Digital images are stored as data files that contain information about the colors in the image. They don’t know anything about the shapes, lines, people, animals, or scenery. Just how much of each color there is in the image, and where it is in image. In the same way that you use colored pegs in a black background to make a picture with a Lite-Brite (I’m sure that name is copyrighted), you use picture elements – or “pixels” – in a digital image to make your picture.
Give or take, a Lite-Brite has a total surface size of about 8″ x 10″. With about eight light pegs per inch, that gives a total resolution of maybe 80 x 64. Even if we’re generous, it can’t be much more than 100 x 75. That’s a total of 7,500 pixels on a Lite Brite. Your digital camera, on the other hand, has more like 4000 pixels by 3000 pixels, for a total of 12,000,000 picture elements. That’s 1600 times the resolution of a Lite-Brite.
Each of these pixels has at least one color attribute associated with it. Usually three. One that tells how red it was, one for green, and one for blue. This is the “RGB space” that you might hear talk about, and consists of values ranging from zero (for no red, or no blue, or no green), to 255, 4095, 65535, and sometimes 16,777,216 or even as high as 4,294,967,296. That can be a lot of numbers to keep track of! 256 is 8-bits of color, so starting from zero you can count to 255 with 8 bits. With 12 bits you can count to 4095, and so on with 16, 24, or 32 bits.
Eight bits make a byte, so an 8-bit image with red, green, and blue, gives you 3 x 8 bits = 24 bits / 8 bites per byte = 3 bytes of information for each pixel. A 2MP cell phone camera therefore takes 2,000,000 pixels, each with 3 bytes of data, for 6,000,000 bytes of data. That’s 6 megabytes (let’s ignore mega versus Mega right now) of data in the raw image. Since that is a lot of data, there are lots of ways to save space in the image file and not store every bit of every byte of every pixel.
The easiest way is a simple compression called Run Length Limited, or RLL encoding. This means that if there are a string of numbers that are the same (very likely in a large file), then you only need to keep track of the first one, and then count how many there are. As a simple example: 1233333456, which takes up 10 characters, can be written as 123.5456, which takes up only 8. A 20% savings in space – but you have to know the code. A dot means the previous number repeats for X number of times, where X is the next digit after the dot. So 1233333333334567 (16 characters long), would have to be written 123.934567 (10 characters long), since we can only use one digit after the dot. Even still, we saved over 37% by using our simple RLL code.
That’s all fun and dandy, but digital images are very complex and might have similar numbers next to each other, but not always in a row. So we commonly use a compression technique devised by the Joint Photographic Experts Group, or JPEG (sometimes shortened to JPG for three letter filename types). In very simple terms, JPEGs use a trick that exploits the fact that the human eye can’t always tell the difference between two slightly different shades of colors. This lets the total number of colors be reduced, so the number of times that a certain color shows up is likely to occur more often, and you can use something like RLL encoding to shorten the overall length of the file. It’s all quite complicated, and it doesn’t much matter to us except for one very important thing:
JPEG compression is “lossy,” meaning that once you’ve compressed with it, you’ve permanently lost the data that you compressed out of it. You cannot regain 100% of the data that you started with if all you have is a JPEG image.
This is fine for most people and for most images. We tend to take pictures of things we see with our eyes because we want to remember them in some way. That’s why JPEG is the standard format that most digital cameras use to store their images in. But this can present a problem if we want to edit those pictures on the computer because the camera has decided what is important and what isn’t, and thrown the rest out when it created the JPEG image. We’ve lost some data that we might have been able to use on the computer.
This is especially problematic with images from the Time Machine, because we are often zooming in, changing colors, altering contrast, extracting data that isn’t visible to normal human vision, and other tricks to bring out the best in our images. If we’re starting with less than 100% of the data, then the best we can do is “pretty good” when it comes to our final image.
And this is why we use a camera that stores the image in both a JPEG format and what’s called a RAW format. RAW is the “raw data” that the imaging sensor detected and it’s simply written to the camera storage card (using something like RLL encoding, but with zero loss of data). If we start with the RAW format, we have all of the information and not just some of it, so we can make a better image in the end with further computer processing.
To show you the difference between the two, I’ve added an image in both RAW format and JPEG format of the same door from an area bed and breakfast. (Yes, even the Astropotamus takes a weekend off here and there and stays in his current time.) Compare the two images and you’ll see that the JPEG looks…wait, you decide for yourself what the differences are. Just remember, only the RAW version looks good to the Astropotamus once he’s finished editing it the way he wants on the computer.