Anatal Digilog
Story Highlights
It is said that we are now in the digital era. But the solid-state imaging chips of “all-digital” cameras are actually analog devices. On the other hand, the centuries-spanning image-capture technology of motion-picture film operates on straightforward digital principles. The question of whether analog or digital provides higher quality must be answered “either.” And the one about which offers immunity to noise and interference must be answered “neither.”
Confused? That’s not surprising considering the hype that has surrounded the word digital. It simply means relating to digits.
Digit, from the Latin digitus, means finger (or toe), and, because humans count on their fingers, it also means a numeral. Digital, therefore, means expressible as a number.
Analog, from the Greek analogos, means parallel (as a noun) — something that is like something else. Light a scene and shoot it with a video camera. Now add more light. As the light level increases, so does the video-signal level. The video signal is an analog of the amount of light falling on the scene.
If an analog light meter is held in front of the scene, its needle will move the same way the video level does. Add a little light, and the needle moves a little. Add a lot of light, and the needle moves a lot.
A digital light meter might, at first glance, seem similar to an analog light meter (making it an analog of an analog). If it indicates 50 foot-candles at a certain light level, then, when the light level is doubled, it will indicate 100 foot-candles. But there’s an important difference.
Suppose the digital light meter has three digits. It can, therefore, indicate 100 foot-candles or 101. But it cannot indicate 100.5. The light level may be varied between 99.5 and 100.4 foot-candles, and the display will not change. The needle on an analog light meter might not move much over that range, but it will move.
The numbers 100 and 101 are, in this case, light levels. All digital systems depend on some number of distinguishable levels. A level is sometimes called a quantum (plural quanta), from the Latin quantus, not an Australian airline but the question how much? Quantity comes from the same root.
It’s easy to see why an analog video camera is useful. It provides a signal that’s a parallel of the scene, enabling it to be reproduced. But why go digital?
In the children’s game of Telephone, one person whispers a message in another person’s ear. The second person whispers it to a third, and so on, until the last person says it out loud. When the final version is compared to the first, laughter often ensues. The message has become garbled as it passed through the distribution chain.
Analog electronic signals are similar. They are subject to interference that will affect the final result. Introduce a small amount of noise into a video signal, and the final picture will get a little grainy. With more noise, the picture is said to have “snow,” an expression born in the days of black-&-white television, when a noisy signal appeared to be something viewed through a blizzard. With a huge amount of noise, it may no longer be possible to detect the original signal at all.
Now consider a game of Telephone in which all participants know there can be only two possible messages, “one” or “zero.” This time, no matter how long the chain, the final output is likely to be exactly the same as the input. A person might hear “fun” or “ton” instead of “one,” but, knowing there are only two possible messages, will “correct” (regenerate) the word when repeating it to the next person in the chain.
A video signal, of course, has many more than two possible levels, and those levels change instantaneously. If the left edge of the screen is white and the right edge black, and there is a smooth transition between them, then there will be an infinite number of levels between the white and the black. There will also be an infinite number of left-to-right positions (represented in video by an infinite number of moments in time) in which to measure them. A digital video signal would, therefore, seemingly have to carry an infinite number of digits per second, a clear impossibility.
Fortunately, neither the infinite number of levels nor the infinite number of moments is really necessary. The human visual system just isn’t that good. The desired signal-to-noise ratio determines the number of levels required, and the desired fineness of detail determines the number of moments (or samples) required each second.
By international agreement, the brightness-detail (luma) portion of a standard-definition video signal is typically sampled 13.5 million times per second, and each sample is typically assigned to one of 256 different levels. The 13.5-million figure is based on a combination of information theory and existing analog practices around the world. To accommodate U.S. and European video frame rates and numbers of lines per frame, the sampling rate had to be a multiple of 2.25 million times per second, and, to accommodate the desired detail, it had to be above 12 million or so.
The 256 levels represent two to the eighth power. What is significant about it being a power of two is that perfect game of Telephone. A choice between only two selections is the most basic choice that can be made. It’s a binary choice and can be represented by a single binary digit (the words are contracted to bit), either zero or one.
Two bits, however, can describe four different conditions: 00, 01, 10, and 11. Three bits can describe eight: 000, 001, 010, 011, 100, 101, 110, and 111. So eight bits can describe 256.
Now consider signal transmission, distribution, or recording — processes in which analog signals are typically subjected to noise and interference. If some characteristic of the transmission, distribution, or recording processes momentarily alters the video signal by, say, 10%, it will cause a noticeable shift in brightness. In an extreme case, if succeeding passes through the system affect the signal the same way, eventually something that’s white could turn black or vice versa.
A 10% change in level, however, is unlikely to affect a basic digital receiver’s ability to detect whether a one or a zero is being transmitted. And, as in the two-possible-word game of telephone, therefore, that receiver can perfectly regenerate the original bits. Even if every re-transmission or re-recording causes the same 10% shift, each time that 10% shift will hit a perfect signal. There will be no accumulation of defects.
That, at least, is the theory behind digital perfection. The practice is something else entirely.
A U.S. broadcast video signal, conforming to the specifications of the National Television System Committee (NTSC) and the Federal Communications Commission (FCC), has a bandwidth of 4.2 million cycles per second (4.2 MHz). Compare that to a standard-definition digital video signal. The 13.5 MHz sampling with eight bits per sample yields 108 million bits per second (108 Mbps), but that covers only luma. To add color, the figure is nominally doubled. That’s 216 Mbps if 256 levels are enough, 270 Mbps if there are 10 bits (1024 levels) per sample.
At first glance, 270 Mbps might seem to be a lot more than 4.2 MHz. But there’s no direct correlation between them. There need not be a one-bit-per-Hz relationship.
The modulation technique used in the U.S. broadcast digital-television standard is called 8-VSB. VSB stands for vestigial sideband. More significantly, 8 stands for eight levels. Each transmitted symbol can have eight different states instead of one, which means that three bits (two to the third power is eight) can be carried instead of just one.
That’s good news, because it triples the transmission capacity. Unfortunately, it’s also bad news, because it complicates the hypothetical game of Telephone. Now eight words are possible instead of just two.
If the eight words are one, three, five, seven, nine, eleven, thirteen, and fifteen, it’s not obvious that each person will be certain of exactly what is said. Pilots sometimes say “niner” instead of “nine” to be sure the listener will know it wasn’t “five.” A whisper of “eleven” might be heard as “seven.”
A 10% change in level could be just as confusing to a digital receiver. With eight different levels squeezed into a one-volt range, the first level might be at zero volts, the next at 0.14 volts, the next at 0.29 volts, and so on. If something comes in at 0.1 volts, was that a 0.1-volt increase in the zero-volt level or a 0.04-volt decrease in the 0.14-volt level?
It’s not just the voltage. Contrary to what you may have thought you learned, the speed of light is not constant. Only the speed of light in a vacuum is constant. Light moves slower in air, water, glass, and other seemingly transparent media.
Similarly, electronic signals travel at different speeds through different materials. Two signals that took two different paths will likely arrive at a digital receiver at different times. Each bit of a 270 Mbps signal lasts less than four billionths of a second. If a receiver detects something two billionths of a second after it was expected, is that a late bit from position one or an early bit from position two?
Then there’s filtering. Digital bits are often depicted as perfectly rectangular pulses. They rise instantly to the desired level, stay there for the full duration of the bit period, and then fall just as instantly.
In the real world, there are no such instantaneous changes. Going from zero volts to one volt (or vice versa) takes time. Instead of 90-degree corners, a real-world pulse is rounded. A simple trip through a coaxial cable will have three effects on a pulse: Its level will be reduced; it will be spread over more time, and its peak will be delayed.
That’s why seemingly perfect SDI (serial-digital interface) video signals can travel only a certain distance down a cable before it becomes impossible to identify their ones and zeroes. The worse the path, the shorter the distance.
Fortunately, there are measures that can be taken to help. One is the use of regenerators. If a digital signal can travel safely 300 feet down a cable, then a regenerator every 200 feet will allow longer runs. Another option is so-called forward error correction.
If after every two bits another bit is transmitted that says whether the sum of the two bits should be even or odd, then anytime a single bit is lost, that sum can be used to replace it. The advantage is that single-bit errors are repairable; the disadvantage is that the number of bits to be carried increases by 50%.
Although that’s a crude example of error-correction coding, the amount of additional capacity is not that unusual. In the U.S. broadcast digital-television standard, the increase is closer to 70%. And even that is sometimes not enough.
The real-world conditions of signal reflections and noise have forced the Advanced Television Systems Committee (ATSC), the group that approved the digital-television standard, to consider more-robust versions with even lower capacities. And it’s not just that standard. In Britain, a more-robust reduced-capacity digital-television standard just went on the air.
Regenerators and error-correction coding are aimed at preserving digital-signal quality. But the so-called “perfect” signals were never perfect in the first place. Consider the standard-definition digital-video standard, Recommendation 601, the one with the 13.5 MHz sampling.
In theory, 13.5 MHz sampling should allow perfect reproduction of an analog signal with a bandwidth just under half that rate: 6.75 MHz. In the real world, where there are no perfect filters, it’s more like 5.75 MHz.
That’s still more than enough to cover the U.S. broadcast NTSC limit of 4.2 MHz. It’s even enough for Britain’s analog broadcast limit of 5.5 MHz. But it falls slightly short of France’s analog broadcast limit of 6 MHz.
In analog videotape recorders, each new model tended to produce performance superior to the last. The old analog IVC-9000 had an astounding 8 MHz video bandwidth — much more than any Rec. 601 standard-definition digital videotape recorder. Digital videotape recorders don’t provide improved pictures as newer models are introduced.
Then there are quanta. Assuming a high-enough sampling rate and proper filtering, a digital video signal should be able to reproduce the full range of detail of its analog original. But it can never perfectly reproduce the brightness range.
Consider a sample. What is its level? It is extraordinarily unlikely that it will fall precisely on one of the digital quanta. Instead of being 128, it might be 128.3 or 127.75. If it is rounded off to 128, that rounding error (called quantization error) will likely introduce objectionable distortion in the signal — perhaps a contour line that didn’t exist in the original. In audio, harmonic distortion is introduced.
More bits lower the distortion, but they don’t eliminate it. Perhaps amazingly, the best way to eliminate quantization distortion is to add noise. As long as there is sufficient noise in the input signal so that quantization error becomes random (instead of correlated to the signal), the distortion disappears. So the ideal, “noise-free” digital signal actually needs noise.
Then there’s bit-rate reduction, commonly called “compression.” Betacam SX, Digital Betacam, Digital-S, DV, DVCAM, DVCPRO, HDCAM, and IMX are all popular digital videotape recording formats. All of them are compressed. So is DVD. So are DirecTV and DISH satellite signals. So is digital cable. So are digital broadcasts.
The compression may be quite severe — 50:1 is not uncommon. Does compression affect quality? Perhaps.
Most compression systems make use of the redundancy in a video signal. Each frame is likely to be similar to the one before and after it. Each picture element is likely to be similar to those around it.
When that’s true, compressed digital video signals can look just as good as uncompressed. But, when the system is stressed by fine detail and rapid, complex motion (as in a shot of players against a crowd in a basketball game), there might not be enough redundancy to support the high compression ratio, and quality will degrade. Even with sufficient sampling rate and quanta, just the right amount of noise, and (thanks to forward error correction and an appropriate number of bits per Hz) perfect reception, excessive compression can result in terrible-looking pictures.
So, neither analog nor digital signals are free from real-world noise and interference. And either analog or digital may offer higher quality, depending on such factors as bandwidth and signal-to-noise ratio in the analog system and sampling rate, bit depth (number of bits per sample), and compression ratio in the digital. But what about those image receptors, camera chips and film?
Just as all digital signals are actually analog (they take time to rise and fall and are subject to analog noise and interference), to some extent all analog signals — and everything else in the world, including humans — are digital. In the realm of quantum physics, electrons have only discrete levels of energy states. Similarly, if light is thought of as quantities of photons, then any image receptor may serve as a photon counter, an inherently digital device.
In practical terms, however, no common video camera counts individual photons. The imaging chips in “digital” video cameras look like computer chips. They’re even divided into a fixed (and, therefore, digital-looking) grid of columns and rows. But, at the intersection of a column and a row, the sensor delivers an analog signal proportional to the light falling on it. It’s not divided into discrete levels; that happens later, after an analog-to-digital conversion (ADC) stage. And, because the signals from the chip are very weak, there’s likely to be an analog preamplifier as well before the ADC. To put it bluntly, “all-digital” cameras simply aren’t.
In practical terms, the only truly digital video device that straddles the electronics and light realms is the dynamic-micromirror device at the heart of the Texas Instruments DLP (digital light processing) projectors. Like imaging chips, they, too, are chips divided into columns and rows. In this case, at the intersection of a column and a row, however, instead of an image sensor, there’s a tiny, tilting mirror. In one position, light from the mirror hits the screen; in the other, it doesn’t. That’s a truly binary device. Shades of gray are created based on the number of times the mirror sends light to the screen in any given period. That’s numerical, too — a totally digital system. But that’s a light-output system, not a light-input system.
Is there no light-input imaging system that is truly binary? As a matter of fact, there is — and it’s quite common. It’s called photographic film, or, to be more specific, the emulsion on photographic film.
A single grain on the film can only be either exposed or not exposed. There are literally no shades of gray in between. That’s a binary condition. But we also know that film does capture shades of gray. In fact, the characteristic way that film deals with different shades of gray is said to be a factor in the desirable “look” that so many people associate with the photochemical medium.
Film captures a range of grays by having a range of grain sizes. Some become exposed at very low levels of light. Others need a lot of light. Still others lie in between. The seemingly natural way that film deals with shades of gray is, in fact, effectively programmed into it by the intentional composition of the emulsion.
Even though they’re digital, the grains aren’t organized into columns and rows, which is another good thing. Those analog imaging chips, which are arranged in numerical rows and columns, can exhibit a degradation called “fixed pattern noise,” a sort of graininess that doesn’t move as the camera is panned or tilted, looking like shooting through a dirty lens or a screen door.
So there’s nothing inherently better about either digital or analog signals, though the former may be manipulated through binary computer circuitry and are, therefore, perhaps, ultimately easier to understand or at least like. After all, with digital you can dig it all.