Since the first days of television, pictures have been painted on television tubes in the same way. A moving electron beam scans across screen from left to right, when it finishes the top row, it jumps back the left and begins painting the next row down. If you look closely, you can even see these scanning lines in the picture. This analog TV system was originally adopted back in the late 1940's, it's called the NTSC system. It uses 525 lines to create a full picture, or "frame," but all 525 lines are not scanned one after the other. Rather each frame is divided into two "fields" of 262.5 lines that "interlace" to form a complete picture. The odd-number lines are drawn first, starting at the upper left corner and ending half way across the screen at the bottom. Then the even-numbered lines fill in the interstices starting half way across the screen at the top and ending at the bottom right corner.
Why do something so convoluted? It's because of the way we see, and the limitations of the 6-MHz broadcast channel. Movies and TV really aren't "moving" pictures at all. The eye is shown a series of slightly different still pictures in rapid succession. The eye/brain blends these together and perceives smooth motion as long as the pictures appear at a fast enough rate for our brain to blend them. If the rate is too slow, the picture "flickers." People differ in their sensitivity to flicker but in general, everyone is more sensitive to flicker in bright areas of the screen. Motion pictures are shot at 24 frames per second, which is too slow to prevent flicker in bright areas. In this case some frames are flashed on the screen two or three times in succession to boost the rate and trick the eye. The analog NTSC TV system used in this country presents close to 30 frames a second. Sensitive people will still perceive flicker at that rate, especially at the periphery of vision. By splitting the 30 frames into 60 interlaced fields however, the eye is stimulated twice as often and the flicker is suppressed. Why not send 60 full frames a second? Because that would require twice as much information which would exceed the capacity of the 6-MHz channel assigned to conventional TV broadcasters.
(Remember these decisions were made fifty years ago! 6-Mhz defines the slice of the radio spectrum that is available for broadcasting information to your TV. Basically it defines the size of the "pipe" that the content flows through. Fifty years ago the "pipe" wasn't very large, so compromises were necessary.)
So the information content in 60 fields of 262.5 Iines is exactly the same as the information in 30 frames of 525 lines. Interlacing is simply a crutch to reduce flicker with out exceeding the information capacity of the broadcast channel. But interlacing is no panacea. If there were no motion and the interlacing were perfect, (which is rarely the case), you would expect to have 525 lines of vertical resolution. That's not quite true however. Forty-two lines are in the so-called "vertical blanking interval" that is used to synchronize the display and transmit text information, closed captioning, setup signals, etc., and some lines are off screen, either just above or below the edge of the tube. In the end only about 460 lines are available for the picture that you see on the screen. Keep that number in mind; it's approximately the same as the 480 lines of a VGA computer display.
The computer industry never used interlacing. Type, graphics and slanted lines look terrible on an interlaced screen. Computer displays invariably use "progressive" scanning, that is the lines are created in order from top to bottom. So progressive goes 1,2,3,4,5,.... rather than interlaced as 1,3,5,7,9,... and then a moment later 2,4,6,8,... etc. Since each line is presented sequentially, there is no distinction between "fields" and "frames" or between "field rate" and "frame rate." In fact, it's better to think of computer-generated pictures in terms of the "refresh rate," the number of pictures painted on the screen per second. That's what determines flicker. For the same refresh rate and resolution (total number of picture elements in each screen), a progressively scanned display requires twice the bandwidth of an interlaced display. But that's only if we look at this from an analog viewpoint. Computers generate pictures digitally, so bandwidth is less of a consideration.
Digital high-definition TV system is a whole new ball game. In fact, you can forget the high-definition part, digital TV is simply a different beast than analog TV. Why? Because once in the digital domain, pictures can be compressed with mathematical algorithms that eliminate redundant information. On screen details that don't change from one frame to the next don't need to be re-transmitted. Rather the picture on screen is created by modifying the previous frame rather than creating a new frame from scratch. Every now and then, (whenever the picture changes drastically), a full frame of information is sent, but most "frames" contain very little new information. So frame rate is far less of an issue than it is in the analog world, where every frame must be transmitted whether there's anything new in it or not.
It's fair to say that digital TV images are more akin to computer-generated images than to those of a traditional analog broadcast system. Bandwidth becomes more a function of what the picture contains, and how effectively it can be compressed, than on the display rate. Digital TV doesn't care whether the final image is interlaced or progressively scanned. In fact, it is easier to compress a progressively scanned image than an interlaced image because there's no worry about "A" fields and "B" fields. Each frame is complete in and of itself, which makes it far easier to search consecutive pictures for areas in which nothing has changed. In the digital world, "interlacing" is neither needed nor desirable, so why use it? Partially because of the NIH (Not Invented Here) factor. Some broadcast people are wedded to interlaced pictures because that's what they're used to and that's what they have equipment to generate. Many professional studio cameras produce interlaced fields rather than full frames. Cameras are analog devices and interlacing reduces the bandwidth by half for apparently comparable resolution. But progressively scanned high-frame rate cameras exist. Right now, they may be somewhat expensive, but that's because there's been relatively little demand for them.
There's also the numbers game. A "1080i" picture sounds better than a "720p" picture, but is it? Not really. The rule of thumb is that interlaced pictures have only 60 percent of the resolution that the number implies because of the interlace problems described above. Thus, a 1,080i picture is approximately equivalent to a progressively scanned picture of 650 lines. In short. 720p is better than 1,080i! Computer types would go farther, and suggest that you should cut the interlaced number in half because only half the lines are sent every 60th of a second. That may be going overboard, but the point is well taken. There is more information contained in 720 lines sent 60 times a second than in 1,080 lines sent 30 times a second. If that weren't true, there never would have been a reason to interlace analog systems. In a digital system that uses MPEG compression, 720p at 60 frames per second can be handled without straining the transmission channel, so it's pointless to use 1,080i at 30 frames per second. Nonetheless, two major networks (CBS and NBC) have adopted 1,080i (for now) while ABC and Fox are going with 720p.
You'd think it strange for the FCC to permit two broadcast standards, but matters are worse than that. After years of development and testing, the major parties – broadcasters, TV manufacturers and computer folk – couldn't agree on a scanning standard. So the FCC compromised by only specifying the digital bitstream and RF transmission parameters. They left the scanning and encoding parameters up to "the Marketplace". And it's created a mess. There are currently 18 different transmission formats that HDTV sets must accommodate! These differ in whether they use interlaced or progressive scanning, and in aspect ratios, and in pixel shape, alignment, and resolution.
The consensus for high definition comes seems to have settled on two resolutions: 1,920H x 1,080V and 1,280H x 720V. Both use square-pixel alignment and a 16:9 aspect ratio. The 1,920H x 1080V format is a superset of SMPTE 240M, and cameras and other equipment are available to generate pictures in that format. The problem is that 1920H x 1O8OV can't be transmitted at 60 frames per second with progressive scanning. As a result, the high-speed rate is interlaced at 60 fields per second to conserve bandwidth while the two lower rates use progressive scanning at 24 and 30 frames per second. The 1,280H x 720V format on the other hand uses progressive scanning at all frame rates: 24, 30 and 60 frames per second. The problem with this format is lack of professional equipment to generate it. Most studios haven't invested in the latest cameras. It's hoped that this will soon become a non issue, but these are still other major issues to be solved. One of the stickiest is how to push the extra information through a typical cable system. Welcome to The Numbers Game.