Description
This article is from the MPEG FAQ, by
Frank Gadegast phade@cs.tu-berlin.de with numerous contributions
by others.
67 The 6 Steps to Claiming Bogously High Compression Ratios: (MPEG-2)
MPEG video is often quoted as achieving compression ratios over 100:1,
when in reality the sweet spot rests between 8:1 and 30:1.
Heres how the fabled greater than 100:1 reduction ratio is derived for
the popular Compact Disc Video (White Book) bitrate of 1.15 Mbit/sec.
Step 1. Start with the oversampled rate
Most MPEG video sources originate at a higher sample rate than the
"target sample rate encoded into the final MPEG bitstream. The most
popular studio signal, known canonically as D-1 or CCIR 601 digital
video, is coded at 270 Mbit/sec.
The constant, 270 Mbit/sec, can be derived as follows:
Luminance (Y): 858 samples/line x 525 lines/frame x 30 frames/sec x
10 bits/sample ~= 135 Mbit/sec
R-Y (Cb): 429 samples/line x 525 lines/frame x 30 frames/sec x
10 bits/sample ~= 68 Mbit/sec
B-Y (Cb): 429 samples/line x 525 lines/frame x 30 frames/sec x
10 bits/sample ~= 68 Mbit/sec
Total: 27 million samples/sec x 10 bits/sample = 270 Mbit/sec.
So, our compression ratio is: 270/1.15... an amazing 235:1 !!
Step 2. Include blanking intervals
Only 720 out of the 858 luminance samples per line contain active
picture information. In fact, the debate over the true number of
active samples is the cause of many hair-pulling cat-fights at TV
engineering seminars and conventions, so it is safer to say that the
number lies somewhere between 704 and 720. Likewise, only 480 lines
out of the 525 lines contain active picture information. Again, the
actual number is somewhere between 480 and 496. For the purposes of
MPEG-1s and MPEG-2s famous conformance points (Constrained Parameters
Bitstreams and Main Level, respectively), the number shall be 704
samples x 480 lines for luminance, and 352 samples x 480 lines for each
of the two chrominance pictures. Recomputing the source rate, we arrive
at:
(luminance)
704 samples/line x 480 lines x 30 fps x 10 bits/sample ~= 104 Mbit/sec
(chrominance)
2 components x 352 samples/line x 480 lines x 30 fps x 10 bits/sample
~= 104 Mbit/sec
Total: ~ 207 Mbit/sec
The ratio (207/1.15) is now only 180:1
Step 3. Include higher bits/sample
The MPEG sample precision is 8 bits. Studio equipment often quantize
samples with 10 bits of accuracy. The 2-bit improvement to the dynamic
range is considered useful for suppressing noise in multi-generation
video.
The ratio is now only 180 * (8/10 ), or 144:1
Step 4. Include higher chroma ratio
The famous CCIR-601studio signal represents the chroma signals (Cb, Cr)
with half the horizontal sample density as the luminance signal, but
with full vertical resolution. This particular ratio of subsampled
components is known as 4:2:2. However, MPEG-1 and MPEG-2 Main Profile
specify the exclusive use of the 4:2:0 format, deemed sufficient for
consumer applications, where both chrominance signals have exactly half
the horizontal and vertical resolution as luminance (the MPEG Studio
Profile, however, centers around the 4:2:2 macroblock structure). Seen
from the perspective of pixels being comprised of samples from multiple
components, the 4:2:2 signal can be expressed as having an average of 2
samples per pixel (1 for Y, 0.5 for Cb, and 0.5 for Cr). Thanks to the
reduction in the vertical direction (resulting in a 352 x 240
chrominance frame), the 4:2:0 signal would, in effect, have an average
of 1.5 samples per pixel (1 for Y, and 0.25 for Cb and Cr each). Our
source video bit rate may now be recomputed as:
720 pixels x 480 lines x 30 fps x 8 bits/sample x 1.5 samples/pixel
= 124 Mbit/sec
... and the ratio is now 108:1.
Step 5. Include pre-subsampled image size
As a final act of pre-compression, the CCIR 601 frame is converted to
the SIF frame by a subsampling of 2:1 in both the horizontal and
vertical directions.... or 4:1 overall. Quality horizontal subsampling
can be achieved by the application of a simple FIR filter (7 or 4 taps,
for example), and vertical subsampling by either dropping every other
field (in effect, dropping every other line) or again by an FIR filter
(regulated by an interfield motion detection algorithm). Our ratio now
becomes:
352 pixels x 240 lines x 30 fps x 8 bits/sample x 1.5 samples/pixel
~= 30 Mbit/sec !!
.. and the ratio is now only 26:1
Thus, the true A/B comparison should be between the source sequence at
the 30 Mbit/sec stage, the actual specified sample rate in the MPEG
bitstream, and the reconstructed sequence produced from the 1.15
Mbit/sec coded bitstream.
Step 6. Don¿t forget the 3:2 pulldown
A majority of high-end programs originates from film. Most of the
movies encoded onto Compact Disc Video were in captured and reproduced
at 24 frames/sec. So, in such an image sequence, 6 out of the 30
frames every second are in fact redundant and need not be coded into
the MPEG bitstream, leading to the shocking discovery that the actual
soure bit rate has really been 24 Mbit/sec all along, and the
compression ratio a mere 21:1 !!! Even at the seemingly modest 20:1
ratio, discrepancies will appear between the 24 Mbit/sec source
sequence and the reconstructed sequence. Only conservative ratios in
the neighborhood of 8:1 have demonstrated true transparency for
sequences with complex spatial-temporal characteristics (i.e. rapid,
divergent motion and sharp edges, textures, etc.). However, if the
video is carefully encoded by means of pre-processing and intelligent
distribution of bits, higher ratios can be made to appear at least
artifact-free.
 
Continue to: