This article is from the MPEG FAQ, by Frank Gadegast phade@cs.tu-berlin.de with numerous contributions by others.
Rate control and adaptive quantization are divided into three steps:
Step One: Target Bit Allocation
In Complexity Estimation, the global complexity measures assign
relative weights to each picture type (I,P,B). These weights (Xi, Xp,
Xb) are reflected by the typical coded frame size of I, P, and B
pictures (see typical frame size discussion). I pictures are usually
assigned the largest weight since they have the greatest stability
factor in an image sequence and contain the most new information in a
sequence. B pictures are assigned the smallest weight since B energy
do not propagate into other pictures and are usually more highly
correlated with neighboring P and I pictures than P pictures are.
The bit target for a frame is based on the frame type, the remaining
number of bits left in the Group of Pictures (GOP) allocation, and the
immediate statistical history of previously coded pictures (sort of a
moving average global rate control, if you will).
Step Two: Rate Control via Buffer Monitoring
Rate control attempts to adjust bit allocation if there is significant
difference between the target bits (anticipated bits) and actual coded
bits for a block of data. If the virtual buffer begins to overflow,
the macroblock quantization step size is increased, resulting in a
smaller yield of coded bits in subsequent macroblocks. Likewise, if
underflow begins, the step size is decreased. The Test Model
approximates that the target picture has spatially uniform distribution
of bits. This is a safe approximation since spatial activity and
perceived quantization noise are almost inversely proportional. Of
course, the user is free to design a custom distribution, perhaps
targeting more bits in areas that contain more complex yet highly
perceptible data such as text.
Step Three: Adaptive Quantization
The final step modulates the macroblock quantization step size obtained
in Step 2 by a local activity measure. The activity measure itself is
normalized against the most recently coded picture of the same type (I,
P, or B). The activity for a macroblock is chosen as the minimum among
the four 8x8 block luminance variances. Choosing the minimum block is
part of the concept that a macroblock is no better than the block of
highest visible distortion (weakest link in the chain).
Decision:
[deferred to later date]
 
Continue to: