This article is from the MPEG FAQ, by Frank Gadegast phade@cs.tu-berlin.de with numerous contributions by others.
During its formative stages, H.263 was known as "H.26P" or "H.26X". It
is an ITU-T standard for low-bitrate video and audio teleconferencing.
It is designed to be more efficient (at least 2dB) than H.261 for bit
rates below 64 kbits/sec (ISDN B channel). The primary target bit
rate, approximately 27,000 bits/sec, is the payload rate of the V.34
(a.k.a "V.Fast" or "V.Last") modem standard. In a typical scenario, 20
kbit/sec would be allocated for the video portion, and 6.5 kbit/sec for
the speech portion.
Since the H.261 syntax was defined in 1990, techniques and
implementation power have naturally improved. H.263 collects many of
the advanced methods proposed during MPEGs formative stages into a
syntax which shares a common basis more with MPEG-1 video than with
H.261.
The detailed differences and similarities are summarized below:
Sample rate, precision, and color space:
H.263 pictures are transmitted with QCIF dimensions. MPEG and JPEG
allow nearly any picture size to be described in the headers. A fixed
picture size promotes interoperability by forcing all implementors to
operate at a common rate, rather than by allowing implementors to get
away with whatever lowest sample rate the consumer can be tricked into
buying. Another reason for a fixed sample rate is that, unlike MPEG
which is generic, H.263 is geared towards a specific application
(teleconferencing). Other MPEG applications such as CD Video and Cable
TV define their own fixed parameters. Chromaticy is again YCbCr, 4:2:0
macroblock structure, and 8 bits of uniform sample precision.
[details deferred]
 
Continue to: