Understanding H.264 video compression

11 Mar 2009
by: By Les Simmonds
During 2004 we presented a two part series on “MPEG - The Standards and History” available at http://www.lessimmonds.com.au/papers.html. The time has now come to look at H.264 which is starting to show up in CCTV systems.
SO what is H.264 and what are the advantages and disadvantages of this new compression technology? Essentially, H.264 is a new video compression scheme which is set to become the worldwide digital video standard for consumer electronics and personal computers. H.264 has already been selected as a key compression scheme (codec) for the new optical disc formats, such as Blu-ray disc.


The intent of the H.264 standard project was to create a standard capable of providing good video quality at substantially lower bit rates than previous standards (e.g. half or less the bit rate of MPEG-2, H.263, or MPEG-4 Part 2), without increasing the complexity of design so much that it would be impractical or excessively expensive to implement.

“H.264 has been adopted by the Motion Picture Experts Group (MPEG) to be a key video compression scheme in the MPEG-4 format for digital media exchange”

An additional goal was to provide enough flexibility to allow the standard to be applied to a wide variety of applications on a wide variety of networks and systems, including low and high bit rates, low and high resolution video, broadcast, DVD storage, RTP/IP packet networks, and ITU-T multimedia telephony systems.

H.264 has been adopted by the Motion Picture Experts Group (MPEG) to be a key video compression scheme in the MPEG-4 format for digital media exchange. H.264 is sometimes referred to as “MPEG-4 Part 10” (part of the MPEG-4 specification), or as “AVC” (MPEG-4’s Advanced Video Coding). This new compression scheme has been developed in response to technical factors and the needs of an evolving market:

* MPEG-2 and other older video codecs are relatively inefficient.

* Much greater computational resources are available today.

* High Definition video is becoming pervasive, and there is a strong need to store and transmit more efficiently the higher quantity HD data (about 6 times more than Standard Definition video).
 
H.264 clearly has a bright future, mostly because it offers much better compression efficiency than previous compression schemes. The improved efficiency translates into three main benefits, or a combination of them:

* Higher video quality at a given bit-rate; reduction in artifacts such as blockiness, color bands, etc

* Higher resolution; as the video world transitions to High Definition, a mechanism is needed to deliver it. The new Foxtel HD transmission is H.264

* Lower storage requirements; lower storage requirements will allow for large amounts of content to be delivered on a single disc.

It is likely that future delivery of Digital TV signals (both in SD and HD) will use H.264. For SD, the same content at a given quality can be delivered with a lower bit-rate (allowing for more channels to be transmitted on the same medium), or higher quality and/or higher resolution can be delivered at the same bit-rate. Also, many CCTV suppliers are now showing their new systems with H.264. Future Digital TV delivery vehicles include:

* Satellite

* Cable

* IPTV (over cable or DSL)

* Over-the-Air broadcast

* CCTV systems.

Some of the above are already turning to H.264 as a standard; worldwide, more are likely to announce that they are following shortly. High-Definition Optical Discs High-definition video is gaining in popularity, aided by the falling cost of HD television sets. A key deployment vehicle for High Definition content is likely to be optical discs carrying this content. Blu-ray Disc format is currently proposed. This disc format has chose to adopt H.264 as one of the key means of storing the HD video content. The high bit-rates that are used to encode the video on HD-discs will be particularly challenging today’s PCs; we will examine this further after we compare MPEG-2 and H.264)

Differences between H.264 and MPEG-2 video decoding

MPEG-2 is today’s dominant video compression scheme and it’s used to encode video on DVDs, to stream internet video and is the basis for most worldwide digital television (over-the air, cable and satellite). While MPEG-2 is a video-only format, MPEG-4 is a more generic media exchange format, with H.264 as one of several video compression schemes offered by MPEG-4.

There are numerous differences between these compression schemes, but a key point is that H.264 has been developed to deliver much higher compression ratios than MPEG-2. However, this greater degree of compression (up to 2-3 times more efficient than MPEG-2) comes at the expense of much higher computational requirements. This additional computational complexity is widespread in the overall decoding process, but three key techniques areas stand out in adding to the new overhead: Entropy encoding, smaller block size and In-loop deblocking.

* Entropy encoding

Entropy encoding is a technique used to store large amounts of data by examining the frequency of patterns within it and encoding this in another, smaller, form. H.264 allows for a variety of entropy encoding schemes, compared to the fixed scheme employed by MPEG-2. In particular, the new CABAC (Context-based Adaptive Binary Arithmetic Coding) scheme adds 5-20% of compression efficiency but is much more computationally demanding than MPEG-2’s entropy encoding.

* Smaller block size

MPEG-2, H.264, and other most other codecs treat portions of the video image in blocks, often processed in isolation from each another. Independently of the number of video pixels in the image, the number of blocks has an effect of the computational requirements.

While MPEG-2 has a fixed block size of 16 pixels on a side (referred as 16x16), H.264 permits the simultaneous mixing of different block sizes (down to 4x4 pixels). This permits the codec to accurately define fine detail (with more, smaller blocks) while not having to ‘waste’ small blocks on coarse detail. In this way, for example, patches of blue sky in a video image can use large blocks, while the finer details of a forest in the frame could be encoded with smaller blocks.

* In-loop deblocking

When the bit-rate of an MPEG-2 stream is low, the blocks (and specifically, the boundaries between them) can be very visible and can clearly detract from the visual quality of the video. “De-blocking” is a post-processing step that adaptively smoothes the edges between adjacent blocks. De-blocking is computationally “expensive”.

In the past, de-blocking has been an optional step in decoding, only enabled when it was possible for the playback device (such as a PC) to perform it in real time. ATI has offered de-blocking capability for playback of video for some time. In H.264, however, In-loop deblocking is introduced. The “in-loop” refers to when previously ‘de-blocked’ image data, in addition to being displayed, is actually used as part of the decoding of future frames; it is in the decoding ‘loop’. Because of this, the de-blocking is no longer optional. It adds to the quality of the decoded video, but also adds significantly to the computational overhead of H.264 decode.

In the coming months the CCTV industry will see a significant increase in H.264 compression technology with most CCTV manufacturers.
 

Acknowledgment: ATI Technologies Inc.

* Les Simmonds is an independent CCTV consultant. Email: les@cctvconsultants.com.au 

“H.264 has been developed to deliver much higher compression ratios than MPEG-2. However, this greater degree of compression (up to 2-3 times more efficient than MPEG-2) comes at the expense of much higher computational requirements”