View Full Version : Benefits of h.264 as compared to ASP codecs
Mutant_Fruit
30th December 2005, 15:38
I'm doing a "research project" type thing as part of my course in college. We were given a wonderfully vague advice to "write a technical article on a subject of your choice", so i decided to write on the advantages of H.264 over previous video codecs.
Here's a basic jist of what i have so far (major differences):
Error Resiliance: H.264 is more error resiliant with regards to packetloss and file corruption. If a "packet" is lost/damaged, all other packets can still be decoded. They are self contained.
Improved prediction: H.264 allows for multiple reference frames to be used for prediction as opposed to the 2 allowed for in ASP codecs (one "past" frame and one "future" frame).
Entropy Encoding: We have cabac, which is far a far superior entropy encoder and (according to this white paper) is about twice as efficient as previous methods.
Can use smaller block sizes from 16x16 down to 4x4 which increase motion estimation accuracy
Is there anything else thats majorly different between ASP codecs and h.264 that i havn't mentioned? Maybe theres something i considared minor and didn't mention which is actually a major feature.
I'm currently sifting through a dozen PDF's, so its quite likely i'll be editing this list again.
Thanks for any help offered.
smok3
30th December 2005, 16:07
iam not an expert, but here is an extract from
http://forum.doom9.org/showthread.php?t=96059 :
- Loop/Deblocking Filter and use of the filtered frames as references (think low bitrate)
- Weighted Prediction (think fadein, fadeout, dissolve maybe?)
- Variable Block Sizes/Macroblock Partitions (including steps like 8x4...) - expanded your point 4.
foxyshadis
30th December 2005, 16:26
One nitpick, cabac isn't really twice as good, it's more like 10-20% better vs. CAVLC, and if you count from the original DCT coefficients, it's a lot less than that. (Of course even that is nothing to shake your head at.)
Also error resilience is an optional part of the spec that few encoders implement.
Another major feature: Built in deblocking, ie inloop. The best thing to happen to low-bitrate DCT-based encoding in ages. I'm sure you know all about it by now so I won't describe it.
IvS
30th December 2005, 16:30
As always, don't believe theories, believe implementations. "Twice as efficient" it's not.
From my testing, x264 doesn't give any better quality than XviD when CABAC and deblocking are disabled, and in fact the results are worse sometimes, blockier, even with exhaustive motion estimation, RDO partition decision, b-frames with adaptive and weighted biprediction, all partition options on and 8x8 transform. With CABAC and deblocking filter on, it can be better, but again not always noticeably.
What is promising is that without CABAC and deblocking filter, the decoding speed is about the same as ASP with qpel.
An important thing you should mention is that WITH advanced features like CABAC and the deblocking filter, along with the improved quality comes the much more demanding decoding performance. About 3 times as slow as ASP and twice as slow as ASP with qpel, and worse as the resolution of the video is higher. 1280x720 (720p) decoding is pretty much impossible with normal system setups while with ASP it's rarely an issue.
Mutant_Fruit
30th December 2005, 16:36
iam not an expert, but here is an extract from
http://forum.doom9.org/showthread.php?t=96059 :
Aye, i've been looking at that alright. Its where i got some of the links for the PDF's i've been reading.
I didn't think the inloop filter was a great new feature (is it?) because ASP codecs have deblocking filters (albeit ones that aren't integrated into the specs). Is this really such an important move?
Weighted Prediction: I don't really understand this as of yet. According to the pdf i'm reading now, "overlay coding techniques" are responsible for "improved efficiency during fading cuts". As much as 50% bitrate reduction for same quality. "BT-PAW overlay coding performs better than B-PAW overlay coding that outperforms simple overlay coding."
What exactly is BT-PAW coding? That doesn't seem to be explained here.
EDIT: Looks like i'll have to take a much bigger interest in the inloop filter. It seems to be much more important than i originally thought.
EDIT2: I know i said "twice as efficient", but i assume that means in the optimal conditions, which never occur in reallife. Its just like the way B-Frames were supposed to reduce filesize by 25% for the same perceivable quality, but the realistic gain was less than 1/2 that (iirc).
IvS
30th December 2005, 16:42
I can't really explain the features well, there are quite experienced people here who could :) But notice "As much as 50%." That doesn't really mean 50% improvement or even close to it is necessarily the usual result, it's just the usual semi-misleading stuff common in papers and articles.
Edit: OK, I just saw your Edit 2 :).
Mutant_Fruit
30th December 2005, 17:10
Hrmm, i suppose the Inloop filter is important in the way that it is *always* going to be applied to the video, so the quality benefits are always going to be there as opposed to an external deblocker which may or may not actually be activated.
Other than that, is the inloop filter significantly better than other methods? I cant seem to see any mention of advantages it offers by being built-in as opposed to external.
smok3
30th December 2005, 17:21
from what i understand is just that: the filtered images are used as references, and this should decrease blockines (everybody seems to agree that avc is better in low bitrates than asp - so thats not just theory). - hopefully somebody with more knowledge can give some proper info on how that really works, in x264 for example.
edit: also this can give you (the encoding guy) full control on how the video should look like to the end user (instead on relieing that end user will select proper post processing, which is not the same thing anyway.)
nm
30th December 2005, 17:23
As always, don't believe theories, believe implementations. "Twice as efficient" it's not.
From my testing, x264 doesn't give any better quality than XviD when CABAC and deblocking are disabled, and in fact the results are worse sometimes, blockier, even with exhaustive motion estimation, RDO partition decision, b-frames with adaptive and weighted biprediction, all partition options on and 8x8 transform. With CABAC and deblocking filter on, it can be better, but again not always noticeably.
It would be nice to know what kind of bitrates or QP did you use in your testing.
What is promising is that without CABAC and deblocking filter, the decoding speed is about the same as ASP with qpel.
An important thing you should mention is that WITH advanced features like CABAC and the deblocking filter, along with the improved quality comes the much more demanding decoding performance. About 3 times as slow as ASP and twice as slow as ASP with qpel, and worse as the resolution of the video is higher. 1280x720 (720p) decoding is pretty much impossible with normal system setups while with ASP it's rarely an issue.
According to many reports, the CoreAVC decoder looks very promising. Even 1080p (at 23.976 fps) decoding on an AthlonXP was claimed to be possible. If that is with CABAC and inloop deblocking, it is very impressive indeed. I have yet to see this myself though, since the decoder is only available on one platform.
smok3
30th December 2005, 17:30
nm, from my quick tests, mplayer is still the fastest thingy for winxp (if you use the proper compile.)
IvS
30th December 2005, 17:36
nm: from about 1000kbps to 5000kbps, indeed I haven't tested lower bitrates really but I was more interested in preserving transparent quality, plus the videos I had to encode were pretty demanding (though not specially crafted or whatever, just vids captured from a Canon S2 camera at about 16000kbps MJPEG), at least 4000kbps was often needed to make them look sharp enough and not smoothed or blocky (with both XviD and x264)
Revgen
30th December 2005, 17:37
CoreAVC indeed does play 1920x1080 material very well on my AMD X2 4600+ (thats 3800+ if you look at it as a single cpu) without any problems. FFDshow is too slow playing back these files. However TCMP does lack support for CQM's and other advanced AVC features, and I can tell because some of my X264 videos play slower on TCMP then they do on FFDshow because I used these features during encoding.
smok3
30th December 2005, 17:52
wikipedia has some good reading too, to see the whole picture primarily:
http://en.wikipedia.org/wiki/H.264
Manao
30th December 2005, 18:34
It is definitely not more error resilient for the everyday use. Anybody using main profile ( i.e. everybody ) end up with a codec less error resilient than a good old mpeg2. And cabac make things even worse in that regards.
Error resilience only comes with extended profile, and I don't know any codec that can exploit the error resilience functions that comes along that profile.
Inloop delocking is definitely a plus, but it's not to be compare to postprocessing deblockers that are used with mpeg2 / mpeg4 asp. The 'Inloop' aspect of the deblocking is really the important part, since it allows to improve the prediction. In sheer efficiency gain at medium & low bitrates, inloop brings as much gain as cabac over cavlc. Moreover, if the decoder supports it, you can still show the non deblocked picture while still using inloop in the decoding process.
Cabac always brings a 10% size reduction gain in comparison to cavlc, whatever the others features used are. However, it often happens that a 10% reduction gain, translated into a quality gain, isn't visually noticable, but that's another matter.
Mutant_Fruit
30th December 2005, 19:06
Well, the spec allows for extra error resiliance features, so the fact that these aren't implemented (as of yet) in any encoders isn't as relevant. The point is that AVC does offer these features to anyone willing to implement them, and the previous generation of codecs didn't have these extra resiliance features.
Personally, i can't really see the error resiliance being used in anything other than webstreaming, and even then, maybe not.
The 'Inloop' aspect of the deblocking is really the important part, since it allows to improve the prediction. In sheer efficiency gain at medium & low bitrates, inloop brings as much gain as cabac over cavlc.
I didn't realise that inloop deblocking lead to more accurate prediction and then better efficiency. (i heard mention of it in one of the PDF's but i think i skimmed over it as irrelevant :rolleyes: ). Thats definately worthy of a mention.
Thanks again!
Manao
30th December 2005, 20:05
that AVC does offer these features to anyone willing to implement themYes, but if you use them, you fall into the extended profile, which disable most of the quality improving feature of AVC : cabac, bframes, 8x8 transform, custom matrices, weighted prediction, interlacing encoding...
Mutant_Fruit
16th January 2006, 19:20
Let me just double check i have this right, and let me know if i've left anything out:
Features of H.264 divided by profile.
Baseline:
No B-Frames (standard i/p frames only)
CAVLC Only (no cabac)
no trellis
i4x4, P4x4, P8x8, B8x8 macroblocks
up to 16 refs
inloop deblocker
Progressive only
Main Profile:
up to 16 B-Frames
adaptive bframes
weighted b prediction
bidirectional M.E.
i4x4, P4x4, P8x8, B8x8 macroblocks
Cabac allowed (CAVLC also an option)
Trellis allowed
up to 16 refs
in loop deblocker
Progressive + Interlaced supported
High Profile:
up to 16 B-Frames
adaptive bframes
weighted b prediction
bidirectional M.E.
Adaptive DCT (for i8x8 usage)
i4x4, P4x4, P8x8, B8x8 and i8x8 macroblocks
Cabac allowed (CAVLC also an option)
Trellis allowed
up to 16 refs
in loop deblocker
Custom Quantizers
Lossless encoding possible
Progressive + Interlaced supported
Extended Profile:
CAVLC Only (no cabac)
Allows I, P, B, SI and SP frames
Progressive only
Inloop deblocker
Supports redundant frames for error correction type of thing
General features
RDO used (i know this is standard agnostic: Any codec can feature this algorithm. just thought i'd mention it here though).
Slices (can be used in any profile)
can use spatial prediction as opposed to just temporal (For b-frames only, so it won't work for baseline).
Supports quarter pel subpixel refinement
IDR frames introduced... it's a frame that doesn't get referenced by any frames after it. Allowes for cutting when using multiple refs
EDITED: Put in a few changes as mentioned by manao. Thanks!
Manao
16th January 2006, 19:26
Codec Features of x264 divided by profile.H264. x264 is a codec that follows the h264/AVC standard, and that doesn't implement everything the standard offers ( no interlacing, no fancy extended profile stuff, not all weighted prediction modes )
bidirectional M.E.Not a feature. It's a way of finding motion, and isn't part of the standard.
Trellis allowedSame thing. It's an optimisation of the codec, not a feature of the standard. And it can be used for all profiles ( though since in x264 it works only with cabac, it won't be used in baseline )
General features
* RDO used
* Slices (can be used in any profile?)
* can use spatial prediction as opposed to just temporal
* Supports quarter pel subpixel refinementRDO is a codec implementation feature, not a standard's. Slices can be used in any profile. Spatial / temporal prediction refers to Bframes ( so no baseline profile ).
Mutant_Fruit
16th January 2006, 19:33
thanks for that, edited my post above. Freudian slip when i wrote x264 instead of H.264. Its the only codec i use, so i got a bit carried away :p
akupenguin
16th January 2006, 21:44
You still haven't (re)moved trellis, bidir ME, and adaptive bframes.
B-frames are not limited to 16, that's just a random large number in x264 so that we can statically allocate some arrays.
No B8x8 in baseline, since there's no B-frames.
There are B4x4 blocks too, just nobody uses them.
On the other hand, if you really do want to keep your x264-specific options/limitations, then do call it "features of x264" and delete extended profile and interlacing.
Mutant_Fruit
16th January 2006, 21:59
On the other hand, if you really do want to keep your x264-specific options/limitations, then do call it "features of x264" and delete extended profile and interlacing.
Nope, its just i havn't come across a nice table in any of my PDF's showing me the limits of the various things (such as max b-frames). I'm reading through the latest draft of the specs, but since theres over 300 pages in it, its hard to find all the relevant info i might need. Tis why i'm posting here.
No B8x8 in baseline, since there's no B-frames.
Good point. I'll fix that now. *feels stupid*
Hopefully i'll come across some nice tables in the spec pdf listing these kind of details. I probably just havn't looked in the right place yet.
Thanks.
EDIT: Just found Annex A. Seems to contain the info i need...
puffpio
17th January 2006, 10:38
youe h264 spec allows up to 32 reference frames
akupenguin
17th January 2006, 12:08
youe h264 spec allows up to 32 reference frames
No.
It allows up to 16 reference frames, and if they're interlaced that's 32 fields.
IDR frames introduced... it's a frame that doesn't get referenced by any frames after it. Allowes for cutting when using multiple refs
IDR-frames are referenced. In fact, they're identical to what you thought of as I-frames or keyframes in previous standards. It's the non-IDR I-frames that are slightly different in that multirefs interfere with seeking/cutting there.
Mutant_Fruit
17th January 2006, 19:32
IDR-frames are referenced. In fact, they're identical to what you thought of as I-frames or keyframes in previous standards. It's the non-IDR I-frames that are slightly different in that multirefs interfere with seeking/cutting there.
Aye, but if an IDR frame is frame number 100, it can only be referenced by frames less than 100! No frame after 100 can reference that particular IDR frame. At least thats what i understand from reading the spec. This means that if the video is cut at that IDR frame, everything can still be reconstructed correctly.
EDIT: I assume that frame 99 couldn't reference 101 either (if 100 is an IDR frame). Otherwise there'd be no point in IDR frames that i can see.
Manao
17th January 2006, 19:38
Aye, but if an IDR frame is frame number 100, it can only be referenced by frames less than 100! No frame after 100 can reference that particular IDR frame.No
See IDR as a wall. Nothing goes through. Only things beyond the wall can touch it. So frames before the IDR can't reference it nor any of the following frames. And frames after the IDR can reference it, but can't reference any frame before it.
Mutant_Fruit
17th January 2006, 19:42
No
See IDR as a wall. Nothing goes through. Only things beyond the wall can touch it. So frames before the IDR can't reference it nor any of the following frames. And frames after the IDR can reference it, but can't reference any frame before it.
Ah right, i was thinking of an IDR frame backwards. I thought only previous frames could reference an IDR frame, but actually only frames after an IDR frame can reference it. Makes more sense that way if i think about it.
vBulletin® v3.8.5, Copyright ©2000-2012, Jelsoft Enterprises Ltd.