Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > MPEG-4 AVC / H.264

Reply
 
Thread Tools Search this Thread Display Modes
Old 2nd July 2010, 12:34   #1  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,989
Single threaded H.264 decoding performance

[EDIT] This was all updated quite a bit. The interesting graph I came up with a few hours later has been added to this OP just to grab eyes
Consider my thoughts in this OP somewhat misguided...

tl;dr - ffmpeg-mt is faster (for these sources anyway) than CoreAVC or DivX H.264!




Hey folks,

So we all know that CoreAVC is a great place to start if you're looking for FAST H.264 decoding on Windows. That's all well and good.

However, I recently realized I'd made the assumption that it would be very fast in all scenarios. As my numbers below show, this is not the case!

All tests were performed on a 2.4 GHz Intel Q6600 quad-core CPU running Windows 7 x64. Benchmarks were made using avs2avi. DSS2 provided DirectShow interaction. All sources were of identical content, and were 1080p24.

Source 1 - ~50mbps x264 encode, "fastdecode" tune (no CABAC, B-Frames, or Deblocking)
CoreAVC - 103.7 fps
FFMS2 - 31.5 fps
Fake CoreAVC "single threaded" ~ 25.93 fps

Source 2 - ~50mbps x264 encode, no CABAC or B-Frames, but has deblocking
CoreAVC - 86.9 fps
FFMS2 - 27.6 fps
Fake CoreAVC "single threaded" ~ 21.73 fps

Source 3 - ~50mbps x264 encode, with CABAC, B-frames, and deblocking
CoreAVC - 56.6 fps
FFMS2 - 13.7 fps
Fake CoreAVC "single threaded" ~ 14.2 fps

My conclusion? In typical scenarios for this community, CoreAVC or similar high performance multi-threaded decoders are the way to go.

However, in MY CASE , things are a little different. I usually work with H.264 sources that have no B-Frames, and use CAVLC instead of CABAC. Both of these sacrifices are made to facilitate real-time capture, which is a huge ugly animal all on its own!

Given my typical sources, and the fact that I'm doing a LOT of transcoding at once (anywhere from 3-6 1080p encodes on a system at once, depending on how many cores I have), FFMS2, single threaded, seems like it will actually give me better throughput - provided it never bottlenecks x264!

Sure, the numbers are always higher with CoreAVC, but the numbers are always with it eating up almost 100% of my CPU! That's fine for playback, but for high volume transcoding, I want the most efficient decoder for my type of files, correct?

I THINK I'm seeing this correctly. What do you guys think?

The best possible solution is to do all the decoding on a couple GPUs, but that's another story.

Derek
__________________
These are all my personal statements, not those of my employer :)

Last edited by Blue_MiSfit; 2nd July 2010 at 17:53.
Blue_MiSfit is offline   Reply With Quote
Old 2nd July 2010, 12:59   #2  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,989
Some more numbers, this time with timeCodec.exe. They're higher overall, but I assume that's because it avoids AviSynth etc. The advantage is, I can restrict CPU affinity reliably this way

Also, since FFMS2 isn't available in this framework, I'm just using libavcodec from ffdshow-tryouts rev 3463 (may 29 2010)

Source 1 (--tune fastdecode, zerolatency)
CoreAVC (4 threads) - 126.3
libavcodec (1 thread) - 35.9
CoreAVC (1 thread) - 34

Source 2 (--no-cabac, --tune zerolatency)
CoreAVC (4 threads) - 101.4
libavcodec (1 thread) - 30.4
CoreAVC (1 thread) - 27.7

Source 3 (all on)
CoreAVC (4 threads) - 62.5
libavcodec (1 thread) - 17.6
CoreAVC (1 thread) - 16.5

Interesting. The trend is basically the same, though the differences here are less pronounced (less than 10% in most cases).

Interestingly, good old MPEG-2 at 80mbps, with typical GOP structure decodes at ~44fps single threaded with FFMS2, which is almost 2x faster than DGDecode

Derek
__________________
These are all my personal statements, not those of my employer :)
Blue_MiSfit is offline   Reply With Quote
Old 2nd July 2010, 13:15   #3  |  Link
clsid
*****
 
Join Date: Feb 2005
Posts: 5,647
ffdshow also has multi-threading capability if you select ffmpeg-mt as H.264 decoder.
__________________
MPC-HC 2.2.1
clsid is offline   Reply With Quote
Old 2nd July 2010, 13:19   #4  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,989
Indeed. In my testing it's always been slower than CoreAVC, but it's worth a look!

Derek
__________________
These are all my personal statements, not those of my employer :)
Blue_MiSfit is offline   Reply With Quote
Old 2nd July 2010, 13:58   #5  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,989
Here's more results, this time using ffmpeg-mt (and including the CoreAVC results for reference

Source 1 (--tune fastdecode, zerolatency)
ffmpeg-mt (1 thread) - 35.5 fps
ffmpeg-mt (4 threads) - 130.4 fps
CoreAVC (1 thread) - 34 fps
CoreAVC (4 threads) - 126.3 fps

Source 2 (--no-cabac, --tune zerolatency)
ffmpeg-mt (1 thread) - 28 fps
ffmpeg-mt (4 threads) - 110.4 fps
CoreAVC (1 thread) - 27.7 fps
CoreAVC (4 threads) - 101.4 fps

Source 3 (all on)
ffmpeg-mt (1 thread) - 17.3 fps
ffmpeg-mt (4 threads) - 66.2 fps
CoreAVC (1 thread) - 16.5 fps
CoreAVC (4 threads) - 62.5 fps

WOW. I was NOT expecting that! ffmpeg-mt is faster than CoreAVC in all cases for my usual files! It looks like the ffmpeg devs have been working hard

I'll make a pretty little graph to illustrate, and maybe put in DivX H.264 as well, I don't have DiAVC and don't feel like spending $10 right now...



Derek
__________________
These are all my personal statements, not those of my employer :)

Last edited by Blue_MiSfit; 2nd July 2010 at 14:18.
Blue_MiSfit is offline   Reply With Quote
Old 2nd July 2010, 14:26   #6  |  Link
kieranrk
Registered User
 
Join Date: Jun 2009
Location: London, United Kingdom
Posts: 707
Have you tried DiAVC?
kieranrk is offline   Reply With Quote
Old 2nd July 2010, 14:58   #7  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,989
Quote:
I don't have DiAVC and don't feel like spending $10 right now...
Derek
__________________
These are all my personal statements, not those of my employer :)
Blue_MiSfit is offline   Reply With Quote
Old 2nd July 2010, 22:49   #8  |  Link
Underground78
Registered User
 
Underground78's Avatar
 
Join Date: Oct 2004
Location: France
Posts: 567
Quote:
Originally Posted by Blue_MiSfit View Post
I don't have DiAVC and don't feel like spending $10 right now...
I thought a trial version was available but I may be mistaken.
Underground78 is offline   Reply With Quote
Old 2nd July 2010, 23:18   #9  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Performance may depend heavily on the source material -- e.g. whether the bitrate is high or low, etc.
Dark Shikari is offline   Reply With Quote
Old 3rd July 2010, 01:01   #10  |  Link
sneaker_ger
Registered User
 
Join Date: Dec 2002
Posts: 5,565
Quote:
Originally Posted by Underground78 View Post
I thought a trial version was available but I may be mistaken.
The current trial seems to be expired, but schweinsz gladly distributes trials on request.
sneaker_ger is offline   Reply With Quote
Old 3rd July 2010, 02:12   #11  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,989
Indeed. The bitrate was very high in my case, usually around 60mbps (1080p24)

Derek
__________________
These are all my personal statements, not those of my employer :)
Blue_MiSfit is offline   Reply With Quote
Old 3rd July 2010, 02:15   #12  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Quote:
Originally Posted by Blue_MiSfit View Post
Indeed. The bitrate was very high in my case, usually around 60mbps (1080p24)

Derek
Try a much lower bitrate sample. At that bitrate, you're benching the CABAC decoder.
Dark Shikari is offline   Reply With Quote
Old 3rd July 2010, 02:28   #13  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,989
Hence why 2/3 of my tests have CABAC disabled
__________________
These are all my personal statements, not those of my employer :)
Blue_MiSfit is offline   Reply With Quote
Old 3rd July 2010, 02:34   #14  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Quote:
Originally Posted by Blue_MiSfit View Post
Hence why 2/3 of my tests have CABAC disabled
Then you're benching the CAVLC decoder
Dark Shikari is offline   Reply With Quote
Old 3rd July 2010, 03:12   #15  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,989
Fair enough.

Still, I'm trying to determine the optimal way to decode high bitrate H.264 for high volume transcode. In my mind, this means the decoder that gives the best per-thread performance. Is that logical?

This seems like a valid test, no?

As I said, MPEG-2 decodes very quickly as well

Derek
__________________
These are all my personal statements, not those of my employer :)

Last edited by Blue_MiSfit; 3rd July 2010 at 03:16.
Blue_MiSfit is offline   Reply With Quote
Old 3rd July 2010, 11:12   #16  |  Link
STaRGaZeR
4:2:0 hater
 
Join Date: Apr 2008
Posts: 1,302
If you feel like it, it'll be really nice to have a comparison chart with more samples, with a complete description of the used settings. That way we'll see where each decoder shines or fails
__________________
Specs, GTX970 - PLS 1440p@96Hz
Quote:
Originally Posted by Manao View Post
That way, you have xxxx[p|i]yyy, where xxxx is the vertical resolution, yyy is the temporal resolution, and 'i' says the image has been irremediably destroyed.
STaRGaZeR is offline   Reply With Quote
Old 3rd July 2010, 17:29   #17  |  Link
iwod
Registered User
 
Join Date: Apr 2002
Posts: 756
Um... So we come to the conclusion that FFmpeg - Mt is fastest.

No loner a reason to use CoreAVC, Divx, or Diva decoder?
iwod is offline   Reply With Quote
Old 3rd July 2010, 18:27   #18  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,989
Well, for these (unusual) test cases anyway. I haven't bothered to do anything with standard BluRay streams or typical 4-15mbps backup type streams.

Derek
__________________
These are all my personal statements, not those of my employer :)
Blue_MiSfit is offline   Reply With Quote
Old 4th July 2010, 11:16   #19  |  Link
CeeJay.dk
Registered User
 
Join Date: Dec 2003
Location: Denmark
Posts: 122
@Dark Shikari : You wrote on your blog about some new deblocking optimization you made to x264. Could this optimization to an encoder deblocker be applied to a decoder deblocker , like the one in FFmpeg ?
Have it already been applied - Is that why the new FFmpeg 0.6 release decodes h264 faster ?
CeeJay.dk is offline   Reply With Quote
Old 4th July 2010, 19:53   #20  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Quote:
Originally Posted by CeeJay.dk View Post
@Dark Shikari : You wrote on your blog about some new deblocking optimization you made to x264. Could this optimization to an encoder deblocker be applied to a decoder deblocker , like the one in FFmpeg ?
Have it already been applied - Is that why the new FFmpeg 0.6 release decodes h264 faster ?
It could be, and no, it hasn't been. It would help a lot more in a decoder than in an encoder, in fact.

I haven't done it because the second stage of the process -- "fix things up if the neighbors are different for deblocking than for normal decoding -- is rather hard when you have to consider all possible options (PAFF, MBAFF, etc, etc).
Dark Shikari is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 02:08.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.