View Full Version : Test ffdshow DXVA decoder performance
roozhou
30th March 2011, 16:08
This is the first step to use DXVA in encoding application. With this method you can offload H264 or VC1 decoding to GPU during encoding on ALL video cards.
My test program is based on ffdshow's DXVA module. It adds to ffdshow a new mode called "frame grabber" which copies the whole picture from video memory to main memory.
This program is to test decoding performance in "frame grabber" mode. It seems on some DXVA capable video cards the performance is really bad(<10fps with full CPU usage).
Download link is: http://www.mediafire.com/?czmjke9nmxlmahm. Source code and patch for ffdshow is included.
Plz follow the instructions in readme.txt and post your result in this thread. Don't forget your CPU/GPU/OS spec.
roozhou
30th March 2011, 16:18
Following is my test result. My test clip is a 19Mbps MBAFF 1080p H264 clip and a 20Mbps VC1 1080p clip.
Phenon II 3.1G + AMD HD4250 IGP (880G)
WinXP SP3 (bad performance)
Video format: AVC1, 1440x1080
Start decoding ...
995 frames decoded
Average decoding speed is 10.98fps
User-time 78.672s, kernel-time 3.141s
Video format: WVC1, 1920x1080
Start decoding ...
4545 frames decoded
Average decoding speed is 8.97fps
User-time 475.563s, kernel-time 7.281s
Win7 32bit
Video format: AVC1, 1440x1080
Start decoding ...
995 frames decoded
Average decoding speed is 58.67fps
User-time 2.278s, kernel-time 0.218s
Video format: WVC1, 1920x1080
Start decoding ...
4545 frames decoded
Average decoding speed is 62.89fps
User-time 9.719s, kernel-time 1.061s
roozhou
30th March 2011, 16:21
When I switched to a NVidia GT240 video card, the performance under WinXP is as fast as Win7
Phenon II 3.1G + NV GT240 DDR3
WinXP SP3
Video format: AVC1, 1440x1080
Start decoding ...
995 frames decoded
Average decoding speed is 51.15fps
User-time 0.344s, kernel-time 0.516s
Video format: WVC1, 1920x1080
Start decoding ...
4545 frames decoded
Average decoding speed is 68.75fps
User-time 0.016s, kernel-time 0.375s
i7 950 + GT240 GDDR5
Win7 64bit
Video format: AVC1, 1440x1080
Start decoding ...
995 frames decoded
Average decoding speed is 64.32fps
User-time 1.248s, kernel-time 0.374s
Video format: WVC1, 1920x1080
Start decoding ...
4545 frames decoded
Average decoding speed is 73.50fps
User-time 3.666s, kernel-time 0.780s
Chikuzen
30th March 2011, 17:08
Hi, roozhou
sample: This (http://www.mediafire.com/?p2h7dfhe48tu92i) (1280x720, H.264/AVC in MP4, High@4.1, 501frames)
environment: Core2Quad Q9450 2.67GHz / Radeon HD5870 GDDR5 1024MB / Windows7 SP1 (x64)
result:
Average decoding speed is 114.51fps
User-time 0.499s, kernel-time 0.265s
Average decoding speed is 114.57fps
User-time 0.421s, kernel-time 0.312s
Average decoding speed is 114.49fps
User-time 0.468s, kernel-time 0.234s
Average decoding speed is 114.38fps
User-time 0.499s, kernel-time 0.265s
Average decoding speed is 114.25fps
User-time 0.530s, kernel-time 0.234s
btw, i think that you should prepare some common samples for benchmark.
CruNcher
31st March 2011, 15:50
Intel Core I5 2400
9800 GT 512 MB (G92/VP2) Forceware 270.51 Beta
Windows XP SP3
1080p 60 FPS http://e.dl.playstation.net/e/wipeouthd/assets/WipEoutHD_EN_1080p.zip
Low Power Profile
Video format: avc1, 1920x1080
Start decoding ...
4304 frames decoded
Average decoding speed is 44.63fps
User-time 0.094s, kernel-time 0.453s
High Power Profile
Video format: avc1, 1920x1080
Start decoding ...
4304 frames decoded
Average decoding speed is 44.81fps
User-time 0.078s, kernel-time 0.266s
Chikuzen
31st March 2011, 19:43
I changed the sample to CruNcher's link.
Average decoding speed is 54.98fps
User-time 7.129s, kernel-time 0.998s
Average decoding speed is 55.00fps
User-time 7.114s, kernel-time 0.842s
Average decoding speed is 54.98fps
User-time 7.301s, kernel-time 0.905s
Average decoding speed is 54.98fps
User-time 7.441s, kernel-time 0.811s
roozhou
1st April 2011, 12:12
Hi CruNcher, could you test the same clip on SNB's integrated video chip?
CruNcher
1st April 2011, 14:05
There is no support in the Windows XP drivers for anything HD2000/HD3000 DXVA related except Mpeg-2 :( but yes i will test it soon including some Multi GPU DSP things on 7 and Media SDK 2.0 trying to combine with Nvcuvid and CUDA if i finally found a fix for my VMR9 problem on XP, btw Stan released Mediacoder with support for Intels Encoder and also added Quicksync support to it :) see http://blog.mediacoderhq.com/mediacoder-2011-rc3-is-released-with-intel-media-sdk-encoder-integrated/ and he even replaced his dev system with Sandy Bridge now. So first free application who can utilize Quicksync :) (it's a pity that he violates OSS ethics as such). Not even all the All in one Chinese Encoder ISVs jumped on it yet (Wondershare,Xilisoft) though they will sooner or later like they did with nvcuvenc.
Though this is also 1 thing i don't like about Intels Ecosystem they entirely seem to drop Windows XP , and hence im not really into getting forced because of some Video stuff into another OS (though i see the real pros Multi GPU/DSP support is one i would leave XP for) i will decide this step when im ready myself and this VMR9 thing is something i want to get under control first. Or at least understand if it is really Timer related might be that Cyberlink uses different code for getting time stamps/buffer which is capable of higher resolutions under 1ms that other ISVs don't utilize.
roozhou
1st April 2011, 14:58
@CruNcher
My bad. I forgot that intel does not support DXVA1 at all. I have Intel U4100 + GMA 4500 laptop. It only has bitstream H264 decoding but no VC1 decoding on Win7. Though it does not decode as fast as NVidia and ATI, the performance is fairly acceptable.
I currently lack test results on the following platforms:
1) DXVA2 on Clarkdale and Sandybridge.
2) DXVA2 on NVidia VP2
3) DXVA1 on ATI HD 2/3/4/5 series
I found following platforms have very bad performance
1) DXVA1 on ATI HD 4 series IGP
2) DXVA2 on ATI HD 2 series
Maccara
2nd April 2011, 12:39
I was going to test on AMD X2 4400+, XP x64, ATI 4850, but I just get an exception 0xc0000005 from DXVATest.exe at 0x7d6213e5.
(DXVA decoding in general works with this system, so that's not an issue itself)
tetsuo55
2nd April 2011, 16:44
I was going to test on AMD X2 4400+, XP x64, ATI 4850, but I just get an exception 0xc0000005 from DXVATest.exe at 0x7d6213e5.
(DXVA decoding in general works with this system, so that's not an issue itself)Do you have DEP enabled? what about a virusscanner?
Maccara
3rd April 2011, 21:40
Do you have DEP enabled? what about a virusscanner?
DEP only for services etc. Yes, Avast.
Anyway, quickly tested turning those off completely and still crashes.
(I would've been a tad surprised if that would've helped anyway, as the patch + exe looks quite trivial, and ffdshow works fine in other contexts)
roozhou
4th April 2011, 07:45
@Maccara
Did you see any output in the console window? If you run the exe without arguments, does it crash?
tetsuo55
4th April 2011, 10:21
DEP only for services etc. Yes, Avast.
Anyway, quickly tested turning those off completely and still crashes.
(I would've been a tad surprised if that would've helped anyway, as the patch + exe looks quite trivial, and ffdshow works fine in other contexts)Try enabling DEP for everything, and scan you pc using a different virusscanner, maybe an online one?
Maccara
4th April 2011, 20:41
@roozhou:
From memory, there was absolutely no output, just immediate crash. Without parameters (i.e. the movie file), it didn't crash (but of course didn't do anything iteresting :))
@Tetsuo: Say what?? :) Certainly _enabling_ DEP and/or using virus scanners won't make this test software more stable on this system. DEP will only prevent the whole program running in case it is hit - will absolutely not "fix" anything.
My suspicion is that it is actually the ffdshow.ax module at fault here. Might be something because of 64bit XP (can be a bit quirky with certain software sometimes). I'm running 32bit ffdshow otherwise - just assuming the test software is 32bit too. :)
Anyway, I'll try to test again at some point to make sure I understand at what point the crash happens. Maybe it behaves better after a reboot. :)
EDIT:
Just did a quick re-test. It actually crashes because of the file I downloaded from this thread. :D Sorry for wasting your time.
With WipEout_HD_English_1080p.mp4 crash immediately and no output whatsoever (I can play it with DXVA otherwise, although it is a bit choppy).
With couple of my own encodes I get results:
Video format: avc1, 1280x720
Start decoding ...
103 frames decoded
Average decoding speed is 4.16fps
User-time 25.453s, kernel-time 1.266s
Video format: avc1, 640x368
Start decoding ...
1038 frames decoded
Average decoding speed is 16.20fps
User-time 62.000s, kernel-time 1.859s
So, extremely slow also on ATI HD4850 512MB + XP x64 + AMD X2 4400+ (I believe this would be DXVA1 then)
roozhou
5th April 2011, 06:56
Thank you Maccara. So your HD4850 gave the same result as my HD4250 IGP on DXVA1.
nevcairiel
5th April 2011, 06:59
ATI is known to have very bad performance when copying data back from a D3D Texture, especially on WinXP (performance increased on Vista/7, maybe a side-effect of the new driver model). This is mostly why all those non-DirectShow players (VLC and others) have very bad DXVA support for anything other then NVIDIA.
roozhou
5th April 2011, 07:25
@nevcairiel
But for NVidia we can use CUDA instead of DXVA, though CUDA seems to use more VRAM than DXVA.
ATI is known to have very bad performance when copying data back from a D3D Texture, especially on WinXP (performance increased on Vista/7, maybe a side-effect of the new driver model). This is mostly why all those non-DirectShow players (VLC and others) have very bad DXVA support for anything other then NVIDIA.
Aha. That's interesting. I was always wondering why my streaming from Dreambox to VLC was still somewhat glitchy with DXVA on, even after they fixed the long-standing DXVA bug in the 10.12 drivers. My CPU is too slow on its own, especially with high bitrate 1080i and deinterlacing, so I need to enable DXVA. My tests with Nvidia were fine AFAIR.
I'm really starting to consider an Nvidia card again. I always used Nvidia before, but the 5770 seemed like good value at the time, and I wanted to try ATI for once. Between the VLC issue, not being able to use your video decoder, the chroma issue and seemingly constant little driver niggles showing up, I at least know who will make my next card. The only thing left to decide is if I'm gonna wait until I build my next system.
CruNcher
5th April 2011, 16:03
@nevcairiel
But for NVidia we can use CUDA instead of DXVA, though CUDA seems to use more VRAM than DXVA.
Roozhou try like nevcairiel did with LAV CUVID to use the proper name of the API that's NVCuvid :) to many already hype this CUDA term thing for the Video DSP which actually has nothing todo with it http://developer.download.nvidia.com/compute/cuda/sdk/website/C/src/cudaDecodeGL/doc/nvcuvid.pdf even if NVCUVID is meaning Nvidia Cuda Video Decoder ok if you count the 3D texture things to CUDA then yeah it would be somehow Valid but the main tasks are DSP (fixed function) except if you go into Deinterlacing and PostPro then CUDA becomes valid i guess. Indeed to complicated so CUVID packs everything into it without doing to heavy marketing always mentioning CUDA here CUDA their ;) therefore you really can like ATI/AMD discrete name for it "Open Video Decode" nothing about ATI/AMD or APP just plain non Marketing name http://developer.amd.com/gpu/AMDAPPSDK/assets/OpenVideo_Decode_API.PDF.
Here is a sample of how OVD works with UVD via OpenGL http://download2-developer.amd.com/amd/Samples/OVDecodeRender.zip currently it only works on Windows no Linux support yet as many hoped for.
Aha. That's interesting. I was always wondering why my streaming from Dreambox to VLC was still somewhat glitchy with DXVA on, even after they fixed the long-standing DXVA bug in the 10.12 drivers. My CPU is too slow on its own, especially with high bitrate 1080i and deinterlacing, so I need to enable DXVA. My tests with Nvidia were fine AFAIR.
I'm really starting to consider an Nvidia card again. I always used Nvidia before, but the 5770 seemed like good value at the time, and I wanted to try ATI for once. Between the VLC issue, not being able to use your video decoder, the chroma issue and seemingly constant little driver niggles showing up, I at least know who will make my next card. The only thing left to decide is if I'm gonna wait until I build my next system.
First of all try to use 3rd Party ISV supplied stuff those are more enhanced under NDA for a longer time (Cyberlink,Arcsoft,Corel,Nero) ATI/AMD just begun to open up in those regards competing with Nvidia on a bigger field.
First of all try to use 3rd Party ISV supplied stuff those are more enhanced under NDA for a longer time (Cyberlink,Arcsoft,Corel,Nero) ATI/AMD just begun to open up in those regards competing with Nvidia on a bigger field.
What? I have no idea what you're trying to say here, or how it relates to what I wrote. I use VLC only when I have to. I use MPC-HC and nevcairiel's stuff because I want to.
roozhou
6th April 2011, 08:52
@CruNcher
Could you plz use more periods and paragraphs when typing? I always felt out of breath when reading your posts.
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.