View Full Version : CoreCodec/H.264 Codec "CoreAVC"
madshi
11th February 2009, 15:52
Ahh... that could be the case then. We are taking the stand to not include the CUDA DLL in our installer but instead opting to use the one that is public. We will see if this is the right approach as time progresses and continue to work with NVIDIA on it.
I agree that distributing the CUDA DLL with CoreAVC doesn't make much sense, since the danger would be high that the CUDA DLL you distribute would be outdated rather soon.
But maybe you could make the latest CUDA DLL available as a separate optional download? Updating this download whenever a new CUDA version is available should be easy for you (no need for any installers or release notes or such stuff, just the DLL).
Gleb Egorych
11th February 2009, 15:57
What I have found out so far based on the feedback is that since every system configuration is different it yields varied results. We are getting some report of a massive drop of 60% CPU usage down to 1%-7%. But I think the average so far is about a 50-60% reduction (more like 75% for me with a dual core 3.2ghz 9400GPU laptop).
You apparently talk about CUDA acceleration but I asked about pure software decoding, about comparing different CPUs.
For example, timecodec results for Intel Qxxx0, AMD X4, X3 and so on.
DJ Bobo
11th February 2009, 16:35
Since we're talking about CPU usage here, shouldn't the hardware requirements list on the CoreAVC homepage get updated? Recommending a P4@2.8GHz for 1080p videos can be a little bit misleading, can't it? I mean, I'm getting a little over 70% average CPU usage on an Athlon X2 QL-60 (1.9GHz) which should be roughly equivalent to a Pentium D @ 3.2GHz, and you recommend a P4?!
I mean not everybody can use this CUDA thing, I have a Radeon for instance.
Rectal Prolapse
11th February 2009, 17:04
The artifacts jj666 saw look identical to what I saw with Max Payne and Ghost in the Shell: Stand Alone Complex: Solid State Society (whew, long name!), so if you fix those other titles I'm sure it'll work with these. :)
BetaBoy
11th February 2009, 18:12
Since we're talking about CPU usage here, shouldn't the hardware requirements list on the CoreAVC homepage get updated? Recommending a P4@2.8GHz for 1080p videos can be a little bit misleading, can't it? I mean, I'm getting a little over 70% average CPU usage on an Athlon X2 QL-60 (1.9GHz) which should be roughly equivalent to a Pentium D @ 3.2GHz, and you recommend a P4?!
I mean not everybody can use this CUDA thing, I have a Radeon for instance.
We are holding off in such changes till we release the Press Release with NVIDIA. Then we will update the frontpage and the requirements page. However we have updated both the changelog and the Configuration Guide to reflect the 1.9 release.
squid_80
11th February 2009, 18:24
That's unfortunately not necessarily the case. I use the latest nvcuvid.dll from my contact at Nvidia. The version I have has several fixes beyond what is in the released driver.
Your quickstart html tells people to copy nvcuvid.dll into the windows/system32 directory, so it's going to clobber the one installed with the driver anyway (or alternatively be clobbered if the user installs a new driver).
hajj_3
11th February 2009, 19:01
would it not be best to include the .dll in 1.9.0.0 and update it by 1 build each time and bundle the new .dll in and call it 1.9.0.1 have the changelog say "updated cuda .dll file". would save loads of people from having to mess around.
STaRGaZeR
11th February 2009, 19:13
Field order when using hardware deinterlacing is still wrong.
Guest
11th February 2009, 20:31
Your quickstart html tells people to copy nvcuvid.dll into the windows/system32 directory, so it's going to clobber the one installed with the driver anyway (or alternatively be clobbered if the user installs a new driver). Most CoreAVC users are not DG tool users, so my point stands. Or maybe I missed yours?
My point was that nvcuvid.dll is evolving faster than the driver releases, and it would benefit users to distribute the latest version, as I do for my tools.
Guest
11th February 2009, 20:32
Field order when using hardware deinterlacing is still wrong. The VMR framework forces a one-field delay when the deinterlacer is enabled.
leeperry
11th February 2009, 20:49
thanks for the final 1.9!
but my pulp fiction sample still suffers from banding(doesn't occur in software decode, only CUDA), and seeking still makes terrible artefacts in KMPlayer(only in CUDA mode) :(
that's w/ the 181.20 drivers on XP SP3, I'll try to update..
BetaBoy
11th February 2009, 21:20
My point was that nvcuvid.dll is evolving faster than the driver releases, and it would benefit users to distribute the latest version, as I do for my tools.
Short term that maybe the case... but I think in a few weeks that it will be diff. But maybe there is a compromise. I'll ping the guys.
squid_80
11th February 2009, 21:21
Most CoreAVC users are not DG tool users, so my point stands. Or maybe I missed yours?
My point was that nvcuvid.dll is evolving faster than the driver releases, and it would benefit users to distribute the latest version, as I do for my tools.
My point was if a user says "It doesn't work with CoreAVC, but it works with DGAVCIndexNV" then they obviously do have both installed and have probably copied your nvcuvid.dll to the system directory. But then again if they've installed the driver afterwards it might have switched the dll to an older version. I think it would be better if DGAVCIndexNV didn't rely on nvcuvid.dll being in system32; the program location is stored in the .dga file so why can't DGAVCDecodeNV load nvcuvid.dll from there?
I see 2 strong reasons for not distributing nvcuvid.dll with CoreAVC:
- It's now part of the official driver package from nvidia and it's bad practice for applications to overwrite driver files with unofficial versions (unofficial = no version information attached to the .dll)
- It's a lot easier to trace bugs if you know exactly what configuration the user has. It's easier to ask them what driver version is installed than ask them to find the timestamp/hash of a .dll and hope you've got a matching version somewhere.
leeperry: Does the pulp fiction sample look exactly the same as before?
leeperry
11th February 2009, 21:29
leeperry: Does the pulp fiction sample look exactly the same as before?
a bit better, but still quite a lot of banding.
and seeking in KMPlayer looks a bit better too, less artefacts but they're still there.
none of this happens in software mode :o
madshi
11th February 2009, 21:52
@BetaBoy,
I'm a bit confused about what kind of CUDA acceleration you're actually using. It seems to me that there are two different possibilities:
(1) Either you could let the NVidia dedicated video decoding circuit decode the h264 stream.
(2) Or you could use the general purpose CUDA stuff to let NVidia accelerate only some specific parts of your software decoding code. (Probably this would be run by the shader hardware and not the video decoding circuit of the graphics card).
Could you please clarify which solution you're using? To be honest, I don't really like the idea of using (1), because if there's a bug in the video decoding circuit, all hope is lost. Basically by using (1) you'd give up all control over quality. It's possible that there are some shortcuts in the video decoding circuit, so we can't be 100% sure if we get perfect quality or not. I don't see such risks when using approach (2), so I was hoping you'd use that.
Finally, if you (ever) switch from CUDA to OpenCL, probably you'd be forced to choose approach (2), right?
lexor
11th February 2009, 21:54
Hey Betaboy, you said previously that CUDA has no inherent limitations like DXVA, but the guys in MPC-HC thread are reporting that they get the exact same compatibility level with 1.9 as they do with build in DXVA decoder.
What's the score here?
Rectal Prolapse
11th February 2009, 22:08
Well, according to the DGAVCIndexNV documentation, DGAVCIndexNV requires an NVIDIA card that has the VP2 or better video unit in it. So I suspect that a lot of the work is still being done by the built-in video decoder that is also used by DXVA. I would hazard a guess that CoreAVC does the same thing.
Now the thing is - the Cyberlink (powerdvd) h264 decoder is DXVA and works fine in h/w mode decoding everything I've thrown at it, and use the same amount of CPU. Weird huh?
netchris
11th February 2009, 22:54
Well, according to the DGAVCIndexNV documentation, DGAVCIndexNV requires an NVIDIA card that has the VP2 or better video unit in it. So I suspect that a lot of the work is still being done by the built-in video decoder that is also used by DXVA. I would hazard a guess that CoreAVC does the same thing.
Sigh, if that is true, this cuda implimentation will not work with my 8800 gts 640MB (first generation 8800) either (as mpc hc doesnt too). I had high hopes coreavc 1.9 would work with my gpu. No luck it seems..
STaRGaZeR
12th February 2009, 00:28
The VMR framework forces a one-field delay when the deinterlacer is enabled.
I can't see why that would affect CoreAVC passing BFF flags with TFF content to the renderer. This is in software mode BTW and has been the same since forever. It has been discussed in this thread a few pages back.
BetaBoy
12th February 2009, 00:33
madshi... more of the #2 approach. That has been the basis for all our goals with GPU... obviously more so with CorePlayer (xScale, ATI, Marvell, Qualcomm/QTv, RMI and next CUDA) then with CoreAVC to this point.
lexor... I'll check with the guys.
Dark Eiri
12th February 2009, 01:09
Just tried the 1.9 trial with CUDA acceleration enabled and I must say it didn't decode properly any of my 720p videos with 10 ref-frames. They get really blocky and green frames all over, then the video stops, but when I try to seek, it plays for a few seconds and stops again. Just tested a 720p with 16 ref-frames and it decodes flawlessly, so it's kinda weird... Blocking when seeking is also present here, but it goes away in a second. Most videos play flawlessly! Again, great job, CoreCodec!
Also, I would like to know why "SD" videos aren't CUDA enabled? I think it would really help some Intel Atom systems with the new nVidia Ion chipsets (GeForce 9400, PV2 \ CUDA capable).
EDIT: Oh, yeah... I'm using 182.05 drivers for Vista 32-bit.
EDIT2: It seems some SD videos are working. Example: Youtube "HQ" MP4 videos are working with CUDA, some of my DVD backups are falling to software decoding.
DeepBeepMeep
12th February 2009, 01:20
Well I have tried to replace my nvcuvid.dll with the version that can be obtained in the installation package of DGAVCIndexNV and I still have some ugly blockiness and banding on half my movie samples.
It is important to stress that I don't have the blockiness with either Cyberlink DXVA decoder or CoreAVC CPU decoder.
Well hopefully this can be fixed easily.
On a side note is it possible to use CUDA as well to accelerate VC1 decoding and obtain better performances than with Nvidia DXVA (which are not great)?
_DW_
12th February 2009, 01:59
I have a question not related to CoreAVC decoding abilities. I've been test driving the trial version for the last few days. What I want to know if I buy it now what will it get me in the future? You've been talking about 2.0 for a while now. If I buy 1.9.x now can I automatically upgrade to 2.0?
BetaBoy
12th February 2009, 02:29
As I've said in earlier posts.... although I'm mentioning 2.0 more then several time (especially for 64bit) we are not even close to a release as we want to spend time in 'v1.9.x land' for a while exploring, adding, fixing, changing GPU. So we have no idea how long the 1.9.x release cycle will be and or how many releases are yet to come. But any current user purchaser will get a 'unique' discount upgrade link when 2.0 is out and any user purchasing it within xx days since the last release will get the upgrade for free (with xx being to be determined later). Of coarse this is subject to change... but this is the plan atm.
Cyber-Mav
12th February 2009, 02:35
would have been a lot clearer and simpler if you just said that "version 2.0 will require purchase since upgrade from 1.x will not be free."
_DW_
12th February 2009, 03:16
would have been a lot clearer and simpler if you just said that "version 2.0 will require purchase since upgrade from 1.x will not be free."
That was exactly what I was fishing around for. The change from the 1.x to the 2.x will cost. Since 1.9 is 15 bucks I'm not really going to worry about it. I've removed the trial version and I'm going to use fddshow for a few more days till the bugs to 1.9.x are worked out.
STaRGaZeR
12th February 2009, 04:02
So no 64-bit Haali's splitter for several months then. :(
Inspector.Gadget
12th February 2009, 04:58
I have an 8600M GT and I've just updated to the 182.05 drivers. Immediately after that (but after rebooting), I installed CoreAVC Professional 1.9.0.0. When I go to the configuration page for CoreAVC, the "Prefer CUDA Acceleration" option is grayed out. This remains the case even after deleting "CoreAVC.ini" in my Appdata\Roaming folder. The 8600M GT supports VP2, according to Nvidia. Is there something else I need to do to enable CUDA acceleration or have I made a mistake somewhere?
ilkertezcan
12th February 2009, 05:09
CoreAVC settings file (coreavc.ini) saved to %userprofile%. (\Documents Settings\...\Application Data). I want save to another folder. How am I make this?
Or...
I want only registry entries(no INI file): (HKEY_LOCAL_MACHINE\SOFTWARE\CoreCodec\CoreAVC Trial\ ...)
I created "brightness" dword value. But don't working(automatically created coreavc.ini).
BetaBoy
12th February 2009, 05:49
CoreAVC settings file (coreavc.ini) saved to %userprofile%. (\Documents Settings\...\Application Data). I want save to another folder. How am I make this?
Or...
I want only registry entries(no INI file): (HKEY_LOCAL_MACHINE\SOFTWARE\CoreCodec\CoreAVC Trial\ ...)
I created "brightness" dword value. But don't working(automatically created coreavc.ini).
The ini file is the only way and cannot be moved to another location. I'll see about adding an registry option.
edison
12th February 2009, 06:04
The hardware de-interlacing does not work when playing interlaced h264 video and RGB32 output, my card is GeForce 9600GT.
BetaBoy
12th February 2009, 06:18
edison... you did see the note that interlaced content is not supported with this initial CUDA release, correct?
cyberbeing
12th February 2009, 06:45
So no 64-bit Haali's splitter for several months then. :(
If you're feeling impatient or want to possibly speed up the process, you could use Haali's experimental 64-bit build (http://www.mediafire.com/?qznjzmmnvz0) in the meantime, and help by reporting any bugs found to Haali in the Alternative Matroska Splitter (http://forum.doom9.org/showthread.php?t=80762) thread.
BetaBoy
12th February 2009, 07:18
cyberbeing... did Haali sign off on you distributing that? I know he wants to do more before ppl start commenting on fixing 64 related issues.
cyberbeing
12th February 2009, 08:20
cyberbeing... did Haali sign off on you distributing that? I know he wants to do more before ppl start commenting on fixing 64 related issues.
Haali has been posting builds in the Alternative Matroska Splitter thread (http://forum.doom9.org/showpost.php?p=1240156&postcount=893). You can get it directly from his website if you want: http://haali.net/mkv/mkx.y.8.exe.
If he would like, I'll delete it from Mediafire. I just hosted it there since I find his site slow at times.
As for feedback, he never said not to leave feedback when he posted the build here on doom9, so I just assumed he wanted it. *shrug*
Upon your request, if you don't want CoreAVC users installing this build, and since this is the CoreAVC thread, I would be happy to delete my posts linking to it.
squid_80
12th February 2009, 09:15
The hardware de-interlacing does not work when playing interlaced h264 video and RGB32 output, my card is GeForce 9600GT.
Hardware deinterlacing normally only works with NV12 or YUY2 output. It's a common video card limitation.
squid_80
12th February 2009, 09:18
I have an 8600M GT and I've just updated to the 182.05 drivers. Immediately after that (but after rebooting), I installed CoreAVC Professional 1.9.0.0. When I go to the configuration page for CoreAVC, the "Prefer CUDA Acceleration" option is grayed out. This remains the case even after deleting "CoreAVC.ini" in my Appdata\Roaming folder. The 8600M GT supports VP2, according to Nvidia. Is there something else I need to do to enable CUDA acceleration or have I made a mistake somewhere?
If the option is grayed out it means the driver did not install properly, there are required files missing.
squid_80
12th February 2009, 09:36
I can't see why that would affect CoreAVC passing BFF flags with TFF content to the renderer. This is in software mode BTW and has been the same since forever. It has been discussed in this thread a few pages back.
Because that's not what is happening; CoreAVC is passing the correct flags and VMR9 interprets them wrongly. If it was passing the wrong flags why do VMR7 and Haali's renderer always show the fields in the correct order?
VMR9 is broken. It even changes field order randomly when you seek.
madshi
12th February 2009, 09:37
madshi... more of the #2 approach. That has been the basis for all our goals with GPU...
I'm glad to hear that! :)
Do you happen know where your GPU accelerated code is being executed on the graphics card? Is it the shader hardware or the dedicated video decoding circuit?
Thank you!
edison
12th February 2009, 10:19
edison... you did see the note that interlaced content is not supported with this initial CUDA release, correct?
HW De-interlacing does work when using YV12 output, but can not work with RGB32 output.
ACrowley
12th February 2009, 10:51
Mh... i dont have a Nvidia Card but a ATI HD4870
Something is strange with 1.9.0.0. BBC-HD 1080i MBAFF Files are stuttering/shaking with Hardware deinterlacing. It works fine with 1.8.5.0
Inspector.Gadget
12th February 2009, 14:05
If the option is grayed out it means the driver did not install properly, there are required files missing.
Thanks. I used the 185.20 drivers from LaptopVideo2go (which didn't require a modded INF) because the Nvidia site didn't list any beta drivers newer than the 179.x series for the 8600M GT. Is there something else I need to do other than just running "setup.exe" and doing the usual installation wizard procedure? If it matters, I have nvcuda.dll in my Windows\System32 folder.
samepaul
12th February 2009, 14:21
2 BetaBoy
Damned. Didn't think that I will be the one with complains "it doesn't work", but unfortunately :(
So, here is the movie (downloaded as 1080p example compatible with DXVA)
General
Complete name : E:\Movies\~test\bbc-blue_m1080p.mov
Format : MPEG-4
Format profile : QuickTime
Codec ID : qt
File size : 234 MiB
Duration : 3mn 22s
Overall bit rate : 9 689 Kbps
Movie name : BBC Motion Gallery
Encoded date : UTC 2007-05-04 18:32:15
Tagged date : UTC 2007-05-04 18:32:26
Copyright : ©2007 BBC Motion Gallery
Comment : All Rights Reserved
Video
Format : AVC
Format/Info : Advanced Video Codec
Format profile : Main@L4.1
Format settings, CABAC : No
Format settings, ReFrames : 2 frames
Codec ID : avc1
Codec ID/Info : Advanced Video Coding
Duration : 3mn 22s
Bit rate mode : Variable
Bit rate : 9 563 Kbps
Width : 1 920 pixels
Height : 1 072 pixels
Display aspect ratio : 16/9
Frame rate mode : Variable
Frame rate : 30.000 fps
Minimum frame rate : 16.216 fps
Maximum frame rate : 300.000 fps
Resolution : 24 bits
Colorimetry : 4:2:0
Scan type : Progressive
Bits/(Pixel*Frame) : 0.155
Stream size : 230 MiB (99%)
Audio
...
NVIDIA Beta Driver 182.05 downloaded and installed (RivaTunner confirms that version is indeed 182.05). Board 8800GTS 320Mb
Option "Prefer CUDA acceleration" is available and selected, but tray icon does not turn green during playback. Also CPU usage is same as it was before codec upgrade - ~50%.
Tried under XP and under Win7 - same result on both systems.
What's wrong with me? :)
At least multithreading works :)
STaRGaZeR
12th February 2009, 14:29
Because that's not what is happening; CoreAVC is passing the correct flags and VMR9 interprets them wrongly. If it was passing the wrong flags why do VMR7 and Haali's renderer always show the fields in the correct order?
VMR9 is broken. It even changes field order randomly when you seek.
That would be right if I were using VMR9, but it happens with EVR and EVR Custom. When playing TFF content it's clear that the renderer (or another filter located after CoreAVC like ffdshow) is receiving BFF flags. When you force TFF then all is right. Again this has nothing to do with VMR. Already discussed and confirmed with tests a few pages back.
CoreAVC outputting NV12 with TFF Blu-ray content muxed into Matroska and using Haali's splitter:
VMR7 windowed/renderless --> no hardware deinterlacing (not CoreAVC's fault)
VMR9 windowed/renderless --> black screen
Haali renderer --> it does not accept NV12, falls back to Video Renderer, no hardware deinterlacing
EVR --> hardware deinterlacing with wrong field order
EVR Custom --> hardware deinterlacing with wrong field order
nm
12th February 2009, 14:30
NVIDIA Beta Driver 182.05 downloaded and installed (RivaTunner confirms that version is indeed 182.05). Board 8800GTS 320Mb
[...]
What's wrong with me? :)
Your GPU does not have VP2 (http://en.wikipedia.org/wiki/PureVideo#Table_of_PureVideo_.28HD.29_GPUs), so NVCUVID (and CoreAVC) can't use it.
Carpo
12th February 2009, 14:35
im guessing this has been asked before but is there an x64 version in the offering soon? or is that a 2.x/3.x release ;)
samepaul
12th February 2009, 14:55
Your GPU does not have VP2 (http://en.wikipedia.org/wiki/PureVideo#Table_of_PureVideo_.28HD.29_GPUs), so NVCUVID (and CoreAVC) can't use it.
I definitely know that 8800GTS has limited DXVA support, but CUDA is different stuff, not related to DxVA and according to NVIDIA (http://www.nvidia.com/object/cuda_learn_products.html) is supported on my board.
So I believe the problem is elsewhere...
nm
12th February 2009, 15:09
I definitely know that 8800GTS has limited DXVA support, but CUDA is different stuff, not related to DxVA and according to NVIDIA (http://www.nvidia.com/object/cuda_learn_products.html) is supported on my board.
So I believe the problem is elsewhere...
No, CoreAVC uses nvcuvid.dll, which requires VP2 or VP3 video decoding hardware. It does not use the stream processors of the GPU to decode video.
samepaul
12th February 2009, 15:34
No, CoreAVC uses nvcuvid.dll, which requires VP2 or VP3 video decoding hardware. It does not use the stream processors of the GPU to decode video.
What for CUDA needs video decoding hardware? CUDA is a way to perform general purpose calculations on GPU. How it is related to VP? Tesla processors has no VP at all, since they are CUDA-dediacted, but according to you they can't run CUDA code? I think you're wrong (unless you can provide link proving that you're right).
BetaBoy can you comment on this?
lucassp
12th February 2009, 15:47
nVidia offered VP2/VP3 access through their CUDA API so you can skip some of the DXVA limitations.
The Tesla cards have the same GPU's as desktop cards with some functionality disabled through software.
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.