Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
18th September 2008, 14:30 | #204 | Link |
Guest
Join Date: Jan 2002
Posts: 21,901
|
I ran a quick test with a CRF encode of 1280x720 material.
E8500 @ 3.8 GHz NV GPU: 27.9 fps CoreAVC via DSS: 28.4 Apparently CoreAVC is so good that its CPU utilization is comparable to the overhead of managing decoding on the GPU (memory transfers host<->device, etc.). It's hard to believe, so I'll look to see if I've missed something. I'm beginning to think this is a dead end, since things can only get better on the CPU versus the GPUs, which are "cast in silicon". Maybe I'll update the regular versions with the latest libavcodec code while we wait for a CoreAVC API/SDK. |
18th September 2008, 14:44 | #205 | Link |
x264aholic
Join Date: Jul 2007
Location: New York
Posts: 1,752
|
I do have some good news in regards to the deinterlacing of vp2.
The following image was deinterlaced using a simple TempGaussMC(): The following with the VP2 Deint: I used very naive settings of the default, just to play the role of a new user who doesn't know better. The deinterlacer is very very sharp.
__________________
You can't call your encoding speed slow until you start measuring in seconds per frame. |
18th September 2008, 14:51 | #206 | Link |
Guest
Join Date: Jan 2002
Posts: 21,901
|
Maybe that's the saving grace!
Nice shots, thanks for posting. BTW, I noticed with CoreAVC via DSS(), it blend deinterlaced, even though I have deinterlacing turned off in its control panel. It also included my clip 4 times in the output! I haven't used CoreAVC much but I was somewhat surprised. Any idea what is going on? Last edited by Guest; 18th September 2008 at 14:53. |
18th September 2008, 14:57 | #207 | Link |
Registered User
Join Date: Apr 2006
Posts: 225
|
To be honest, I think that the decoding on the GPU is a neat idea, but in the end should be an option for those with the right hardware who want to try squeezing a little more speed out of their encodes. I'm concerned though that eventually the limitations of DXVA on these cards may begin to hinder this great program's ability to work. I agree that the PAFF accuracy is nice (haven't gotten to try it myself yet though) but I know how difficult it can be sometimes to make hardware based tasks like this work consistently. The thought of only one instance at a time is scarry too, as I sometimes do more than one encode at once, or may in the future be reading from more than one source at once (doing a fade between two of my camera's clips comes to mind).
Bottom Line: I feel that hardware acceleration is better for realtime critical applications (playback). However for applications like transcoding, a pure software solution is better, since speed is nice, but not critical, and it can grow as needed without being as you say "cast it silicon." edit: That de-interlaced footage does look nice! Does it change the fps to 60 like DGbob does? (assuming of course the source was 30i). Last edited by Turtleggjp; 18th September 2008 at 15:05. |
18th September 2008, 15:05 | #209 | Link |
x264aholic
Join Date: Jul 2007
Location: New York
Posts: 1,752
|
Yes, indeed.. I'll need to test on a longer interlaced source to see how well a "good" deinterlacer will fare vs AVCSource(deinterlace=true) speed wise. By "good" I mean to say one that would produce very similar results without being horribly slow At the very least, we have quite possibly a fast and high quality decoder + deinterlacer for those annoying 30i sources.
Edit: I forget where I got the source now, but it was thanks to some searching on the doom9 forums for an interlaced H.264 m2ts source did I find it. Edit 2: neuron2, is it safe to assume you used DSS() using graphedit to connect CoreAVC as the decoder?
__________________
You can't call your encoding speed slow until you start measuring in seconds per frame. Last edited by Sagekilla; 18th September 2008 at 15:07. |
18th September 2008, 15:20 | #211 | Link | ||
Registered User
Join Date: Dec 2004
Location: Melbourne, AU
Posts: 1,963
|
Quote:
Quote:
|
||
18th September 2008, 15:25 | #213 | Link |
x264aholic
Join Date: Jul 2007
Location: New York
Posts: 1,752
|
@squid_80: For the single core users who happen to have a very fast GPU, this is indeed very useful.
On another note, while this is an extremely unusual case, I'm curious how much speed up I can afford on two machine setup with TCPServer --> TCPSource. @neuron2: I'm not completely sure. My normal routine is Directshowsource("graph.grf") with graph.grf being a filter graph of source file --> CoreAVC. I don't know if doing it your way could produce different results.
__________________
You can't call your encoding speed slow until you start measuring in seconds per frame. |
18th September 2008, 15:53 | #214 | Link |
Registered User
Join Date: Nov 2003
Posts: 1,281
|
I just ran a quick test on my e6750 @ 3.6Ghz and 8800gt 512m.
BD source croped and resized to 1280,528 In virtualdubmod encoding to huffyuv with software = 21fps same with hardware = 34-42fps encoding to x264 ("turbo" settings) with software decoder = 21 fps same with hardware = 29 fps. And i've seen the purevideo de-interlacer at work, I like it alot. Sorry, it's not alot more detailed, but it's been a big day and I should be sleeping. edit: neuron, did i read correctly, you are using 8600 gpu? I'd like to see some tests from someone with a 9800gtx*2 or 280 core even in sli. |
18th September 2008, 16:10 | #215 | Link |
Registered User
Join Date: Apr 2002
Location: Germany
Posts: 4,926
|
Nice Deinterlacing results though i wouldn't have suspected that it beats something as advanced as TempgaussMC last time my visual subjective impression was it was on the same level then yadif, definitely more tests with different sources need to be done
Im gonna test DgdecodeNV as fast as i can though currently PC resources are not available @Audionut As Donald already said it's equal for every Nvidia card they are all clocked with 400 mhz (VP1/2 core) <- (No Software yet can change that no Software even knows that this Clock is existing, it was revealed here almost as a Worlds First exclusive, not really that suprising its a own Hardware core though nobody knew how high it's clocked) so the only difference is the Memory Bandwith and most probably stuff like the Deinterlaceing could be a big amount faster as here the Shader Clock i think plays a role else for Progressive it should be really steady for us all and the results posted here show that it indeed is
__________________
all my compares are riddles so please try to decipher them yourselves :) It is about Time Join the Revolution NOW before it is to Late ! http://forum.doom9.org/showthread.php?t=168004 Last edited by CruNcher; 18th September 2008 at 16:29. |
18th September 2008, 16:18 | #216 | Link |
Registered User
Join Date: Apr 2006
Posts: 225
|
Yeah, the graph option is better, since you can build the graph yourself and guarantee what decoder is being used. I always use either a .m2ts file or .mkv file as my source, and then run that to the desired decoder (CoreAVC for AVC, WMV Decoder DMO for VC-1). Seeking can be unstable with this method though, which is why your DGAVCIndex/DGAVCDecode program is so valuable.
|
18th September 2008, 16:32 | #217 | Link |
Turkey Machine
Join Date: Jan 2005
Location: Lowestoft, UK (but visit lots of places with bribes [beer])
Posts: 1,953
|
A direct comparison between this (DGAVCDecodeNV) and the software DGAVCDecode (currently Alpha 35 IIRC) would be most welcome, as I don't have an NVIDIA-capable GPU to test this with (ATI Xpress 1150 or NVIDIA GeForce FX5200 ). Same settings, same encoder, please.
__________________
On Discworld it is clearly recognized that million-to-one chances happen 9 times out of 10. If the hero did not overcome huge odds, what would be the point? Terry Pratchett - The Science Of Discworld |
18th September 2008, 16:40 | #218 | Link |
Registered User
Join Date: Dec 2007
Posts: 639
|
neuron2: If you get the latest CoreAVC, then it has a tray icon when it's loaded like ffdshow.
And wow, the deinterlacer is really nice. I might have to trade cards with my brother so I get a 9600GT (got a 8800GTS 320MB now). |
18th September 2008, 16:48 | #219 | Link | |
Registered User
Join Date: Mar 2006
Posts: 1,538
|
Quote:
DGAVCDecode Code:
LoadPlugin("[PATH\]DGAVCDecode.dll") AVCSource("[PATH\]Transporter_DGAVCDecode.dga") Spline36Resize(1280,544) Code:
"[PATH\]x264.exe" --crf 18 --ref 5 --mixed-refs --bframes 3 --b-adapt 2 --b-pyramid --b-rdo --bime --weightb --filter -1:-1 --trellis 2 --partitions p8x8,b8x8,i4x4,i8x8 --8x8dct --me umh --threads auto --thread-input --sar 1:1 --progress --no-psnr --no-ssim --output "[PATH\]Transporter_DGAVCDecode.mp4" "[PATH\]Transporter_DGAVCDecode.avs" Code:
LoadPlugin("[PATH\]DGAVCDecodeNV.dll") AVCSource("[PATH\]Transporter_DGAVCDecodeNV.dga") Spline36Resize(1280,544) Code:
"[PATH\]x264.exe" --crf 18 --ref 5 --mixed-refs --bframes 3 --b-adapt 2 --b-pyramid --b-rdo --bime --weightb --filter -1:-1 --trellis 2 --partitions p8x8,b8x8,i4x4,i8x8 --8x8dct --me umh --threads auto --thread-input --sar 1:1 --progress --no-psnr --no-ssim --output "[PATH\]Transporter_DGAVCDecodeNV.mp4" "[PATH\]Transporter_DGAVCDecodeNV.avs" Code:
DirectShowSource("[PATH\]Transporter.mp4",fps=25,audio=false) Spline36Resize(1280,544) Code:
"[PATH\]x264.exe" --crf 18 --ref 5 --mixed-refs --bframes 3 --b-adapt 2 --b-pyramid --b-rdo --bime --weightb --filter -1:-1 --trellis 2 --partitions p8x8,b8x8,i4x4,i8x8 --8x8dct --me umh --threads auto --thread-input --sar 1:1 --progress --no-psnr --no-ssim --output "[PATH\]Transporter_Directshow.mp4" "[PATH\]Transporter_Directshow.avs" Last edited by rack04; 18th September 2008 at 16:55. |
|
18th September 2008, 17:29 | #220 | Link |
@DVBPortal
Join Date: Feb 2004
Posts: 434
|
Ok, here are my results dgavcdecode vs. dgavcdecodeNV:
First good news: Using x264 cli two pass encodings are no problem. Second good news: Acceleration is massive especially on the first pass. Third good news: The quality of the final encode seems to be higher. I don't know why and have to do more tests. Source: 1080p30 Apple music clip downscaled to 720p30 CPU: Q6600 2.4GHz GPU: GF 8600 GTS 1.450 GHz dgavcdecode: pass #1: encoded 6580 frames, 32.68 fps, 4811.95 kb/s pass #2: encoded 6580 frames, 21.48 fps, 5005.68 kb/s x264 [info]: SSIM Mean Y:0.9689052 x264 [info]: PSNR Mean Y:41.892 U:48.994 V:48.284 Avg:43.078 Global:42.345 kb/s:5005.53 dgavcdecodeNV: pass #1: encoded 6580 frames, 57.11 fps, 4811.95 kb/s pass #2: encoded 6580 frames, 24.45 fps, 5005.42 kb/s x264 [info]: SSIM Mean Y:0.9691189 x264 [info]: PSNR Mean Y:41.931 U:49.607 V:49.447 Avg:43.258 Global:42.404 kb/s:5005.27 Last edited by crypto; 18th September 2008 at 17:34. |
|
|