Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
19th September 2008, 10:43 | #244 | Link |
Registered User
Join Date: Jun 2008
Posts: 177
|
DXVA don't support post-processing. BTW resizing and sharpening also can be done at GPU
Best benefit will be if someone combine decoding, lanczos/blackman/spline resizing and FFT3DGPU in one package to avoid transfer of image back and forth. (Like this) Last edited by Quark.Fusion; 19th September 2008 at 10:48. |
19th September 2008, 10:54 | #245 | Link |
Registered User
Join Date: Sep 2003
Posts: 209
|
Hi,
since Sagekilla asked for a comparision between DGAVCIndex and DGAVCIndexNV on a quadcore CPU, a ran a few tests. My PC is a Q6600 @ 3,4GHz with 2GBytes DDR2-800 RAM running at 756MHz. The graphics card is a 9600GT with forceware 177.83 drivers, and the OS is Windows XP Pro 32bit. To get DGAVCIndexNV to run, I only had to unpack it and to copy the nvcuvid.dll into my system32 directory, like instructed. Both the denoising and the sharpening sliders in the Nvidia control panel where set to 0% during this tests. The first clip I used for my tests is a DVB recording from "Anixe HD", which is about 2 minutes long. It is 1080i50, the footage seems to be shot with a video camera, since it really is interlaced, it has an AR of 1,78:1 and looks very clear ( no noise or grain ) 1.) Decoded with DGAVCDecode and encoded as interlaced: Code:
LoadPlugin("D:\MPEG\AVISYNTH\PLUGINS\DGAVCDecode.DLL") AVCsource("anixe.dga") D:\MPEG\megui\tools\x264>x264.exe --crf 22.0 --level 4.1 --ref 4 --mixed-refs --no-fast-pskip --bframes 3 --b-rdo --bime --weightb - -trellis 1 --partitions p8x8,b8x8,i4x4,i8x8 --8x8dct --me umh --threads auto --thread-input --sar 1:1 --progress --no-psnr --interla ced --output "H:\anixe.mkv" "G:\Aufnahmen\anixe.avs" avis [info]: 1920x1080 @ 25.00 fps (3023 frames) x264 [warning]: NAL HRD parameters require VBV max bitrate and buffer size to be specified x264 [info]: using SAR=1/1 x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 Cache64 x264 [info]: slice I:14 Avg QP:21.34 size:229347 x264 [info]: slice P:1330 Avg QP:23.36 size: 65230 x264 [info]: slice B:1679 Avg QP:25.90 size: 20724 x264 [info]: consecutive B-frames: 9.8% 41.6% 17.0% 31.5% x264 [info]: mb I I16..4: 13.0% 50.0% 37.0% x264 [info]: mb P I16..4: 1.4% 6.9% 3.2% P16..4: 44.7% 15.1% 10.9% 0.0% 0.0% skip:17.8% x264 [info]: mb B I16..4: 0.1% 0.3% 0.3% B16..8: 39.7% 1.1% 2.0% direct: 6.7% skip:49.8% L0:36.9% L1:54.5% BI: 8.6% x264 [info]: 8x8 transform intra:58.1% inter:55.6% x264 [info]: ref P L0 52.5% 20.0% 12.9% 4.0% 3.7% 2.1% 3.6% 1.2% x264 [info]: ref B L0 65.5% 20.6% 7.2% 2.4% 3.0% 1.3% x264 [info]: ref B L1 70.9% 29.1% x264 [info]: SSIM Mean Y:0.9713192 x264 [info]: kb/s:8254.2 encoded 3023 frames, 5.64 fps, 8254.32 kb/s Code:
LoadPlugin("D:\MPEG\AVISYNTH\PLUGINS\DGAVCDecodeNV.DLL") AVCsource("anixe_nv.dga", deinterlace=false) D:\MPEG\megui\tools\x264>x264.exe --crf 22.0 --level 4.1 --ref 4 --mixed-refs --no-fast-pskip --bframes 3 --b-rdo --bime --weightb - -trellis 1 --partitions p8x8,b8x8,i4x4,i8x8 --8x8dct --me umh --threads auto --thread-input --sar 1:1 --progress --no-psnr --interla ced --output "H:\anixe_nv.mkv" "G:\Aufnahmen\anixe_nv.avs" avis [info]: 1920x1080 @ 25.00 fps (3026 frames) x264 [warning]: NAL HRD parameters require VBV max bitrate and buffer size to be specified x264 [info]: using SAR=1/1 x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 Cache64 x264 [info]: slice I:14 Avg QP:21.01 size:228003 x264 [info]: slice P:1162 Avg QP:22.76 size: 66468 x264 [info]: slice B:1850 Avg QP:24.86 size: 16716 x264 [info]: consecutive B-frames: 9.0% 18.8% 25.3% 46.9% x264 [info]: mb I I16..4: 12.9% 47.7% 39.4% x264 [info]: mb P I16..4: 1.3% 7.5% 3.3% P16..4: 40.3% 15.9% 11.0% 0.0% 0.0% skip:20.9% x264 [info]: mb B I16..4: 0.1% 0.2% 0.3% B16..8: 37.4% 0.9% 1.5% direct: 5.2% skip:54.5% L0:33.0% L1:58.9% BI: 8.1% x264 [info]: 8x8 transform intra:59.7% inter:55.5% x264 [info]: ref P L0 56.6% 22.8% 8.5% 3.8% 2.7% 2.1% 2.3% 1.2% x264 [info]: ref B L0 71.8% 20.6% 3.7% 2.0% 1.0% 0.9% x264 [info]: ref B L1 76.9% 23.1% x264 [info]: SSIM Mean Y:0.9719650 x264 [info]: kb/s:7359.7 encoded 3026 frames, 5.99 fps, 7359.81 kb/s Code:
LoadPlugin("D:\MPEG\AVISYNTH\PLUGINS\DGAVCDecodeNV.DLL") AVCsource("anixe_nv.dga", deinterlace=true) D:\MPEG\megui\tools\x264>x264.exe --crf 22.0 --level 4.1 --ref 4 --mixed-refs --no-fast-pskip --bframes 3 --b-rdo --bime --weightb - -trellis 1 --partitions p8x8,b8x8,i4x4,i8x8 --8x8dct --me umh --threads auto --thread-input --sar 1:1 --progress --no-psnr --output "H:\anixe_nv_deint.mkv" "G:\Aufnahmen\anixe_nv_deint.avs" avis [info]: 1920x1080 @ 25.00 fps (3026 frames) x264 [info]: using SAR=1/1 x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 Cache64 x264 [info]: slice I:14 Avg QP:19.71 size:161555 x264 [info]: slice P:1117 Avg QP:21.46 size: 54310 x264 [info]: slice B:1895 Avg QP:23.47 size: 14413 x264 [info]: consecutive B-frames: 8.6% 15.1% 22.5% 53.8% x264 [info]: mb I I16..4: 10.8% 70.6% 18.7% x264 [info]: mb P I16..4: 1.8% 14.8% 1.6% P16..4: 41.4% 11.9% 7.8% 0.0% 0.0% skip:20.7% x264 [info]: mb B I16..4: 0.2% 0.7% 0.1% B16..8: 32.7% 0.7% 1.1% direct: 6.1% skip:58.3% L0:34.0% L1:58.0% BI: 7.9% x264 [info]: 8x8 transform intra:80.1% inter:68.1% x264 [info]: ref P L0 74.8% 13.3% 7.6% 4.2% x264 [info]: ref B L0 89.6% 7.0% 3.4% x264 [info]: SSIM Mean Y:0.9820854 x264 [info]: kb/s:5964.2 encoded 3026 frames, 9.01 fps, 5964.27 kb/s The second clip I used is also a DVB recording, this time from "Premiere HD", which also is about 2 minutes long. It is 1080i50 and seems to be shot on celluloid since the footage is progressive. This clip has an AR of 1,85:1 and looks quite noisy to be, might be film grain. 4.) Decoded by DGAVCDecode and encoded as progressive: Code:
LoadPlugin("D:\MPEG\AVISYNTH\PLUGINS\DGAVCDecode.DLL") AVCsource("Premiere.dga") D:\MPEG\megui\tools\x264>x264.exe --crf 22.0 --level 4.1 --ref 4 --mixed-refs --no-fast-pskip --bframes 3 --b-rdo --bime --weightb - -trellis 1 --partitions p8x8,b8x8,i4x4,i8x8 --8x8dct --me umh --threads auto --thread-input --sar 1:1 --progress --no-psnr --output "H:\Premiere.mkv" "G:\Aufnahmen\Premiere.avs" avis [info]: 1920x1080 @ 25.00 fps (2977 frames) x264 [info]: using SAR=1/1 x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 Cache64 x264 [info]: slice I:42 Avg QP:14.42 size:171198 x264 [info]: slice P:2020 Avg QP:16.96 size: 84495 x264 [info]: slice B:915 Avg QP:19.28 size: 27135 x264 [info]: consecutive B-frames: 39.6% 55.8% 2.1% 2.5% x264 [info]: mb I I16..4: 14.2% 77.5% 8.4% x264 [info]: mb P I16..4: 2.4% 11.8% 0.7% P16..4: 45.1% 20.9% 10.1% 0.0% 0.0% skip: 8.9% x264 [info]: mb B I16..4: 0.3% 0.8% 0.0% B16..8: 35.5% 0.8% 1.3% direct:16.4% skip:44.8% L0:43.3% L1:50.3% BI: 6.3% x264 [info]: 8x8 transform intra:78.7% inter:60.6% x264 [info]: ref P L0 57.8% 20.9% 11.3% 9.9% x264 [info]: ref B L0 80.5% 10.5% 9.0% x264 [info]: SSIM Mean Y:0.9782869 x264 [info]: kb/s:13617.7 encoded 2977 frames, 5.89 fps, 13617.88 kb/s Code:
LoadPlugin("D:\MPEG\AVISYNTH\PLUGINS\DGAVCDecodeNV.DLL") AVCsource("Premiere_nv.dga",deinterlace=false) D:\MPEG\megui\tools\x264>x264.exe --crf 22.0 --level 4.1 --ref 4 --mixed-refs --no-fast-pskip --bframes 3 --b-rdo --bime --weightb - -trellis 1 --partitions p8x8,b8x8,i4x4,i8x8 --8x8dct --me umh --threads auto --thread-input --sar 1:1 --progress --no-psnr --output "H:\Premiere_nv.mkv" "G:\Aufnahmen\Premiere_nv.avs" avis [info]: 1920x1080 @ 25.00 fps (2980 frames) x264 [info]: using SAR=1/1 x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 Cache64 x264 [info]: slice I:37 Avg QP:14.35 size:149138 x264 [info]: slice P:1752 Avg QP:16.64 size: 78769 x264 [info]: slice B:1191 Avg QP:19.28 size: 25838 x264 [info]: consecutive B-frames: 24.9% 59.9% 10.1% 5.0% x264 [info]: mb I I16..4: 14.3% 77.9% 7.8% x264 [info]: mb P I16..4: 2.4% 12.1% 0.5% P16..4: 44.7% 20.5% 10.0% 0.0% 0.0% skip: 9.9% x264 [info]: mb B I16..4: 0.2% 0.7% 0.0% B16..8: 39.4% 0.6% 1.2% direct:14.4% skip:43.4% L0:39.8% L1:54.4% BI: 5.8% x264 [info]: 8x8 transform intra:80.1% inter:60.0% x264 [info]: ref P L0 63.5% 18.4% 12.5% 5.6% x264 [info]: ref B L0 81.5% 11.8% 6.7% x264 [info]: SSIM Mean Y:0.9784119 x264 [info]: kb/s:11697.7 encoded 2980 frames, 6.41 fps, 11697.84 kb/s My preliminary conclusions are: When using a quadcore CPU, offloading the decoding process to the GPU without deinterlacing at least atm speeds up things about 5% - 10%. This is a good step into the right direction, but it's such a big one as somebody might have hoped for. Nevertheless I suppose that the performance gain by GPU decoding might be much higher when using a single ore a dual core CPU. On the other hand, the GPU decoding has a huge impact when used together with the GPU deinterlacing feature. Here I could observe a performace gain over 50%. I also observed that decoding the video through the GPU seems to have an notable impact on the picture quality resulting in a significantly higher compressibility of the video during the encoding process. And at last, the libavcodec.dll seems to have a bug which makes it loosing 3 frames compared to GPU decoding. Subsuming the results of my tests it's my opinion that, even at this early stage of development, the usage of the GPU decoding through DGAVCDecodeNV is absolutely recommendable, if you own a suitable graphics card. Even if you might "only" get a speedup of 5% - 10% in the worst case, the better compressibility, the better picture quality and the very good deinterlacing feature are making DGAVCDecodeNV superior over DGAVCDecode. C.U. NanoBot Last edited by NanoBot; 19th September 2008 at 10:57. Reason: Typo |
19th September 2008, 10:58 | #246 | Link | |
Registered User
Join Date: Jan 2007
Location: Romania, Timisoara
Posts: 223
|
Quote:
Firstly you need FFT3DGPU ported to CUDA because right now it uses the Direct3D backend. So we need a good CUDA programmer to do all this |
|
19th September 2008, 10:59 | #247 | Link |
unrecognized user
Join Date: Oct 2005
Location: home of Stella Artois
Posts: 303
|
Better picture quality? That, I think, shouldn't happen. Do you have maybe comparison pics? Or did you mean only when deinterlacer is used, compared to software ones?
__________________
zzz |
19th September 2008, 11:03 | #248 | Link |
Registered User
Join Date: Sep 2003
Posts: 209
|
Hi Daodan,
perhabs I used a misleading verbalization: Decoding through libavcodec.dll gives me macroblock errors, which results in a loss of compression. Those errors are gone if I use GPU decoding. So "better picture quality" was the wrong term in that context, I should have used "better decoding result". C.U. NanoBot |
19th September 2008, 11:07 | #249 | Link |
Registered User
Join Date: Jun 2008
Posts: 177
|
AFAIK DXVA can't do FFT-NR, high-quality resize and limited sharpening — so you get medium quality NR, bilinear or at most bicubic resize and don't-know-what sharpening.
And if you want to encode that to view later on your not-so-powerful notebook or HTPC — you out of luck with DXVA. |
19th September 2008, 12:08 | #253 | Link |
Registered User
Join Date: Mar 2005
Posts: 89
|
I have found the thread in the meantime, so no more discussion on this apart from a little request: If there is a special bobbing function inside the drivers that can be implemented with just a few commands, please give it a try - we can compare the results then - in a separate thread
|
19th September 2008, 12:26 | #254 | Link |
Registered User
Join Date: Dec 2004
Location: Melbourne, AU
Posts: 1,963
|
I haven't tried it but bobbing should be possible by setting the deinterlace mode to cudaVideoDeinterlaceMode_Bob and toggling the second_field flag of the CUVIDPROCPARAMS struct when mapping the frame (set to 0 when an even frame is requested, 1 for an odd frame). No idea if this functionality is implemented in cuda yet though.
|
19th September 2008, 13:15 | #255 | Link | |
Registered User
Join Date: Mar 2006
Posts: 1,538
|
Quote:
|
|
19th September 2008, 20:24 | #257 | Link |
x264aholic
Join Date: Jul 2007
Location: New York
Posts: 1,752
|
I suppose this is, at the very least, a big win for anyone with a nvidia card and some HD interlaced material
__________________
You can't call your encoding speed slow until you start measuring in seconds per frame. |
19th September 2008, 20:42 | #258 | Link | |
Turkey Machine
Join Date: Jan 2005
Location: Lowestoft, UK (but visit lots of places with bribes [beer])
Posts: 1,953
|
Quote:
Only a slight increase, but it helps. Good one.
__________________
On Discworld it is clearly recognized that million-to-one chances happen 9 times out of 10. If the hero did not overcome huge odds, what would be the point? Terry Pratchett - The Science Of Discworld |
|
19th September 2008, 22:49 | #259 | Link | |
RipBot264 author
Join Date: May 2006
Location: Poland
Posts: 7,815
|
Quote:
__________________
Windows 7 Image Updater - SkyLake\KabyLake\CoffeLake\Ryzen Threadripper |
|
19th September 2008, 23:51 | #260 | Link |
Registered User
Join Date: May 2008
Posts: 39
|
Vista SP1
Athlon64 3500+ 2.25ghz 1GB ram 8400GS with VP3 i took a 1080p hd-dvd clip and cropped a few lines off the top and bottom then bilinear resized to 640x352 and encoded using megui iphone profile. no gpu: pass 1 encoded 2168 frames, 6.03 fps, 1003.51 kb/s pass 2 encoded 2168 frames, 6.06 fps, 1000.23 kb/s gpu: pass 1 encoded 2170 frames, 11.91 fps, 1003.26 kb/s pass 2 encoded 2170 frames, 12.06 fps, 1000.27 kb/s |
Thread Tools | Search this Thread |
Display Modes | |
|
|