Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
22nd September 2017, 11:39 | #1 | Link |
Registered User
Join Date: Aug 2006
Location: Taiwan
Posts: 392
|
nnedi3cl
https://github.com/HomeOfVapourSynth...Synth-NNEDI3CL
Here comes another OpenCL variant of the popular filter. As usual, some benchmarks below FYI, measured by "vspipe -p test.vpy .". My CPU is E3-1231v3 and GPU is GTX 660. Sample videos used for benchmarking. test1 is quite unfriendly to the prescreener, while test2 is very friendly to the prescreener. vpy Code:
import vapoursynth as vs core = vs.get_core() core.max_cache_size = 3072 clip = core.lsmas.LibavSMASHSource('test1.mp4') #clip = core.lsmas.LibavSMASHSource('test2.mp4') #clip = core.resize.Bicubic(clip, format=vs.YUV420P16) # deinterlace #clip = core.nnedi3.nnedi3(clip, field=1) #clip = core.nnedi3cl.NNEDI3CL(clip, field=1) # enlarge #clip = core.std.Transpose(clip).nnedi3.nnedi3(field=1, dh=True, nsize=4, nns=3).std.Transpose().nnedi3.nnedi3(field=1, dh=True, nsize=4, nns=3) #clip = core.nnedi3cl.NNEDI3CL(clip, field=1, dh=True, dw=True, nsize=4, nns=3) clip.set_output() Code:
YUV420P8: nnedi3: 19.66 fps nnedi3cl: 35.82 fps YUV420P16: nnedi3: 13.12 fps nnedi3cl: 32.68 fps Code:
YUV420P8: nnedi3: 98.34 fps nnedi3cl: 89.53 fps YUV420P16: nnedi3: 59.34 fps nnedi3cl: 71.07 fps Code:
YUV420P8: nnedi3: 6.60 fps nnedi3cl: 7.72 fps YUV420P16: nnedi3: 4.99 fps nnedi3cl: 7.44 fps Code:
YUV420P8: nnedi3: 28.82 fps nnedi3cl: 34.85 fps YUV420P16: nnedi3: 16.48 fps nnedi3cl: 29.17 fps Last edited by HolyWu; 19th October 2017 at 05:01. |
22nd September 2017, 15:38 | #2 | Link |
Professional Code Monkey
Join Date: Jun 2003
Location: Kinnarps Chair
Posts: 2,555
|
Interesting... Maybe I'll try similar benchmarks on my computer which has very different hardware.
Btw, why didn't you use the built in resize for the bitdepth conversion?
__________________
VapourSynth - proving that scripting languages and video processing isn't dead yet |
22nd September 2017, 16:46 | #3 | Link |
Registered User
Join Date: Sep 2007
Posts: 5,374
|
Thanks
Is there an "easy" way to toggle versions for testing for functions based on NNEDI3 / CL ? Maybe some python magic I'm not aware of since I'm a python newbie? For example, presumably haf.QTGMC would call the nnedi3.nnedi3 version , would I need to find/replace all instances or something like that ? |
22nd September 2017, 17:13 | #4 | Link |
Excessively jovial fellow
Join Date: Jun 2004
Location: rude
Posts: 1,100
|
Code:
plain_nnedi = core.nnedi3.nnedi3 core.nnedi3.nnedi3 = core.nnedi3cl.NNEDI3CL |
22nd September 2017, 17:22 | #5 | Link |
Registered User
Join Date: Sep 2007
Posts: 5,374
|
I'm getting some issues with build program failure
Win8.1 x64 . Vapoursynth R39test4 Code:
Failed to evaluate the script: Python exception: NNEDI3CL: Build Program Failure :1:3129: error: incompatible pointer types passing '__local float *' to parameter of type 'const __local float (*)[95]' :1:161: note: passing argument to parameter 'input' here :1:4825: error: incompatible pointer types passing '__local float *' to parameter of type 'const __local float (*)[95]' :1:161: note: passing argument to parameter 'input' here error: front end compiler failed build. Code:
import vapoursynth as vs core = vs.get_core() clip = core.avisource.AVISource(r'PATH\test.avi', pixel_type="yv12") #clip = core.nnedi3cl.NNEDI3CL(clip, field=1, dh=True, dw=True) clip.set_output() Last edited by poisondeathray; 22nd September 2017 at 17:34. |
23rd September 2017, 04:19 | #6 | Link | |
Registered User
Join Date: Aug 2006
Location: Taiwan
Posts: 392
|
Quote:
Please try https://www.nmm-hd.org/upload/get~gS...NNEDI3CL-r3.7z and see whether it works. |
|
23rd September 2017, 07:06 | #7 | Link | |
Registered User
Join Date: Sep 2007
Posts: 5,374
|
Quote:
What was the issue ? / What is different with this build ? |
|
30th September 2017, 19:04 | #10 | Link |
Registered User
Join Date: Sep 2010
Location: Ukraine, Bohuslav
Posts: 377
|
I can't get it working on NVIDIA Quadro 600 and Windows Server 2012 R2 - https://pastebin.com/Sya8BVKQ
Is that a driver issue? I'm not familiar with OpenCL. Upd.: On another server (same OS) on NVIDIA GeForce GTX 550 Ti I get the same errors. Last edited by DJATOM; 30th September 2017 at 19:29. |
1st October 2017, 13:07 | #11 | Link | |
Registered User
Join Date: Aug 2006
Location: Taiwan
Posts: 392
|
Quote:
|
|
2nd October 2017, 18:43 | #13 | Link |
Professional Code Monkey
Join Date: Jun 2003
Location: Kinnarps Chair
Posts: 2,555
|
All tests performed with:
Code:
vspipe script.vpy . -p Threadripper 1950X Sapphire Fury Tri-X 3200 CL14 RAM All tests were run with 32 threads (default) except for enlarge on cpu where 16 threads for some reason performed substantially better (7-10 fps difference). The source used was 3000 frames from a typical 1080p tv series episode. deinterlace1: Code:
YUV420P8: nnedi3: 352.14 fps nnedi3cl: 45.68 fps YUV420P16: nnedi3: 164.40 fps nnedi3cl: 40.42 fps Code:
YUV420P8: nnedi3: 65.22 fps nnedi3cl: 10.47 fps YUV420P16: nnedi3: 32.03 fps nnedi3cl: 9.90 fps
__________________
VapourSynth - proving that scripting languages and video processing isn't dead yet Last edited by Myrsloik; 2nd October 2017 at 18:46. |
14th October 2017, 11:41 | #14 | Link |
Beyond Kawaii
Join Date: Feb 2008
Location: Russia
Posts: 724
|
Tried to replace NNEDI3 with NNEDI3CL in QTGMC.
Got this error: Code:
2017-10-14 13:41:42.021 Failed to evaluate the script: Python exception: NNEDI3CL: Build Program Failure :1:528: error: expected identifier or '(' :42:23: note: expanded from here #define SCALE_ASIZE 0,003472 ^ :1:528: error: expected ';' at end of declaration :42:23: note: expanded from here #define SCALE_ASIZE 0,003472 ^ :1:564: error: expected identifier or '(' :42:23: note: expanded from here #define SCALE_ASIZE 0,003472 ^ :1:564: error: expected ';' at end of declaration :42:23: note: expanded from here #define SCALE_ASIZE 0,003472 ^ :1:1854: warning: expression result unused Traceback (most recent call last): File "src\cython\vapoursynth.pyx", line 1830, in vapoursynth.vpy_evaluateScript (src\cython\vapoursynth.c:36860) File "D:/video-to-process/Takaradzuka - Phantom 2004/phantom-temp.vpy", line 92, in deint = haf.QTGMC(weaved, Preset="Placebo", EdiMode="eedi3+nnedi3", ChromaEdi="", EdiQual=2, NNeurons=4, NNSize=3, SubPel=4, SubPelInterp=2, BlockSize=8, Overlap=4, TFF=True, **qtgmcArguments) File "D:\vapoursynth-plugins\py\havsfunc.py", line 1104, in QTGMC edi1 = QTGMC_Interpolate(ediInput, InputType, EdiMode, NNSize, NNeurons, EdiQual, EdiMaxD, bobbed, ChromaEdi, TFF) File "D:\vapoursynth-plugins\py\havsfunc.py", line 1390, in QTGMC_Interpolate sclip=core.nnedi3cl.NNEDI3CL(Input, field=field, planes=planes, nsize=NNSize, nns=NNeurons, qual=EdiQual)) File "src\cython\vapoursynth.pyx", line 1722, in vapoursynth.Function.__call__ (src\cython\vapoursynth.c:35000) vapoursynth.Error: NNEDI3CL: Build Program Failure :1:528: error: expected identifier or '(' :42:23: note: expanded from here #define SCALE_ASIZE 0,003472 ^ :1:528: error: expected ';' at end of declaration :42:23: note: expanded from here #define SCALE_ASIZE 0,003472 ^ :1:564: error: expected identifier or '(' :42:23: note: expanded from here #define SCALE_ASIZE 0,003472 ^ :1:564: error: expected ';' at end of declaration :42:23: note: expanded from here #define SCALE_ASIZE 0,003472 ^ :1:1854: warning: expression result unused
__________________
...desu! |
19th October 2017, 05:25 | #15 | Link | |
Registered User
Join Date: Aug 2006
Location: Taiwan
Posts: 392
|
Update r4 & r5.
The benchmark in the first post is revised. Quote:
|
|
22nd October 2017, 18:15 | #17 | Link |
Registered User
Join Date: Aug 2006
Location: Taiwan
Posts: 392
|
Code:
clip = core.ffms2.Source('lena512.bmp').std.Loop(1000) #clip = core.nnedi3.nnedi3(clip, field=1, pscrn=2) #clip = core.znedi3.nnedi3(clip, field=1, pscrn=2) Code:
nnedi3: Output 1000 frames in 6.70 seconds (149.24 fps) znedi3: Output 1000 frames in 71.07 seconds (14.07 fps) Anyway, I have no specific favor over CPU or GPGPU personally. It's simply provided as an alternative here. The users will choose which one to use on their own depending on the speed they get then. |
23rd October 2017, 18:21 | #19 | Link |
Registered User
Join Date: Aug 2006
Location: Taiwan
Posts: 392
|
Code:
#clip = core.ffms2.Source('lena512color.tiff').std.Loop(2000) #clip = core.ffms2.Source('test1.mp4') #clip = core.ffms2.Source('test2.mp4') #clip = core.nnedi3.nnedi3(clip, field=1, pscrn=2) #clip = core.znedi3.nnedi3(clip, field=1, pscrn=2) #clip = core.nnedi3cl.NNEDI3CL(clip, field=1, pscrn=2) Code:
nnedi3: Output 2000 frames in 14.52 seconds (137.71 fps) znedi3: Output 2000 frames in 9.08 seconds (220.17 fps) nnedi3cl: Output 2000 frames in 17.65 seconds (113.28 fps) Code:
nnedi3: Output 1250 frames in 63.31 seconds (19.75 fps) znedi3: Output 1250 frames in 40.80 seconds (30.64 fps) nnedi3cl: Output 1250 frames in 34.97 seconds (35.75 fps) Code:
nnedi3: Output 2184 frames in 21.82 seconds (100.10 fps) znedi3: Output 2184 frames in 21.07 seconds (103.66 fps) nnedi3cl: Output 2184 frames in 23.73 seconds (92.02 fps) |
Thread Tools | Search this Thread |
Display Modes | |
|
|