Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 4th January 2015, 20:05   #61  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,007
Quote:
Originally Posted by Khanattila View Post
Code:
 Device Version                                    OpenCL 1.0 CUDA
According to khronos API:
Code:
 If the -cl-std build option is not specified, the CL_DEVICE_OPENCL_C_VERSION is used to select the version of OpenCL C to be used when building the program executable for each device.
Possible the same source code generate different machine code. In this case slower.

(-cl-std=CL1.0 is not allowed)
I believe that "CL_DEVICE_OPENCL_C_VERSION" refers to "OpenCL C Version" which in my case is "OpenCL C 1.1".
Groucho2004 is offline   Reply With Quote
Old 4th January 2015, 23:15   #62  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 434
Quote:
Originally Posted by Groucho2004 View Post
I believe that "CL_DEVICE_OPENCL_C_VERSION" refers to "OpenCL C Version" which in my case is "OpenCL C 1.1".
But your device only support OpenCL 1.0, so I guess some instructions are downgrade for backward compatibility.

In a OpenCL program the kernel (the problem to be solved) is compiled just before running.

Probably I used some instruction OpenCL 1.1.
Khanattila is offline   Reply With Quote
Old 5th January 2015, 00:21   #63  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 434
Edit.

Last edited by Khanattila; 6th May 2016 at 15:44.
Khanattila is offline   Reply With Quote
Old 5th January 2015, 00:45   #64  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,007
032:
Code:
Frames processed:               500 (0 - 499)
FPS (min | max | average):      13.51 | 14.27 | 13.86
CPU usage (average):            25%
GPU usage (average):            95%
Thread count:                   8
Memory usage (phys | virt):     74 | 73 MB
Time (elapsed):                 00:00:36.062
034 (legacy = true):
Code:
Frames processed:               500 (0 - 499)
FPS (min | max | average):      13.52 | 14.26 | 13.86
CPU usage (average):            25%
GPU usage (average):            95%
Thread count:                   8
Memory usage (phys | virt):     90 | 88 MB
Time (elapsed):                 00:00:36.063
So, performance is identical to 0.3.2 but the memory usage is a tiny bit higher.
Groucho2004 is offline   Reply With Quote
Old 8th January 2015, 16:15   #65  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 434
Edit.

Last edited by Khanattila; 6th May 2016 at 15:44.
Khanattila is offline   Reply With Quote
Old 8th January 2015, 17:10   #66  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,007
0.4.0 does not work with my card (GT240) any more, whether I specify "legacy = true" or not:
[clBuildProgram (CL_BUILD_PROGRAM_FAILURE)]

I guess that's by design?
Groucho2004 is offline   Reply With Quote
Old 8th January 2015, 18:03   #67  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 434
Quote:
Originally Posted by Groucho2004 View Post
0.4.0 does not work with my card (GT240) any more, whether I specify "legacy = true" or not:
[clBuildProgram (CL_BUILD_PROGRAM_FAILURE)]

I guess that's by design?
OpenCL, NVIDIA and portability, can not be together
Please use this plugin that generates a debug file, link.

EDIT.

Last edited by Khanattila; 8th January 2015 at 18:28.
Khanattila is offline   Reply With Quote
Old 8th January 2015, 18:23   #68  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,007
Quote:
Originally Posted by Khanattila View Post
Please use this plugin that generates a debug file, link.
Log file:
"Language version specified by -cl-std is greater than the language version supported by the device!"

Last edited by Groucho2004; 8th January 2015 at 18:48.
Groucho2004 is offline   Reply With Quote
Old 8th January 2015, 18:59   #69  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 434
Quote:
Originally Posted by Groucho2004 View Post
Log file:
"Language version specified by -cl-std is greater than the language version supported by the device!"
Fixed, just re-download.

Code:
 If the -cl-std build option is not specified, the CL_DEVICE_OPENCL_C_VERSION is used to select the version of OpenCL C to be used when building the program executable for each device.
Thanks Nvidia.

In this version (0.4.0) I have specified -cl-std 1.1 because AMD support OpenCL >1.2 and could create problems. With legacy = true this check is now disabled.
Khanattila is offline   Reply With Quote
Old 8th January 2015, 19:06   #70  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,007
Quote:
Originally Posted by Khanattila View Post
Fixed, just re-download.
Works, thanks!

Just to confirm - Memory consumption is lower than with the last version (0.3.4)

Last edited by Groucho2004; 8th January 2015 at 19:10.
Groucho2004 is offline   Reply With Quote
Old 9th January 2015, 23:33   #71  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,007
This has been asked before in this thread a couple of times - any plans to implement temporal mode?
Groucho2004 is offline   Reply With Quote
Old 10th January 2015, 12:59   #72  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 434
Quote:
Originally Posted by Groucho2004 View Post
This has been asked before in this thread a couple of times - any plans to implement temporal mode?
First I have to solve a performance problem with B > 1 or 2.
Problem size is := [-Ax, ..., Ax] x [-Ay, ..., Ay] x [-Dpast, ..., Dfuture].

So Az=1 (three frame processing) is three times slower.

EDIT.
B = 0, 13.48 FPS //correct
B = 1, 41.38 FPS //correct
B = 2, 47.82 FPS //wrong, expected ~67 FPS
B = 3, 24.01 FPS //wtf???

Last edited by Khanattila; 10th January 2015 at 14:14.
Khanattila is offline   Reply With Quote
Old 28th February 2015, 15:59   #73  |  Link
Pulp Catalyst
Registered User
 
Join Date: May 2006
Posts: 266
could anyone recommend a default that will give be a ultralight kind of setting, i only want the bare minimum, my aim is so the original source will be affected so lightly that it will not be detected by the naked eye, i've been using the ultralight setting on Handbrake (vidcoder i use), but i don 't know the config parameters of course they use. (the strength value is very low though, and surprisingly still give very pleasing results as a generic profile)

if anyone can suggest recommended values and possible an example NLMeansCL2(????), that would be great.

Thanks,

p.s i really hope that a gpu version of this gets ported over to handbrake though one day.... the cpu version is so SLOWWWWW lol

Last edited by Pulp Catalyst; 28th February 2015 at 16:02.
Pulp Catalyst is offline   Reply With Quote
Old 10th March 2015, 16:33   #74  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 434
Quote:
Originally Posted by Pulp Catalyst View Post
could anyone recommend a default that will give be a ultralight kind of setting, i only want the bare minimum, my aim is so the original source will be affected so lightly that it will not be detected by the naked eye, i've been using the ultralight setting on Handbrake (vidcoder i use), but i don 't know the config parameters of course they use. (the strength value is very low though, and surprisingly still give very pleasing results as a generic profile)

if anyone can suggest recommended values and possible an example NLMeansCL2(????), that would be great.

Thanks,

p.s i really hope that a gpu version of this gets ported over to handbrake though one day.... the cpu version is so SLOWWWWW lol
Handbrake's nlmeans it's very different:
1) use a temporal search;
2) use a median filter (3x3 / 5x5);
3) use a edge mask;
4) other.

Normally, NLMeans2CL(A=4, S=2, B=1, h=1.3) is light enough.
Khanattila is offline   Reply With Quote
Old 11th March 2015, 04:39   #75  |  Link
Pulp Catalyst
Registered User
 
Join Date: May 2006
Posts: 266
Quote:
NLMeansCL2(A=4, S=2, B=1, h=1.3, device_type="GPU")
I'm giving it a try, thanks for the feedback.

I did not realize the handbrake NLMeans was so different, is there version more mature than this one (has more features)?
I also hope for the day when i can insert this version into QTGMC .... Wink Wink.
Pulp Catalyst is offline   Reply With Quote
Old 11th March 2015, 08:59   #76  |  Link
feisty2
I'm Siri
 
feisty2's Avatar
 
Join Date: Oct 2012
Location: Providence, RI
Posts: 2,359
Quote:
Originally Posted by Pulp Catalyst View Post
I'm giving it a try, thanks for the feedback.

I did not realize the handbrake NLMeans was so different, is there version more mature than this one (has more features)?
I also hope for the day when i can insert this version into QTGMC .... Wink Wink.
TNLMeans got a temporal mode but it's sloooooooow like hell and no support for high bitdepth
__________________
If I got new ideas, will post here: https://github.com/IFeelBloated
feisty2 is offline   Reply With Quote
Old 21st March 2015, 16:57   #77  |  Link
Stereodude
Registered User
 
Join Date: Dec 2002
Location: Region 0
Posts: 1,362
What is the expected behavior if the arguments passed to NLmeanCL2 require more resources (VRAM?) than the OpenCL device has?

I was trying to run this script suggested in this thread on a FHD source.

Code:
nlmeanscl2(a=10,b=0,s=4,lsb_inout=true,h=4.8,aa=3.2, device_type="GPU", info=true)
With both the Intel HD 4600 (i7-4770k) and Nvidia GT440 (1gB) the graphics driver crashes (momentary black screen followed by a message in the system tray about it).

This message is returned to the application by AVIsynth

NLMeansCL2: Houston, we've had a problem!
[clCreateContext (CL_OUT_OF_RESOURCES)]


If I just use a NLMeansCL2 call of
Code:
nlmeanscl2(device_type="GPU", info=true)
it works fine.

I'm running Windows 7 x64 w/ 16gB of RAM and have the latest drivers on both the GT440 and the HD4600.
Stereodude is offline   Reply With Quote
Old 1st April 2015, 09:32   #78  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 434
Quote:
Originally Posted by Stereodude View Post
What is the expected behavior if the arguments passed to NLmeanCL2 require more resources (VRAM?) than the OpenCL device has?

I was trying to run this script suggested in this thread on a FHD source.

Code:
nlmeanscl2(a=10,b=0,s=4,lsb_inout=true,h=4.8,aa=3.2, device_type="GPU", info=true)
With both the Intel HD 4600 (i7-4770k) and Nvidia GT440 (1gB) the graphics driver crashes (momentary black screen followed by a message in the system tray about it).

This message is returned to the application by AVIsynth

NLMeansCL2: Houston, we've had a problem!
[clCreateContext (CL_OUT_OF_RESOURCES)]


If I just use a NLMeansCL2 call of
Code:
nlmeanscl2(device_type="GPU", info=true)
it works fine.

I'm running Windows 7 x64 w/ 16gB of RAM and have the latest drivers on both the GT440 and the HD4600.
Please, use GPU Caps Viewer: http://www.geeks3d.com/20150127/gpu-...23-0-released/
I need some more info. Most likely your system does not have enough video memory.


Validation => Submit. Get a link like that: http://www.ozone3d.net/gpudb/gpu.php?which=49098&v=2

Last edited by Khanattila; 1st April 2015 at 09:39.
Khanattila is offline   Reply With Quote
Old 4th April 2015, 14:11   #79  |  Link
Stereodude
Registered User
 
Join Date: Dec 2002
Location: Region 0
Posts: 1,362
Quote:
Originally Posted by Khanattila View Post
Please, use GPU Caps Viewer: http://www.geeks3d.com/20150127/gpu-...23-0-released/
I need some more info. Most likely your system does not have enough video memory.


Validation => Submit. Get a link like that: http://www.ozone3d.net/gpudb/gpu.php?which=49098&v=2
Here you go. http://www.ozone3d.net/gpudb/gpu.php?which=49162&v=2
Stereodude is offline   Reply With Quote
Old 7th April 2015, 18:31   #80  |  Link
Mangix
Audiophile
 
Join Date: Oct 2006
Posts: 354
Latest Nvidia drivers support OpenCL 1.2. Any performance benefits?
Mangix is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 01:45.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, vBulletin Solutions Inc.