Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 6th January 2017, 15:26   #1041  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 413
Quote:
Originally Posted by dipje View Post
@khanattila: What about the results of that 'benchmark only' build you did?

Are you going to put 'optimal' settings inside the plugin based on detected gpu / generation. are or you going to open up parameters like those so we can find our own optimal distribution with the final release?
Good point, the parameters will be already pre-calibrated, but you can overwrite them.
__________________
https://github.com/Khanattila
Khanattila is offline   Reply With Quote
Old 6th January 2017, 18:50   #1042  |  Link
tormento
Acid fr0g
 
tormento's Avatar
 
Join Date: May 2002
Location: Italy
Posts: 766
Quote:
Originally Posted by Khanattila View Post
Good point, the parameters will be already pre-calibrated, but you can overwrite them.
I think a --preset value, such as in x264, with 4/5 max setting value groups would be nice and noob proof, better with too many parameters based on video card.

Speed is determined by total script, not only a plugin. I could trade some speed for better prefiltering or anything else.
__________________
tormento@ircnet

Last edited by tormento; 6th January 2017 at 18:52.
tormento is offline   Reply With Quote
Old 6th January 2017, 20:05   #1043  |  Link
dipje
Registered User
 
Join Date: Oct 2014
Posts: 228
But the parameters have nothing to do with quality or speed settings for the plugin. More a way to optimize the workload so it performs better for certain video cards.

Certain AMD generations need other parameters than other AMD cards or something. My otherwise speedy GTX 1060 needed really different (low) settings to gain maximum speed than the latest RX480 cards for instance.

This has nothing to do with quality or what the plugin does. More the way the plugin gives commands to the GPU as far as I get it .
dipje is offline   Reply With Quote
Old 17th January 2017, 15:35   #1044  |  Link
hydra3333
Registered User
 
Join Date: Oct 2009
Location: crow-land
Posts: 436
Hello. A newbie type question, however worth asking since I find I am in need of guidance.

I wonder if you could clarify whether and when KNLMeansCL is appropriate to use as a plain denoiser by itself ? I noticed it's been used in SMDegrain however I am unclear if KNLMeansCL is appropriate for use as a denoiser in its own right, for example on a range of OTA TV captures for moderate denoising a la mdegrain1/2/3. Any advice or links to comparisons somewhere ?

Another objective is to attempt to identify GPU (eg OpenCL) based filters for the times when a fast workflow is OK, ie where some improved quality output is hoped for but not paramount and speed is valued. (Tools = ffmpeg and x264, portable vapoursynth_x64, win10_x64.)

So far I have only seen these GPU based filters
  1. DGDecodeNV for GPU decoding / deinterlacing (nvidia PureVideo) / resizing - in vapoursynth
  2. unsharp - an ffmpeg internal filter which uses OpenCL to sharpen
  3. KNLMeansCL for denoising - OpenCL in vapoursynth

Do you know of any other GPU filters usable in vapoursynth, or ffmpeg, especially sharpeners ? Is there already a list somewhere ?

Thanks.
hydra3333 is offline   Reply With Quote
Old 17th January 2017, 18:22   #1045  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 413
Quote:
Originally Posted by hydra3333 View Post
Hello. A newbie type question, however worth asking since I find I am in need of guidance.

I wonder if you could clarify whether and when KNLMeansCL is appropriate to use as a plain denoiser by itself ? I noticed it's been used in SMDegrain however I am unclear if KNLMeansCL is appropriate for use as a denoiser in its own right, for example on a range of OTA TV captures for moderate denoising a la mdegrain1/2/3. Any advice or links to comparisons somewhere ?

Another objective is to attempt to identify GPU (eg OpenCL) based filters for the times when a fast workflow is OK, ie where some improved quality output is hoped for but not paramount and speed is valued. (Tools = ffmpeg and x264, portable vapoursynth_x64, win10_x64.)

So far I have only seen these GPU based filters
  1. DGDecodeNV for GPU decoding / deinterlacing (nvidia PureVideo) / resizing - in vapoursynth
  2. unsharp - an ffmpeg internal filter which uses OpenCL to sharpen
  3. KNLMeansCL for denoising - OpenCL in vapoursynth

Do you know of any other GPU filters usable in vapoursynth, or ffmpeg, especially sharpeners ? Is there already a list somewhere ?

Thanks.
I believe that my filter is the exception, not the rule.
Do not focus yourself on gpu based filters.
__________________
https://github.com/Khanattila
Khanattila is offline   Reply With Quote
Old 17th January 2017, 19:02   #1046  |  Link
feisty2
I'm Siri
 
feisty2's Avatar
 
Join Date: Oct 2012
Location: Los Angeles, California
Posts: 2,049
Quote:
Originally Posted by hydra3333 View Post
Hello. A newbie type question, however worth asking since I find I am in need of guidance.

I wonder if you could clarify whether and when KNLMeansCL is appropriate to use as a plain denoiser by itself ? I noticed it's been used in SMDegrain however I am unclear if KNLMeansCL is appropriate for use as a denoiser in its own right, for example on a range of OTA TV captures for moderate denoising a la mdegrain1/2/3. Any advice or links to comparisons somewhere ?

Another objective is to attempt to identify GPU (eg OpenCL) based filters for the times when a fast workflow is OK, ie where some improved quality output is hoped for but not paramount and speed is valued. (Tools = ffmpeg and x264, portable vapoursynth_x64, win10_x64.)
KNLMeansCL itself is an "appropriate" and very high quality (theoretically better quality than motion compensation based filters like MDeGrain since motion compensation matches macroblocks in temporal dimension only and NLMeans does that in both spatial and temporal dimensions) denoiser, it has been misused to do other things thanks to folks like me, but that doesn't mean it has lost its original purpose, to work as a plain denoiser!
__________________
If I got new ideas, will post here: https://github.com/IFeelBloated
feisty2 is offline   Reply With Quote
Old 17th January 2017, 19:09   #1047  |  Link
feisty2
I'm Siri
 
feisty2's Avatar
 
Join Date: Oct 2012
Location: Los Angeles, California
Posts: 2,049
Also it's a big mystery that how the hell NLMeans ended up acting as a "pre filter" for MDeGrain? You do realize that's like doing a motion compensation pre filtering for RemoveGrain, right?

EDIT: simple rule, the fancier filter gets to be the main filter, so MDeGrain should be the pre filter (rclip) and NLMeans should be the main filter.
__________________
If I got new ideas, will post here: https://github.com/IFeelBloated

Last edited by feisty2; 17th January 2017 at 19:23.
feisty2 is offline   Reply With Quote
Old 17th January 2017, 20:25   #1048  |  Link
real.finder
Registered User
 
Join Date: Jan 2012
Location: Mesopotamia
Posts: 744
Quote:
Originally Posted by feisty2 View Post
Also it's a big mystery that how the hell NLMeans ended up acting as a "pre filter" for MDeGrain? You do realize that's like doing a motion compensation pre filtering for RemoveGrain, right?

EDIT: simple rule, the fancier filter gets to be the main filter, so MDeGrain should be the pre filter (rclip) and NLMeans should be the main filter.
ask dogway, anyway maybe he want a pure temporal, and the knlmeans have blend artfact in temporal dimension
__________________
My Avisynth Stuff
real.finder is offline   Reply With Quote
Old 17th January 2017, 21:08   #1049  |  Link
feisty2
I'm Siri
 
feisty2's Avatar
 
Join Date: Oct 2012
Location: Los Angeles, California
Posts: 2,049
Quote:
Originally Posted by real.finder View Post
ask dogway, anyway maybe he want a pure temporal, and the knlmeans have blend artfact in temporal dimension
I wouldn't be surprised about the "blending" artifacts since a lot of people that use this filter got an "a" value of 2 or 3 or so.
Apparently the term "Non-Local" is so rocket science and incomprehensible to them and ironically, it is the "Non-Local Means" filter they are using.

Set "a" to 32 and if you can still observe any blending artifacts, post a sample
__________________
If I got new ideas, will post here: https://github.com/IFeelBloated
feisty2 is offline   Reply With Quote
Old 18th January 2017, 11:26   #1050  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 413
https://github.com/Khanattila/KNLMea.../v1.0.0-beta.3
Code:
KNLMeansCL v1.0.0-beta.3
New:
* Improved performance with CPU.
* Improved performance with AMD GCN Architecture.
* Reduced CPU overload in some system.
* Two modified bisquare weighting functions.
* Advanced OpenCL parameters for fine tuning.

Changed:
* Replaced 'cmode' with 'channels' and added the options to only process the chroma.
* Increased the maximum 's' value to 8.
* Updated to VapourSynth R35.

Removed:
* Cauchy weighting function.

Fixed:
*Second clip 'rclip' in some circumstances.
It corrects the errors of the previous beta.
__________________
https://github.com/Khanattila

Last edited by Khanattila; 19th January 2017 at 17:11.
Khanattila is offline   Reply With Quote
Old 19th January 2017, 10:12   #1051  |  Link
feisty2
I'm Siri
 
feisty2's Avatar
 
Join Date: Oct 2012
Location: Los Angeles, California
Posts: 2,049

something is still wrong..
__________________
If I got new ideas, will post here: https://github.com/IFeelBloated
feisty2 is offline   Reply With Quote
Old 19th January 2017, 10:35   #1052  |  Link
feisty2
I'm Siri
 
feisty2's Avatar
 
Join Date: Oct 2012
Location: Los Angeles, California
Posts: 2,049
I located the error to "rclip", it will crash if rclip is not None
__________________
If I got new ideas, will post here: https://github.com/IFeelBloated
feisty2 is offline   Reply With Quote
Old 19th January 2017, 17:09   #1053  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 413
Quote:
Originally Posted by feisty2 View Post
I located the error to "rclip", it will crash if rclip is not None
The bugs do not exist until someone discovers them, I am sure.
__________________
https://github.com/Khanattila
Khanattila is offline   Reply With Quote
Old 19th January 2017, 17:33   #1054  |  Link
feisty2
I'm Siri
 
feisty2's Avatar
 
Join Date: Oct 2012
Location: Los Angeles, California
Posts: 2,049
Quote:
Originally Posted by Khanattila View Post
The bugs do not exist until someone discovers them, I am sure.
I only test for floating point inputs, couldn't care less about all that integer crap.



script works fine with v0.7.7
__________________
If I got new ideas, will post here: https://github.com/IFeelBloated
feisty2 is offline   Reply With Quote
Old 22nd January 2017, 11:17   #1055  |  Link
cork_OS
Registered User
 
cork_OS's Avatar
 
Join Date: Mar 2016
Posts: 97
Beta3 give this error (beta2 work ok):
Code:
 OpenCL Platform
------------------------------------------------------------
 CL_PLATFORM_VENDOR:                Advanced Micro Devices, Inc.
 CL_PLATFORM_NAME:                  AMD Accelerated Parallel Processing
 CL_PLATFORM_VERSION:               OpenCL 2.0 AMD-APP (2236.10)
 CL_PLATFORM_PROFILE:               FULL_PROFILE

 OpenCL Device
------------------------------------------------------------
 CL_DEVICE_VENDOR:                  Advanced Micro Devices, Inc.
 CL_DEVICE_NAME:                    Pitcairn
 CL_DRIVER_VERSION:                 2236.10
 CL_DEVICE_VERSION:                 OpenCL 1.2 AMD-APP (2236.10)
 CL_DEVICE_PROFILE:                 FULL_PROFILE
 CL_DEVICE_IMAGE_SUPPORT:           1
 CL_DEVICE_IMAGE2D_MAX_WIDTH:       16384
 CL_DEVICE_IMAGE2D_MAX_HEIGHT:      16384
 CL_DEVICE_IMAGE_MAX_ARRAY_SIZE:    2048

 Program Build
------------------------------------------------------------
 CL_PROGRAM_BUILD_OPTIONS:          -cl-single-precision-constant
                                    -cl-denorms-are-zero
                                    -cl-fast-relaxed-math
                                    -Werror        
                                    -D NLM_CLIP_TYPE_UNORM
                                    -D NLM_CLIP_REF_LUMA
                                    -D NLM_WMODE_WELSCH
                                    -D VI_DIM_X=720
                                    -D VI_DIM_Y=480
                                    -D HRZ_RESULT=1
                                    -D VRT_RESULT=1        
                                    -D HRZ_BLOCK_X=32
                                    -D HRZ_BLOCK_Y=8 
                                    -D VRT_BLOCK_X=32
                                    -D VRT_BLOCK_Y=8        
                                    -D NLM_D=1
                                    -D NLM_S=4
                                    -D NLM_H=1.200000
                                    -D NLM_WREF=1.000000
 CL_PROGRAM_BUILD_LOG:              
"C:\Users\cork_OS\AppData\Local\Temp\OCL7900T8.cl", line 19: error: global
          variable declaration is corrected by the compiler to have addrSpace
          constant
  const sampler_t nne = CLK_NORMALIZED_COORDS_FALSE | CLK_ADDRESS_NONE  | CLK_FILTER_NEAREST;                      
                  ^

"C:\Users\cork_OS\AppData\Local\Temp\OCL7900T8.cl", line 20: error: global
          variable declaration is corrected by the compiler to have addrSpace
          constant
  const sampler_t clm = CLK_NORMALIZED_COORDS_FALSE | CLK_ADDRESS_CLAMP | CLK_FILTER_NEAREST;                      
                  ^

2 errors detected in the compilation of "C:\Users\cork_OS\AppData\Local\Temp\OCL7900T8.cl".
Frontend phase failed compilation.


 RETURN:                            0
__________________
I'm infected with poor sources.
cork_OS is offline   Reply With Quote
Old 24th January 2017, 12:12   #1056  |  Link
jmac698
Registered User
 
Join Date: Jan 2006
Posts: 1,859
Does this work with opencl 1.1? I have Fermi/Nvidia. Maybe before version 0.7?
jmac698 is offline   Reply With Quote
Old 24th January 2017, 12:36   #1057  |  Link
Groucho2004
ŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻŻ
 
Groucho2004's Avatar
 
Join Date: Mar 2006
Posts: 3,406
Quote:
Originally Posted by jmac698 View Post
Does this work with opencl 1.1? I have Fermi/Nvidia. Maybe before version 0.7?
See here, last paragraph.
Groucho2004 is offline   Reply With Quote
Old 24th January 2017, 12:51   #1058  |  Link
jmac698
Registered User
 
Join Date: Jan 2006
Posts: 1,859
Right, thanks. Testing.

Edit:
Code:
colorbars()
KNLMeansCL()
Code:
Plugin was designed for a later version of Avisynth
Am using Avisynth+ 0.1 (r1576, x86)

Also another question, if it did work with x64 version of Avs+, and I used 64bit version of plugin, would I then need 64bit version of vc2013?

Was hoping to use it with Avs+.

Will try to find a later version.

Edit 2:
Found latest build on page 145 of Avisynth+ thread.
http://www.mediafire.com/file/bazu8v...splus-r2397.7z

Edit 3:
Used Avs+ 0.1 (r2397, MT, i386)
Code:
KNLMeansCL: AviSynthCreate error (clBuildProgram)!
Please report Log-KNLMeansCL.txt
I can't find that .txt file.

Edit 4:
Got it to work
Avs 2.60
Code:
KNLMeansCL(device_type="GPU")
Also found the log file now (and maybe was previously) at same place as .avs file.

The errors were:
Code:
---------------------------------
*** Error in OpenCL compiler ***
---------------------------------

# Build Options
-cl-single-precision-constant -cl-denorms-are-zero -cl-fast-relaxed-math -Werror -D H_BLOCK_X=32 -D H_BLOCK_Y=4 -D V_BLOCK_X=32 -D V_BLOCK_Y=4 -D NLMK_TCLIP=76 -D NLMK_S=4 -D NLMK_WMODE=1 -D NLMK_TEMPORAL=0 -D NLMK_H2_INV_NORM=185.828175 -D NLMK_BIT_SHIFT=0

# Build Log
:119:47: error: double precision constant requires cl_khr_fp64, casting to single precision
:122:40: error: double precision constant requires cl_khr_fp64, casting to single precision
:125:43: error: double precision constant requires cl_khr_fp64, casting to single precision
So I had to set device type to GPU. Works with Nvidia Fermi and opencl 1.1.

Next: to test avs+ again, then benchmark.

Last edited by jmac698; 24th January 2017 at 13:32.
jmac698 is offline   Reply With Quote
Old 24th January 2017, 13:50   #1059  |  Link
jmac698
Registered User
 
Join Date: Jan 2006
Posts: 1,859
Tested by playing in media player, running at 30fps results in 97% GPU usage and 24% memory controller load. I had to stop quickly because GPU reached 105deg C and could cause thermal shutdown. I guess I need a laptop cooler to use this

Edit:
This is a known issue with my laptop, the heatsink does not physicall touch the Nvidia chip, so there is only a thermal pad. The Nvida chip isn't as tall as the CPU. The mod solution is to insert a 0.8mm copper shim, which reduces temp by 20deg C. So until I can make this mod, I can't use OpenCL

The intel hd3000 can use directcompute, but OpenCL is not supported, though there is 1.2 emulation on CPU. Have you looked into this? Does anyone use DirectCompute? It should be supported on any dx10.1 GPU.

Last edited by jmac698; 25th January 2017 at 14:25.
jmac698 is offline   Reply With Quote
Old 25th January 2017, 18:39   #1060  |  Link
kgrabs
Registered User
 
Join Date: Jan 2017
Posts: 15
I got a log file to report, too. First, the script (same thing happens regardless of the KNL settings tho):
Code:
LWLibavVideoSource(source="00001.m2ts")
Dither_convert_8_to_16()
KNLMeansCL(d=1, a=2, s=0, h=7, channels="UV", device_type="GPU", lsb_inout=true)
DitherPost(mode=8)
And the log:
Quote:

OpenCL Platform
------------------------------------------------------------
CL_PLATFORM_VENDOR: Advanced Micro Devices, Inc.
CL_PLATFORM_NAME: AMD Accelerated Parallel Processing
CL_PLATFORM_VERSION: OpenCL 2.0 AMD-APP (1800.8)
CL_PLATFORM_PROFILE: FULL_PROFILE

OpenCL Device
------------------------------------------------------------
CL_DEVICE_VENDOR: Advanced Micro Devices, Inc.
CL_DEVICE_NAME: BeaverCreek
CL_DRIVER_VERSION: 1800.8 (VM)
CL_DEVICE_VERSION: OpenCL 1.2 AMD-APP (1800.8)
CL_DEVICE_PROFILE: FULL_PROFILE
CL_DEVICE_IMAGE_SUPPORT: 1
CL_DEVICE_IMAGE2D_MAX_WIDTH: 16384
CL_DEVICE_IMAGE2D_MAX_HEIGHT: 16384
CL_DEVICE_IMAGE_MAX_ARRAY_SIZE: 2048

Program Build
------------------------------------------------------------
CL_PROGRAM_BUILD_OPTIONS: -cl-single-precision-constant
-cl-denorms-are-zero
-cl-fast-relaxed-math
-Werror
-D NLM_CLIP_TYPE_STACKED
-D NLM_CLIP_REF_CHROMA
-D NLM_WMODE_WELSCH
-D VI_DIM_X=960
-D VI_DIM_Y=540
-D HRZ_RESULT=1
-D VRT_RESULT=1
-D HRZ_BLOCK_X=32
-D HRZ_BLOCK_Y=8
-D VRT_BLOCK_X=32
-D VRT_BLOCK_Y=8
-D NLM_D=1
-D NLM_S=0
-D NLM_H=7.000000
-D NLM_WREF=1.000000
CL_PROGRAM_BUILD_LOG:
Warnings being treated as errors
"C:\Users\mikeay\AppData\Local\Temp\OCL6E22.tmp.cl", line 19: error: global
variable declaration is corrected by the compiler to have addrSpace
constant
const sampler_t nne = CLK_NORMALIZED_COORDS_FALSE | CLK_ADDRESS_NONE | CLK_FILTER_NEAREST;
^

"C:\Users\mikeay\AppData\Local\Temp\OCL6E22.tmp.cl", line 20: error: global
variable declaration is corrected by the compiler to have addrSpace
constant
const sampler_t clm = CLK_NORMALIZED_COORDS_FALSE | CLK_ADDRESS_CLAMP | CLK_FILTER_NEAREST;
^

"C:\Users\mikeay\AppData\Local\Temp\OCL6E22.tmp.cl", line 198: error: "val_x"
has already been declared in the current scope
float val_x = native_divide(num_y, den);
^

"C:\Users\mikeay\AppData\Local\Temp\OCL6E22.tmp.cl", line 199: error:
identifier "val_y" is undefined
write_imagef(U1_out, s, (float4) (val_x, val_y, 0.0f, 0.0f));
^

"C:\Users\mikeay\AppData\Local\Temp\OCL6E22.tmp.cl", line 199: error: not
enough initializer values
write_imagef(U1_out, s, (float4) (val_x, val_y, 0.0f, 0.0f));
^

"C:\Users\mikeay\AppData\Local\Temp\OCL6E22.tmp.cl", line 295: error:
identifier "u" is undefined
write_imageui(R_lsb, s, (uint4) (u & 0xFF, 0u, 0u, 0u));
^

"C:\Users\mikeay\AppData\Local\Temp\OCL6E22.tmp.cl", line 295: error: not
enough initializer values
write_imageui(R_lsb, s, (uint4) (u & 0xFF, 0u, 0u, 0u));
^

"C:\Users\mikeay\AppData\Local\Temp\OCL6E22.tmp.cl", line 296: error:
identifier "v" is undefined
write_imageui(G_lsb, s, (uint4) (v & 0xFF, 0u, 0u, 0u));
^

"C:\Users\mikeay\AppData\Local\Temp\OCL6E22.tmp.cl", line 296: error: not
enough initializer values
write_imageui(G_lsb, s, (uint4) (v & 0xFF, 0u, 0u, 0u));
^

9 errors detected in the compilation of "C:\Users\mikeay\AppData\Local\Temp\OCL6E22.tmp.cl".
Frontend phase failed compilation.


RETURN: 0
and my version: Avisynth+ 0.1 (r2172, MT, i386) x86

...aaand my computer jazz:
Code:
Windows 7 Home Premium 64 bit, Service Pack 1
HP Pavilion dv7 Notebook
AMD A6-3400M APU with Radeon HD Graphics 1.40 GHz
RAM: 6GB (5.48 usable)
Everything is 32 bit, excluding my OS, and possibly my OpenCL driver by proxy. Not sure how to check tbh

The previous beta seemed to have some weird quirks too. With chroma processing it had green bleeding in around the borders. Luma-only mode seemed fine. v0.7.7 had no apparent issues
kgrabs is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 01:02.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2017, vBulletin Solutions Inc.