Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 12th April 2015, 19:06   #81  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 440
Quote:
Originally Posted by Mangix View Post
Latest Nvidia drivers support OpenCL 1.2. Any performance benefits?
Nvidia seems to don't like OpenCL, maybe because of CUDA.
OpenCL 1.2 / 1.1 API are different. The plugin should be rewritten to exploit all.

EDIT. OpenCL 1.2 was presented on November 15, 2011.

Last edited by Khanattila; 12th April 2015 at 19:11.
Khanattila is offline   Reply With Quote
Old 14th April 2015, 11:51   #82  |  Link
Atak_Snajpera
RipBot264 author
 
Atak_Snajpera's Avatar
 
Join Date: May 2006
Location: Poland
Posts: 7,812
What are your thoughts about opencl 2.0? AMD has just added support in latest drivers.
Atak_Snajpera is offline   Reply With Quote
Old 17th April 2015, 18:17   #83  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 440
Quote:
Originally Posted by Atak_Snajpera View Post
What are your thoughts about opencl 2.0? AMD has just added support in latest drivers.

Is a compatibility issue. I'd like to use OpenCL 2.0, but I can't.

To achieve the maximum diffusion I have to use OpenCL 1.1.
NVIDIA CUDA Toolkit v7.0 can't stand even OpenCL 1.2. (http://developer.download.nvidia.com/compute/cuda/7_0/Prod/doc/CUDA_Toolkit_Release_Notes.pdf)

I don't want to build two different versions. I hope so.
Khanattila is offline   Reply With Quote
Old 17th April 2015, 18:24   #84  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 440
Quote:
Originally Posted by Stereodude View Post
Sorry for the late response.
Probably a big "A" use too much private/local memory for yor GPU.

(GPU have 3 three different memories: private memory, local memory, global memory. Where private is the faster and global is the bigger.)

Please reduce the value of A.
Khanattila is offline   Reply With Quote
Old 17th April 2015, 18:28   #85  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 440
Probably this week I will release a new version of NLMeansCL2.
Since it will be a major change, for now I'll call it NLMeansCL2b().
So you can keep both versions.
Khanattila is offline   Reply With Quote
Old 21st April 2015, 10:56   #86  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 440
NLMeansCL2b - v.0.5 - Beta 1

Code:
NLMeansCL2b (
    clip src, 
    int D (0),            // Temporal windows, disabled in Beta 1
    int A (4),            // Search window
    int S (2),            // Similarity neighborhood window
    int B (0),            // Base window
    int wmode (1),        // Weighting function
    float h (1.8),        // Strength of the filtering
    string device_type ("default"), 
    int y (3), 
    int u (2), 
    int v (2), 
    bool lsb_inout (false), 
    bool info (false)
)
Little explanation:

[wmode = 0] Cauchy weighting function has a very slow decay.
It assign larger weights to dissimilar blocks than the Leclerc
robust function, which will eventually lead to oversmoothing.

[wmode = 1] Leclerc weighting function has a faster decay,
but still assigns positive weights to dissimilar blocks. Original
NLMeans weighting function.

[wmode = 2] Bisquare weighting function use a soft threshold.

Download: removed.

################

New:
  • Temporal windows search (disabled in Beta1).
  • Cauchy weighting function.
  • Bisquare weighting function.
Removed:
  • Ay, Sy and By arguments.
  • aa argument (simple Euclidean distance).
  • sse argument (replaced by wmode).
  • legacy kernel.
Changed:
  • Patchwise Implementation
  • Euclidean distance
Updated:
  • NVIDIA CUDA Toolkit v7.0.
  • AVS 2.6.0 RC 2 [150331].

Last edited by Khanattila; 27th April 2015 at 18:22.
Khanattila is offline   Reply With Quote
Old 21st April 2015, 18:52   #87  |  Link
Reel.Deel
Registered User
 
Join Date: Mar 2012
Location: Texas
Posts: 1,666
Hi Khanattila, thanks for the update! I tried using NLMeansCL2b with AviSynth+ r1576 on my work computer but I got this error message: "Plugin was designed for a later version of Avisynth (6)"
I updated to r1779 and it works with that version. Was that intentional or is there something else going on? If so will this also be true for anyone using an older version before AviSynth 2.6 R2?
Reel.Deel is offline   Reply With Quote
Old 21st April 2015, 19:56   #88  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 440
Quote:
Originally Posted by Reel.Deel View Post
Hi Khanattila, thanks for the update! I tried using NLMeansCL2b with AviSynth+ r1576 on my work computer but I got this error message: "Plugin was designed for a later version of Avisynth (6)"
I updated to r1779 and it works with that version. Was that intentional or is there something else going on? If so will this also be true for anyone using an older version before AviSynth 2.6 R2?
Hi! Avisynth 2.6.0 has many bug fixes and improvements. It is a good idea to upgrade. For this I forced users to use an updated version. However, I do not know how Avisynth+ works.

Version 6 is 2.6.0.
Version 5 is 2.6.0a1-a5.
Version 4 is reserved.
Version 3 is 2.5.6.

Last edited by Khanattila; 21st April 2015 at 21:49.
Khanattila is offline   Reply With Quote
Old 21st April 2015, 21:38   #89  |  Link
StainlessS
HeartlessS Usurer
 
StainlessS's Avatar
 
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
I think there were multiple versions of Avisynth Header VERSION 5, a change was made to the header at some point which made plugins compiled with it
require Avisynth versionv2.6 Alpha 4+. The previous to current version of ClipClop plugin crashed on Avisynth v2.6a3 and previous
(as with other plugins, immediately at startup).
__________________
I sometimes post sober.
StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace

"Some infinities are bigger than other infinities", but how many of them are infinitely bigger ???

Last edited by StainlessS; 21st April 2015 at 21:42.
StainlessS is offline   Reply With Quote
Old 21st April 2015, 22:41   #90  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
Quote:
Originally Posted by StainlessS View Post
I think there were multiple versions of Avisynth Header VERSION 5, a change was made to the header at some point which made plugins compiled with it
require Avisynth versionv2.6 Alpha 4+.
I think you're getting this mixed up with the introduction of "AVS_linkage" in 2.6 Alpha4.

As for the "AVISYNTH_INTERFACE_VERSION":
"3" : 2.5.x
"5" : 2.6.0 < RC1
"6" : 2.6.0 >= RC1

As for AVS+, the lastest builds (r17xx) have the header updated to v6.
__________________
Groucho's Avisynth Stuff
Groucho2004 is offline   Reply With Quote
Old 22nd April 2015, 03:18   #91  |  Link
StainlessS
HeartlessS Usurer
 
StainlessS's Avatar
 
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
Guilty as charged M'lud
__________________
I sometimes post sober.
StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace

"Some infinities are bigger than other infinities", but how many of them are infinitely bigger ???
StainlessS is offline   Reply With Quote
Old 27th April 2015, 18:39   #92  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 440
Code:
KNLMeansCL (
    clip src, 
    int D (0),                        // Temporal window
    int A (4),                        // Search window
    int S (2),                        // Similarity neighborhood window
    int wmode (1),                    // Weighting function
    float h (1.8),                    // Strength of the filtering
    string device_type ("default"), 
    bool lsb_inout (false), 
    bool info (false)
)
Changelog
  • v0.5.0 Beta2 (2015-04-27)
    - New: Temporal windows search.
    - New: Cauchy weighting function.
    - New: Bisquare weighting function.
    - Changed: plugin name!
    - Changed: now process always luminace (y).
    - Changed: simple Euclidean distance.
    - Changed: sse argument, replaced by wmode.
    - Removed: u and v arguments.
    - Removed: Ay and Sy.
    - Removed: patchwise Implementation.
    - Removed: aa argument.
    - Removed: legacy kernel.
    - Updated: NVIDIA CUDA Toolkit v7.0.
    - Updated: AVS 2.6.0 RC 2 [150331].

Last edited by Khanattila; 6th May 2016 at 15:45.
Khanattila is offline   Reply With Quote
Old 27th April 2015, 22:14   #93  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
Quote:
Originally Posted by Khanattila View Post
Code:
KNLMeansCL (
    clip src, 
    int D (0),                        // Temporal window
    int A (4),                        // Search window
    int S (2),                        // Similarity neighborhood window
    int wmode (1),                    // Weighting function
    float h (1.8),                    // Strength of the filtering
    string device_type ("default"), 
    bool lsb_inout (false), 
    bool info (false)
)
Changelog
  • v0.5.0 Beta2 (2015-04-27)
    - New: Temporal windows search.
    - New: Cauchy weighting function.
    - New: Bisquare weighting function.
    - Changed: plugin name!
    - Changed: now process always luminace (y).
    - Changed: simple Euclidean distance.
    - Changed: sse argument, replaced by wmode.
    - Removed: u and v arguments.
    - Removed: Ay and Sy.
    - Removed: patchwise Implementation.
    - Removed: aa argument.
    - Removed: legacy kernel.
    - Updated: NVIDIA CUDA Toolkit v7.0.
    - Updated: AVS 2.6.0 RC 2 [150331].

Download: KNLMeansCL_v0.5_Beta2.
Thanks for this new version. I suppose the default for "D" (0) disables temporal operation? I get a huge performance hit when I set it to "1" which I guess is normal.
I really like this filter and I think I have to consider upgrading from my old GT240.
So, how about running a simple benchmark to see how various video cards fare? I'm thinking something like this:
Code:
colorbars(width = 1280, height = 720, pixel_type = "yv12").killaudio().assumefps(24000, 1001)
trim(0,99)
fadeio(48)
trim(0,99)
KNLMeansCL(D = 1)
Running this script through AVSMeter (-log -gpu), I get these results:

Code:
[Runtime info]
Frames processed:               100 (0 - 99)
FPS (min | max | average):      1.421 | 1.452 | 1.448
Memory usage (phys | virt):     44 | 43 MB
Thread count:                   8
CPU usage (average):            25%
GPU usage (average):            99%
Video engine load (average):    0%
GPU memory usage:               67 MB
Time (elapsed):                 00:01:09.051

[Graphics card info]
Card name:               NVIDIA GeForce GT 240
GPU name:                GT215
Memory size:             512
OpenCL version:          OpenCL 1.0 CUDA
Driver version:          6.14.13.4052 (ForceWare 340.52) / XP
__________________
Groucho's Avisynth Stuff
Groucho2004 is offline   Reply With Quote
Old 28th April 2015, 00:31   #94  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 440
D is the number of past and future frame that the filter uses for denoising the current frame.

D = 0, only current frame (n).
D = 1, use n - 1, n, n + 1.
D = 2, use n - 2, n - 1, n, n + 1, n + 2.
etc.

D = 0 also exploit the symmetry property of the weights, i.e. w(p, p + q) = w(p + q, p). But this requires an accumulation buffer, probably in an old GPU makes performance down.

(Symmetry is also possible in temporal but requires too many check).

Quote:
KNLMeansCL(D = 0)

[Runtime info]
Frames processed: 100 (0 - 99)
FPS (min | max | average): 27.89 | 45.95 | 43.44
CPU usage (average): 24%
GPU usage (average): 73%
Thread count: 8
Memory usage (phys | virt): 60 | 75 MB
Time (elapsed): 00:00:02.302
Quote:
KNLMeansCL(D = 1)

[Runtime info]
Frames processed: 100 (0 - 99)
FPS (min | max | average): 6.984 | 7.730 | 7.598
CPU usage (average): 25%
GPU usage (average): 94%
Thread count: 8
Memory usage (phys | virt): 63 | 77 MB
Time (elapsed): 00:00:13.161
Quote:
KNLMeansCL(D = 2)

[Runtime info]
Frames processed: 100 (0 - 99)
FPS (min | max | average): 4.368 | 4.653 | 4.587
CPU usage (average): 25%
GPU usage (average): 95%
Thread count: 8
Memory usage (phys | virt): 64 | 79 MB
Time (elapsed): 00:00:21.799
EDIT.
Computational complexity: ((2 * A + 1) * (2 * A +1) * (2 * D + 1) - 1) / (D ? 1 : 2)

Last edited by Khanattila; 28th April 2015 at 00:34.
Khanattila is offline   Reply With Quote
Old 28th April 2015, 01:25   #95  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
I suppose you used the GTX760 for these measurements?
__________________
Groucho's Avisynth Stuff

Last edited by Groucho2004; 28th April 2015 at 10:02.
Groucho2004 is offline   Reply With Quote
Old 28th April 2015, 11:53   #96  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 440
Quote:
Originally Posted by Groucho2004 View Post
I suppose you used the GTX760 for these measurements?
Code:
[Graphics card info]
Card name:               NVIDIA GeForce GTX 760
GPU name:                GK104
Memory size:             2048
OpenCL version:          OpenCL 1.2 CUDA
Driver version:          9.18.13.5012 WHQL (ForceWare 350.12) / Win8.1 64
EDIT.
Would you try this?
- 720x480. KNLMeansCL(0, 2, 1)
- 720x480. KNLMeansCL(0, 3, 1)
- 720x480. KNLMeansCL(0, 5, 1)

Original 9600 GT take: 100.00 FPS / 52.46 FPS / 18.46 FPS.

(B. Goossens, H.Q. Luong, J. Aelterman, A. Pizurica, and W. Philips,
"A GPU-Accelerated Real-Time NLMeans Algorithm for Denoising Color Video Sequences",
in Proc. ACIVS (2), 2010, pp.46-57. )

Last edited by Khanattila; 28th April 2015 at 12:15.
Khanattila is offline   Reply With Quote
Old 28th April 2015, 12:19   #97  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
Quote:
Originally Posted by Khanattila View Post
Would you try this?
- 720x480. KNLMeansCL(0, 2, 1)
- 720x480. KNLMeansCL(0, 3, 1)
- 720x480. KNLMeansCL(0, 5, 1)

Original 9600 GT take: 100.00 FPS / 52.46 FPS / 18.46 FPS.
Code:
colorbars(width = 720, height = 480, pixel_type = "yv12").killaudio().assumefps(24000, 1001)
trim(0,99)
fadeio(48)
trim(0,99)

KNLMeansCL(0, x, 1) #x = 2, 3, 5
GT 240: 77 FPS / 40 FPS / 16.5 FPS.

How can this be? The 9600 is ancient.
__________________
Groucho's Avisynth Stuff
Groucho2004 is offline   Reply With Quote
Old 28th April 2015, 14:15   #98  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 440
Quote:
Originally Posted by Groucho2004 View Post
Code:
colorbars(width = 720, height = 480, pixel_type = "yv12").killaudio().assumefps(24000, 1001)
trim(0,99)
fadeio(48)
trim(0,99)

KNLMeansCL(0, x, 1) #x = 2, 3, 5
GT 240: 77 FPS / 40 FPS / 16.5 FPS.

How can this be? The 9600 is ancient.
Code is not fully optimized... but about is the same card: http://www.tomshardware.com/reviews/...40,2475-5.html
Khanattila is offline   Reply With Quote
Old 30th April 2015, 10:35   #99  |  Link
Khanattila
Registered User
 
Khanattila's Avatar
 
Join Date: Nov 2014
Posts: 440
Code:
          Beta2       Beta3
S=0    59.05 FPS    52.44 FPS  -11%
S=1    47.35 FPS    51.82 FPS  + 9%
S=2    43.99 FPS    51.43 FPS  +17%
S=3    38.88 FPS    50.39 FPS  +30%
S=4    36.89 FPS    49.20 FPS  +33%
Ready for the final release.

Last edited by Khanattila; 6th May 2016 at 15:45.
Khanattila is offline   Reply With Quote
Old 30th April 2015, 21:18   #100  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
Quote:
Originally Posted by Khanattila View Post
I'm getting an error with this version:
Code:
[build_programm (CL_BUILD_PROGRAM_FAILURE)]
Also, a text file ("KNLMeansCL.txt") is created with this content:
Code:
"error: macro 'V_BLOCK_Y' contains embedded newline, text after the newline is ignored."
__________________
Groucho's Avisynth Stuff
Groucho2004 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 07:47.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.