Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 17th January 2011, 23:07   #1  |  Link
Jawed
Registered User
 
Join Date: Jan 2008
Location: London
Posts: 156
Deathray - OpenCL GPU accelerated spatial/temporal non-local means de-noising

I've created Deathray, an Avisynth plug-in filter for spatial/temporal non-local means de-noising. It uses OpenCL for GPU acceleration.

The project lives on GitHub now: Deathray on GitHub

To download the DLL only:
On the Deathray project page click on "Deathray.dll" in the list of files. On the new page, press the "Raw" button and you will be offered the option of saving the file.
To download the entire source code, including the compiled DLL:
On the Deathray project page click on the Download ZIP button at the bottom of the right-hand column.
Version History

Code:
1.04 - Marks use of OpenCL 1.1 features, that have been deprecated in 1.2, as preferred. This keeps AMD SDK from complaining.

1.03 - Fixed black pixels bug with low values of hY and z=true (also affected chroma with low values of hUV)
       Fixed bug with allocation of a spurious buffer

1.02 - Added new l, c, z and b options
       Temporal filtering now re-weights target pixel based upon all sampled frames

1.01 - Removed logging to stderr (was only present in NVidia debug version);
       Includes post-filtering correction first introduced in NVidia debug version;
       Updated to use AMD APP SDK;
       Updated to VS2012 Express Edition;
       Updated DLL so that it is not dependent upon visual C runtime - see post 70;

1.00 - Initial version
Post 90 contains a version of Deathray which ignores the Intel OpenCL platform. This is a quick and dirty fix for people who have the Intel graphics driver installed and find that it is stopping Deathray from working on their graphics card.

I've quoted here the Deathray readme.txt:

Code:
Deathray
========

An Avisynth plug-in filter for spatial/temporal non-local means de-noising.

Created by Jawed Ashraf - Deathray@cupidity.f9.co.uk


Installation
============

Copy the Deathray.dll to the "plugins" sub-folder of your installation of 
Avisynth.


De-installation
===============

Delete the Deathray.dll from the "plugins" sub-folder of your installation of 
Avisynth.


Compatibility
=============

The following software configurations are known to work:

 - Avisynth 2.5.8         and 2.6 MT (SEt's)
 - AMD Stream SDK 2.3     and AMD APP SDK 2.8.1
 - AMD Catalyst 10.12     and 13.8 beta 2
 - Windows Vista 64-bit   and Windows 8 64-bit
 
 - NVidia software is known to work but drivers unknown

The following hardware configurations are known to work:

 - ATI HD 5870
 - AMD HD 7770
 - AMD HD 7970
 - Various NVidia, models unknown

Known non-working hardware:

 - ATI cards in the 4000 series or earlier
 - ATI cards in the 5400 series

Video:

 - Deathray is compatible solely with 8-bit planar formatted video. It has
   been tested with YV12 format.


Usage
=====

Deathray separates the video into its 3 component planes and processes each
of them independently. This means some parameters come in two flavours: luma
and chroma.

Filtering can be adjusted with the following parameters, with the default 
value for each in brackets:

 hY  (1.0) - strength of de-noising in the luma plane.

             Cannot be negative.

             If set to 0 Deathray will not process the luma plane.

 hUV (1.0) - strength of de-noising in the chroma planes.

             Cannot be negative.

             If set to 0 Deathray will not process the chroma planes.

 tY  (0)   - temporal radius for the luma plane.

             Limited to the range 0 to 64.

             When set to 0 spatial filtering is performed on the 
             luma plane. When set to 1 filtering uses the prior,
             current and next frames for the non-local sampling
             and weighting process. Higher values will increase
             the range of prior and next frames that are included.

 tUV (0)   - temporal radius for the chroma planes.

             Limited to the range 0 to 64.

             When set to 0 spatial filtering is performed on the 
             chroma planes. When set to 1 filtering uses the prior,
             current and next frames for the non-local sampling
             and weighting process. Higher values will increase
             the range of prior and next frames that are included.

 s   (1.0) - sigma used to generate the gaussian weights.

             Limited to values of at least 0.1.

             The kernel implemented by Deathray uses 7x7-pixel 
             windows centred upon the pixel being filtered. 

             For a 2-dimensional gaussian kernel sigma should be 
             approximately 1/3 of the radius of the kernel, or less,
             to retain its gaussian nature. 

             Since a 7x7 window has a radius of 3, values of sigma 
             greater than 1.0 will tend to bias the kernel towards
             a box-weighting. i.e. all pixels in the window will 
             tend towards being equally weighted. This will tend to 
             reduce the selectivity of the weighting process and 
             result in relatively stronger spatial blurring.

 x   (1)   - factor to expand sampling.

             Limited to values in the range 1 to 14.

             By default Deathray spatially samples 49 windows 
             centred upon the pixel being filtered, in a 7x7
             arrangement. x increases the sampling range in
             multiples of the kernel radius.
             
             Since the kernel radius is 3, setting x to 2 produces
             a sampling area of 13x13, i.e. 169 windows centred
             upon the target pixel. Yet higher values of x such as
             3 or 4 will result in 19x19 or 25x25 sample windows.

             Deathray uses 32x32 tiles to accelerate its processing.
             Each tile is equipped with a border of 8 pixels around
             all four edges, with pixels copied from neighbouring 
             tiles, or mirrored from within the tile if the tile 
             edge corresponds with a frame edge. This apron of 8
             extra pixels ensures that the default sampling of 
             49 windows is correct, allowing pixels near the edge of
             the tile to employ 49 sample windows that all have
             valid pixels.

             When x is set to 2 or more, sampling will "bump" into
             the edges defined by the 48x48 region. With strong 
             values of the de-noising parameters this will create
             artefacts in the filtered image. These artefacts are
             visible as a grid of vertical and horizontal lines
             corresponding with the 32x32 arrangement of the tiles.

 l (false) - linear processing of luma plane.
 
             true or false.

             This option allows processing in linear space instead
             of the default gamma space.

 c (true)  - correction after filtering.
 
             true or false.

             This option applies a correction after filtering
             to limit the amount of filtering per pixel.
			 
             When set to false the naked NLM algorithm is used.
			 
 z (false) - target pixel tends towards zero-weighted.
 
             true or false.

             Reduces the weight of the pixel being filtered to
             a minimum. This results in more even filtering across
             the tonal range from shadows to highlights.
			 
             The standard NLM algorithm gives the pixel being filtered
             the maximum weight of all. A refinment of the algorithm
             is to give the pixel being filtered the maximum weight
             derived from all the other pixels that were inspected.
			 
             This maximum of other pixels' weights is used when z is
             set to false.
			 
             When set to true, the minimum of other pixels' weights is
             used instead.

 b (false) - balanced weighting.
 
             true or false.

             Attempts to balance weighting of pixels based upon their
             luma value.
			 
             This parameter is not applied to chroma planes.
			 
			 
Avisynth MT
===========

Deathray is not thread safe. This means that only a single instance of
Deathray can be used per Avisynth script. By extension this means that
it is not compatible with any of the multi-threading modes of the 
Multi Threaded variant of Avisynth. 

Use:

SetMTMode(5) 

before a call to Deathray in the Avisynth script, if multi-threading
is active in other parts of the script.


Multiple Scripts Using Deathray
===============================

The graphics driver is thread safe. This means it is possible to have
an arbitrary number of Avisynth scripts calling Deathray running on a 
system. 

e.g. 2 scripts could be encoding, another could be running in a media player
and another could be previewing individual frames in AvsP or VirtualDub.

Eventually video memory will probably run out, even though it's virtualised.


System Responsiveness
=====================

Currently graphics drivers are unable to confer user-responsiveness
guarantees on OpenCL applications that utilise GPUs. This means if you
are using Deathray on a frame size of 16 million pixels, there will be some
juddering in Windows every ~0.7 seconds (1.5 frames per second on HD 5870) 
accompanied by difficulty in typing, etc.
Deathray is BSD licensed

Last edited by Jawed; 10th January 2016 at 10:53.
Jawed is offline   Reply With Quote
Old 18th January 2011, 23:19   #2  |  Link
Zep
Registered User
 
Join Date: Jul 2002
Posts: 587
awesome. will test this weekend for sure. :P
Zep is offline   Reply With Quote
Old 19th January 2011, 02:29   #3  |  Link
pokazene_maslo
Registered User
 
Join Date: Apr 2009
Location: Martin, Slovakia
Posts: 79
Is this filter motion compensated?
pokazene_maslo is offline   Reply With Quote
Old 19th January 2011, 02:42   #4  |  Link
pirej
Registered User
 
Join Date: Aug 2009
Posts: 26
Nice one, ill give it a try.
Is ati HD5750 compatible ? I have ati stream installed/enabled.


edit: I gues it's compatible, i loaded the default settings, and it works...(just for previewing the filtered video.. cpu load is only 20%, GPU 50% load), now i have to try to tweak the settings and see the effect.
Thanks Jawed

Last edited by pirej; 19th January 2011 at 03:19.
pirej is offline   Reply With Quote
Old 19th January 2011, 11:33   #5  |  Link
Jawed
Registered User
 
Join Date: Jan 2008
Location: London
Posts: 156
Quote:
Originally Posted by pokazene_maslo View Post
Is this filter motion compensated?
Not explicitly.

The general algorithm searches the entire image for blocks (what are usually called "windows") that look like the block around the target pixel. It uses the similarity of each sampled block as the weighting of the centre pixel of each sampled block.

The filtered pixel is then the weighted sum of all the centre pixels of every block in the original image.

In Deathray the search is restricted to 49 windows around the target pixel. This means if motion is "low", i.e. less than 3 pixels in any direction, motion compensation "arises".

Deathray has an option, x, to increase the sampling area.

Intrinsically NLM is a spatial filtering technique based on self-similarity in real world images plus it is geared towards noise rather than artefacts such as JPEG/DCT blocks or interlacing artefacts. See this paper for a summary:

http://hal.archives-ouvertes.fr/docs...jcvrevised.pdf

One of the problems with NLM (generally as well as in Deathray) is that it isn't doing time-series pixel averaging (what you might do with a series of photographs of a static scene) - the spatial aspect tends to dominate, even with a temporal radius of 5 or even 7.

Ironically, after the grand claims made in the paper linked above, hybrid time-series techniques have been experimented with by some of the same people:

ftp://ftp.math.ucla.edu/pub/camreport/cam09-62.pdf

In this paper you will see reference to something called BM3D, which as far as I can tell is the academics' name for MVTools' MVDegrain (or MVTools2's MDegrain).

The principle of the hybrid approach is to use BM3D where "registration" is achieved (i.e. motion compensation meets a threshold of suitability) and to use NLM where registration fails.

I normally use FizzKiller, which is a variation of MDegrain using a calmed clip for analysis:

http://forum.doom9.org/showthread.php?t=133977

but I'm looking for something faster, so I decided to implement temporal NLM.

I should update the FizzKiller script I posted in that thread (post 23) as I tweaked it a bit. Overall, FizzKiller is awesome.
Jawed is offline   Reply With Quote
Old 19th January 2011, 11:42   #6  |  Link
Jawed
Registered User
 
Join Date: Jan 2008
Location: London
Posts: 156
Quote:
Originally Posted by pirej View Post
Nice one, ill give it a try.
Is ati HD5750 compatible ? I have ati stream installed/enabled.


edit: I gues it's compatible, i loaded the default settings, and it works...(just for previewing the filtered video.. cpu load is only 20%, GPU 50% load), now i have to try to tweak the settings and see the effect.
Thanks Jawed
Nice, thanks. Glad to hear it works somewhere else!

I'm working on linear correction, a post-filtering step, to improve detail retention. This improves the result while allowing stronger de-noising, so I will post an updated version of Deathray soon.

I recommend temporal rather than spatial - use 2 or 3 for the radius. I prefer low sigma, i.e. <=1. h varies with material.

With sigma set to ~0.7 you are effectively using a 5x5 kernel - the outer ring of 24 pixels in each 7x7-pixel window is effectively weighted "0" (all 24 of these pixels have a total weighting of 4%). The search area is still 7x7 (i.e. 49 windows), but the smaller kernel is "sharper".

I may implement a native 5x5 variant of Deathray to make it go faster, since I think temporal is more useful than spatial.

I also need to understand how to make arguments for a plug-in optional (this is my first plug-in). Argument handling is very clumsy. Any help on that would be appreciated.

Last edited by Jawed; 19th January 2011 at 12:25.
Jawed is offline   Reply With Quote
Old 19th January 2011, 12:28   #7  |  Link
Gavino
Avisynth language lover
 
Join Date: Dec 2007
Location: Spain
Posts: 3,376
Quote:
Originally Posted by Jawed View Post
I also need to understand how to make arguments for a plug-in optional
Simply add the argument name in square brackets in the call to env->AddFunction, and supply a default value when you extract the value using AsInt etc.

For an example, see http://avisynth.org/mediawiki/Filter...le_sample_1.3a.
__________________
GScript and GRunT - complex Avisynth scripting made easier
Gavino is offline   Reply With Quote
Old 19th January 2011, 12:45   #8  |  Link
Jawed
Registered User
 
Join Date: Jan 2008
Location: London
Posts: 156
Thanks, this is how I setup Deathray:

Code:
    env->AddFunction("deathray", "c[hY]f[hUV]f[tY]i[tUV]i[s]f[x]i", CreateDeathray, 0);
Then I have:

Code:
AVSValue __cdecl CreateDeathray(AVSValue args, void *user_data, IScriptEnvironment *env) {

	double h_Y = args[1].AsFloat(1.);
	if (h_Y < 0.) h_Y = 0.;

	double h_UV = args[2].AsFloat(1.);
	if (h_UV < 0.) h_UV = 0.;

	int temporal_radius_Y = args[3].AsInt(0);
	if (temporal_radius_Y < 0) temporal_radius_Y = 0;
	if (temporal_radius_Y > 64) temporal_radius_Y = 64;

	int temporal_radius_UV = args[4].AsInt(0);
	if (temporal_radius_UV < 0) temporal_radius_UV = 0;
	if (temporal_radius_UV > 64) temporal_radius_UV = 64;

	double sigma = args[5].AsFloat(1.);
	if (sigma < 0.1) sigma = 0.1;

	int sample_expand = args[6].AsInt(1);	
	if (sample_expand <= 0) sample_expand = 1;
	if (sample_expand > 14) sample_expand = 14;

	return new deathray(args[0].AsClip(),
			    h_Y, 
			    h_UV, 
			    temporal_radius_Y, 
			    temporal_radius_UV, 
			    sigma,
			    sample_expand, 
			    env);
}
If I do this:

deathray(hY=2)

it works fine. If, instead, I try this:

deathray(hy=2,1)

It fails. But now I think about it, I think that's not valid Avisynth function call syntax - you can't mix named and un-named paramters. Sigh, addled by C++...
Jawed is offline   Reply With Quote
Old 19th January 2011, 12:59   #9  |  Link
Didée
Registered User
 
Join Date: Apr 2002
Location: Germany
Posts: 5,390
Quote:
Originally Posted by Jawed View Post
The principle of the hybrid approach is to use BM3D where "registration" is achieved (i.e. motion compensation meets a threshold of suitability) and to use NLM where registration fails.
To the experienced Avisynth users around here, this is kind of boring, isn't it?

There's a bunch of scripts that do exactly this kind of "hybrid" filtering ... several by *mp4guy, one or two by me ...
Here is one with a short explanation of the principle, 3 years old. And I definetly know my first mentioning of that problem/solution has been years before that already.
__________________
- We´re at the beginning of the end of mankind´s childhood -

My little flickr gallery. (Yes indeed, I do have hobbies other than digital video!)
Didée is offline   Reply With Quote
Old 19th January 2011, 13:28   #10  |  Link
Jawed
Registered User
 
Join Date: Jan 2008
Location: London
Posts: 156
Quote:
Originally Posted by Didée View Post
To the experienced Avisynth users around here, this is kind of boring, isn't it?
Of course. I didn't suggest it was novel, did I?

Quote:
There's a bunch of scripts that do exactly this kind of "hybrid" filtering ... several by *mp4guy, one or two by me ...
Here is one with a short explanation of the principle, 3 years old. And I definetly know my first mentioning of that problem/solution has been years before that already.
Yes, I made FizzKiller based on the calm-clip idea.

Has anyone built a hybrid of NLM and MDegrain?

I've been thinking about trying Deathray to generate the calm clip for FizzKiller...
Jawed is offline   Reply With Quote
Old 19th January 2011, 17:45   #11  |  Link
ChaosKing
Registered User
 
Join Date: Dec 2005
Location: Germany
Posts: 167
I get an error: "Error in OpenCl status=1 frame 0"
My graphics card is a Nvidia 260GTX.
Win7 x64
__________________
Search and denoise
ChaosKing is online now   Reply With Quote
Old 19th January 2011, 18:46   #12  |  Link
Jawed
Registered User
 
Join Date: Jan 2008
Location: London
Posts: 156
That seems to be a compilation error. Or it could be that the code is too complex (which behaves like a compilation error).

I presume you have OpenCL working on your system?

As you can probably tell I don't have any NVidia cards to test with so it's something I can only do remotely with others' help.

If you or others are prepared to "mess about", I can try to diagnose the issue with tailored versions of the DLL (which might not do any filtering, but would verify basic capability).

I'm also planning on a change in architecture (which should improve performance), which has the side-effect of reducing complexity, making the code more likely to work on NVidia. But that's a few days away at least.

I'd be interested in results with NVidia 400 or 500 series as the code's complexity is theoretically less of a problem there.

(The complexity issue is to do with registers. The code uses an extremely high register allocation on ATI, and likely similar on NVidia. NVidia prefers lower register allocations, but 400/500 series should be fine. My planned changes include a reduction in register allocation.)

Did you try Deathray with all default values? i.e. use:

Deathray()

It may also be worth trying

Deathray(hUV=0)

but I'm doubtful that will work if the default version doesn't work.

If you'd like to try some diagnosis, try this version of Deathray:

www.cupidity.f9.co.uk/DeathrayNV110119001.zip

Delete the Deathray DLL that is installed in your plugins folder and put this version of Deathray in there. This version of Deathray merely passes through the frame, with default settings. (It will do something else, not sure what, if you turn on temporal filtering.) Make sure to test with no parameters, please.

I've also increased the detail on the error status message. That might provide some insight.
Jawed is offline   Reply With Quote
Old 19th January 2011, 19:49   #13  |  Link
ChaosKing
Registered User
 
Join Date: Dec 2005
Location: Germany
Posts: 167
Yep, OpenCL is working on my system. The NLMeansCL filter works on my system for example ^^

hUV=0 makes no difference

your Debug version of Deathray gives me: "Error in OpenCL status=11 frame=0 and OpenCl status= -30"

Hope this values can help you
__________________
Search and denoise
ChaosKing is online now   Reply With Quote
Old 19th January 2011, 21:04   #14  |  Link
Jawed
Registered User
 
Join Date: Jan 2008
Location: London
Posts: 156
Thanks. That means it is having trouble finding devices. Which is definitely not what I was expecting.

Do you have the AMD OpenCL installed on your computer, in addition to NVidia OpenCL? I'm wondering if it finds the AMD OpenCL first, but there's no GPUs. So then fails. But the status you're getting doesn't seem to correspond with that, there's a different error for that situation.

The OpenCL error is more mysterious, "invalid value"...

If you'd like to help some more, this will tell me which OpenCL call is failing:

www.cupidity.f9.co.uk/DeathrayNV110119002.zip

EDIT: corrected, should be fine now

Last edited by Jawed; 19th January 2011 at 21:16.
Jawed is offline   Reply With Quote
Old 19th January 2011, 21:48   #15  |  Link
Didée
Registered User
 
Join Date: Apr 2002
Location: Germany
Posts: 5,390
With this one, it is

"Error reading source frame 0: Avisynth read error: Deathray: Error in OpenCL status=11 frame 0 and OpenCL status=-30"
__________________
- We´re at the beginning of the end of mankind´s childhood -

My little flickr gallery. (Yes indeed, I do have hobbies other than digital video!)
Didée is offline   Reply With Quote
Old 19th January 2011, 22:08   #16  |  Link
Jawed
Registered User
 
Join Date: Jan 2008
Location: London
Posts: 156
Thanks, that's really peculiar. It's asking for the number of devices and seemingly responding that asking for the number of devices is invalid.

This is going to be tedious.

OK this test version doesn't ask for the device count (eventually Deathray will support multiple cards ), it assumes there's 1 device:

www.cupidity.f9.co.uk/DeathrayNV110119003.zip

Fingers-crossed.
Jawed is offline   Reply With Quote
Old 19th January 2011, 23:19   #17  |  Link
Didée
Registered User
 
Join Date: Apr 2002
Location: Germany
Posts: 5,390
Different message now:

"Error reading source frame 0: Avisynth read error: Single-frame initialisation failed, status=1"


However, note I'm not running the latest NV driver for my GT240, it's one or two revisions behind.
Reports from people running the most recent drivers could be more interesting.
Or, perhaps the GT240 is simply "too small" ?
__________________
- We´re at the beginning of the end of mankind´s childhood -

My little flickr gallery. (Yes indeed, I do have hobbies other than digital video!)
Didée is offline   Reply With Quote
Old 19th January 2011, 23:56   #18  |  Link
ChaosKing
Registered User
 
Join Date: Dec 2005
Location: Germany
Posts: 167
I get exactly the same messages as Didée, both new versions tested.
And no, there is no trace of an AMD driver on my system ^^"

Maybe this information can help?
Code:
===================================================
GPU Caps Viewer v1.9.5
http://www.ozone3d.net/gpu_caps_viewer/
===================================================


===================================[ System / CPU ]
- CPU Name: Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz
- CPU Core Speed: 2833 MHz
- CPU Num Cores: 4
- Family: 6 - Model: 7 - Stepping: 10
- Physical Memory Size: 4095 MB
- Operating System: Windows 7 64-bit build 7600 [No Service Pack]
- DirectX Version: 10.0
- PhysX Version: 9100514


===================================[ Graphics Adapter / GPU ]
- SLI: disabled
- GPUs: 1
- Logical GPUs: 1
- OpenGL Renderer: GeForce GTX 260/PCI/SSE2
- Drivers Renderer: NVIDIA GeForce GTX 260
- DB Renderer: NVIDIA GeForce GTX 260
- Device Description: NVIDIA GeForce GTX 260
- Adapter String: GeForce GTX 260
- Vendor: NVIDIA Corporation
- Vendor ID: 0x10DE
- Device ID: 0x05E2
- Sub device ID: 0x1109
- Sub vendor ID: 0x19DA
- Drivers Version: 8.17.12.6099 (10-16-2010) - nvoglv64
- GPU Codename: GT200
- GPU Unified Shader Processors: 192
- GPU Vertex Shader Processors: 0
- GPU Pixel Shader Processors: 0
- SM / SIMD: 24
- TPC: 8
- TPD (Watts): 182
- Video Memory Size: 896 MB
- Video Memory Type: GDDR3
- Clocks level #0: Core: 300MHz - Memory: 100MHz - Shader: 600MHz
- Clocks level #1: Core: 400MHz - Memory: 300MHz - Shader: 800MHz
- Clocks level #2: Core: 576MHz - Memory: 999MHz - Shader: 1242MHz
- BIOS String:  62.0.61.0.0
- Current Display Mode: 1280x1024 @ 60 Hz - 32 bpp


===================================[ OpenGL GPU Capabilities ]
- OpenGL Version: 3.3.0
- GLSL (OpenGL Shading Language) Version: 3.30 NVIDIA via Cg compiler
- ARB Texture Units: 4
- Vertex Shader Texture Units: 32
- Pixel Shader Texture Units: 32
- Geometry Shader Texture Units: 32
- Max Texture Size: 8192x8192
- Max Anisotropic Filtering Value: X16.0
- Max Point Sprite Size: 63.4
- Max Dynamic Lights: 8
- Max Viewport Size: 8192x8192
- Max Vertex Uniform Components: 4096
- Max Fragment Uniform Components: 2048
- Max Geometry Uniform Components: 2048
- Max Varying Float: 60
- Max Vertex Bindable Uniforms: 12
- Max Fragment Bindable Uniforms: 12
- Max Geometry Bindable Uniforms: 12
- Frame Buffer Objects (FBO) Support:[yes]
- Multiple Render Targets / Max draw buffers: 8
- Pixel Buffer Objects (PBO) Support:[yes]
- S3TC Texture Compression Support:[yes]
- ATI 3Dc Texture Compression Support:[no]
- Texture Rectangle Support:[yes]
- Floating Point Textures Support:[yes]
- MSAA: 2X
- MSAA: 4X
- MSAA: 8X
- MSAA: 16X
- OpenGL Extensions: 221 extensions (GL=199 and WGL=22)
    <li>GL_ARB_blend_func_extended</li>
    <li>GL_ARB_color_buffer_float</li>
    <li>GL_ARB_compatibility</li>
    <li>GL_ARB_copy_buffer</li>
    <li>GL_ARB_debug_output</li>
    <li>GL_ARB_depth_buffer_float</li>
    <li>GL_ARB_depth_clamp</li>
    <li>GL_ARB_depth_texture</li>
    <li>GL_ARB_draw_buffers</li>
    <li>GL_ARB_draw_elements_base_vertex</li>
    <li>GL_ARB_draw_instanced</li>
    <li>GL_ARB_ES2_compatibility</li>
    <li>GL_ARB_explicit_attrib_location</li>
    <li>GL_ARB_fragment_coord_conventions</li>
    <li>GL_ARB_fragment_program</li>
    <li>GL_ARB_fragment_program_shadow</li>
    <li>GL_ARB_fragment_shader</li>
    <li>GL_ARB_framebuffer_object</li>
    <li>GL_ARB_framebuffer_sRGB</li>
    <li>GL_ARB_geometry_shader4</li>
    <li>GL_ARB_get_program_binary</li>
    <li>GL_ARB_half_float_pixel</li>
    <li>GL_ARB_half_float_vertex</li>
    <li>GL_ARB_imaging</li>
    <li>GL_ARB_instanced_arrays</li>
    <li>GL_ARB_map_buffer_range</li>
    <li>GL_ARB_multisample</li>
    <li>GL_ARB_multitexture</li>
    <li>GL_ARB_occlusion_query</li>
    <li>GL_ARB_occlusion_query2</li>
    <li>GL_ARB_pixel_buffer_object</li>
    <li>GL_ARB_point_parameters</li>
    <li>GL_ARB_point_sprite</li>
    <li>GL_ARB_provoking_vertex</li>
    <li>GL_ARB_robustness</li>
    <li>GL_ARB_sampler_objects</li>
    <li>GL_ARB_seamless_cube_map</li>
    <li>GL_ARB_separate_shader_objects</li>
    <li>GL_ARB_shader_bit_encoding</li>
    <li>GL_ARB_shader_objects</li>
    <li>GL_ARB_shading_language_100</li>
    <li>GL_ARB_shadow</li>
    <li>GL_ARB_sync</li>
    <li>GL_ARB_texture_border_clamp</li>
    <li>GL_ARB_texture_buffer_object</li>
    <li>GL_ARB_texture_compression</li>
    <li>GL_ARB_texture_compression_rgtc</li>
    <li>GL_ARB_texture_cube_map</li>
    <li>GL_ARB_texture_env_add</li>
    <li>GL_ARB_texture_env_combine</li>
    <li>GL_ARB_texture_env_crossbar</li>
    <li>GL_ARB_texture_env_dot3</li>
    <li>GL_ARB_texture_float</li>
    <li>GL_ARB_texture_mirrored_repeat</li>
    <li>GL_ARB_texture_multisample</li>
    <li>GL_ARB_texture_non_power_of_two</li>
    <li>GL_ARB_texture_rectangle</li>
    <li>GL_ARB_texture_rg</li>
    <li>GL_ARB_texture_rgb10_a2ui</li>
    <li>GL_ARB_texture_swizzle</li>
    <li>GL_ARB_timer_query</li>
    <li>GL_ARB_transform_feedback2</li>
    <li>GL_ARB_transpose_matrix</li>
    <li>GL_ARB_uniform_buffer_object</li>
    <li>GL_ARB_vertex_array_bgra</li>
    <li>GL_ARB_vertex_array_object</li>
    <li>GL_ARB_vertex_buffer_object</li>
    <li>GL_ARB_vertex_program</li>
    <li>GL_ARB_vertex_shader</li>
    <li>GL_ARB_vertex_type_2_10_10_10_rev</li>
    <li>GL_ARB_viewport_array</li>
    <li>GL_ARB_window_pos</li>
    <li>GL_ATI_draw_buffers</li>
    <li>GL_ATI_texture_float</li>
    <li>GL_ATI_texture_mirror_once</li>
    <li>GL_S3_s3tc</li>
    <li>GL_EXT_texture_env_add</li>
    <li>GL_EXT_abgr</li>
    <li>GL_EXT_bgra</li>
    <li>GL_EXT_bindable_uniform</li>
    <li>GL_EXT_blend_color</li>
    <li>GL_EXT_blend_equation_separate</li>
    <li>GL_EXT_blend_func_separate</li>
    <li>GL_EXT_blend_minmax</li>
    <li>GL_EXT_blend_subtract</li>
    <li>GL_EXT_compiled_vertex_array</li>
    <li>GL_EXT_Cg_shader</li>
    <li>GL_EXT_depth_bounds_test</li>
    <li>GL_EXT_direct_state_access</li>
    <li>GL_EXT_draw_buffers2</li>
    <li>GL_EXT_draw_instanced</li>
    <li>GL_EXT_draw_range_elements</li>
    <li>GL_EXT_fog_coord</li>
    <li>GL_EXT_framebuffer_blit</li>
    <li>GL_EXT_framebuffer_multisample</li>
    <li>GL_EXTX_framebuffer_mixed_formats</li>
    <li>GL_EXT_framebuffer_object</li>
    <li>GL_EXT_framebuffer_sRGB</li>
    <li>GL_EXT_geometry_shader4</li>
    <li>GL_EXT_gpu_program_parameters</li>
    <li>GL_EXT_gpu_shader4</li>
    <li>GL_EXT_multi_draw_arrays</li>
    <li>GL_EXT_packed_depth_stencil</li>
    <li>GL_EXT_packed_float</li>
    <li>GL_EXT_packed_pixels</li>
    <li>GL_EXT_pixel_buffer_object</li>
    <li>GL_EXT_point_parameters</li>
    <li>GL_EXT_provoking_vertex</li>
    <li>GL_EXT_rescale_normal</li>
    <li>GL_EXT_secondary_color</li>
    <li>GL_EXT_separate_shader_objects</li>
    <li>GL_EXT_separate_specular_color</li>
    <li>GL_EXT_shadow_funcs</li>
    <li>GL_EXT_stencil_two_side</li>
    <li>GL_EXT_stencil_wrap</li>
    <li>GL_EXT_texture3D</li>
    <li>GL_EXT_texture_array</li>
    <li>GL_EXT_texture_buffer_object</li>
    <li>GL_EXT_texture_compression_latc</li>
    <li>GL_EXT_texture_compression_rgtc</li>
    <li>GL_EXT_texture_compression_s3tc</li>
    <li>GL_EXT_texture_cube_map</li>
    <li>GL_EXT_texture_edge_clamp</li>
    <li>GL_EXT_texture_env_combine</li>
    <li>GL_EXT_texture_env_dot3</li>
    <li>GL_EXT_texture_filter_anisotropic</li>
    <li>GL_EXT_texture_integer</li>
    <li>GL_EXT_texture_lod</li>
    <li>GL_EXT_texture_lod_bias</li>
    <li>GL_EXT_texture_mirror_clamp</li>
    <li>GL_EXT_texture_object</li>
    <li>GL_EXT_texture_shared_exponent</li>
    <li>GL_EXT_texture_sRGB</li>
    <li>GL_EXT_texture_swizzle</li>
    <li>GL_EXT_timer_query</li>
    <li>GL_EXT_transform_feedback2</li>
    <li>GL_EXT_vertex_array</li>
    <li>GL_EXT_vertex_array_bgra</li>
    <li>GL_IBM_rasterpos_clip</li>
    <li>GL_IBM_texture_mirrored_repeat</li>
    <li>GL_KTX_buffer_region</li>
    <li>GL_NV_blend_square</li>
    <li>GL_NV_conditional_render</li>
    <li>GL_NV_copy_depth_to_color</li>
    <li>GL_NV_copy_image</li>
    <li>GL_NV_depth_buffer_float</li>
    <li>GL_NV_depth_clamp</li>
    <li>GL_NV_explicit_multisample</li>
    <li>GL_NV_fence</li>
    <li>GL_NV_float_buffer</li>
    <li>GL_NV_fog_distance</li>
    <li>GL_NV_fragment_program</li>
    <li>GL_NV_fragment_program_option</li>
    <li>GL_NV_fragment_program2</li>
    <li>GL_NV_framebuffer_multisample_coverage</li>
    <li>GL_NV_geometry_shader4</li>
    <li>GL_NV_gpu_program4</li>
    <li>GL_NV_half_float</li>
    <li>GL_NV_light_max_exponent</li>
    <li>GL_NV_multisample_coverage</li>
    <li>GL_NV_multisample_filter_hint</li>
    <li>GL_NV_occlusion_query</li>
    <li>GL_NV_packed_depth_stencil</li>
    <li>GL_NV_parameter_buffer_object</li>
    <li>GL_NV_parameter_buffer_object2</li>
    <li>GL_NV_pixel_data_range</li>
    <li>GL_NV_point_sprite</li>
    <li>GL_NV_primitive_restart</li>
    <li>GL_NV_register_combiners</li>
    <li>GL_NV_register_combiners2</li>
    <li>GL_NV_shader_buffer_load</li>
    <li>GL_NV_texgen_reflection</li>
    <li>GL_NV_texture_barrier</li>
    <li>GL_NV_texture_compression_vtc</li>
    <li>GL_NV_texture_env_combine4</li>
    <li>GL_NV_texture_expand_normal</li>
    <li>GL_NV_texture_multisample</li>
    <li>GL_NV_texture_rectangle</li>
    <li>GL_NV_texture_shader</li>
    <li>GL_NV_texture_shader2</li>
    <li>GL_NV_texture_shader3</li>
    <li>GL_NV_transform_feedback</li>
    <li>GL_NV_transform_feedback2</li>
    <li>GL_NV_vertex_array_range</li>
    <li>GL_NV_vertex_array_range2</li>
    <li>GL_NV_vertex_buffer_unified_memory</li>
    <li>GL_NV_vertex_program</li>
    <li>GL_NV_vertex_program1_1</li>
    <li>GL_NV_vertex_program2</li>
    <li>GL_NV_vertex_program2_option</li>
    <li>GL_NV_vertex_program3</li>
    <li>GL_NVX_conditional_render</li>
    <li>GL_NVX_gpu_memory_info</li>
    <li>GL_SGIS_generate_mipmap</li>
    <li>GL_SGIS_texture_lod</li>
    <li>GL_SGIX_depth_texture</li>
    <li>GL_SGIX_shadow</li>
    <li>GL_SUN_slice_accum</li>
    <li>GL_WIN_swap_hint</li>
    <li>WGL_EXT_swap_control</li>
    <li>WGL_ARB_buffer_region</li>
    <li>WGL_ARB_create_context</li>
    <li>WGL_ARB_create_context_profile</li>
    <li>WGL_ARB_create_context_robustness</li>
    <li>WGL_ARB_extensions_string</li>
    <li>WGL_ARB_make_current_read</li>
    <li>WGL_ARB_multisample</li>
    <li>WGL_ARB_pbuffer</li>
    <li>WGL_ARB_pixel_format</li>
    <li>WGL_ARB_pixel_format_float</li>
    <li>WGL_ARB_render_texture</li>
    <li>WGL_ATI_pixel_format_float</li>
    <li>WGL_EXT_create_context_es2_profile</li>
    <li>WGL_EXT_extensions_string</li>
    <li>WGL_EXT_framebuffer_sRGB</li>
    <li>WGL_EXT_pixel_format_packed_float</li>
    <li>WGL_NVX_DX_interop</li>
    <li>WGL_NV_float_buffer</li>
    <li>WGL_NV_multisample_coverage</li>
    <li>WGL_NV_render_depth_texture</li>
    <li>WGL_NV_render_texture_rectangle</li>


===================================[ NVIDIA CUDA Capabilities ]
- CUDA Device 0
	- Device name: GeForce GTX 260
	- Compute Capability: 1.3
	- Total Memory: 877 MB
	- Shader Clock Rate: 1242 MHz
	- Multiprocessors: 24
	- Warp Size: 32
	- Max Threads Per Block: 512
	- Threads Per Block: 512 x 512 x 64
	- Grid Size: 65535 x 65535 x 1
	- Registers Per Block: 16384
	- Texture Alignment: 256 byte
	- Total Constant Memory: 64 Kb


===================================[ OpenCL Capabilities ]
- Num OpenCL platforms: 1
- Name: NVIDIA CUDA
- Version: OpenCL 1.0 CUDA 3.2.1
- Profile: FULL_PROFILE
- Vendor: NVIDIA Corporation
- Num devices: 1

	- CL_DEVICE_NAME: GeForce GTX 260
	- CL_DEVICE_VENDOR: NVIDIA Corporation
	- CL_DRIVER_VERSION: 260.99
	- CL_DEVICE_PROFILE: FULL_PROFILE
	- CL_DEVICE_VERSION: OpenCL 1.0 CUDA
	- CL_DEVICE_TYPE: GPU
	- CL_DEVICE_VENDOR_ID: 0x10DE
	- CL_DEVICE_MAX_COMPUTE_UNITS: 24
	- CL_DEVICE_MAX_CLOCK_FREQUENCY: 1242MHz
	- CL_NV_DEVICE_COMPUTE_CAPABILITY_MAJOR: 1
	- CL_NV_DEVICE_COMPUTE_CAPABILITY_MINOR: 3
	- CL_NV_DEVICE_REGISTERS_PER_BLOCK: 16384
	- CL_NV_DEVICE_WARP_SIZE: 32
	- CL_NV_DEVICE_GPU_OVERLAP: 1
	- CL_NV_DEVICE_KERNEL_EXEC_TIMEOUT: 1
	- CL_NV_DEVICE_INTEGRATED_MEMORY: 0
	- CL_DEVICE_ADDRESS_BITS: 32
	- CL_DEVICE_MAX_MEM_ALLOC_SIZE: 224608KB
	- CL_DEVICE_GLOBAL_MEM_SIZE: 877MB
	- CL_DEVICE_MAX_PARAMETER_SIZE: 4352
	- CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE: 0 Bytes
	- CL_DEVICE_GLOBAL_MEM_CACHE_SIZE: 0KB
	- CL_DEVICE_ERROR_CORRECTION_SUPPORT: NO
	- CL_DEVICE_LOCAL_MEM_TYPE: Local (scratchpad)
	- CL_DEVICE_LOCAL_MEM_SIZE: 16KB
	- CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64KB
	- CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3
	- CL_DEVICE_MAX_WORK_ITEM_SIZES: [512 ; 512 ; 64]
	- CL_DEVICE_MAX_WORK_GROUP_SIZE: 512
	- CL_EXEC_NATIVE_KERNEL: 4751356
	- CL_DEVICE_IMAGE_SUPPORT: YES
	- CL_DEVICE_MAX_READ_IMAGE_ARGS: 128
	- CL_DEVICE_MAX_WRITE_IMAGE_ARGS: 8
	- CL_DEVICE_IMAGE2D_MAX_WIDTH: 4096
	- CL_DEVICE_IMAGE2D_MAX_HEIGHT: 32768
	- CL_DEVICE_IMAGE3D_MAX_WIDTH: 2048
	- CL_DEVICE_IMAGE3D_MAX_HEIGHT: 2048
	- CL_DEVICE_IMAGE3D_MAX_DEPTH: 16
	- CL_DEVICE_MAX_SAMPLERS: 16
	- CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR: 1
	- CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT: 1
	- CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT: 1
	- CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG: 1
	- CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT: 1
	- CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE: 1
	- CL_DEVICE_EXTENSIONS: 16
	- Extensions:
		- cl_khr_byte_addressable_store
		- cl_khr_icd
		- cl_khr_gl_sharing
		- cl_nv_d3d9_sharing
		- cl_nv_d3d10_sharing
		- cl_khr_d3d10_sharing
		- cl_nv_d3d11_sharing
		- cl_nv_compiler_options
		- cl_nv_device_attribute_query
		- cl_nv_pragma_unroll
		- 
		- cl_khr_global_int32_base_atomics
		- cl_khr_global_int32_extended_atomics
		- cl_khr_local_int32_base_atomics
		- cl_khr_local_int32_extended_atomics
		- cl_khr_fp64


===================================[ Misc. ]


===================================[ Related Graphics Drivers ]
- http://www.geeks3d.com/?page_id=752
- http://downloads.guru3d.com/download.php?id=10
- http://www.tweakguides.com/NVFORCE_1.html
- http://www.nvidia.com/object/winxp-2k_archive.html
- http://www.geeks3d.com/?p=65


===================================[ Related Graphics Cards Reviews ]
- http://www.geeks3d.com/?tag=geforce-gtx-260
- http://www.google.us/search?q=NVIDIA+GeForce+GTX+260+review
__________________
Search and denoise
ChaosKing is online now   Reply With Quote
Old 19th January 2011, 23:57   #19  |  Link
TheProfileth
Leader of Dual-Duality
 
TheProfileth's Avatar
 
Join Date: Aug 2010
Location: America
Posts: 134
Will give this a look see in a bit.
Edit:
I tried the normal version and the 3 fixed versions in avspmod
I got this for the normal
Quote:
Traceback (most recent call last):
File "AvsP.pyo", line 5813, in OnMenuVideoRefresh
File "AvsP.pyo", line 8950, in ShowVideoFrame
File "AvsP.pyo", line 9492, in PaintAVIFrame
File "pyavs.pyo", line 322, in DrawFrame
File "pyavs.pyo", line 306, in _GetFrame
File "avisynth.pyo", line 309, in GetPitch
ValueError: NULL pointer access
I got this for the first fixed
Quote:
Traceback (most recent call last):
File "AvsP.pyo", line 5813, in OnMenuVideoRefresh
File "AvsP.pyo", line 8950, in ShowVideoFrame
File "AvsP.pyo", line 9492, in PaintAVIFrame
File "pyavs.pyo", line 322, in DrawFrame
File "pyavs.pyo", line 306, in _GetFrame
File "avisynth.pyo", line 309, in GetPitch
ValueError: NULL pointer access
and this for the second and third
Quote:
Traceback (most recent call last):
File "AvsP.pyo", line 6281, in OnButtonTextKillFocus
File "AvsP.pyo", line 8950, in ShowVideoFrame
File "AvsP.pyo", line 9492, in PaintAVIFrame
File "pyavs.pyo", line 322, in DrawFrame
File "pyavs.pyo", line 306, in _GetFrame
File "avisynth.pyo", line 309, in GetPitch
ValueError: NULL pointer access
Traceback (most recent call last):
File "AvsP.pyo", line 7147, in OnPaintVideoWindow
File "AvsP.pyo", line 9492, in PaintAVIFrame
File "pyavs.pyo", line 322, in DrawFrame
File "pyavs.pyo", line 306, in _GetFrame
File "avisynth.pyo", line 309, in GetPitch
ValueError: NULL pointer access
Really want to test this filter out, so I will hope you fix it soon
also I am able to get nlmeanscl to run fine on my computer, I have a GTX 260 and a AMD Phenom quad core
__________________
I'm Mr.Fixit and I feel good, fixin all the sources in the neighborhood
My New filter is in the works, and will be out soon

Last edited by TheProfileth; 20th January 2011 at 00:15.
TheProfileth is offline   Reply With Quote
Old 20th January 2011, 00:38   #20  |  Link
Jawed
Registered User
 
Join Date: Jan 2008
Location: London
Posts: 156
Thanks Didée and ChaosKing. That means it's found the GPU and is now trying to create memory allocations and/or preparing the GPU code (kind of compilation).

From another forum (someone else's application) I've learnt there are (or at least, were) problems with NVidia's support for my use of a feature, which complicates things.

This is my last update for tonight:

www.cupidity.f9.co.uk/DeathrayNV110119004.zip

Yet more detail in error tracking. Though I'm pessimistic overall due to the problem I mentioned above.
Jawed is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 12:09.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2017, vBulletin Solutions Inc.