Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 26th November 2005, 23:06   #241  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
Uploaded version 0.5a. It fix a bug with bt=2. Only file changed is fft3dgpu.hlsl.
__________________
Get my avisynth filters @ http://www.avisynth.org/tsp/
tsp is offline   Reply With Quote
Old 26th November 2005, 23:59   #242  |  Link
Revgen
Registered User
 
Join Date: Sep 2004
Location: Near LA, California, USA
Posts: 1,545
Quote:
Originally Posted by Fizick
What is degrid sharpening?
Quote:
Originally Posted by tsp
sharpening with degrid.
LOL

I'll eventually try it out and see how it works, once I find some time.
__________________
Pirate: Now how would you like to die? Would you like to have your head chopped off or be burned at the stake?

Curly: Burned at the stake!

Moe: Why?

Curly: A hot steak is always better than a cold chop.
Revgen is offline   Reply With Quote
Old 27th November 2005, 19:13   #243  |  Link
Kopernikus
Registered User
 
Join Date: Sep 2002
Posts: 125
@tsp: Is there somewhere more information about shader programming available? Perhaps a sort of SDK?
Kopernikus is offline   Reply With Quote
Old 27th November 2005, 19:57   #244  |  Link
AssassiNBG
Registered User
 
Join Date: Mar 2005
Location: Pleven, Bulgaria, Europe.
Posts: 45
Umm ... am I blind or is there no new version on http://www.avisynth.org/tsp/ ?
AssassiNBG is offline   Reply With Quote
Old 27th November 2005, 20:33   #245  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
Sorry forgot to update the index page. It should be fixed now. Also I uploaded a new version 0.51 that fixes a bug where the parameters after NVPerf was shifted one place so degrid=scutoff,scutoff=svr, etc. Improved download speed from GPU and Kalman should work with geforce fx 5xxx.

I created a special version of fft3dGPU that reports the time it takes to download the final image from the gpu. You can get it here. Just run the included download speed.avs after the included fft3dgpu.dll has been extracted to the plugin directory. On my computer with a AGP Geforce 6800GT it takes ~4.3 mikrosec to download it. That's about 100 MBytes/sec [EDIT]92MBytes/sec (the other is million bytes/sec)[/EDIT]. AGPx8 speed upload speed is about 2100 MBytes/sec. So if anyone with a PCI-express GPU would run the test to compare the result.

Kopernikus: There are the DirectX SDK that contains some sample. Also both NVidia and ATI have SDK available. Most of the sample is game orientatet but there are also some image/video and general purpose GPU (GPGPU) shader examples. The sample chapters from the ShaderX bookseries contains some nice sample for image manipulations on the GPU.
www.gpgpu.org is also a good site although mostly OpenGL.
__________________
Get my avisynth filters @ http://www.avisynth.org/tsp/

Last edited by tsp; 1st December 2005 at 17:30.
tsp is offline   Reply With Quote
Old 27th November 2005, 20:58   #246  |  Link
Kopernikus
Registered User
 
Join Date: Sep 2002
Posts: 125
Thank you
Kopernikus is offline   Reply With Quote
Old 28th November 2005, 11:02   #247  |  Link
ariga
Learning...
 
ariga's Avatar
 
Join Date: Nov 2005
Location: 12.97°N, 77.56°E
Posts: 135
Quote:
Originally Posted by tsp
ariga: I will post the next version shortly but please post your script else it is very difficult to figure out that is wrong.
It's a simple
AviSource("dv.avi")
LeakKernelDeint(order=0)
FFT3DGPU() # no params

Doesn't matter what params i pass, the error is the same.

BTW, I tried 0.47 and it complained about missing d3dx9_27.dll

Just d/l 0.51. Will see if it works.
ariga is offline   Reply With Quote
Old 28th November 2005, 14:57   #248  |  Link
acrespo
Brazilian Anime Ripper
 
Join Date: Nov 2001
Location: Brazil
Posts: 237
I am trying a compare between FFT3Dfilter and FFT3DGPU.

FFT3Dfilter is more efficient than FFT3DGPU in my anime captures. The parameters I used:

FFT3DFilter(sigma=5, sharpen=1.0) << version 1.8.3
FFT3DGPU(sigma=5, bh=48, bw=48, mode=1, sharpen=1.0) << version 0.5a

I still have some grid lines with fft3dgpu in some frames and fft3dfilter don't have any grid in all video. I guess the only thing is different in the parameters above is the overlap. The default overlap in fft3dfilter is bw,bh/3 and I don't know the default overlap of fft3gpu.
__________________
Capture cards:
Compro VideoMate Gold+ (Philips SAA7134 based) (not active)
Hauppauge PVR 150MCE (not active)
ATI TV Wonder Elite (active)
acrespo is offline   Reply With Quote
Old 28th November 2005, 15:16   #249  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
all: I made an installer. It might be more userfrindly

acrespo:the default overlap is ow=bw/2 oh=bh/2 currently that can't be changed. Also there where a bug in 0.5a that assigned the value from svr to degrid. So the default value was 0.3. That might explain it. That is fixed in 0.51. Another difference is the precision used. By default fft3dgpu uses 16 bit floating point precission while fft3dfilter uses 32 bit precision. You can change that by setting useFloat16=false. This slows down the filter and uses more memory on the GPU.

ariga: I don't get any errors with 0.51, fft3dgpu() and a Geforce fx 5200.
__________________
Get my avisynth filters @ http://www.avisynth.org/tsp/

Last edited by tsp; 28th November 2005 at 18:56.
tsp is offline   Reply With Quote
Old 29th November 2005, 05:16   #250  |  Link
AI
Registered User
 
Join Date: Jul 2005
Location: Russia, Ural
Posts: 77
Quote:
Originally Posted by tsp
I created a special version of fft3dGPU that reports the time it takes to download the final image from the gpu. You can get it here. Just run the included download speed.avs after the included fft3dgpu.dll has been extracted to the plugin directory. On my computer with a AGP Geforce 6800GT it takes ~4.3 mikrosec to download it. That's about 100 MBytes/sec. AGPx8 speed upload speed is about 2100 MBytes/sec. So if anyone with a PCI-express GPU would run the test to compare the result.
ATI x700 DDR(I) 128bit, A64 3000+ S939

if core ratio = 4 (i.e. 800Mhz - min) = 7,5e-4 (sec?)
if core ratio = 9 (i.e. 1800Mhz) 1,2e-3 (sec?)

I can test in PIII800 fx5200, but later.

Last edited by AI; 29th November 2005 at 05:20.
AI is offline   Reply With Quote
Old 29th November 2005, 14:14   #251  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
Ai : thanks for the report. The time is reported in seconds. Strange that increasing the core ratio slows the download down. The x700 is pci-express?
tsp is offline   Reply With Quote
Old 30th November 2005, 05:27   #252  |  Link
AI
Registered User
 
Join Date: Jul 2005
Location: Russia, Ural
Posts: 77
Quote:
Originally Posted by tsp
The x700 is pci-express?
Yes, of course... (buy special for fft3dGPU and upgrate CPU special for FFT3DFilter)

Quote:
Originally Posted by tsp
Strange that increasing the core ratio slows the download down
Becouse "CPU more busy memory" (or "memory more busy from CPU" - my English is very bad)

in A64 memory controller integreted in core (more exactly in CPU chip)

PS What about speed? (How many Mb/s on my timings(?))

PPS Why "bt=2 uses the current and next frame",
but "bt=4 uses the two previous frames, the current and next frame"?

Last edited by AI; 30th November 2005 at 05:41.
AI is offline   Reply With Quote
Old 30th November 2005, 07:57   #253  |  Link
ariga
Learning...
 
ariga's Avatar
 
Join Date: Nov 2005
Location: 12.97°N, 77.56°E
Posts: 135
Quote:
Originally Posted by tsp
ariga: I don't get any errors with 0.51, fft3dgpu() and a Geforce fx 5200.
Still no change with 0.51 and 5600 Ultra. I'll try installing the latest DirectX update and drivers. Just 9.0c with the d3dx9_27.dll may not suffice.
ariga is offline   Reply With Quote
Old 30th November 2005, 12:22   #254  |  Link
acrespo
Brazilian Anime Ripper
 
Join Date: Nov 2001
Location: Brazil
Posts: 237
fft3dfilter 1.8.4 include muti-plane option (luma + all chroma planes > plane=4). Can be develop in fft3dgpu too?
__________________
Capture cards:
Compro VideoMate Gold+ (Philips SAA7134 based) (not active)
Hauppauge PVR 150MCE (not active)
ATI TV Wonder Elite (active)
acrespo is offline   Reply With Quote
Old 1st December 2005, 05:50   #255  |  Link
acrespo
Brazilian Anime Ripper
 
Join Date: Nov 2001
Location: Brazil
Posts: 237
Quote:
Originally Posted by tsp
acrespo:the default overlap is ow=bw/2 oh=bh/2 currently that can't be changed. Also there where a bug in 0.5a that assigned the value from svr to degrid. So the default value was 0.3. That might explain it. That is fixed in 0.51. Another difference is the precision used. By default fft3dgpu uses 16 bit floating point precission while fft3dfilter uses 32 bit precision. You can change that by setting useFloat16=false. This slows down the filter and uses more memory on the GPU.
The new 0.51 version degrid is function correctly now. Also I think bw and bh defaults is producing grid. When I try 48 (fft3dfilter defaults) I don't have grid, so I think that change bh,bw defaults to 48 is good.
__________________
Capture cards:
Compro VideoMate Gold+ (Philips SAA7134 based) (not active)
Hauppauge PVR 150MCE (not active)
ATI TV Wonder Elite (active)
acrespo is offline   Reply With Quote
Old 1st December 2005, 09:14   #256  |  Link
AI
Registered User
 
Join Date: Jul 2005
Location: Russia, Ural
Posts: 77
Quote:
Originally Posted by acrespo
The new 0.51 version degrid is function correctly now. Also I think bw and bh defaults is producing grid. When I try 48 (fft3dfilter defaults) I don't have grid, so I think that change bh,bw defaults to 48 is good.
RTFM

Quote:
Originally Posted by fft3dgpu.txt
bw,bh: blockwide and block height. It should be a power of 2 ie valid values is 4,8,16,32,64,128,256,512. Default=32
IMHO

i.e. then you use bw=bh=48 - realy bw=bh=64 (possible)
AI is offline   Reply With Quote
Old 1st December 2005, 12:33   #257  |  Link
ariga
Learning...
 
ariga's Avatar
 
Join Date: Nov 2005
Location: 12.97°N, 77.56°E
Posts: 135
Works now, but slow

Quote:
Originally Posted by ariga
Still no change with 0.51 and 5600 Ultra. I'll try installing the latest DirectX update and drivers. Just 9.0c with the d3dx9_27.dll may not suffice.
Updated to the latest Nvidia drivers and it works ! Sorry for the trouble. However it's still slow (~4fps). Will update DirectX too and see if it makes any difference.
ariga is offline   Reply With Quote
Old 1st December 2005, 17:28   #258  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
Quote:
Originally Posted by AI
PS What about speed? (How many Mb/s on my timings(?))
to find the speed divide 0.39551 (=720*576/1024/1024) with the time i took. So 7.5*10^-4 sec becomes ~530MBytes/sec and 1.2*10^-3 sec is ~330MBytes/sec.
Quote:
Originally Posted by AI
PPS Why "bt=2 uses the current and next frame",
but "bt=4 uses the two previous frames, the current and next frame"?
sorry bt=2 does use the current and the previous frame. An error in the documentation (maybe it did use the next frame in version 0.1?)
Quote:
Originally Posted by ariga
Updated to the latest Nvidia drivers and it works ! Sorry for the trouble. However it's still slow (~4fps). Will update DirectX too and see if it makes any difference.
Good to hear. Just curious what version did you use before? Also I think the main reason you are seeing such a low speed is because a Geforce FX 5600 Ultra is very slow. It does only have 2 pixel pipeline(Meaning it can only process two pixel at a time) running at 350 MHz. It doesn't support MRT(Multiple Render Targets) meaning it needs more passes to do the fft (and kalman filtering). The good thing about having a slow videocard is the cpu-utilization should also be very low so it is possible to run some very slow filters before fft3dgpu(like a deinterlacer) or use a slower codec
Quote:
Originally Posted by acrespo
fft3dfilter 1.8.4 include muti-plane option (luma + all chroma planes > plane=4). Can be develop in fft3dgpu too?
sure no problem (but it will become plane=3 because plane=1 process both the U and V chroma plane)
Quote:
Originally Posted by acrespo
The new 0.51 version degrid is function correctly now. Also I think bw and bh defaults is producing grid. When I try 48 (fft3dfilter defaults) I don't have grid, so I think that change bh,bw defaults to 48 is good.
As AI said fft3dGPU uses bw=64 and bh=64 instead. Also the default setting is mode=0 that does produce some nasty griding(but is faster). When I implement the variable overlap I think I will use mode=1 as the default. If you use:
FFT3DGPU(sigma=5, bh=32, bw=32, mode=1, sharpen=1.0)
Does the griding appear?
__________________
Get my avisynth filters @ http://www.avisynth.org/tsp/

Last edited by tsp; 1st December 2005 at 17:39.
tsp is offline   Reply With Quote
Old 1st December 2005, 20:53   #259  |  Link
acrespo
Brazilian Anime Ripper
 
Join Date: Nov 2001
Location: Brazil
Posts: 237
I need some more tests but I think mode=1 and useFloat16=false can avoid grids with bw=bh=64. I will take some screenshots to compare side by side with bw=bh=32, but with useFloat16=true I see grids with all bw/bh and modes (less than v0.47 and 0.5a).

I have a doubt about other planes. In readme.txt you wrote about fft3dgpu in plane=0 and other instance in plane=1 can decrease speed more than fft3dfilter. Is it correct to all situation/video cards?

For your information, I have forceware 81.89 and the speed is great.
__________________
Capture cards:
Compro VideoMate Gold+ (Philips SAA7134 based) (not active)
Hauppauge PVR 150MCE (not active)
ATI TV Wonder Elite (active)
acrespo is offline   Reply With Quote
Old 1st December 2005, 21:47   #260  |  Link
Fizick
AviSynth plugger
 
Fizick's Avatar
 
Join Date: Nov 2003
Location: Russia
Posts: 2,183
tsp,
IMHO, for FFt3dFilter plane=4 you may consider using plane=4 too,
and for plane=3 consider plane=3 (as alias of your 1 and 2).
Fizick is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 15:48.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.