Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 27th March 2006, 17:25   #361  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
Boulder: sure but you will have to wait until saturday before I can compile it.

Mug Funky:; I didn't expect anyone to set ow or oh to zero so it is kind of unsupported but I will see what I can do about it. (just curious doesn't it produce a lot of grid artifacts or is it part of a more complicated script?)
__________________
Get my avisynth filters @ http://www.avisynth.org/tsp/
tsp is offline   Reply With Quote
Old 30th March 2006, 06:55   #362  |  Link
Mug Funky
interlace this!
 
Mug Funky's Avatar
 
Join Date: Jun 2003
Location: i'm in ur transfers, addin noise
Posts: 4,555
more a matter of seeing how fast it can go. the grids aren't bad if the sigma is low, and bt=0 gets a lot of smoothing out of a low sigma.

of course, for stuff i care about overlapping is a must. i was just playing with it to see what i could get out of it. if you set the block sizes really small (2), then it acts like a basic temporal filter, which can be fun to have on a GPU. of course it's not the best usage for a filter like this...
__________________
sucking the life out of your videos since 2004
Mug Funky is offline   Reply With Quote
Old 1st April 2006, 21:05   #363  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
new version that should fix the HC 0.17 bug. version 0.6.2.1
__________________
Get my avisynth filters @ http://www.avisynth.org/tsp/
tsp is offline   Reply With Quote
Old 1st April 2006, 21:07   #364  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Finland
Posts: 5,733
Thanks!

__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...
Boulder is offline   Reply With Quote
Old 5th April 2006, 07:34   #365  |  Link
Firesurfer
Registered User
 
Join Date: Jul 2005
Posts: 15
Thank you, too!

I was waiting for that...
Firesurfer is offline   Reply With Quote
Old 3rd May 2006, 03:53   #366  |  Link
swaaye
Registered User
 
Join Date: Jun 2004
Posts: 144
FFT3DGPU is only hitting my dual core at 50%. Is this right? I don't understand why Divx isn't even using some of the 2nd core with FFT3DGPU in use. It's like it restricts everything to a single thread. I disable FFT3DGPU and I get back up to 100 fps, with it I can't do better than 25 fps!

Code:
MPEG2Source("aaa", idct=3)
tfm(d2v="aaa")
tdecimate(hybrid=1)
FFT3dGPU(sigma=2,bt=3,sharpen=0.7,precision=1)
Crop(2,0,-2,-0)
BicubicResize(640,352,0,0.75)
I'm running a dual core Opteron 165 @ 2.6 GHz. Radeon X850 XT. Playing the .AVS back in MPC shows that it can definitely playback at realtime 24fps. I suppose the video card could be maxed out.

edit: with precision at 0 I get 40-50fps. CPU usage is slightly higher. I figured that since my GPU is the bottleneck I'll use the extra CPU cycles for higher DIVX quality settings.

TSP you should post on the Beyond3d.Com forum about optimizing the shader code. A lot of engineers from ATI and NV, along with some extremely talented graphics programmers, hang out there and I bet they'd welcome a challenge with such an ingenious use of their hardware.

Last edited by swaaye; 3rd May 2006 at 06:08.
swaaye is offline   Reply With Quote
Old 3rd May 2006, 19:34   #367  |  Link
swaaye
Registered User
 
Join Date: Jun 2004
Posts: 144
I ran into a strange problem last night. One the 2nd pass of my 2nd encode I lost half of my framerate. I went back and re-ran the first encode and it was also half speed. Rebooted, no help. Reset video driver settings to defaults. Nothing. It seems to be FFT3DGPU that's causing the issue, somehow the video card and it aren't getting along now. I am totally baffled. I underclocked and overclocked my GPU and saw no speed change.

Tonight I will reinstall drivers for the video card and see if that helps. Sorta bummed here cuz it was going so well at 50 fps, but 30fps just isn't going to cut it for how many things I have to encode.
swaaye is offline   Reply With Quote
Old 3rd May 2006, 19:59   #368  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
swaaye: The only limit FFT3dgpu has about restriction multithreading is it only allows 1 thread to execute code inside fft3dgpu. This shouldn't cause any problems so as you noted the limiting factor might be your GPU. It also depends on what program you use to encode with as some are only singlethreaded. VirtualDub works well as the encoding codec and avisynth runs on different threads.
As you have a dualcore processor you shouldn't expect to get 100% cpu utilization unless you're using a multithreaded codec as fft3dgpu alone rarely will use 50%.
As for optimizing the slowest part of my gpu code is the fft. It relies heavly on more or less random texture lookups so it is very bandwidth limited. I'm working on an improved version that shouldn't requere as many texture lookups and should have more linear texture reads.

About your second problem I don't know how fft3dgpu should cause that. Is the framerate stable or does it fluctuate a lot?
__________________
Get my avisynth filters @ http://www.avisynth.org/tsp/

Last edited by tsp; 3rd May 2006 at 20:07.
tsp is offline   Reply With Quote
Old 3rd May 2006, 21:05   #369  |  Link
AI
Registered User
 
Join Date: Jul 2005
Location: Russia, Ural
Posts: 77
swaaye:
good idea for fast CPU and fast GPU
first - Luma in CPU
second - Chroma (U and V) in GPU
tsp
no problem with it, I am only recommend

Last edited by AI; 4th May 2006 at 12:44.
AI is offline   Reply With Quote
Old 3rd May 2006, 21:39   #370  |  Link
swaaye
Registered User
 
Join Date: Jun 2004
Posts: 144
Quote:
Originally Posted by tsp
About your second problem I don't know how fft3dgpu should cause that. Is the framerate stable or does it fluctuate a lot?
I don't know what happened. It has me totally baffled. I didn't change anything between the encodes. Same settings for all. I use StaxRip for encoding and that uses VDubmod. It's definitely multithreaded. I've been encoding with it for months now and I can usually pull 90-100fps with what I'm doing and multithreaded Divx 6.2.

FFT3DGPU dropped me to exactly 50% CPU when I enabled it with the settings above. So, it was doing something making DIVX not use the 2nd core at all. I upped the DIVX quality settings in the codec and managed to pull 80-90% usage while maintaining the same encoding 40-55fps framerate. This worked for 1.5 encodes, on the second pass of the 2nd encode it dropped to 35fps inexplicably.

Now, I can not get that speed back at all and it is using 50% CPU no matter what. I am totally baffled. GPU clock speed had little to no effect which was extremely strange, signifying some other problem. Disabling FFT3DGPU in the filter stream brought speed up to the normal 100fps with both cores working. So it's gotta be something between the video card and the software.

Like I said I'll try reinstalling the video drivers tonight. I updated DirectX 9c to the April 2006 release. Could that be a prob? It worked fine for 3 passes of Divx though.
swaaye is offline   Reply With Quote
Old 3rd May 2006, 21:46   #371  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
AI: what is wrong with
Code:
fft3dfilter(plane=0)
fft3dgpu(plane=1)
??

swaaye:
How does it behave if you replace fft3dgpu with fft3dfilter? Does it makes any difference using an older version of fft3dgpu? What if you use VDub without StaxRip? Installing the april edition of Dx shouldn't make any difference(as it only installs a single dll file that doesn't overwrite older versions). Has the cpu affinity been changed somehow so it only runs on 1 core (look in the taskmanager)
__________________
Get my avisynth filters @ http://www.avisynth.org/tsp/

Last edited by tsp; 3rd May 2006 at 22:07.
tsp is offline   Reply With Quote
Old 4th May 2006, 03:01   #372  |  Link
swaaye
Registered User
 
Join Date: Jun 2004
Posts: 144
I got a monstrous speedup by moving resize from AVISynth to the DIVX codec's internal resize option. VERY odd. But that filter was the problem. I went from 35fps to 50+fps!

Now there are no filters after FFT3DGPU. So perhaps having filters after it can mess it up performance-wise?


edit: it's back to its old slow self again. Something strange is sure happening here. It's like DIVX gets locked to a single thread sometimes. But it's only with FFT3DGPU in use... and then there are the random times when it works and I get up to 90% CPU and 50+fps!

Last edited by swaaye; 4th May 2006 at 05:23.
swaaye is offline   Reply With Quote
Old 4th May 2006, 15:13   #373  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
Swaaye: Try this version and see if it behaves better. I disabled the part of the code that only allows fft3dgpu to run in one thread at a time.
__________________
Get my avisynth filters @ http://www.avisynth.org/tsp/
tsp is offline   Reply With Quote
Old 5th May 2006, 00:36   #374  |  Link
swaaye
Registered User
 
Join Date: Jun 2004
Posts: 144
Okay I tried a bunch of tests with the new version. I think I've narrowed the problem down to bt=3.

FFT3dGPU(sigma=2,bt=2,sharpen=0.5) = 45-60 fps
FFT3dGPU(sigma=2,bt=3,sharpen=0.5) = 32-40 fps

Code:
MPEG2Source("ddddd", idct=3)
tfm(d2v="ddddd")
tdecimate(hybrid=1)
FFT3dGPU(sigma=2,bt=2,sharpen=0.5)
Basically there is a huge difference in speed with bt=3 vs. bt=2 or lower. What's odd is that I'm fairly sure I saw bt=3 doing that faster speed in an earlier run, but I don't see it now.... Interestingly, running my X800GTO2 at 400/500 vs. 500/610 has an almost unnoticeable impact on speed at the bt=3 setting whereas with the bt=2 speed it gets me 10fps more or so. bt=4 runs at about the same speed as bt=3.

Also, having filters after FFT3DGPU drops the framerate to the level of bt=3 regardless of the actual filter settings, even default settings (ie FFT3DGPU() ).

Last edited by swaaye; 5th May 2006 at 01:06.
swaaye is offline   Reply With Quote
Old 5th May 2006, 23:07   #375  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
well higher bt means more to do for the GPU allthough it is somewhat strange that you don't see any difference when overclocking it. On my comp(opteron 165 currently @ 1.8GHz and a Geforce 7800GT 500/700(GPU/Mem) I get 45-48fps with bt=2 and 47-50 fps with bt=3. This is with no encoding.
One of the reason you might see a framedrop with filters after fft3dgpu is that they are not executed parralel with fft3dgpu.
Is there any change with the cpu utilization between bt=2 and bt=3?
__________________
Get my avisynth filters @ http://www.avisynth.org/tsp/
tsp is offline   Reply With Quote
Old 6th May 2006, 02:53   #376  |  Link
swaaye
Registered User
 
Join Date: Jun 2004
Posts: 144
I tested the bt=3 and overclocking again and there does seem to be a difference after all. During encoding it'll vary between 30 and 50fps. Quite a range. I upped the DIVX quality slider to 7 (!!!) and with that I saw about 80% CPU.

I think I was getting thrown off by the initial slow 30fps. It starts off a lot faster at bt=2.

Last edited by swaaye; 6th May 2006 at 07:59.
swaaye is offline   Reply With Quote
Old 10th May 2006, 10:21   #377  |  Link
G_M_C
Registered User
 
Join Date: Feb 2006
Posts: 1,076
Up till now i've been using FFT3D filter to clean op my video before processing, and upon encoding i get avg. 7 fps using XviD (very slow for a dual Xeon, but hey . . . im still learning. SMP is next on the list for instance).

This is a very simple script im running now
Quote:
LoadPlugin("C:\Program Files\Video-Programs\DLLs\DGDecode.dll")
mpeg2source("C:\AVI-Forge\Input.D2V",idct=3)
Crop(0,16,0,-16)
Tweak(sat=1.1, bright=1.4, cont=0.96)
LanczosResize (688,384)
fft3dfilter(sigma=2, plane=4)
But today my new Graphics Board has arrived, a Gainward BLISS 7800GS GoldenSample plus with 512 Mb and the full 24 pipelines (AGP btw).

Later today ill adapt my script to use FFT3DGPU in stead of teh "non GPU" version. Im very curious to see what a difference it will make.

As said my sctipt is running @ 7 fps now and its doing the 1st pass of a 2-pass encode . With the second pass ill use the GPU version. So it will be a "honest" compere betweet the two

PS: Is this filter more dependend on the "number op pipelines" or "the number of shaders" ?
G_M_C is offline   Reply With Quote
Old 10th May 2006, 13:13   #378  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
fft3dgpu is mainly limited by memory bandwidth (memory clock and width). So I would say that more pipelines is better than more shaders per pipeline but faster memory is even better.
To get the most fair compairison between fft3dfilter and fft3dgpu use
fft3dgpu(sigma=2,plane=4,precision=2,bw=64,bh=64,ow=24,oh=24)
and I will recomend using the same script for both passes as the are minor differences between the output from fft3dgpu and fft3dfilter.
__________________
Get my avisynth filters @ http://www.avisynth.org/tsp/
tsp is offline   Reply With Quote
Old 10th May 2006, 16:03   #379  |  Link
G_M_C
Registered User
 
Join Date: Feb 2006
Posts: 1,076
Thx for your answer

Anyway, using GPU now, and fps is up by 300% (21 ~22 fps in stead of 7)

So basically, its faster to do a new 1st pass and then a new 2nd pass. Because 2 passes with FFT3DGPU is faster then 1 pass with "regular" FFT3D (wow).

I'm very impressed

Very many Kudo's to you man. Amazing stuff !

Btw. did you mean System memory or Graphics memory in your post before ?

Last edited by G_M_C; 10th May 2006 at 16:13.
G_M_C is offline   Reply With Quote
Old 10th May 2006, 20:51   #380  |  Link
tsp
Registered User
 
tsp's Avatar
 
Join Date: Aug 2004
Location: Denmark
Posts: 807
Is that speed with fft3dgpu(sigma=2,plane=4,precision=2,bw=64,bh=64,ow=24,oh=24)?
The Geforce 7900 is a rather fast card.
I mean Graphics memory. The fft version I work on now should be less memory bandwidth dependent as I have cut down the number of texture lookup
__________________
Get my avisynth filters @ http://www.avisynth.org/tsp/
tsp is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 16:27.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.