Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
27th March 2006, 17:25 | #361 | Link |
Registered User
Join Date: Aug 2004
Location: Denmark
Posts: 807
|
Boulder: sure but you will have to wait until saturday before I can compile it.
Mug Funky:; I didn't expect anyone to set ow or oh to zero so it is kind of unsupported but I will see what I can do about it. (just curious doesn't it produce a lot of grid artifacts or is it part of a more complicated script?)
__________________
Get my avisynth filters @ http://www.avisynth.org/tsp/ |
30th March 2006, 06:55 | #362 | Link |
interlace this!
Join Date: Jun 2003
Location: i'm in ur transfers, addin noise
Posts: 4,555
|
more a matter of seeing how fast it can go. the grids aren't bad if the sigma is low, and bt=0 gets a lot of smoothing out of a low sigma.
of course, for stuff i care about overlapping is a must. i was just playing with it to see what i could get out of it. if you set the block sizes really small (2), then it acts like a basic temporal filter, which can be fun to have on a GPU. of course it's not the best usage for a filter like this...
__________________
sucking the life out of your videos since 2004 |
1st April 2006, 21:05 | #363 | Link |
Registered User
Join Date: Aug 2004
Location: Denmark
Posts: 807
|
new version that should fix the HC 0.17 bug. version 0.6.2.1
__________________
Get my avisynth filters @ http://www.avisynth.org/tsp/ |
3rd May 2006, 03:53 | #366 | Link |
Registered User
Join Date: Jun 2004
Posts: 144
|
FFT3DGPU is only hitting my dual core at 50%. Is this right? I don't understand why Divx isn't even using some of the 2nd core with FFT3DGPU in use. It's like it restricts everything to a single thread. I disable FFT3DGPU and I get back up to 100 fps, with it I can't do better than 25 fps!
Code:
MPEG2Source("aaa", idct=3) tfm(d2v="aaa") tdecimate(hybrid=1) FFT3dGPU(sigma=2,bt=3,sharpen=0.7,precision=1) Crop(2,0,-2,-0) BicubicResize(640,352,0,0.75) edit: with precision at 0 I get 40-50fps. CPU usage is slightly higher. I figured that since my GPU is the bottleneck I'll use the extra CPU cycles for higher DIVX quality settings. TSP you should post on the Beyond3d.Com forum about optimizing the shader code. A lot of engineers from ATI and NV, along with some extremely talented graphics programmers, hang out there and I bet they'd welcome a challenge with such an ingenious use of their hardware. Last edited by swaaye; 3rd May 2006 at 06:08. |
3rd May 2006, 19:34 | #367 | Link |
Registered User
Join Date: Jun 2004
Posts: 144
|
I ran into a strange problem last night. One the 2nd pass of my 2nd encode I lost half of my framerate. I went back and re-ran the first encode and it was also half speed. Rebooted, no help. Reset video driver settings to defaults. Nothing. It seems to be FFT3DGPU that's causing the issue, somehow the video card and it aren't getting along now. I am totally baffled. I underclocked and overclocked my GPU and saw no speed change.
Tonight I will reinstall drivers for the video card and see if that helps. Sorta bummed here cuz it was going so well at 50 fps, but 30fps just isn't going to cut it for how many things I have to encode. |
3rd May 2006, 19:59 | #368 | Link |
Registered User
Join Date: Aug 2004
Location: Denmark
Posts: 807
|
swaaye: The only limit FFT3dgpu has about restriction multithreading is it only allows 1 thread to execute code inside fft3dgpu. This shouldn't cause any problems so as you noted the limiting factor might be your GPU. It also depends on what program you use to encode with as some are only singlethreaded. VirtualDub works well as the encoding codec and avisynth runs on different threads.
As you have a dualcore processor you shouldn't expect to get 100% cpu utilization unless you're using a multithreaded codec as fft3dgpu alone rarely will use 50%. As for optimizing the slowest part of my gpu code is the fft. It relies heavly on more or less random texture lookups so it is very bandwidth limited. I'm working on an improved version that shouldn't requere as many texture lookups and should have more linear texture reads. About your second problem I don't know how fft3dgpu should cause that. Is the framerate stable or does it fluctuate a lot?
__________________
Get my avisynth filters @ http://www.avisynth.org/tsp/ Last edited by tsp; 3rd May 2006 at 20:07. |
3rd May 2006, 21:39 | #370 | Link | |
Registered User
Join Date: Jun 2004
Posts: 144
|
Quote:
FFT3DGPU dropped me to exactly 50% CPU when I enabled it with the settings above. So, it was doing something making DIVX not use the 2nd core at all. I upped the DIVX quality settings in the codec and managed to pull 80-90% usage while maintaining the same encoding 40-55fps framerate. This worked for 1.5 encodes, on the second pass of the 2nd encode it dropped to 35fps inexplicably. Now, I can not get that speed back at all and it is using 50% CPU no matter what. I am totally baffled. GPU clock speed had little to no effect which was extremely strange, signifying some other problem. Disabling FFT3DGPU in the filter stream brought speed up to the normal 100fps with both cores working. So it's gotta be something between the video card and the software. Like I said I'll try reinstalling the video drivers tonight. I updated DirectX 9c to the April 2006 release. Could that be a prob? It worked fine for 3 passes of Divx though. |
|
3rd May 2006, 21:46 | #371 | Link |
Registered User
Join Date: Aug 2004
Location: Denmark
Posts: 807
|
AI: what is wrong with
Code:
fft3dfilter(plane=0) fft3dgpu(plane=1) swaaye: How does it behave if you replace fft3dgpu with fft3dfilter? Does it makes any difference using an older version of fft3dgpu? What if you use VDub without StaxRip? Installing the april edition of Dx shouldn't make any difference(as it only installs a single dll file that doesn't overwrite older versions). Has the cpu affinity been changed somehow so it only runs on 1 core (look in the taskmanager)
__________________
Get my avisynth filters @ http://www.avisynth.org/tsp/ Last edited by tsp; 3rd May 2006 at 22:07. |
4th May 2006, 03:01 | #372 | Link |
Registered User
Join Date: Jun 2004
Posts: 144
|
I got a monstrous speedup by moving resize from AVISynth to the DIVX codec's internal resize option. VERY odd. But that filter was the problem. I went from 35fps to 50+fps!
Now there are no filters after FFT3DGPU. So perhaps having filters after it can mess it up performance-wise? edit: it's back to its old slow self again. Something strange is sure happening here. It's like DIVX gets locked to a single thread sometimes. But it's only with FFT3DGPU in use... and then there are the random times when it works and I get up to 90% CPU and 50+fps! Last edited by swaaye; 4th May 2006 at 05:23. |
4th May 2006, 15:13 | #373 | Link |
Registered User
Join Date: Aug 2004
Location: Denmark
Posts: 807
|
Swaaye: Try this version and see if it behaves better. I disabled the part of the code that only allows fft3dgpu to run in one thread at a time.
__________________
Get my avisynth filters @ http://www.avisynth.org/tsp/ |
5th May 2006, 00:36 | #374 | Link |
Registered User
Join Date: Jun 2004
Posts: 144
|
Okay I tried a bunch of tests with the new version. I think I've narrowed the problem down to bt=3.
FFT3dGPU(sigma=2,bt=2,sharpen=0.5) = 45-60 fps FFT3dGPU(sigma=2,bt=3,sharpen=0.5) = 32-40 fps Code:
MPEG2Source("ddddd", idct=3) tfm(d2v="ddddd") tdecimate(hybrid=1) FFT3dGPU(sigma=2,bt=2,sharpen=0.5) Also, having filters after FFT3DGPU drops the framerate to the level of bt=3 regardless of the actual filter settings, even default settings (ie FFT3DGPU() ). Last edited by swaaye; 5th May 2006 at 01:06. |
5th May 2006, 23:07 | #375 | Link |
Registered User
Join Date: Aug 2004
Location: Denmark
Posts: 807
|
well higher bt means more to do for the GPU allthough it is somewhat strange that you don't see any difference when overclocking it. On my comp(opteron 165 currently @ 1.8GHz and a Geforce 7800GT 500/700(GPU/Mem) I get 45-48fps with bt=2 and 47-50 fps with bt=3. This is with no encoding.
One of the reason you might see a framedrop with filters after fft3dgpu is that they are not executed parralel with fft3dgpu. Is there any change with the cpu utilization between bt=2 and bt=3?
__________________
Get my avisynth filters @ http://www.avisynth.org/tsp/ |
6th May 2006, 02:53 | #376 | Link |
Registered User
Join Date: Jun 2004
Posts: 144
|
I tested the bt=3 and overclocking again and there does seem to be a difference after all. During encoding it'll vary between 30 and 50fps. Quite a range. I upped the DIVX quality slider to 7 (!!!) and with that I saw about 80% CPU.
I think I was getting thrown off by the initial slow 30fps. It starts off a lot faster at bt=2. Last edited by swaaye; 6th May 2006 at 07:59. |
10th May 2006, 10:21 | #377 | Link | |
Registered User
Join Date: Feb 2006
Posts: 1,076
|
Up till now i've been using FFT3D filter to clean op my video before processing, and upon encoding i get avg. 7 fps using XviD (very slow for a dual Xeon, but hey . . . im still learning. SMP is next on the list for instance).
This is a very simple script im running now Quote:
Later today ill adapt my script to use FFT3DGPU in stead of teh "non GPU" version. Im very curious to see what a difference it will make. As said my sctipt is running @ 7 fps now and its doing the 1st pass of a 2-pass encode . With the second pass ill use the GPU version. So it will be a "honest" compere betweet the two PS: Is this filter more dependend on the "number op pipelines" or "the number of shaders" ? |
|
10th May 2006, 13:13 | #378 | Link |
Registered User
Join Date: Aug 2004
Location: Denmark
Posts: 807
|
fft3dgpu is mainly limited by memory bandwidth (memory clock and width). So I would say that more pipelines is better than more shaders per pipeline but faster memory is even better.
To get the most fair compairison between fft3dfilter and fft3dgpu use fft3dgpu(sigma=2,plane=4,precision=2,bw=64,bh=64,ow=24,oh=24) and I will recomend using the same script for both passes as the are minor differences between the output from fft3dgpu and fft3dfilter.
__________________
Get my avisynth filters @ http://www.avisynth.org/tsp/ |
10th May 2006, 16:03 | #379 | Link |
Registered User
Join Date: Feb 2006
Posts: 1,076
|
Thx for your answer
Anyway, using GPU now, and fps is up by 300% (21 ~22 fps in stead of 7) So basically, its faster to do a new 1st pass and then a new 2nd pass. Because 2 passes with FFT3DGPU is faster then 1 pass with "regular" FFT3D (wow). I'm very impressed Very many Kudo's to you man. Amazing stuff ! Btw. did you mean System memory or Graphics memory in your post before ? Last edited by G_M_C; 10th May 2006 at 16:13. |
10th May 2006, 20:51 | #380 | Link |
Registered User
Join Date: Aug 2004
Location: Denmark
Posts: 807
|
Is that speed with fft3dgpu(sigma=2,plane=4,precision=2,bw=64,bh=64,ow=24,oh=24)?
The Geforce 7900 is a rather fast card. I mean Graphics memory. The fft version I work on now should be less memory bandwidth dependent as I have cut down the number of texture lookup
__________________
Get my avisynth filters @ http://www.avisynth.org/tsp/ |
Thread Tools | Search this Thread |
Display Modes | |
|
|