View Full Version : New ffdshow build (?)
Sharktooth
18th October 2005, 03:48
it doesnt crash on my box, but produces weird artifacts.
i use the latest CD build though.
Egh
18th October 2005, 04:32
My observations on recent builds:
1. GCC plain build -- slow as t3h hell. I don't know what kind of magical system you have, b0b0r, but on same CPU i have tremendously slowed down playback of AVC even w/o resizing (on VMR9 ofc, and *RGB*, not YV12 output).
2. b0b0r's ICL+gcc is not _that_ bad. But seems in highly CPU intensive operations, it still sucks. One of those high-CPU-eater operations is software resizer. I.e. with that build i can't actually get non-drop-frame playback even on 640*480 xvid encoded video resized to 800*600 by lanczos in ffdshow :)
3. Last celtic build == t3h win. That's the one which doesn't slow down in all my tests done. I don't know why exactly, but that's the truth :) IIRC movax builds (msvc ax + gcc dlls) are similar in speed too.
movax
18th October 2005, 05:21
MSVC .ax + GCC DLLs do seem to work the best. Trying to find my ICL7 (and then upgrade version ICL8->ICL9 :P) to see what happens with ICL .ax + GCC DLLs.
Kurosu
18th October 2005, 12:14
The following only happens for gcc.
1) To allow building for non-SSE2 computers, -msse2 must be removed from CFLAGS; but then gcc builtins for SSE2 are undefined, leading to errors whenever those are used. As a consequence, SSE2 code has to be inhibited through defines
2) But then, if you have a SSE2-able computer, and some code is left activating the (now-empty) SSE2 functions, you get of course garbage at the output
3) Inhibiting code through ifdef's is a tedious and boring work; therefore SSE code was not inhibited, and -msse flag left. This yields a build where SSE opcodes are used (much like SSE2 when -msse2 was used) and therefore K6, Pentium 2, first Athlon/duron and weird CPUs are not supported. This is not going to be fixed, not only because of the boring task but also because it clutters source code.
4) All code accessing MSVC-compiled binaries (ie Avisynth or any dll using MSVC C++ ABI) just won't work and has to be converted to a C equivalent, if any, or get deactivated. Therefore ffavisynth.dll and the avisynth processing filter are not working and will just crash. You can't use either a MSVC version of those because, again, ABI are incompatible.
5) (update) btw, the sluggishness in some dlls is due to gcc not inlining builtins, but actually making function calls out of them. You hardly get slower.
Inventive Software
18th October 2005, 12:52
@Celtic_druid: Thank you SO much!
For the record, I'm gonna try compiling ffdshow entirely with GCC, and with ONLY SSE enabled, thus making it work on my CPU without much trouble (hopefully!)
clsid
18th October 2005, 13:00
Here is another build:
http://www.megaupload.com/?d=5HWWOXB0
libavcodec.dll and mplayer.dll are compiled with GCC 4.0.2
everything else with ICL9
bob0r
18th October 2005, 14:26
@Kurosu
Very intesting, maybe someone will be up to do the hard dirty work for old and new cpus.
Any recommendations about how files should be compiled to reach the most people?
Also if you had to advice the ffdshow developer(s), what would your advice be?
@celtic_druid
When you get a fresh CVS source, can you explain to us, step by step, what actions you take, what files you edit, and what possible (directx) header files you use? Then how you edit some files and ofcourse how do you compile your ffdshow builds, like what settings and optimizes, if any.
@all
Blur & NR > Denoise3D and HQ crash icl/gcc
fixed HQ denoise3d crash > http://cia.navi.cx/stats/project/ffdshow/.message/6073914
Comment by Milan: "Fixed."
ffdshow/about/version details crashes (gcc 4.0.2)
http://sourceforge.net/tracker/index.php?func=detail&aid=1329122&group_id=53761&atid=471489
Comment by Milan:
"I think it crashes on Avisynth version check. ffdshow invokes
VersionString command, but avisynth C++ interface is
incompatible with GCC."
This may be fixed in the future hopefully, but it's not that bad we can't use this :o
celtic_druid
18th October 2005, 14:39
Currently I am running Microsoft DirectX 9.0 SDK (October 2005) which doesn't include dshow and isn't supposed to install on win2k and 2003 SP1 Platform SDK (or was it the PSDK that isn't supposed to work with 2000?).
bob0r
18th October 2005, 14:43
Currently I am running Microsoft DirectX 9.0 SDK (October 2005) which doesn't include dshow and isn't supposed to install on win2k and 2003 SP1 Platform SDK (or was it the PSDK that isn't supposed to work with 2000?).
Okay...... some more info about how you use them (like i did in some above thread), or are you saying October 2005 header files should speed up ffdshow a lot over Summer 2004? (I btw got the latest platform sdk installed, any reason i should use those for building ffdshow, if so, how?)
celtic_druid
18th October 2005, 14:59
Well the reason I am using it is because the DX SDK for Oct doesn't include dshow.
I think the reason for the speedup is ICL9 although gcc with cpu specific flags could maybe beat it. It can for libavcodec anyway.
I compiled everything with ICL9 except for mplayer and part of libavcodec.
cc979
18th October 2005, 15:06
just tried 02dfe422f480810944852f55dea8db83 *ffdshow-20051018.exe xvid/x264 decode is very slow compared to 52e5e5a5008760ecfbedbd209f964182 *ffdshow-20051015.exe
i have nforce2 ultra mobo with an athalon2400
Kurosu
18th October 2005, 15:48
@Kurosu
Very intesting, maybe someone will be up to do the hard dirty work for old and new cpus.
Even if someone was, I think Milan would not like to do this, because of the code bloat. Anyway, it's just too much of a mess: gcc doesn't make the difference between mmxext and sse, so builds can only be plain MMX using intrinsics. But I did try this, and gcc was generating goofy opcodes (movl $0, mm0 in h264 chroma loop filter). It's a pity, because that code may not be used in the end because of CPU detection.
Fixing by not using compiler defines would require to undo most of the templatization, which is certainly not something Milan would bother to do now that the code is templated.
Any recommendations about how files should be compiled to reach the most people?
Milan chose the intrinsic way because gcc, icl and cl are able to understand it (producing optimized code from it is another matter). But gcc is geared towards targeted builds. In that end, using only gcc to build the most compatible software yield builds that have to use the lowest common factor: sse. Otherwise, it's just a matter of what assembly syntax is used that should determine the compiler to use.
Also if you had to advice the ffdshow developer(s), what would your advice be?
I'm already discussing with Milan. The non-SSE2 fixes are based on a patch I made that proved it was feasible. But I never went through testing all filters.
"I think it crashes on Avisynth version check. ffdshow invokes
VersionString command, but avisynth C++ interface is
incompatible with GCC."
That's one of the last, most obvious, gcc-compatibility fixes needed. Fixing makeAVIS was an incentive and an example on how to do this. Milan has particular plans on this subject, though. Time will tell.
bob0r
18th October 2005, 16:38
@celtic_druid:
Yes i understand WHAT you use to compile ffdshow, BUT how, unless its some secret please share with us how you compile your builds, with as many details as possible.
Because some people are saying my ICL builds are slower than yours, when we both use ICL9.
It could be indeed the different DX SDK files, but, not including dshow, what benefit does this have?
When you say you partly compile libavcodec.dll with gcc, what do you do?
People have asked you this already, i think its time for the ultimate-celtic_druid-guide :D
@Kurosu
Understood and thanks.
bob0r
18th October 2005, 16:50
just tried 02dfe422f480810944852f55dea8db83 *ffdshow-20051018.exe xvid/x264 decode is very slow compared to 52e5e5a5008760ecfbedbd209f964182 *ffdshow-20051015.exe
i have nforce2 ultra mobo with an athalon2400
gcc = gcc 4.0.2
msvc = msvc 7.1
ffdshow-20051015.exe = (all files gcc, except ffdshow.ax and ff_vfw.dll are msvc)
ffdshow-20051018.exe = (all files gcc)
I was told ffdshow.ax (and ff_vfw.dll for that matter) only are a frontend for the encoders/decoders.
If i am wrong, i am sure the experts will correct me.
So this may be caused by possible libavcodec.dll updates, between 15 oct and 18 oct ( http://cia.navi.cx/stats/project/ffdshow )
Maybe someone with a CPU like yours can test this too.
I must have some weird system, all versions, all combinations, all just play very smooth!
videomixer9
18th October 2005, 19:51
CPU usage on 20051018 is low for me with h264, no difference to the celtic builds. I have nForce2 Ultra 400 1024 MB DDR333 Dual Channel with AMD Athlon 3000+ (Barton), GeForce FX 5600, WindowsXP and tested with Zoomplayer with Overlay Mixer (not using VMR9 because of nasty Luma Shift and various other reasons like high cpu usage for not better quality :), my name may suggest I do the exact opposite hehe). Xvid performance is quite good too and not much different from what I saw with the latest celtic_build.
NoX1911
18th October 2005, 21:07
not using VMR9 because of nasty Luma Shift and various other reasons like high cpu usage for not better quality :)Enable 'Color Controls' in Zoom Player and leave to standard values. Picture (luma) is exactly like Overlay Mixer after that. I hate that Luma shift as well (0-255 <-> 16-240). Maybe has something to do with TV luma restrictions. 'Color Controls' are finally working in every situation in latest Zoom Player (i think).
Back to topic:
Is there any way to determine exact speed variations between different ffdshow builds, some kind of benchmark or something? Maybe a huge template benchmark video (2048x2048x30fps) that should stress even the biggest cpus and ffdshow is counting/logging fps so ppl can compare results? I mean.. better than just 'feeling' the difference...
cc979
18th October 2005, 22:44
bob0r:
i've re-test the builds
52e5e5a5008760ecfbedbd209f964182 *ffdshow-20051015.exe
e0983d0fdaee4d123abcd57ee379d61b *ffdshow-20051017.exe
02dfe422f480810944852f55dea8db83 *ffdshow-20051018.exe
using the latest mpc using hardware-overlay with a xvid-file
all these builds work fine when post-processing is off (cpu 25%) but
spp deblocking cpu usage (cpu 85%) is a lot more with ffdshow-20051017.exe and ffdshow-20051018.exe
i could test more if you want.
cc979
18th October 2005, 23:05
just tested build:
866c5aca7188f9040f28e6408c6eb4e0 *ffdshow-20051017-clsid.exe
spp deblocking is ok on this
Blkbird
19th October 2005, 00:17
I've got a similiar 100% CPU problem with the 20051018 (02dfe422f480810944852f55dea8db83) build, even without SSP deblocking.
Computer config: Athlon 2000, Windoes XP SP2, Radeon 9000, Catalyst 5.8.
Playing 1000 kbit/s XviD, CPU goes about 100% with my regular postpocessing options: Presets highest auto, Strength 100%, Method mplayer accurate luma full range, Nic's 20*40.
When I deselect Nic's CPU goes back to around 20%.
With 20051015 (52e5e5a5008760ecfbedbd209f964182), CPU is about 35% even with Nic's.
NoX1911
19th October 2005, 02:16
If you want to benchmark more seriously...
- enable 'OSD' with 'CPU load' and 'Save to' option
- Play your video
- open the resulting .csv file in excel
- insert =AVERAGE(A:A)*100 in field B
- repeat this under same conditions with different ffdshow builds and compare the results
Restart player or re-open file (eg. drag&drop) to reinitialize ffdshow to reset statistics. Stopping/restarting same video still uses same statistic file.
Liisachan
19th October 2005, 03:02
for instance:
buildA gives the max CPU load 90%, avarage CPU load 85%
buildB gives the max CPU load 100%+ (unusable), avarage CPU load 80%
If you really mean it, don't blindly calclate the avarage
clsid
19th October 2005, 11:23
Another benchmarking method is to take a short clip, lets say about 3 minutes. Open your taskmanager. Play the entire clip and note the CPU time consumed by your player. Close the player. Replace some files (or the entire build) and repeat.
Leak
19th October 2005, 11:35
Because some people are saying my ICL builds are slower than yours, when we both use ICL9.
Could it be that those people are using AMD CPUs? If so, you should probably have a look at this (http://yro.slashdot.org/comments.pl?sid=155593&threshold=5)... :(
np: Thomas Fehlmann - Hana (Lowflow)
clsid
19th October 2005, 12:23
The GenuineIntel check can be patched. I posted a link for it a few days ago.
cc979
19th October 2005, 19:54
i've tested ffdshow-20051015-icl.exe (65c727897c98792b70154603ea09e6c4), it just crashes
ffdshow-20051015.exe (52e5e5a5008760ecfbedbd209f964182) does'nt use ICL but it works best for me
ffdshow-20051018.exe (02dfe422f480810944852f55dea8db83) is all GCC but it does not work as good as ffdshow-20051015.exe
bob0r have you tried compile all files gcc, but use ICL for ffdshow.ax and ff_vfw.dll ?
bob0r
19th October 2005, 20:16
...
bob0r have you tried compile all files gcc, but use ICL for ffdshow.ax and ff_vfw.dll ?
My goal is to make a full GCC build of ffdshow, so it can be compiled on a weekly bases and on demand.
I dont think using ICL ffdshow.ax and ff_vfw.dll should speed things up, meaning if they do, something must be broken with the gcc versions.
So if a ICL/MSVC ffdshow.ax (ff_vfw.dll) will speed things up, i will report this as a question/bug.
But maybe some readers can shed some extra light on this.
Even msvc ffdshow.ax + the rest gcc, does not speeddown, while generally msvc is very slow. For now i am focussing on GCC only, but i guess i can make some ICL/MSVC files in the future, but compiling with ICL takes very long and you need to add linkers in ffdshow, which is kind of annoying.
Speaking of gcc and fixed:
it's crashes when opening "version details" window.
release avisynth_c environment > http://cia.navi.cx/stats/project/ffdshow/.message/6086170
Seems it got fixed, its not crashing me for.
Compiled a new TEST build, all files gcc 4.0.2
http://cia.navi.cx/stats/project/ffdshow (last = 13:58 on Oct 19, 2005)
http://mirror05.x264.nl/public/ffdshow/ffdshow-20051019-test.exe (78ff930250380514ec36e8e9b92e2f0d)
Inventive Software
19th October 2005, 20:27
Right, at the risk of sounding completely dumb: Is ICL free?
cc979
19th October 2005, 20:59
bob0r
sounds a bit compilcated, do you compile the GCC stuff on linux enviroment?
but keep up the good work
lazyn00b
19th October 2005, 22:56
Recent builds of FFDshow (audio) cause MCE 2005 w/Rollup 2 to crash when decoding MP2 and AC3. 20050930, 20051013, and 20051015 are all affected with this problem. 20050822 and 20050920 still work fine.
Did something change in the audio decoder code between 9/20 and 9/30?
EDIT: I should mention that it's just the Media Center app that crashes, not the whole OS. I am running a Pentium D 820 (2 x 2.8 Ghz) with 1 GB of memory and my sound card is on-board Realtek ALC882 (Intel HD Audio).
vortex_hl
20th October 2005, 00:12
@bob0r
All .txt files in ffdshow directory are incorrect at your builds. No problem on CD and clsid ones.
http://img22.imagevenue.com/loc54/th_507_ffd.jpg (http://img22.imagevenue.com/img.php?loc=loc54&image=507_ffd.jpg)
Egh
20th October 2005, 00:16
@bob0r
All .txt files in ffdshow directory are incorrect at your builds. No problem on CD and clsid ones.
http://img22.imagevenue.com/loc54/th_507_ffd.jpg (http://img22.imagevenue.com/img.php?loc=loc54&image=507_ffd.jpg)
heh, i was mentioning that couple of days ago, but CR/LF are stil broken :)
cc979
20th October 2005, 00:34
bob0r:
just tested the ffdshow 051019-test.exe (78ff930250380514ec36e8e9b92e2f0d)
everything works smooth when postprocessing is off
i tested the deblocking functions on there own found that the luminance deblock(h) is using a lot more than usual on my system.
do you know if there was any recent changes with the standard deblocking code?
movax
20th October 2005, 01:23
Right, at the risk of sounding completely dumb: Is ICL free?
Nope, the Intel Compiler costs a pretty penny.
NoX1911
20th October 2005, 03:02
Anyone able to enter negative delay values in audio decoder? Seems that only positive values are accepted... Maybe i'm wrong but wasn't it implemented already before?
bob0r
20th October 2005, 04:37
bob0r
sounds a bit compilcated, do you compile the GCC stuff on linux enviroment?
but keep up the good work
Windows XP Professional, Service Pack 2 (5.1 - 2600)
Mingw/msys is what i compile on. (same for x264 builds)
@bob0r
All .txt files in ffdshow directory are incorrect at your builds. No problem on CD and clsid ones.
http://img22.imagevenue.com/loc54/th_507_ffd.jpg (http://img22.imagevenue.com/img.php?loc=loc54&image=507_ffd.jpg)
heh, i was mentioning that couple of days ago, but CR/LF are stil broken :)
I don't have it this way, maybe CD and clsid can explain to me what i should do (if its not ffdshow.ax being icl and not gcc as in my case)
bob0r:
just tested the ffdshow 051019-test.exe (78ff930250380514ec36e8e9b92e2f0d)
everything works smooth when postprocessing is off
i tested the deblocking functions on there own found that the luminance deblock(h) is using a lot more than usual on my system.
do you know if there was any recent changes with the standard deblocking code?
http://cia.navi.cx/stats/project/ffdshow
Have a look for yourself, i will make some ICL/MSVC ffdshow.ax (and ff_vfw.dll) files later, maybe the cause is there.
Anyone able to enter negative delay values in audio decoder? Seems that only positive values are accepted... Maybe i'm wrong but wasn't it implemented already before?
Good question, ill test this later on aswell :D
vidhead
20th October 2005, 05:00
Windows XP Professional, Service Pack 2 (5.1 - 2600)
Mingw/msys is what i compile on. (same for x264 builds)
i've read and followed your guide/steps (above) to compile ffdshow with mingw/msys on winxp...but it's not working out. i'm either too dumb or too busy or both, please do a detail breakdown of steps, no matter how trivial too.
celtic_druid
20th October 2005, 07:07
Not working out? What exactly does that mean? Compiled, but doesn't work? Won't compile? If you can give details about what isn't working then it should be possible to do something about it.
Liisachan
20th October 2005, 08:12
ffdshow-20051019-test.exe is too slow for me when I play an ordinary xvid clip with VMR9 renderless + ffdshow-side RGB32 output. The problem is gone with default (YUY2) color spaces.
The same problem exists in ffdshow-20051017.exe and ffdshow-20051018.exe
The last good for me is celtic_druid's ffdshow-20051013.exe
Inventive Software
20th October 2005, 09:38
Nope, the Intel Compiler costs a pretty penny.
At a guess, the pretty penny is likely to break the bank balance, right? I.E, more than £200
clsid
20th October 2005, 11:51
The linux version of ICL is free. For Windows there is an evaluation version.
http://www.intel.com/cd/software/products/asmo-na/eng/compilers/index.htm
Sharktooth
20th October 2005, 13:00
are you sure linux compiler is free? http://www.intel.com/cd/software/products/asmo-na/eng/compilers/219937.htm
dimzon
20th October 2005, 13:58
Does anybody tried Open Watcom C++ Compiler (http://www.openwatcom.org/)?
Watcom C/C++ compiler v 10.x was THE BEST C/C++ compiler in the middle of 90's. Wide range optimization etc...
Leak
20th October 2005, 14:21
are you sure linux compiler is free? http://www.intel.com/cd/software/products/asmo-na/eng/compilers/219937.htm
As long as you don't use it for commercial purposes... (http://www.intel.com/cd/software/products/asmo-na/eng/download/download/index.htm)
np: I'm Not A Gun - Every Moment Is Ours (Our Lives On Wednesdays)
Sharktooth
20th October 2005, 14:54
Does anybody tried Open Watcom C++ Compiler (http://www.openwatcom.org/)?
Watcom C/C++ compiler v 10.x was THE BEST C/C++ compiler in the middle of 90's. Wide range optimization etc...
Yep it was widely used for games and other stuff that required performance.
However i dont know the actual status...
dimzon
20th October 2005, 15:39
Yep it was widely used for games and other stuff that required performance.
However i dont know the actual status...
Just test it!
madman1980
21st October 2005, 00:06
I'm on a nf4 + a64 and I too have the slowdown (unplayable) with SPP deblock lum H+V with version 20051018
Egh
21st October 2005, 04:09
ffdshow-20051019-test.exe is too slow for me when I play an ordinary xvid clip with VMR9 renderless + ffdshow-side RGB32 output. The problem is gone with default (YUY2) color spaces.
The same problem exists in ffdshow-20051017.exe and ffdshow-20051018.exe
The last good for me is celtic_druid's ffdshow-20051013.exe
I'm glad I'm not alone :)
VMR9 renderless really requires a bit more CPU. I guess some of those instructions benefit greatly from optimisation, and if a build is unoptimized then playback is slow. A sheer guess would be that here the reason is YV12-->RGB conversion, which is applied on every pixel in this mode, uses some kind of that optimisation.
Other filters, like previously mentioned software bicubic/Lanczos resizers (and SPP deblocking, as others reported) also seem to have such behaviour.
Lisachan: what DX version / video card do you have?
Liisachan
21st October 2005, 15:00
ffdshow-20051017-clsid.exe works fine with MPC VMR9 renderless + ffdshow RGB32 too. (Thanks clsid!!)
DirectX 9.0c (October 2005) / Quadro FX 500 / NVidia ForceWare 78.01 WHQL
Sharktooth
21st October 2005, 15:18
RGB... why ppl still use RGB...
Liisachan
21st October 2005, 15:36
Overlay has its forte, greatly, but still:
"I still feel software-side RGB32 is more beautiful, but that may be just my imagination."
"I was comparing using ffdshow with force rgb32 (high quality) output versus using the hardware (geforce 6) do the color conversion, both using VMR9. I noticed that using ffdshow gives better colors which you clearly see on bright objects, especially red objects."
http://forum.doom9.org/showthread.php?p=719923#post719923
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.