View Full Version : Multithreaded XviD - official thread
sysKin
24th February 2006, 10:27
[edit - this was updated. a lot]
Hi :)
I hope you don't mind I opened a new thread about it, but new MT code is significantly different than previews and I'd really like the first post to explain all current information, not old obsolete experimental stuff.
So, here we go:
What happened: Multithreaded XviD code is now committed to CVS. From now on, all "xvid head" or "xvid 1.2.x" versions have it.
What does it mean: If you have a multi-core, multi-CPU or just Hyperthreaded system, you get more speed by using new multithreaded capabilities. If not, you can ignore them completely. In fact you are recommended to ignore them. Everything below only applies for number of threads > 0, for equal zero there's no difference from before.
How to use it: In VfW, the number of threads can be changed by clicking on Other Options at the bottom. When using xvid_encraw, use -threads parameter. You need a new xvid_encraw for that (compiled today or later). In general, use number of threads equal to your number of cores. You can make your own measurments though - for example, fast first pass seems to be a bit *faster* on just one thread... But all that depends on your setup, settings, complexity of avisynth script etc.
Is quality lower: No, resulting file is (should be) the same regardless of number of threads.
Is there a maximum: I think a value that's waaay to high will cause a crash ;) and that's my fault. In theory, the maximum # of threads that can be used is equal to picture height divided by 16, rounded up.
Is there an option that b0rks it: Not as such. However, GMC remains not multithreaded at all, so if you use GMC you probably loose most benefits.
CPU usage is better at number of threads = 3 than = 2, should I use 3 then?: DO NOT, I repeat, DO NOT measure this code's performance by looking at CPU usage in task manager. Measure speed. In fact I get the best speed at two threads (85% cpu usage) three (~95% usage). That's the way it is.
So, can I increase CPU usage because 85% is only 85%: Yes, increase thread priority. On a dual-core system it doesn't hurt anyway, and the higher priority the less idle CPU. You probably shouldn't feel any difference, while "Higher" gives me 98% usage and is faster than "Normal". HOWEVER, "Highest" is slower again. In fact MUCH slower. Probably takes CPU power away from avisynth.
I'm looking forward for your impressions, speed results, and crash reports :)
Regards,
Radek
Until Koepi or Celtic Druid make their builds, you can get a crude binary here: Http://syskin.is.dreaming.org/xvid-smp.zip . Unzip, right click on the .inf file, select install.
Axed
24th February 2006, 12:00
Wow, this is great news for those lucky buggers with dual core processors! Ill give it a try on my p4 when i get a chance, but the damned thing isnt booting at the moment (again time for another new mobo!).
To be truthful, im going to be extremely suprised if theres any difference at all using threads on my p4 since i havent got a boost with any of the other programs with extra threads.
squid_80
24th February 2006, 13:00
How to use it: In VfW, the number of threads can be changed by clicking on Other Options at the bottom. When using xvid_encraw, use -threads parameter. You need a new xvid_encraw for that (compiled today or later).
Use the following number of threads:
* uniprocessor systems: 0
* HT pentium 4s : probably 1. I don't know really.
* Multi cores, multi CPUs: probably equal to real number of cores. I don't know really ;p
Are there plans to implement some sort of no. of cores auto-detection, so an XVID_GBL_INFO call will return the right number of threads to use? Intel has given example assembly code and it seems to have been adopted by AMD to match.
708145
24th February 2006, 13:06
Are there plans to implement some sort of no. of cores auto-detection, so an XVID_GBL_INFO call will return the right number of threads to use? Intel has given example assembly code and it seems to have been adopted by AMD to match.
The optimal number of threads also depends on the settings used. It is thus a bit more complex.
sysKin
24th February 2006, 15:35
The optimal number of threads also depends on the settings used. It is thus a bit more complex.
Actually I just removed this idea of mine, to run encoding as yest another thread. So now number of threads = number of cpus always, after all.
Are there plans to implement some sort of no. of cores auto-detection, so an XVID_GBL_INFO call will return the right number of threads to use? Intel has given example assembly code and it seems to have been adopted by AMD to match.
In theory XVID_GBL_INFO returns that, just look at the code It's old stuff though.
Anyone knows a win32 equivalent of pthread_num_processors_np() ?
Since threads = 0 is pretty much identical to threads = 1 now, I can make = 0 an autodetection.
Doom9
24th February 2006, 16:32
win32 equivalent of pthread_num_processors_np() ?Not exactly an equivalent, but get the number of processors environment variable.. it does the trick on NT based boxes. And if it's not there, set the number of threads to one because those pseudo 32 bit Windows can't handle more than one cpu anyway. This is how the autosetting of nbthreads works in megui (except that I'm not catching the error in case the environment variable is there.. some might argue that's a bug.. I consider it a feature you get if you use a crappy OS <evil grin>)
captainvideo
25th February 2006, 00:38
could someone post a link.
dimzon
25th February 2006, 01:23
Is quality lower: No, resulting file is (should be) the same regardless of number of threads.
WOW!
Does this technology applicable to AVC? Is it possible to make same for x264?
ChronoCross
25th February 2006, 01:25
MINGW32 Compilation failed. 1.1 final compiles fine. does the new 1.2 cvs require anything more than 1.1 in terms of libraries?
EDIT I did fresh checkout a minute ago and it seems as though it's working. although it throws alot of warnings. but it does indeed compile.
the vfw doesn't however.
sysKin
25th February 2006, 02:28
Does this technology applicable to AVC? Is it possible to make same for x264?
Yes it is, but note it's not as efficient (in terms of speed, scaling on many CPUs etc) as slices used in x264.
Then again I could have used slices for b-frames and didn't, because I wanted identical file ;)
Kostarum Rex Persia
25th February 2006, 03:01
sysKin, but where is the download link for official Multithreaded XviD build?
ChronoCross
25th February 2006, 03:04
sysKin, but where is the download link for official Multithreaded XviD build?
you have to compile it yourself.
sysKin
25th February 2006, 03:21
sysKin, but where is the download link for official Multithreaded XviD build?
Actually I have no reason *not* to make a build of some sorts. But don't count on installer or other fancy stuff. Koepi promised to make a build soon, too.
Http://syskin.is.dreaming.org/xvid-smp.zip
sysKin
25th February 2006, 04:46
Anyone knows a win32 equivalent of pthread_num_processors_np() ?
If anyone is interested, I just invented this:
static int pthread_num_processors_np()
{
unsigned int p_aff, s_aff, r = 0;
GetProcessAffinityMask(GetCurrentProcess(), &p_aff, &s_aff);
for(; p_aff != 0; p_aff>>=1) r += p_aff&1;
return r;
}
Takes affinity into account and generally does what I need :)
suxen_drol
25th February 2006, 05:45
Actually I just removed this idea of mine, to run encoding as yest another thread. So now number of threads = number of cpus always, after all.
In theory XVID_GBL_INFO returns that, just look at the code It's old stuff though.
Anyone knows a win32 equivalent of pthread_num_processors_np() ?
yep! have commited change to cvs.
Since threads = 0 is pretty much identical to threads = 1 now, I can make = 0 an autodetection.
ok, but be wary of other platforms for which we don't yet have a syscall to query the number of processors.
-- pete
Yong
25th February 2006, 07:00
MINGW32 Compilation failed. 1.1 final compiles fine. does the new 1.2 cvs require anything more than 1.1 in terms of libraries?
EDIT I did fresh checkout a minute ago and it seems as though it's working. although it throws alot of warnings. but it does indeed compile.
the vfw doesn't however.
Hmm, wired,
i can compile 1.2 vfw/ encraw/ xvidcore.dll without problem, but...
yup its throws alot of warnings," ...motion_smp.h. no new line at the end of file..."
Start from 1.2, i no longer can compile the xvid dshow decoder :(
gcc throw me a "no rules to make debug.obj, needed by xvid.ax"... O.o
Dont know what the hell is going on...
ChronoCross
25th February 2006, 07:19
Yong. My problems were caused by an incomplete commit. all the problems are fixed as of 8pm 02/24.
Cyberace
25th February 2006, 12:19
Feature request; can we please get a multithreaded decoder (option) as well as the multithreaded encoder?
hajj_3
25th February 2006, 12:48
is xvid 1.2 final gunna be out soon?
sysKin
25th February 2006, 14:26
Feature request; can we please get a multithreaded decoder (option) as well as the multithreaded encoder?
I have absolutely no idea which part of decoder can be multithreaded and how (complete rewrite with added pipeline doesn't count).
Guilllo
25th February 2006, 14:42
Hi, it's a great new !!
I'm using Mencoder under linux to encode in xvid. How can I set the number of thread to use ?
Thanks
Koepi
25th February 2006, 14:49
Ok, as promised my build is up now, too. Fetch it as long as it's still warm ;-)
Changelog to XviD-1.1:
- {core}: New experimental SMP support.
- {core}: Trellis improvements (according to sysKin).
- On uniprocessor machines set number of threads to 0!
Cheers
Koepi
lantern
25th February 2006, 16:33
I seem to be having a problem compiling the latest CVS build. Here is the error I get when I go to make. I am using mingw & msys.
I ran ./bootstrap.sh then ./configure and finally make. Am I missing any steps?
Thanks!
http://img123.imageshack.us/img123/6074/xvidcompile4xm.jpg (http://imageshack.us)
sysKin
25th February 2006, 16:51
I seem to be having a problem compiling the latest CVS build. Here is the error I get when I go to make. I am using mingw & msys.
Update your cvs, it's a mistake I made ~24 hours ago and fixed ~12 hours ago.
hajj_3
25th February 2006, 21:45
so when is xvid 1.2 final due to be released?
is there just gunna be SMP multithreaded support or will there be any quality improvements?
lantern
25th February 2006, 22:01
I can't seem to find the latest cvs build. I downloaded from here (http://downloads.xvid.org/downloads/xvid_latest.tar.gz) and it was not the latest. I used WinCVS with this command, cvs -d: pserver:anonymous@cvs.xvid.org:/xvid co xvidcore and it is still an old compile. I have tried Koepi's site but can't get the latest (www.koepi.org/xvidcore-1.2.-127.zip). The link doesn't seem to be working.
Thanks for all your help.
ChronoCross
25th February 2006, 22:20
I can't seem to find the latest cvs build. I downloaded from here (http://downloads.xvid.org/downloads/xvid_latest.tar.gz) and it was not the latest. I used WinCVS with this command, cvs -d: pserver:anonymous@cvs.xvid.org:/xvid co xvidcore and it is still an old compile. I have tried Koepi's site but can't get the latest (www.koepi.org/xvidcore-1.2.-127.zip). The link doesn't seem to be working.
Thanks for all your help.
the cvs from where you were downloading is the latest. koepi's is not a .zip. go to his site and in the downloads section there is an installer.
SeeMoreDigital
25th February 2006, 22:35
Here was a direct link to Koepi's latest Koepi's Site (http://www.koepi.org/) download :)
Cheers
[edit by Koepi]: tses ;-P
lantern
25th February 2006, 23:29
Thanks SMD, but I was looking for the latest source to build my own.
I did have the right cvs, but it was showing up as being created 28Dec2005. I needed to reboot after I put in in the system32 directory.
Koepi
26th February 2006, 01:05
Of course the latest source that I used is on my site as well, look out for "xvidcore-1.2.-127-250206.zip (861kb)" (scroll a little down. If you still see "xvidcore-1.2.-127.zip" you might wanna hit the refresh-button while holding down your shift-key. This tells your proxy to fetch the site fresh from the net.
Cheers
Koepi
sysKin
26th February 2006, 05:24
so when is xvid 1.2 final due to be released?
Probably never. We hardly had enough willpower to release 1.1, and today is the last day of my summer holidays. I expect 1.2.-127 to remain th "latest unstable build" forever and ever ;)
It already has multiple improvements over 1.1, such as better trellis, some packet bitsteam fixes, new VfW config, HVS plugin support (still no plugins though) and some other stuff.
LordIntruder
26th February 2006, 09:49
Hi,
I've been making some tests comparing first SMP build released many days ago by Syskin in the other discussion thread with this new one.
Those tests have been made using the exact same script and XviD options of course.
Encoded using 2 threads.
AMD 4200+ , Avisynth 2.5.6 , VDM 1.5.10.2
First smp build:
1st pass : 56min (41,5fps)
2nd pass : 2h 36min (14,9fps)
New build:
1st pass : 50min (46,5fps)
2nd pass : 2h 17min (17fps)
Now the new build without GMC as Syskin said GMC is not multi-threaded. Just to see how faster it is.
1st pass : 50min
2nd pass : 1h 59min (19,5fps)
My CPU usage is around 85% (fluctuating all the time around 80% to 90% and no other program running) but as you said Syskin it is essential to make tests on speed. However there is still CPU power potential that is not used.
Thanks for the work Syskin. :)
I tested 3 and 4 threads just to see if it would decrease my and encoding time by increasing my CPU usage but it takes longer so forget about that. ;)
Is it impossible to multi-thread GMC?
You say you are stopping 1.2 development but is there any plan than somebody else continue your work? We are millions to wait :D :D
Anyway again thanks you very much. :thanks:
Kostek80
26th February 2006, 11:53
static unsigned CalculateNumberOfThreads(unsigned numThreads) {
if(numThreads == 0) {
SYSTEM_INFO systemInfo;
::GetSystemInfo(&systemInfo);
// microsoft recomended for smp system
numThreads = systemInfo.dwNumberOfProcessors * 2;
}
return numThreads;
}
I hope it will be helpfull. i use this formula for scalable systems
xDrJx
26th February 2006, 12:15
Koepi site requires username and PW:confused:
Did I miss something?
Nevermind, works again!!
Sorry for useless post!!
dimitrik
26th February 2006, 13:26
Of course the latest source that I used is on my site as well, look out for "xvidcore-1.2.-127-250206.zip (861kb)"
Cheers
Koepi
Hi Koepi,
Pardon me for being out of date here:o , but is this compile optimised for any kind of processor? IIRC, some time ago you used to host AMD optimised builds while Celtic Druid did Intel optimised builds.
Many thanks for all you're doing.:)
humax
26th February 2006, 13:37
Hi
i have a Pentium Hyperthreading .
No Dual Core like 820D or something but Hyperthreading simulates 2 Cpus.
How many threads shall i choose. 0 or 1 ???
thx for help
xDrJx
26th February 2006, 13:48
I also have a HT shitbox and for me the number 2 works best. Test it dude.
cheers
708145
26th February 2006, 13:50
static unsigned CalculateNumberOfThreads(unsigned numThreads) {
if(numThreads == 0) {
SYSTEM_INFO systemInfo;
::GetSystemInfo(&systemInfo);
// microsoft recomended for smp system
numThreads = systemInfo.dwNumberOfProcessors * 2;
}
return numThreads;
}
I hope it will be helpfull. i use this formula for scalable systems
Never use a single formula for every software because optimal #threads depends on the FU/cache/memory usage of the tasks.
The best solution is to benchmark with different #thread settings at install time.
i.e. for my bus simulator real_cores+ 1/4*hype_cores was optimal from 2 to 4 CPUs.
hajj_3
26th February 2006, 14:20
Probably never. We hardly had enough willpower to release 1.1, and today is the last day of my summer holidays. I expect 1.2.-127 to remain th "latest unstable build" forever and ever ;)
It already has multiple improvements over 1.1, such as better trellis, some packet bitsteam fixes, new VfW config, HVS plugin support (still no plugins though) and some other stuff.
that SUCKS, why is the official not going to be released? you should make an official of this, then end development, h.264 aint gunna be mainstream for a while, scene groups arent even using xvid 1.1 yet, nevermind h264. it would be a wasted effort not making an official version of this!
Cyberace
26th February 2006, 14:23
so when is xvid 1.2 final due to be released?Probably never. We hardly had enough willpower to release 1.1. I expect 1.2.-127 to remain th "latest unstable build" forever and everHow about a news-post on xvid.org front-page to inform everyone about that? :(
PS! Probebly been asked before but why is xvid.org forum registration closed? :confused:
xDrJx
26th February 2006, 14:24
...scene groups arent even using xvid 1.1 yet...
:D complain
Koepi
26th February 2006, 16:37
Pardon me for being out of date here:o , but is this compile optimised for any kind of processor?
The usual build should work fast on any kind of processor (cil7 compile with some optimisations). The unstable-build is optimized for pIII and athlon/duron upwards (uses iSSE even if you deselect any processor auto detection) with icl7.1 -- I'm sorry for the k6- and pII-users, but I think it's time to assume that everyone should be using something more recent than a 7 year old processor in the meantime.
@all:
XviD isn't dead. Development is going on, but more slowly again. SysKin hasn't got anymore time, but other people sometimes contribute as well. In the foreseeable future XviD will even get AVC support, if that code hits the public development will get faster again I think.
Cheers
Koepi
shon3i
26th February 2006, 16:52
In the foreseeable future XviD will even get AVC supportThat was be Cool!
split710
26th February 2006, 18:35
just to be sure....
I have a pentium 4 3gh, with Hyperthreading support, what i have to set exactly?
shpitz
27th February 2006, 03:48
hello all,
i'm trying to encode a sample of a 720p from a DVB source.
i'm using xvid 1.2.127 with smp alpha from koepi's site, vdub 1.6.11, dgdecode 1.4.6, and avisynth 2.5.6.
i'm using the following script:
LoadPlugin("D:\TBS\Filters\DGDecode.dll")
mpeg2source("F:\720p_clip.d2v")
SelectEven()
Crop(170,4,-160,-4)
BicubicResize(640,480,0,0.75)
my spec is dual xeon 3ghz with 1gb ram. OS is xp pro sp2.
setting thread number from 0 to 4 results in cpu usage around 40%, all resulting in about the same encoding time (1min, about 30fps avg).
when i try 5 threads or above cpu usage goes up to 85% and above, but encoding speeds gets worse and worse.
i tried installing xvid 1.0.3, and it was faster compared to 1.2 smp. i tried installing xvid 1.1.0 beta 2 and it was faster compared to 1.2 smp.
am i doing something wrong?
could be my dgdecode settings are wrong? which idct option should i use?
the cpus have sse2 and sse3.
i will also run a test on a regular non-HD clip.
http://img528.imageshack.us/img528/4784/xvid011sr.th.jpg (http://img528.imageshack.us/my.php?image=xvid011sr.jpg)
http://img528.imageshack.us/img528/2826/xvid029xp.th.jpg (http://img528.imageshack.us/my.php?image=xvid029xp.jpg)
http://img528.imageshack.us/img528/350/xvid032ei.th.jpg (http://img528.imageshack.us/my.php?image=xvid032ei.jpg)
http://img528.imageshack.us/img528/4573/xvidusage3ku.th.jpg (http://img528.imageshack.us/my.php?image=xvidusage3ku.jpg)
foxyshadis
27th February 2006, 04:08
The codec is being starved by avisynth, probably. Try using the MT build of avisynth 2.5.6 and SetMTMode(2) before mpeg2source.
shpitz
27th February 2006, 04:17
wow, that was a quick reply, tnx m8.
i will try the MT version of avisynth from mt v0.5 .
i will report back in a few minutes.
shpitz
27th February 2006, 04:39
ok, i've replaced the avs dll with the MT version.
if i put SetMTMode(2) in the script, the encode speed drops conciderably.
if i change it to SetMTMode(2,2) it's fine.
i tried with xvid threads set to 0. 56 & 59 seconds for pass1 and pass2.
with xvid threads set to 2 it works with the same exact speed.
if i try SetMTMode(2,4) it drags ass as well.
so it seems like all this smp'ability just makes encoding slower on my pc ;-((
is there anything else u guys can think of that might be misconfiged or that i'm doing wrong?
shpitz
27th February 2006, 04:40
is it maybe because i'm actually not using any filters on this clip at all? i only crop and resize...
sysKin
27th February 2006, 06:24
I am very much interested in Xeon or dual core P4s. In theory, their primary design fault (no connection between cores other than FSB) can be a major problem. It might also be not a problem at all.
However, I don't like your CPU usage graphs... one virtual core is not being used at all, and this is not something windows does, when it gets the choice..... something might be funny.
Either that, or Xeons are really bad at this (they might be, really might be...)
shpitz
27th February 2006, 07:06
yeah, it is well known that xeons are chokers when it comes to memory-cpu bandwidth. opterons and dual-cores should do a better job in that respect.
i might disable HT and try again, from my tests it appears that 2 threads is optimal as with 4 it really chokes.
i will also disable PAT and see if it makes a difference.
expect more tomorrow ;-)
Revgen
27th February 2006, 07:14
@shpitz
The SetMTmode filter needs to be set before you're source like this:
SetMTmode(2,2)
mpeg2source("yourdrive:\yourd2v.d2v")
Yourfilter()
Also, are you actually using a real dual-core or dual CPU solution? HT will not gain as much as the real thing.
HT in some cases can even slow down performance.
sysKin
27th February 2006, 08:18
yeah, it is well known that xeons are chokers when it comes to memory-cpu bandwidth.
Yes, and with an application that works on two cpus, *additional* data transfers happen - between the cpus. They add up to the memory bandwidth because they all use FSB as the data pipe.
This code doesn't use a lot of data to communicate, but it does depend on this communication happening fast. If FSB is delaying the information, one of the threads suddenly has nothing to do (because it's not informed how much another thread has done) and must wait...
Actually, I'll make some tests how this code's speed depends on Hypertransport speed. Good idea :)
Yong
27th February 2006, 08:18
I failed to compile the lastest cvs code(libxvidcore)...
Here is the part of msys output:
D: =build
C: ./decoder.c
In file included from ../../src/bitstream/../motion/motion_smp.h:34,
from ../../src/bitstream/../encoder.h:158,
from ../../src/bitstream/bitstream.h:31,
from ../../src/decoder.c:40:
c:/msys/mingw/bin/../lib/gcc/mingw32/3.4.5/../../../../include/winbase.h:552: error: syntax error before "DWORD"
c:/msys/mingw/bin/../lib/gcc/mingw32/3.4.5/../../../../include/winbase.h:556: error: syntax error before "DWORD"
c:/msys/mingw/bin/../lib/gcc/mingw32/3.4.5/../../../../include/winbase.h:558: error: syntax error before "ftLastAccessTime"
c:/msys/mingw/bin/../lib/gcc/mingw32/3.4.5/../../../../include/winbase.h:559: error: syntax error before "ftLastWriteTime"
[...]
alot of erros, "make distclean && make -s" doesnt help...
But dshow decoder compiling work again, thx xvid devs :)
EDIT: may be there's something wrong with motion_smp.h,
i revert it to older revision then compling works again :p
shpitz
27th February 2006, 16:26
@shpitz
The SetMTmode filter needs to be set before you're source like this:
SetMTmode(2,2)
mpeg2source("yourdrive:\yourd2v.d2v")
Yourfilter()
Also, are you actually using a real dual-core or dual CPU solution? HT will not gain as much as the real thing.
HT in some cases can even slow down performance.
dual xeon 3ghz, and yes, i've added the setmtmode before mpeg2source.
i will need to thoroughly investigate it, the computer seems to choke completely in terms of smp encoding.
Zep
27th February 2006, 23:39
dual xeon 3ghz, and yes, i've added the setmtmode before mpeg2source.
i will need to thoroughly investigate it, the computer seems to choke completely in terms of smp encoding.
It depends on how low level syskin went. Dual CPU sucks when the code is lower level like say each CPU working on part of the same marcobock VS each CPU working on half of a video frame and never accessing the same data so no need to sync much or sync fast and thus each CPU is always crunching with very few wait states. Higher level makes scaling much better also like when throwing a render farm at a project.
woah!
28th February 2006, 05:14
i am getting nearly double the framerate in encoding by using Vdub with SetMTMode(2) and the lastest xvidcore.dll
it sits at about 85% cpu and from 10fps for 1 thread to 18-19fps with 2 threads:
going from 1920x1088i 30fps to 720x304 23.976fps :
SetMTMode(2)
mpeg2source("G:\INDEXED.d2v")
crop(0,144,-0,-148)
Telecide(order=1,guide=1,post=2).Decimate(mode=0)
#TomsMoComp(1,30,1)
#FDecimate(rate=23.976,threshold=3.5)
#Kerneldeint(order=1,sharp=true)
#Kernelbob(order=1,sharp=true) # double framerate 60fps
#ColorMatrix("Rec.709->Rec.601",mmx=true,hints=false)
#LanczosResize(848,480)
#BicubicResize(704, 400, 0.33, 0.33, 8, 8, 1904, 1072)
#bicubicResize(704,400)
LanczosResize(720,304)
asharp(2,4,hqbf=true)
#trim(61000, 61720)
thx for this nice upgrade which i could never have got with any single-core cpu upgrade.
http://images.dr3vil.com/files/default/2core.jpg
heres a clip of the output file:
http://s19.yousendit.com/d.aspx?id=051E4GRBM6JOI18W03XLOVCLQH
shpitz
28th February 2006, 19:37
woah!,
can you post a link to the xvid version you are using?
also, which avisynth version are you using?
did you install anything else (such as MT 0.5 filter) ?
can you also post your xvid encoding settings you used and which matrix? the encode looks great.
AmazingRando
1st March 2006, 02:54
:D I just wanted to thank you guys for your effort in making XviD SMP capable. I've used XviD for years now on hundreds of encodes and it's been great.
Last fall I reluctantly decided to move to Divx 6 since XviD wasn't SMP capable. I found that I needed to encode at "extreme" or "insane" quality to get the same quality as with XviD. But now I'm thrilled to be using XviD once again :D I did several test encodes last night to compare quality and performance between XviD 1.2 and Divx 6.1.1. On a dual-core 2.8Ghz Pentium D I was able to do a two-pass encode of a 44 min DVD ripped TV episode (720x480x24fps) in 47 mins including audio encoding and muxing. Compare that to 2 hours 2 mins for Divx 6.1.1 on insane quality. I judge the quality to be comparable between the two. BTW, the Pentium D 820 doesn't support HT, just two physical processors (cores). I have it set to 2 threads and get about 85-90% CPU utilization.
So, I'm one happy guy :D . Keep up the great work.
AR
PS. Future AVC support would be great!
Koepi
1st March 2006, 07:00
Igor Levicki asked me to post this (which I'll gladly do as it might be helpful):
In case you folks still haven't figured out the way to detect number of CPUs in the system let me help a bit:
#include <stdio.h>
typedef unsigned long u32;
u32 GetLogicalCPUCount(void)
{
u32 logical_cores = 0;
__asm {
xor eax, eax
cpuid
cmp eax, 1
jb no_logical_cores
mov eax, 1
cpuid
shr ebx, 16
and ebx, 0xFF
mov dword ptr [logical_cores], ebx
no_logical_cores:
}
return logical_cores;
}
u32 GetPhysicalCPUCount(void)
{
u32 physical_cores = 1;
__asm {
xor eax, eax
cpuid
cmp eax, 4
jb no_physical_cores
mov eax, 4
xor ecx, ecx
cpuid
shr eax, 26
and eax, 0x1F
add eax, 1
mov dword ptr [physical_cores], eax
no_physical_cores:
}
return physical_cores;
}
u32 HasHyperThreading(void)
{
u32 has_htt = 0;
__asm {
xor eax, eax
cpuid
cmp eax, 1
jb no_htt
mov eax, 1
cpuid
test edx, 0x10000000
jz no_htt
mov dword ptr [has_htt], 1
no_htt:
}
return has_htt;
}
int main(int argc, char* argv[])
{
u32 pcores, lcores, htt;
lcores = GetLogicalCPUCount();
pcores = GetPhysicalCPUCount();
htt = HasHyperThreading();
printf("Number of logical cores = %ld\n", lcores);
printf("Number of physical cores = %ld\n", pcores);
if (htt && ((lcores / pcores) > 1)) {
printf("CPU has HyperThreading = YES\n");
} else if (htt && (lcores == 1)) {
printf("CPU has HyperThreading = YES, DISABLED\n");
}
return 0;
}
If you have any doubts, check AP-485 (http://developer.intel.com/design/xeon/applnots/241618.htm) document. Same should apply to AMD.
Take note that logical cpu count == physical cpu count on current dual-core CPUs and that the presence of HyperThreading needs to be detected in a slightly different way than before -- having HTT bit set and 2 logical cores doesn't mean you have HTT. Pentium D 955 reports 4 logical and 2 physical cores. I understand that physical cores are counted as logical for compatibility reasons.
Perhaps it wouldn't hurt to add more checks (CPU familiy/model, etc) but I believe that this code should be safe to execute even on older CPUs, at least those which have CPUID instruction so you might at least want to add the check for that.
I would like to comment on FSB and bandwidth issues, I have done some threading of median filtering on my Pentium D 930 recently and I managed to get 2x speedup. With careful threading I bet you can do it too -- just make the threads work independently. I did it by making a queue from which two threads dequeue "packets" for processing which are queued by the third, main thread.
squid_80
1st March 2006, 08:56
Now that's what I was talking about.;) Would fit nicely into cpuid.asm, no?
seehowyouare
1st March 2006, 15:15
On a dual-core 2.8Ghz Pentium D I was able to do a two-pass encode of a 44 min DVD ripped TV episode (720x480x24fps) in 47 mins including audio encoding and muxing. Compare that to 2 hours 2 mins for Divx 6.1.1 on insane quality.
Sorry to be a party pooper but my x2 4400 Windows x64 system doesn't agree, regardless of number of threads I set :eek:
And I tried using GK, AutoGK and RealAnime. Fastest is DivX 6.11 at 75fps encoding speed, then XviD-1.1.0-30122005.exe and XviD-1.2.-127-25022006.exe is the slowest.
FWIW - the x264 standard codec uses only 1 core at 65% CPU @ 8fps. woot ! Yet to try Sharktooth's build.
Maybe my testing is wrong so if someone has an "agreed" test setup/procedure/script I can follow to fairly test the 1 thread vs 2 thread codec speeds on my PC I'd appreciate it.
Thanks
shpitz
1st March 2006, 15:31
anyone that experiences a speed-up, can you please post the exact versions (direct links would help a lot) of the components you are using (codec, avisynth, etc...).
seehowyouare's experience is kinda encouraging so-to-speak, maybe it's not a hardware issue on my end after all...
i did disable PAT but speed was not affected in almost any way, next thing i will try is disable HT in bios.
expect more to come...
celtic_druid
1st March 2006, 15:45
Well if you want an alternative to try:
http://ffdshow.faireal.net/mirror/XviD/xvid.cvs.head.2006.02.28.7z
http://ffdshow.faireal.net/mirror/XviD/XviD.cvs.head.exe
Same thing just with/without installer.
AmazingRando
1st March 2006, 20:01
Maybe my testing is wrong so if someone has an "agreed" test setup/procedure/script I can follow to fairly test the 1 thread vs 2 thread codec speeds on my PC I'd appreciate it.
:confused: That's strange. Is it me or does it seem like the people who are having trouble are AMD X2 users? I'll post my software, versions, etc. tonight. I have a dedicated system for encoding so it's a very clean install. Moreover, I'm using either GK codec pack 1.9 and GK rip pack .35 pack2. The only changes are keopi's latest xvid 1.2 binary and then the Divx stuff. For Divx I'm using Divx Pro 6.1.1 and Dr. Divx OSS 2.0 beta 7 with the current DrFFMPEG. Pretty stock really. More later...
AR
shpitz
1st March 2006, 20:21
:confused: That's strange. Is it me or does it seem like the people who are having trouble are AMD X2 users?
not only, i'm a xeon (intel) user...
seehowyouare
1st March 2006, 22:17
FWIW - the x264 standard codec uses only 1 core at 65% CPU @ 8fps. woot ! Yet to try Sharktooth's build.
I tried Sharktooth's x264 build set at 4 threads and encoding speed in VirtualDubMod increased from 8fps using the standard single thread x264 codec to 15fps on identical encoding job. CPU usage is much higher on both cores so I know multi thread stuff does work on my system.
Screenshots of Task Manager using single thread (http://img312.imageshack.us/img312/488/x264stdr4088gh.png) x264-Std_r408exe and multi thread (http://img401.imageshack.us/img401/8928/x264445installexe7wv.png)x264-445-install.exe
Yesterday I got a dual cpu box with 3 gig of ram dropped on my desk at work. Each cpu is a dual core opteron 280. Gives me a good excuse to do some XviD multithreaded encoding :)
I'm not really noticing any speed up. But at present i've got a few tasks in the background, so I need to clean things and do some proper testing.
I'll let you know.
-Nic
dimzon
2nd March 2006, 12:49
Yesterday I got a dual cpu box with 3 gig of ram dropped on my desk at work. Each cpu is a dual core opteron 280. Gives me a good excuse to do some XviD multithreaded encoding :)
I'm not really noticing any speed up. But at present i've got a few tasks in the background, so I need to clean things and do some proper testing.
I'll let you know.
-Nic
Can You try elder too?
http://forum.doom9.org/showthread.php?t=100766
Thanx!
Doom9
2nd March 2006, 13:15
one thing not to be forgotten (and I think I'm repeating myself here already) is that even the single threaded XviD build makes use of SMP. Not actually the codec, but since almost all people use XviD from within Virtualdub, the multithreaded architecture of VirtualDub comes to the rescue. You have one thread reading the input (and thus decoding the source), and one doing the actual encoding.. and so the thread scheduler puts the encoding thread on one core, and the decoding thread on another, which in turn means the encoder will max out one core, and the decoding thread will run as fast as the encoder thread can process the data, resulting in not such a shabby speedup. If you really want to see the raw difference, encode with a singlethreaded and a multithreaded encraw build..
At the settings used for the codec comparison, the CPU usage of XviD was in the +70% (no other tasks running but whatever windows is running) which is a lot closer to the cpu usage I'm getting when running a commandline smp capable encoder (x264, elecard, nerodigital).
If you really want to see the difference even in VDub, you can temporarily switch off one core (add /onecp uto boot.ini in the appropriate line, then reboot) ,make a test (nbthreads set to 0), remove the line, reboot, make the test again with nbthreads set to 2, and compare the encoding time.
celtic_druid
2nd March 2006, 13:41
mencoder just got updated to support threads with XviD now to.
seehowyouare
3rd March 2006, 08:36
and I think I'm repeating myself here already) is that even the single threaded XviD build makes use of SMP.
I am sure you are because of people like me :-)
Ok, I'm an encoding noob and just got my first dual core CPU.
Can someone confirm or deny if I got this threading/dual core right in my head.
1) Decoding only uses 1 thread
2) Encoding can use multiple threads (dep. on build etc)
3) VDM uses threads to seperate encoding and decoding tasks
4) VDM can allocates threads to CPUs.
5) Having 2 x CPU or 2 x dual core means VDM can decode on 1 x CPU and encode on the 2nd CPU.
6) Setting 2 x XviD threads on my dual core PC will make no difference to speed as VDM controls thread and CPU allocation.
PS - I will receive a dual core 2.8 Intel to keep the x2 4400 compnay next week.
Ice =A=
3rd March 2006, 12:11
@seehowyouare:
I can at least help in some regards:
1. No, luckily there are also more and more SMP DEcoders available, like Nero, DivX6.1.1 or Quicktime 7 (which is nevertheless very slow on PCs), and some more coming (like CoreAVC).
2. yes
6. There is a difference between the two: Most encoding programs can only separate threads like decoding of source material and encoding or audio and sound. Since the encoding itself needs most of the computing time thge most gains come from multithreaded encoding, which is where the SMP optimized XviD comes handy.
There is only one encoding program I know of which can really designate seperate threads to a non SMP XviD, namely ELDER...
sysKin
3rd March 2006, 18:03
1. No, luckily there are also more and more SMP DEcoders available, like Nero, DivX6.1.1 or Quicktime 7 (which is nevertheless very slow on PCs), and some more coming (like CoreAVC).
I consider that VERY unlikely. In fact, I consider that almost impossible. Other than postprocessing, I can almost guarantee that there's no multithreaded ASP decoders at all.
1) Decoding only uses 1 thread
2) Encoding can use multiple threads (dep. on build etc)
3) VDM uses threads to seperate encoding and decoding tasks
Yes, yes, yes.
4) VDM can allocates threads to CPUs.No, windows kernel does that.
5) Having 2 x CPU or 2 x dual core means VDM can decode on 1 x CPU and encode on the 2nd CPU.Yup.
6) Setting 2 x XviD threads on my dual core PC will make no difference to speed as VDM controls thread and CPU allocation.No, because encoding and source decoding take different processing power. The more threads there are, the more balance between CPUs.
HOWEVER if filtering and encoding take similar power, such as more complex avs script combined with fast-1st-pass, then indeed just one encoding thread seems to be faster.
dimzon
3rd March 2006, 18:42
I consider that VERY unlikely. In fact, I consider that almost impossible. Other than postprocessing, I can almost guarantee that there's no multithreaded ASP decoders at all.
theoretically it's possible to split source bitstream by I-frame:
IpppIpbbp and decode each piece in separate thread
unfortunally this scheme is unusable for normal playback but can be used for speedup during transcoding (on more than 2 CPU machines with huge RAM avaluable)
dimzon
3rd March 2006, 18:53
Or, maybe, it's posible to decode B-frames @ separate thread...
Ice =A=
3rd March 2006, 20:18
I consider that VERY unlikely. In fact, I consider that almost impossible. Other than postprocessing, I can almost guarantee that there's no multithreaded ASP decoders at all. Now that I'm thinking of it technically, it really seems difficult. But I know that with Quicktime there is a big speed difference between one and two cores, and regarding Nero, I at least know it's using (whatever that means) more then one core.
Maybe they can do different threads for luminiscense and chroma channels or compute different parts of the image in parallel, no idea... :o
foxyshadis
3rd March 2006, 20:31
I know Elecard/Nero and CoreAVC at least do Cabac decoding on a separate thread, that's a huge processor sink for dvd video and up and the most obvious and easiest speedup. I'm sure they both have a smart scheduler to go with it that tries to partition the work to be done across cpus as much as possible.
sysKin
4th March 2006, 02:53
I know Elecard/Nero and CoreAVC at least do Cabac decoding on a separate thread, that's a huge processor sink for dvd video and up and the most obvious and easiest speedup.
Yeah this is why I said ASP decoder. For AVC, there are at least several possibilities to either pipeline (like cabac->everything else ->deblocking) or work in parallel (deblocking itself, or decode slices in parallel if file was encoded with slices).
But ASP... yeah I suppose you can reinvent entire decoder to decode entire frames ahead (either on GOP boundary or b-frames) but that sounds weird. Within one frame, this sounds impossible.
Postprocessing can be split into slices. In fact I might do that if you say that's useful.
As for quicktime, are you sure it's not audio decoder running on another CPU that makes this difference?
lantern
5th March 2006, 21:18
It was built with mingw/msys. I have left the thread count to 0 or 1 or 2. I have a P4 w/Hyperthreading.
Crash report from VDub:
VirtualDub-MPEG2 crash report -- build 23843 (release)
--------------------------------------
Disassembly:
0d0495e0: ffc7 inc edi
0d0495e2: 43 inc ebx
0d0495e3: 1800 sbb [eax], al
0d0495e5: 0000 add [eax], al
0d0495e7: 00c7 add bh, al
0d0495e9: 43 inc ebx
0d0495ea: 1c00 sbb al, 00h
0d0495ec: 0000 add [eax], al
0d0495ee: 00c7 add bh, al
0d0495f0: 43 inc ebx
0d0495f1: 1000 adc [eax], al
0d0495f3: 0000 add [eax], al
0d0495f5: 00c7 add bh, al
0d0495f7: 43 inc ebx
0d0495f8: 1400 adc al, 00h
0d0495fa: 0000 add [eax], al
0d0495fc: 00c7 add bh, al
0d0495fe: 43 inc ebx
0d0495ff: 0800 or [eax], al
0d049601: 0000 add [eax], al
0d049603: 00c7 add bh, al
0d049605: 43 inc ebx
0d049606: 0c00 or al, 00h
0d049608: 0000 add [eax], al
0d04960a: 00c7 add bh, al
0d04960c: 0300 add eax, [eax]
0d04960e: 0000 add [eax], al
0d049610: 00c7 add bh, al
0d049612: 43 inc ebx
0d049613: 0400 add al, 00h
0d049615: 0000 add [eax], al
0d049617: 00c7 add bh, al
0d049619: 83900100000000 adc dword ptr [eax+01], 00h
0d049620: 0000 add [eax], al
0d049622: c7839401000000 mov dword ptr [ebx+194], 00000000
000000
0d04962c: c7838801000000 mov dword ptr [ebx+188], 00000000
000000
0d049636: c7838c01000000 mov dword ptr [ebx+18c], 00000000
000000
0d049640: c7838001000000 mov dword ptr [ebx+180], 00000000
000000
0d04964a: c7838401000000 mov dword ptr [ebx+184], 00000000
000000
0d049654: c7837801000000 mov dword ptr [ebx+178], 00000000
000000
0d04965e: c7837c01000000 mov dword ptr [ebx+17c], 00000000
000000
0d049668: c78570ffffff00 mov dword ptr [ebp-90], 00000000
000000
0d049672: c78574ffffff00 mov dword ptr [ebp-8c], 00000000
000000
0d04967c: 83b8ec00000010 cmp dword ptr [eax+ec], 10h <-- FAULT
0d049683: 89bbf0000000 mov [ebx+f0], edi
0d049689: 0f8463040000 jz 0d049af2
0d04968f: 8b7dd4 mov edi, [ebp-2ch]
0d049692: 8d47e1 lea eax, [edi-1fh]
0d049695: 85c0 test eax, eax
0d049697: 0f8e2d070000 jle 0d049dca
0d04969d: 89f9 mov ecx, edi
0d04969f: bbffffffff mov ebx, ffffffff
0d0496a4: 8b75c8 mov esi, [ebp-38h]
0d0496a7: d3eb shr ebx, cl
0d0496a9: 89c1 mov ecx, eax
0d0496ab: 89b59cfeffff mov [ebp-164], esi
0d0496b1: 21f3 and ebx, esi
0d0496b3: d3e3 shl ebx, cl
0d0496b5: b920000000 mov ecx, 00000020
0d0496ba: 29c1 sub ecx, eax
0d0496bc: 8b45cc mov eax, [ebp-34h]
0d0496bf: d3e8 shr eax, cl
0d0496c1: 09c3 or ebx, eax
0d0496c3: 8d4701 lea eax, [edi+01h]
0d0496c6: 83f81f cmp eax, 1fh
0d0496c9: 89c6 mov esi, eax
0d0496cb: 7625 jbe 0d0496f2
0d0496cd: 8b55cc mov edx, [ebp-34h]
0d0496d0: 8b75d8 mov esi, [ebp-28h]
0d0496d3: 8945d4 mov [ebp-2ch], eax
0d0496d6: 8955c8 mov [ebp-38h], edx
0d0496d9: 89959cfeffff mov [ebp-164], edx
0d0496df: 8b db 8bh
Windows 5.1 (Windows XP build 2600) [Service Pack 2]
EAX = 0d3dc5ac
EBX = 0d3baa2c
ECX = 00000000
EDX = 0000007e
EBP = 0d95ee60
ESI = 00000004
EDI = 0000001a
ESP = 0d95eca8
EIP = 0d04967c
EFLAGS = 00010246
FPUCW = ffff027f
FPUTW = ffffaaaa
Crash reason: Access Violation
Crash context:
An out-of-bounds memory access (access violation) occurred in module 'xvidcore'... ...while running thread "Processing" (thread.cpp:150).
Pointer dumps:
EBX 0d3baa28: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
ESP 0d95eca8: 0d95ee28 00000003 00000000 0d95ee28 00000000 00000001 00000001 00000010
0d95ecc8: 0000000e 0d95ee28 00000001 00000000 00000000 00000000 00000000 003283d8
0d95ece8: 00000000 0d95ed4c 0d95ede0 00000001 0000000b 337f0000 0d95ed18 0d95ed18
0d95ed08: 0d215d50 80000200 000001b6 00acfd68 00000000 00000003 00000040 0000001f
EBP 0d95ee60: 0d95ef40 04602e03 0d215c80 0d95eee8 00000000 00000000 00000000 00000000
0d95ee80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
0d95eea0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
0d95eec0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Thread call stack:
0d04967c: xvidcore!0000967c
0460c295: xvidvfw!DriverProc [04600000+bd80+515]
7c910833: ntdll!RtlAllocateHeap [7c900000+105d4+25f]
7c910833: ntdll!RtlAllocateHeap [7c900000+105d4+25f]
7c80b5b8: kernel32!GetModuleHandleA [7c800000+b529+8f]
7c80b58c: kernel32!GetModuleHandleA [7c800000+b529+63]
7c80b5a1: kernel32!GetModuleHandleA [7c800000+b529+78]
7c80b4b6: kernel32!GetModuleFileNameA [7c800000+b357+15f]
7c80b4cb: kernel32!GetModuleFileNameA [7c800000+b357+174]
7c910732: ntdll!RtlAllocateHeap [7c900000+105d4+15e]
7c910732: ntdll!RtlAllocateHeap [7c900000+105d4+15e]
7c9106ab: ntdll!RtlAllocateHeap [7c900000+105d4+d7]
7c9106eb: ntdll!RtlAllocateHeap [7c900000+105d4+117]
7c910732: ntdll!RtlAllocateHeap [7c900000+105d4+15e]
7c910732: ntdll!RtlAllocateHeap [7c900000+105d4+15e]
7c9106ab: ntdll!RtlAllocateHeap [7c900000+105d4+d7]
7c9106eb: ntdll!RtlAllocateHeap [7c900000+105d4+117]
7c910732: ntdll!RtlAllocateHeap [7c900000+105d4+15e]
7c911538: ntdll!wcsncpy [7c900000+10a8f+aa9]
7c911596: ntdll!wcsncpy [7c900000+10a8f+b07]
7c9106eb: ntdll!RtlAllocateHeap [7c900000+105d4+117]
7c910833: ntdll!RtlAllocateHeap [7c900000+105d4+25f]
7c910895: ntdll!RtlImageDirectoryEntryToData [7c900000+10856+3f]
7c910833: ntdll!RtlAllocateHeap [7c900000+105d4+25f]
7c910895: ntdll!RtlImageDirectoryEntryToData [7c900000+10856+3f]
7c9037bf: ntdll!RtlConvertUlongToLargeInteger [7c900000+3745+7a]
7c90da54: ntdll!NtFreeVirtualMemory [7c900000+da48+c]
7c918331: ntdll!RtlReAllocateHeap [7c900000+179fd+934]
7c90d4ea: ntdll!NtAllocateVirtualMemory [7c900000+d4de+c]
7c9180ff: ntdll!RtlReAllocateHeap [7c900000+179fd+702]
7c911bff: ntdll!RtlInitializeCriticalSection [7c900000+11b2d+d2]
7c91825d: ntdll!RtlReAllocateHeap [7c900000+179fd+860]
0050ad37: VDResamplerSeparablePointRowStageMMX::Process()
00509f9c: VDResamplerSeparableStage::ProcessRow()
00509a6a: VDResamplerSeparableStage::ProcessPoint()
75a718a8: MSVFW32!ICSendMessage [75a70000+187d+2b]
75a74c09: MSVFW32!ICCompress [75a70000+4ba6+63]
004ae882: VideoSequenceCompressor::PackFrameInternal()
004ae536: VideoSequenceCompressor::packFrame()
75a718a8: MSVFW32!ICSendMessage [75a70000+187d+2b]
75a74c4d: MSVFW32!ICDecompress [75a70000+4c10+3d]
0047f4ea: Dubber::WriteVideoFrame()
0047ee18: Dubber::WriteVideoFrame()
0047fa63: Dubber::ThreadRun()
004df5fe: VDThread::StaticThreadStart()
005285bf: _threadstartex@4()
7c80b50b: kernel32!GetModuleFileNameA [7c800000+b357+1b4]
-- End of report
woah!
6th March 2006, 03:26
mencoder just got updated to support threads with XviD now to.
where would this version be as i looked around the places i have linked and havent seen it?
celtic_druid
6th March 2006, 04:20
http://www1.mplayerhq.hu/cgi-bin/cvsweb.cgi/main/libmpcodecs/ve_xvid4.c.diff?r1=1.24&r2=1.25
devaster
7th March 2006, 23:56
mencoder just got updated to support threads with XviD now to.
how i activate it ? i cant find it in manual ... or i am blind ?!?
celtic_druid
8th March 2006, 02:29
Probably the manual you are looking at hasn't been updated yet. According to the above it is just threads=#. So -xvidencopts pass=1:threads=2
devaster
8th March 2006, 22:53
:thanks: ooh thanx
DiJayy
13th April 2006, 23:59
From what I can tell, when encoding MPEG2 transport streams (~18mbps) to XviD using avisynth and vdubmod or avs2avi, using 2 threads is actually a bit slower than 0, maybe because the decoder's doing so much work on one thread that the second xvid thread crowds it, but I don't know enough about this stuff to have a fair estimate. On 0 threads cpu usage is ~80%, one 2 threads its 100%
AMD Athlon 64 X2 3800+ using Windows XP x64
I also tried playing with the process priority but it doesn't change much, usually just makes the fps less stable and more liquid, but doesn't actually speed up anything.
shpitz
14th April 2006, 14:24
On 0 threads cpu usage is ~80%, one 2 threads its 100%
again, cpu usage has NOTHING to do with encoding speed.
if you want to assess if you had a speed-up or not you should only look at the TIME it took to encode...
only cause the cpu is working harder doesn't mean it is working more efficient...
ilhyfe
9th May 2006, 07:03
Hi,
Encoded using 2 threads.
AMD 4200+ , Avisynth 2.5.6 , VDM 1.5.10.2
First smp build:
1st pass : 56min (41,5fps)
2nd pass : 2h 36min (14,9fps)
New build:
1st pass : 50min (46,5fps)
2nd pass : 2h 17min (17fps)
I'd be interested in some more details. What kind of source? Any filters? Which destination resolution?
A friend of mine gets more than 100 fps while encoding a DVB stream to xvid (512*384) while I get max 50. He's using a X2 4400+, I a X2 3800+. Both of my cores are working @ ~90% while his are both on 100%.
We are both using koepis smp build.
so long...
:thanks: for the Xvid SMP, much appreciated. Doing a quick calculation I get about a 12% improvement in 2nd pass speeds using 2 threads.
ChronoCross
18th August 2006, 08:10
Is there a way to get the latest cvs co to recognize that I have pthread.h? I can't build the SMP version without it. Configure says I don't have it.
Building x264 SMP works fine (doesn't search for pthread as far as I know)
btw I know this thread is quite old however it is the OFFICIAL thread and therefore compilation problems might be best suited here.
sysKin
18th August 2006, 08:58
Is there a way to get the latest cvs co to recognize that I have pthread.h? I can't build the SMP version without it. Configure says I don't have it.
Unfortunately all pthread logic, including makefile, was put in blindly. I don't know about others but from what I know, this was only ever tested with win32 and VC++.
Test results and patches appreciated.
celtic_druid
18th August 2006, 17:32
checking pthread.h usability... yes
checking pthread.h presence... yes
checking for pthread.h... yes
checking for pthread_create in -lpthread... yes
checking for pthread_join in -lpthread... yes
No problem here.
ChronoCross
18th August 2006, 17:35
checking pthread.h usability... yes
checking pthread.h presence... yes
checking for pthread.h... yes
checking for pthread_create in -lpthread... yes
checking for pthread_join in -lpthread... yes
No problem here.
where is pthread located for you in you mingw?
akupenguin
30th August 2006, 00:27
Yeah this is why I said ASP decoder. For AVC, there are at least several possibilities to either pipeline (like cabac->everything else ->deblocking) or work in parallel (deblocking itself, or decode slices in parallel if file was encoded with slices).
But ASP... yeah I suppose you can reinvent entire decoder to decode entire frames ahead (either on GOP boundary or b-frames) but that sounds weird. Within one frame, this sounds impossible.
Postprocessing can be split into slices. In fact I might do that if you say that's useful.
GOPs or b-frames would only work for certain restricted bitstreams... GOP parallelism would require a ridiculously large buffer with the standard 10 second keyframe limit, and b-frame parallelism fails with adaptive b-frames.
So I propose a different method of frame-level parallelism (which, btw, is codec-agnostic and would work in encoding too):
Decode N consecutive frames with N threads. Whenever a thread tries to decode a motion vector that points into a region of the reference frame that hasn't been decoded yet, it stalls until the thread responsible for that frame has decoded enough. If the movie has enough b-frames that only one thread at a time is in an i/p-frame, then there will be no stalls and it's just like b-frame parallelism. If not, it's slightly less efficient but still works.
Alternate plan: Use dxva/xvmc except with software emulation of the video card. (probably harder to implement, and definitely uses more bus bandwidth)
DaForce
30th August 2006, 05:41
Hey chaps and chapettes,
My friend finally got his Dell machine up and running, it has a Pentium D 2.8ghz.
So we wanted to do some speed comparisons between my machine (x2 4400) and his.
So using Virtualdubmod and the latest Koepi Xvid SMP build (Love your work Koepi) we set off converting a sample piece of footage.
Here are my results for the tests and the details of the file used for encoding
Virtualdubmod + xvid - 2pass
2m34s thread @ 0
2m38s thread @ 1
2m 8s thread @ 2
2m10s thread @ 3
mewig + xvid - 2pass
2m11s
Original file details:
720x576 MPEG2 25FPS
1111frames (duration 0:44s)
Now his machine did it in 1m58s at the time i didnt note what his thread was set to but it could very well have been 0 (I believe it was).
So my machine is some 25% slower than his.
Next we used mewig to convert using x264 and my machine was almost 30% quicker (which is more like what i expected in xvid as well).
NOTE: we had identical settings in both VDM and Mewig
So what were we doing wrong? does xvid (smp build) not like AMDs as much as it likes Intels.
Any suggestions or ideas would be great. I really did expect my machine to beat him considerably in xvid.
sysKin
30th August 2006, 10:37
I really did expect my machine to beat him considerably in xvid.
And you should, that's the slowest P-D of them all.
Very few people tested with multi-core pentiums (or at least, very few people admitted it), but their results were quite bad so far. This is what I'd expect, given P4's lack of communication between cores (like hypertransport).
Perhaps you made some silly mistake, like a different setting somewhere? Or Full Recompress in vdub?
DaForce
30th August 2006, 12:06
Well yeah exactly.. i expected a result the same as the x264 thru mewig... where i was 30% faster.
I have retested again today and got the same problem. Even when using mewig which uses its own xvidcore.dll i get about the same time ..about 2m10s which is slower than the P4D but about 15s
in VDM we are doing a full recompress of the MPEG2 stream to an XVID video, leaving the audio as is (direct stream copy) and not much else really. I have double check the settings many times.
Im kinda stumped. So basically 2 different programs using 2 different xvidcore.dll's produce the same miserable time for me.
I did a quick render test in 3dsmax using vray and my resulting time on a benchmark scene was as expect.. so my puter isnt running funky or anything.
Any ideas mate?
p.s. Im in Australia as well... Canberra :D
DaForce
31st August 2006, 07:34
hmmm i can get the encoding time down down to 1m43s (instead of 2m08s) using Virtualdub (not virtualdubmod) and using avisynth and dvd2avi to be able to load the mpeg2 into virtualdub.
Which is weird. So its 25s quicker when using the above method than when using virtualdubmod. Doesnt make much sense.
Havnt tried the above on my friends P4-D tho.
GodofaGap
31st August 2006, 08:28
VirtualDub does encoding and decoding in separate threads, but it could also be that the MPEG2 decoder in VDM is just slower than DGIndex.
DaForce
31st August 2006, 09:12
I tried using the avisynth route in VDM as well and got basically the same time.
So Virtualdub uses seperate threads and VDM doesnt?
GodofaGap
31st August 2006, 09:29
I know VirtualDub uses separate threads, but it could be it is introduced in the 1.16.x branch. (Of which there is no VDM version)
DaForce
31st August 2006, 09:41
ahh i see.
Right might stick with avisynth and virtualdub then.
Still i would expect alot faster than my friend p4-d than what im getting. x264 is much better.
ahh well.
Thanks Mate
Zep
1st September 2006, 19:45
ahh i see.
Right might stick with avisynth and virtualdub then.
Still i would expect alot faster than my friend p4-d than what im getting. x264 is much better.
ahh well.
Thanks Mate
Something is wrong on your box. I have the same CPU as you and I get 70+ FPS on 720x576 MPEG2 source video. You are not even reaching real time of 25FPS which is just crazy. If i shut off 1 core I get about 40 FPS. I use avisynth to feed to Vdub.
DaForce
1st September 2006, 20:30
Hey Zep, thanks for your reply man.
Well exactly it should be running faster in xvid. However are you converting from 720x576 to a smaller size ?
As in the test we are doing we are just converting from 720x576 mpeg2 to 720x576 XVID.
I while ago i convert some DVD vob files to a smaller xvid file and was getting about 90FPS on the final pass.
In rendering test (3D) its time is spot on with other x2 4400's so its not the box as a whole but maybe some of the codecs or something.
Would you be interesting in testing the file that we are using? Would certainly let me know if there problem is with my machine.
I just tried converting to 320x256 and it was getting over 100fps on the first pass and about 85 average on the pass.
But for the test we were not resizing the footage.
Dreassica
5th September 2006, 18:53
I'm seeing no speedup on using smp 1.2 version compared to old singlethreaded xvid, using same avs and settings. I do notice 2nd core having considerable less load, 28% against 96 for core 1. I have xp patched etc, so thats not it.
foxyshadis
5th September 2006, 20:26
Framerate's only going to be as fast as the slowest part of it. If you open a new virtualdub, load the script, and use analyse video, you'll get the maximum speed any encoder could possibly run at. To get more you'd have to change your script.
To really test xvid mt's speed, try loading a plain avi file instead of a script.
Zep
8th September 2006, 10:57
Hey Zep, thanks for your reply man.
Well exactly it should be running faster in xvid. However are you converting from 720x576 to a smaller size ?
As in the test we are doing we are just converting from 720x576 mpeg2 to 720x576 XVID.
I while ago i convert some DVD vob files to a smaller xvid file and was getting about 90FPS on the final pass.
In rendering test (3D) its time is spot on with other x2 4400's so its not the box as a whole but maybe some of the codecs or something.
Would you be interesting in testing the file that we are using? Would certainly let me know if there problem is with my machine.
I just tried converting to 320x256 and it was getting over 100fps on the first pass and about 85 average on the pass.
But for the test we were not resizing the footage.
makes no huge difference really. The CPU eaten via the resize is gotten back because Xvid now encodes a lot less. Basically i get about the same FPS if i do not resize or if i do to 320 x 256. The CPU usage is just in a different area. In this case avisynth more than Xvid when resizing down. Note i use multi threaded avisynth so the resize is fast compared to the avisynth most are using.
let me give you an example. I JUST encoded a HDTV show 720p to 624 x 352 and got a steady 68 FPS. Both source rez and file size i/o is much greater than DVD input. In this case if I do NOT resize down I get only about 56 FPS because Xvid has to encode a lot more and the resize for me is much faster than the Xvid settings I'm using (Max quality)
Oh BTW I use PC4000 and run it @ 267MHz and it makes a huge difference also since avisynth/vdub/xvid/encoding are mega memory read/writes etc...
DaForce
8th September 2006, 11:03
cool thanks for your info.. really helps.
When i last tried converting.. without resize it was about 30fps i think.. with resize down to 320x240 (or so ) it was up to 100fps.
And last night i noticed my memory settings were POV.. bloody CAS3.. (dont know why ) so they are not back to cas2 and tighter timings will try again later tonight.
Thanks again.
shpitz
8th September 2006, 15:10
If you open a new virtualdub, load the script, and use analyse video, you'll get the maximum speed any encoder could possibly run at.
can you explain how to do that? i don't any analyze option in the vdub menus.
Note i use multi threaded avisynth so the resize is fast compared to the avisynth most are using.
let me give you an example. I JUST encoded a HDTV show 720p to 624 x 352 and got a steady 68 FPS. Both source rez and file size i/o is much greater than DVD input.
can you point me to the avisynth version you're using?
can you post your script that you get 68fps on HD material?
which OS are you using? you use vdub or vdubmod?
thanks
Selur
8th September 2006, 15:28
can you explain how to do that? i don't any analyze option in the vdub menus.
File->Run video analysis pass
(using Virtual Dub 1.6.16)
shpitz
8th September 2006, 15:38
thanks Selur
_xxl
1st January 2007, 18:32
checking pthread.h usability... yes
checking pthread.h presence... yes
checking for pthread.h... yes
checking for pthread_create in -lpthread... yes
checking for pthread_join in -lpthread... yes
No problem here.
Latest xvid 20070101.I don't know why
"checking for pthread_create in -lpthread" is disabled.
:confused:
http://i11.tinypic.com/3ywgrc9.jpg
ChronoCross
1st January 2007, 21:55
Latest xvid 20070101.I don't know why
"checking for pthread_create in -lpthread" is disabled.
:confused:
http://i11.tinypic.com/3ywgrc9.jpg
same here
celtic_druid
2nd January 2007, 06:42
checking pthread.h usability... yes
checking pthread.h presence... yes
checking for pthread.h... yes
checking for pthread_create in -lpthread... yes
checking for pthread_join in -lpthread... yes
Check your config.log.
_xxl
2nd January 2007, 09:41
Can you share MinGW and Msys dir?
Yong
2nd January 2007, 10:29
Could someone please test this?
Xvid vfw and encraw(original) compiled with pthreads support
http://www.mytempdir.com/1145230
I only have p4 presscot so i cant test it :p
@drevil_xxl:
You could try this http://sourceware.org/pthreads-win32/
Donwload the source, then
make GC-inlined
copy the libpthreadGC2.a to msys/lib
rename the libpthreadGC2.a to libpthread.a
Configure xvid again see if its work.
celtic_druid
2nd January 2007, 11:17
I wouldn't rename since some apps still expect lpthreadGC2. Windows equiv of ln -s is just to copy it though.
My mingw dir is over 1GB.
As I said, check your config.log. If something isn't detected you can generally see why and fix it.
_xxl
2nd January 2007, 13:01
copy the libpthreadGC2.a to msys/lib
rename the libpthreadGC2.a to libpthread.a
Configure xvid again see if its work.
Works!
I have tested xvid with AMD X2.
1).xvid 1.2-127 multithreaded:
http://i16.tinypic.com/4733s4i.jpg
Total time 2:23s
2).xvid 1.1.2 no multithread:
http://i12.tinypic.com/33m3fja.jpg
Total time 3:12s
You're seeing is the kernel's habit of tossing threads around in strange fashions. It doesn't really affect performance to execute half-and-half on two cpus instead of one.
pixelk
17th September 2007, 10:10
As sysKin website seems down, A firend compiled the latest 1.2.x source for me, you can get the binaires ( core + wfv ) here :
http://www.knackes.com/blog/index.php?2007/09/16/151-xvid-multithread
Lenny_Nero
26th September 2007, 10:20
I have said quite a few times before that like for like testing needs to have the same hard drives and data on them, I get about a 30% speed change depending on which way I send to and from my 7200 rpm drives (SATA I only) and even more if going to and from the 10k drives.
Same for a big clean empty space, and again even more speed if the cluster size is tuned to the OS cache working set size, only then can you get the full use of the CPU[s].
MacAddict
26th September 2007, 12:47
(snip) and again even more speed if the cluster size is tuned to the OS cache working set size, only then can you get the full use of the CPU[s].
Any guides or sites that help out on these recommendations?
Mutant_Fruit
26th September 2007, 19:37
The fastest way to 'improve' performance (multithreaded and singlethreaded) would be to run the reading of the data from the disk in a separate thread and buffer 1-5 frames of data in memory so that when xvid itself requires data, it's already in memory and so doesn't stall waiting for the data to be read from disk. x264 does this.
If that's already done in xvid, then ignore my suggestion. For someone competent in C, it'd probably be less than a 1 hour hack (assuming xvid does all it's reading from a specific function which can be altered and doesn't require the shotgun approach of changing dozens of areas) .
Lenny_Nero
27th September 2007, 04:28
Any guides or sites that help out on these recommendations?
Most of my own work is just that, I have always mucked about with Hard drive cluster sizes because it was the first things my dad used to have to setup with stuff. Hdd were a bit bigger/smaller then (10 x 12 inch platters got you 40 MB) but the data is still read on and off in the same way
I have been racking my brain for the keys in the registry I will post it in this thread ASA I remember but [HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001\Control\FileSystem]
has quite a few under it, or get to the other computer with the text file with my workings. There is a sysinternal's tool (IIRC) that can sort of do the same in a very simple sort of way.
I dont use XP or Vista (spit) for windows I use NT4 and NT5 AS (win2000) and 64bit 2003.
As for link's these can go some of the way to help.
http://www-128.ibm.com/developerworks/lotus/library/ls-Rules_WinNT2000/#A%20few%20tuning%20tips
http://www-128.ibm.com/developerworks/lotus/library/ls-Rules_WinNT2000/
http://www.ntcompatible.com/thread29591-1.html
http://www.dfwsug.org/cookbook.html
http://www.teamapproach.ca/trouble/CacheSize.htm
http://www.virtualdub.org/blog/pivot/entry.php?id=71
squid_80
27th September 2007, 06:33
The fastest way to 'improve' performance (multithreaded and singlethreaded) would be to run the reading of the data from the disk in a separate thread and buffer 1-5 frames of data in memory so that when xvid itself requires data, it's already in memory and so doesn't stall waiting for the data to be read from disk. x264 does this.
If that's already done in xvid, then ignore my suggestion. For someone competent in C, it'd probably be less than a 1 hour hack (assuming xvid does all it's reading from a specific function which can be altered and doesn't require the shotgun approach of changing dozens of areas) .
I added it to xvid_encraw just over a month ago. Virtualdub has been doing it forever.
HarryM
1st October 2007, 21:16
I added it to xvid_encraw just over a month ago. Virtualdub has been doing it forever.
It's good idea! But better is "customizable -read ahead-".Using the 1st pass statistics from the 2pass encoding process (in this case you know future frame type in the encoded data - significant informations, of course).
When you set e.g. 30 frames ahead reading (input buffer = 30 frames), you can theoretically start one new internal-encoding thread, if I-frames are closer than 30 frames.
But you need output buffer (=30 frames) too, for encoded frames sorting. Minor improvement, but still improvement.
If you set input buffer=(I-frames distance), you can ALWAYS use two internal encoding threads. But you need gigant input buffer at size ~100 to 200MB for distance bigger than 250.
What is 100MB of memory, today? Nothing.:)
It is idea for three- and quad-core cpu, mainly.
Mutant_Fruit
2nd October 2007, 01:00
When you set e.g. 30 frames ahead reading (input buffer = 30 frames), you can theoretically start one new internal-encoding thread, if I-frames are closer than 30 frames.
The benefits seem marginal at best. If the encoder wants 30 frames, it can read them in, and thereafter it will only request 1 at a time (as once it finishes with 1 frame out of that 30, it'll request one more). So there's no need for the disc buffer to retain 30 (or more) frames itself.
I hope that makes sense.
Zach
12th October 2007, 04:09
After reading through all seven pages, I feel I'm crashing a private party here, but I'll ask anyway. :)
I take it the only way to get and test this SMP version of xvid is to download the source and compile it one's self? (The original link is now dead, but its over a year old, so that's understable.)
imcold
12th October 2007, 09:23
There are CVS compiles from Celtic Druid at http://mirror.celticdruid.info/XviD/ .
Zach
12th October 2007, 09:33
Sure, but those aren't the multi-threaded variants, are they?
You mean v.1.1.3 on Celtic Druid's site is the multi-threaded counterpart to the "official" one (found here (http://www.xvid.org/Downloads.43.0.html))?
foxyshadis
12th October 2007, 10:40
Anything that says "cvs head" means the latest & greatest up-to-the-minute ("nightly") builds of the development code, which right now means 1.2.
Of course they're a few months old now, I wonder if any big fixes have been introduced in the meantime.
pixelk
13th October 2007, 12:21
Sure, but those aren't the multi-threaded variants, are they?
You mean v.1.1.3 on Celtic Druid's site is the multi-threaded counterpart to the "official" one (found here (http://www.xvid.org/Downloads.43.0.html))?
I compile the last CVS every few days. Get the binary here : http://www.knackes.com/blog/index.php?2007/09/16/151-xvid-multithread&link1
foxyshadis
13th October 2007, 13:45
Thanks for the heads up! Current builds are always appreciated.
Zach
14th October 2007, 01:58
I compile the last CVS every few days. Get the binary here : http://www.knackes.com/blog/index.php?2007/09/16/151-xvid-multithread&link1
Thanks!
MacAddict
14th October 2007, 17:35
I compile the last CVS every few days. Get the binary here : http://www.knackes.com/blog/index.php?2007/09/16/151-xvid-multithread&link1
Thanks for your builds, much appreciated. I seem to be having more frequent crashes using xvid_encraw via megui with the 10-14 build compared to the September builds. Anyone else?
pixelk
15th October 2007, 12:34
xvidcore_20071015 binary should work much better.
|sawo|
31st October 2007, 23:31
I have q6600 @ 3GHz and i got 55fps with 4 threads (mpeg2 source, 1500kbps 1 pass xvid)
The fps is pretty much the same as the regular xvid and the cpu usage is around 50%
With the same settings im able to get ~111fps using divx 6 (also i set crop,resize and deinterlace directly from the divx panel almost without fps drop!)
Ice =A=
31st October 2007, 23:42
The usual answer: What are "same settings"?:devil::sly::confused:
|sawo|
1st November 2007, 00:24
1500kbps 1pass balanced profile for divx.
Even with the Insane Quality preset activated in divx6 im still able to get around ~68fps which is far more than xvid.
Also dont forget that i have many additional options such as resize,deinterlace,crop etc activated in the divx6 panel!
Zach
1st November 2007, 03:44
I have q6600 @ 3GHz and i got 55fps with 4 threads (mpeg2 source, 1500kbps 1 pass xvid)
The fps is pretty much the same as the regular xvid and the cpu usage is around 50%
With the same settings im able to get ~111fps using divx 6 (also i set crop,resize and deinterlace directly from the divx panel almost without fps drop!)
But I'll wager that the final xvid file is significantly smaller than the final divx file, though, right? Have you checked? I'm curious.
I've been doing a lot of Divx vs. Xvid(MT) benchmarks over the last week, and as hard as I try to match up the profiles (so that I'm comparing apples to apples), Divx 6.7 always beats the snot out of Celtic Druid's Xvid MT build (found here (http://ffdshow.faireal.net/mirror/XviD/)) in terms of encoding speed. (Xvid's 'Threads' = 2, not 4, since I only have a dual core.)
On the other hand, the Xvid encode always produces a significantly smaller file (edit: without any loss to picture quality as far as my eyes can tell).
Well, logic dictates that this is probably why DivX performs so much faster: it's spending a lot less time compressing.
But, alas, I'm admittedly a n00b when it comes to the low-level profile settings, so maybe I'm not setting up correctly. But even when just comparing the built-in "Home Theater" to "Home Theater" profiles, DivX is faster, but Xvid produces smaller files. <shrug>
foxyshadis
1st November 2007, 08:21
I would certainly hope that xvid wouldn't be significantly smaller than divx if both were set to 1500kbps.
But yes, xvid's threading capability isn't as refined as divx's.
However, sawo, you need to remember not to let the avisynth be a limiting factor - use setmtmode(2) at the beginning of the script (with the latest MT avisynth), and use leakkerneldeint as your deinterlacer (fastest decent-quality avisynth deint). That might boost your xvid fps. If you are feeding it a preprocessed source vid this doesn't apply.
|sawo|
1st November 2007, 10:23
Zach actually the difference between divx/xvid was 100-200kb with 80mb mpeg2 source so i think the size in this case is not big deal considering the fps difference.
foxyshadis im currently using the latest virtualdub with the vfw versions of the codecs(without avisynth), because it saves time for me.When i try the MT avisynth version ill post the results
Zach
2nd November 2007, 02:29
I would certainly hope that xvid wouldn't be significantly smaller than divx if both were set to 1500kbps.
Yes, I should clarify that all my benchmarking was with using a default "Target quantizer: 4.00" rather than a "Target bitrate (kbps)" which, I guess you are saying, is basically VBR vs. CBR, right?
I don't like the concept of hard-coding a bitrate. It just seems wasteful. :p
Alright, well, sorry to intrude. Carry on.
JCDenton
10th November 2007, 18:32
Hi everyone,
I'm sorry to bother you with this, supposedly stupid, question but I really don't know what to do or where else to look for advice.
First off, my system:
A64 X2
WindowsXP SP2
VDub Mod 1.5.10.1
I can't get the XviD codec (tried the "offizial" one and Koepi's 28062007 build linked at doom9.net) to use both of my cores - the funny thing is, just a couple of days ago it worked perfectly fine, CPU-load was near 100% almost every time, even when I switched the priority to "idle".
I also tried pixelk's 20071031 build, but here I only get up to 70% CPU load with 4 threads and "higher" priority.
I don't remember having changed anything, even reinstalled the codecs an VDub over and over again - does anyone have an idea what might be wrong here ?
Oh, and please pardon my clumsy English, as you've obviously figured out by now, it's not my mother tounge ;)
Regards
denton
foxyshadis
14th November 2007, 11:21
Official and 28062007 are both part of the 1.1 branch, which explains why those won't thread. Don't use 4 threads if you only have 2 cores, though; 2 is best and 3 is the max, may help or may hurt in different situations, but 4 will always hurt by uselessly increasing the cpu meter without improving performance.
The other half of the problem is that the input is probably slowing the whole process down now: You have to use MT avisynth if you have an even remotely complex script to get full utilization.
JCDenton
15th November 2007, 12:25
Stupid me, could have figured for myself that ">2 Threads@2 Cores = bad" :p
Well, thanks for your reply - sadly, I've never used avisynth and don't have any experience with scripting, so I'm just going to stick with pixelk's build, at least that way I can use the remaining CPU cycles to transcode the sound stream ...
Thanks again.
Regards
denton
humax
7th December 2007, 11:35
i have a Q6600 Quadcore . Do I have to use 4 Threads ??
I looked at pixel k homepage but it is in french language . There was a 4 in Number of Threads . In other threads i read that a maximum of 3 Threads has to be chosen .
Thx for Help
Adub
8th December 2007, 08:01
You don't "have" to use 4 threads. But it should help. Actually I think it is 6 threads for you, but don't quote me on that.
squid_80
8th December 2007, 08:46
Actually testing on my Q6600 shows 2 threads to be optimal. 3 or 4 may improve performance very slightly, but only if you're not doing anything with the pc; otherwise they're worse than 2.
sysKin
9th December 2007, 05:50
You don't "have" to use 4 threads. But it should help. Actually I think it is 6 threads for you, but don't quote me on that.
Any basis for that? I'd say 3 threads. Definitely no more threads than cores, what made you think that!
foxyshadis
13th December 2007, 19:21
Any basis for that? I'd say 3 threads. Definitely no more threads than cores, what made you think that!
Confusion with x264's behavior. ^.~
ilhyfe
13th December 2007, 22:59
Any basis for that? I'd say 3 threads. Definitely no more threads than cores, what made you think that!
I never tried that before on a quadcore but I get best speed with 3 threads. Can you explain why?
olnima
14th December 2007, 10:29
...and with Dual-core (E6850)? 1 or 2 threads? I tried 3, CPU-usage raises without getting more speed.
Olnima
ilhyfe
14th December 2007, 11:02
...and with Dual-core (E6850)? 1 or 2 threads? I tried 3, CPU-usage raises without getting more speed.
Olnima
I had best resultes with 2 threads.
pixelk
22nd December 2007, 09:55
Somebody just asked on my blog, where can we find with the latest MT avisynth ? I would like to try to get the maximum fps from my quad-core, and if I have the time post the result of my tests.
Any advice about how I can (with the latest nightly build) get the highest fps ?
LigH
30th December 2007, 14:43
Using celtic-druid's "head" build (http://mirror.celticdruid.info/XviD/XviD.cvs.head.MTK.exe, 2007-07-25); on an AM2, the "Threads" edit field is disabled (grayed out) with a "2". Is this the expected behaviour? And is this still the most recent Win32 build?
@ celtic-druid: Would be nice to include the version "1.2" in the filename too. Several members of the german forum were confused about "not finding a v1.2 build", not knowing sysKin's remark:
So, here we go:
What happened: Multithreaded XviD code is now committed to CVS. From now on, all "xvid head" or "xvid 1.2.x" versions have it.
pc_speak
31st December 2007, 02:32
@LigH. Interesting. Had the same grayed out problem also. Remembered I had Koepi's XviD-1[1].1.3-28062007.exe installed.
Uninstalled it AND celtic-druid's. Ran a registry cleaner over the system. Reinstalled celtic-druid's XviD.cvs.head.MTK.exe.
Went into 'Configure Decoder' for a quick peek at the settings. Then went into 'Configure Encoder' to set my defaults. "Threads" edit field now enabled. Set it to 4. Quad core. :)
LigH
1st January 2008, 18:18
I see -- XviD is not XviD, regarding installers... Always uninstall! :D
Happy New Year!
philippas
2nd January 2008, 13:07
:thanks:
88keyz
4th January 2008, 00:39
For the last little while I have been playing with compiling an XviD release based on the CVS tarballs. It has taken me a while to figure it out but I think I have it now. This release includes both the encoder and decoder and comes in an EXE installer that creates config shortcuts for both. Also included is a complete uninstaller should there be any problems. If anyone would like to test the first new XviD release of 2008 then I would be curious to know how my compile stacks up. Please uninstall any previous XviD releases from your system before installing this one. Based on the XviD 1.2 code this release fully supports SMP systems.
XviD_1.2.127-29032008.exe (http://rapidshare.com/files/103434080/XviD_1.2.127-29032008.exe)
Please keep in mind that I am not a developer and that I have only compiled the CVS release available from Xvid.org for curiosity.
:)
pc_speak
4th January 2008, 23:28
Installed your compile as per instructions.
Seems fine. Nothing got broken. :D
Configured encoder OK. Could set threads to 4.
Used decoder & encoder. Still OK.
Uninstall worked fine.
I'll leave it installed for a bit and let you know.
:)
88keyz
5th January 2008, 02:39
Thanks for the feedback. I've done some testing and everything seems to work fine for me but I'm sure others out there push the codec closer to its limits than I do.
pc_speak
8th January 2008, 22:42
Did about 8-10 hours on my quad core machine over the weekend. All worked just fine. Congratulations.
88keyz
10th January 2008, 20:46
A small update to the previous codec release. Based on the January 10th tarball code. Includes new icons and a codec info link.
XviD_1.2.127-09022008.exe (http://rapidshare.com/files/90468184/XviD_1.2.127-09022008.exe)
The older compile is still being hosted for now. Thanks to all of you that have downloaded and tried this compile.
:thanks:
Buggle
16th January 2008, 18:53
Maybe someone can give me directions on where to find a changelog for the newest changes in Xvid, like the nice listing of x264 found here (http://trac.videolan.org/x264/log/trunk/common). I am really interested in what kind of experimental fixes and changes are before I put it in action. I have been searching for a while now, but cannot find anything but some mailings that do not get me any further.
Ranguvar
16th January 2008, 22:44
*bows to 88keyz*
Here, Buggle. It's as far as I have. Note that
Xvid-1.1.2-01112006 and Xvid-1.1.3-28062007 do not include the changes in XviD-1.2.*
Xvid-1.1.3-28062007:
- {core}: Fixed possible security issue in mbcoding.c
Xvid-1.1.2-01112006:
- {core}: Fixed bug when frame-drop (N-VOP) feature is used in combination with packed B-frames
- {core}: Fixed potential crash on AMD64/EMT64 architecture.
- {core}: Fix for visual_object_verid vs. video_object_layer_verid problem.
- {core}: Ensure intervening bytes are preserved in BitstreamInit()
- {vfw}: Prevent segfault when encoding application calls compress_end with NULL codec context
- {vfw}: Profile definitions updates.
XviD-1.2.-127-25022006:
Changelog to XviD-1.1:
- {core}: New experimental SMP support.
- {core}: Trellis improvements (according to sysKin).
- On uniprocessor machines set number of threads to 0!
XviD-1.2.-127-07012006
Changelog to XviD-1.1:
- {xvidcore} Experimental SMP support (2 threads hardcoded). Patch for P- and B-frames from sysKin applied by hand.
- {xvidcore} Trellis improvements (according to sysKin).
- {xvidcore} Bumped bitstream version to 42, you never know (41 is XviD-1.1.0-final).
XviD-1.1.0 final build.
Changelog:
- {core}: Field interlaced decoding.
- {dshow}: Additional fourcc support.
- {vfw}: Small updates.
XviD-1.0.3
Changelog:
- {xvidcore} Fixed trellis optimization overflow for quant 1 & qpel modes (motion search was done twice, one in SAD mode and one in RD mode)
-Fixed MV clipping with non valid DivX 5 based sequences.
-Fixed RGB 16 bit C functions.
-Fixed posible VOL header corruption for fps=1 encodes.
-DC misprediction caused by bad value clipping (bug forwarded to the ffmpeg project too).
VFW frontend
-Fixed mismatching of hintswidgets.
1.0.0 RC4, Codenamed "Hola"
xvidcore
-GMC 1 warp point (DivX5)
-GMC 2 warp point fix
-Minor postproc code fixes
-Motion Vector clipping fix for stressing test cases.
-Problems caused by wrong cooperation of bframes and frame dropping code.
-Decoder provides quant information in stats.
VFW frontend
-Multiple instance memory leak fix.
-Improved bitrate calculator.
-Some other minor changes.
-DShow frontend
-Release packages have all needed files to build from source.
1.0.0 RC3, Codenamed "Ni Hao"
xvidcore
-Workaround for dev-api-3 decoding that causes psychedelic color effects for non modulo 16 encodes. dev-api-3 builds were mostly used by win32 users during the transition from 0.9.x series and 1.0.x.
-Buffer overflow reading in decoder (read up to bytes to far).
VFW frontend
-Bitrate calculator fixes.
-Status window fixes, GMC frames get counted too now.
-Mod4 / YV12 resolutions encoding fixed.
DShow frontend
-Updates and cleaning.
1.0.0 RC2, Codenamed "Jambo"
xvidcore:
-Decoder bugfixes (GMC+interlaced).
-Changed the DivX packed user string to version 999 so DivX decodes XviD packed bitstreams like it should have done before.
-Fixed YVYU colorspace space (was using Y as V channel)
VFW frontend:
-Added bitrate calculator.
-Output a DLL linking library when compiling with MSVC.
DShow frontend:
-Video flipping fixed.
-Added MP4V to the supported FourCCs.
-Command line driving.
1.0.0 RC1, codenamed "Niltze"
xvidcore
-Scaled zones should now work in 2pass 1&2.
-Qpel is disabled during first pass now.
-Bug in PP using MB quants badly initialized.
-Changed Win32 build type to DLL.
VFW frontend
-Changed linking policy. Links against xvidcore.dll. The vfw component is now xvidvfw.dll.
-GUI improvements.
-Added PP options as in DShow frontend.
-Added easier constant quant encoding as most of users complained though the feature was available thanks to zones.
DShow frontend
-Changed linking policy. Links against xvidcore.dll.
-Better seeking.
-Fixed colorspace usage.
1.0.0 beta3, Codenamed "Selam"
xvidcore:
-Defaulted back to VGA 1:1 PAR.
-Enabled SSE2 assembly code for IA32 platforms.
-Improved and bugfixed two pass:
-better frame size scaling.
-better defaults.
-handles up to 2TB target filesizes
-1st pass disables automatically CPU hungry features.
-Added fast ME replacement routines.
-Added Post Processing to decoder:
-Deblocking.
-FILM noise.
-Various Bugfixes.
VFW frontend:
-Added AR widget.
-Added "Turbo mode" that enables core fast ME routines.
-Removed DXN profiles from the profile list.
-New defaults.
DShow decoder frontend:
-Added PP widgets.
1.0.0 beta2, Codenamed "Ciao"
xvidcore:
-MPEG4 compliance is back (beta1 was missing the VOS header)
-matrix quantization is finally thread safe
-improved vop type decision
-2pass2 plugin: min key interval was a misleading name. It's been renamed to kfthreshold. And as the kfthresholding behavior was a bit -too aggressive, it's been disabled until we decide how it should behave w/o hurting quality
-single plugin: fixed quant capping
-interlacing artefacts fixed.
VFW fontend:
-some misuses of xvidcore were fixed
-min key frame widget renamed and moved to the 2pass panel
Debian package:
-small errors in the control file reported and fixed by Nicolas Boos
1.0.0 beta1, Codenamed "Aloha"
-New API.
-New Motion Estimation system
-with SAD based algorithms,
-or Rate Distortion optimized algorithms.
-Dynamic frame type decision based on a fast motion estimation pass
-Support for bvops and svops (up to 3 warp points).
-QuarterPel precision.
-Trellis optimization for h263 and MPEG quantization schemes.
-Special mode for cartoons/anime like futuruma/the simpsons or any anime with flat color areas.
-Mod 2 resolution support.
-Two pass algorithm is now part of xvidcore.
Buggle
16th January 2008, 23:35
Thanks Ranguvar, but I meant more like the changes in the nightly builds, or to the cvs, like the changes noted between the latest svns of x264.
88keyz
17th January 2008, 00:51
That I'm aware of the Xvid team doesn't publish info about the 1.2.x family of builds. They have only ever been available as CVS source code and there is no info on the site about changes made to the code. According to the site each night at midnight they simply upload the latest source tarball, nowhere is there any info about version until you open the tarball, where all it gives you is a date. I have never seen release notes for any of the 1.2.x family of releases other than what Koepi published on his site when he released his first compile of the 1.2.x code.
XviD-1.2.-127-25022006
Changelog to XviD-1.1:
- {core}: New experimental SMP support.
- {core}: Trellis improvements (according to sysKin).
- On uniprocessor machines set number of threads to 0!
To the best of my knowledge the major change made was that the encoder will now use multiple processors. I'm sure there have been other tweaks along the line though, sysKin might be the guy to answer this question.
professor_desty_nova
17th January 2008, 09:57
Go to Celtic Druid's page http://celticdruid.no-ip.com/xvid/ and choose changelog next to the XviD link. The latest that he recorded is in the last page of the forum post (but only goes to midle of 2007).
Buggle
20th January 2008, 14:37
A small update to the previous codec release. Based on the January 10th tarball code. Includes new icons and a codec info link.
XviD_1.2.127-10012008.exe (http://rapidshare.com/files/82784043/XviD_1.2.127-10012008.exe)
The older compile is still being hosted for now. Thanks to all of you that have downloaded and tried this compile.
:thanks:
Have encoded lots and lots of stuff with your latest build, already. Haven't run into problems on neither my dual, nor my single core. And its fast, that's really cool :D. Now I encode in second pass at almost realtime speed on my 2800+ Barton. The last time I tried Xvid that was significantly lower (a few months ago), if I recall correctly. Maybe that's just some whishful thinking ;)
Good goin'!
Cyberace
21st January 2008, 15:47
Any updates on SMP (multi-processor) DECODING support?
sysKin
23rd January 2008, 16:16
Any updates on SMP (multi-processor) DECODING support?
I definitely don't plan to try. What for? I can't imagine any SMP-capable computer not decoding xvid on single thread already (unless it's an ancient dual-cpu).
And anyway, what's the point of using XviD for decoding. Other decoders exist.
OterLabb
25th January 2008, 01:20
A small update to the previous codec release. Based on the January 10th tarball code. Includes new icons and a codec info link.
XviD_1.2.127-10012008.exe (http://rapidshare.com/files/82784043/XviD_1.2.127-10012008.exe)
The older compile is still being hosted for now. Thanks to all of you that have downloaded and tried this compile.
:thanks:
Did a little testing on this release on my Q6600, with 5 threads (which seems to be default) vdub uses 80-90% of the total CPU, all four cores. But with 5 threads, the fps is dropping and overall encoding takes longer time.. Found out that 3 threads seemed to work best, wich seems to be like 1.1.3 Final build, concerning fps. Anyone else getting more fps on q6600 with this build?
Ranguvar
25th January 2008, 03:51
anyway, what's the point of using XviD for decoding. Other decoders exist.
Gah. Bad logic.
What's the point of using Xvid? Other MPEG-4 ASP encoders exist.
;)
sysKin
25th January 2008, 06:32
Gah. Bad logic.
What's the point of using Xvid? Other MPEG-4 ASP encoders exist.
;)
Good logic. XviD compresses better than others.
XviD decompresses identically to others.
Ranguvar
27th January 2008, 17:56
Good logic. XviD compresses better than others.
XviD decompresses identically to others.
Yes, but through demand and work, it could work better.
squid_80
28th January 2008, 05:11
Decoders must produce identical output. There are already faster decoders than xvid, so why waste time speeding up xvid's decoding to try and match them when you're going to get the same identical results?
TripleA
28th January 2008, 17:57
I remember back in the dawn of time when I first started using XviD that the encodes I made required insane hardware requirements such as 800MHz P3s and the like.
I suppose slightly higher requirements are to be expected with the enhancements since then (that was pre-B-frames, btw), but I don't think there is any modern CPU that couldn't decode XviD on a single core without breaking a sweat or, indeed, leaving power-saving mode. Well, maybe VIA's C3 can't. But then the C3 isn't gonna benefit from multi-core optimizations, is it?
squid_80
28th January 2008, 18:20
The Duron 800 in my car has no probs playing back SD clips, with only 384mb ram and gps software running at the same time.
EuropeanMan
28th January 2008, 20:28
^ SQUID80........please help me out man...
your xvidencraw.exe file for some reason crashes on me... :( i sent you a PM and have a thread open in this forum as well...
PLEEEEEEEEAES HELP
squid_80
29th January 2008, 03:07
^ How not to file a bug report.
You didn't say what the problem was, so I didn't pay attention.
audyovydeo
31st January 2008, 12:04
Maybe an offtopic question, but it's the only thread where I saw v 1.2 mentioned - is xvid 1.2.0 the new project as described here :
http://www.xvid.org/Xvid-Codec.2.0.html
???
cheers
audyovydeo
Ranguvar
31st January 2008, 12:22
No.
The Xvid 1.2.x branch has seemed to focus on SMP (multi-threading) and trellis.
hajj_3
31st January 2008, 18:01
so how much faster is the latest 1.2 build on a dual or quad core core2duo than 1.13?
ive got a core2duo overclocked to 3.6ghz and will be getting a 45nm quad core in 2months.
also any plans to add sse4 support? divx 6.7 is 60% or something faster with sse4 capable cpus.
thanks!
clsid
31st January 2008, 21:34
divx 6.7 is 60% or something faster with sse4 capable cpus.That is not correct. IT IS A MARKETING STUNT.
They added 1 specific routine to DivX that performs really well with SSE4. However, that particular routine is pretty much useless in real life. So DivX encoding does not really benefit from SSE4.
hajj_3
1st February 2008, 00:23
That is not correct. IT IS A MARKETING STUNT.
They added 1 specific routine to DivX that performs really well with SSE4. However, that particular routine is pretty much useless in real life. So DivX encoding does not really benefit from SSE4.
i've seen reviews of 45nm quad penryn cpu's encoding movies to divx 6.7 on anandtech site and its literally about 60% faster. independant review, nothing to do with the divx company.
Ranguvar
1st February 2008, 00:36
i've seen reviews of 45nm quad penryn cpu's encoding movies to divx 6.7 on anandtech site and its literally about 60% faster. independant review, nothing to do with the divx company.
Link?
Ranguvar
1st February 2008, 00:37
so how much faster is the latest 1.2 build on a dual or quad core core2duo than 1.13?
ive got a core2duo overclocked to 3.6ghz and will be getting a 45nm quad core in 2months.
also any plans to add sse4 support? divx 6.7 is 60% or something faster with sse4 capable cpus.
thanks!
For the first, try and see. There's a very new build posted just a little bit ago.
For the second, I doubt it. It's been asked of x264 a lot, and they basically said their own software routines were faster than using SSE4. So I presume it's pretty much the same with Xvid.
sysKin
1st February 2008, 02:11
SSE4 only makes things faster if you use very inefficient algorithm to begin with. Basically what you need is a full search, which is pointless and horribly slow. Then, this full search can be made 60% faster.
This is pretty much what DivX did, they invented particular algorithm for the purpose of using SSE4 with it.
Mgz
3rd February 2008, 12:45
Link?
http://www.anandtech.com/cpuchipsets/intel/showdoc.aspx?i=3195&p=4
clsid
3rd February 2008, 13:32
I have more trust in the expertise of sysKin, Manao, and akupenguin.
http://forum.doom9.org/showthread.php?t=133567&highlight=divx+sse4
http://forum.doom9.org/showthread.php?t=124881&highlight=divx+sse4
lazik
4th February 2008, 16:13
Divx in one test is slower n*times then Xvid.
Try to capture DV source with real-time compression. Yes I know is tricky test, but there is no DV capture tool that can use Divx codec. ;)
foxyshadis
5th February 2008, 01:18
SSE4 optimizations only work with DivX set to Insane. That's exactly what anandtech did. If you just use any of the slow speeds (let alone fast), you'll see no boost from sse4.
cmw
5th February 2008, 18:25
Erm, sorry if this has been answered prior in this thread, as I'm not done reading trhough, but this seems very strange to me. I use the batch semi-automatic script to encode with xvid, have the xvid built from a few days back (posted prior in this thread) installed and use the neweset xvid_encraw from squid80. I have an Intel Core 2 Duo E6600, Windows XP SP2 32bit.
Commandline input:
start "TGHQ-30 - Pass 1/2" /b /wait /normal "C:\autok2\bin\xvid_encraw.exe" -zon
es 0,q,3,KO -threads 2 -progress 100 -max_key_interval 300 -packed -quality 5 -v
hqmode 1 -max_bframes 2 -bquant_ratio 162 -bquant_offset 0 -qtype 1 -qmatrix "C:
\autok2\matrix\eqm_v3ulr_rev3.xcm" -nochromame -turbo -lumimasking -bitrate 1373
-pass1 "C:\Temp\dvd\tmp\extrem.pass" -type 2 -i "C:\Temp\dvd\extrem.avs"
As you can see, it specifies threads=2
Subsequent encoding:
xvid_encraw - raw mpeg4 bitstream encoder written by Christoph Lampert 2002-2003
Trying to retrieve width and height from input header
xvid [info]: Avisynth detected
xvid [info]: Input colorspace is YV12
xvid [info]: Input is 640 x 352, 29.970fps (30000/1001), starting from frame 0
xvid [info]: Number of frames to encode: 214758, Bitrate = 1373kbps
xvid [info]: xvidcore build version: xvid-1.2.0-dev
xvid [info]: Bitstream version: 1.2.-127
xvid [info]: Detected CPU flags: ASM MMX MMXEXT SSE SSE2 TSC
xvid [info]: Detected cpus = 2, threads requested = 1, threads in use = 1
xvid [info]: Threaded input reading active
Only 1 thread requested? Only 1 thread used? Umm... help pls :)
Does it have something to do with the threaded input reading active? What does that mean? oO
Edit: I've just installed the MT version of avisynth. Encoding speed went up considerably, however, xvid still displays that only 1 thread is used :/
Edit2: I just tried and used an older version of xvid_encraw, and it says cpu's used = 2 there. The line about threaded input reading ist missing there though, so I guess it has something to do with that. Encoding speed is pretty much the same on both versions though.
squid_80
5th February 2008, 21:11
1 thread for encoding + threaded input = 2 threads. If you want to try forcing 2 encoding threads use -nothreadedinput or -threads 3, but you'll probably lose speed.
88keyz
9th February 2008, 19:56
As a little birthday present to myself I decided to compile a new tarball.
XviD_1.2.127-29032008.exe (http://rapidshare.com/files/103434080/XviD_1.2.127-29032008.exe)
Enjoy.
clsid
9th February 2008, 20:19
I hope you know that the tarball is generated automatically, and not just whenever something has changed. Judging from the file dates inside the tarball, nothing has changed since your previous build.
88keyz
10th February 2008, 08:08
No, I didn't know that. But to be honest I don't really care either, it was just fun to release a compile on my birthday. Its not like its a lot of work, probably took less than 5 minutes to do so I'm not worried about any time I may have wasted.
:rolleyes:
audyovydeo
11th February 2008, 11:44
sorry guys which is the official link for the 1.2.x branch tarballs ?
I couldn't trace it through xvid.org ...
thanks
audyovydeo
clsid
11th February 2008, 12:21
It's right there in the download section:
http://downloads.xvid.org/downloads/xvid_latest.tar.gz
totya
16th February 2008, 10:30
As a little birthday present to myself I decided to compile a new tarball. This one is based on the February 9th release, just like me!
XviD_1.2.127-09022008.exe (http://rapidshare.com/files/90468184/XviD_1.2.127-09022008.exe)
Enjoy.
Thanks!
My problem with xvid (all version) : preset manager is unusable. Only 1 (one!) user preset available. This is impossible. Different preset required for tv capture, avi recompress etc... :(
olnima
16th February 2008, 12:08
As a workaround: Import/export your setting via regedit.
The xvid-settings are stored into the registry under
HKEY_CURRENT_USER\Software\GNU\XviD
Using this way, You can create your own profiles and reimport them before using xvid.
If You use VirtualDub for capturing/postprocessing there is another way called "V2CRS".
http://v2crs.sourceforge.net/
Here is a little program included called VDubProfileCreator to do the same as described above but with a GUI.
Hope that helps,
Olnima
totya
16th February 2008, 12:28
As a workaround: Import/export your setting via regedit.
The xvid-settings are stored into the registry under
HKEY_CURRENT_USER\Software\GNU\XviD
Using this way, You can create your own profiles and reimport them before using xvid.
Thank you, this works!
If You use VirtualDub for capturing/postprocessing there is another way called "V2CRS".
I'm use ex this: VirtualVCR, this is supported too :)
DivXko
16th February 2008, 14:05
I download the last stable Xvid 1.13 - http://downloads.xvid.org/downloads/xvidcore-1.1.3.zip
but i don't know how to install it
can sameone help me??
_xxl
16th February 2008, 14:10
That is the source code.You have to compile it using MinGW GCC, or just download bin from:
http://rapidshare.com/files/90468184/XviD_1.2.127-09022008.exe
or just search on doom9 forum.
Buggle
17th February 2008, 19:39
I download the last stable Xvid 1.13 - http://downloads.xvid.org/downloads/xvidcore-1.1.3.zip
but i don't know how to install it
can sameone help me??
If you just want a stable release go to koepi.org, if you have a multicore search the latest post in this thread for a link to a 1.2 build.
But if you have to ask this question you might want to start by reading some of the guides on the main Doom9 site.
clsid
17th February 2008, 20:18
Koepi's site no longer exists.
Prettz
19th February 2008, 06:49
This is pretty much what DivX did, they invented particular algorithm for the purpose of using SSE4 with it.
Well to be fair, that's what you're supposed to do when you rewrite an algorithm to take advantage of the latest instruction set extensions. How else are you supposed to take advantage of new SIMD extensions of the instruction set other than a gigantic rewrite using an algorithm that's only awesome because of the new SIMD instructions?
Not that I could ever believe a DXN encoder is better than Xvid, since it hasn't ever been since I freaking joined Doom9.
I only expected that DXN would quickly produce a build optimized for the latest SSE extensions, since they make a profit from their codec (well, I assume) and therefore have the resources to have their devs produce this. And I don't have to have any experience in writing video encoding software (although I'd like to!) to know that producing a highly-optimized binary using a new SIMD algorithm from an already highly-optimized C/C++ algorithm is no fun at all. I've already got plenty of experience writing MMX/SSE code to know how difficult it can sometimes be to convert algorithms to different methodologies.
squid_80
19th February 2008, 07:12
Well to be fair, that's what you're supposed to do when you rewrite an algorithm to take advantage of the latest instruction set extensions. How else are you supposed to take advantage of new SIMD extensions of the instruction set other than a gigantic rewrite using an algorithm that's only awesome because of the new SIMD instructions?
No no no. They wrote a NEW algorithm (not a rewrite). When used without SSE4 it was very slow. When used with SSE4 it was much faster. However the EXISTING algorithm is already just as fast.
olnima
19th February 2008, 18:51
...VirtualVCR, this is supported too :)
Yes, but VDubProfileCreator is useless in that case (any reason not to use newest VirtualDub for capturing?). Btw., as the name of this tool says, it does NOT the same as creating/importing xvid-profiles via regedit, it does the same for VirtualDub but I guess You already have noticed that :). If You want, changing the code from VDubProfileCreator to use this explicit for xvid shouldn't be too difficult (but on the other hand wouldn't give You big advantages)
Olnima
DivXko
22nd February 2008, 00:23
I found Kopei Xvid 1.1.3 final - http://www.free-codecs.com/download/Koepi_XviD.htm
totya
22nd February 2008, 00:39
Thanks, but VirtualVCR much better than VirtualDub.
Yes, but VDubProfileCreator is useless in that case (any reason not to use newest VirtualDub for capturing?). Btw., as the name of this tool says, it does NOT the same as creating/importing xvid-profiles via regedit, it does the same for VirtualDub but I guess You already have noticed that :). If You want, changing the code from VDubProfileCreator to use this explicit for xvid shouldn't be too difficult (but on the other hand wouldn't give You big advantages)
Olnima
Ranguvar
22nd February 2008, 03:34
Thanks, but VirtualVCR much better than VirtualDub.
"Better", along with "best" is not a good term to describe something. (rules) Either narrow down what it is "better" in, and say why, or don't use it, please.
totya
22nd February 2008, 11:28
"Better", along with "best" is not a good term to describe something. (rules) Either narrow down what it is "better" in, and say why, or don't use it, please.
Hi, if u like virtualdub then use this. I cant write long detailed answer, because my english is terrible. But true, virtualvcr not only "better", but really best capture application - for me. Thats all.
Ranguvar
22nd February 2008, 16:15
Hi, if u like virtualdub then use this. I cant write long detailed answer, because my english is terrible. But true, virtualvcr not only "better", but really best capture application - for me. Thats all.What I'm saying is, regardless of each's ability, the terms "better" and "best" are general terms that should not be used. Check the Rules.
olnima
22nd February 2008, 23:39
Better or not..., developement of VVCR stopped years ago and VirtualDub is going on and on...
Olnima
totya
22nd February 2008, 23:55
Better or not..., developement of VVCR stopped years ago and VirtualDub is going on and on...
Olnima
Many old software better than new. My english is bad, but i wrote programs (yes I can) i know. VirtualDub/Mod is excellent, great, and free(!) application, i like it, but capture function not for me. I think this is offtopic...
but ontopic: latest beta xvid works for me (i dl from here). 1 core and 2 core mode works correctly. But under capture, two core mode is unusable. If CPU usage higher than 50% with two core mode (2 cpu usage), I get dropped frame with any capture applications. I dont know why.
Sorry my english.
olnima
23rd February 2008, 12:11
strange. I use xvid 1.2 (2 cores/2 threads) for capturing without any problems.
Onima
P.S.:
<
but capture function not for me.
>
did You ever try to post your problems in VirtualDub-forum?
totya
3rd March 2008, 18:29
strange. I use xvid 1.2 (2 cores/2 threads) for capturing without any problems.
did You ever try to post your problems in VirtualDub-forum?
This is not my problem... VDub is tipically poor capture apps, because very sensitive, if i do anything (ex i see taskmanager) under capture, i get frame drop/insert...
Ranguvar
3rd March 2008, 22:14
That's why he asked whether you had brought this up before. VDub may be capable of better capturing, perhaps something odd was enabled on your system.
I don't know much in this field; my cap card only works with proprietary if one wants to use the hardware MPEG-2 encoder, which I do.
olnima
4th March 2008, 08:20
This is not my problem... VDub is tipically poor capture apps, because very sensitive, if i do anything (ex i see taskmanager) under capture, i get frame drop/insert...
VDub isn't more or less "sensitive" then any other capture Software. Dropping/inserting frames depends on CPU-comsumption during capture and/or on your timing-settings. You can set up VirtualDub in a way that your dropping/inserting frame-counter is allways 0. BUT maybe sometimes it needs more then 10 min. to set up everything the right way. And that seems to be your problem. I really do not want You to switch to VirtualDub (and also I don't get money for this :-) ) but please, do not post such a nonsense here.
Olnima
P.S.: This has gone a little bit OT, sorry.
totya
4th March 2008, 10:18
but please, do not post such a nonsense
Please dont write me, if u stupid, thx.
Edit:
1. Sorry my poor english. "nonsense post" i think equal with "ur stupid".
2. Dropped frame DEPEND on application quality. If u dont know this - not my problem.
totya
4th March 2008, 10:21
That's why he asked whether you had brought this up before. VDub may be capable of better capturing, perhaps something odd was enabled on your system.
I don't know much in this field; my cap card only works with proprietary if one wants to use the hardware MPEG-2 encoder, which I do.
Thx, but my dvd player is divx/xvid capable, not need for me mpeg2 output. Mpeg4 result is smaller file. VDub is not my problem, anyone say this app. Thx.
Ranguvar
4th March 2008, 23:56
totya, the second thing I posted had nothing to do with you, just random chatter.
And I was explaining what he said...
CaMoTblku_OnToM
18th March 2008, 18:31
Hi! Plz help
i cant use 100% CPU load for compression, CPU usage when i compress video in virtualdub about 73-77% with normal or even higher priority :(
XviD_1.2.127-09022008
VirtualDub 1.7.8
threads was set 2
Pentium E2160 4Gb Windows XP SP2 x86 + all updates
Buggle
18th March 2008, 19:44
Hi! Plz help
i cant use 100% CPU load for compression, CPU usage when i compress video in virtualdub about 73-77% with normal or even higher priority :(
XviD_1.2.127-09022008
VirtualDub 1.7.8
threads was set 2
Pentium E2160 4Gb Windows XP SP2 x86 + all updates
Then try to set the amount of threads to 3, that will max your CPU usage to 100%. The problem is, however, that this will not higher (even lower) your encoding spead, according to some.
I personally have not yet tested the difference, and I do not really care that much, since I let it encode overnight or when I am at work. Having it at a slightly lower usage means less power consumed.
Ranguvar
18th March 2008, 22:08
And if you are using AviSynth input, multithread that. (MT)
CaMoTblku_OnToM
22nd March 2008, 23:31
Then try to set the amount of threads to 3, that will max your CPU usage to 100%.
set threads to 3 don't solve the problem :(
totya
23rd March 2008, 01:31
set threads to 3 don't solve the problem :(
if you uses avs script with filters, or you uses virtualdub with built-in filter, this is normal...see msg from Ranguvar, but MT is compicated (need manual switch MT level depend on used filter).
CaMoTblku_OnToM
23rd March 2008, 11:53
if you uses avs script with filters, or you uses virtualdub with built-in filteri don't use avisynth and i tested without built-in filters in virtual dub, in fast recompress mode with no audio, result is the same :(
also i tested x264-vfw which works right and loads CPU 100% with or without built-in filters in virtual dub
totya
23rd March 2008, 13:45
Hi! Plz help
i cant use 100% CPU load for compression, CPU usage when i compress video in virtualdub about 73-77% with normal or even higher priority :(
Sorry, i dont read exactly your problem :) 73-77% CPU usage is good. Me too 70-90%. CPU usage is higher if (xvid) compression quality is higher, or/and you use avs input (with vdub). 100% CPU usage is very hard on more cpu, ask programmers.
Buggle
24th March 2008, 17:05
set threads to 3 don't solve the problem :(
Strange. Maybe check the power options of your comp? Once I had my laptop set to max 50% usage or something like that and it wasn't exactly that fast... Come to think of it, it might also be a problem of warmth, it might be a problem of input buffer, might be something else I cannot think of :P. Have you looked in the taskmanager if there is a clear difference between cores?
clsid
24th March 2008, 17:32
Maybe harddrive access (reading/writing) is the bottleneck.
SoRiX
28th March 2008, 20:00
:thanks::thanks::thanks::thanks::thanks::thanks::thanks::thanks::thanks:
Hi,
Thanks for this release, it works fine for me with AutoGK, StaxRip any VirtualDub and it is much faster than the "SingleCore variant" 1.1.3 and again --> :thanks:
I dont recognize any quality or compatibility differences to the release from ww.xvid.org
thx from Germany :)
KML
29th March 2008, 23:14
http://img101.imageshack.us/img101/2594/adszwd1.png
Hi friends My processor is "PentiumD 3.00 GHZ"
And you see my xvid settings i can't chance "number of threads"..
Can i use multithreading or not?
Ranguvar
30th March 2008, 01:34
Do you have a multithreaded Xvid? v1.2.x SMP?
KML
30th March 2008, 03:04
i have this one
http://www.free-codecs.com/Koepi_XviD_download.htm
(1.1.3 Final)
Zarxrax
30th March 2008, 03:30
Out of curiosity, what keeps xvid 1.2.x from replacing 1.1.3 as a stable release? Are there any problems with it?
Lenny_Nero
30th March 2008, 03:44
i have this one (1.1.3 Final)
That is the single thread version try Xvid? v1.2.x SMP as said.
...and as to
what keeps xvid 1.2.x from replacing 1.1.3 as a stable release? Are there any problems with it?
Its not the same branch as I understand it, so it cant replace 1.1.3, but I have not found anything to cause problems, in fact the Jan 2008 version I am using is pushing out some very good encodes at around 10 to 20 per night.
If you have multi chip/core use it.
To CaMoTblku_OnToM
I have found that the hard drives make a big deal on the CPU load, so much so I can go from 60~70% when working on 80+% full drives to 97.3% when using 10% full de-fragged drives and even better when out to my arrays.
hajj_3
10th May 2008, 20:33
mirror for: XviD_1.2.127-29032008
http://www.sendspace.com/file/flcbir
devil-strike
13th May 2008, 11:55
I have a question, i have downloaded the latest smp 1.2 from 3apr 2008, and i have a quadcore amd 9600 2,3Ghz @ 2,6Ghz but if i use 0 Threads than it is faster than 4 threads, is this a bug or do i something wrong.
ps, sorry for bad english.
Ranguvar
13th May 2008, 12:25
0 is automatic, and it tends to use more threads than you have cores in order to get the most performance. So no, not a bug.
Try manual values higher than 4 though... on my Q6600, 6 gets better performance than both 0 and 4. Any higher and speed decreases.
devil-strike
13th May 2008, 16:41
0 is automatic, and it tends to use more threads than you have cores in order to get the most performance. So no, not a bug.
Try manual values higher than 4 though... on my Q6600, 6 gets better performance than both 0 and 4. Any higher and speed decreases.
Thnx 3 is the best number for me both passes 90a100fps whene the first 0/1/4 will gif me slower fps rate on second pass from 60 to 85fps.
Buggle
13th May 2008, 21:00
Thnx 3 is the best number for me both passes 90a100fps whene the first 0/1/4 will gif me slower fps rate on second pass from 60 to 85fps.
Then another question: where did you get the april build?
Hogan77
14th May 2008, 10:49
Then another question: where did you get the april build?Try here: http://www.koepi.info/
Buggle
16th May 2008, 12:17
Try here: http://www.koepi.info/
Yeah, I know that, but that one's based on old code with a VAQ patch, if I understand correctly. There must have been some work done in the meantime, implying that the latest build posted in this thread is more recent than his, even though it's been stamped april. Or am I not understanding Koepi correctly and did he use the latest codebase for that build?
Lenny_Nero
18th May 2008, 00:52
Yep, when I looked into it the code seemed to be from some early 2006 CVS.
It would be nice to find an easy way to get the later builds, I dont have a box set up to do any M$ builds ATM, but I do want to try out the VAQ patch.
kandrey89
23rd June 2008, 02:31
Could someone make the latest v1.2 build with VAQ patch? As I understand it, Koepi's build is 2 years old.
Thanks
Ranguvar
23rd June 2008, 19:09
There have been VERY few changes since then... but there are builds in this thread. Look back a little.
clsid
23rd June 2008, 21:47
Yep, last time I checked the last source code modification dated back to september 2007.
kandrey89
24th June 2008, 00:28
There have been VERY few changes since then... but there are builds in this thread. Look back a little.
I see other builds in this thread :helpful: , I'm not blind, but are they with VAQ??????????????
If so which one? :mad:
vBulletin® v3.8.4, Copyright ©2000-2009, Jelsoft Enterprises Ltd.