View Full Version : x264 compiling error with Visual Studio 2008
mehmeto
29th November 2008, 20:24
I get this error while compiling the latest x264 tarball:
error PRJ0019: A tool returned an error code from "Assembly d:\x264\x264-snapshot-20081128-2245\x264-snapshot-20081128-2245\common\x86\x86inc.asm"
Assuming that this is an nasm related error I placed nasm and yasm in every possible path but still get this error.
I feel like an Idiot, please help...
MasterNobody
29th November 2008, 20:26
Rename yasm.exe to nasm.exe or modify project files so it use yasm and not nasm.
kemuri-_9
29th November 2008, 20:27
Rename yasm.exe to nasm.exe or modify project files so it use yasm and not nasm.
you do realize, you said two contradictory things?
the latter was correct of course.
mehmeto
29th November 2008, 20:33
Wow, that worked. Many thanks. Wished that this was documented earlier though...
MasterNobody
29th November 2008, 20:33
kemuri-_9
There is no contradictions. I suggest 2 options:
1) Don't change current project files and substitute nasm with yasm (rename yasm.exe binary as nasm.exe)
OR
2) Find and replace all occurrences of "nasm.exe" with "yasm.exe" in project files.
LoRd_MuldeR
29th November 2008, 20:36
Why does the project still hold "nasm.exe" after nasm has been dropped recently? :confused:
kemuri-_9
29th November 2008, 20:38
@MasterNobody
oic, so rather than actually staying logically realistic and keeping yasm as yasm and fixing the project settings,
you actually suggested what could totally fubar building of other programs.
@Lord_MuldeR
cuz it hasn't been updated in the repository, since none of the devs use it.
MasterNobody
29th November 2008, 20:44
LoRd_MuldeR
Because x264 developers doesn't care about Visual Studio project files. They modify them only when somebody sends them patch (here it is: http://stashbox.org/309620/x264_vs_nasm2yasm.diff).
Dark Shikari
29th November 2008, 23:05
LoRd_MuldeR
Because x264 developers doesn't care about Visual Studio project files. They modify them only when somebody sends them patch (here it is: http://stashbox.org/309620/x264_vs_nasm2yasm.diff).applied
cogman
30th November 2008, 00:19
So how does the MSVC compiler compare to the mingw-g++ compiler? Better, worse, or the same?
LoRd_MuldeR
30th November 2008, 00:29
So how does the MSVC compiler compare to the mingw-g++ compiler? Better, worse, or the same?
Most important is that many (not all) assembler functions are disabled in MSVC builds, due to stack alignment problems in MSVC.
I think all the SSE code is disabled for MSVC, resulting in working but significant slower builds. Or has this ever been fixed?
Dark Shikari
30th November 2008, 00:31
Most important is that many (not all) assembler functions are disabled in MSVC builds, due to stack alignment problems in MSVC.
I think all the SSE code is disabled for MSVC, resulting in working but significant slower builds. Or has this ever been fixed?All SSE code that requires an aligned stack is disabled. This means SSE2 deblocking and SSE2 hadamard_ac. Not sure what else.
LoRd_MuldeR
30th November 2008, 00:35
All SSE code that requires an aligned stack is disabled. This means SSE2 deblocking and SSE2 hadamard_ac. Not sure what else.
What about the solutions suggested here?
http://forum.doom9.org/showpost.php?p=1217760&postcount=1409
Dark Shikari
30th November 2008, 00:36
What about the solutions suggested here?
http://forum.doom9.org/showpost.php?p=1217760&postcount=1409That post is completely unrelated to the current topic...
LoRd_MuldeR
30th November 2008, 00:38
That post is completely unrelated to the current topic...
Is it? Isn't it about methods to ensure stack alignment for both compilers, MinGW and MSVC ? :confused:
Dark Shikari
30th November 2008, 00:54
Is it? Isn't it about methods to ensure stack alignment for both compilers, MinGW and MSVC ? :confused:"Malloc" and "stack" don't belong in the same sentence.
LoRd_MuldeR
30th November 2008, 01:01
"Malloc" and "stack" don't belong in the same sentence.
Well, if there is an alignment bug with the stack, but aligned malloc works properly, we can simply store the data in a malloc'ed memory area instead of stack, right?
After all both, malloc'ed memory and the stack, are simply areas in the main memory. Just the methods to allocation/deallocation are different (malloc/free -vs- push/pop).
You could even malloc an area of memory and keep the pointer to it in the stack. No idea how useful that would be for a real implementation, just speculating...
Dark Shikari
30th November 2008, 01:11
Well, if there is an alignment bug with the stack, but aligned malloc works properly, we can simply store the data in a malloc'ed memory area instead of stack, right?Have fun calling malloc from assembly functions that need temporary space on the stack. Plus, good thing malloc and free are fast, and take only one or two clock cycles, right? ;)
Oh, and have fun calling malloc to store function arguments, too.
(It would be far faster to include workarounds for a misaligned stack than to call malloc even once.)
LoRd_MuldeR
30th November 2008, 01:13
Oh, and have fun calling malloc to store function arguments, too.
You'd have to use indirection via pointers I guess (store data in malloc'ed area, pass the pointers to the function on the stack).
But it's too ugly and too inefficient, I see... :o
burfadel
30th November 2008, 07:50
If MSVC outputs such an inefficient compiled version, if the system isn't being developed wouldn't it be better to disable support for it all together?
kemuri-_9
30th November 2008, 08:36
If MSVC outputs such an inefficient compiled version, if the system isn't being developed wouldn't it be better to disable support for it all together?
though it's not generally being developed, it works and is all some people care to compile with.
and running some quick benches...
the msvc version is about 55-60% as fast as the gcc version on my phenom.
so for the people who only want to compile in msvc (and in intel's compiler as it has the same problem apparently), it's a pretty big speed hit.
burfadel
30th November 2008, 11:13
If it were only a few percent its different, but I don't think its a good example of x264's speed if there are MSVC/ICL versions floating around!
roozhou
30th November 2008, 12:43
MSVC makes it easy to debug.
kemuri-_9
30th November 2008, 17:23
MSVC makes it easy to debug.
gcc w/ -g and gdb is just as easy.
akupenguin
30th November 2008, 18:23
the msvc version is about 55-60% as fast as the gcc version on my phenom.
No way. All the asm combined is a factor of 3-4. There's only a few functions affected by stack alignment, and they aren't sad or satd. msvc-compiled C isn't 1.7x slower than gcc either.
MasterNobody
30th November 2008, 20:16
akupenguin
I think he compiled MSVC build without pthreads (because current project files don't use it) and MinGW build with pthreads and test it on 2 cores PC.
kemuri-_9
30th November 2008, 20:43
@MasterNobody, no the original quick comparisons were single threaded because i didn't feel like building pthreads for msvc.
hmm... well, 'quick' was fairly quick, and not to mention lazy...
(gcc ndebug vs MSVC debug - was too lazy to switch msvc to release and actually optimize everything)
so then, running a fully fair bench:
gcc - ./configure (pthreadless, gpacless); make fprofiled
MSVC - switch to Release config (pthreadless, gpacless) - then altered to actually optimize everything
(/O2 /Ob2 /Oi /Ot /Oy /GT /GL /ltcg /arch:SSE2) (a lot to change <_<)
they came out really close on the average,
here's the makefile i used to run the benches:
#taken from x264 make fprofile
OPT0 = --crf 30 -b1 -m1 -r1 --me dia --no-cabac --pre-scenecut --direct temporal --no-ssim --no-psnr
OPT1 = --crf 16 -b2 -m3 -r3 --me hex -8 --direct spatial --no-dct-decimate
OPT2 = --crf 26 -b2 -m5 -r2 --me hex -8 -w --cqm jvt --nr 100
OPT3 = --crf 18 -b3 -m9 -r5 --me umh -8 -t1 -A all --mixed-refs -w --b-pyramid --direct auto --no-fast-pskip
OPT4 = --crf 22 -b3 -m7 -r4 --me esa -8 -t2 -A all --mixed-refs
OPT5 = --frames 50 --crf 24 -b3 -m9 -r3 --me tesa -8 -t1 --mixed-refs
OPT6 = --frames 50 -q0 -m9 -r2 --me hex -Aall
OPT7 = --frames 50 -q0 -m2 -r1 --me hex --no-cabac
run:
$(foreach BIN, $(BINS), $(BIN) --version > $(BIN)_bench.log ;)
$(foreach V, $(VIDS), $(foreach I, 0 1 2 3 4 5 6 7, $(foreach BIN, $(BINS), echo VID=$(V) >> $(BIN)_bench.log;\
echo $(OPT$I) >> $(BIN)_bench.log; $(BIN) $(OPT$I) $(V) -o NUL 2>&1 | tee -a $(BIN)_bench.log ;)))
here's those .logs (can't put them in here - too long of a post)
x264_gcc_bench.log (http://kemuri9.net/dev/x264/other/x264_gcc_bench.log)
x264_msvc_bench.log (http://kemuri9.net/dev/x264/other/x264_msvc_bench.log)
MSVC debug profile had a much larger speed impact than i had anticipated it seems.
(apparently too used to gcc having fairly equivalent speeds for debug to ndebug)
so sorry for being misinformative.
roozhou
2nd December 2008, 17:47
I compiled x264 under VS 2005 and it was 5%~10% slower than my GCC 4.3.2 fprofiled build. The .exe is 600kb while gcc build is 1.02mb.
And only four SSE2/SSSE3 functions cannot be enabled with MSVC due to stack alignment.
LoRd_MuldeR
2nd December 2008, 17:49
gcc build is 1.02mb.
Try stripping the binary, my x264.exe compiled with MinGW/GCC is only 722 KB in size. Also try fprofiled, if you didn't already...
roozhou
2nd December 2008, 17:58
Try stripping the binary, my x264.exe compiled with MinGW is only 722 KB in size. Also try fprofiled, if you didn't already...
strip x264.exe
The file size remains 1.02mb.
And I mentioned that GCC build uses msvcrt.dll and MSVC build uses static-linked CRT. It is quite a pity that MinGW does not have a static CRT library and all MinGW-built binaries link to a M$ CRT dll even if it is as simple as "Hello world".
LoRd_MuldeR
2nd December 2008, 18:04
It is quite a pity that MinGW does not have a static CRT library and all MinGW-built binaries link to a M$ CRT dll even if it is as simple as "Hello world".
Why? Dynamic linking to msvcrt.dll makes the binary smaller than static linking. And it has no drawback, as msvcrt.dll is present on any Windows system anyway (except for ancient versions maybe).
roozhou
2nd December 2008, 19:01
Why? Dynamic linking to msvcrt.dll makes the binary smaller than static linking. And it has no drawback, as msvcrt.dll is present on any Windows system anyway (except for ancient versions maybe).
But now GCC gives larger binary. How do you strip x264.exe? And how does fprofile work?
P.S. I can make 1kb exe with MSVC, but so far impossible with GCC.
LoRd_MuldeR
2nd December 2008, 19:23
But now GCC gives larger binary. How do you strip x264.exe? And how does fprofile work?
Most likely the binary is bigger because of stronger optimization. For example "loop-unrolling" makes the binary bigger, but faster.
Also you should check your MSVC binary in Dependency Walker (http://www.dependencywalker.com/) to make sure it really isn't linked against any DLL's - for example msvcr80.dll or alike.
You can strip a binary via "strip x264.exe" command in MSYS. If it was compiled with debug symbols, they'll be removed.
And you run fprofile like this:
make fprofiled VIDS="sample.avs"
Make sure the sample.avs points to a suitable sample. Doesn't need to be that long. And don't forget "make clean" before re-compile ;)
P.S. I can make 1kb exe with MSVC, but so far impossible with GCC.
And what's the point? You can get cheap Terrabyte HDD's nowadays and most people have broadband internet access.
IMO the size of a binary doesn't matter much. Optimizations and performance do matter. I'd always prefer a bigger, but faster binary!
roozhou
2nd December 2008, 19:54
Most likely the binary is bigger because of stronger optimization. For example "loop-unrolling" makes the binary bigger, but faster.
Also you should check your MSVC binary in Dependency Walker (http://www.dependencywalker.com/) to make sure it really isn't linked against any DLL's - for example msvcr80.dll or alike.
Yes, with /MT switch it uses static CRT. Actually I hate those programs which links to msvcr80/msvcr90 (e.g. latest ffdshow-mt from xvidvideo.ru).
You can strip a binary via "strip x264.exe" command in MSYS. If it was compiled with debug symbols, they'll be removed.
Seems that my binary has no debug symbols.
And you run fprofile like this:
make fprofiled VIDS="sample.avs"
Make sure the sample.avs points to a suitable sample. Doesn't need to be that long. And don't forget "make clean" before re-compile ;)
I do use fprofile. I wonder why we use fprofile and how gcc works with fprofile.
And what's the point? You can get cheap Terrabyte HDD's nowadays and most people have broadband internet access.
IMO the size of a binary doesn't matter much. Optimizations and performance do matter. I'd always prefer a bigger, but faster binary!
Max filesize of attachment on doom9 is 200kb.
kemuri-_9
2nd December 2008, 20:05
Seems that my binary has no debug symbols.
debugging symbols are only added with the -g flag in gcc and x264's ./configure does this for you when you specify --enable-debug
I do use fprofile. I wonder why we use fprofile and how gcc works with fprofile.
fprofiling is the act of running a program and tracing the code paths that have been run and how often they were run in order to better optimize those code paths.
Max filesize of attachment on doom9 is 200kb.
maximums have to be set to keep the forum's size reasonable, use a file or web host somewhere like everyone else.
LoRd_MuldeR
2nd December 2008, 20:14
Also in case the size of a binary is that important for you, simply UPX it ;)
I doubt you will ever get x264 below 200 KB though. My MinGW compiled and UPX'd x264.exe is still 260 KB in size.
Since there are more than enough ways to host your files for free, there is no need to use Doom9 (and having to wait for validations by a mod).
BTW: In my experience fprofiled makes the binary smaller (less aggressive optimizations in rarely used paths).
kemuri-_9
2nd December 2008, 20:35
BTW: In my experience fprofiled makes the binary smaller (less aggressive optimizations in rarely used paths).
r1046: same ./configure options,
fprofiled: 1,103,890 bytes in size (~1.05 MB)
non: 1,030,144 bytes in size (~.98 MB)
fprofiled is bigger.
be sure not to compare the profiling (-fprofile-generate) build to the profiled (-fprofile-use) build
LoRd_MuldeR
2nd December 2008, 20:38
fprofiled is bigger.
libx264-65.dll r1042
./configure --enable-shared --extra-cflags="-march=pentium2"
gcc.exe (4.3.2-tdm-1 for MinGW) 4.3.2
fprofiled: 697 KB (713.757 bytes)
none: 825 KB (844.800 bytes)
not here ;)
kemuri-_9
2nd December 2008, 20:39
that would be why, i'm going on the binary, you're going on the library. different concepts.
LoRd_MuldeR
2nd December 2008, 20:41
that would be why, i'm going on the binary, you're going on the library. different concepts.
Shouldn't make a difference though. Total size will be different, of course. But the effect should be the same. Why it shouldn't?
Shinigami-Sama
2nd December 2008, 20:43
Shouldn't make a difference though. Why should it?
cli is lib + exe stuffage
lib is lib
LoRd_MuldeR
2nd December 2008, 20:51
x264.exe r1042
./configure --extra-cflags="-march=pentium2"
gcc.exe (4.3.2-tdm-1 for MinGW) 4.3.2
fprofiled: 825 KB (844.800 bytes)
none: 837 KB (857.600 bytes)
cli is lib + exe stuffage
lib is lib
You'd need to re-compile the lib's too, of course. My build doesn't include mp4 (gpac) output, which may explain the difference to kemuri's results...
weaker
2nd December 2008, 21:50
Just out of curiosity: Did anyone try a profile guided optimization with VS2005/VS2008 (Professional and TeamSuite only)?
LoRd_MuldeR
2nd December 2008, 22:08
Just out of curiosity: Did anyone try a profile guided optimization with VS2005/VS2008 (Professional and TeamSuite only)?
The question is: Does anybody own a copy of that 630€ (http://shop.cancom.de/online/53/wa/TendiDL/getArticleDetail?id=2536672&shop=microsoft) (800$) software? :D
RadScorpion
2nd December 2008, 22:27
Great thread. I was just about to ask how to build x264 properly with MSVC because I'm getting strange crashes wich only happen with my MSVC builds and not with linux builds. I would like to use MSVC so that I can use static linking to build a nice standalone encoder filter.... btw. (I haven't had much time to play) Is it possible to link mingw static libraries with msvc projects ?
(Yepp - I do own a legal copy of VS2005 ;) )
clsid
2nd December 2008, 23:15
Yes, that is possible. MPC-HC is an example of a MSVC project that links with some mingw compiled libs (libavcodec, libgcc.a, libmingwex.a).
weaker
3rd December 2008, 01:21
@LordMulder: As a computer science student that is no problem if your university participates in the MSDNAA programme. It is 0€ software then. You can even get the team edition. Just for clarification. A link in German: http://www.microsoft.com/germany/bildung/infopool/msdnaa.mspx
LoRd_MuldeR
3rd December 2008, 01:23
@LordMulder: As a computer science student that is no problem if your university participates in the MSDNAA programme. It is 0€ software then. You can even get the team edition. Just for clarification. A link in German: http://www.microsoft.com/germany/bildung/infopool/msdnaa.mspx
In fact I have MSDNAA access and could download "Visual Studio 2005 Professional" from the image server at any time ;)
However it seems "Visual Studio 2008" is not available yet. Also there is no "Team Edition" available on the server.
Maybe I'll give it a try when I have some free time. Currently my experience with Visual Studio is zero. I use MinGW/MSYS now ...
Maccara
3rd December 2008, 01:26
The question is: Does anybody own a copy of that 630€ (http://shop.cancom.de/online/53/wa/TendiDL/getArticleDetail?id=2536672&shop=microsoft) (800$) software? :D
Heh, I hope no-one actually would buy a standalone version at those prices, as full msdn licensing is available at far lower prices. :)
Anyway, I tried a long ago with VS2005 profile guided optimizations and at that time without PGO was slightly slower than GCC built and with PGO a few percent (~2-3) faster than the GCC version. PGO itself made somewhat less than 10% difference at the time (don't remember exact figures anymore). I had mp4 & pthreads enabled.
However, that was very many revisions ago and I have no idea how it would fare with the current revisions. I was actually only testing it because of PGO and wanted to see if it would make a difference in this case.
Hmm. Just tried "svn update" on the x264 sources I had lying around and doesn't seem to work, so I guess I won't re-run the tests with VS2008. ;) (I use x264 version which comes with megui updates nowadays anyway, as most of the patches I needed at the time I did compilation myself are now included)
Edit: Ups, no wonder svn does not work - git needed. Seems I've been "a little" out of touch with x264 development... :)
kemuri-_9
3rd December 2008, 02:34
My school's a part of the msdnaa and i have the .net studio pro 2008 version, didn't get the team one tho it's available.
@Maccara:
x264 hasn't been on subversion since early March 2008,
pretty out of touch if you haven't noticed that change ;)
but i haven't technically tried PGO on here,
but i guess i can give it a whirl for testing's sake...
doing PGO with the same profiling paths that GCC fprofiles with,
then comparing them against each other on those same settings again on some video:
yuv4mpeg: 640x480@30/1fps, 0:0
the profiling paths (series of settings) are:
--crf 30 -b1 -m1 -r1 --me dia --no-cabac --pre-scenecut --direct temporal --no-ssim --no-psnr
--crf 16 -b2 -m3 -r3 --me hex -8 --direct spatial --no-dct-decimate
--crf 26 -b2 -m5 -r2 --me hex -8 -w --cqm jvt --nr 100
--crf 18 -b3 -m9 -r5 --me umh -8 -t1 -A all --mixed-refs -w --b-pyramid --direct auto --no-fast-pskip
--crf 22 -b3 -m7 -r4 --me esa -8 -t2 -A all --mixed-refs
--frames 50 --crf 24 -b3 -m9 -r3 --me tesa -8 -t1 --mixed-refs
--frames 50 -q0 -m9 -r2 --me hex -Aall
--frames 50 -q0 -m2 -r1 --me hex --no-cabac
GCC
x264 0.65.1046 71d34b4
built on Dec 2 2008, gcc: 3.4.5 (mingw-vista special r3)
x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSEMisalign
encoded 980 frames, 60.14 fps, 1422.32 kb/s
encoded 980 frames, 25.29 fps, 5194.67 kb/s
encoded 980 frames, 22.32 fps, 1912.17 kb/s
encoded 980 frames, 4.68 fps, 4150.89 kb/s
encoded 980 frames, 3.77 fps, 2551.43 kb/s
encoded 50 frames, 3.81 fps, 2618.73 kb/s
encoded 50 frames, 4.60 fps, 44098.44 kb/s
encoded 50 frames, 28.84 fps, 48403.27 kb/s
MSVC
x264 0.65.X
built on Dec 2 2008, using a non-gcc compiler
x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSEMisalign Slow_mod4_stack
encoded 980 frames, 59.23 fps, 1422.22 kb/s
encoded 980 frames, 25.66 fps, 5194.66 kb/s
encoded 980 frames, 22.54 fps, 1912.17 kb/s
encoded 980 frames, 4.62 fps, 4150.89 kb/s
encoded 980 frames, 3.66 fps, 2551.43 kb/s
encoded 50 frames, 3.83 fps, 2618.66 kb/s
encoded 50 frames, 4.83 fps, 44098.37 kb/s
encoded 50 frames, 28.57 fps, 48403.20 kb/s
kind of flip-floppy if you ask me.
LoRd_MuldeR
3rd December 2008, 03:53
the profiling paths (series of settings) are:
--crf 30 -b1 -m1 -r1 --me dia --no-cabac --pre-scenecut --direct temporal --no-ssim --no-psnr
--crf 16 -b2 -m3 -r3 --me hex -8 --direct spatial --no-dct-decimate
--crf 26 -b2 -m5 -r2 --me hex -8 -w --cqm jvt --nr 100
--crf 18 -b3 -m9 -r5 --me umh -8 -t1 -A all --mixed-refs -w --b-pyramid --direct auto --no-fast-pskip
--crf 22 -b3 -m7 -r4 --me esa -8 -t2 -A all --mixed-refs
--frames 50 --crf 24 -b3 -m9 -r3 --me tesa -8 -t1 --mixed-refs
--frames 50 -q0 -m9 -r2 --me hex -Aall
--frames 50 -q0 -m2 -r1 --me hex --no-cabac
I wonder why the profiling paths don't use threads :confused:
Would multi-threading corruption the profiling results? Or would it be even better to take threading into account?
At least it would make compile time a lot shorter :D
Shinigami-Sama
3rd December 2008, 04:08
I wonder why the profiling paths don't use threads :confused:
Would multi-threading corruption the profiling results? Or would it be even better to take threading into account?
At least it would make compile time a lot shorter :D
or have JOBS = # of cores ;)
kemuri-_9
3rd December 2008, 04:08
I wonder why the profiling paths don't use threads :confused:
Would multi-threading corruption the profiling results? Or would it be even better to take threading into account?
At least it would make compile time a lot shorter :D
common/mc.c:421: error: corrupted profile info: number of executions for edge 19-20 thought to be 391620
common/mc.c:421: error: corrupted profile info: number of executions for edge 19-21 thought to be -1
that's what happens when you try to fprofile with threads, it doesn't work.
(you can't have -1 executions)
LoRd_MuldeR
3rd December 2008, 04:23
common/mc.c:421: error: corrupted profile info: number of executions for edge 19-20 thought to be 391620
common/mc.c:421: error: corrupted profile info: number of executions for edge 19-21 thought to be -1
that's what happens when you try to fprofile with threads, it doesn't work.
(you can't have -1 executions)
I see. Is this a knwon bug/limitation in GCC's profiling code?
[EDIT]
In fact it seems to run through fine here. I'm using gcc.exe (4.3.2-tdm-1 for MinGW) 4.3.2.
[EDITē]
I was wrong. The individual tests run through fine, but it crashes at the final step of the building process :o
(There are negative execution counts, as you said)
well, there's also the case in which x264 isn't compiled with pthread support. can't just assume people are going to build it in.
...and the ./configure script could disable/enable "--threads auto" in the profiling section of the Makefile, depending on whether pthread is enabled or not.
kemuri-_9
3rd December 2008, 04:31
well, there's also the case in which x264 isn't compiled with pthread support. can't just assume people are going to build it in.
btw, this is getting pretty OT
Edit:
tried with gcc (Ubuntu 4.3.2-1ubuntu11) 4.3.2 and
gcc-4.2 (GCC) 4.2.4 (Ubuntu 4.2.4-3ubuntu4)
they crashed too.
Maccara
3rd December 2008, 13:11
Would multi-threading corruption the profiling results? Or would it be even better to take threading into account?
Btw, I did include multi-threading in the PGO when I did my tests in the past with VS2005, so the profiling was not exactly the same as GCC. Can't remember, though, if it made any difference. (I suppose it "might", as then the pthread code would be included in the profiling and maybe some overhead would be reduced - but I have no data how big overhead pthreads cause so this is a bit of a stretch)
roozhou
3rd December 2008, 17:08
Btw, I did include multi-threading in the PGO when I did my tests in the past with VS2005, so the profiling was not exactly the same as GCC. Can't remember, though, if it made any difference. (I suppose it "might", as then the pthread code would be included in the profiling and maybe some overhead would be reduced - but I have no data how big overhead pthreads cause so this is a bit of a stretch)
You can try xvid's way, replacing pthreads with native win32 api.
vBulletin® v3.8.5, Copyright ©2000-2012, Jelsoft Enterprises Ltd.