Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
19th April 2007, 22:22 | #1 | Link |
A hollow voice says
Join Date: Sep 2006
Posts: 269
|
Koepi vs personal build
I had noted in the past that Koepi's build of 1.1.2 was bigger than my builds.
For example, xvidcore.dll - Koepi 748KB - mine 588KB I just chalked it up to differant compilers but assumed the output was the same. I don't normally muck around with the core (my interest has been the 2nd pass vbv stuff, which I moved to vfw.dll) so I normally use Koepi's core.dll, but I was just trying an experiment and discovered that my assumption was incorrect. For the same clip and encoder options I get somewhat differant output files depending upon which xvidcore build I use. Which begs the questions: Which one is 'right'? How do I determine that? If mine is 'wrong' how do I fix it? I've been using the visual studio workspace/project files included with the xvid source kit. I'm using MS Visual Studio 6 with Service Pack 5, Processor Pack 5, and nasm 0.98.39. I was under the impression this was a 'correct' setup. I have noted that do I get an error about an unknown compiler option 'Qipo' but as I understand it that is an option used by the Intel C compiler, not the MS one, and is harmless - perhaps not? Suggestions? |
20th April 2007, 01:58 | #3 | Link |
A hollow voice says
Join Date: Sep 2006
Posts: 269
|
Yeah, I figure he's using a differant compiler and/or options.
The significant point is that the results are differant as well. As a first take, I diff'd the .pass files for my 7800 frame clip. These were both "full quality" first passes (not using the fast approx routines). There are no differences in I frames (interval 240), scattered differences in isolated b-frames, and two segments with runs of differant p and b frames - one 198 frames long, the other 210 frames long. For example: ( <! is Koepi, !> is mine ) Code:
<! b 3 0 2560 0 4104 2879 !> b 3 0 2560 0 4106 2900 <! b 3 0 2560 0 5498 4357 !> b 3 0 2560 0 5471 4337 <! b 3 0 2560 0 15629 7732 !> b 3 0 2560 0 15652 7742 <! b 3 0 2560 0 13280 6691 !> b 3 0 2560 0 13280 6690 <! b 3 0 2560 0 7909 5051 !> b 3 0 2560 0 7907 5048 <! b 3 0 2560 0 10205 5730 !> b 3 0 2560 0 10211 5734 <! b 3 0 2560 0 11031 5689 !> b 3 0 2560 0 11034 5682 <! b 3 0 2560 0 4950 2688 !> b 3 0 2560 0 4947 2688 <! b 3 0 2560 0 3489 2439 !> b 3 0 2560 0 3495 2437 Code:
<! p 2 47 2505 8 17747 5163 !> p 2 46 2506 8 17743 5161 <! b 3 0 2560 0 4003 2819 !> b 3 0 2560 0 3999 2814 <! p 2 48 2503 9 17272 5148 !> p 2 48 2503 9 17271 5145 <! b 3 0 2560 0 3695 2587 !> b 3 0 2560 0 3705 2596 <! p 2 47 2512 1 17926 5131 !> p 2 47 2512 1 17929 5132 <! b 3 0 2560 0 3206 2219 !> b 3 0 2560 0 3204 2215 <! p 2 45 2511 4 17892 4939 !> p 2 46 2510 4 17906 4939 <! b 3 0 2560 0 4197 2658 !> b 3 0 2560 0 4220 2673 <! p 2 111 2442 7 18951 5259 !> p 2 112 2441 7 18939 5250 <! b 3 0 2560 0 2699 1958 !> b 3 0 2560 0 2699 1952 <! p 2 75 2474 11 16939 4795 !> p 2 75 2474 11 16949 4799 <! b 3 0 2560 0 3496 2529 !> b 3 0 2560 0 3493 2529 <! p 2 68 2484 8 17503 4963 !> p 2 68 2484 8 17502 4963 <! b 3 0 2560 0 3120 2309 !> b 3 0 2560 0 3113 2298 <! p 2 70 2480 10 18055 4899 !> p 2 70 2480 10 18068 4894 (continues) Koepi 118,568,960 bytes mine 118,566,912 bytes Does this provide any clues as to what / where / why they are differant? How do you determine which one is "correct"? Last edited by plugh; 20th April 2007 at 02:20. |
20th April 2007, 02:44 | #4 | Link |
Registered User
Join Date: Mar 2004
Posts: 889
|
Celtic Druid's 01 Apr builds are nearly 1MB. I am using Celtic Druid's March builds because the April snapshot seems to have problems in 2-pass rate control.
It appears to be difficult to judge the "correctness" as we don't even have a reference code base (which beta, alpha or even snapshots?) nor build (maybe plain MSVC 7 or old stable GCC w/o any optimization?). |
20th April 2007, 03:08 | #5 | Link |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
If they are using exactly the same codebase, the only thing I can think of is that he's using some option like -ffast-math that doesn't abide perfectly by the ANSI C standards. Its most likely that you're using different codebases though.
|
20th April 2007, 04:16 | #6 | Link |
A hollow voice says
Join Date: Sep 2006
Posts: 269
|
The source kit is the "Xvid 1.1.2 stable release" source kit from xvid.org. The source kit includes "Generic install procedure for Win32/MSVC" instructions (from 2004), and includes MS Visual Studio workspace and project files. Assuming you have the required software, unzip it, open the workspace, select the project, and build it. Done!
I believe Koepi's build is also based upon this code base. Like I said, the only glitch I encountered is that the canned build wants to use a "Qipo" compile switch, which MSVC noted and ignored. To be fair, the bulk of the encode appears to be the same - perhaps 500 frames out of the 7800 were differant. But that still seems like a lot from simply using a differant compiler. |
20th April 2007, 07:29 | #8 | Link |
A hollow voice says
Join Date: Sep 2006
Posts: 269
|
Yes, from what I was able to find on the net, I believe that is an Intel C compiler option. I too thought that was odd, coming from the Xvid canned Visual Studio project files.
I have a (not-installed) copy someplace in my archives. I vaguely recall there was some kind of integration, where you could substitute it for the MS C compiler - compatible command lines etc. But the docs included in the xvid source kit don't mention it at all, just gcc and MSVC. |
21st April 2007, 17:10 | #10 | Link |
A hollow voice says
Join Date: Sep 2006
Posts: 269
|
Hmmm. So perhaps I should try using ICL and see what happens.
As there doesn't appear to be a clear answer as to which is "correct", it occurred to me to ask 'which is better'? In that context, I stumbled across this MSU Video Quality Measurement Tool which appears to be tailor made for this - allowing comparison between source (avs input) and two alternative encodes (avi input). Has anyone else used this tool? Any advice? |
22nd April 2007, 15:25 | #13 | Link |
A hollow voice says
Join Date: Sep 2006
Posts: 269
|
Koepi and celtic_druid builds produced identical results.
So it seems safe to conclude both used same sources, and both used ICL? celtic_druid, did you use the 1.1.2 source kit download zip file, or cvs? Anyone got a gcc build of 1.1.2 xvidcore.dll ? Guess I'll dig out ICL from my archives, and see what that gives me... EDIT: change 'identical' to 'nearly identical' - the .pass files *are* identical, the avi files differ by _one_ byte part way in. Last edited by plugh; 22nd April 2007 at 15:35. |
22nd April 2007, 15:40 | #14 | Link |
Registered User
Join Date: Oct 2001
Location: Melbourne, Australia
Posts: 2,171
|
Could have been zip/tar.bz or CVS. Can't remember.
http://ffdshow.faireal.net/mirror/Xv...id-1.1.2gcc.7z Should be a bunch of gcc builds for different CPU's from recollection. |
22nd April 2007, 18:28 | #15 | Link |
A hollow voice says
Join Date: Sep 2006
Posts: 269
|
As I'm using an 'Applebred' (Duron 1.8Ghz) cpu, I used the gcc Athlon-XP build (and I'm not overclocking).
Results: Comparing the Koepi and gcc-xp .pass files, I get a set of differences ("set1") Comparing gcc-xp and my-msvc .pass files, I get a set of differences ("set2") Comparing my-msvc and Koepi .pass files, I get a set of differences ("set3") -->in the context of this test clip<-- set3 has the smallest number of differences, set2 the largest. working from smallest to largest, comparing set3 and set1 - there are two frames in set3 not in set1 (set3 is NOT a proper subset of set1) set1 and set2 - set1 IS a proper subset of set2 I'm using a differant clip than I used above; with this clip, ALL the differences are B-frames (no P-frames like above). I also dug out and installed ICL 9.0.28 (fairly old), recompiled using it, and compared to Koepi .pass file - only three frames differant, interestingly 2 P's and 1 B, and only by one byte lengths. Last edited by plugh; 22nd April 2007 at 18:40. |
22nd April 2007, 19:39 | #16 | Link |
A hollow voice says
Join Date: Sep 2006
Posts: 269
|
Another quick set of comparisons...
Same clip as above, same three builds (Koepi, my-msvc, gcc-xp) This time, vhq=0 and 'vhq for b-frames' off (was vhq=3 and 'on' above) my-msvc vs gcc-xp - .pass files were identical, and _one_ byte difference in avi files (same byte position as above when comparing Koepi and druid ICL builds, but differant values) Koepi vs gcc-xp, and Koepi vs my-msvc - many differences, including both P and B frames So msvc and gcc give same results, until you turn on vhq. ICL builds give lots of differences from ths common result. (I stated this previously, but as a reminder, these were 'full quality' first passes; NOT using the fast approximation routines.) Last edited by plugh; 22nd April 2007 at 19:54. |
22nd April 2007, 23:35 | #17 | Link |
A hollow voice says
Join Date: Sep 2006
Posts: 269
|
More comparisons between msvc and gcc build...
They continue to produce identical results, across the entire range of vhq settings, as long as "VHQ for B-frames" is off. Looks like there is a something in estimation_rd_based_bvop.c and/or its unique subordinates that cause msvc/gcc to differ. These tests also lead to the conclusion that the ICL builds are BORKED. I'm continuing to compare behaviour on other code paths - qpel, gmc, ... |
23rd April 2007, 02:52 | #19 | Link |
A hollow voice says
Join Date: Sep 2006
Posts: 269
|
I'm not doing multiple threads, so not sure how this helps that.
FYI - I've been using a custom profile that has 4MV off; but have tested 4MV on as well, and so far there does not seem to be dependance on that flag elsewhere - the two builds continue to give identical results with vhq-b off. I've decided to hold off on the qpel testing; I don't use it anyway, and it further multiplies the number of tests. (Though I did do one; vhqb=off, vhq=4, 4mv=on, qpel=on, h263, yeilding a single p-frame difference between the builds.) There seem to be three major predicates in that vhq-b module; h263 or not, inter4v or not, qpel or not. So I'll try some test cases and see what turns up... But I'm back to my "original" problem - which build (msvc, gcc), if either, is behaving "correctly"? Perhaps this is a gcc glitch... Are there any xvidcore.dll builds around using yet another compiler? The ICL ones are out, since they differ no matter what encoder options I select... BTW syskin, perhaps you can answer a question. I've noticed a several constructs in that module similar to this one Code:
switch(mode) { case MODE_DIRECT: return Data_d->iMinSAD[0]; case MODE_FORWARD: return Data_f->iMinSAD[0]; case MODE_BACKWARD: return Data_b->iMinSAD[0]; default: case MODE_INTERPOLATE: return Data_i->iMinSAD[0]; Last edited by plugh; 23rd April 2007 at 03:11. |
23rd April 2007, 15:32 | #20 | Link |
A hollow voice says
Join Date: Sep 2006
Posts: 269
|
Continuing my comparisons between msvc to gcc,
I tried various relevant encoder options to probe the "vhq for b-frames" behavioural difference. The results were inconclusive. So I decided to try a differant tack. Note that I am using a Duron 1.8GHz that has mmx, xmm, sse, 3dne, 3dne2 Using encode options vhq-b=on, vhq=1, 4mv=off (this is _a_ case where msvc and gcc builds differ) Code:
Using a normal 'optimized' msvc build, compare h263,default h263,mmxonly identical mpeg,default mpeg,mmxonly differant Using a build with estimation_rd_based_bvop.c compiled noopt h263,default h263,mmxonly identical mpeg,default mpeg,mmxonly differant *may* simply indicate an accuracy difference between the mmx asm routines, and the default mixed, non-orthogonal, set used on my processor. Now if there is NO optimization sensitivity on the relevant C code paths, then comparing the above 8 encodes *vertically* should give me all identical comparison results. They don't! Code:
identical identical DIFFERANT identical *should* have no effect on its results. The vertical comparisons *should* all be 'identical'. But it appears there is an interaction between the optimization status of this module and the default mix of asm routines used on my processor. This is not good. For what it's worth, the relevant asm routines are: quant_mpeg_inter_xmm and dequant_mpeg_inter_3dne --vs-- quant_mpeg_inter_mmx and dequant_mpeg_inter_mmx (and perhaps fdct_mmx_skal vs fdct_xmm_skal ) These are called from within two 'large' static inline routines. Block_CalcBits_BVOP and Block_CalcBits_BVOP_direct These two inline routines are virtually identical (only ONE line is differant), and they are invoked multiple times both within and outside of loops. My gut tells me that the various compiler optimizers are having a field day with this. Where to go from here? Beats me - perhaps someone 'out there' can look at those asm routines and the relevant C code and figure out why there is an interaction with the C compiler's optimizer... Last edited by plugh; 23rd April 2007 at 16:07. |
Thread Tools | Search this Thread |
Display Modes | |
|
|