Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > MPEG-4 ASP

Reply
 
Thread Tools Search this Thread Display Modes
Old 19th April 2007, 22:22   #1  |  Link
plugh
A hollow voice says
 
Join Date: Sep 2006
Posts: 269
Koepi vs personal build

I had noted in the past that Koepi's build of 1.1.2 was bigger than my builds.
For example, xvidcore.dll - Koepi 748KB - mine 588KB

I just chalked it up to differant compilers but assumed the output was the same.

I don't normally muck around with the core (my interest has been the 2nd pass vbv stuff, which I moved to vfw.dll) so I normally use Koepi's core.dll, but I was just trying an experiment and discovered that my assumption was incorrect. For the same clip and encoder options I get somewhat differant output files depending upon which xvidcore build I use.

Which begs the questions:
Which one is 'right'?
How do I determine that?
If mine is 'wrong' how do I fix it?

I've been using the visual studio workspace/project files included with the xvid source kit. I'm using MS Visual Studio 6 with Service Pack 5, Processor Pack 5, and nasm 0.98.39. I was under the impression this was a 'correct' setup. I have noted that do I get an error about an unknown compiler option 'Qipo' but as I understand it that is an option used by the Intel C compiler, not the MS one, and is harmless - perhaps not?

Suggestions?
plugh is offline   Reply With Quote
Old 20th April 2007, 00:27   #2  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
He's probably using more aggressive optimization options. In GCC land, that might be -O3 and -funroll-loops that would increase the size as such.
Dark Shikari is offline   Reply With Quote
Old 20th April 2007, 01:58   #3  |  Link
plugh
A hollow voice says
 
Join Date: Sep 2006
Posts: 269
Yeah, I figure he's using a differant compiler and/or options.

The significant point is that the results are differant as well.

As a first take, I diff'd the .pass files for my 7800 frame clip. These were both "full quality" first passes (not using the fast approx routines).

There are no differences in I frames (interval 240), scattered differences in isolated b-frames, and two segments with runs of differant p and b frames - one 198 frames long, the other 210 frames long. For example: ( <! is Koepi, !> is mine )
Code:
 <! b 3 0 2560 0 4104 2879
 !> b 3 0 2560 0 4106 2900

 <! b 3 0 2560 0 5498 4357
 !> b 3 0 2560 0 5471 4337

 <! b 3 0 2560 0 15629 7732
 !> b 3 0 2560 0 15652 7742

 <! b 3 0 2560 0 13280 6691
 !> b 3 0 2560 0 13280 6690

 <! b 3 0 2560 0 7909 5051
 !> b 3 0 2560 0 7907 5048

 <! b 3 0 2560 0 10205 5730
 !> b 3 0 2560 0 10211 5734

 <! b 3 0 2560 0 11031 5689
 !> b 3 0 2560 0 11034 5682

 <! b 3 0 2560 0 4950 2688
 !> b 3 0 2560 0 4947 2688

 <! b 3 0 2560 0 3489 2439
 !> b 3 0 2560 0 3495 2437
and
Code:
 <! p 2 47 2505 8 17747 5163
 !> p 2 46 2506 8 17743 5161
 <! b 3 0 2560 0 4003 2819
 !> b 3 0 2560 0 3999 2814
 <! p 2 48 2503 9 17272 5148
 !> p 2 48 2503 9 17271 5145
 <! b 3 0 2560 0 3695 2587
 !> b 3 0 2560 0 3705 2596
 <! p 2 47 2512 1 17926 5131
 !> p 2 47 2512 1 17929 5132
 <! b 3 0 2560 0 3206 2219
 !> b 3 0 2560 0 3204 2215
 <! p 2 45 2511 4 17892 4939
 !> p 2 46 2510 4 17906 4939
 <! b 3 0 2560 0 4197 2658
 !> b 3 0 2560 0 4220 2673
 <! p 2 111 2442 7 18951 5259
 !> p 2 112 2441 7 18939 5250
 <! b 3 0 2560 0 2699 1958
 !> b 3 0 2560 0 2699 1952
 <! p 2 75 2474 11 16939 4795
 !> p 2 75 2474 11 16949 4799
 <! b 3 0 2560 0 3496 2529
 !> b 3 0 2560 0 3493 2529
 <! p 2 68 2484 8 17503 4963
 !> p 2 68 2484 8 17502 4963
 <! b 3 0 2560 0 3120 2309
 !> b 3 0 2560 0 3113 2298
 <! p 2 70 2480 10 18055 4899
 !> p 2 70 2480 10 18068 4894
(continues)
fwiw, output files
Koepi 118,568,960 bytes
mine 118,566,912 bytes

Does this provide any clues as to what / where / why they are differant?
How do you determine which one is "correct"?

Last edited by plugh; 20th April 2007 at 02:20.
plugh is offline   Reply With Quote
Old 20th April 2007, 02:44   #4  |  Link
henryho_hk
Registered User
 
Join Date: Mar 2004
Posts: 889
Celtic Druid's 01 Apr builds are nearly 1MB. I am using Celtic Druid's March builds because the April snapshot seems to have problems in 2-pass rate control.

It appears to be difficult to judge the "correctness" as we don't even have a reference code base (which beta, alpha or even snapshots?) nor build (maybe plain MSVC 7 or old stable GCC w/o any optimization?).
henryho_hk is offline   Reply With Quote
Old 20th April 2007, 03:08   #5  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
If they are using exactly the same codebase, the only thing I can think of is that he's using some option like -ffast-math that doesn't abide perfectly by the ANSI C standards. Its most likely that you're using different codebases though.
Dark Shikari is offline   Reply With Quote
Old 20th April 2007, 04:16   #6  |  Link
plugh
A hollow voice says
 
Join Date: Sep 2006
Posts: 269
The source kit is the "Xvid 1.1.2 stable release" source kit from xvid.org. The source kit includes "Generic install procedure for Win32/MSVC" instructions (from 2004), and includes MS Visual Studio workspace and project files. Assuming you have the required software, unzip it, open the workspace, select the project, and build it. Done!

I believe Koepi's build is also based upon this code base.

Like I said, the only glitch I encountered is that the canned build wants to use a "Qipo" compile switch, which MSVC noted and ignored.

To be fair, the bulk of the encode appears to be the same - perhaps 500 frames out of the 7800 were differant. But that still seems like a lot from simply using a differant compiler.
plugh is offline   Reply With Quote
Old 20th April 2007, 06:46   #7  |  Link
henryho_hk
Registered User
 
Join Date: Mar 2004
Posts: 889
Sorry for my ignorance.

Isn't "/Qipo" an ICL option? Owing to the past records of ICL, I believe your pure MSVC compile is more "correct".

Last edited by henryho_hk; 20th April 2007 at 11:59.
henryho_hk is offline   Reply With Quote
Old 20th April 2007, 07:29   #8  |  Link
plugh
A hollow voice says
 
Join Date: Sep 2006
Posts: 269
Yes, from what I was able to find on the net, I believe that is an Intel C compiler option. I too thought that was odd, coming from the Xvid canned Visual Studio project files.

I have a (not-installed) copy someplace in my archives. I vaguely recall there was some kind of integration, where you could substitute it for the MS C compiler - compatible command lines etc. But the docs included in the xvid source kit don't mention it at all, just gcc and MSVC.
plugh is offline   Reply With Quote
Old 21st April 2007, 09:45   #9  |  Link
celtic_druid
Registered User
 
celtic_druid's Avatar
 
Join Date: Oct 2001
Location: Melbourne, Australia
Posts: 2,171
Yeah, with ICL installed there is an option in VC6 to pick what compiler to use. /Qipo enables multi-file optimisation.
celtic_druid is offline   Reply With Quote
Old 21st April 2007, 17:10   #10  |  Link
plugh
A hollow voice says
 
Join Date: Sep 2006
Posts: 269
Hmmm. So perhaps I should try using ICL and see what happens.

As there doesn't appear to be a clear answer as to which is "correct", it occurred to me to ask 'which is better'? In that context, I stumbled across this MSU Video Quality Measurement Tool which appears to be tailor made for this - allowing comparison between source (avs input) and two alternative encodes (avi input).

Has anyone else used this tool? Any advice?
plugh is offline   Reply With Quote
Old 22nd April 2007, 08:46   #11  |  Link
sysKin
Registered User
 
sysKin's Avatar
 
Join Date: Jun 2002
Location: Adelaide, Australia
Posts: 1,167
The output should be identical regardless of the compiler. Something's funny, maybe Koepi used some earlier sources?
__________________
Visit #xvid or #x264 at irc.freenode.net
sysKin is offline   Reply With Quote
Old 22nd April 2007, 13:52   #12  |  Link
celtic_druid
Registered User
 
celtic_druid's Avatar
 
Join Date: Oct 2001
Location: Melbourne, Australia
Posts: 2,171
In that case I guess also try my 1.1.2 compile since it was also compiled with ICL. See if the output matches.
celtic_druid is offline   Reply With Quote
Old 22nd April 2007, 15:25   #13  |  Link
plugh
A hollow voice says
 
Join Date: Sep 2006
Posts: 269
Koepi and celtic_druid builds produced identical results.

So it seems safe to conclude both used same sources, and both used ICL?

celtic_druid, did you use the 1.1.2 source kit download zip file, or cvs?

Anyone got a gcc build of 1.1.2 xvidcore.dll ?

Guess I'll dig out ICL from my archives, and see what that gives me...

EDIT: change 'identical' to 'nearly identical' - the .pass files *are* identical, the avi files differ by _one_ byte part way in.

Last edited by plugh; 22nd April 2007 at 15:35.
plugh is offline   Reply With Quote
Old 22nd April 2007, 15:40   #14  |  Link
celtic_druid
Registered User
 
celtic_druid's Avatar
 
Join Date: Oct 2001
Location: Melbourne, Australia
Posts: 2,171
Could have been zip/tar.bz or CVS. Can't remember.

http://ffdshow.faireal.net/mirror/Xv...id-1.1.2gcc.7z
Should be a bunch of gcc builds for different CPU's from recollection.
celtic_druid is offline   Reply With Quote
Old 22nd April 2007, 18:28   #15  |  Link
plugh
A hollow voice says
 
Join Date: Sep 2006
Posts: 269
As I'm using an 'Applebred' (Duron 1.8Ghz) cpu, I used the gcc Athlon-XP build (and I'm not overclocking).

Results:

Comparing the Koepi and gcc-xp .pass files, I get a set of differences ("set1")
Comparing gcc-xp and my-msvc .pass files, I get a set of differences ("set2")
Comparing my-msvc and Koepi .pass files, I get a set of differences ("set3")

-->in the context of this test clip<--

set3 has the smallest number of differences, set2 the largest.

working from smallest to largest, comparing

set3 and set1 - there are two frames in set3 not in set1
(set3 is NOT a proper subset of set1)
set1 and set2 - set1 IS a proper subset of set2

I'm using a differant clip than I used above; with this clip, ALL the differences are B-frames (no P-frames like above).

I also dug out and installed ICL 9.0.28 (fairly old), recompiled using it, and compared to Koepi .pass file - only three frames differant, interestingly 2 P's and 1 B, and only by one byte lengths.

Last edited by plugh; 22nd April 2007 at 18:40.
plugh is offline   Reply With Quote
Old 22nd April 2007, 19:39   #16  |  Link
plugh
A hollow voice says
 
Join Date: Sep 2006
Posts: 269
Another quick set of comparisons...

Same clip as above, same three builds (Koepi, my-msvc, gcc-xp)
This time, vhq=0 and 'vhq for b-frames' off (was vhq=3 and 'on' above)

my-msvc vs gcc-xp - .pass files were identical, and _one_ byte difference in avi files (same byte position as above when comparing Koepi and druid ICL builds, but differant values)

Koepi vs gcc-xp, and Koepi vs my-msvc - many differences, including both P and B frames

So msvc and gcc give same results, until you turn on vhq.
ICL builds give lots of differences from ths common result.

(I stated this previously, but as a reminder, these were 'full quality' first passes; NOT using the fast approximation routines.)

Last edited by plugh; 22nd April 2007 at 19:54.
plugh is offline   Reply With Quote
Old 22nd April 2007, 23:35   #17  |  Link
plugh
A hollow voice says
 
Join Date: Sep 2006
Posts: 269
More comparisons between msvc and gcc build...

They continue to produce identical results, across the entire range of vhq settings, as long as "VHQ for B-frames" is off.

Looks like there is a something in estimation_rd_based_bvop.c and/or its unique subordinates that cause msvc/gcc to differ.

These tests also lead to the conclusion that the ICL builds are BORKED.

I'm continuing to compare behaviour on other code paths - qpel, gmc, ...
plugh is offline   Reply With Quote
Old 23rd April 2007, 01:56   #18  |  Link
sysKin
Registered User
 
sysKin's Avatar
 
Join Date: Jun 2002
Location: Adelaide, Australia
Posts: 1,167
Quote:
Originally Posted by plugh View Post
They continue to produce identical results, across the entire range of vhq settings, as long as "VHQ for B-frames" is off.

Looks like there is a something in estimation_rd_based_bvop.c and/or its unique subordinates that cause msvc/gcc to differ.
You're on to something here ~~ perhaps you'll find the evil elusive bug that causes the output sometimes (very very rarely) depend on number of threads.
__________________
Visit #xvid or #x264 at irc.freenode.net
sysKin is offline   Reply With Quote
Old 23rd April 2007, 02:52   #19  |  Link
plugh
A hollow voice says
 
Join Date: Sep 2006
Posts: 269
I'm not doing multiple threads, so not sure how this helps that.

FYI - I've been using a custom profile that has 4MV off; but have tested 4MV on as well, and so far there does not seem to be dependance on that flag elsewhere - the two builds continue to give identical results with vhq-b off.

I've decided to hold off on the qpel testing; I don't use it anyway, and it further multiplies the number of tests. (Though I did do one; vhqb=off, vhq=4, 4mv=on, qpel=on, h263, yeilding a single p-frame difference between the builds.)

There seem to be three major predicates in that vhq-b module; h263 or not, inter4v or not, qpel or not. So I'll try some test cases and see what turns up...

But I'm back to my "original" problem - which build (msvc, gcc), if either, is behaving "correctly"? Perhaps this is a gcc glitch...

Are there any xvidcore.dll builds around using yet another compiler? The ICL ones are out, since they differ no matter what encoder options I select...

BTW syskin, perhaps you can answer a question. I've noticed a several constructs in that module similar to this one
Code:
	switch(mode) {
		case MODE_DIRECT: return Data_d->iMinSAD[0];
		case MODE_FORWARD: return Data_f->iMinSAD[0];
		case MODE_BACKWARD: return Data_b->iMinSAD[0];
		default:
		case MODE_INTERPOLATE: return Data_i->iMinSAD[0];
My question is - is 'case interpolate' the correct path when mode is DIRECT_NONE_MV or (in particular) DIRECT_NO4V?

Last edited by plugh; 23rd April 2007 at 03:11.
plugh is offline   Reply With Quote
Old 23rd April 2007, 15:32   #20  |  Link
plugh
A hollow voice says
 
Join Date: Sep 2006
Posts: 269
Continuing my comparisons between msvc to gcc,
I tried various relevant encoder options to probe
the "vhq for b-frames" behavioural difference.

The results were inconclusive.
So I decided to try a differant tack.

Note that I am using a Duron 1.8GHz
that has mmx, xmm, sse, 3dne, 3dne2

Using encode options vhq-b=on, vhq=1, 4mv=off
(this is _a_ case where msvc and gcc builds differ)
Code:
Using a normal 'optimized' msvc build, compare

h263,default	h263,mmxonly	identical
mpeg,default	mpeg,mmxonly	differant

Using a build with estimation_rd_based_bvop.c compiled noopt

h263,default	h263,mmxonly	identical
mpeg,default	mpeg,mmxonly	differant
So far so good. Horizontal differences, while not ideal,
*may* simply indicate an accuracy difference between the
mmx asm routines, and the default mixed, non-orthogonal,
set used on my processor.

Now if there is NO optimization sensitivity on the relevant
C code paths, then comparing the above 8 encodes *vertically*
should give me all identical comparison results. They don't!
Code:
identical	identical
DIFFERANT	identical
To make this clear, whether this module is optimized or not
*should* have no effect on its results. The vertical comparisons
*should* all be 'identical'. But it appears there is an interaction
between the optimization status of this module and the default mix
of asm routines used on my processor. This is not good.

For what it's worth, the relevant asm routines are:

quant_mpeg_inter_xmm and dequant_mpeg_inter_3dne
--vs--
quant_mpeg_inter_mmx and dequant_mpeg_inter_mmx
(and perhaps fdct_mmx_skal vs fdct_xmm_skal )

These are called from within two 'large' static inline routines.
Block_CalcBits_BVOP and Block_CalcBits_BVOP_direct

These two inline routines are virtually identical (only ONE line
is differant), and they are invoked multiple times both within
and outside of loops. My gut tells me that the various compiler
optimizers are having a field day with this.

Where to go from here? Beats me - perhaps someone 'out there'
can look at those asm routines and the relevant C code and figure
out why there is an interaction with the C compiler's optimizer...

Last edited by plugh; 23rd April 2007 at 16:07.
plugh is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 11:19.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.