Log in

View Full Version : Current Patches, Where to get them, How they affect speed/output


Pages : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 [26] 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69

3ngel
28th October 2008, 17:49
3.4.6 Broken

http://img219.imageshack.us/img219/230/15390722oy0.th.png (http://img219.imageshack.us/my.php?image=15390722oy0.png)http://img219.imageshack.us/images/thpix.gif (http://g.imageshack.us/thpix.php)

Up to now

MinGW/GCC v3.4.6-> Broken

MinGW/GCC v4.3.2 tdm-1 (SJLJ Unwinding) -> Broken

LoRd_MuldeR
28th October 2008, 18:00
MinGW/GCC v3.4.5 here:
http://forum.doom9.org/showpost.php?p=1191032&postcount=1212

MinGW/GCC v4.3.3 here:
http://komisar.gin.by/x264.999kMod.generic.exe

kemuri-_9
28th October 2008, 18:23
3ngel: could you put up a sample of the source for the problematic section?
would help for people to have it so they can do testing on it themselves....

3ngel
28th October 2008, 18:28
3.4.5 Broken

Same Frame as 3.4.6 +

http://img221.imageshack.us/img221/3492/345zo4.th.png (http://img221.imageshack.us/my.php?image=345zo4.png)http://img221.imageshack.us/images/thpix.gif (http://g.imageshack.us/thpix.php)

MinGW/GCC v3.4.5 -> Broken

MinGW/GCC v3.4.6 -> Broken

MinGW/GCC v4.3.2 tdm-1 (SJLJ Unwinding) -> Broken

@kemuri-_9
It seems that the errors (among other sections) can be systematically repeated over small section. If this is confirmed in my next test, i'll post a Lagarith section.

EDIT: Moreover i think you can take any End Credits in any film and i think you'll see the error sooner or later.

4.3.3 Broken

http://img98.imageshack.us/img98/8937/433xe9.th.png (http://img98.imageshack.us/my.php?image=433xe9.png)http://img98.imageshack.us/images/thpix.gif (http://g.imageshack.us/thpix.php)

MinGW/GCC v3.4.5 -> Broken

MinGW/GCC v3.4.6 -> Broken

MinGW/GCC v4.3.2 tdm-1 (SJLJ Unwinding) -> Broken

MinGW/GCC v4.3.3 -> Broken

Now i try to isolate a small section

3ngel
28th October 2008, 19:14
The frame that seems to be systematically repeteable is this

http://img88.imageshack.us/img88/412/frametv7.th.png (http://img88.imageshack.us/my.php?image=frametv7.png)http://img88.imageshack.us/images/thpix.gif (http://g.imageshack.us/thpix.php)

This is the lags video portion

http://www.megaupload.com/?d=GDGX6WR5

It seems that every gcc version has the error. Can i hazard an asm code error?

EDIT:
And now a ridiculous update to laugh us all

With the CPU i've done all the tests above (Phenom BE) the encode is corrupted. With a 2 cores Athlon the encode is correct.

http://img75.imageshack.us/img75/5694/phenomkk5.th.png (http://img75.imageshack.us/my.php?image=phenomkk5.png)http://img75.imageshack.us/images/thpix.gif (http://g.imageshack.us/thpix.php)http://img75.imageshack.us/img75/5179/athloneu7.th.png (http://img75.imageshack.us/my.php?image=athloneu7.png)http://img75.imageshack.us/images/thpix.gif (http://g.imageshack.us/thpix.php)

Said that i think it's the strangest error i've ever seen.

kemuri-_9
28th October 2008, 19:49
that's interesting...
well as i have the same cpu (phenom 9850 BE), technically i should be able to repeat the problem when i get back from work.

elguaxo
28th October 2008, 20:12
With a 2 cores Athlon the encode is correct.

I was wondering why I never got the artifacts on the many encodes I did with r998, now I know why! Thanks for the tests.

Dark Shikari
28th October 2008, 20:18
A bug that only exists on Phenom--no wonder nobody ever reported it before ;)

poisondeathray
28th October 2008, 20:21
I did a quick test with your sample with an Athlon64 single core, and Intel Quad but cannot reproduce the artifacts with r998

akupenguin
28th October 2008, 20:29
could someone post a build with --disable-asm?
Did you skip over the testing of a normal build with --no-asm ?

Dark Shikari
28th October 2008, 20:30
Did you skip over the testing of a normal build with --no-asm ?Yes, since the bug is most probably in inline assembly, not autoloaded, but that's a good idea--we should test that, too.

3ngel
28th October 2008, 20:42
How can it be a bug on one type of processor? Never heard something like this :)

LoRd_MuldeR
28th October 2008, 21:26
Maybe that CPU has some undocumented feature (one may say "bug") that only triggers in this particular situation...

kemuri-_9
28th October 2008, 22:01
what was the x264 parameters that you used? I was unable to reproduce the error with the x264.nl build nor my own with some simple settings.
might be specific to the parameters and the cpu...

but then again, i'm still at work and doing these tests through RDP, so it may be messing with the visualization.

either way, it would be helpful for me if you could provide the parameters as i seem to be the only one around with a phenom to try reduplicating the error,
which in then turn can help akupenguin and Dark Shikari at solving the issue.

3ngel
28th October 2008, 22:41
It's 2 pass

x264.999kMod.genericGCC433.exe --pass 1 --bitrate 3984 --stats ".stats" --progress --keyint 250 --bframes 16 --qpmin 10 --qpmax 51 --aq-mode 0 --psy-rd 0.8:0 --no-psnr --no-ssim --no-fast-pskip --mixed-refs --b-adapt 0 --trellis 2 --ref 9 --no-deblock --subme 7 --direct auto --direct-8x8 -1 --me umh --merange 32 --nf --weightb --b-pyramid --partitions all --8x8dct --threads auto --thread-input --no-dct-decimate --level 41 --output NUL "t.avs"

x264.999kMod.genericGCC433.exe --pass 2 --bitrate 3984 --stats ".stats" --progress --keyint 250 --bframes 16 --qpmin 10 --qpmax 51 --aq-mode 0 --psy-rd 0.8:0 --no-psnr --no-ssim --no-fast-pskip --mixed-refs --b-adapt 0 --trellis 2 --ref 9 --no-deblock --subme 7 --direct auto --direct-8x8 -1 --me umh --merange 32 --nf --weightb --b-pyramid --partitions all --8x8dct --threads auto --thread-input --no-dct-decimate --level 41 --output "2pass.mkv" "t.avs"

This is getting more and more strange

For the moment i'm alone in my trouble :(

Ehehe

LoRd_MuldeR
28th October 2008, 22:52
3ngel, is your CPU overclocked? Did you run Prime95 for a few hours, check your RAM with Memtest86+ and so on ???

3ngel
28th October 2008, 22:56
No, stock cpu and mem. No overclock.

If it was a mem corrupt problem the --noasm would have the same problem (and the problem could not be reproduced systematically as it is).
Moreover the situation happens only on BW credits streams or white text on black. On color or any other stream not a problem.

kemuri-_9
28th October 2008, 23:43
It's 2 pass

x264.999kMod.genericGCC433.exe --pass 1 --bitrate 3984 --stats ".stats" --progress --keyint 250
--bframes 16 --qpmin 10 --qpmax 51 --aq-mode 0 --psy-rd 0.8:0 --no-psnr --no-ssim --no-fast-pskip
--mixed-refs --b-adapt 0 --trellis 2 --ref 9 --no-deblock --subme 7 --direct auto --direct-8x8 -1 --me umh
--merange 32 --nf --weightb --b-pyramid --partitions all --8x8dct --threads auto --thread-input --no-dct-decimate --level 41 --output NUL "t.avs"

x264.999kMod.genericGCC433.exe --pass 2 --bitrate 3984 --stats ".stats" --progress --keyint 250
--bframes 16 --qpmin 10 --qpmax 51 --aq-mode 0 --psy-rd 0.8:0 --no-psnr --no-ssim --no-fast-pskip
--mixed-refs --b-adapt 0 --trellis 2 --ref 9 --no-deblock --subme 7 --direct auto --direct-8x8 -1 --me umh
--merange 32 --nf --weightb --b-pyramid --partitions all --8x8dct --threads auto --thread-input --no-dct-decimate --level 41 --output "2pass.mkv" "t.avs"

This is getting more and more strange

For the moment i'm alone in my trouble :(

Ehehe

confirmed error for those settings!
--asm 0 has no error.

StackVertical comparison:
top is --asm 0
bottom is running asm based on cpu detection.
snap (http://kemuri9.net/forumpics/x264_bug.png)

used the std x264.nl build

Dark Shikari
28th October 2008, 23:47
Can you raise --asm progressively (MMX2, SSE2Slow, SSE2, etc, etc) and see which setting is the lowest at which the problem occurs?

Oh, and before you do that, make checkasm and run checkasm. It should fail if there really is a problem.

3ngel
28th October 2008, 23:55
confirmed error for those settings!

Phew... :)

Ehehe

kemuri-_9
28th October 2008, 23:58
sure i'll do that when i get back to the house, about to leave work now... so be a while
however, i did the same settings except with psy-rd set to 0:0 and couldn't see an error with asm active.

Dark Shikari
29th October 2008, 00:12
sure i'll do that when i get back to the house, about to leave work now... so be a while
however, i did the same settings except with psy-rd set to 0:0 and couldn't see an error with asm active.Since the error could be triggered out of mere coincidence, just because it disappears with an option changed doesn't mean that option is at fault.

kemuri-_9
29th October 2008, 02:10
indeed, since dropping it back to threads 1 also didn't have an error.

but as for checking increasing asm to pinpoint the error:
MMX2: no error
SSE2Slow: no error
SSE2: error
SSE2Fast: error


$ ./checkasm.exe
x264: using random seed 56950896
x264: MMX
- pixel sad : [OK]
- pixel sad_aligned : [OK]
- pixel ssd : [OK]
- pixel satd : [OK]
- pixel sa8d : [OK]
- pixel sad_x3 : [OK]
- pixel sad_x4 : [OK]
- pixel var : [OK]
- pixel hadamard_ac : [OK]
- intra satd_x3 : [OK]
- intra sad_x3 : [OK]
- ssim : [OK]
- esa ads: [OK]
- sub_dct4 : [OK]
- sub_dct8 : [OK]
- add_idct4 : [OK]
- add_idct8 : [OK]
- (i)dct4x4dc : [OK]
- zigzag_frame : [OK]
- zigzag_field : [OK]
- mc luma : [OK]
- mc chroma : [OK]
- mc wpredb : [OK]
- hpel filter : [OK]
- lowres init : [OK]
- intra pred : [OK]
- deblock : [OK]
- quant : [OK]
- dequant : [OK]
- denoise dct : [OK]
- cabac : [OK]
x264: MMX Cache64
- pixel sad : [OK]
- pixel sad_x3 : [OK]
- pixel sad_x4 : [OK]
- mc luma : [OK]
- lowres init : [OK]
x264: MMX Cache32
- pixel sad : [OK]
- pixel sad_x3 : [OK]
- pixel sad_x4 : [OK]
- mc luma : [OK]
- lowres init : [OK]
x264: SSE2Slow
- pixel sad_aligned : [OK]
- pixel ssd : [OK]
- pixel satd : [OK]
- pixel sa8d : [OK]
- pixel var : [OK]
- ssim : [OK]
- sub_dct4 : [OK]
- sub_dct8 : [OK]
- add_idct4 : [OK]
- add_idct8 : [OK]
- hpel filter : [OK]
- intra pred : [OK]
- deblock : [OK]
- quant : [OK]
- dequant : [OK]
- denoise dct : [OK]
x264: SSE2Fast
- pixel sad : [OK]
- pixel sad_x3 : [OK]
- pixel sad_x4 : [OK]
- pixel var : [OK]
- pixel hadamard_ac : [OK]
- intra sad_x3 : [OK]
- esa ads: [OK]
- zigzag_frame : [OK]
- mc luma : [OK]
- mc chroma : [OK]
- mc wpredb : [OK]
- hpel filter : [OK]
- lowres init : [OK]
- intra pred : [OK]
x264: SSE2Fast Cache64
- pixel sad : [OK]
- pixel sad_x3 : [OK]
- pixel sad_x4 : [OK]
- mc luma : [OK]
x264: SSE3
- pixel sad : [OK]
- pixel sad_x3 : [OK]
- pixel sad_x4 : [OK]
x264: All tests passed Yeah :)

burfadel
29th October 2008, 03:51
Rev 1000 doesn't (and I'm sure won't be) have to be a bugfix. Any fixes can be released as patches then submitted to GIT once the time comes :)

kemuri-_9
29th October 2008, 19:04
was the information i posted above able to help you (Dark Shikari/akupenguin) in finding the error, or do you still need some more data?

Dark Shikari
29th October 2008, 19:06
was the information i posted above able to help you (Dark Shikari/akupenguin) in finding the error, or do you still need some more data?Can you confirm that all --asm levels work properly on other CPUs, that is, the issue only occurs on Phenom?

The next thing you can do is go throughout x264 commenting out SSE2 function initializations (find if( cpu&X264_CPU_SSE2 ) lines and change them to if(0) ) until the bug stops occurring.

kemuri-_9
29th October 2008, 23:20
this came out when i plugged it on my other two computers with --asm SSE2:
top: athlon64 x2 4200+
bottom: prescott (p4 3GHz w/ HT)
snapshot (http://kemuri9.net/forumpics/x264_bug2.png)
diff frame within the sample, but still a fubar frame
--asm SSE2Slow was fine once again.

would attribute 3ngel not seeing a problem with it on the athlon64, since the default asm usage is SSE2Slow for them (as you well know).

I'll go through commenting out SSE2 function inits in a bit.

akupenguin
29th October 2008, 23:44
top: athlon64 x2 4200+
bottom: prescott (p4 3GHz w/ HT)
You could have just said that both cpus showed the same error, rather than pasting two copies of the same image and making me overlay them in GIMP to make sure that they're identical.

kemuri-_9
30th October 2008, 00:43
yeah, hadn't thought about that...

so trying what DS said about the asm init disabling:
removing in quant.c - error
removing in pixel.c:
if( (cpu&X264_CPU_SSE2) && !(cpu&X264_CPU_SSE2_IS_SLOW) ) -> if ( 0 ) - error free
if (cpu&X264_CPU_SSE2 ) -> if( 0 ) - error
removing in dct.c - error

by line commenting within the found bracket (via commenting the specified line only):
//INIT2( sad, _sse2 ); - error
//INIT2( sad_x3, _sse2 ); - error
//INIT2( sad_x4, _sse2 ); - error
//INIT4( hadamard_ac, _sse2 ); - error free
//INIT_ADS( _sse2 ); - error
//pixf->var[PIXEL_8x8] = x264_pixel_var_8x8_sse2; - error
//pixf->intra_sad_x3_16x16 = x264_intra_sad_x3_16x16_sse2; - error

Edit:
all the error blocks I've been seeing are focused on the bottom of the frame, near the frame edge...
possibly related to having a 532 pixel (non mod16) height?

Dark Shikari
30th October 2008, 00:45
Hadamard_ac... ooooooooooh, it is a bug in psy-RD, but in the assembly, not the C code or the algorithm!

I'm going to guess there was some sort of overflow, considering the fact that it only happens on a black/white area like that.

3ngel
30th October 2008, 01:02
but in the assembly, not the C code

I would guessed it. Remember to me the ol' amiga days when strange things happened in hardcored asm code :)

@kemuri-_9
If you look at the "Jiikine" word, it probably happens in the middle of the screen. I found it happens almost anytime.

Good BTW.

Dark Shikari
30th October 2008, 08:42
Here's Akupenguin's fix (http://akuvian.org/src/x264/hadamard_ac.diff).

It seems to fix the problem; it avoids possibility of overflow as far as I can tell.

If no issues are reported, I'll commit it locally (yes, yes, r1k is coming soon!).

LoRd_MuldeR
30th October 2008, 14:11
Here's Akupenguin's fix (http://akuvian.org/src/x264/hadamard_ac.diff).

It seems to fix the problem; it avoids possibility of overflow as far as I can tell.

If no issues are reported, I'll commit it locally (yes, yes, r1k is coming soon!).

x264 r999 + hadamard_ac.diff
http://www.mediafire.com/file/myqnqimddg0/x264-r999-gcc432-hadamard_ac.zip

MinGW/GCC 4.3.2-tdm-1, yasm 0.7.1.2093, march=pentium2, no fprofiled

kemuri-_9
30th October 2008, 15:47
Here's Akupenguin's fix (http://akuvian.org/src/x264/hadamard_ac.diff).

It seems to fix the problem; it avoids possibility of overflow as far as I can tell.

If no issues are reported, I'll commit it locally (yes, yes, r1k is coming soon!).

i ran the patch on my own build and didn't see any fubar blocks in the sample,
so it appears to have done the trick.

3ngel
31st October 2008, 12:29
Confirmed. Encode clean with no problem.

Good work.

Out of curiosity, there is something that could be done to furtherly improve the performance on Phenom?

Moreover, there can be add a customizable option to buffer ahead frames? It something happens to me to do encodes over RDP, taking frames from remote location, and the read over the net lose 3-4 fps. So with a buffer ahead the x264.exe would have the same performance as locally.

lexor
31st October 2008, 18:24
zomg, the pipeline changes just went live into git! (well not nahelem and interlace ones, obviously) I'm getting giddy just thinking about trying the new toys.

We need a build pronto!

Disabled
31st October 2008, 18:44
We need a build pronto!
x264.nl has one...
Congratulations on the 1000+th revision.

LoRd_MuldeR
31st October 2008, 18:55
x264.nl has one...
Congratulations on the 1000+th revision.

So when is the revision 1000 party at pengvado's villa going to start? :D

Avenger007
31st October 2008, 19:02
So when is the revision 1000 party at pengvado's villa going to start? :D
Thu Oct 16 03:17:53 2008 :p

LoRd_MuldeR
31st October 2008, 19:09
x264 r1016
http://www.mediafire.com/file/dkn5mzndyyi/x264-r1016-gcc432-fprofiled.7z

MinGW/GCC 4.3.2-tdm-1, yasm 0.7.1.2093, march=pentium2, fprofiled

microchip8
31st October 2008, 19:17
So when is the revision 1000 party at pengvado's villa going to start? :D

it has already began on IRC, the cake is delicious :p

pcordes
31st October 2008, 21:24
Moreover, there can be add a customizable option to buffer ahead frames?

A separate program can implement the buffer. There are several buffered-pipe programs around, such as bfr. This works well on Linux, and helps even when feeding x264 from a local but bursty source, e.g. mplayer -vo yuv4mpeg:file=pipe.y4m. See my post here:
http://forum.doom9.org/showthread.php?p=1206916#post1206916

Shinigami-Sama
31st October 2008, 23:37
it has already began on IRC, the cake is delicious :p

the cake is a lie...



its actually a pie

Adub
1st November 2008, 00:34
Excellent! I can't wait to back up my tv series in even more quality now!

kemuri-_9
1st November 2008, 00:35
thanks for all the revisions devs, they were delicious.

LoRd_MuldeR
1st November 2008, 01:30
it has already began on IRC, the cake is delicious :p

I'll open a bottle of beer now to celebrate the occasion http://forum.gleitz.info/images/smilies/cheers.gif

skystrife
1st November 2008, 02:01
Broken, because I'm a retard.

XhmikosR
1st November 2008, 13:39
@skystrife: I'm getting an error with your build that pthreadGC2.dll was not found.

http://img376.imageshack.us/img376/9500/01112008143613hz6.png (http://imageshack.us)

techhouse's x264 x86 r1016 (http://techouse.project357.com/builds/x264_x86_r1016_techouse.7z) is working fine.

roozhou
1st November 2008, 14:10
@skystrife: I'm getting an error with your build that pthreadGC2.dll was not found.



Seems skystrife's build dynamically links to pthread library. He needs a patch to build a static-linked x264.

skystrife
1st November 2008, 15:53
Oh, well fuck. I did something stupid with my msys environment and forgot to set it back to normal. Fixed build coming soon.

Fixed: http://skystrife.com/x264/x264.1016.modified.02.exe