Log in

View Full Version : x264 "Macroblock Tree Ratecontrol" testing (committed)


Pages : 1 2 3 4 [5] 6 7 8 9 10

G_M_C
11th August 2009, 10:30
I think we'll get the the real step forward when adaptive p frames are introduced, especially on the subject of fades and dark scenes.

I've seen on the GSOC project page that the majority of set goals were achieved already, more goeals than DS hoped for at first instance, reading his comment on that page. If i understand that page correctly the code-optimalisation fase was undergoing. After that maybe the testing can be done more public "alpha / beta / RC" testing phase (like the testing of mbtree done in this thread).

Before that i dont think discussing fades beeing good, bad or ugly is usefull. Cause it's allready know that fades are not handled optimally.

nixo
11th August 2009, 13:58
I'm having a bit of a problem hitting the desired bitrate using mbtree. Source is Elephant's Dream and I'm using the unpatched 32bit build of rev. 1206 from x264.nl.
Perhaps I'm wrong but it doesn't appear to be a question of saturation.
Options used: -B 8000 --level 4.0 --preset veryslow --tune animation -b 3 -I 24 -i 1 --no-progress --vbv-bufsize 31250 --vbv-maxrate 25000

With mbtree:

Pass1:
avis [info]: 1920x1080 @ 24.00 fps (15691 frames)
x264 [warning]: VBV bitrate (25000) > level limit (20000)
x264 [warning]: VBV buffer (31250) > level limit (25000)
x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 Cache64
x264 [info]: profile Main, level 4.0
x264 [info]: slice I:836 Avg QP:14.77 size:182341
x264 [info]: slice P:6827 Avg QP:18.22 size: 48533
x264 [info]: slice B:8028 Avg QP:21.68 size: 18782
x264 [info]: consecutive B-frames: 13.9% 28.2% 41.6% 16.3%
x264 [info]: mb I I16..4: 43.8% 0.0% 56.2%
x264 [info]: mb P I16..4: 20.1% 0.0% 0.0% P16..4: 36.9% 0.0% 0.0% 0.0% 0.0%
skip:43.0%
x264 [info]: mb B I16..4: 3.4% 0.0% 0.0% B16..8: 18.2% 0.0% 0.0% direct:
8.8% skip:69.7% L0:34.8% L1:41.5% BI:23.7%
x264 [info]: final ratefactor: 17.59
x264 [info]: direct mvs spatial:76.6% temporal:23.4%
x264 [info]: coded y,uvDC,uvAC intra:42.2% 44.5% 18.6% inter:14.3% 10.7% 0.6%
x264 [info]: kb/s:7764.6

encoded 15691 frames, 8.23 fps, 7765.44 kb/s

Pass2:
avis [info]: 1920x1080 @ 24.00 fps (15691 frames)
x264 [info]: using SAR=1/1
x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 Cache64
x264 [warning]: target: 8000.00 kbit/s, expected: 7659.36 kbit/s, avg QP: 22.2821
x264 [info]: profile High, level 4.0
x264 [info]: slice I:836 Avg QP:15.62 size:184470
x264 [info]: slice P:6827 Avg QP:19.99 size: 46680
x264 [info]: slice B:8028 Avg QP:23.57 size: 19026
x264 [info]: consecutive B-frames: 13.9% 28.2% 41.6% 16.3%
x264 [info]: mb I I16..4: 21.8% 53.4% 24.7%
x264 [info]: mb P I16..4: 5.5% 10.2% 1.7% P16..4: 23.6% 9.5% 6.3% 0.8% 0.4%
skip:42.1%
x264 [info]: mb B I16..4: 0.7% 1.5% 0.4% B16..8: 27.1% 2.2% 2.6% direct:
3.6% skip:62.0% L0:47.1% L1:39.1% BI:13.8%
x264 [info]: 8x8 transform intra:56.6% inter:53.6%
x264 [info]: direct mvs spatial:70.0% temporal:30.0%
x264 [info]: coded y,uvDC,uvAC intra:49.5% 46.7% 22.0% inter:14.1% 7.5% 0.9%
x264 [info]: ref P L0 72.2% 14.9% 8.3% 4.7%
x264 [info]: ref B L0 76.8% 15.5% 7.7%
x264 [info]: kb/s:7655.6

encoded 15691 frames, 1.40 fps, 7656.33 kb/s

Without mbtree:

Pass1:
avis [info]: 1920x1080 @ 24.00 fps (15691 frames)
x264 [warning]: VBV bitrate (25000) > level limit (20000)
x264 [warning]: VBV buffer (31250) > level limit (25000)
x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 Cache64
x264 [info]: profile Main, level 4.0
x264 [info]: slice I:836 Avg QP:15.67 size:146565
x264 [info]: slice P:7300 Avg QP:17.30 size: 48504
x264 [info]: slice B:7555 Avg QP:20.16 size: 20117
x264 [info]: consecutive B-frames: 19.4% 25.9% 37.2% 17.4%
x264 [info]: mb I I16..4: 42.9% 0.0% 57.1%
x264 [info]: mb P I16..4: 19.8% 0.0% 0.0% P16..4: 40.5% 0.0% 0.0% 0.0% 0.0%
skip:39.7%
x264 [info]: mb B I16..4: 3.7% 0.0% 0.0% B16..8: 20.9% 0.0% 0.0%
direct:11.5% skip:64.0% L0:34.4% L1:43.4% BI:22.2%
x264 [info]: final ratefactor: 18.90
x264 [info]: direct mvs spatial:84.6% temporal:15.4%
x264 [info]: coded y,uvDC,uvAC intra:45.8% 48.9% 20.5% inter:15.6% 13.8% 0.8%
x264 [info]: kb/s:7691.7

encoded 15691 frames, 8.97 fps, 7692.55 kb/s

Pass2:
avis [info]: 1920x1080 @ 24.00 fps (15691 frames)
x264 [info]: using SAR=1/1
x264 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 Cache64
x264 [info]: profile High, level 4.0
x264 [info]: slice I:836 Avg QP:15.24 size:160784
x264 [info]: slice P:7300 Avg QP:17.46 size: 48311
x264 [info]: slice B:7555 Avg QP:20.90 size: 22024
x264 [info]: consecutive B-frames: 19.4% 25.9% 37.2% 17.4%
x264 [info]: mb I I16..4: 19.0% 56.0% 25.1%
x264 [info]: mb P I16..4: 3.8% 11.0% 2.2% P16..4: 24.6% 11.3% 7.0% 0.7% 0.3%
skip:39.2%
x264 [info]: mb B I16..4: 0.8% 1.6% 0.4% B16..8: 28.7% 2.5% 3.0% direct:
4.5% skip:58.4% L0:45.1% L1:40.2% BI:14.7%
x264 [info]: 8x8 transform intra:60.9% inter:53.3%
x264 [info]: direct mvs spatial:73.9% temporal:26.1%
x264 [info]: coded y,uvDC,uvAC intra:54.2% 54.4% 28.2% inter:15.6% 9.6% 1.0%
x264 [info]: ref P L0 72.4% 15.3% 8.2% 4.1%
x264 [info]: ref B L0 77.0% 15.8% 7.2%
x264 [info]: kb/s:7996.1

encoded 15691 frames, 1.30 fps, 7996.89 kb/s

--
Nikolaj

SquallMX
12th August 2009, 19:06
Hi, i using MBTree for first time, and the Maxbitrate is not respected by x264 and causes buffer underflows.

Non MBTree (x264 1195)

"D:\Archivos de programa\megui\tools\x264\x264.exe" --profile high --pass 2 --bitrate 5900 --stats "D:\Temp\BR\DBE2.stats" --level 4.1 --keyint 48 --min-keyint 4 --b-adapt 2 --direct auto --deblock -2:-1 --psy-rd 0.8:0.2 --partitions p8x8,b8x8,i4x4,i8x8 --qpmax 40 --ipratio 1.1 --pbratio 1.2 --vbv-bufsize 15000 --vbv-maxrate 15000 --me umh --thread-input --aq-strength 0.8 --ssim --output "D:\Temp\BR\DBE.mkv" "D:\Temp\BR\DBE.avs" --mvrange 511 --aud --nal-hrd --sar 4:3

Max bitrate acording to DGAVCIndex: 14.736

MBTree (x264 1201)

"D:\Archivos de programa\megui\tools\x264\x264.exe" --profile high --pass 2 --bitrate 5900 --stats "D:\Temp\BR\DBE2.stats" --level 4.1 --keyint 48 --min-keyint 4 --b-adapt 2 --direct auto --deblock -2:-1 -mbtree --psy-rd 0.8:0.2 --partitions p8x8,b8x8,i4x4,i8x8 --qpmax 40 --ipratio 1.1 --pbratio 1.2 --rc-lookahead 47 --aq-mode 2 --vbv-bufsize 15000 --vbv-maxrate 15000 --me umh --thread-input --aq-strength 0.8 --ssim --output "D:\Temp\BR\DBE2.mkv" "D:\Temp\BR\DBE.avs" --mvrange 511 --aud --nal-hrd

Max bitrate acording to DGAVCIndex: 23.718
x264 [warning]: VBV underflow (-99323 bits)

Any Help? :thanks:

Dark Shikari
12th August 2009, 19:16
Stop setting qmax to 40; you're crippling the ability of VBV to do its job.

SquallMX
12th August 2009, 19:24
Stop setting qmax to 40; you're crippling the ability of VBV to do its job.

:thanks:, I will try again using the default value (51) but according to AVINaptic de Max Quant used in the stream is 29, i will post the results later.

Dark Shikari
12th August 2009, 19:27
:thanks:, I will try again using the default value (51) but according to AVINaptic de Max Quant used in the stream is 29, i will post the results later.Avinaptic prints frame quantizers, not block quantizers, the latter of which QPmax controls.

juGGaKNot
12th August 2009, 21:28
start "encode" /b /low /wait "%mypath%\bin\x264.exe" --pass 1 --preset veryslow --level %mylevel% --bitrate %btratex264% --stats "%mypath%\%mpath%\T1\%mymovie%.stats" --log-file "%mypath%\%mpath%\x264_1pass.log" --keyint %kint% --min-keyint %mint% --deblock -2;-2 --fullrange on --trellis 2 --mbtree --ref 4 --bframes 4 --merange 32 --ipratio 1.1 --pbratio 1.1 --vbv-bufsize 20000 --vbv-maxrate 20000 --qcomp 1.0 --aq-mode 1 --aq-strength 0.5 --sar 1:1 --aud --nal-hrd --slow-firstpass --output NUL %myavs%
echo.
echo %langi%
echo.
start "encode" /b /low /wait "%mypath%\bin\x264.exe" --pass 2 --preset veryslow --level %mylevel% --bitrate %btratex264% --stats "%mypath%\%mpath%\T1\%mymovie%.stats" --log-file "%mypath%\%mpath%\x264_2pass.log" --keyint %kint% --min-keyint %mint% --deblock -2;-2 --fullrange on --trellis 2 --mbtree --ref 4 --bframes 4 --merange 32 --ipratio 1.1 --pbratio 1.1 --vbv-bufsize 20000 --vbv-maxrate 20000 --qcomp 1.0 --aq-mode 1 --aq-strength 0.5 --sar 1:1 --aud --nal-hrd --output "%mypath%\%mpath%\T1\%mymovie%.264" %myavs%
echo.

Encoding X264 1st Pass :

avis [info]: 1136x656 @ 30.00 fps (756 frames)
x264 [info]: using SAR=1/1
x264 [info]: using cpu capabilities: MMX2 SSE2Slow
x264 [info]: cabac=1 ref=4 deblock=1:-2:-2 analyse=0x3:0x133 me=umh subme=10 psy
=1 psy_rd=1.0:0.0 mixed_ref=1 me_range=32 chroma_me=1 trellis=2 8x8dct=1 cqm=0 d
eadzone=21,11 chroma_qp_offset=-2 threads=3 nr=0 decimate=1 mbaff=0 bframes=4 b_
pyramid=0 b_adapt=2 b_bias=0 direct=3 wpredb=1 keyint=300 keyint_min=30 scenecut
=40 rc=cbr mbtree=0 bitrate=4000 ratetol=1.0 qcomp=1.00 qpmin=10 qpmax=51 qpstep
=4 vbv_maxrate=20000 vbv_bufsize=20000 ip_ratio=1.10 pb_ratio=1.10 aq=1:0.50
x264 [info]: profile High, level 4.0
x264 [info]: slice I:12 Avg QP:27.03 size: 22740
x264 [info]: slice P:300 Avg QP:25.28 size: 23924
x264 [info]: slice B:444 Avg QP:27.49 size: 16857
x264 [info]: consecutive B-frames: 5.0% 41.1% 22.6% 19.9% 11.4%
x264 [info]: mb I I16..4: 27.9% 67.7% 4.5%
x264 [info]: mb P I16..4: 4.1% 19.1% 1.6% P16..4: 36.4% 15.2% 8.1% 0.2% 0
.1% skip:15.2%
x264 [info]: mb B I16..4: 0.9% 4.6% 0.5% B16..8: 48.1% 3.6% 4.6% direct:
12.6% skip:25.2% L0:45.5% L1:44.4% BI:10.1%
x264 [info]: final ratefactor: 27.51
x264 [info]: 8x8 transform intra:76.2% inter:83.4%
x264 [info]: direct mvs spatial:99.8% temporal:0.2%
x264 [info]: coded y,uvDC,uvAC intra:67.3% 62.1% 27.1% inter:36.6% 29.0% 2.3%
x264 [info]: ref P L0 70.1% 14.8% 9.5% 5.6%
x264 [info]: ref B L0 75.2% 15.4% 9.4%

encoded 756 frames, 1.40 fps, 4741.16 kb/s

Encoding X264 2nd Pass :

avis [info]: 1136x656 @ 30.00 fps (756 frames)
x264 [info]: using SAR=1/1
x264 [info]: using cpu capabilities: MMX2 SSE2Slow
x264 [info]: cabac=1 ref=4 deblock=1:-2:-2 analyse=0x3:0x133 me=umh subme=10 psy
=1 psy_rd=1.0:0.0 mixed_ref=1 me_range=32 chroma_me=1 trellis=2 8x8dct=1 cqm=0 d
eadzone=21,11 chroma_qp_offset=-2 threads=3 nr=0 decimate=1 mbaff=0 bframes=4 b_
pyramid=0 b_adapt=2 b_bias=0 direct=3 wpredb=1 keyint=300 keyint_min=30 scenecut
=40 rc=2pass mbtree=0 bitrate=4000 ratetol=1.0 qcomp=1.00 qpmin=10 qpmax=51 qpst
ep=4 cplxblur=20.0 qblur=0.5 vbv_maxrate=20000 vbv_bufsize=20000 ip_ratio=1.10 p
b_ratio=1.10 aq=1:0.50
x264 [info]: profile High, level 4.0
[16.1%] 122/756 frames, 3.88 fps, 1143.08 kb/s, eta 0:02:43

is this normal ?

Dark Shikari
12th August 2009, 21:51
qcomp=1 is synonymous with mbtree=0.

stpdrgstr
13th August 2009, 08:06
Guys, sorry to look dumb, but, what's a good value of lookahead with 2 gb of ram and a 24 minutes video @ 24 fps?

I was playing with mbtree in a 1 minute video at the same frame rate, using 250 of lookahead, but I don't know what would be safe, yet "the max" for a longer one.

nakTT
13th August 2009, 08:12
Guys, sorry to look dumb, but, what's a good value of lookahead with 2 gb of ram and a 24 minutes video @ 24 fps?

I was playing with mbtree in a 1 minute video at the same frame rate, using 250 of lookahead, but I don't know what would be safe, yet "the max" for a longer one.
Yeah I also wonder the same thing. How about 2 hours long movies @25fps with 4GB RAM? Any simple formula that we, layman can use as a general guideline?

Dark Shikari
13th August 2009, 08:13
Yeah I also wonder the same thing. How about 2 hours long movies? Any simple formula that we, layman can use as a general guideline?General advice for laymen?!

Don't touch the bloody thing you gits!

nakTT
13th August 2009, 08:15
General advice for laymen?!

Don't touch the bloody thing you gits!
Sorry, wrong word I guess :D. I mean for not so layman. Come on, some advice please.

Comatose
13th August 2009, 08:21
The new CRF (with mbtree) is rather unstable with the new source, while previously CRF would produce encodes of similar sources at about 200 kbps within each other.

Mouryou no Hako (grainy, HD anime):

Episode 2 - source (http://i32.tinypic.com/32zrvv6.png)
Episode 2 - CRF 18.5 (before mbtree, b-pyramid enabled) = ~7000 kbps (http://i31.tinypic.com/16bxdgk.png)
Episode 2 - CRF 13 (with mbtree) = ~6900 kbps (http://i28.tinypic.com/2yvpjzk.png)

Episode 4 - CRF 18.5 (mbtree disabled, b-pyramid enabled) = ~7250 kbps
Episode 4 - CRF 12.85 (with mbtree) = ~5600 kbps

Is it likely that episode 4 just included that much more areas which mbtree is effective with (considering grain is heaviest in dark scenes and lightest in bright scenes, and episode 4 had more bright scenes than episode 2)?

edit: At episode 2, mbtree is actually worse at times. Added links to episode 2 above. This is compared to an "old" encode, though, before mbtree was introduced, so it uses the old b-adapt, etc. I'll encode again, this time using the new version with --no-mbtree. Check the bottom left and to the (your) right of the woman's rightmost arm. The earlier encode was not perfect in this regions either, but it actually got worse. Also, there's artifacting to the left of the close guy's neck

Astrophizz
13th August 2009, 11:34
Yeah I also wonder the same thing. How about 2 hours long movies @25fps with 4GB RAM? Any simple formula that we, layman can use as a general guideline?

I think it would be the number of frames your ram can hold so (max lookahead) <= (free ram)/[(# of pixels)x(bits per pixel in raw format - remember YUV12)]

Dark Shikari
13th August 2009, 11:47
edit: At episode 2, mbtree is actually worse at times. Added links to episode 2 above. This is compared to an "old" encode, though, before mbtree was introduced, so it uses the old b-adapt, etc. I'll encode again, this time using the new version with --no-mbtree. Check the bottom left and to the (your) right of the woman's rightmost arm. The earlier encode was not perfect in this regions either, but it actually got worse. Also, there's artifacting to the left of the close guy's neckFor the umpteenth time, this is intentional. MB-tree redistributes bits, therefore some parts are guaranteed to get worse.

Comatose
13th August 2009, 13:10
Very kind response... Considering the purpose of the thread (you know, TESTING/FEEDBACK), you'd think posting something like that would be met with more than that...

nakTT
13th August 2009, 15:07
I think it would be the number of frames your ram can hold so (max lookahead) <= (free ram)/[(# of pixels)x(bits per pixel in raw format - remember YUV12)]
Thanks for the reply bro,

I think I know how to calculate number of pixel. But bits per pixel in raw format? I'm not quite sure. Care to elaborate? (or of there is a simpler guide).

:thanks:

DarkZell666
13th August 2009, 16:20
Thanks for the reply bro,

I think I know how to calculate number of pixel. But bits per pixel in raw format? I'm not quite sure. Care to elaborate? (or of there is a simpler guide).

:thanks:

YV12 is 12 bits/pixel (I'm not sure if it's a coincidence or what though :p).

Selur
13th August 2009, 16:46
hmmm,..

1080p = 1920*1080 = 2 073 600 pixel
Yv12 = 12bit per pixel
-> 2073600 * 12 = 24 883 200 bit per frame

1 GB free RAM = 1024 * 1024 *1024 Byte = 8*1 073 741 824 bit = 8 589 934 592 bit
8589934592 / 24883200 = 345.210...

-> did I go wrong somewhere or should this mean that a rc-lookahead of 345 per GB of free RAM would be fine?

DarkZell666
13th August 2009, 16:56
Thanks for the reply bro,

I think I know how to calculate number of pixel. But bits per pixel in raw format? I'm not quite sure. Care to elaborate? (or of there is a simpler guide).

:thanks:

hmmm,..

1080p = 1920*1080 = 2 073 600 pixel
Yv12 = 12bit per pixel
-> 2073600 * 12 = 24 883 200 bit per frame

1 GB free RAM = 1024 * 1024 *1024 Byte = 8*1 073 741 824 bit = 8 589 934 592 bit
8589934592 / 24883200 = 345.210...

-> did I go wrong somewhere or should this mean that a rc-lookahead of 345 per GB of free RAM would be fine?

It looks right on the paper, except it doesn't account for the overhead-per-frame of x264 itself (without lookahead), influenced partly by the number of b-frames and reference frames IMHO (dunno how many Mbytes/frame x264 uses to store MV's, weights, DCT data, etc. though ...).

kemuri-_9
13th August 2009, 18:18
did I go wrong somewhere or should this mean that a rc-lookahead of 345 per GB of free RAM would be fine?
let's not forget that rc-lookahead is capped at 250.
but overall the majority of memory usage consumed by x264 is in the allocation of frames.

frames are relatively complex in their allocation and have a number of factors relating to what parameters were given on the command line (the cpu's cacheline split comes into play here too) in addition to the standard resolution/ mb count factors.

look at the code for x264_frame_new (http://git.videolan.org/gitweb.cgi?p=x264.git;a=blob;f=common/frame.c;hb=HEAD) to get the picture.

by no means is it as simple as what you all have been thinking :rolleyes:

Selur
13th August 2009, 20:43
wasn't my thinking the formula was presented by Astrophizz and I knew it was to simple but should at least help to give a general size.
to be on the save side 5 MB free save per frame should be fine unless there's something huge hiding in the code somewhere ;)

akupenguin
13th August 2009, 22:50
to be on the save side 5 MB free save per frame should be fine unless there's something huge hiding in the code somewhere ;)
There is. hpel and lowres planes bring the pixel data itself up to (1920+64)*(1088+64)*5.5 = 12MB per frame allocated. And metadata isn't negligible either, especially when it's O(n^2) in max_bframes.

Chengbin
14th August 2009, 03:07
Is decoding affected with mbtree? I found seeking much slower with mbtree encodes on my Archos 5.

Selur
14th August 2009, 08:05
@akupenguin: huha, that's a huge one ;)

G_M_C
14th August 2009, 08:56
Am i right in thinking that features like this mbtree make switching to 64-bits systems/OS etc. more interesting ? Cause i havent switched to 64 bits yet, but mbtree has the potential of breaking the 2 Gb process limit (with high resolutions and/or large values for lookahead).

wyti
14th August 2009, 09:42
Yeah you're right, tired it on 1080p with little insane values and crash :p

LoRd_MuldeR
14th August 2009, 11:41
Am i right in thinking that features like this mbtree make switching to 64-bits systems/OS etc. more interesting ?

Switching to 64-Bit OS is always a good thing, as it allows to use more than ~3 GB of physical RAM, it allows to run 64-Bit processes where needed and 32-Bit processes will still work fine.

Anyway, I've had no problem with 32-Bit x264 and MB Tree, even with 1080p content. So if you run out of memory, you must be enforcing the problem by using "overkill" settings...

G_M_C
14th August 2009, 11:45
Switching to 64-Bit OS is always a good thing, as it allows to use more than ~3 GB of physical RAM, it allows to run 64-Bit processes where needed and 32-Bit processes will still work fine.

Anyway, I've had no problem with 32-Bit x264 and MB Tree, even with 1080p content. So if you run out of memory, you must be enforcing the problem by using "overkill" settings...

It was more a theoretical question for now i suppose; But who knows what happens if i have need for a 4k resolution encoding with GOP size of 60 (60fps) for instance.

LoRd_MuldeR
14th August 2009, 12:03
It was more a theoretical question for now i suppose; But who knows what happens if i have need for a 4k resolution encoding with GOP size of 60 (60fps) for instance.

Yeah, but at I think at the time when 4k has established as a standatd (if that ever happens), 32-Bit OS will have almost disappeared...

burfadel
14th August 2009, 12:37
Well, by the time 4k video comes out h.265 would be the standard format (its just in the planning stage). Then again, depending on the timeframe encoders would probably be wavelet based and not macroblock based. Just a guess, but GPU would probably suit wavelet encoding much better than macroblock :)

nurbs
14th August 2009, 12:53
Then again, depending on the timeframe encoders would probably be wavelet based and not macroblock based.
I remember reading in some magazine how wavelet based encoding would in the coming years be used everywhere when I was in school 10 years ago.

Guess it wasn't feasible both from quality and needed cpu power (at least if dirac is any indication).

G_M_C
14th August 2009, 12:55
This is all offtopic, but the fact is that more and more apps seem to want more memory. x264 was very modest in that respect, untill mbtree. I asked this question just for myself, just cause i'm still a little nervous to go over to 64 bit.

My machine works like clockwork, very dependend etc. And i dont want to change that if i dont have too. When i do migrate to 64 bit, i get more stuff i have to think about; Avisynth for instance.

Most Avisynth plugins are only 32 bit. I can pipe 32 bit avisynth output to x264-64, i know. But if I dont have to , why should I ?

But going to 64 bit is inevitable, i just want to time my upgrade as good as i can. I'll be going to Win7 then, but i'll wait unill most apps/drivers/tools i use are ported to win7 or commonplace (clsid's tool for preferring non-MS codecs for instance).

And now back to --mbtree testing, the actual topic of this thread :p

LoRd_MuldeR
14th August 2009, 12:58
Go WindowsXP x64-Edition today and you'll notice no difference to good old 32-bit WindowsXP, except that 64-Bit applications suddenly start to work and 4 GB of RAM are accessible.

Driver support is good, except for exotic/archaic hardware maybe. 32-Bit apps that used to work under XP x86 will work under XP x64 flawlessly...

burfadel
14th August 2009, 13:00
CPU usage would probably be higher for the same quality (although 'same quality' is very had to judge between the two fundamentally different encoding techiniques) but this can be overcome by using the vast parallelism that modern GPU's provide, which I suspect would benefit a wavelet based codec significantly more than that of a macroblock codec.

LoRd_MuldeR
14th August 2009, 13:02
CPU usage would probably be higher for the same quality (although 'same quality' is very had to judge between the two fundamentally different encoding techiniques) but this can be overcome by using the vast parallelism that modern GPU's provide, which I suspect would benefit a wavelet based codec significantly more than that of a macroblock codec.

Can you explain why Wavelet-based compression is so much more suitable for massive parallelization than DCT-based compression?

Remember that when GPGPU is involved we are talking about thousands of ultra-lightweight threads...

G_M_C
14th August 2009, 13:24
[...]
exotic hardware
[...]


According to most reports my Xonar HDAV13 Slim works best on XP-32 for now. Driver-support for Win7-64 is somewhat lacking or having difficutlies ;)

So, as long as there is no need to upgrade, i'll wait.

And continuing on topic; I've found a good test-clip for testing the fade-in/fade-out artifacting. I'ts a short clip of the BBC documentary "South Pacific", episode "Vulcanoes", the scene where the camara gets lowered into a cave (actually it's an empty lavatube in Hawaii).

Artifacting (loss of shadowdetail) is worst with adaptive AQ, strength 1 and mbtree enabled. Best seems to be AQ=1 (regular AQ), rest same.

popper
14th August 2009, 16:28
It was more a theoretical question for now i suppose; But who knows what happens if i have need for a 4k resolution encoding with GOP size of 60 (60fps) for instance.

has anyone actually tryed MBtree encoding a high speed 2K AND 4K 30 second+ sequence for real yet?, iv not seen any posts to that effect here so far, and were would you even find such a Highspeed 2k/4k dataset thats freeware to try online.

burfadel
15th August 2009, 01:18
I believe its something to do with the processing stages of the wavelet algorithm. Also a wavelet codec is significantly different to a DCT (macroblock) codec, and would have to be developed from the ground up (at least to be most effective). In this process, it would make sense to make the processing as parallel as possible to make use of the available parallel threads. Remember CPU's are already progressing towards 6 core (12 threads on the i7), and we're talking probably at least a 10 yr timeframe here unless something unforeseen happens, by that stage a CPU may possibly closer resemble what we see a modern GPU today. Its all guess work, even the most intelligble and knowledgeable people in the field can only see a current trends & brick walls (such as the transistor leakage using current technologies as they become smaller) scenario.

In terms of x64 of 32 bit, if you have no need for 64 bit then its not worth the upgrading at all. I would NEVER recommend upgrading to a 32 bit version of Windows 7, as far as I see it there shouldn't even be a 32-bit version of it! If you have to change over operating systems then definitely get 64 bit! The previous statement that there is no different between XP and XP 64 except for the extra RAM available and 64 bit application availability is false :) XP x64 is based off the Server 2003 code which was much more stable and refined for speed than the XP code ever was. Server 2003 & XP x64 carries the Windows version 5.2, whereas XP is 5.1. I ran XP x64 from when it first came out till I got Vista x64, and the difference was noticeable for system useability, especially when I had multiple programmes open. Even back then I could play GTA San Andreas (we're talking about 2005), record a tv show off HDTV, encode in the background at low priority and share out some media stuff to friends all at the same time. Thats something that can be easily done now (even with today's gta 4, Far cry 2 etc), but back then XP just couldn't handle it. The system had 2gb of RAM so it wasn't a RAM issue, XP just had the tendency to crash, freeze up, or run slow whereas XP x64 had no such trouble.

In should point out in terms of gameplay XP is directx 9 only, you can't compare the speed of Crysis from XP to Vista as the Directx mode is different! Some say XP is faster, and although that may be true running a single, non HDD intensive app, when running multiple applications or tasks XP can fall over. It either crashes, freezes, or runs a particular task much slower than it would otherwise. Vista and Windows 7 (particularly x64 versions) can quite happily run multiple tasks (depending on what they are and what priorities they're set to), so you can encode in the background without it making any noticeable difference in performance as long as the priority is set to low. This can be done on x86 etc as well, but try playing a game while encoding, downloading, recording two tv separate digital tv channels (not just 2 different streams), having several people actually watching stuff off your computer (say, a housemate watching something in the loungeroom whilst leaving music on in the bedroom, and two others doing the same thing), and several other tasks and still be able to play poorly written games like GTA 4 without making any noticeable difference in game performance! (hint - on Vista and even on Windows 7, disable application superfetch as per the instructions in another thread.

If you get a new computer or buy a new copy of Windows, in other words just get the x64 version!

This hasn't been entirely off topic, as the now much higher memory use of x264 better suits operation under a x64 operating system even if it doesn't meet the 32 bit RAM limit! considering if you have at least 4gb of RAM. For the most part, you will probably have to remind yourself you're actually encoding if you run it as a background process :)

burfadel
15th August 2009, 01:40
Although not directly related to this thread, this also somewhat relates under memory usage for mbtree.

I can't stress the need enough to disable the application superfetch under Vista and Windows 7. The principle behind superfetch is to load applications automatically in to memory based on past application use, so when you need it, the application loads much faster. This is good in principle on a system where the memory useage is relatively stable (such as a workstation that just uses Word and Excel), but it its fundamentally flawed for other system uses such as video encoding. Basically in an environment where memory useage constantly changes, application superfetch causes continuous disk activity as it loads those applications in to the free memory then clears them to make way for normal memory use. This is just plain stupid!

Disabling application superfetch will still mean your memory may show 0 free under task manager, as it will still use your whole memory as cache, but it keeps session only past application use in the cache, not loading and unloading apps you haven't even used. With macroblock enabled, since the memory use of x264 is higher running it can stuff around with superfetch.

DO NOT disable the superfetch service. That will also disable prefetch and other performance enhancing features. Just disable it by doing the following:
- Run regedit (just type 'regedit' in the start menu search area)
- Open 'Hkey_local_machine), --> 'System' --> 'CurrentControlSet' --> 'Control' --> 'Session manager' --> 'Memory Management' --> 'Prefetchparameters'
- Double click on 'Enablesuperfetch' and change it from 3 to 2 (or to 3 if at any other value)
- Make sure 'Enableprefetcher' is set to 3
- Don't change anything else in the registry
- Close regedit and run 'Services.msc' from the start menu search area
- Ensure 'Superfetch' is set to automatic and is started (just to make sure some other app hasn't changed it)
- Close and reboot

Although you may notice some apps taking slightly longer to load, overall it is much kinder on your system and hdd. Application superfetch can load and unload several gigabytes per hour which in my mind is pointless!

nurbs
15th August 2009, 09:19
There is some discussion about wavelets in this thread: http://forum.doom9.org/showthread.php?t=147319
Note the first post on the second page.

burfadel
15th August 2009, 12:56
Who knows what could be around in 20 years :) I guess everything it just speculation, I suggested wavelets as a possibility, but the maths behind it could be much more complicated than can currently be done on a modern cpu. For this reasoning, consider the top of the line processor 20 years ago for consumers was the 386, and now consider the difference between a 386 and say, Core i7 965, then extrapolate that for the next 20 years. Doesn't quite work like that I know, but from a conceptual view the reasoning is valid. In a more practical sense, the suggested improvement in processing capability will quite possibly break Moore's Law due to parellism (and AMD's suggested reverse hyperthreading which would benefit applications that aren't so capable of parallelism) and other factors . The processing capability could allow for extrinsically complex mathematical processing which may, depending on designers and programmers capabilities, overcome the problems inherent in a wavelet codec design.

Guest
15th August 2009, 13:02
Wavelets is OT here. Further posts about it will be silently deleted.

@burfadel

Read and follow the forum rules, and stop taking threads OT.

Mr. Brown
15th August 2009, 14:10
Hello i have been testing mbtree last days but my feeling is that it is quite not ready!
positive:
the overall picture quality (psnr,ssim & own eyes + brain 1.0) is better
in my tests (normal movies) i get 20-30 % higher bitrate without mbtree
negative:
sometimes in dark scenes i see very blocky picture where with --no-mbtree the picture was good
and water sometimes also don't look normal

so i prefer encoding with --no-mbtree until the problems are fixed.

LoRd_MuldeR
15th August 2009, 14:26
so i prefer encoding with --no-mbtree until the problems are fixed.

Another option would be raising "qcomp" a bit and lower the effect of MBTree RC for now. Worth a try at least, I think...

Chengbin
15th August 2009, 14:53
sometimes in dark scenes i see very blocky picture where with --no-mbtree the picture was good
and water sometimes also don't look normal

That's a problem I'm seeing as well.

Is this fixable with weight-p?

juGGaKNot
16th August 2009, 09:45
Is this normal ? trying to test mbtree full power

Bit rate mode : Variable
Bit rate : 4 000 Kbps
Maximum bit rate : 12.0 Mbps
Writing library : x264 core 71 r1210 42d6b17
Encoding settings : cabac=1 / ref=5 / deblock=1:-1:-1 / analyse=0x3:0x113 / me=umh / subme=7 / psy=1 / psy_rd=1.0:0.0 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=0 / 8x8dct=1 / cqm=0 / deadzone=21,11 / chroma_qp_offset=-2 / threads=3 / nr=0 / decimate=1 / mbaff=0 / bframes=3 / b_pyramid=0 / b_adapt=2 / b_bias=0 / direct=3 / wpredb=1 / keyint=300 / keyint_min=30 / scenecut=40 / rc_lookahead=50 / rc=2pass / mbtree=1 / bitrate=4000 / ratetol=1.0 / qcomp=0.00 / qpmin=10 / qpmax=45 / qpstep=5 / cplxblur=20.0 / qblur=0.5 / ip_ratio=1.60 / aq=2:1.00

set useles2=--level %mylevel% --bitrate %btratex264% --stats "%mypath%\%mpath%\T1\%mymovie%.stats" --log-file "%mypath%\%mpath%\x264_2pass.log" --keyint %kint% --min-keyint %mint% --fullrange on --ref %myrefs%

start "encode" /b /low /wait "%mypath%\bin\x264.exe" --pass 2 --preset slow %useles2% --deblock -1;-1 --trellis 0 --qcomp 0 --subme 7 --qpmax 45 --qpstep 5 --bframes 3 --ipratio 1.6 --pbratio 1.2 --aq-mode 2 --aq-strength 1.0 --sar 1:1 --aud --output "%mypath%\%mpath%\T1\%mymovie%.264" %myxavs%
echo.

also when will the ratefactor be fixed ?

avis [info]: 1184x666 @ 30.00 fps (1800 frames)
x264 [warning]: width or height not divisible by 16 (1184x666), compression will suffer.
x264 [info]: using SAR=1/1
x264 [info]: using cpu capabilities: MMX2 SSE2Slow
x264 [info]: profile High, level 3.2
x264 [info]: slice I:33 Avg QP:16.43 size: 66387
x264 [info]: slice P:761 Avg QP:19.13 size: 27498
x264 [info]: slice B:1006 Avg QP:25.55 size: 6473
x264 [info]: consecutive B-frames: 9.6% 28.6% 44.3% 17.4%
x264 [info]: mb I I16..4: 22.3% 60.0% 17.7%
x264 [info]: mb P I16..4: 6.4% 16.4% 3.2% P16..4: 32.1% 11.6% 8.2% 0.0% 0.0% skip:22.1%
x264 [info]: mb B I16..4: 0.9% 2.3% 0.5% B16..8: 32.8% 1.5% 1.8% direct: 5.3% skip:54.9% L0:46.3% L1:47.5% BI: 6.2%
x264 [info]: final ratefactor: 11.61
x264 [info]: 8x8 transform intra:62.8% inter:65.4%
x264 [info]: direct mvs spatial:99.8% temporal:0.2%
x264 [info]: coded y,uvDC,uvAC intra:56.3% 59.0% 30.1% inter:15.2% 14.8% 3.1%
x264 [info]: ref P L0 72.4% 12.1% 7.7% 4.2% 3.5%
x264 [info]: ref B L0 83.3% 9.1% 4.8% 2.8%
encoded 1800 frames, 3.26 fps, 3950.72 kb/s

Selur
16th August 2009, 10:27
Where's the problem with the ratefactor? (you aimed for a specific bitrate not a ratefactor)
I agree that when aiming for 4000 kb/s and getting 3950.72 kb/s it would be nice it rate control would be a bit stricter,...
Have you tried to set --qpmax to 51 and --qpmin to 1 ? (seems to be reasonable with qcomp 0)

Dark Shikari
16th August 2009, 10:28
Where's the problem with the ratefactor? (you aimed for a specific bitrate not a ratefactor)
I agree that when aiming for 4000 kb/s and getting 3950.72 kb/s it would be nice it rate control would be a bit stricter,...
Have you tried to set --qpmax to 51 and --qpmin to 1 ? (seems to be reasonable with qcomp 0)More importantly, setting qcomp=0 is probably a really bad idea... :rolleyes:.

Don't touch what you don't understand.

juGGaKNot
16th August 2009, 11:24
Don't touch what you don't understand.

0 for maximum mbtree is bad
default might be bad for psy rd as you said
1.0 will disable mbtree

so when will the default be right ?

and if it is 0 why does the bitrate vary so much ? 12 mb ?

Where's the problem with the ratefactor? (you aimed for a specific bitrate not a ratefactor)

it would be nice to have a valid "quality reference"

ratefactor is 11, quality is good but not great and fades suck.