x265 HEVC Encoder [Archive] - Page 91

LigH

26th December 2016, 12:40

Santa was here...

x265 version 2.2 has been released. This release contains new algorithms to limit the search of optimal transform units, a new motion search method, and optimizations to the bitstream. With this release, x265 also supports POWERPC platforms, with key functions optimized by using ALTIVEC kernels.

The latest version can be downloaded from here (https://bitbucket.org/multicoreware/x265/downloads/x265_2.2.tar.gz) (MD5 sum = 36161843a70e4d46af1fa38cf221d0f3). Full documentation is available at http://x265.readthedocs.io/en/stable/.

Release Notes for 2.2
================

Encoder enhancements
----------------------------------
1. Enhancements to TU selection algorithm with early-outs for improved speed; use --limit-tu to exercise.
2. New motion search method SEA (Successive Elimination Algorithm) supported now as –me 4
3. Bit-stream optimizations to improve fields in PPS and SPS for bit-rate savings through --[no-]opt-qp-pps, --[no-]opt-ref-list-length-pps, and --[no-]multi-pass-opt-rps.
4. Enabled using VBV constraints when encoding without WPP.
5. All param options dumped in SEI packet in bitstream when info selected.
6. x265 now supports POWERPC-based systems. Several key functions also have optimized ALTIVEC kernels.

API changes
-------------------
1. Options to disable SEI and optional-VUI messages from bitstream made more descriptive.
2. New option --scenecut-bias to enable controlling bias to mark scene-cuts via cli.
3. Support mono and mono16 color spaces for y4m input.
4. --min-cu-size of 64 no-longer supported for reasons of visual quality (was crashing earlier anyways.)
5. API for CSV now expects version string for better integration of x265 into other applications.

Bug fixes
--------------
1. Several fixes to slice-based encoding.
2. --log2-max-poc-lsb‘s range limited according to HEVC spec.
3. Restrict MVs to within legal boundaries when encoding.

Enjoy!
Pradeep.

New "merge with stable milestone" build:

x265 2.2+2-998d4520d1cf (http://www.mediafire.com/file/fll25auuucv1i85/x265_2.2+2-998d4520d1cf.7z)

Barough

26th December 2016, 17:00

x265 v2.2+5-e40db0bdde4a (http://www8.zippyshare.com/v/vvCRqXqk/file.html) (MSYS/MinGW, GCC 6.2.0, 32 & 64bit 8/10/12bit multilib EXEs)

need4speed

27th December 2016, 17:20

Thanks but the sea mse is really so slow or it is just me?

Inviato dal mio GT-N7100 utilizzando Tapatalk

need4speed

27th December 2016, 17:21

Same everything, 720p encode 15 fps to 6?

Inviato dal mio GT-N7100 utilizzando Tapatalk

Barough

27th December 2016, 17:23

Thanks but the sea mse is really so slow or it is just me?

Inviato dal mio GT-N7100 utilizzando Tapatalk
It's slow......

Sent from my Samsung Galaxy S7 edge via Tapatalk

need4speed

27th December 2016, 18:09

It's slow......

Sent from my Samsung Galaxy S7 edge via Tapatalk
Thanks, just checking really. Output file seems to be a bit better but IMHO and in my own case not worth it.
Is it compatible with tune grain or any issue?

Inviato dal mio GT-N7100 utilizzando Tapatalk

Barough

27th December 2016, 18:20

Thanks, just checking really. Output file seems to be a bit better but IMHO and in my own case not worth it.
Is it compatible with tune grain or any issue?

Inviato dal mio GT-N7100 utilizzando Tapatalk
Im short on test files..... had a HD failure but It seems yo give ya a little better output but i don't know how it works with tune grain.....

The loss of speed vs quality gain is not worth it at all imo.

Sent from my Samsung Galaxy S7 edge via Tapatalk

LigH

28th December 2016, 00:20

At least SEA is faster than the trivial exhaustive search (every sample in the motion range). But it will probably be about as slow as UMH.

There are several motion search algorithms, though, which may miss the optimal vector when it is quite far away from a predicion, and then an intra coded block or a quite coarse match are required, which would reduce the bitrate/quality ratio. Diamond and Hexagonal will only test a small range around the predicted vector. Of course this will be a lot faster. But it may fail for quite random and wide (shaky) motion.

Different motion search algorithms cannot be incompatible to any other part of the encoding, IMHO. They either are able to find the optinal motion vector, or a suboptimal match has to be encoded and needs to spend more bitrate than minimally required in case of an optimal match.

pradeeprama

28th December 2016, 05:53

Thanks, just checking really. Output file seems to be a bit better but IMHO and in my own case not worth it.
Is it compatible with tune grain or any issue?

Inviato dal mio GT-N7100 utilizzando Tapatalk

Don't expect any compatibility issues between tune grain and SEA

need4speed

28th December 2016, 06:41

Don't expect any compatibility issues between tune grain and SEA
Thank you all for the reply.
Been playing around for almost two years now with hevc and the improvements have been terrific.
The main issue was and somehow still is grain and detail retention but slowly we're getting there.
Im not a pixel peeper hence at the moment with latest updates I have settled for
Medium, crf 19, main10, output10, ctu32, max tu 16, early skip, limit modes, qcomp 075, me star, no Sao, no strong intras, range limited, deblock - 6-6,tune grain.
This is my setup for TV shows 1920 to 720 and works very well and fast.
Really satisfied but still looking for fine tuning since details are still a bit of an issue. A lot better but still not there.
10-14 fps and really glad about final video.
Any suggestion here to improve details retention?
Thanks again!

Inviato dal mio GT-N7100 utilizzando Tapatalk

Boulder

28th December 2016, 07:08

--tune grain works wonders when detail and grain retention are important.

need4speed

28th December 2016, 07:39

--tune grain works wonders when detail and grain retention are important.
Tune grain has been a real surprise, and it works wonders.
Again, not a pixel peeper and have tried a lot of fine tuning over there past two years. There is still room for improvement in the detail game but with my preset can't almost tell the difference between avc and hevc.
Speed for quality is my personal goal and, to be honest, messing around with a lot of settings wasn't really any help or difference. Except for tune grain, again.
My biggest concern about hevc is that I don't know anything about some parameters, in terms of how messing around with one can break or spoil another.
The forum is a huge help, a lot of technical details and explanations for each item but not much about how one can influence another and how.
So it's a try and go but, again, a terrific improvement along these two years.
So thanks again for all the hard work!

Inviato dal mio GT-N7100 utilizzando Tapatalk

aymanalz

28th December 2016, 09:10

Thanks but the sea mse is really so slow or it is just me?

Inviato dal mio GT-N7100 utilizzando Tapatalk

In my testing, I'm getting about 12-14 FPS with umh and star, and 4 fps with sea, all other settings being the same.

So yes, it is slow, at least for my test clips and parameters.

LigH

28th December 2016, 09:25

SEA being a lot slower than UMH surprises me. I expected it to narrow down the square to test more thoroughly in each step. Did you use very large motion search ranges in your test? Or does SEA really test more than narrowing down only the best quadrant so far?

I visualized how I imagine them (http://forum.doom9.org/showthread.php?p=1789660#post1789660) and explained why I see UMH and SEA rather close up (http://forum.doom9.org/showthread.php?p=1789870#post1789870). Unfortunately, no developer with real insight confirmed or corrected yet.

Even the idea by 708145 of adaptively selecting motion search methods (by gathered statistics about the amount of motion, or by assumptions about the efficiency depending on frame prediction types - close B frames may be estimated well with faster methods already) sounds interesting.

aymanalz

28th December 2016, 10:22

SEA being a lot slower than UMH surprises me. I expected it to narrow down the square to test more thoroughly in each step. Did you use very large motion search ranges in your test? Or does SEA really test more than narrowing down only the best quadrant so far?

I visualized how I imagine them (http://forum.doom9.org/showthread.php?p=1789660#post1789660) and explained why I see UMH and SEA rather close up (http://forum.doom9.org/showthread.php?p=1789870#post1789870). Unfortunately, no developer with real insight confirmed or corrected yet.

Even the idea by 708145 of adaptively selecting motion search methods (by gathered statistics about the amount of motion, or by assumptions about the efficiency depending on frame prediction types - close B frames may be estimated well with faster methods already) sounds interesting.

I know, and I was rather surprised too, since I had read your explanations earlier.

My experience has always been that umh and star usually have the same speed, within 10% of each other. Neither has been much faster or slower than the other in any of my tests so far.

I just tested a few clips with sea yesterday, and I tried merange of the default 57, and then 40. In both cases, sea was three times slower than umh or star.

I'll wait for more people to report their findings on the question of speed. (Did you test any with sea, btw?)

LigH

28th December 2016, 12:16

My mistake about UMH may be that it first checks the narrow neighborhood around the predicted position, and only extends to the wide multi-hex pattern with recursive refinement if the narrow range did not yet find a good match. If it works this way, then it can indeed be about as fast as hex or star when motion is mainly regular and well predicted.

Przemek_Sperling

28th December 2016, 13:54

Well, I have to say that x265, in general, behaves different from x264, and I cannot explain it. I have three computers (Intel Core 2 Duo and Sandy Bridge laptops and a AMD Rana powered desktop). I noticed that x265 likes Intel architectures, and I do not write about the Sandy but also about my both old computers. My desktop (AMD Athlon x3 435 oc'ed to 3.71 Ghz) is around twice faster than my laptop (Core 2 Duo 2.4 GHz) if I code DVDs with x264 but only ~50% faster when I switch to x265 (I use VidCoder).
On the other hand both my Intel CPUs are far more vulnerable if I change some settings. Sea slow them down but sea has a minor impact on my AMD CPU.

HWK

28th December 2016, 16:01

Well, I have to say that x265, in general, behaves different from x264, and I cannot explain it. I have three computers (Intel Core 2 Duo and Sandy Bridge laptops and a AMD Rana powered desktop). I noticed that x265 likes Intel architectures, and I do not write about the Sandy but also about my both old computers. My desktop (AMD Athlon x3 435 oc'ed to 3.71 Ghz) is around twice faster than my laptop (Core 2 Duo 2.4 GHz) if I code DVDs with x264 but only ~50% faster when I switch to x265 (I use VidCoder).
On the other hand both my Intel CPUs are far more vulnerable if I change some settings. Sea slow them down but sea has a minor impact on my AMD CPU.

Raw speed is not only factor which determine how fast encoder will run, there are other factor such as extension available from host cpu.

brumsky

28th December 2016, 16:03

Tune grain has been a real surprise, and it works wonders.
Again, not a pixel peeper and have tried a lot of fine tuning over there past two years. There is still room for improvement in the detail game but with my preset can't almost tell the difference between avc and hevc.
Speed for quality is my personal goal and, to be honest, messing around with a lot of settings wasn't really any help or difference. Except for tune grain, again.
My biggest concern about hevc is that I don't know anything about some parameters, in terms of how messing around with one can break or spoil another.
The forum is a huge help, a lot of technical details and explanations for each item but not much about how one can influence another and how.
So it's a try and go but, again, a terrific improvement along these two years.
So thanks again for all the hard work!

Inviato dal mio GT-N7100 utilizzando Tapatalk

A while back I went through testing a number of settings and combinations of settings. I found that -rd 5 gives the biggest improvement in quality with the least impact in speed. IMHO, it's the best bang for the buck.

LigH

28th December 2016, 16:56

@ Przemek_Sperling:

Different avialable instruction set extensions (SSE4+, AVX+) and architecture details (e.g. CPU internal cache strategy, instruction pipelining, etc.) speed up several parts of the encoding differently. Their relation to each other will change with the whole architecture, and you may notice a bottleneck on one CPU architecture in one part of the whole encoding algorithm, and on another CPU architecture in a different one.

There is even a noticable difference among different generations of intel processors.

need4speed

28th December 2016, 18:58

A while back I went through testing a number of settings and combinations of settings. I found that -rd 5 gives the biggest improvement in quality with the least impact in speed. IMHO, it's the best bang for the buck.
Will try for sure,thanks!
Still testing with samples from different sources and bitrate. Again, can't see any difference in files with sea or star. Maybe it's just me really, but at this stage wondering what good sea might be.

Inviato dal mio GT-N7100 utilizzando Tapatalk

birdie

28th December 2016, 20:33

Well, I have to say that x265, in general, behaves different from x264, and I cannot explain it. I have three computers (Intel Core 2 Duo and Sandy Bridge laptops and a AMD Rana powered desktop). I noticed that x265 likes Intel architectures, and I do not write about the Sandy but also about my both old computers. My desktop (AMD Athlon x3 435 oc'ed to 3.71 Ghz) is around twice faster than my laptop (Core 2 Duo 2.4 GHz) if I code DVDs with x264 but only ~50% faster when I switch to x265 (I use VidCoder).
On the other hand both my Intel CPUs are far more vulnerable if I change some settings. Sea slow them down but sea has a minor impact on my AMD CPU.

Without numbers your comment looks like utter BS.

1) Encoding time for the same source given the same encoder, encoding settings and OS load
2) Throttling data (throttling or not)
3) Frequency data

Then you forget that most recent AMD CPUs have 40-50% lower IPC than most recent Intel CPUs, so naturally Intel CPUs must be faster. Test again after you buy/assemble a PC based on the AMD Ryzen arch. Also mind that AMD CPUs of today don't support the AVX2 instruction set and their AVX implementation leaves a lot to be desired.

Last but not least I am pretty sure most programmers optimize/profile/compile for the most prevalent CPU architecture and that happens to be the Intel's ones.

Instead of a thousand of words:

http://images.anandtech.com/graphs/graph10705/83862.png

The top AMD CPU featuring 4 cores is as fast as Intel's i3 CPU featuring 2 cores.

Motenai Yoda

29th December 2016, 14:48

The top AMD CPU featuring 4 cores is as fast as Intel's i3 CPU featuring 2 cores.
AMD's Cpus don't sports 4 core, but 2 core each with 2 ipu/logic modules and a single fpu module.

mandarinka

29th December 2016, 16:21

Seems there are two interesting patched on ML, that could bring quality improvements.

https://mailman.videolan.org/pipermail/x265-devel/2016-December/010852.html ("AQMotion")
https://mailman.videolan.org/pipermail/x265-devel/2016-December/010853.html ("SSIM based RDO for mode selection")

@Yoda
That's not really correct, they are way more close to cores than not in their characteristics, including performance. It is just that the per-MHz performance ("IPC") of these cores is very low. If you disable the second thread in each pair, you will only get very low speedup (5-10 % usually).

I mean, you are right that elements including FPU (SIMD unit) are shared. But it is still more correct to call them cores than anything else.

Atak_Snajpera

29th December 2016, 16:55

Then you forget that most recent AMD CPUs have 40-50% lower IPC than most recent Intel CPUs, so naturally Intel CPUs must be faster. Test again after you buy/assemble a PC based on the AMD Ryzen arch. Also mind that AMD CPUs of today don't support the AVX2 instruction set and their AVX implementation leaves a lot to be desired.

Too be honest AMD FX does not even have "full" AVX256 like Intel. Proof is here
http://i.cubeupload.com/byOi6P.png

LigH

29th December 2016, 19:22

x265 2.2+15-a18ab7656c30 (https://www.mediafire.com/file/wh752cnzrnm994t/x265_2.2+15-a18ab7656c30.7z)

Speedups in multipass encoding (reusing of more 1st-pass data); SSIM based RDO for mode selection; and some bugfixes.

jlpsvk

29th December 2016, 21:42

Yay.. :) Getting more than 7fps with RipBot "distributed encoding" on 14-cores together (3 PC's) with setting:
--crf 20 --output-depth 10 --preset slow --rd 6 --tu-intra-depth 4 --tu-inter-depth 4 --amp
--cbqpoffs -3 --qpstep 8 --bframes 8 --rc-lookahead 60 --min-keyint 23 --keyint 240 --no-open-gop
--colorprim bt709 --colormatrix bt709 --transfer bt709 --deblock -3:-3 --psy-rdoq 10 --no-sao --rskip
--crqpoffs -3 --high-tier --limit-tu 3 --qg-size 8

Any suggestion for quality increase (not highering bitrate too much).

pradeeprama

30th December 2016, 04:54

Yay.. :) Getting more than 7fps with RipBot "distributed encoding" on 14-cores together (3 PC's) with setting:
--crf 20 --output-depth 10 --preset slow --rd 6 --tu-intra-depth 4 --tu-inter-depth 4 --amp
--cbqpoffs -3 --qpstep 8 --bframes 8 --rc-lookahead 60 --min-keyint 23 --keyint 240 --no-open-gop
--colorprim bt709 --colormatrix bt709 --transfer bt709 --deblock -3:-3 --psy-rdoq 10 --no-sao --rskip
--crqpoffs -3 --high-tier --limit-tu 3 --qg-size 8

Any suggestion for quality increase (not highering bitrate too much).

If you don't mind the impact to speed, then you can try disabling rskip (--no-rskip), and limit-tu (just remove limit-tu from your command line). You can also try other ME search modes, and increasing subme for better quality.

shinchiro

30th December 2016, 11:13

Tried aq-motion setting but there's more noise around edges compared to when disabled

jlpsvk

30th December 2016, 20:43

If you don't mind the impact to speed, then you can try disabling rskip (--no-rskip), and limit-tu (just remove limit-tu from your command line). You can also try other ME search modes, and increasing subme for better quality.

What SUBME are you recommending? ME search "higher" than star is slowing down significantly... :(

LoRd_MuldeR

31st December 2016, 12:22

SEA being a lot slower than UMH surprises me. I expected it to narrow down the square to test more thoroughly in each step.

I'm not surprised at all :p

To my understanding SEA – this is what x264 uses in "ESA" mode¹, and form there it apparently was ported to x265 – is supposed to be a full search. It is mathematically equivalent to a simple "brute force" search on the whole search space, but considerably faster than that, because it eliminates candidates that certainly can't win. UMH, on the other hand, is not a full search. "UMH is much more aggressive than SEA at removing candidates without even looking at them".

See also:
* https://forum.doom9.org/showthread.php?t=141568
* https://x265.readthedocs.io/en/default/cli.html#cmdoption--me

[1] If you look at the x264 source, you'll see that the case block for (T)ESA mode also implements a brute-force search (aka "plain old exhaustive search"), but that code is disabled by a hard #if 0 in favor of SEA (aka "successive elimination by comparing DC before a full SAD")

LigH

31st December 2016, 12:46

Then we still have to know how the elimination works, and which steps are calculated in which order.

I imagined SEA as separating the whole motion search range into quadrants, and selecting the 1/4 quadrant with the best match, then separating this quadrant again with sub-quadrants, until they are only one square-sample wide (even when a quite good match is found early, just to ensure there is no better). That would mean a constant complexity related to the dual logarithm of the number of samples in the motion search range. No lucky best case.

For UMH instead, there are 3 different patterns: 4 samples star, 6 samples narrow hex, and 48 samples wide hex. My mistake was probably to believe that UMH always searches the wide pattern first. But that would mean a constant high complexity. Instead, looking at the "ASCII art" sample by akupenguin (https://forum.doom9.org/showthread.php?p=693742#post693742) again, I believe now that the wide pattern is only tried if the star and hex steps were considered not successful enough. If this strategy is correct, then there is no more surprise that UMH can be much faster when motion is often well-predicted.

LoRd_MuldeR

31st December 2016, 12:54

Then we still have to know how the elimination works, and which steps are calculated in which order.

I think it eliminates candidates which, by looking at the DC alone, cannot be better than the best candidate found so far.

For those "cannot win anyway" candidates, computing a full SAD is avoided and thus CPU time is saved. Simply put, you first have a quick look at each candidate, and you do the "accurate" (expensive) check only when it's actually worth it.

Also, whenever you actually do find a better candidate than your previous best one, even more of the remaining (i.e. not yet checked in-depth) candidates may be dropped. That's the "successive elimination" part, I suppose.

But, in contrast to other methods (e.g. UMH), you never discard any candidate that possibly (though unlikely) might have been the winner. For that reason, even though SEA is faster than "plain old exhaustive search", it's still relatively slow.

The SEA optimization is possible "...because sum(abs(diff)) >= abs(diff(sum))". If you really want to understand it in all detail, you'll probably have to study the source code:
http://git.videolan.org/?p=x264.git;a=blob;f=encoder/me.c;h=cafdfbe5995c882cf944f47a01ef0a13a360cb14;hb=HEAD#l632

LigH

31st December 2016, 13:21

I hoped for someone who was involved in implementing them to explain them; that was too optimistic, so far... :(

Motenai Yoda

31st December 2016, 14:49

--[no-]aq-motion
Adjust the AQ offsets based on the relative motion of each block with respect to the motion of the frame.
The more the relative motion of the block, the more quantization is used. Default disabled.

Requires AQ Mode to be on.
IIRC this should be roughtly the same as mb-tree/cu-tree?

--cutree, --no-cutree
Enable the use of lookahead's lowres motion vector fields to
determine the amount of reuse of each block to tune adaptive
quantization factors. CU blocks which are heavily reused as motion
reference for later frames are given a lower QP (more bits) while CU
blocks which are quickly changed and are not referenced are given
less bits. This tends to improve detail in the backgrounds of video
with less detail in areas of high motion. Default enabled

Selur

1st January 2017, 09:29

Refine analysis in 2 pass based on analysis information from pass 1
--multi-pass-opt-analysis doesn't support refining analysis through multiple-passes; it only reuses analysis from the second-to-last pass to the last pass.Disabling reading
To use '--multi-pass-opt-analysis' during which passes to I need to set the option? (during all? only during the last pass?)

Cu Selur

Ps.: Happy New Year!

CruNcher

2nd January 2017, 05:45

Hmm did somebody do a test on the efficiency of x265 retranscoding other Encoder results (2nd Generation) and how the end complexity on the Decoding side turns out ?

and put that into compare vs 264 and its encoding/decoding complexity ?

i mean you only find test that talk about the encoding efficiency but how encoding/decoding goes through the roof nobody really seems to be interested looking at the overall efficiency and how that plays out visually.

Especially in current 10 Bit Encoder/Decoder states.

Leo 69

2nd January 2017, 13:03

Can someone elaborate what ssim-rd does exactly?

LigH

2nd January 2017, 13:20

Not yet in the online documentation...

It uses the more elaborate, but also visually quite human-like SSIM metric to optimize the rate distortion in the mode decision step. From the source:

/* SSIM based RDO, based on residual divisive normalization scheme. Used for mode
* selection during analysis of CTUs, can achieve significant gain in terms of
* objective quality metrics SSIM and PSNR */

Remember, SSIM is still an objective metric – means, a computer can calculate numbers to compare against other methods and options; but the human image recognition is, in contrast, subjective – it does not certainly produce the best-looking results for every person watching and judging the video. Different people may rate the same quality loss differently annoying.

Barough

2nd January 2017, 18:34

x265 v2.2+17-a2fe29ca5c6c (http://www64.zippyshare.com/v/UD0c4Ki3/file.html) (MSYS/MinGW, GCC 6.2.0, 32 & 64bit 8/10/12bit multilib EXEs)

Leo 69

2nd January 2017, 21:23

Not yet in the online documentation...

It uses the more elaborate, but also visually quite human-like SSIM metric to optimize the rate distortion in the mode decision step. From the source:

/* SSIM based RDO, based on residual divisive normalization scheme. Used for mode
* selection during analysis of CTUs, can achieve significant gain in terms of
* objective quality metrics SSIM and PSNR */

Remember, SSIM is still an objective metric – means, a computer can calculate numbers to compare against other methods and options; but the human image recognition is, in contrast, subjective – it does not certainly produce the best-looking results for every person watching and judging the video. Different people may rate the same quality loss differently annoying.

Thank you, but I still don't get it. :) When I switch it on, psy-rd automatically switches off. I understand to get proper SSIM figures, we always turn off all psychovisual enhancements. But I can't understand the mechanics of this option in case of normal encoding.

Has anyone had any chance to test aq-motion, is it any good?

LigH

2nd January 2017, 22:02

Looks like you have either Psy-DR opt, or SSIM-RD opt, or no RD opt.

Has anyone had any chance to test aq-motion, is it any good?

If it does no good ... why implement it? In theory, when you see a lot of motion, you can't recognize details easily. So scenes with more motion are quantized more coarse, and this spares bitrate for scenes with less motion where you have the time to look for quality loss.

Magik Mark

3rd January 2017, 00:21

Guys,

Experimented with the new CLis

--aq-motion
--ssim-rd
--multi-pass-opt-analysis
--multi-pass-opt-distortion

x265 2.2+17-a2fe29ca5c6c:[Windows][MSVC 1910][64 bit] 10bit
Encoding settings : cpuid=1173503 / frame-threads=5 / numa-pools=28 / wpp / no-pmode / no-pme / no-psnr / no-ssim / log-level=2 / input-csp=1 / input-res=1920x1080 / interlace=0 / total-frames=154345 / level-idc=0 / high-tier=1 / uhd-bd=0 / ref=3 / no-allow-non-conformance / no-repeat-headers / annexb / no-aud / no-hrd / info / hash=0 / no-temporal-layers / open-gop / min-keyint=23 / keyint=250 / bframes=4 / b-adapt=2 / b-pyramid / bframe-bias=0 / rc-lookahead=20 / lookahead-slices=6 / scenecut=40 / no-intra-refresh / ctu=64 / min-cu-size=8 / no-rect / no-amp / max-tu-size=32 / tu-inter-depth=1 / tu-intra-depth=1 / limit-tu=0 / rdoq-level=0 / signhide / no-tskip / nr-intra=0 / nr-inter=0 / no-constrained-intra / strong-intra-smoothing / max-merge=2 / limit-refs=3 / no-limit-modes / me=1 / subme=2 / merange=57 / temporal-mvp / weightp / no-weightb / no-analyze-src-pics / deblock=0:0 / sao / no-sao-non-deblock / rd=3 / no-early-skip / rskip / no-fast-intra / no-tskip-fast / no-cu-lossless / no-b-intra / rdpenalty=0 / psy-rd=0.00 / psy-rdoq=0.00 / no-rd-refine / analysis-mode=0 / no-lossless / cbqpoffs=0 / crqpoffs=0 / rc=abr / bitrate=2991 / qcomp=0.60 / qpstep=4 / stats-write=0 / stats-read=2 / cplxblur=20.0 / qblur=0.5 / ipratio=1.40 / pbratio=1.30 / aq-mode=3 / aq-strength=1.00 / cutree / zone-count=0 / no-strict-cbr / qg-size=32 / no-rc-grain / qpmax=69 / qpmin=0 / sar=0 / overscan=0 / videoformat=5 / range=0 / colorprim=2 / transfer=2 / colormatrix=1 / chromaloc=0 / display-window=0 / max-cll=0,0 / min-luma=0 / max-luma=1023 / log2-max-poc-lsb=8 / vui-timing-info / vui-hrd-info / slices=1 / opt-qp-pps / opt-ref-list-length-pps / no-multi-pass-opt-rps / scenecut-bias=0.05 / no-opt-cu-delta-qp / aq-motion

I get a lot of blocks of pixel jumping from frame to frame. Can you help which among the new CLI is causing this?

http://i.imgur.com/ju8iFkS.png

Ma

3rd January 2017, 02:36

Experimented with the new CLis

--aq-motion
--ssim-rd
--multi-pass-opt-analysis
--multi-pass-opt-distortion

I get a lot of blocks of pixel jumping from frame to frame. Can you help which among the new CLI is causing this?

I can confirm this. The simplest options to reproduce are:
-D10 --bitrate 3000 --ssim-rd

In my sample with -D8 (instead of -D10) options the quality is OK. Strange...

Magik Mark

3rd January 2017, 02:44

I can confirm this. The simplest options to reproduce are:
-D10 --bitrate 3000 --ssim-rd

In my sample with -D8 (instead of -D10) options the quality is OK. Strange...

Are you saying that it is only --ssim-rd causing this? How about the other CLIs I have mentioned?

Thanks for prompt response!

Ma

3rd January 2017, 02:58

Are you saying that it is only --ssim-rd causing this? How about the other CLIs I have mentioned?

If I use all your options without --ssim-rd, it is OK.
-D8 --ssim-rd is also OK (crf & bitrate).
-D10 --ssim-rd is wrong (crf & bitrate).

Magik Mark

3rd January 2017, 03:41

Thanks ma for confirming!

D10 means 10bit encoding right?

pradeeprama

3rd January 2017, 05:08

Thank you, but I still don't get it. :) When I switch it on, psy-rd automatically switches off. I understand to get proper SSIM figures, we always turn off all psychovisual enhancements. But I can't understand the mechanics of this option in case of normal encoding.

ssim-rd uses SSIM as an additional distortion metric on top of SSE. Since ssim doesn't capture noise introduced in the picture as a result of lossy encoding, we continue to use SSE instead of completely replacing it with SSIM. We first would like to experiment exhaustively with ssim-rd before enabling psy-rd along with ssim-rd as they may have some interesting implications on each other; that is the reason why psy-rd is turned off right now when using ssim-rd.

pradeeprama

3rd January 2017, 05:09

Thanks ma for confirming!

D10 means 10bit encoding right?

Thanks for the test - we will look into this.

pradeeprama

3rd January 2017, 05:14

--[no-]aq-motion
Adjust the AQ offsets based on the relative motion of each block with respect to the motion of the frame.
The more the relative motion of the block, the more quantization is used. Default disabled.

Requires AQ Mode to be on.
IIRC this should be roughtly the same as mb-tree/cu-tree?

--cutree, --no-cutree
Enable the use of lookahead's lowres motion vector fields to
determine the amount of reuse of each block to tune adaptive
quantization factors. CU blocks which are heavily reused as motion
reference for later frames are given a lower QP (more bits) while CU
blocks which are quickly changed and are not referenced are given
less bits. This tends to improve detail in the backgrounds of video
with less detail in areas of high motion. Default enabled

aq-motion is attempting to look only for relative motion of a block wrt to the frame that this block is a part of to increase QP and save some bits. cutree is looking at how this block is referenced by future blocks to give more bits to those blocks that are referenced more. They therefore can work orthogonal to each other, IMO