Log in

View Full Version : x265 HEVC Encoder


Pages : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 [59] 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197

mandarinka
12th November 2015, 19:13
I did tests of x265 for several times so far over last year+, but I keep hitting one quality issue - when encoding somewhat grainy/noisy or textured footage (in my case it is hand-drawn film/cel animation, the noise isn't necessarily very strong), there are areas where the grain or texture gets wiped out.

These areas are square and when I checked them in a stream analyzer several months ago, they seemed to align with block boundaries. These areas looked like they have almost no residual (the blocks look flat) as opposed to their neightbours which show a noise/grain pattern. Basically it seems to me that x265 does keep grain/noise in some CUs/CTUs, but removes it almost completely in others.

I didn't find any way to prevent that, does anybody have any advice about how to get the frames' look to be uniform? I'm trying this at high bitrates (15-20 megabits at 1440x1080p24) so I would expect it to be possible to get close to transparent quality.

Here is a comparison (http://screenshotcomparison.com/comparison/150244/picture:0) (frames 2, 37, 154) showing uncompressed and compressed frames that suffer from this effect.


Source video (http://ulozto.net/x4xd3Bw8/source-ll-mkv) (200 frames), encode (http://ulozto.net/xCzY5Atn/encode-test-doom9-hevc) (20094 kbps).

commandline:

x265_main_1.8+76-bd8237a5d782.GCC520.exe - --input-res 1440x1080 --fps 24000/1001 --preset placebo --subme 7 --ref 6 --aq-mode 1 --qg-size 16 --rd 6 --psy-rd 1.0 --psy-rdoq 1.0 --rdoq-level 1 --crf 14

(I used 64bit build from this post (http://forum.doom9.org/showpost.php?p=1745333&postcount=2882), 8bit mode, run on Windows 10 64, AMD A10-6800K, fed through avs2yuv).


Looking at the analyzer (assuming it isn't broken) seems to show that these flat-residual blocks happen when a 64x64 CU is used.
Screenshots: just residual (http://i.imgur.com/1DVNLkT.png), with the TU grid shown (http://i.imgur.com/FsOWqFU.png). When I tried to use --ctu, the effect was lessened, but it seems to me that some of the smaller blocks in these ares keep being smoothed/denoised. Perhaps it is blocks that have an strong outline/edge running through them (here is image of the same frame with --ctu 16 (http://i.imgur.com/Q3WViVH.png))?

------------------------------------------------------------
P.S.
x265 crashed on me with the same commandline and --ctu 16 ( --preset placebo --subme 7 --ref 6 --aq-mode 1 --qg-size 16 --rd 6 --psy-rd 1.0 --psy-rdoq 1.0 --rdoq-level 1 --crf 14 --ctu 16). I had to manually set --tu-inter-depth 1 and --tu-intra-depth 1, these should probably be set automatically when ctu is limited?

nandaku2
13th November 2015, 13:20
@Mandarinka,

Thank you for your test case. I ran the exact commandline, and opened the residual in an analyzer. At 95%+ of the places where bad blocks showed up, the mode chosen was intra DC.

Motenai Yoda
14th November 2015, 15:44
What about it? What effect do you observe on speed and visual quality when using --scaling-list default? What presets would you suggest that we change?

Nothing special, it reminds me an h.263 cqm, seems to thrown away high freq but this didn't help at all with rl stuffs (at least not crowd_run or park_joy) even with >30 crf...
But seems to be in pair with some animation ones, somewhere better, somewhere worst.
at the same bitrate metrics (avg qp, ssim, psnr) looks better but visually it isn't, maybe is a bit faster (+5%)

x265_Project
15th November 2015, 19:56
Here is an updated look at the improvement that --limit-refs and --limit-modes can make to encoding speed...
http://x265.org.s3.amazonaws.com/img/LimitRefs&LimitModes2.png
Testing was done with 2 clips each at 720P (3 Mbps), 1080P (6 Mbps), and 4K (15 Mbps) on an Intel Core i7 6700K system, using --preset veryslow.

D3C0D3R
15th November 2015, 22:58
The main reason why this was so efficient for H264 does not apply to HEVC anymore though, so the gains should be smaller, and moving from 8 to 10

Name this reason, please.

mandarinka
17th November 2015, 02:59
Probably referring to the bias in biprediction and interpolation, that was leading to off by 1 errors in prediction that needed to be corrected by coeficients in residual. IIRC H.264 had such problem (and HEVC supposedly not). If I am recalling this wrong or making up non-existent stuff then somebody please correct me.

LigH
17th November 2015, 13:40
After a longer while without commits, now another build with important changes. This one might be related to the loss of details, as discussed here by mandarinka and nandaku2:

intra prediction: disable 64x64 analysis

In intra CUs, the predictions are applied for each TU sequentially (and not at the
PU level). This patch turns off all 64x64 intra analysis/modes - to analyse which,
previously, x265 averaged a 64x64 block to 32x32 and then did a prediction search
on this averaged block. This is a bad idea for visual quality, and instead x265
will perform 32x32 predictions sequentially.

x265 1.8+106-e8f9a60d4cd9 (GCC 4.9.2) (https://www.mediafire.com/download/822v2kqltv4dulg/x265_1.8+106-e8f9a60d4cd9.GCC492.7z)
x265 1.8+106-e8f9a60d4cd9 (GCC 5.2.0) (https://www.mediafire.com/download/cqqgr4sepimrh9f/x265_1.8+106-e8f9a60d4cd9.GCC520.7z)

littlepox
17th November 2015, 13:48
After a longer while without commits, now another build with important changes. This one might be related to the loss of details, as discussed here by mandarinka and nandaku2:

intra prediction: disable 64x64 analysis

In intra CUs, the predictions are applied for each TU sequentially (and not at the
PU level). This patch turns off all 64x64 intra analysis/modes - to analyse which,
previously, x265 averaged a 64x64 block to 32x32 and then did a prediction search
on this averaged block. This is a bad idea for visual quality, and instead x265
will perform 32x32 predictions sequentially.

x265 1.8+106-e8f9a60d4cd9 (GCC 4.9.2) (https://www.mediafire.com/download/822v2kqltv4dulg/x265_1.8+106-e8f9a60d4cd9.GCC492.7z)
x265 1.8+106-e8f9a60d4cd9 (GCC 5.2.0) (https://www.mediafire.com/download/cqqgr4sepimrh9f/x265_1.8+106-e8f9a60d4cd9.GCC520.7z)

what is the difference between this change and --ctu 32 ?

Barough
17th November 2015, 13:48
Thnx for the new compiles LigH :)

Boulder
17th November 2015, 13:54
After a longer while without commits, now another build with important changes. This one might be related to the loss of details, as discussed here by mandarinka and nandaku2:

intra prediction: disable 64x64 analysis

In intra CUs, the predictions are applied for each TU sequentially (and not at the
PU level). This patch turns off all 64x64 intra analysis/modes - to analyse which,
previously, x265 averaged a 64x64 block to 32x32 and then did a prediction search
on this averaged block. This is a bad idea for visual quality, and instead x265
will perform 32x32 predictions sequentially.Looks like I need to do some comparisons again. Last time I did those a week ago or so, x265 was much closer to x264 than earlier. I did notice that some flat areas were punished quite heavily while neighbouring areas were spared. Brighter uniform areas with grain didn't suffer as much as darker areas.

foxyshadis
17th November 2015, 15:13
what is the difference between this change and --ctu 32 ?

It only applies to intra CUs. Inter CUs in P/B frames still get 64x64 tested, since they never used the speed hack intra 64x64 did. I'm wondering why the hack was removed entirely instead of being restricted to fast presets. (Without the hack, true 64x64 analysis is prohibitively slow. Maybe a candidate for veryslow or placebo preset?)

x265_Project
17th November 2015, 17:08
It only applies to intra CUs. Inter CUs in P/B frames still get 64x64 tested, since they never used the speed hack intra 64x64 did. I'm wondering why the hack was removed entirely instead of being restricted to fast presets. (Without the hack, true 64x64 analysis is prohibitively slow. Maybe a candidate for veryslow or placebo preset?)

We weren't happy with the design of the 64x64 Intra implementation, so it was removed, and for the moment 64x64 Intra analysis is off entirely. I agree that our highest quality presets should analyze every possibility.

Tom

mandarinka
17th November 2015, 19:35
After a longer while without commits, now another build with important changes. This one might be related to the loss of details, as discussed here by mandarinka and nandaku2:

intra prediction: disable 64x64 analysis

In intra CUs, the predictions are applied for each TU sequentially (and not at the
PU level). This patch turns off all 64x64 intra analysis/modes - to analyse which,
previously, x265 averaged a 64x64 block to 32x32 and then did a prediction search
on this averaged block. This is a bad idea for visual quality, and instead x265
will perform 32x32 predictions sequentially.

x265 1.8+106-e8f9a60d4cd9 (GCC 4.9.2) (https://www.mediafire.com/download/822v2kqltv4dulg/x265_1.8+106-e8f9a60d4cd9.GCC492.7z)
x265 1.8+106-e8f9a60d4cd9 (GCC 5.2.0) (https://www.mediafire.com/download/cqqgr4sepimrh9f/x265_1.8+106-e8f9a60d4cd9.GCC520.7z)

:thanks:

I'll test this soon (currently celebrating the aniversary of November 1989).

Edit: From cursory look I think this change really helped with look of those blocks. I think one can still spot thatt ehy differ from source when comparing, but they look more uniform and aren't smoothed (I guess because the encoder is now inter predicting those areas?). Ideally I should make a before/after comparison to really tell, but I think this really improved the ability to achieve transparent quality, with this kind of soruce at least.

Oh, and thanks again for looking into the sample, Nandaku2!

x265_Project
18th November 2015, 04:23
:thanks:

I'll test this soon (currently celebrating the aniversary of November 1989).

Edit: From cursory look I think this change really helped with look of those blocks. I think one can still spot thatt ehy differ from source when comparing, but they look more uniform and aren't smoothed (I guess because the encoder is now inter predicting those areas?). Ideally I should make a before/after comparison to really tell, but I think this really improved the ability to achieve transparent quality, with this kind of soruce at least.

Oh, and thanks again for looking into the sample, Nandaku2!
Thanks for your feedback! We are continuing to work on this, but I think we're on the right track.

nandaku2
18th November 2015, 12:45
It only applies to intra CUs. Inter CUs in P/B frames still get 64x64 tested, since they never used the speed hack intra 64x64 did. I'm wondering why the hack was removed entirely instead of being restricted to fast presets. (Without the hack, true 64x64 analysis is prohibitively slow. Maybe a candidate for veryslow or placebo preset?)

So, this hack is not ideal because it does an extra 64x64 analysis (for sa8d), but proper 32x32 TU RDO for final decision. The new patch biases away from blurry modes by removing an extra analysis - without increasing bits.

LigH
23rd November 2015, 15:25
Interesting proposed patch:

rc: implement 2 pass CRF, when vbv is enabled.

Allow CRF with VBV in 2nd pass to increase the quality of capped CRF in the first pass.

I guess that could be useful for HQ movies getting close to the transfer rates of USB 2 devices (like bigger but not very fast readable Flash sticks) or WLAN with not optimal signal quality and reduced speed? Or is there a different specific purpose to implement such a refining bitrate control mode as "least possible quality loss below a bitrate cap"?

sneaker_ger
23rd November 2015, 16:06
Are you really asking what the purpose of a function to minimize quality loss is? ;)

But of course doing a second pass of a complete movie is a lot of time to invest. Would be more interesting if it would be limited to affected scenes only.

LigH
23rd November 2015, 16:50
No, I know why someone would want to minimize loss under the constraint of a maximum bitrate. But I wonder which are the most common usage cases where 2-pass CRF+VBV would be preferred over both 1-pass CRF (without VBV constraint) and 2-pass VBR (with target size ... OK, the latter is only necessary when you have a target capacity as constraint, which you don't have for CRF, with or without VBV constraint).

VBV maximum bitrate constraints are important when the reading speed is not much higher than the decoding speed. "Slow" networks or media are a common reason to apply VBV max rate constraints. I am curious if there are other cases in the minds of the developers as reason to develop this mode than the ones I could imagine.

benwaggoner
23rd November 2015, 18:50
No, I know why someone would want to minimize loss under the constraint of a maximum bitrate. But I wonder which are the most common usage cases where 2-pass CRF+VBV would be preferred over both 1-pass CRF (without VBV constraint) and 2-pass VBR (with target size ... OK, the latter is only necessary when you have a target capacity as constraint, which you don't have for CRF, with or without VBV constraint).

VBV maximum bitrate constraints are important when the reading speed is not much higher than the decoding speed. "Slow" networks or media are a common reason to apply VBV max rate constraints. I am curious if there are other cases in the minds of the developers as reason to develop this mode than the ones I could imagine.
VBV constraints are also required to be compliant with a particular Level. Thus they are pretty critical when targeting hardware decoders, and also can cap worst-case decoder complexity for software decoding as well.

I always use VBV for any encode I want to be playable on arbitrary decoders, for the same reason I always follow max refs and other constraints.

Motenai Yoda
23rd November 2015, 22:27
But 1pass crf yet can be vbv restricted, what should do a 2 pass one?

- do a 1st analysis pass then a 2nd encoding pass where the quality reduction is spreaded in a more large range and forward too?

- do a classical 1st crf w/o vbv encode pass and saving a log for the 2nd pass which encode ex novo only those gop around critical sections/zones and copy the others?

why not add instead a lookahead bitrate estimation to work with the usual crf+vbv?
(but this can not work so well)

x265_Project
24th November 2015, 00:26
CRF is a great "mode" for x265, but it's really not "rate control", it's constant quality. It's a form of variable bit rate (VBR) encoding. The bit rate can vary as widely as needed in order to maintain a target level of quality. If the video in some places is especially complex (lots of detail + lots of motion), the bit rate will be very high.

CRF can produce very efficient encodes in a single pass. Commercial companies need video encoders that meet real-world conditions (understanding channel bandwidth and the real constraints of hardware decoders). So CRF is often used with VBV to insure that the bit rate meets real-world requirements. This combination is sometimes called capped VBR.

The downside of capped VBR is that while for most of the title you're able to get constant quality, in the sections where VBV kicks in to enforce the VBV parameters, the quality is lower. This new 2 pass CRF feature in development allows for selective re-encoding of the sections where VBV impacted the quality in order to minimize the severity of the quality impact.

foxyshadis
24th November 2015, 17:55
That's pretty cool! How far back does it rewind time when it hits a VBV limit? Just to the beginning of the frame, or multiple frames, or will that be adjustable?

x265_Project
24th November 2015, 18:31
That's pretty cool! How far back does it rewind time when it hits a VBV limit? Just to the beginning of the frame, or multiple frames, or will that be adjustable?
It will analyze and, if it makes sense, re-encode the affected GOP and the previous GOP. The 2nd pass runs similarly to 2 pass ABR, but it only re-encodes affected areas where it's able to make a difference (essentially reducing quality/bit rate a bit prior to the area of high complexity where VBV kicked in, saving more bits for the area where VBV clamps down, lessening the severity of the quality hit caused by VBV).

LigH
24th November 2015, 18:53
From my time in a DVD authoring studio, I remember a similar technique from MPEG-2 encoders trying to create DVD Video compliant results. Even some hardware encoders used such a short distance recoding technique. If I'm not completely wrong, also HCenc may do so, or did once?

Boulder
24th November 2015, 18:54
Yes, that's what HCEnc does. It will re-encode the GOP to meet the VBV demands.

Boulder
25th November 2015, 16:33
That's my bug report right there :) I must say that there has been improvement recently, but I haven't had the time to test the most recent changes which should also help with the issue. I suggest that you run some tests yourself and see what you can come up with.

x265_Project
25th November 2015, 16:59
I would note that in my limited testing, recent development builds of x265 seem to tolerate much higher psy-rd settings without producing strange artifacts. So you should be able to increase detail retention even further with higher psy-rd strength. Feedback from the community on this would be welcomed.

Tom

Boulder
25th November 2015, 19:01
@Boulder: What kind of improvements are you specifically refering to?Detail (=grain etc.) is not as heavily blurred, or at least not in all cases. Some further changes were made after I tested the last time, you can read the last few pages to see what they have been working on when some evidence came up.

Boulder
25th November 2015, 20:18
If you have the time to spare, --preset veryslow with some low CRF value is a very good option (don't ask me what's low, you'll have to find the level which suits your needs). Maybe experiment with the options mentioned in this post (http://forum.doom9.org/showthread.php?p=1746714#post1746714) to find a good quality/speed tradeoff. Chances are that you won't notice a thing if you enable --limit-refs 3 and --limit-modes.

LigH
26th November 2015, 00:38
The encoder will use more elaborate search methods, taking more time to search for the perfect match to have the smallest difference to compress (which usually results in smaller compressed data). But if it does not spent so much efforts, does not find the optimum, it only has to compress bigger differences and generate a bigger video stream, but can still retain the same quality ... at least in CRF mode without restraining the bitrate. CRF tries to guarantee a quality loss threshold. If your rate factor value is small enough, enough quality will be retained. Just at the cost of bigger results.

Try for yourself with a small video clip and compare for yourself how few percent or even permille the results differ in size, and if the difference in encoding time is more relevant. In either case, if your rate factor was small enough, you will be satisfied with the quality retention, no matter how elaborate the encoder tries to reduce the size of the result by more efficient encoding.

LigH
26th November 2015, 09:16
Compare to car tuning:

The only reliable way to gain horsepower is by increasing the cylinder capacity. Everything else is minor tuning. But then you will have more fuel consumption as price.

The only reliable way to get "visually transparent" copies is to aim for a small enough rate factor. Everything else is minor tuning. But then you will have more bitrate as price.

You may try to increase the psychovisual parameters (just this week I read that their range of values became more tolerant), then you may get a convenient result already with a bigger CRF (16 is already quite small for x265, it's not the same as in x264, I guess around 20 can be enough). And if your source is grainy, try a large NR parameter (>100).

nandaku2
26th November 2015, 09:52
@pingfr: what was the difference in settings between your 2 encodes; you mentioned 155MB larger - than what?

foxyshadis
26th November 2015, 09:58
Well to decrease the size you just increase CRF. If you can't tell the difference between two encodes then they're larger than they need to be, and you should test higher crf. There's a hard limit to how much you can compress film grain (without generating it, which nothing can really do yet), so the more film grain you have and want to keep, the less useful more advanced compression gets. Aside from that, don't mess with me and subme after setting a preset, that's the whole point of using a preset in the first place. Just use medium instead of slow if you want it faster.

psy-rd can reasonably be 0.5-1.0 without causing trouble, higher the grainier your video is; the default 0.3 just hasn't been updated yet.

sneaker_ger
26th November 2015, 17:56
What is it with --tune grain and flattening blocks? Take a look at the sky:

x265 10bit 1.8+120

Source (287 MB) (https://mega.nz/#!hh0nRLia!7BghO78Nto3t9jVz_AObXbHuFd5HlDn7k4XOb_acysc) ( Trim(634,1156), crop(0, 24, 0, -24) )
encodes and screenshot package (https://mega.nz/#F!g913GRKZ!wMmEbi1EN2f35xIdXJtUbw)

settings: --preset medium/slow/slower (--tune grain) --limit-refs 3 --limit-modes --bitrate 7000 --pass 1/2

Original:
http://abload.de/img/140_original_gqsiy.png

Without tuning:
http://abload.de/img/140_medium_pssye.png
http://abload.de/img/140_slow_x3snt.png
http://abload.de/img/140_slower_xes31.png

With tuning:
http://abload.de/img/140_medium_grain_8msrl.png
http://abload.de/img/140_slow_grain_y9szd.png
http://abload.de/img/140_slower_grain_zist4.png

LigH
26th November 2015, 18:36
I would guess that "--tune grain" requires a lot more bitrate now, so the general quantisation got worse.

x265_Project
26th November 2015, 21:05
Pingfr - it would help if you show us your results, and not just your settings. If you use the --csv option with --ssim and --psnr, you could see the relative objective quality measurements for each test encode. While ssim and psnr are far from perfect, they are fine for an understanding of the relative quality between different encodes - especially when you're doing things that trade off quality vs speed.

LigH
26th November 2015, 21:20
They will write technical internal values into a log file in CSV format which you can load into a calculation application (spreadsheet, e.g. Excel) and compare graphs. Much smaller, easier to download, and tells tech-savvy people quite a bit before even watching the video.

Furthermore, the content of log files only is certainly not copyrighted ... I don't know which video you uploaded until I download it.

LigH
26th November 2015, 22:40
You have an AviSynth script as source? Then try avs4x26x (http://forum.doom9.org/showthread.php?t=162656) instead of ffmpeg, much simpler, it will handle piping raw video and forwarding resolution and framerate parameters on its own.

avs4x26x -L x265_x64.exe --preset slow --crf 19 --csv crf19.csv -o crf19.265 source.mkv.avs

Do not put files into C:\ - this should be a UAC protected directory; always create a separate subdirectory (and if you have more than one HDD, preferably on a different internal drive than your Windows system).

Apart from that, the "--csv filename" parameter looks not bad, but read also about adjacent parameters, e.g. "--csv-log-level #"; level 0 will only report a summary per encode, level 1 will write a verbose log with per-frame details (ensure to use separate csv filenames per setting), and level 2 even with some performance statistics.

avs4x26x -L x265_x64.exe --preset slow --crf 19 --csv crf19.csv --csv-log-level 2 -o crf19.265 source.mkv.avs

LigH
27th November 2015, 08:21
Well, my big respect to your efforts, you certainly invested more time into gaining experience than several other people "demanding the best options".
:helpful:
You will find your personal optimum this way. And sharing the results will hopefully add value to the development.
:goodpost:
For Christmas, I wish more users like you! :cool:

foxyshadis
27th November 2015, 09:26
Note that the only reason to use --limit-refs and --limit-modes is as a speed-up option. I have a feeling they'll probably be incorporated into the presets at some point, but with them you can use more references without the speed hit (you only pay the memory cost now). However, --preset slow is already 3 refs, so --limit-refs 3 isn't actually doing anything in that case. (Unless B-frames count as an automatic ref, like in x264? Not sure.)

The quality of --limit-refs 3 is somewhere between just 3 references and the full reference count, closer to full, but about the same speed as ref 3. You can see how useful it can be with more references.

LigH
27th November 2015, 09:43
Average Quantization P..?: CRF gives a kind of quantizer target, but the encoder (x264 or x265) may already stay below the desired threshold of quality loss at a bigger quantizer for scenes with little detail and little motion. CRF doesn't waste precision where not much precision is required to reconstruct the video good enough.

If you had a preset or a custom command line where you allow at most 8 references per frame, but try to limit references to 3, then "full" would be all 8 references (which is hardly ever necessary).

foxyshadis
27th November 2015, 09:59
Ignore what I said, I was confusing it with an older experimental where only X references were considered... totally forgot that the actual implementation applies the same no matter how many refs you have. Sorry.

LigH
27th November 2015, 10:06
So what's "best" a high or a low value? should I be "worried" or "concerned" in any ways at all? :)

Just in parallel to CRF, a smaller quantizer gives a better quality but a less efficient compression. Imagine it like an integer divisor: Thousands of different values (frequency domain parameters you can hardly imagine as a human) are divided to become only a few hundreds of different values to be compressed; during decoding, they are multiplied again with the quantizer to return to their original magnitude, but are then steps of this quantizer apart from each other (multiples of this quantizer).

Very simplified, you may interpret CRF like: "In the worst case, use a quantizer as low as this, but if you discover that a bigger quantizer is good enough, that's even better". Of course, it's a bit more complex.

Interpreting your logs can take some time. We may have jobs, you know ... patience. :)

nandaku2
27th November 2015, 11:18
Ok - will try to answer.

limit-refs (1/2/3) - this option uses heuristics to determine the best reference to be used for a block. Basically, limit-refs 3 combines the heuristics used in limit-refs 1 and 2 (think bitflags, it has nothing to do with actual number of references). Essentially, this means the encoder finds the best reference faster. For fast moving videos, more references is better for compression efficiency. So, turning on limit-refs will enable the user to increase the number of references (--ref) without the speed penalty. And if you compare commandlines with --ref N, and another with --ref N --limit-refs 3, you will find a small drop in SSIM, but visually nothing at all - and of course, the latter will be significantly faster like you found.

@foxyshadis: Indeed, we are just winding up our new preset testing, which will increase --ref and turn on limit-refs for better quality. Early next week.

CRF indicates a certain quality level for that particular sequence- the encoder can use any quantizer (determined by frame complexity) with the only objective being maintaining constant visual quality.

Boulder
27th November 2015, 11:26
What comes to hardware decoding, what kind of limits are there regarding the number of refs? I don't know if there's a simple table to look at like there is for MPEG-4 AVC at Wikipedia.

nandaku2
27th November 2015, 11:41
The maxDpbSize column here indicates the maximum number of references that a decoder conforming to a specific level will support.
https://en.wikipedia.org/wiki/High_Efficiency_Video_Coding_tiers_and_levels

Motenai Yoda
27th November 2015, 14:47
IIRC x265 can use up to 16 ref, but HEVC specs allow only up to 8 ref/DpbSize.
And you have to take in account as DpbSize 1 frame for bframes and 1 frame for b-piramidal, so it should be 6.

the paper says:
The value of sps_max_dec_pic_buffering_minus1[ HighestTid ] + 1 shall be less than or equal to MaxDpbSize, which is derived as follows:
if( PicSizeInSamplesY <= ( MaxLumaPs >> 2 ) )
MaxDpbSize = Min( 4 * maxDpbPicBuf, 16 )
else if( PicSizeInSamplesY <= ( MaxLumaPs >> 1 ) )
MaxDpbSize = Min( 2 * maxDpbPicBuf, 16 ) (A-2)
else if( PicSizeInSamplesY <= ( ( 3 * MaxLumaPs ) >> 2 ) )
MaxDpbSize = Min( ( 4 * maxDpbPicBuf ) / 3, 16 )
else
MaxDpbSize = maxDpbPicBuf

where MaxLumaPs is specified in Table A.4 and maxDpbPicBuf is equal to 6.
but this seems don't work well, ie with a 1280x768 input and level 4 x265 will reduce ref to 7 but "issue a warning that the resulting stream is non-compliant and it signals the stream as profile NONE and level NONE and will abort the encode unless"
x265 [info]: Lowering max references to 7 to meet numPocTotalCurr requirement
x265 [warning]: level 4 detected, but NumPocTotalCurr (total references) is non-compliant
x265 [info]: NONE profile, Level-NONE (Main tier)
x265 [info]: non-conformant bitstreams not allowed (--allow-non-conformance)
x265 [error]: failed to open encoder
avs [error]: Error occurred while writing frame 2

Vesdaris
27th November 2015, 18:43
Guys, what changes should I make to a medium preset that would allow me to greatly reduce blur and keep as much detail as possible without sacrificing much encoding speed?

Cheers

foxyshadis
27th November 2015, 22:20
IIRC x265 can use up to 16 ref, but HEVC specs allow only up to 8 ref/DpbSize.
And you have to take in account as DpbSize 1 frame for bframes and 1 frame for b-piramidal, so it should be 6.

the paper says:

but this seems don't work well, ie with a 1280x768 input and level 4 x265 will reduce ref to 7 but "issue a warning that the resulting stream is non-compliant and it signals the stream as profile NONE and level NONE and will abort the encode unless"
x265 [info]: Lowering max references to 7 to meet numPocTotalCurr requirement
x265 [warning]: level 4 detected, but NumPocTotalCurr (total references) is non-compliant
x265 [info]: NONE profile, Level-NONE (Main tier)
x265 [info]: non-conformant bitstreams not allowed (--allow-non-conformance)
x265 [error]: failed to open encoder
avs [error]: Error occurred while writing frame 2

It should probably be setting anything out-of-spec to level 8.5, which is now the designated "unrestricted" level, although a warning that it's not supported in the base spec is useful.

I don't think that NumPicTotalCurr is the actual maximum size of the DPB, because it's never used in the derivation of the reference picture set, but rather the maximum that any one ref list can include. The DPB for 1280x720 @ L4.0 would be 12 pics. When a DPB is larger than 8, at least 4 pics would then be placed in the RefPicSetStFoll in the RPS (unavailable to the current picture but still available to future frames), without having to be marked "unused for reference" (unavailable for good). I'm trying to sort out from the spec whether it's also possible for B-frames to have completely different frames in each list, meaning all 16 can be referenced at once, but I'm not certain.

I don't think x265's test for conformance is correct here.

x265_Project
28th November 2015, 00:03
Guys, what changes should I make to a medium preset that would allow me to greatly reduce blur and keep as much detail as possible without sacrificing much encoding speed?

Cheers
Increase --psy-rd strength to a value higher than the current default (0.3). You can start with 1.0, and see if this produces any visible artifacts (ghosting around objects, or strange motion artifacts). If not, you can try even higher values. If so, reduce psy-rd strength. Do short tests (limited number of --frames) until you figure out what seems to work best for your content and settings.