Log in

View Full Version : Is x265 fundamentally broken?


Grojm
22nd September 2019, 12:33
I just found this: https://bitbucket.org/multicoreware/x265/issues/397/file-size-vs-preset-goes-the-wrong

It seems that x265 preset options are fundamentally broken. The slower presets will just increase file size dramatically (up to 50%) without offering better image quality (maybe in benchmarks, but not in real world).

What do you think? Is it still worth using x265 or should we look for alternatives?

nevcairiel
22nd September 2019, 12:40
The issue smells of half-Information and no actual scientific testing.

Size between presets is not comparable when using CRF, if you want proper comparison then either use a two pass fixed Bitrate Encode, or adjust the CRF value until output size is similar. And only then compare quality - both objectively and subjectively.

Nothing in that issue has the makings of decent testing. And CRF has always been vastly misunderstood by many people. It's not designed to give a consistent file size over a wide range of settings, as such file size is not a comparable value at all if you change preset with CRF. Consistent file size is provided by ABR or CBR modes instead.

Grojm
22nd September 2019, 12:48
CRF should give a constant quality. So in theory, if I change to a slower preset and use the same CRF, file size is expected to go down while offering the same quality.

nevcairiel
22nd September 2019, 12:49
CRF is only "constant quality" when no other settings change. The "constant quality" refers to the encode you are doing right now, ie. the quality is constant throughout the entire encode. Its not constant when you change every other setting.

If you want to compare quality vs. size, do as I explained above, dial in settings that produce the same file size, and _then_ compare the quality of these. Then you can only begin to judge how different slow vs. fast is.

Grojm
22nd September 2019, 13:05
This is very confusing. Then the metric used by CRF to determine quality must be very fragile and contra intuitive. I'd expected that it will optimize for some visual model to offer such and such quality level. But I understand your point.

Asmodian
23rd September 2019, 19:51
Then the metric used by CRF to determine quality must be very fragile and contra intuitive. I'd expected that it will optimize for some visual model to offer such and such quality level. But I understand your point.

Yes, all quality metrics are not very accurate. Assigning a single number to quality (so computers can understand it) has been one of those long running unsolved problem since the beginning.

None of the "constant quality" modes from any codec are constant when you change other settings, x264 behaves in almost exactly the same way. In general the faster settings have noticeably lower quality at the same crf.

NikosD
24th September 2019, 07:01
It's been already exposed many times, that slower modes of CRF using x265, are desperately trying to extract quality from the video stream repeating the same algorithms again and again, but most of the times they increase the size of the encoded video without actually increasing quality.

It would be wise to stay at "x265 slow" and not going slower, but nowadays it would be even wiser to try a hardware encoder of H.265 like Turing's which produces low size, good quality (even better in many cases than x265) and extremely fast H.265 encodings.

Asmodian
24th September 2019, 11:48
are desperately trying to extract quality from the video stream repeating the same algorithms again and again, but most of the times they increase the size of the encoded video without actually increasing quality.

While the quality improvements are less and less significant for the CPU time used x265 does not increase the size without also increasing the quality. You need to increase the CRF value as you decrease the speed to keep the same size but the quality/size goes up too. Do some tests with two pass targeting the same size. If you are already transparent it can simply seem like only the size goes up but you could also chose a slightly lower CRF (e.g. -0.05) and it would seem like the size only goes up too.

I think you do not understand the consequences of this 'repeating the same algorithms again and again', it only keeps the best choice of out everything it tries.

NikosD
24th September 2019, 13:51
According to an analysis I read recently, lower presets will get you smaller file sizes.

The preset tells your encoder how hard to try to get the quality, so even if it reaches the right quality but still has time to search, it will try find ways to decrease the data required even further.

But there’s an anomaly, sometimes using slower presets (or larger parameters) you will get a larger file size because sometimes with CQ the encoder will search within the parameters you allowed it but couldn’t find enough quality, so it just saves the frame.

But if your parameters are searching far and wide for more options, it can locate tiny pieces of extra quality to fit in and once it finds them, it will use however many bits it needs to fill the target.

Asmodian
24th September 2019, 20:01
Slower presets are not setting "how hard to try to get the quality". That doesn't even make sense, speed settings don't change any quality metrics, that is what CRF sets. A Slower preset allows x265 to test more encoding methods (it always picks the most efficient method) but it does not make it want to preserve detail any more or weight quality v.s. size differently.

Do you have any sources? I think this is one of those wrong urban legends based on bad testing (using CRF encodes) and misunderstanding how x265 works. Also the desire to say the faster preset is better quality is pretty tempting if you feel like you should use the best quality possible but don't actually want it to be that slow. Sour grapes, if you will.

benwaggoner
24th September 2019, 20:31
Reading through the readthedocs, it's obvious how quality is different between presets, because different modes controlled by different presets use different ways to measure and optimize quality.

Presets control --rd, for a big example. What even gets included in rate distortion optimization changes with rd level. So how chroma quality is measured and optimized changes between 2 and 4. Something with complex chroma running at a fast preset might give smaller size because it is just making the chroma look terrible.

aq-mode also has fundamental impact on how the codec works in practice.

NikosD
24th September 2019, 20:57
It's late September of 2019.

I think it's time to declare project x265 dead.

It can't get any other AVX2/AVX512 optimizations and it can't improve quality in the same size any more.

Time to move on to H.265 hardware encoding and other codecs (AV1)

microchip8
24th September 2019, 21:05
It's late September of 2019.

I think it's time to declare project x265 dead.

It can't get any other AVX2/AVX512 optimizations and it can't improve quality in the same size any more.

Time to move on to H.265 hardware encoding and other codecs (AV1)

I think you're wrong. If x265 is "dead" so is x264. Yet they still get developed further. x265 first listens to its customers and what they want. That's the first priority of the x265 devs. After this comes all other

Asmodian
24th September 2019, 21:08
Time to move on to H.265 hardware encoding and other codecs (AV1)

What? :confused:

Hardware encoding is still offers worse quality and AV1 and similar is not ready yet and is not supported by current hardware for playback. I am totally confused as to what your point is. You just irrationally hate x265?

RanmaCanada
25th September 2019, 04:51
It's late September of 2019.

I think it's time to declare project x265 dead.

It can't get any other AVX2/AVX512 optimizations and it can't improve quality in the same size any more.

Time to move on to H.265 hardware encoding and other codecs (AV1)

AVX512 is useless. The amount of heat that is generated when it's enabled actually slows down your encodes. Also the sheer amount of power used during it makes it far less efficient.

Hardware encoding is garbage. You have fixed functions that can not be changed until the actual hardware itself is updated, at extreme cost to the user.

AV1 is no where near ready for home users. It requires a literal order of magnitude of processing power over HEVC to get not even similar results.

If we're to declare it dead, should we declare MPEG2 dead as well? MPEG1? How about Huffy?

MeteorRain
25th September 2019, 17:20
AVX512 is useless. The amount of heat that is generated when it's enabled actually slows down your encodes. Also the sheer amount of power used during it makes it far less efficient.

I'd say they are all implementation related.

When AVXx was first introduced, it was even slower than SSEx sometimes. How about now? We all enjoy AVX2 bringing 20-30+% performance boost. AVX512 right now is not even widely supported, and there's only Intel implementation.

It's also widely known that Intel down clock when executing lots of AVX2 code. When AMD finally have a proper AVX2 implementation, we observe that it does not down clock. So we can still hope in a day when we can have full speed AVX512 and it'll eventually outperform AVX2.

NikosD
25th September 2019, 18:14
I'd say they are all implementation related. Excellent post, agreed in everything.
But I must say that my low TDP 65W Coffee Lake Refresh - Core i3 9100F using a budget motherboard and a budget cooler can keep the absolute, total power virus called Prime95 SmallFFTs running at all Core turbo clock exactly the same as any workload while running extremely optimized AVX2-FMA3 code (for the few minutes that I was running the torturing test)
So, after the "buggy" first AVX2 implementation of Haswell, Intel has improved a lot its AVX2 implementation in next generations.

Blue_MiSfit
27th September 2019, 19:19
It's late September of 2019.

I think it's time to declare project x265 dead.

It can't get any other AVX2/AVX512 optimizations and it can't improve quality in the same size any more.

Time to move on to H.265 hardware encoding and other codecs (AV1)

x265 is generally an excellent HEVC encoder, and it's by far the best free HEVC encoder. It's _widely_ used professionally, and is actively developed. The folks at MulticoreWare are responsive, and extremely sharp.

It's far from dead. If it's dead to you that's fine, but that's not what the industry thinks. In any case, there's plenty of interesting developments in AV1 and hardware HEVC encoding,

BTW - all the off-topic posts related to an older version of this statement have been moved by the HEVC forum moderation team. Let's stay on topic, please.

LoRd_MuldeR
27th September 2019, 19:27
This is very confusing. Then the metric used by CRF to determine quality must be very fragile and contra intuitive. I'd expected that it will optimize for some visual model to offer such and such quality level. But I understand your point.

That is not how CRF mode works. There is no "elaborate" quality metric.

If we ignore MB-Tree for now, it is pretty much just a "complexity" measure based on the expected frame size (in bits) in conjunction with the qComp ("quantizer curve compression") algorithm and a fixed (used-defined) scaling factor.

Read here for some basic information on how the different RC modes work:
https://code.videolan.org/videolan/x264/blob/master/doc/ratecontrol.txt

x264's ratecontrol [and thus x265's] is based on libavcodec's, and is mostly empirical. But I can retroactively propose the following theoretical points which underlie most of the algorithms:
You want the movie to be somewhere approaching constant quality. However, constant quality does not mean constant PSNR nor constant QP. Details are less noticeable in high-complexity or high-motion scenes, so you can get away with somewhat higher QP for the same perceived quality.
On the other hand, you get more quality per bit if you spend those bits in scenes where motion compensation works well: A given artifact may stick around several seconds in a low-motion scene, and you only have to fix it in one frame to improve the quality of the whole scene.
Both of the above are correlated with the number of bits it takes to encode a frame at a given QP.[...]

2pass:
Given some data about each frame of a 1st pass (e.g. generated by 1pass ABR, below), we try to choose QPs to maximize quality while matching a specified total size. This is separated into 3 parts:
Before starting the 2nd pass, select the relative number of bits to allocate between frames. This pays no attention to the total size of the encode. The default formula, empirically selected to balance between the 1st 2 theoretical points, is "complexity ** 0.6", where complexity is defined to be the bit size of the frame at a constant QP (estimated from the 1st pass).
Scale the results of (1) to fill the requested total size. Optional: Impose VBV limitations. Due to nonlinearities in the frame size predictor and in VBV, this is an iterative process.
[...]

1pass, ABR:
The goal is the same as in 2pass, but here we don't have the benefit of a previous encode, so all ratecontrol must be done during the encode.
This is the same as in 2pass, except that instead of estimating complexity from a previous encode, we run a fast motion estimation algo over a half-resolution version of the frame, and use the SATD residuals (these are also used in the decision between P- and B-frames). Also, we don't know the size or complexity of the following GOP, so I-frame bonus is based on the past.
We don't know the complexities of future frames, so we can only scale based on the past. The scaling factor is chosen to be the one that would have resulted in the desired bitrate if it had been applied to all frames so far.[...]

1pass, CFR:
Same as ABR.
The scaling factor is a constant based on the --crf argument.[...]


On the subject of presets and CRF mode:

In general, a "slower" preset improves the "quality per bit" ratio (better quality at same file size), at the cost of slower encoding speed. Conversely, a "faster" preset improves encoding speed at the cost of a worse "quality per bit" ratio (worse quality at same file size). But: When it comes to CRF mode, the is no guarantee (and I think there never was) that a specific CRF value still results in the same absolute file size (average bitrate) after switching to another preset!

Therefore, it is quite possible that, after switching to a "slower" preset, you end up with better quality and a larger file – at the same CRF value. That alone doesn't say anything! However, if you carefully adjusted your CRF value, in order to find the highest CRF value that still gives satisfactory quality, and if you did this separately for each preset, then you would probably find that, in the end, the "slower" preset can get away with a smaller file than the "faster" one.


It's late September of 2019.

I think it's time to declare project x265 dead.

It can't get any other AVX2/AVX512 optimizations and it can't improve quality in the same size any more.

Time to move on to H.265 hardware encoding and other codecs (AV1)I think you're wrong. If x265 is "dead" so is x264. Yet they still get developed further. x265 first listens to its customers and what they want. That's the first priority of the x265 devs. After this comes all other

BTW: x265 version 3.2 has just been released:
https://bitbucket.org/multicoreware/x265/commits/353572437201d551381002aebf20d244bd49ef17

Also, the longer an encoder has already been under development, the more effort will be required to get an additional improvement. So, it is kind of "natural" that things improve fast at the beginning and slow down after a while...

LoRd_MuldeR
27th September 2019, 19:31
Please note: All off-topic posts have been moved to "moderation". The same is going to happen to any further post that does not add something relevant to the topic.

:thanks:

Grojm
28th September 2019, 19:05
Reading through the readthedocs, it's obvious how quality is different between presets, because different modes controlled by different presets use different ways to measure and optimize quality.

Presets control --rd, for a big example. What even gets included in rate distortion optimization changes with rd level. So how chroma quality is measured and optimized changes between 2 and 4. Something with complex chroma running at a fast preset might give smaller size because it is just making the chroma look terrible.

aq-mode also has fundamental impact on how the codec works in practice.

THIS was the missing piece of information I needed. Thanks!


Because I always thought, if you give the encoder more time (slower preset) to try different encoding settings, at least the efficiency will stay the same or improve. So, if you keep the quality metric fixed (howsoever you define your metric, doesn't matter), the resulting file size will be at least the same or smaller.
But, if with a slower preset, also the metric itself changes, of course this is not the case any more.

I haven't thought about how the preset changes how the metric is calculated itself.

However, I think there should be found a way to keep the metric fixed for different presets. This would avoid a lot of confusion.

benwaggoner
30th September 2019, 18:50
However, I think there should be found a way to keep the metric fixed for different presets. This would avoid a lot of confusion.
The problem is that useful metrics are more expensive to calculate, and so aren't going to be used in faster presets. And there isn't any way to set a metric directly when encoding in a useful mode. Sure you could make an encoder that would encode to exactly 30 PSNR, but it would do it by making every frame and likely every pixel that, which would be a horribly suboptimal way to encode visually.

What we want an encoder to do is to provide optimal subjective quality, and that can only really be tested by humans in a double-blind environment. VMAF attempts to estimate what a human would rate content, and it's the least-bad metric available to the public. But it falls short in many ways, doesn't do HDR at all, etcetera.

If we had a perfect objective metric, all encoders would do is optimize for that. Alas, things are vastly more complicated.

Z2697
8th May 2026, 09:04
I came across this thread and was amazed that OP still refers to CRF as "metric" after all the discussion.
There's no code that compares the source frame and the reconstructed frame and then say "this is Rate Factor 18" or "this is not Rate Factor 18, let's try something different" and then feed it back to encoder.

rwill
8th May 2026, 18:05
Well you can say something like "this reconstructed frame is the result of unmodified preset slow and crf 18" if the stream was encoded this way...

Z2697
8th May 2026, 18:29
Well you can say something like "this reconstructed frame is the result of unmodified preset slow and crf 18" if the stream was encoded this way...

That's like knowing the answer prior to the question :rolleyes: