x265 HEVC Encoder [Archive] - Page 86

View Full Version : x265 HEVC Encoder

Pages : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 [86] 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201

Jamaika

12th September 2016, 10:06

A question came up in the mailing list about the implementation of tiles. According to the Wikipedia, HEVC supports different parallel processing tools (https://en.wikipedia.org/wiki/High_Efficiency_Video_Coding#Parallel_processing_tools), x265 implemented WPP (Wavefront Parallel Processing) first, but is just implementing slices as well (imaginable as horizontal stripes of possibly different heights, IIRC), and tiles would even allow to split video frames into a grid of rectangular sub-frames.

I read that this isn't true.
http://image.slidesharecdn.com/hevcvideocodecvinayagammariappan-160607083744/95/hevc-video-codec-by-vinayagam-mariappan-56-638.jpg?cb=1465288787
main profile doesn't allow tiles + wpp compilation

LigH

12th September 2016, 10:19

Those will be the restrictions we shall know well. If we have to decide between either one or another technique, we better know good reasons which to prefer in which case.

I never claimed that two of them could be used in conjunction. I just mentioned that they are implemented one after another: WPP already was, slices are being, and tiles may be implemented in x265.

This diagram explains the WPP dependencies well, IMHO. It appears to prefer top and left prediction, and each additional thread has to wait for more finished partial results.

Jamaika

12th September 2016, 10:29

I don't know how it is.:stupid:
For example in the widely quoted blog post by Ronald Bultje (https://blogs.gnome.org/rbultje/2015/09/28/vp9-encodingdecoding-performance-vs-hevch-264/), “all forms of threading/tiling/slicing/wpp were disabled”. I didn't notice this until after comments were closed, so I didn't have a chance to respond. As most of you know, x265 is highly multi-threaded, and it uses frame parallelism and Wavefront Parallel Processing by default. These features have minimal impact on quality, but a massive positive impact on performance. WPP is an integral part of the HEVC standard, and we expect most HEVC video to use WPP, as it makes both encoding and decoding run faster and more efficiently. So, I would love to see this test repeated this test using settings that are more rational.

LigH

12th September 2016, 12:05

From the x265 commits (https://bitbucket.org/multicoreware/x265/commits/all): Introduction of patches to implement slices started in 2016-09-06 (last week). Consider it "experimental", it may work well only under certain conditions.

Leo 69

12th September 2016, 16:18

Guys, have there been any changes in the code lately, which affect quality in any way?
Also, the question of the day - when will the SAO issue be fixed? I can't seem to include this option to all my encodes, it visually blurs so much. Oh, to an untrained eye it certainly not noticeable, but we all know what we're talking about, right?

Thanks a lot to the development team for all their efforts. As soon as I have some money (for now I'm completely broke :)), I'll consider making a donation.

burfadel

12th September 2016, 17:42

I think SAO is just set way too strong. Probably even 50 percent of what it is currently would be too strong, I would think more like 20 percent.

LigH

13th September 2016, 07:55

A little late, but as it seems to be a kind of intermediate milestone:

x265 2.0+54-e5ca9b210223 (GCC 5.3.0) (https://www.mediafire.com/download/rjdjp9textb73bp/x265_2.0+54-e5ca9b210223.GCC530.7z)
x265 2.0+54-e5ca9b210223 (GCC 6.1.0) (https://www.mediafire.com/download/w9xgmwnb3k3kb4f/x265_2.0+54-e5ca9b210223.GCC610.7z)

Differences in CLI options since 2.0+16:

--[no-]slices <integer> Enable Multiple Slices feature. Default enabled
...
--[no-]analyze-src-pics Motion estimation uses source frame planes. Default disable
...
--qg-size <int> Specifies the size of the quantization group (64, 32, 16, 8). Default 32
...
--discard-sei Discard SEI packets in bitstream. Default disabled
--discard-vui Discard VUI information from the bistream. Default disabled

--qg-size: Additional size 8.

I wonder how the default of --slices can be "enabled" when it expects an integer parameter. Let's guess the default number is 1?

Barough

15th September 2016, 11:20

x265-2.0+58-cc77b9922b19 (https://1fichier.com/?9n5zszd1od) (MSYS/MinGW, GCC 6.1.0, 32 & 64bit 8/10/12bit multilib EXEs)

x265_Project

16th September 2016, 04:56

I wonder how the default of --slices can be "enabled" when it expects an integer parameter. Let's guess the default number is 1?

Apparently our documentation for this feature is also "experimental". Yes, the default is 1 slice/frame.

Motenai Yoda

16th September 2016, 05:17

for research purpose, which is more compatible with tablet, tv and boxset, a main10 level 4.0 high tier or a main10 level 4.1 main tier?

benwaggoner

16th September 2016, 16:12

for research purpose, which is more compatible with tablet, tv and boxset, a main10 level 4.0 high tier or a main10 level 4.1 main tier?
Probably 4.1 Main Tier. But there's a lot of tablets and some TVs that aren't 10-bit compatible at all.

8-bit Main Level 4.0 is going to be the most compatible option.

Barough

22nd September 2016, 17:49

x265-2.0+65-d20b78d6d138 (http://www45.zippyshare.com/v/zhN1DAFX/file.html) (MSYS/MinGW, GCC 6.2.0, 32 & 64bit 8/10/12bit multilib EXEs)

[EDIT]
Updated to x265-2.0+65

brumsky

25th September 2016, 06:07

Jawed

25th September 2016, 11:58

This is how I do it:

--preset medium --rd 5

versus

--preset medium

To compare encodes I use a script like:

a=FFVideoSource() # original before encoding
b=FFVideoSource() # encoded

c=subtract(a,b)
c=c.lumalevels(133,149)

interleave(a,b,c)

return last

function LumaLevels(clip source, int "black", int "white", float "sat")
{
black=default(black,16) # PCLevels
white=default(white,235) # PCLevels
sat =default(sat,1)

Cont = 255/float(white-black)
Bright = -int(float((black-16)*Cont+16.5))
Tweak(source,bright=Bright, cont=Cont, coring=false, sat=sat)
}

This helps me to identify where in the frame there are differences. I can then decide whether I care about those differences.

For my own encodes I'm currently using:

--crf 20 --preset medium --output-depth 10 --rd 6

--crf 24 --preset medium --output-depth 10 --bframes 2 --rd 6

--crf 28 --preset medium --output-depth 10 --bframes 2 --rd 6

I think medium with --rd 6 is better than slow and it's the same encode speed or a little faster than slow. Slow uses --rd 4. Medium uses --rd 3. So --rd 6 is a significant difference for the medium preset.

Yes, I know that --rd 6 is currently the same as 5 - when 6 arrives, I'm ready!

I decided to focus on --rd due to:

http://forum.doom9.org/showpost.php?p=1779902&postcount=4237

x265_Project

25th September 2016, 18:10

I recently upgraded my computer and therefore have some extra encoding power. I've been thinking about making one of the following changes.

RD 5
Rect
Rect & Amp

Now, I admit I don't fully understand the low level details of the options above - just the bits I can gather from the docs. What is RDO predictions? It is listed as the change to RD 5. It seems to have a very large impact on encoding speed.

RD = Rate Distortion Optimization level
At the core of x265 (and most video encoders) is an algorithm that figures out the best way to encode each block of video. This algorithm weighs the distortion of each candidate "mode" against the bits used (the rate). To do this rate distortion optimization as accurately as possible, the encoder has to accurately calculate the bits that each candidate mode would use. In other words, it has to fully encode the candidate mode, calculating the residual error after prediction (the difference between the inter or intra-predicted block and the source pixels), performing a discrete cosine transform on the residual error, quantizing the residual error, and entropy coding the predicted block plus the residual error. This method is used for RD level 5. For lower RD levels, x265 weighs the bits used by each candidate mode, but it doesn't go through all of the above steps (it will skip entropy coding, for example). Lower levels are roughly accurate when it comes to choosing the best mode, but not as accurate as the highest rd level. Note that there are a number of decisions in x265 that can be rd optimized. More decisions are rd optimized at higher rd level, to a higher degree of precision.

RECT = allow x265 to evaluate and code rectangular blocks (CUs). Each CTU (usually a 64x64 pixel block, but this depends on your --ctu setting) is analyzed to find the optimal way to encode the pixels in the CTU, partitioning the CTU into smaller CUs. For example, a 64x64 pixel CTU might be encoded as four 32x32 CUs, or sixteen 16x16 pixel CUs. HEVC supports rectangular CUs. For example, a 32x32 pixel block could be encoded as two 16x32 CUs, or two 32x16 CUs. Rectangular CUs allow the partitioning to more ideally match the video content, but of course the more possibilities the encoder wants to evaluate, the longer it takes to evaluate all of the possibilities.

AMP = allow x265 to evaluate asymmetric partitions. Asymmetric partitions are rectangular CUs that have a ratio of 1:4, 3:4, 4:1 or 4:3. For example, a 32x32 pixel block can be encoded as a 8x32 pixel block plus a 24x32 pixel block. Again, these additional possibilities can allow for a more accurate fit, but more possibilities = more time needed to evaluate all possibilities.

burfadel

25th September 2016, 20:26

AMP seems to have a much less of an impact on performance compared to RECT.

benwaggoner

26th September 2016, 19:41

AMP seems to have a much less of an impact on performance compared to RECT.

And --limit-refs and --limit-modes can make the perf cost of each or both a lot smaller while retaining most of the gains.

burfadel

26th September 2016, 20:19

And --limit-refs and --limit-modes can make the perf cost of each or both a lot smaller while retaining most of the gains.

Yes, that too!

benwaggoner

26th September 2016, 23:20

For my own encodes I'm currently using:

--crf 20 --preset medium --output-depth 10 --rd 6

--crf 24 --preset medium --output-depth 10 --bframes 2 --rd 6

--crf 28 --preset medium --output-depth 10 --bframes 2 --rd 6
Why only 2 b-frames?

Jawed

27th September 2016, 00:36

Why only 2 b-frames?
I found that motion in live action tended to become jittery (sort of as though it were half-rate: "anime" in feel).

http://forum.doom9.org/showpost.php?p=1777880&postcount=4189

I actually found the same source suffered the same fate with x264 at crf 24 (using settings that are almost equivalent to preset very slow in speed). But in x264 I historically did not use crf as high as 24 - principally because blocking/banding with 8-bit encodes becomes unbearable.

So the same source in both x264 and x265 (both 10-bit) produced the same problem, which disappears with crf 20. In x264, setting --tune grain also solved the problem (doubled the bitrate for the water test sequence too...).

So when I started my x265 experiments I started to investigate crf, pushing far beyond 21 which was my limit with x264 8-bit. 10-bit, it transpires, makes x264 blocking/banding a non-issue, which was a good reason to see what I thought of higher crf values. Along the way I stumbled into this problem at crf 24.

(In fact I found the problem back in April and was so dismayed that I just forgot about my x265 experiments and did other things - not realising at the time that it was b-frames causing the problem.)

On the water sequence I reported 4 b-frames as a solution. But in other testing I found that rapidly turning faces (small in the frame) in low contrast (think "a steamy room") the problem recurred (horrible jumps, like anime). So 2 b-frames it is.

I decided not to evaluate CRFs between 20 and 24 to see how the problem arises. It just seems that at crf 24, b-frames (with x264 or x265 both on very slow preset) are too unreliable. I hadn't noticed in the past with x264 because I hadn't used such a high value for crf.

Dclose

27th September 2016, 01:07

I think medium with --rd 6 is better than slow and it's the same encode speed or a little faster than slow. Slow uses --rd 4. Medium uses --rd 3. So --rd 6 is a significant difference for the medium preset.
I'm testing that right now. --rd 5 is "better" than 4, but it's so slow. And on my current live action test sample, it looks like Medium + --rd 5 is more susceptible to reference frames than Slow + --rd 4.

3,4 frames it is more detailed than Slow + 4, but there's more blocking/artifacts. I changed the frames to 4,4 and that helped, though Slow + 4 is still more consistent even at 3,4 frames. (but Medium + rd5 still has more detail).

And now I increased frames to 5,7. I didn't notice that helping Slow + rd4, but it sure looks to have helped the artifacts/blocking of Medium + rd5. Slow + rd4 is still better in that regard, though.

This is with using most of the common(?) settings for more detail. Maybe those detail settings are too much for Medium and rd5 and 23-24 CRF.

Are there differences between presets that can't be changed? Right now, I think I'm basically changing Sub-Pixel Precision to be able to run --rd 5 at a decent speed. Medium, rd 5, Sub-Pixel Precision 2 vs. Slow, rd 4, Sub-Pixel Precision 3.

Jawed

27th September 2016, 01:32

Check the presets page:

http://x265.readthedocs.io/en/latest/presets.html

LigH

27th September 2016, 14:37

Another milestone has been passed. Triple feature this time: v2.1+2 stable

x265 2.1+2-c0d91c2b4048 (GCC 5.3.0) (https://www.mediafire.com/file/9ydeug4nzruvmm9/x265_2.1+2-c0d91c2b4048.GCC530.7z)
x265 2.1+2-c0d91c2b4048 (GCC 6.1.0) (https://www.mediafire.com/file/sdz65g1cwajjy39/x265_2.1+2-c0d91c2b4048.GCC610.7z)
x265 2.1+2-c0d91c2b4048 (GCC 6.2.0) (https://www.mediafire.com/file/93utcu99s2hvn0p/x265_2.1+2-c0d91c2b4048.GCC620.7z)

I guess I can skip GCC 6.1.0 soon?

brumsky

27th September 2016, 19:47

Thanks for the reply and info Jawed. I did watch that video when it was released. Based on your suggestion, I decided to take a 30 second clip with high motion and a significant amount of rain then encode it using different settings.

The main goal of this was to determine the extra encoding time required for a variety of settings. This is by no means an exhaustive analysis of all the settings - just a select few.

Please note that I am comparing the additional settings against my base config - see below.

--profile main10 --output-depth 10 --crf 22 --ctu 32 --bframes 6 --bframe-bias 5 --rc-lookahead 40 --scenecut 40 --ref 5 --limit-refs 3 --me 3 --merange 26 --subme 3 --no-rect --no-amp
--limit-modes --max-merge 4 --no-early-skip --b-intra --no-sao --signhide --weightp --weightb --aq-mode 1 --aq-strength 1
--cutree --rd 4 --tu-intra-depth 3 --tu-inter-depth 1 --psy-rd 1 --psy-rdoq 1 --rdoq-level 2 --qcomp 0.65 --no-strong-intra-smoothing --qg-size 16

Using the settings above I began making the following changes then running the encode, and documenting the results.

Here is the link to my dropbox folder, it contains all the videos, images, and excel doc I used.

I will only post a few select images, if you wish to see them all please download from dropbox.

https://www.dropbox.com/sh/mv9jhnoez7u6hcc/AAAlPMAi3E90GbBI7v60IVlZa?dl=0

First off I'd like to say that I selected this clip for two reasons.

It was the best clip I had at the time. ;)
I wanted a clip that would be difficult to encode, lots of motion and rain. The idea being to try and kick x265 in the pants.

Before I post the images I'd like to mention I couldn't tell the difference between any of the encodes when watching the video. The only way I could see a difference was to go frame by frame.

I'd suggest playing the encodes at random to see if you can see a difference.

Source:
https://s22.postimg.org/u7l3o0uh9/637_Vikings_S03_E05_The_Usurper_test.png (https://postimg.org/image/u7l3o0uh9/)

My settings: 4.8 FPS - 2:25 encoding time
https://s10.postimg.org/6z788wclh/637_Vikings_S03_E05_The_Usurper_test_reg.png (https://postimg.org/image/6z788wclh/)

My settings plus RD 6: 3.135 FPS, 3:42 encoding time
https://s10.postimg.org/n3tn24fyd/637_Vikings_S03_E05_The_Usurper_test_rd_6.png (https://postimg.org/image/n3tn24fyd/)

Slow: 2.513 FPS, 4:37 encoding time
https://s10.postimg.org/ww0wlig91/637_Vikings_S03_E05_The_Usurper_test_Slow.png (https://postimg.org/image/ww0wlig91/)

Very Slow: 0.919 FPS, 12:37 encoding time
https://s10.postimg.org/wyks8cjwl/637_Vikings_S03_E05_The_Usurper_test_very_slo.png (https://postimg.org/image/wyks8cjwl/)

Placebo: 0.250 FPS, 46:24 encoding time
https://s10.postimg.org/c0a0wiuo5/637_Vikings_S03_E05_The_Usurper_test_Placebo.png (https://postimg.org/image/c0a0wiuo5/)

If you look closely at the highlighted areas below you can see the areas I was specifically comparing.

https://s21.postimg.org/fn35e019v/637_Vikings_S03_E05_The_Usurper_test_Location.png (https://postimg.org/image/fn35e019v/)

Using my default settings you will see that the areas are less crisp and generally fussy. Especially the hair I highlighted. It has noticeable more distortion then any of the other encodes. The only way I was able to get rid of the distortion was by enabling RD6 - yes 5 & 6 are the same right now. With the exception of preset slow, which is RD 4.

I was actually quiet puzzled by this, I tried adding rect & amp or tu-inter-depth 3 & rect to remove the distortion. Neither of them worked, the images and videos are in dropbox for those encodes. I'm guessing at this point that since the slow preset uses SAO and strong-intra-smoothing, those are responsible for the reduced distortion observed.

It shouldn't be a surprise that the higher the settings used the better the quality. ;) The question is, is the extra encoding time worth the extra quality?

I'd suggest downloading the excel doc now to view all of the results.

Taking the encoding time into account, my opinion is that adding RD 6 gives the best quality increase per extra encoding time. That single option was able to remove a significant amount of the distortion I mentioned above with the smallest increase in encoding time. If you're looking at the doc the column Magnitude difference, for the life of me I couldn't think of a better term, shows the multiple/factor of encoding time increase. Meaning if you take your FPS and divide it by the corresponding value, it should give you a reasonable idea of your fps rate.

Yes I know it will vary, the point is it will at least give you a good ballpark range.

I've decided to have a bit of a rolling scale. Meaning if a particular video is less important to me I'll use by default settings. As the importance of the content increases so will my settings.

On a scale of 1 - 10:
1-4 - my default settings
5-7 - Add RD6
8-10 - add RD 6 rect & amp

Adding at least rect gives the benefit of tu-inter-depth 2 for at least the blocks that use rect or amp, per the docs.

With all of that said, I don't think I'll ever use Placebo or Very Slow presets. Placebo is up to 19x slower than my settings and Very Slow is over 5X slower. While they did produce better encodes, obviously, the increased quality wasn't enough to justify the encode times...

Now if you've actually looked at the excel doc you'll notice I ran the encodes on two different systems. One with AVX2 and one without.

Please Note: I only ran the tests once, so YMMV.

System A:
Dual E5-2670 v1 8c/16t 2.6GHz @ 2.9Ghz No AVX2
This is a Sandy Bridge chip and thus does not down clock when running AVX instructions. This means it can run at the 2.9GHZ all core turbo setting.

System B:
Single E5-2683 v4 16c/32t 2.0GHz @ 1.9Ghz With AVX2.
This is a Broadwell chip and does down clock when running AVX2 instructions.

Despite the 1GHZ difference in clock speed when encoding, the Broadwell chip is up to 25% faster depending upon the settings. If you are running CTU 64 it is up to 25% faster. With CTU 32 it is approximately equal.

At first glance this may not appear that impressive but keep in mind we are comparing two 115 watt chips versus a single 120 watt chip. At the vary least they are equal in speed for approximately half the power usage and heat generated - up 25% faster with CTU 64!!

EDIT:

I forgot to mention that to-intra-depth has little to no impact on performance. My default settings now include --tu-intra-depth 4.

brumsky

27th September 2016, 20:24

RD = Rate Distortion Optimization level
At the core of x265 (and most video encoders) is an algorithm that figures out the best way to encode each block of video. This algorithm weighs the distortion of each candidate "mode" against the bits used (the rate). To do this rate distortion optimization as accurately as possible, the encoder has to accurately calculate the bits that each candidate mode would use. In other words, it has to fully encode the candidate mode, calculating the residual error after prediction (the difference between the inter or intra-predicted block and the source pixels), performing a discrete cosine transform on the residual error, quantizing the residual error, and entropy coding the predicted block plus the residual error. This method is used for RD level 5. For lower RD levels, x265 weighs the bits used by each candidate mode, but it doesn't go through all of the above steps (it will skip entropy coding, for example). Lower levels are roughly accurate when it comes to choosing the best mode, but not as accurate as the highest rd level. Note that there are a number of decisions in x265 that can be rd optimized. More decisions are rd optimized at higher rd level, to a higher degree of precision.

RECT = allow x265 to evaluate and code rectangular blocks (CUs). Each CTU (usually a 64x64 pixel block, but this depends on your --ctu setting) is analyzed to find the optimal way to encode the pixels in the CTU, partitioning the CTU into smaller CUs. For example, a 64x64 pixel CTU might be encoded as four 32x32 CUs, or sixteen 16x16 pixel CUs. HEVC supports rectangular CUs. For example, a 32x32 pixel block could be encoded as two 16x32 CUs, or two 32x16 CUs. Rectangular CUs allow the partitioning to more ideally match the video content, but of course the more possibilities the encoder wants to evaluate, the longer it takes to evaluate all of the possibilities.

AMP = allow x265 to evaluate asymmetric partitions. Asymmetric partitions are rectangular CUs that have a ratio of 1:4, 3:4, 4:1 or 4:3. For example, a 32x32 pixel block can be encoded as a 8x32 pixel block plus a 24x32 pixel block. Again, these additional possibilities can allow for a more accurate fit, but more possibilities = more time needed to evaluate all possibilities.

Thank you for the detailed answer!

Is RDO predictions the Entropy coding you mentioned?

trip_let

27th September 2016, 20:54

Is RDO predictions the Entropy coding you mentioned?
I'm not x265_Project and not that familiar with video encoding, but unless entropy coding means something different in this context, usually that just refers to general lossless encoding of information bits by essentially making a more efficient representation using codewords, representing more common patterns with fewer bits and less common patterns with more bits. So this process is not about predictions. Some information used in aiding rate distortion optimization with rd 5 is entropy coded.

Rate distortion optimization is about figuring out which lossy representation is acceptably close to the original. At a high level there are more calculations with higher rd levels to increase accuracy of predictions of what might be the best candidate of choices/modes to use by the encoder.

By the way, does anybody know an easy way to compare what's different between versions? I'm not much of a coder and don't really know platforms like bitbucket etc. I know you can look at the commits but that's a bit low level. Kind of looking for the equivalent of a changelog between stable builds. Maybe you can compare documentation versions? But on readthedocs they have latest, stable, 1.7, 1.6 on down, but no 2.0. Can some other snapshot be accessed? Or perhaps any feature labeled "experimental" might be new?

x265_Project

27th September 2016, 21:41

Thank you for the detailed answer!

Is RDO predictions the Entropy coding you mentioned?

The general sequence of video encoding (simple version)...

Prediction -> Calculate Residual Error after Prediction -> Discrete Cosine Transform the Residual Error -> Quantize the Transformed Residual Error -> Entropy Code the Prediction and the Transformed, Quantized Residual Error

Entropy coding is lossless compression (think of it like zip file compression). HEVC uses CABAC entropy coding. This is the last step for encoding each block.

When you're doing rate-distortion optimization (RDO), and you want to weigh the relative bit rate of multiple encoding candidates for a block of video, you have to do all of the above steps if you want the actual number of bits. This is very time consuming, of course, because you are doing a full transform, quant and entropy coding on every candidate, and you'll throw out all but the best candidate. So for faster settings, we just look at the # of bits for each candidate without doing the entropy coding. It's roughly accurate, but not as accurate.

RainyDog

28th September 2016, 09:35

I forgot to mention that to-intra-depth has little to no impact on performance. My default settings now include --tu-intra-depth 4.

Yeah I've noticed this too brumsky. I used to run --tu-inter-depth 2 and --tu-intra-depth 2 but realised that there was no difference in speed by upping intra to 3.

In fact, I wonder if intra actually does anything at certain resolutions as --tu-inter-depth 2 / --tu-intra-depth 3 versus --tu-inter-depth 2 / --tu-intra-depth 1 resulted in practically identical bitrates and file sizes if I recall when I tested them.

Jawed

28th September 2016, 21:47

x265_Project

29th September 2016, 04:48

I wonder if the x265 project will, at some point, completely overhaul their presets.
We take a look at this every so often. We've updated our presets several times already, and we'll certainly do it again as we make improvements to different algorithms, or develop new algorithms. We don't want to do it too often, as we think people like to develop a favorite recipe, and when we change presets it changes things for everyone using x265 (except those who specify every option manually).

Dclose

29th September 2016, 15:10

I've been running tests for days and for my current test sample of 5000 Kbps live-action 720p video, I agree with various proposed settings such as CTU 32. Other standard disabled things are SAO, intra-smoothing, etc. Early Skip really hurts quality imo, but maybe it will do better with higher bitrate, less action, or with more extreme settings I'm just starting to test such as RD5 with --subme 7.

Anyway, it's hard to do a search on this because I'm not sure what I'm looking for. x265 has always looked 2-D and flat and less "alive" to me than x264, and I thought it was a lack of grain and too much motion blur thing. I've been trying to tune those out of x265, but...

I think the 2-D thing is because x265 limits the lighting or color range. With x264, there seems to be more dynamic lighting. x265 seems to tone down bright light shining on someone's face. Or like someone having highlights in their hair. Or how the sun shines on a green pine tree and the edges or certain other parts of the tree have a shine. x265 seems to dull the lighting and or color.

Is there a known correction for what I'm talking about?

LigH

29th September 2016, 15:31

Please search for "sao" (smooth all objects) and see if it is related...

sneaker_ger

29th September 2016, 16:19

I think the 2-D thing is because x265 limits the lighting or color range. With x264, there seems to be more dynamic lighting. x265 seems to tone down bright light shining on someone's face. Or like someone having highlights in their hair. Or how the sun shines on a green pine tree and the edges or certain other parts of the tree have a shine. x265 seems to dull the lighting and or color.

Is there a known correction for what I'm talking about?
Post a sample with screenshots that show what you are talking about and your x264/x265 command-lines. Most often this is a user error resulting in wrong color conversion between YUV and RGB (BT.601 vs BT.709) or range (limited/full).

brumsky

29th September 2016, 17:10

So brumsky, since your encode parameters are so slow, perhaps you can use --rd 6 to replace other of your parameters that are "slow".

Perhaps you can get back to approximately the same quality as your default parameters (without --rd 6) and maybe gain performance :eek:

The parameters (including --rd 6) you provided are very impressive on that clip - I'd say close to, if not as good as, crf 18 slow (and about 1/4 the bitrate).

That's really going some. But the nature of the clip could be biasing such an assessment. In my experience useful conclusions come from many test clips (and single frames are ignored)...

I wonder if the x265 project will, at some point, completely overhaul their presets. It would appear that they've been chosen too early in the development cycle.

One of the things that concerns me with the presets is that higher quality presets often lead to increased bitrate. This is the opposite of what's seen with x264. To me that indicates the presets are pretty immature.

I know the test was limited and a single clip and frame is not indicative of every possible outcome. It was a simple test to verify your statement about RD 5/6, which I ended up agreeing with.

I also wanted to get a better idea of the performance impact vs quality of a variety of settings.

I've seen the opposite with the preset, generally they provide lower bitrates. My test clip isn't a good example because it is high motion.

Dclose

29th September 2016, 22:00

Please search for "sao" (smooth all objects) and see if it is related...
lol, x265 is frustrating compared to x264. I've ran 100+ test samples the past few days, and taking some screenshots now, I guess it is a grain/blur thing not a color or lighting thing. Well, I suppose those are related since blurring diffuses highlights/color along with resolution.

I guess it's mostly during movement that x265 looks 2-D and dull, which is practically all the time in modern media with either a person moving or the camera moving.

Early on, I tried the Grain setting, but it was excessive so I didn't think about it anymore. Looking at it again, objects do have more "energy" on screen and look more alive, and I now see Grain's settings disable SAO.

I tried x265 a year ago but gave up on it quickly due to so much blur even at higher resolution/bitrate. Trying it again to see how small in file size a video could get, it is impressive over x264 at very low bitrate, but it still was so blurry at higher bitrates until I happened to turn off SAO. And then reading mentions of SAO in this thread was evidence I wasn't just seeing things and that maybe x265 has more hidden potential.

Very Slow with my test video encodes at 2fps or less. I don't know what Very Slow is trying to do to the video since the settings I'm gravitating towards from my tests look as good or better to me and encode at least 2+ times as fast. My settings are similar to the ones mentioned in the last 20 or so pages that other people are mostly agreeing on. Still working on fine-tuning, of course.

Dclose

29th September 2016, 22:28

May as well post some screenshots. Maybe other people can see things that make the x265 presets look better. The modded setting below has a lot of room left since it encodes much faster than x265 preset Very Slow. I only used that for this since I already had it saved as a profile and it's a decent baseline.
A thing not seen well from screenshots is the blur during motion I mentioned. In the scene, she moves her head just a bit, and x265 presets want to blur her hair, making it look clumped up and flat instead of looking 3-D and alive.

original
http://i64.tinypic.com/28jxna9.jpg

x265 Slower
http://oi64.tinypic.com/24pmq7k.jpg

x265 Very Slow
http://oi66.tinypic.com/120iiw1.jpg

x264 Very Slow
http://oi67.tinypic.com/zwxbwz.jpg

x265 modified
http://oi67.tinypic.com/2qxb4lu.jpg

x265 --preset slow --input - --y4m --ctu 32 --qg-size 16 --merange 25 --no-strong-intra-smoothing --crf 23.50 --qpfile
GENERATED_QP_FILE --psy-rd 1.00 --rdoq-level 1 --psy-rdoq 2.00 --deblock=-2:-2 --no-sao --range full --colormatrix bt709
--ipratio 1.38 --pbratio 1.28

Jamaika

30th September 2016, 05:04

Dclose

30th September 2016, 05:28

It is difficult to say anything, because you added the sceenshot aren't identical. Lacking the "aq-mode 3". Lacking the "zone". Otherwise, the beginning and the end film will be of poor quality (~500 frames).
I didn't notice half of the shots were slightly different since they all read the same frame number in PotPlayer when I saved the screenshot. It looks like the first two x265 are the same, and the x264 and last x265 are the same. Even still, there's some obvious differences there.

I copied and pasted what Hybrid showed in its configuration area. Lacking the "aq-mode 3?" They're the preset settings except for the last one, which is a modded Slow and so uses the same as umodded Slow.

It's not meant to be a big technical comparison. Just that I was taking screenshots so figured I'd post some. And to show that I think the presets are doing x265 a disservice in showing its potential.

Barough

30th September 2016, 16:36

x265-2.1+12-11bfa0ae9710 (http://www109.zippyshare.com/v/BRqVdrNG/file.html) (MSYS/MinGW, GCC 6.2.0, 32 & 64bit 8/10/12bit multilib EXEs)

Jawed

1st October 2016, 12:14

We take a look at this every so often. We've updated our presets several times already, and we'll certainly do it again as we make improvements to different algorithms, or develop new algorithms. We don't want to do it too often, as we think people like to develop a favorite recipe, and when we change presets it changes things for everyone using x265 (except those who specify every option manually).
Thanks for explanation. I'm glad to hear to hear that the presets are not sacrosanct.

benwaggoner

1st October 2016, 21:04

Thanks for explanation. I'm glad to hear to hear that the presets are not sacrosanct.
Are any :)?

Certainly the x265 presets have had many years less tuning than x264's, and don't include --tune animation or film. Plus HEVC is a much more complex codec than H.264, so there are a lot more axes for tuning.

Also (and fortunately) x265 is in MUCH more active development than x264. This also means that settings that worked well 6-12 months ago might now work as well now. In a lot of cases, special settings for special content are less necessary as x265 does a better job adapting to the content by default (for example --rskip is way better now, so --no-rskip isn't nearly as necessary). Also, the quality/perf tradeoff of different parameters have changed a lot, and we have new parameters like --limit-refs and --limit-modes that allow us to use higher --refs and advanced features like --amp and --rect with much lower CPU cost.

And for really grainy/noisy content, we have --tune-grain with its whole grain-tuned rate control mode.

Exciting stuff is coming (via the changelog), like support for non-IDR I-frames mid-GOP.

x265_Project

1st October 2016, 21:40

Are any :)?

Certainly the x265 presets have had many years less tuning than x264's, and don't include --tune animation or film. Plus HEVC is a much more complex codec than H.264, so there are a lot more axes for tuning.

Also (and fortunately) x265 is in MUCH more active development than x264. This also means that settings that worked well 6-12 months ago might now work as well now. In a lot of cases, special settings for special content are less necessary as x265 does a better job adapting to the content by default (for example --rskip is way better now, so --no-rskip isn't nearly as necessary). Also, the quality/perf tradeoff of different parameters have changed a lot, and we have new parameters like --limit-refs and --limit-modes that allow us to use higher --refs and advanced features like --amp and --rect with much lower CPU cost.

And for really grainy/noisy content, we have --tune-grain with its whole grain-tuned rate control mode.

Exciting stuff is coming (via the changelog), like support for non-IDR I-frames mid-GOP.

Thanks Ben. To clarify a couple of points, we last updated our presets in December, and we incorporated limit-refs and limit-modes into the presets at that time.

Support for scene change detection within fixed GOPs is coming, but this will be on by default for all presets.

Right now we're looking at how presets might be optimized for different picture sizes (certain settings that work well for 4K might not be as optimal for 480P and below).

eclipse98

2nd October 2016, 21:58

microchip8

2nd October 2016, 23:08

Hi All,

Suppose I encode (x265) at CR26 and get a 5mbps bit rate. Then I encode at 3 pass with 5mbps target rate. Will there be any difference in quality ? Logic says it shouldn't be, just wanted to confirm with experts !

Thanks for your help.

Cheers !

there will be small, mostly unnoticeable quality difference, due to difference in distribution of bits. Also, 3-pass will gain you no more quality than 2-pass, and is mostly used when 2-pass misses the target file size, which can happen only rarely

Rule of thumb is: if you aim for specific quality, use CRF. If you aim for a specific target file size, use 2-pass

eclipse98

2nd October 2016, 23:19

there will be small, mostly unnoticeable quality difference, due to difference in distribution of bits. Also, 3-pass will gain you no more quality than 2-pass, and is mostly used when 2-pass misses the target file size, which can happen only rarely

Rule of thumb is: if you aim for specific quality, use CRF. If you aim for a specific target file size, use 2-pass

Thanks, just what I thought, appreciate your help !

cojj

3rd October 2016, 01:22

Thanks Ben. To clarify a couple of points, we last updated our presets in December, and we incorporated limit-refs and limit-modes into the presets at that time.

Support for scene change detection within fixed GOPs is coming, but this will be on by default for all presets.

Right now we're looking at how presets might be optimized for different picture sizes (certain settings that work well for 4K might not be as optimal for 480P and below).

First of all, thank you for such a great open-source project.

Now my question: Persoanlly, do you think this change is worth waiting for? I've starting re-encoding all my videos but I could wait bit longer if its worth.

x265_Project

3rd October 2016, 15:34

First of all, thank you for such a great open-source project.

Now my question: Persoanlly, do you think this change is worth waiting for? I've starting re-encoding all my videos but I could wait bit longer if its worth.
You're welcome. You should not be using fixed GOP length (keyint = min-keyint). If you use our default presets and you don't mess with key interval settings, or turn off scene detection, x265 will create a new GOP at the start of each scene. This will give you the best encoding quality.

Fixed GOP length is never desirable from a quality (compression efficiency) standpoint. Unfortunately, for some scenarios, the system designers chose to use a fixed GOP length for convenience, to simplify streaming or broadcasting. Using variable GOP length for video means that you need to segment audio and metadata into the same varying chunk sizes, which makes the broadcasting or streaming server/client design more complex.

benwaggoner

3rd October 2016, 20:17

there will be small, mostly unnoticeable quality difference, due to difference in distribution of bits. Also, 3-pass will gain you no more quality than 2-pass, and is mostly used when 2-pass misses the target file size, which can happen only rarely.
I saw some rare cases where a 3rd pass actually made things slightly better at a few points where the encode was VBV constrained, mainly in cases where a fast first pass was used, or when the VBV was really constrained. I think those were only in 2015, though.

Rule of thumb is: if you aim for specific quality, use CRF. If you aim for a specific target file size, use 2-pass
Quite right!

mzso

3rd October 2016, 21:32

Hi!

Is aliasing a significant issue for x265 (our HEVC)? I came across two separate x265 encoded files of the same thing. The higher bitrate one had some pretty bad aliasing where there were fine patterns. The lower bitrate one was even worse with with more stuff aliased an aliasing being more pronounced.

microchip8

3rd October 2016, 22:00