View Full Version : x265 HEVC Encoder
? There are no minimum required bitrate for physical media. Some decoders do choke with abysmal bitrates but that's an implementation problem.
Once upon a time, when I worked in a DVD Video authoring studio, some of the several dozen DVD Players certainly did not enjoy longer black screens, so developers of MPEG2 encoders visited us to tune their products... my very rusty 2 cents of practical experience. I will be glad to hear that Blu-ray players today are more robust in this regard.
Z2697
5th June 2025, 06:05
just don't use too many b frames, they will slow down the encoding with b-adapt and has no real benefit.
Some thought (theory) on this:
x265 only uses one middle frame in a mini-GOP (like a sequence of b frames) as reference frame (or not at all if b-pyramid is disabled),
which means in a long mini-GOP, "far" b frames will have "far" references, and have more residual to code... unless the scene is still.
tormento
9th June 2025, 18:18
I am asking myself what values should I use with --chromaloc.
I can obtain the original video one with
ffprobe -v quiet -loglevel panic -print_format json -show_format -show_streams <file>
and I know that
0 left (1080p)
1 center
2 topleft (2160p)
3 top
4 bottomleft
5 bottom
where usually I find 0 for 1080p avc and 2 for 2160p hevc.
What happens if I change the codec and/or the size of the video?
I.e. if I downsize a hevc video from 2160p to 1080p, should I:
1) convert topleft to left
2) apply wanted filters
3) set --chromaloc to 0 (left)
or
1) leave chroma location where it is and don't set --chromaloc at all?
microchip8
9th June 2025, 19:32
I always use chromaloc 2 when downsizing 2160p to 1080p. No issues here
Z2697
10th June 2025, 06:11
How is this related to x265 specifically?
FranceBB
14th June 2025, 18:59
What happens if I change the codec and/or the size of the video?
I.e. if I downsize a hevc video from 2160p to 1080p
Resizers in Avisynth are chroma location aware, which means that if you start with a 3840x2160 with top_left chroma location (i.e 4:2:0 Type 2) and you downscale it to 1920x1080, the chroma location of that will stay the same so you should still be flagging it as top_left. This is what Microchip8 seems to be doing:
I always use chromaloc 2 when downsizing 2160p to 1080p. No issues here
and he's right... BUT (yes, there's a "but") although a player should understand and honor the chroma placement metadata when upscaling the chroma and converting to RGB, there's no guarantee that every single player is gonna do it. You see, taking aside special use cases like MPEG-1 with the center chroma placement etc, for consumers the chroma location really actually became important when H.265 was introduced as it moved the default from left (normal 4:2:0) to top_left (4:2:0 type 2), which means that an astonishing amount of old players just automatically assumed left chroma location all the time before this. If you're downscaling to normal FULL HD 4:2:0 and you're feeding an old hardware player then you *might* be in trouble as it *might* assume left, regardless of what you're flagging the file as.
For your use-case, Tormento, I would personally do:
z_ConvertFormat(chromaloc_op="top_left=>left")
to go from 4:2:0 Type 2 to the normal 4:2:0 and then encode setting the left chroma placement. ;)
microchip8
14th June 2025, 19:22
Yea, I don't have anything that old here that doesn't understand chroma location. All my players/streamers are 4k-aware and I mostly use ffmpeg for the resizing (Lanczos3 with accurate_rnd) and encoding, which too has chroma-aware scaling. I sometimes use a zimg/zlib scaler and don't have any issues either.
rwill
14th June 2025, 19:45
, for consumers the chroma location really actually became important when H.265 was introduced as it moved the default from left (normal 4:2:0) to top_left (4:2:0 type 2),
Well its not that simple.
The standard says:
When the chroma_sample_loc_type_top_field and chroma_sample_loc_type_bottom_field are not present,
the values of chroma_sample_loc_type_top_field and chroma_sample_loc_type_bottom_field are inferred to be equal to 0.
With the following additional constraint:
When chroma_format_idc is equal to 1 (4:2:0 chroma format) and the decoded video content is intended for interpretation
according to Rec. ITU-R BT.2020-2 or Rec. ITU-R BT.2100-2, chroma_loc_info_present_flag should be equal to 1, and
chroma_sample_loc_type_top_field and chroma_sample_loc_type_bottom_field should both be equal to 2.
So 4:2:0 BT.2020-2 and BT.2100-2 need to always signal chroma loc 2 but others default to 0 if it is not signaled otherwise.
hellgauss
14th June 2025, 20:06
Thank you for interesting reading!
What is exactly chroma location? AFAIK in "standard" 4:2:0 video I have one luma plane at full resolution and two chroma planes at quarter (2x2) resolution. Just overlap the three planes to get the final image, perhaps using some "smart" upscale for chroma. How does chroma location enter in this process?
In my mind I have two blocky images for chroma and one fine image for luma in which 4 pixels fit exactly into one chroma pixel. Is this interpretation correct?
rwill
14th June 2025, 21:21
Thank you for interesting reading!
What is exactly chroma location? AFAIK in "standard" 4:2:0 video I have one luma plane at full resolution and two chroma planes at quarter (2x2) resolution. Just overlap the three planes to get the final image, perhaps using some "smart" upscale for chroma. How does chroma location enter in this process?
In my mind I have two blocky images for chroma and one fine image for luma in which 4 pixels fit exactly into one chroma pixel. Is this interpretation correct?
Well you can download https://www.itu.int/rec/T-REC-H.265-202407-I and take a look.
On .pdf page 456 (or page no. 438 as stated in the document) are two figures, E.1 and E.2.
E.1 shows the chroma sample location relative to the luma top and bottom fields. E.2 might be a bit easier to understand, it shows which area a chroma sample covers relative to the 4 luma samples it is associated with.
As can be seen in E.2, the easiest one would be chroma loc 1, which is the chroma sample being centered in the middle of the 4 luma samples. I guess this is how Mpeg-1 worked. Now when upsampling one needs to go 1 to 2x2 samples. If one does not want to apply some fancy smooth filter there one can can just assign the chroma sample value to all 4 luma samples the chroma sample covers.
Other locations require other upsampling filters, depending on the position and use case these can be somewhat larger than '1 sample' filters.
Simple chroma loc 0 upsample can look like so:
Horizontal Upsample:
for even luma0 sample: chromasample0
for odd luma0 sample: ( chromasample0 + chromsample1 + 1 ) / 2
( chromasample1 is the one from the next 2x2 luma block )
Vertical Upsample:
for top luma0 sample: ( 3 * chromsample0 + 1 * 'chromsample-1' + 2 ) / 4
for bottom luma0 sample: ( 3 * chromsample0 + 1 * chromsample1 + 2 ) / 4
( chromasample-1 is the sample from the row above and chromsample1 is the sample from the row below )
I hope this makes sense.
Z2697
15th June 2025, 14:21
It's amazing how they have come up with more complex "solutions" instead of a obvious and simple one.
hellgauss
15th June 2025, 14:47
Thanks for precise reference!
Indeed standards are sometimes misteriosly overcomplicated. I had chosen only Chromaloc=1, perhaps with an optional flag for the default chroma upscale (e.g. point, bilinear, bicubic soft, bicubic). Perhaps there are some cases or hidden reasons to shift chroma half a pixel.
benwaggoner
19th June 2025, 17:40
Generally I only use chromaloc 2 for actual Blu-ray authoring. In practice a lot of early UHD mezzanines and HEVC decoders did everything as chromaloc 0 irrespective of metadata.
With UHD Blu-ray at 2160p getting it wrong made only subtle, easy to miss differences. But the lower the resolution, the more visible an offset error can be.
RARY
25th June 2025, 12:55
I'm planning to buy a 4K TV and would appreciate your recommendations.
Looking for models that support HEVC, VVC, and MV-HEVC decoding, and also have Dolby Vision support.
If you've come across any good options recently, please do share!
tormento
25th June 2025, 13:29
In practice a lot of early UHD mezzanines and HEVC decoders did everything as chromaloc 0
Is there any way to find real position, when metadata is wrong?
benwaggoner
25th June 2025, 19:35
I'm planning to buy a 4K TV and would appreciate your recommendations.
Looking for models that support HEVC, VVC, and MV-HEVC decoding, and also have Dolby Vision support.
If you've come across any good options recently, please do share!
No one is making stereoscopic 3D TVs anymore, so any MV-HEVC playback would just be single view anyway.
The three questions in picking a TV are
The ambient light in your viewing environment.
How big you want it to be. I recommend 10" diagonal for every foot your eyes will be from the screen.
How much you want to pay.
OLED is best if you have good light control, but can't go as bright if you don't. Price/Size tradeoffs are personal. Sometimes the best solution is to push your couch forward :sly:.
I'm very happy with both the Sony A95L and LG G5 for OLED Dolby Vision supporting TVs. I've heard great things about the Bravia 8 Mark II, but haven't evaluated it with my own eyeballs yet. It only goes up to 65" though. A95L goes up to 77" and the G5 to 98" (at an astronomical price).
We are truly living in the golden age of display innovation, with mid tier TVs today outperforming the high-end of just five years ago. Get one of the above and put it in Filmmaker Mode, and you can get very close to what the creatives approved on their $30K studio color grading reference monitors. Today's $400 TV is much better than a $4000 TV of ten years ago.
Samsung is also making great TVs, but they don't support Dolby Vision.
Blue_MiSfit
25th June 2025, 20:38
Somewhat OT, do you see any significant benefit of Dolby Vision (or any other system using dynamic metadata for color volume transformation) in content that's graded to a 1000 nit peak on a display that can meet or exceed 1000 nits consistently, Ben?
I know with DoVi P5 / P10 there's the proprietary IPTPQc2 space that allows using full range and dynamic shaping from 12+ bits down to 10 bits while the RPUs can reconstruct closer to 12 bit but... in practice this requires a lot of encoder optimization and is maybe kinda not that big of a deal at the end of the day if you're not doing any mapping?
@Rary, you should probably start a new thread :)
RARY
26th June 2025, 11:11
Has anyone explored efficient multi-threading strategies for the MCSTF module?
benwaggoner
26th June 2025, 20:17
Somewhat OT, do you see any significant benefit of Dolby Vision (or any other system using dynamic metadata for color volume transformation) in content that's graded to a 1000 nit peak on a display that can meet or exceed 1000 nits consistently, Ben?
Not in an appropriately dim ambient lighting environment, no. Dynamic metadata helps map content to the display when the display can only show a subset of the content itself. But if you're watching 400 nit P3 content on a higher end OLED in a 5 nit surround, the display can just show each pixel's spec value perfectly, and the dynamic tone mapping isn't needed. Where tone mapping and dynamic metadata are useful is when content is brighter or more colorful than the display can reproduce, inclusive of ambient light adaptation.
Content can vary a lot, of course. A lot more displays can perfectly reproduce something like Rings of Power than can Inside Out, for example.
I know with DoVi P5 / P10 there's the proprietary IPTPQc2 space that allows using full range and dynamic shaping from 12+ bits down to 10 bits while the RPUs can reconstruct closer to 12 bit but... in practice this requires a lot of encoder optimization and is maybe kinda not that big of a deal at the end of the day if you're not doing any mapping?
Yeah, in theory P5 can help reducing banding as you can have the full 0-1023 range used for any given frame. In practice I've not really seen big net differences from DoVi Profile 8.1 if the HDR-10 base layer is encoded well. Avoiding banding with P5 is easier and can be done at a lower bitrate than 8.1 sometimes. Avoiding banding with HDR-10 can requires some finicky tweaking.
Encoding Profile 5 isn't super hard. Weighted prediction is super important, of course, and you want to use an asymmetric chroma offset because Ct and Cb are more psychovisually distinct than Cb and Cr. It certainly can be done well with stock x265 if one didn't want to use Dolby's tools for some reason. No --hdr10-opt of course!
Z2697
26th June 2025, 20:18
Has anyone explored efficient multi-threading strategies for the MCSTF module?
MCSTF is completely useless. Just don't think about it.
It actually reduces quality, unless I'm not testing it right.
There will be ghosting in frames, which are actually it's intent, if i understand it correctly. (imagine a MDegrain filter without strength control, and limiting)
Since there's no "hidden ref frames" in Main HEVC, I guess it's unavoidable. (like how VP-series/AV1 can have temporal filtered hidden alt-ref)
(I use "Main HEVC" because there're extensions, like multi view, I don't know, maybe there will be this kind of extension. Or maybe someone can exploit the views to create hidden reference frames.)
Balling
9th July 2025, 15:32
Thank you for interesting reading!
What is exactly chroma location? AFAIK in "standard" 4:2:0 video I have one luma plane at full resolution and two chroma planes at quarter (2x2) resolution. Just overlap the three planes to get the final image, perhaps using some "smart" upscale for chroma. How does chroma location enter in this process?
In my mind I have two blocky images for chroma and one fine image for luma in which 4 pixels fit exactly into one chroma pixel. Is this interpretation correct?
No. You have one plane at 8 bit resolution and then another plane at 4 bit resolution that has both Cb and Cr (that is Cb is not 2 bit and Cr is not 2 bit, they both occupy the same 4 bits) and so basically the Chroma sample since it is less resolution needs to be decoded correctly in a single fashion. And it so happens that since both chroma channels are two times less size than Luma it can be when decoded moved in different directions, in fact ffmpeg supports arbitrary direction of the chroma placement.
Asmodian
9th July 2025, 16:43
How does chroma location enter in this process?
In my mind I have two blocky images for chroma and one fine image for luma in which 4 pixels fit exactly into one chroma pixel. Is this interpretation correct?
This wikipedia article has good diagrams of the different locations that are used for 4:2:0.
https://en.wikipedia.org/wiki/Chroma_subsampling
hellgauss
9th July 2025, 20:10
@Balling
Sorry, I did not understand your answer. I think bit depth is unrelated to the topic.
@Asmodian
Thanks for reference, those are basically the same pictures as pointed by rwill in the standard. My [possibly wrong] understanding now is that, since a 4:4:4 image must be somewhat reconstructed at some point, the chromaloc define how the half resolution chroma planes must be upscaled and interpolated at full res: either centered or shifted up/down-left/right a little bit during interpolation.
Perhaps there can be specific algorithm to directly upscale/downscale both luma and chroma at screen resolution in one step.
Asmodian
9th July 2025, 21:56
Perhaps there can be specific algorithm to directly upscale/downscale both luma and chroma at screen resolution in one step.
You need to scale the three planes separately, but you can scale the chroma planes and center them with luma in one step.
FranceBB
10th July 2025, 23:04
since a 4:4:4 image must be somewhat reconstructed at some point, the chromaloc define how the half resolution chroma planes must be upscaled and interpolated at full res: either centered or shifted up/down-left/right a little bit during interpolation.
Yes, pretty much.
4:2:0 -> 4:4:4 -> RGB -> Monitor representation.
You start with the chroma half the resolution of the luma, then it gets upscaled to 4:4:4 so that it has the same resolution as the luma and finally it gets converted to RGB so that it can be represented on the display. Unrelated, but in this step the levels are also converted.
Given that you're Italian, if you're interested in the topic, I actually strongly suggest you to read the first few pages of my 2024 academic publication that you can find here: Link (https://view.publitas.com/p222-8308/francesco-bucciantini-utilizzo-dei-gan-nellupscale-delle-immagini-fisse-ed-in-movimento-2024-edition). I'd say up until page 16 as I go through luma and chroma, 4:2:0, 4:2:2, 4:4:4 and the upscale of the chroma for the RGB conversion so that it can be displayed by the TV subpixels which are, of course, red, green and blue.
For instance, you can think about the process as:
#4:2:0 with 3840x2160 luma and 1920x1080 chroma
ColorBars(3840, 2160, pixel_type="YV12")
#Extracting 3840x2160 luma
Y=ConvertToY8()
#Upscaling the 1920x1080 chroma
U=UToY8().Spline64Resize(3840, 2160)
V=VToY8().Spline64Resize(3840, 2160)
#4:4:4 with 3840x2160 luma and chroma
YToUV(U, V, Y)
#RGB24 to display on the TV
ConvertToRGB24()
https://i.imgur.com/i9Ir3Fw.png
If we were to try to display the chroma without upscaling it, we would end up with a bit of a mess like this:
ColorBars(3840, 2160, pixel_type="YV12")
Y=ConvertToY8()
U=UToY8().AddBorders(0, 0, 1920, 1080)
V=VToY8().AddBorders(0, 0, 1920, 1080)
YToUV(U, V, Y)
https://i.imgur.com/8Y6z24j.png
If we got the wrong sample location, we would end up in a different mess in which the chroma wouldn't be properly aligned with the luma.
Imagine like the colorbars above but with like the chroma of, let's say, one bar shifted on one side etc.
So yeah, the TV is upscaling the chroma to make it match the luma resolution before converting to RGB and the chroma location is telling it where the chroma samples are and where to shift them to make them match the luma so that when they're overlaid one on top of each other they match perfectly. ;)
Barough
20th July 2025, 09:31
x265 v4.1+189-c8ceb6b
GCC 15.1.0 / Win32/64 / 8bit+10bit+12bit
https://www.mediafire.com/file/6xt6fi7fryn40tr
jpsdr
20th July 2025, 11:21
@Barough
Hi.
What repository are you using ?
I've just checked the one i'm using (the master branch), i'm still only at "+136".
Z2697
20th July 2025, 14:03
@Barough
Hi.
What repository are you using ?
I've just checked the one i'm using (the master branch), i'm still only at "+136".
https://forum.doom9.org/showthread.php?p=2010341#post2010341
jpsdr
20th July 2025, 16:18
I forgot this one...
Barough
20th July 2025, 20:55
@Barough
Hi.
What repository are you using ?
I've just checked the one i'm using (the master branch), i'm still only at "+136".
https://bitbucket.org/multicoreware/x265_git.git
jpsdr
21st July 2025, 17:26
https://bitbucket.org/multicoreware/x265_git.git
So do I.
So it must be what Z2697 linked about.
Barough
21st July 2025, 18:31
So do I.
So it must be what Z2697 linked about.
Beats me....
LigH
21st July 2025, 18:47
I stopped compiling x265 since an issue with the version numbering in M-AB-S was discovered, which should be solved as soon as a new version gets tagged.
Balling
24th August 2025, 12:47
@Balling
Sorry, I did not understand your answer. I think bit depth is unrelated to the topic.
It is unrelated, that was just an example. That is for 10 bit it would be 5 bit used for Cb and Cr together. What I am saying is that Cb and Cr together is less data, so it needs to be restored to full 10 bit, that 5 bit needs to be somehow upscled using some FIR algorithm to 20 bits. 5 —> 20 bit. Yep
LigH
24th August 2025, 19:20
Sorry, Balling, but your explanation is not correct. All the channels (luma Y as well as chroma differences Cb and Cr) have the same bitdepth (say, precision of values). But due to Chroma Subsampling they don't have the same resolution. In case of YUV 4:2:0, which is most common, chroma planes store average chrominance difference values for each square of 2×2 luminance samples (more or less equal to pixels).
For example, if you have a video with 1280×720 nominal pixels, the luma plane Y has 1280×720 sample values, but the chroma planes Cb and Cr only store 640×360 sample values. All of them have the same precision, though, e.g. 8 or 10 bit per sample value. Splitting these chroma bits per pixel is not valid.
Z2697
24th August 2025, 21:37
I guess it's just a very unintuitive way of saying "chroma planes are lower resolution and equivalently less bits". (storage bits per sample)
Which makes little sense. But they are somewhat "frequently" used in the general topic.
Like the "12" in "NV12" (I assume). But then you have "NV21" which makes no sense again. This whole thing is just a hot mess.
You think when the planes are all full resolution, the situation should be better? LOL! Meet RGB16(5-6-5) and Y416.
Anyway, just like anything in the life, you learn the rules, and then you learn the f..kton lot of exceptions to the rules.
"Here's how we name things."
"Here're how we name things differently when they are xxx." (repeated 1000 times)
I don't think Balling is a newbie that mixes up this nonsensical equivalent bits per sample with the precision, but this way of representing things is ambiguous and should be avoided.
Interpolation to more bits makes no sense.
LigH
24th August 2025, 21:54
Alright.
It's just sad that the more things are named wrong out of convenience, the fewer precise terms are left which are not yet "burned".
So many terms which originally had a sensible meaning but now are misunderstood by the average people because they never heard of the original meaning and only ever heard the perverted one.
Sorry, off-topic rant.
Z2697
24th August 2025, 22:23
I don't think they are even convenient at all, or maybe they were convenient at some point, but caused more trouble down the road.
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.