2025 Dolby Vision 4k UHD procedure? [Archive]

aSquirrel

6th November 2025, 05:04

Let me start by saying I don't own a 4k HDR monitor, so I can't actually verify if my encodes are good or not. I'm encoding now mostly because I have the time to do so.

My goals are:

Reduce file size with minimal/no quality loss
Encode fast enough that the energy cost of encoding does not erase savings from reduced file size.
Playback will only ever occur on a full desktop PC. No need to support streaming, plex, standalone players, phones, or any other playback type.
I like to scrub around in my movies/skip scenes, so high I-frame intervals are a bug, not a feature. The default i-frame settings are good.

I ran many tests with h265 and AV1 on 1080p movies I know well and settled on the settings below as settings where I cannot see a quality difference between the original and my encode, but my encode takes up 1/2 to 1/10th the space of the original Blu Ray backup.

Here is the command I use to encode 4k UHD movies today:

ffmpeg -y -i input.mkv -map 0:v:0 -c:v libx265 -crf 21 -preset slow -tune grain -x265-params asm=avx512:slices=8:frame-threads=16:pools=* -map 0:a:0 -c:a libopus -ac 2 -b:a 160k output.mkv

I downmix the audio to stereo because that's all I have at this time. Later on, when I have a proper surround setup, I can get that from the backup and replace the audio track.

Finally, all that out of the way, here is my question:

In 2025, what is the proper way to transcode a Dolby Vision 4k UHD movie? I get a bunch of "[hevc @ 0x563fdd9a4ec0] Skipping NAL unit 63" when I use the command above.

I only today discovered that maybe I need repeat-headers=1:hdr-opt=1 in my x265-params? Why wouldn't repeat-headers get turned on automatically if it's required? hdr-opt is listed as experimental, so I can understand that not being on, but the description for it says "Add luma and chroma offsets for HDR/WCG content. Input video should be 10 bit 4:2:0. Applicable for HDR content" and...that may as well be greek to me. Is there a good reason to turn that on?

Also, are my current 4k encodes junk because I didn't have repeat-headers=1 active? Or will I only know when I buy a HDR monitor and test for myself?

My CPU is a 9800x3d, if it matters.

microchip8

6th November 2025, 08:46

Encoding DoVi with x265 is a whole procedure. Firstly, x265 only supports DoVi profile levels 5, 8.1 and 8.2 single layer (SL), so no double layer (DL) and no profile 7 that is used on virtually all DoVi/UHD discs. You will also have to extract the RPU with something like dovi_tools and add it to the encode with --dolby-vision-rpu <file>

IIRC, hdr10-opt is deprecated (I could be wrong), but it should be enabled for HDR content. For HDR content, I highly recommend using cbqpoffs -2 and crqpoffs -2. What this does is lower the compression of the blue/red chroma channels relative to luma. Why is this important? Well, 4K/UHD/HDR content has a much wider color gamut than SDR so less compression of the chroma channels will give you more "intense" colors at the expense of slight bitrate increase, which in my experience is negligible. repeat-headers is not mandatory but I always have it on for HDR

You can get compiled dovi_tool for Linux from my server https://nextcloud.teambelgium.net/s/y86wANBJNP4q2Pr

GeoffreyA

6th November 2025, 10:15

As a side note on the audio, I find that using loudnorm to perform EBU R 128 normalisation, with two passes and an LRA of 14, helps to flatten dynamic range a bit so that differences between dialogue and action are minimised.

bredboi

6th November 2025, 10:31

for DoVi you have the option to play around with DoViBaker (https://github.com/erazortt/DoViBaker) for P7 FEL sources which ranges from offering a negligible difference in image quality to removing notable compression issues on poorly encoded BLs. this gives you the full 12 bit presentation but in a standard PQ/HDR10 format for easy encoding rather than the original dual layer format. this slows encoding down in my experience so feel free to skip it in most cases. I'm pretty sure there's a list of titles with poorly encoded BLs somewhere but it is eluding my searches at this moment. iirc it's mostly StudioCanal releases.

as microchip said, you'll need to use dovi_tool (https://github.com/quietvoid/dovi_tool) to extract the RPU from the source and also make sure it converts to P8.1 for single layer HDR10 formatting. this RPU can be fed into x265 or optionally re-injected into your encoded HEVC elementary stream using dovi_tool again (remember dovi_tool doesn't support demuxing from containers so you're generally better off encoding to .hevc here and then muxing with encoded audio later if you go this route).

the HEVC encode, besides passing in the RPU, should be largely the same as any standard HDR10 encode; DV is merely metadata that sits upon a standard HDR presentation.

Z2697

6th November 2025, 12:46

Dont use slices and tune grain. (actually, don't use slices together with frame-threads(>1), that's a bug in x265 that they refuse to fix.)
Increasing frame-threads generally isn't gonna give you more speed, the default value shoule be good enough.
Disable SAO (it was disabled by tune grain).

A good way of speeding up preset slow is to disable rect and change the ME to a faster one.

microchip8

6th November 2025, 13:18

A good way of speeding up preset slow is to disable rect and change the ME to a faster one.

Or you can keep it on (rect and amp) and use limit-modes to speed it up :)

GeoffreyA

6th November 2025, 13:41

+1 for disable SAO.

rwill

6th November 2025, 14:57

bredboi

6th November 2025, 15:29

I am curious... whats a poorly encoded base layer and how is the enhancement layer fixing it?

the Base Layer (BL) in a DV P7 encode is the underlying HEVC HDR10 video. on non-DV capable hardware, this is played as-is.

DV P7 includes the Enhancement Layer (EL), which is a second video track. this second video track is unwatchable by itself, but can contain data derived from the difference between that BL encode and the original 12 bit DV mezzanine video*. upon playback with compatible hardware, the EL is combined with the BL to offer a presentation closer to the 12 bit DV mezzanine.

nominally, the intent of DV P7 is to offer a 12 bit presentation despite 4KBD's 10 bit limitation. realistically, the difference between 10 bit and 12 bit is minimal at the best of times, so its usefulness is arguable.

due to the way it works, with the EL being intrinsically based on the final HEVC BL encode and the original DV mezzanine (which will be either ProRes, JPEG2000, or TIFF format, with basically no visible compression artefacts), the EL can actually cancel out compression artefacts that exist within the BL. in most cases, the BL is well encoded, and as such the affect on compression is minimal. but, in the case that the BL is poorly encoded (that is, it has notable visible compression artefacts), the EL can clean up these artefacts for an overall higher quality video stream.

so, in those cases, it will be beneficial to use DoViBaker which can give you the full 12 bit DV video to clean up the image before additional compression.

*not all DV P7 discs contain the 12 bit data, many will contain just the RPU for tonemapping data. this is referred to as a "MEL" (Minimal Enhancement Layer) as opposed to a "FEL" (Full Enhancement Layer) which does contain the 12 bit data

microchip8

6th November 2025, 15:55

+1 bredboi for explaining. I'm not very versed in DoVi (it seems overly complex) so I keep learning :)

Kuler087

6th November 2025, 15:57

I am curious... whats a poorly encoded base layer and how is the enhancement layer fixing it?

I have many examples in here:
https://docs.google.com/spreadsheets/d/15i0a84uiBtWiHZ5CXZZ7wygLFXwYOd84/edit?gid=1226038728#gid=1226038728

look when the NOTES says, ''FEL improve grain/details''.

This is one good example:
https://slow.pics/c/bTBjYZuB
*HDR images, must be viewed on Chrome, Edge.

Z2697

6th November 2025, 16:52

EL is a low resolution low bitrate hard-to-compress noise-like video, I'd say any "improvement" in terms of compression artefacts is just hallucination.
However it's possible that there're other informations being lost in the BL encoding (maybe due to some errors) that can be somewhat "retained" in the residual of the munted EL.
But yeah I'd like to think that 90% of the actual benefit of DoVi is the dynamic metadata.

When I say "artefact" I mean the DCT ringing, blocks, the "unwanted". The loss of detail due to unsufficient bit allocation is not included in this context.
I mean, the EL may compensate some artefact, but it will introduce its own artefact, dut to its nature, being an encoded video stream as well (let alone with lower spec).

Z2697

6th November 2025, 16:53

I have many examples in here:
https://docs.google.com/spreadsheets/d/15i0a84uiBtWiHZ5CXZZ7wygLFXwYOd84/edit?gid=1226038728#gid=1226038728

look when the NOTES says, ''FEL improve grain/details''.

This is one good example:
https://slow.pics/c/bTBjYZuB
*HDR images, must be viewed on Chrome, Edge.

It's different, but is it really better?

microchip8

6th November 2025, 17:00

Guys, we're supposed to help OP on how to encode DoVi! with ffmpeg/x265. Discussing DoVi itself and all its layers is out of scope. Let's keep to the topic, or we'll confuse OP (as he seems to be unsure about many DoVi things) :)

Kuler087

6th November 2025, 17:06

It's different, but is it really better?

Are you serious, lol? You can’t see the smeared grain in the white areas of the image? There are tons of comparisons in that link showing clear improvement, just look at the Halloween II (1981) comparison.

Anyway, as microchip8 said, off topic

Z2697

6th November 2025, 17:22

Are you serious, lol? You can’t see the smeared grain in the white areas of the image? There are tons of comparisons in that link showing clear improvement, just look at the Halloween II (1981) comparison.

Anyway, as microchip8 said, off topic

I don't mind if some like grains too much, but without a reference material we can't just say more grain = better.

Kuler087

6th November 2025, 17:30

I don't mind if some like grains too much, but without a reference material we can't just say more grain = better.

This will be my last reply on this here, but we do know!

The HDR10 Base Layer is encoded by comparing pixels, pixel by pixel, against the 12-bit mezzanine file, and ANY difference(grain, banding, black colors, brightness) is encoded in the Enhancement Layer. The only way the decoded BL+FEL image could show more grain is if the 12-bit master actually contained that grain or information.

FEL encoding doesn’t create fake grain out of nothing. When properly encoded, there will be no difference in grain structure between the BL and EL.

Z2697

6th November 2025, 18:02

This will be my last reply on this here, but we do know!

The HDR10 Base Layer is encoded by comparing pixels, pixel by pixel, against the 12-bit mezzanine file, and ANY difference(grain, banding, black colors, brightness) is encoded in the Enhancement Layer. The only way the decoded BL+FEL image could show more grain is if the 12-bit master actually contained that grain or information.

FEL encoding doesn’t create fake grain out of nothing. When properly encoded, there will be no difference in grain structure between the BL and EL.

You are right, I was just thinking you know, there's another layer of distortion which is the final encoding - and a very "heavy" one.
But maybe it's the situation like, "you need 99% of the bitrate to retrain 99% of the detail, but you only need 10% of the bitrate to retain 69% of the detail" (just a made-up example).

The compression ratio for EL is more extreme, but maybe waht's left is still enough to make BL+FEL better.

aSquirrel

6th November 2025, 18:33

IIRC, hdr10-opt is deprecated (I could be wrong), but it should be enabled for HDR content. For HDR content, I highly recommend using cbqpoffs -2 and crqpoffs -2. What this does is lower the compression of the blue/red chroma channels relative to luma. Why is this important? Well, 4K/UHD/HDR content has a much wider color gamut than SDR so less compression of the chroma channels will give you more "intense" colors at the expense of slight bitrate increase, which in my experience is negligible. repeat-headers is not mandatory but I always have it on for HDR

Thanks for that. Near as I can tell hdr10-opt isn't deprecated (at least it doesn't say it is in the official docs). I guess i really need to invest in a 4k uhd monitor to test all these settings. sigh. I was hoping to put that off until the tech had matured a bit more (right now it's microLED for bright scenes and oled for dark scenes. I want to own one monitor, not two :/). Plus a couple more generations of video cards and playing 60FPS 4k video games won't be hit and miss anymore (well, on cards that cost < $1000).

As a side note on the audio, I find that using loudnorm to perform EBU R 128 normalisation, with two passes and an LRA of 14, helps to flatten dynamic range a bit so that differences between dialogue and action are minimised.

Can you give me the actual command(s) to do that? On my old, cheap, speakers and my headphones, normalization isn't a problem, but I got some proper speakers and now I have the same problems with normalization everyone complains about. What you said in technical terms is so far outside my wheelhouse I can't really trust I'd even be googling the right thing.

Dont use slices and tune grain. (actually, don't use slices together with frame-threads(>1), that's a bug in x265 that they refuse to fix.)
Increasing frame-threads generally isn't gonna give you more speed, the default value shoule be good enough.
Disable SAO (it was disabled by tune grain).

A good way of speeding up preset slow is to disable rect and change the ME to a faster one.

What is SAO? I did some testing this morning and saw the problem with slices. On an episode of Star Trek it added about 20% to the file size (4GB became 5.2 for the same episode). Sadface. The only reason I was using pools/frame-threads was because in 4k encodes I feel like my CPU is being starved somehow. I can hear the fans ramping down at certain periods (nothing else is running except a couple idle terminal windows, and it's linux). So that tells me there's probably some performance in the CPU left on the table. It was my attempt to squeeze that out. It didn't work, and made the files bigger without gain, so back to the drawing board. My 1080p SDR encodes do not have this problem, they run a constant (max) fan speed the entire time.

The reason I use grain tune is because I want a lot of the very fine detail to be preserved, like the granite texture in the lobby scene of The Matrix. I didn't find any other settings that would preserve that texture without making it look like vasaline was smeared on the columns. I'm open to try other things, but I feel like I tried a lot and nothing did as good as grain does.

Also what is 'ME' and how would I change that? And how would I disable rect? What does rect even do?

Thanks so much all. Learning!

rwill

6th November 2025, 19:18

This is one good example:
https://slow.pics/c/bTBjYZuB
*HDR images, must be viewed on Chrome, Edge.

*sigh*
Ok, I see whats meant with poor BLs and how the EL can reconstruct that detail. I took some care when implementing one of the EL generation libraries that this is possible and somewhat easy for the EL video encoder. Keep in mind that the BL can be 600 or 1000 nits HDR10 and the reconstructed Dolby Vision 4000 nits or so. So the EL might contain more than just details but also the color grading differences.

Well anyway, are there publicly available DoVi PC players yet? Otherwise I see a problem with this goal:

Playback will only ever occur on a full desktop PC. No need to support streaming, plex, standalone players, phones, or any other playback type.

So no TV I guess. If no DoVi PC Player is available he has to use DoViBaker to generate a HDR10 and encode/view that, but is missing out on the dynamic metadata.

So what he needs is a DoViBaker and x265 HDR10 crash course?

aSquirrel

6th November 2025, 19:29

So no TV I guess. If no DoVi PC Player is available he has to use DoViBaker to generate a HDR10 and encode/view that, but is missing out on the dynamic metadata.

So what he needs is a DoViBaker and x265 HDR10 crash course?

I had been wondering about that. What a shitshow.

Am I really missing that much with Dolby Vision? Like, if I just don't copy the data over, I still get normal HDR10 data, right? Is it *that* noticeable? Is the juice worth the squeeze?

rwill

6th November 2025, 19:30

What is SAO?

Sample adaptive offset is a HEVC tool where x265 has a bad implementation of so its better turned off.

The reason I use grain tune is because I want a lot of the very fine detail to be preserved, like the granite texture in the lobby scene of The Matrix. I didn't find any other settings that would preserve that texture without making it look like vasaline was smeared on the columns. I'm open to try other things, but I feel like I tried a lot and nothing did as good as grain does.

Was this at the same bitrate or did -tune grain balloon the size up by 2x ? When I took a look the details and grain came at a huge size penalty.

Also what is 'ME' and how would I change that? And how would I disable rect? What does rect even do?

ME is the motion estimation. The part of the encoder that tries to find, for a block in the current frame, the best match in previous frame to copy to safe bits. x265 supports quite a couple of different motion estimation algorithms which vary in speed/quality.

Rect is for rectangular block partitions so the encoder is no longer limited to quads only. This comes at a harsh speed hit for a better compression efficiency/quality.

rwill

6th November 2025, 20:20

Am I really missing that much with Dolby Vision? Like, if I just don't copy the data over, I still get normal HDR10 data, right? Is it *that* noticeable? Is the juice worth the squeeze?

It is very noticeable under the right viewing conditions.

An enthusiast setup would be a good high end Dolby Vision TV, the original UHD Bluray in a quiet player and a Dolby Atmos receiver if applicable. Then watch at the right distance on a comfy couch in a dark room etc.

Even if there would be a Player for PC I personally think a normal PC setup at a desk is the wrong way to get a good movie experience.

If you just want to watch the movies, hell around the year 2000 people were watching 240p CAM Rips from shaky handheld camcorders taken in normal crowded cinemas, audience mumble, heads in the picture, horrible microphone audio and all. And they liked it.

microchip8

6th November 2025, 20:28

I personally don't bother with DoVi, which is optional in BD discs/content, so there's always HDR10 which I just use that when encoding. Nothing's perfect (what is?) and I'm quite happy with just HDR10. On my TCL Mini LED TV, DoVi looks darker than HDR10 unless I select dynamic or such mode. And there are many DoVi variations that I just don't want to bother with. There's only one HDR10 and it's good enough for me :)

GeoffreyA

6th November 2025, 21:18

aSquirrel

7th November 2025, 00:48

Was this at the same bitrate or did -tune grain balloon the size up by 2x ? When I took a look the details and grain came at a huge size penalty.

ME is the motion estimation. The part of the encoder that tries to find, for a block in the current frame, the best match in previous frame to copy to safe bits. x265 supports quite a couple of different motion estimation algorithms which vary in speed/quality.

Rect is for rectangular block partitions so the encoder is no longer limited to quads only. This comes at a harsh speed hit for a better compression efficiency/quality.

It did. However, when I don't use grain, to preserve the same level of detail the CRF is so low that it also doubles the file size. If I were willing to accept the visual quality I'd lose by not using grain tune, I'd just encode with AV1 and nearly 5x the encode speed for the same file size. Grain tuning is one of those 'cost of doing business' settings.

I make no claim to be an expert, so it may be that out of all the switches that flips, there's a way to keep most of the file size savings and still preserve the fine details. I don't know what to even try to look into that.

I did a test run (well, part of a run, full test takes an hour) with rect=0 and it actually lowered my bitrate by around 2mb/s while adding about 8 FPS to the encode (18 to 26). I...don't understand that, unless rect is really kinda ass at doing the thing it's supposed to be good at (higher compression). Or is that a quirk of modern CPUs with AVX512 where the squares are just SO much faster to process? I won't be able to run quality checks until the encode finishes.

With rect=0, and ME set to hex with 64 range and subpel 1, I was getting around 33 FPS and the same bitrate. Again, I need the tests to finish before I'll know if the settings are acceptable.

Almost all the movies and TV I own are action or drama. Marvel movies, The Matrix, Bourne, Dredd, stuff like that. For TV it tends to be scifi. Star Trek, The Chosen, etc. It's always humans, I own very little to no animation/cartoon type content (because the remaster of Muppet Christmas Carol is apparently streaming-only. LAME!). Toy Story would be the only animation type movie I'd have in my collection.

What would good ME settings be for that kind of content? Given my goal is to preserve nearly lossless visual quality without wasting bitrate. (first part > second; bitrate is the result of quality, but setting CRF to 10 just wastes bitrate)

microchip8

7th November 2025, 07:36

excellentswordfight

7th November 2025, 08:19

The reason I use grain tune is because I want a lot of the very fine detail to be preserved, like the granite texture in the lobby scene of The Matrix. I didn't find any other settings that would preserve that texture without making it look like vasaline was smeared on the columns. I'm open to try other things, but I feel like I tried a lot and nothing did as good as grain does.

Tune grain is a very old preset made for an very early version of x265 and has never been updated, and there is a broad consensus that its parameters are not optimal. Even at very high bitrates it can cause issues as it does not produce a transparent image, and can introduce and amplify noise. In crf mode it will also cause your bitrate to pretty much explode if you encode actual grainy content.

This is a good baseline:

--preset slow --profile main10 --no-sao --deblock -1:-1 --hdr10-opt

Also remember that master-display and max-cll needs to be set per title, and use DoviTool to extract and inject the RPU for dovi. I also think thats its best practice to specify level (level-idc) as ive encountered numerous playback issues with having unrestricted vbv (I actually go further and cap it at 100Mbps for all UHD encodes, as ive seen several SoCs that states 5.1 playback but struggles with 160Mbps high tier). Lookahead (rc-lookahead) can also be increased to 4-5s (96/120 for 24fps content) as there are bit-allocation and frametype benefits of having a large lookahead and as the only downside is that it will use more ram (which will be fine for most systems nowdays).

So complete commandline could look something like this for 2160p24 HDR:
--preset slow --profile main10 --level-idc 51 --crf 16 --keyint 240 --min-keyint 24 --rc-lookahead 96 --no-sao --deblock -1:-1 --hdr10-opt --repeat-headers --vbv-maxrate 100000 --vbv-bufsize 100000 --colorprim bt2020 --transfer smpte2084 --colormatrix bt2020nc --max-cll "1000,400" --master-display "G(13250,34500)B(7500,3000)R(34000,16000)WP(15635,16450)L(10000000,1)"
(if filesize is important, crf would probably need to be increased for grainy/complex sources for it to still produce decent size savings, should still look very good up to 19-20)

SAO is Sample Adaptive Offset filter, in short it trades details for better compression, beneficial for low bitrate encoding and some types of anime/cartoons, but should pretty much always be turned off when doing high quality live action content. no-sao pared with a slow preset should eliminate most of grain/detail/sharpness issues. Not all pixelpeepers think that its good enough, hence the huge amount of tweaks, some will help, some can introduce its own issues that then causes more tweaks to combat those issues, and as mentioned some might even create bugs when combined, so if you are going down the tweaking path, enjoy the rabbit whole :) But do note that not all, actually most, is not only beneficial, as always, there are tradeoffs.

GeoffreyA

7th November 2025, 11:21

Can you give me the actual command(s) to do that? On my old, cheap, speakers and my headphones, normalization isn't a problem, but I got some proper speakers and now I have the same problems with normalization everyone complains about. What you said in technical terms is so far outside my wheelhouse I can't really trust I'd even be googling the right thing.

Here is a Windows batch script. I pipe the output from FFmpeg to QAAC, muxing the final file with MP4Box. Since you are using Opus, I altered that. You will have to mux the audio after the video has been encoded.

The loudnorm (https://k.ylo.ph/2016/04/04/loudnorm.html) filter works best in two passes. What you do is, adjust out_lra to a target LRA. Think of it as the dynamic range of the audio: smaller means more compressed and is better suited to headphones or ordinary speakers. Then run the script with the second FFmpeg line commented out. Now, take those values from the first pass and enter them into the in_* and tg_offset variables. Now, run the script with the first FFmpeg line commented out. It will take a few tries to get the final LRA to match your target value. I find that an out_lra of around 3–5 reaches a final LRA of around 14. Unfortunately, I haven't automated this. There is a Python tool, slhck's ffmpeg-normalize (https://github.com/slhck/ffmpeg-normalize), but it takes more work to get up and running.

When you copy the script, put the second FFmpeg command on one line. I split it here for readability.

setlocal
set out_i=-23
set out_tp=-2
set out_lra=14

set in_i=-21.1
set in_tp=8
set in_lra=16.9
set in_thresh=-32.4
set tg_offset=-0.7

ffmpeg -i %1 -map 0:a:0 -af aresample=ochl=stereo:osf=dbl,aresample=192000:resampler=soxr:precision=33,loudnorm=i=%out_i%:tp=%out_tp%:lra=%out_lra%:linear=true:print_format=summary -f null -

::ffmpeg -i %1 -map 0:a:0 -af aresample=ochl=stereo:osf=dbl,aresample=192000:resampler=soxr:precision=33,
loudnorm=i=%out_i%:tp=%out_tp%:lra=%out_lra%:measured_i=%in_i%:measured_tp=%in_tp%:measured_lra=%in_lra%:measured_thresh=%in_thresh%:offset=%tg_offset%:linear=true:print_format=summary,
aresample=48000:resampler=soxr:precision=33 -c:a libopus -b:a 160k "audio.opus"

endlocal
pause

Z2697

7th November 2025, 21:09

x265's tune grain is designed to minimize the QP fluctuations between blocks, so the grains are more "uniform".
Not preserved better, but uniform.
Which is bad. Unless you have a video that's been drowned in noises/grains, with hardly anything else to see, the uniformity of the grains should not be the no.1 thing to concern.

Z2697

8th November 2025, 02:03

Here is a Windows batch script. I pipe the output from FFmpeg to QAAC, muxing the final file with MP4Box. Since you are using Opus, I altered that. You will have to mux the audio after the video has been encoded.

The loudnorm (https://k.ylo.ph/2016/04/04/loudnorm.html) filter works best in two passes. What you do is, adjust out_lra to a target LRA. Think of it as the dynamic range of the audio: smaller means more compressed and is better suited to headphones or ordinary speakers. Then run the script with the second FFmpeg line commented out. Now, take those values from the first pass and enter them into the in_* and tg_offset variables. Now, run the script with the first FFmpeg line commented out. It will take a few tries to get the final LRA to match your target value. I find that an out_lra of around 3–5 reaches a final LRA of around 14. Unfortunately, I haven't automated this. There is a Python tool, slhck's ffmpeg-normalize (https://github.com/slhck/ffmpeg-normalize), but it takes more work to get up and running.

When you copy the script, put the second FFmpeg command on one line. I split it here for readability.

setlocal
set out_i=-23
set out_tp=-2
set out_lra=14

set in_i=-21.1
set in_tp=8
set in_lra=16.9
set in_thresh=-32.4
set tg_offset=-0.7

ffmpeg -i %1 -map 0:a:0 -af aresample=ochl=stereo:osf=dbl,aresample=192000:resampler=soxr:precision=33,loudnorm=i=%out_i%:tp=%out_tp%:lra=%out_lra%:linear=true:print_format=summary -f null -

::ffmpeg -i %1 -map 0:a:0 -af aresample=ochl=stereo:osf=dbl,aresample=192000:resampler=soxr:precision=33,
loudnorm=i=%out_i%:tp=%out_tp%:lra=%out_lra%:measured_i=%in_i%:measured_tp=%in_tp%:measured_lra=%in_lra%:measured_thresh=%in_thresh%:offset=%tg_offset%:linear=true:print_format=summary,
aresample=48000:resampler=soxr:precision=33 -c:a libopus -b:a 160k "audio.opus"

endlocal
pause

So this script is just a "parsing output of first command" step away from fully automatic...
Which sounds both easy, and hard if constrained to batch...

Z2697

8th November 2025, 02:33

With rect and/or amp, you should use limit-modes=1 to speed it up. Otherwise, it's dog slow. Even with limit-modes set, it will slow down the encode by some but not as much as without limit-modes.

For ME, I use umh. It's slower than star but also goes deeper in analyzing than star. subme is good at 3 and above (I personally use the max of 7)

As I've said many times, x265's subme is an unimportant option and only affects BD rate less than 1%. (unless disabled completely)
But it has quite some speed drawback, so it's one of the first things to be optimized out when speed is concerned.
subme=2 is a good option, it's not slower than 1 and 0 by a lot.

NOR does it makes image sharper, if you also gonna repeat yourself.

Same story goes for rect and yes, ME, but they have more effect on BD rate.
ME is an essential part of the video encoding, but in fact hex is doing a good job already.
hex with larger range can sometimes even compress better than star or umh with default range.
By the way, the merange slows down hex much less than umh, and umh slightly less than star. (so technically, umh can be faster than star)

And as a side note, x264's subme is different, it's actually multiple settings bound together.
x264's hex seems to be capped at merange 16, unlike x265.
(I keep comparing with x264 because x265 is based on x264)
(But they are not the same)

GeoffreyA

8th November 2025, 06:51

So this script is just a "parsing output of first command" step away from fully automatic...
Which sounds both easy, and hard if constrained to batch...

If the output, as JSON, could be parsed and fed to the second pass, that would eliminate the need for the user to touch anything, except for the loudness, true peak, and LRA, the first two of which generally don't need to be adjusted.

With batch, it would be a veritable pain in the backside. Python would be an obvious answer, but I don't know the language and one has got to have a Python environment just to run the script. Probably, a C++ console application, using nlohmann's library, would be the easiest and cleanest approach. The source file and target parameters could be passed as command-line arguments.

rwill

8th November 2025, 07:42

ME is an essential part of the video encoding, but in fact hex is doing a good job already.
hex with larger range can sometimes even compress better than star or umh with default range.
By the way, the merange slows down hex much less than umh, and umh slightly less than star. (so technically, umh can be faster than star)

Hex is unable to break out of local minima while umh and star can. I would always pick one of umh or star just so the ME doesnt fail completely in corner cases.

microchip8

8th November 2025, 08:27

As I've said many times, x265's subme is an unimportant option and only affects BD rate less than 1%. (unless disabled completely)
But it has quite some speed drawback, so it's one of the first things to be optimized out when speed is concerned.
subme=2 is a good option, it's not slower than 1 and 0 by a lot.

NOR does it makes image sharper, if you also gonna repeat yourself.

Same story goes for rect and yes, ME, but they have more effect on BD rate.
ME is an essential part of the video encoding, but in fact hex is doing a good job already.
hex with larger range can sometimes even compress better than star or umh with default range.
By the way, the merange slows down hex much less than umh, and umh slightly less than star. (so technically, umh can be faster than star)

And as a side note, x264's subme is different, it's actually multiple settings bound together.
x264's hex seems to be capped at merange 16, unlike x265.
(I keep comparing with x264 because x265 is based on x264)
(But they are not the same)

I do not notice a big slowdown between subme 3 and 7. It's less than 1.5 fps slowdown, so why not use 7 in such case? It definitely won't hurt. Regardless, my encodes look stunning and I'm quite happy with the balance between compression and speed. I do not yet encode in 4K as my PC is somewhat older and can't take it (unless I wait days for an encode) so scale down to FHD. On average, with rect & amp & subme 7, I get about 7.5-8 fps speed (sometimes more if input is very clean) and I'm quite happy with that

alexmorph3us

8th November 2025, 18:44

Tune grain is a very old preset made for an very early version of x265 and has never been updated, and there is a broad consensus that its parameters are not optimal. Even at very high bitrates it can cause issues as it does not produce a transparent image, and can introduce and amplify noise. In crf mode it will also cause your bitrate to pretty much explode if you encode actual grainy content.

This is a good baseline:

--preset slow --profile main10 --no-sao --deblock -1:-1 --hdr10-opt

Also remember that master-display and max-cll needs to be set per title, and use DoviTool to extract and inject the RPU for dovi. I also think thats its best practice to specify level (level-idc) as ive encountered numerous playback issues with having unrestricted vbv (I actually go further and cap it at 100Mbps for all UHD encodes, as ive seen several SoCs that states 5.1 playback but struggles with 160Mbps high tier). Lookahead (rc-lookahead) can also be increased to 4-5s (96/120 for 24fps content) as there are bit-allocation and frametype benefits of having a large lookahead and as the only downside is that it will use more ram (which will be fine for most systems nowdays).

So complete commandline could look something like this for 2160p24 HDR:

(if filesize is important, crf would probably need to be increased for grainy/complex sources for it to still produce decent size savings, should still look very good up to 19-20)

SAO is Sample Adaptive Offset filter, in short it trades details for better compression, beneficial for low bitrate encoding and some types of anime/cartoons, but should pretty much always be turned off when doing high quality live action content. no-sao pared with a slow preset should eliminate most of grain/detail/sharpness issues. Not all pixelpeepers think that its good enough, hence the huge amount of tweaks, some will help, some can introduce its own issues that then causes more tweaks to combat those issues, and as mentioned some might even create bugs when combined, so if you are going down the tweaking path, enjoy the rabbit whole :) But do note that not all, actually most, is not only beneficial, as always, there are tradeoffs.

Is it possible to output to a working 12 bit format in order to recreate and in essence preserve the original presentation from the DV mezzanine video when it comes to movies where the enhancement layer (FEL) does make a difference? I'm interested in using tools such as Dolby Media Encoder (DEE) after that, but I can't figure it how to export the 12 bit presentation from the DoVi Baker script. Can you help me in this situation?

Kuler087

8th November 2025, 19:21

but I can't figure it how to export the 12 bit presentation from the DoVi Baker script
the dovi_baker output is in a RGB 16bit container, so the 12bit essence is preserved.

alexmorph3us

8th November 2025, 19:30

the dovi_baker output is in a RGB 16bit container, so the 12bit essence is preserved.

Yes, but I would like an intermediate format to work with, preferably lossless. Like you said, the Ffmpeg JPEG2000 output is not compatible with DEE or CM Analyzer so what other solutions are there?

I was thinking in using TIFF format. Could this work as an export from DoVi Baker? From there maybe I could use Dolby Mezzanine command tool from the Professional Tools in order to create an MXF file container with sidecar xml metadata.

From what I understand, Resolve doesn't "bake" the second layer.

aSquirrel

8th November 2025, 21:28

All of this discussion is good, and it made me go back to my SDR encodes and redo them. It takes a few days because...maybe I'm dumb, maybe I'm stubborn, but I re-encode the entire movie I use as my reference (The Matrix) with all the various settings one at a time (except for AQ settings, as those need to work together) to see what the file size/performance/quality impacts are. Downside? I can only run about 3 test passes per day on my system. Is this sort of data at all useful? What I produce is:

The resulting encode file size in bytes
Encode time in seconds
The SSIM score average of all frames in the video (I chose this out of all metrics because my goal is near-lossless, so to my understanding, actual lossless should have an SSIM of 1)
A list of the top 100 most difficult to encode frames, by frame number
Screenshot(s) of difficult-to-encode frames / a list of which frames are difficult to encode for those settings.

Much of my testing is automated, I just have to write the test cases and run the tools and I get all this data automatically. For example, the 'reference frame' I use for quality is during the Lobby Scene, and consistently the most difficult to encode frame is just before Neo comes out from behind the first pillar he hides behind (essentially an entire frame of exploding debris; high speed, in/out of focus areas, lots of subtle grays, film grain, etc. Worst-case, more or less).

Would that data be helpful for building a reference of 'this setting does X'? I realize how settings interplay with each other would be the next step. rect=0 on its own is one thing, but rect=0 with some other settings may produce a dramatically different result.

As for why I encode the entire movie, my thought is that if motion estimation, for example, is really bad, I'll start seeing low SSIM frames in part(s) of the movie I don't expect. Or if scene transitions break, I'll see lots of low SSIM around that.

Moreover, the reason to go back to SDR, and my logic may be flawed (tell me if so) is that once I move back to HDR encoding, most of the settings will remain the same, and I'll only need to add in HDR specific settings. So by making sure I have a solid foundation here, I am building my house on the rock, not the sand. Am I right in this belief?

aSquirrel

8th November 2025, 21:30

If the output, as JSON, could be parsed and fed to the second pass, that would eliminate the need for the user to touch anything, except for the loudness, true peak, and LRA, the first two of which generally don't need to be adjusted.

With batch, it would be a veritable pain in the backside. Python would be an obvious answer, but I don't know the language and one has got to have a Python environment just to run the script. Probably, a C++ console application, using nlohmann's library, would be the easiest and cleanest approach. The source file and target parameters could be passed as command-line arguments.

I know Python and can code in it. I haven't gotten around to running those commands yet, but maybe I can write the script to do what you have in mind? I just don't have an idea of what you're asking for or why at the moment.

alexmorph3us

8th November 2025, 22:35

@Kuler087: I was thinking what is the purpose of Dolby VES Muxer (which is part of DEE tools) after all? Can it achieve the same thing as DoVi Baker? That would be nice.

Kuler087

8th November 2025, 22:42

alexmorph3us

8th November 2025, 23:20

I don't know. I only use DEE for Profile 5 encoding, and I stick to the script that comes with the package.
It’s designed for 12-bit mezzanine files so I don’t see any reason it would include anything capable of reconstructing the 12bit master from an encoded hevc EL-BL
I understand. Why do you think that the output in JPEG2000 from Ffmpeg is not compatible with any of these tools? Is there something wrong with the command line in DoVi Scripts? What other similar lossless exports can be used after "baking" the video? I'm really interested in a Workflow for us enthusiasts out there :).

Kuler087

8th November 2025, 23:34

I dont know, I tried a bunch of different settings and never got it to work. Feel free to try; the cmd-line I use in the script is exposed.

alexmorph3us

9th November 2025, 00:14

I dont know, I tried a bunch of different settings and never got it to work. Feel free to try; the cmd-line I use in the script is exposed.

I'll look into it.

As a side note , in the next version, don't forget to change "z_Format" where it says "YUV420P10" to "YUV422P10" in all the workflows where a ProRes reencoding is used as an intermediate since that is already using 422hq codec so it should stay this way.

Kuler087

9th November 2025, 00:59

already did. (https://drive.google.com/file/d/128gq8aDUTKA_aT7SQsM9dkjA1EP1sosR/view?usp=drive_link)

GeoffreyA

9th November 2025, 11:05

I know Python and can code in it. I haven't gotten around to running those commands yet, but maybe I can write the script to do what you have in mind? I just don't have an idea of what you're asking for or why at the moment.

No worries. Thanks. I was just thinking aloud in response to Z2697's comment. If the commands work and are useful, perhaps we can look at automation, and what would be helpful.

Z2697

9th November 2025, 19:53

Hex is unable to break out of local minima while umh and star can. I would always pick one of umh or star just so the ME doesnt fail completely in corner cases.

It's about trade-off, corner cases add up to roughly 1.5% increase in BD rate, but the difference in speed is 30~50%, so it's one of the first things to be optimized when speed is one of the priorities.

I do not notice a big slowdown between subme 3 and 7. It's less than 1.5 fps slowdown, so why not use 7 in such case? It definitely won't hurt. Regardless, my encodes look stunning and I'm quite happy with the balance between compression and speed. I do not yet encode in 4K as my PC is somewhat older and can't take it (unless I wait days for an encode) so scale down to FHD. On average, with rect & amp & subme 7, I get about 7.5-8 fps speed (sometimes more if input is very clean) and I'm quite happy with that

So 1.5 fps is 18.75~20% out of 7.5-8 fps, and you trade that for less than 1% improvement, that's fine for you, but OP puts speed in no.2 priority, so probably not fine for him.
These options you mentioned are "high input, low output".

microchip8

9th November 2025, 20:33

So 1.5 fps is 18.75~20% out of 7.5-8 fps, and you trade that for less than 1% improvement, that's fine for you, but OP puts speed in no.2 priority, so probably not fine for him.
These options you mentioned are "high input, low output".

It is less than 1.5 fps which is an upper guess by me. rect and amp have a much higher penalty than subme 7. And as I said, all looks stunning here. There's more than one way of doing things :)

(high input, MODERATE output, which is also called refinement!)

aSquirrel

10th November 2025, 01:17

It's about trade-off, corner cases add up to roughly 1.5% increase in BD rate, but the difference in speed is 30~50%, so it's one of the first things to be optimized when speed is one of the priorities.

So 1.5 fps is 18.75~20% out of 7.5-8 fps, and you trade that for less than 1% improvement, that's fine for you, but OP puts speed in no.2 priority, so probably not fine for him.
These options you mentioned are "high input, low output".

(I am OP) I actually have a spreadsheet that I put all my tests into. It uses the file size and encode time to compute whether I'm saving money or not. Based on the power draw of my system and the price I pay for storage. Ultimately I look at the data and my decision matrix is a bit fuzzy, but it works something like this:

1. Does it look good enough for 'my standard'. If no, discard. If yes, continue.
2. Among the set of results that are 'good enough', what is the spread of cost savings ($1.60 vs $0.23 is the spread I have atm, not counting some of the AV1 tests that went negative (costing me more money than saving by transcoding).
3. Look at the top savings in that set. If I save $1.50 instead of $1.60 do I encode 2x faster? If I save $1.02 do I get settings that work in all media inputs and don't require tuning per title? There's a lot of 'fuzzy' in this step. This is why I have notes in my spreadsheet, SSIM results per encode test, and a way to pixel-peep compare selected frames (by number, chosen as 'hardest to encode' by SSIM) where I can see why the SSIM is low (maybe it crushes shadows, but I don't percieve it as that bad. Maybe it crushes shadows and completely erases detail that should be there. This is why the eyeball test is needed).

Ultimately I make my decision from the sum of the data. It's not as simple as A,B,C. Maybe I suck at defining standards, but I don't have a good way to explain a decision tree this complex in 'useful' language without ballooning it to like 15 paragraphs.