Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
|
|
Thread Tools | Search this Thread | Display Modes |
3rd October 2024, 12:34 | #1 | Link |
Registered User
Join Date: Jun 2024
Location: South Africa
Posts: 145
|
Downmixing multi-channel tracks to stereo and normalising with FFmpeg's loudnorm
Back in the DVD days, I used Azid to apply DRC, downmix, and normalise audio. Downmixing those 5.1 tracks was always softer than the stereo tracks included on many DVDs. Coming back to encoding years later, the tools have changed but the problems haven't. Dialogue is still too soft.
FFmpeg's loudnorm filter does a good job at harmonising the volume and seems to be better than dynaudnorm. However, one has to adjust the target: loudness and true peak are easy enough (-23 and -1 or -2), but the range, or LRA, is up to the user's taste. Netflix recommends an LRA between 4 and 18; and I find that 18 gives good results, with audible dialogue. My question is: what LRA values are others using that give good results with downmixed film material? Also, is it better to run loudnorm before downmixing or after? Here is my batch file. The first, commented-out FFmpeg line is the measuring pass. After it runs, I set the variables to the measured values and run the second pass. Code:
set out_i=-23 set out_tp=-2 set out_lra=12 set in_i=-16.8 set in_tp=6.7 set in_lra=23.2 set in_thresh=-29.4 set tg_offset=1.6 ::ffmpeg -i %1 -map 0:a:0 -af aresample=ochl=stereo,loudnorm=i=%out_i%:tp=%out_tp%:lra=%out_lra%:print_format=summary -f null - ffmpeg -i %1 -map 0:a:0 -af aresample=ochl=stereo:osr=192000:resampler=soxr:precision=33, loudnorm=i=%out_i%:tp=%out_tp%:lra=%out_lra%:measured_i=%in_i%:measured_tp=%in_tp%:measured_lra=%in_lra%:measured_thresh=%in_thresh%:offset=%tg_offset%:linear=true:print_format=summary, aresample=48000:resampler=soxr:precision=33 -c:a pcm_f32le -f wav - | "%qaac%" --tvbr 91 --ignorelength --no-delay --verbose - -o "out\2.0-loudnorm.m4a" Last edited by GeoffreyA; 3rd October 2024 at 12:42. |
4th October 2024, 10:39 | #2 | Link | |
Moderator
Join Date: Feb 2005
Location: Spain
Posts: 7,127
|
Quote:
If the Center channel (with most of the dialog) have low volume compared with the rest of channels it is the same apply loudnorm after or before. I recommend use a downmix with high Center contribution. If the standard ffmpeg downmix is: pan=stereo|FL=.4142c0+.2929c2+.2929c4|FR=.4142c1+.2929c2+.2929c5 I recommend increment the Center coeficient and decrement the surround contribution not important at all to the stereo output: pan=stereo|FL=.4142c0+.3929c2+.1929c4|FR=.4142c1+.3929c2+.1929c5 after that downmix you can use loudnorm at your taste.
__________________
BeHappy, AviSynth audio transcoder. |
|
4th October 2024, 14:01 | #3 | Link |
Registered User
Join Date: Jun 2024
Location: South Africa
Posts: 145
|
Thanks, tebasuna! I think you've solved my problem and cleared up where the issue actually was.
This morning, incidentally, I tried my old set of PC speakers, which had good sound, and discovered that the loudnorm versions were not that grand after all. (I hadn't tested them on the TV yet.) I took your advice, raising the centre coefficient, and the dialogue came out loud and clear. This eliminates the need for loudnorm, which I was not too happy with because of the resampling, further processing, and potential damage. As I am working with 32-bit float output, FFmpeg's default coefficients were the 1, 0.707107 set, so I experimented with 0.75-0.9 for the centre, and the respective drop in the surround channels. On a couple of films (Inception, Fellowship of the Ring, and Event Horizon), it works like a charm. On the other end, qaac is normalising to 0 dB before encoding. If I may ask, is it always necessary to lower the surround channels if raising the centre? (EDIT: Through experimenting with FFmpeg, I see that each channel must add up to 1. I didn't realise that before.) Last edited by GeoffreyA; 4th October 2024 at 14:27. |
5th October 2024, 01:41 | #4 | Link | |
Broadcast Encoder
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 3,118
|
Quote:
To answer your question about loudnorm, I know it's not required any longer as you solved your problem, but in case someone reading this needs it, I found my sweet spot at LRA 12 and True Peak -2 (target can be anything you need, most countries have -23, Italy has -24). Last edited by FranceBB; 5th October 2024 at 01:43. |
|
5th October 2024, 04:23 | #5 | Link |
Registered User
Join Date: Apr 2006
Posts: 153
|
The programme is likely normalized to not clip in the stereo downmix, or has a generous headroom. But you can't necessarily avoid clipping by decreasing another channel. They likely reach their maximum amplitude at different moments. I'm not familiar with any quirks in ffmpeg, but usually I'd tweak one knob at a time, the center channel, then multiply all output channels together if needed.
|
5th October 2024, 09:06 | #6 | Link |
Moderator
Join Date: Feb 2005
Location: Spain
Posts: 7,127
|
That default are before normalize: 1, 0.7071
where 0.7071 = (2^0.5)/2 than restore the original Center volume to the phantom one created with that contribution in both channels FL and FR. That work fine downmixing 3 channel to 2. But when there are others to downmix the default is (see the < instead the =): pan=stereo|FL<c0+.7071c2+.7071c4|FR<c1+.7071c2+.7071c5 is the same (after automatic normalize) to: pan=stereo|FL=.4142c0+.2929c2+.2929c4|FR=.4142c1+.2929c2+.2929c5 Now the Center channel is not recovered at same volume than the original and sometimes need raise it.
__________________
BeHappy, AviSynth audio transcoder. Last edited by tebasuna51; 5th October 2024 at 09:11. |
7th October 2024, 16:44 | #7 | Link |
Registered User
Join Date: Jun 2024
Location: South Africa
Posts: 145
|
Thanks to everyone for their advice and thoughts. Testing this over the weekend, it seems that matters are not as simple as I thought the other day. I might have made a mistake, leaving out qaac's normalising. Also, in light of tebasuna's explanation about the phantom centre, I'm now hesitant to raise the centre coefficient too much. I'll keep on experimenting and report back when things are clearer.
Would you say that you are targeting a final LRA of 12, or putting that value in the loudnorm filter? I find that, to hit an LRA of 18, I've got to put in a value of around 12, funny enough. |
30th November 2024, 00:24 | #9 | Link |
Registered User
Join Date: Mar 2006
Posts: 1,050
|
I recommends sofalizer filter even if your intention is not to use headphones - still seem it is better (smarter) doing downmixing,
this is my personal impression and of course loudness normalization on stereo but i would also consider loudness normalization on center channel (assumption dialogues redirected there) - Tebasuna idea is OK if you not decide to use sofalizer filter - then i would apply loudness normalization to center and after all to stereo. |
1st December 2024, 12:12 | #10 | Link | |
Registered User
Join Date: Jun 2024
Location: South Africa
Posts: 145
|
Quote:
What I ended up settling on was default downmixing, then normalising with loudnorm. I found that using an LRA of around 12 or 13, instead of 18, brings the dialogue to the forefront; it sounds clear, loud, and how it should be, reminscent of the cinema. With higher LRAs, the dialogue is audible but not loud enough, and the volume has to be manually adjusted throughout the movie. Using loudnorm is simpler and more generalised, whereas raising the centre coefficient before downmixing seems to depend on the film, and that means more work. You mention performing loudness normalisation on the centre channel before downmixing. That is an interesting idea. It would take more work but give better results, aligning with the EBU's recommendation of having a dialogue LRA (I think 5) and a programme LRA, and having a ratio between these two. |
|
Thread Tools | Search this Thread |
Display Modes | |
|
|