Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > General > Audio encoding

Reply
 
Thread Tools Search this Thread Display Modes
Old 28th October 2020, 19:21   #21  |  Link
hello_hello
Registered User
 
Join Date: Mar 2011
Posts: 4,823
Quote:
Originally Posted by tebasuna51 View Post
The ffmpeg 'compand' filter do the same than sox with more speed. Use sox is always slow.

Also use ffmpeg is always fast than use AviSynth and the 'compand' filter do the same (more or less) than the SoftClipperFromAudX() filter used in MeGUI 32 bits.
Is it the SoftClipperFromAudX() filter that's incredibly slow?

I tried downmixing a less than 5 minute 7.1ch wave file to stereo with MeGUI while converting to flac, and after indexing it took roughly one minute to complete. I didn't normalize.

To do the same thing with foobar2000 I downmix 7.1ch to 5.1ch with the Matrix Mixer, run that through the Amplify DSP to increase the volume by a couple of dB (for the next DSP), then through fb2k's Advanced Limiter to limit any loud peaks after combining the surround channels, from there it goes through the Amplify DSP again to reduce the volume by 2dB before being downmixed to stereo with the Matrix Mixer DSP.
For the same file, outputting flac once more, that whole process took about 4 seconds.
hello_hello is offline   Reply With Quote
Old 29th October 2020, 02:01   #22  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 6,890
Is not only SoftClipperFromAudX(), is AviSynth 32 bits, use ffmpeg instead.
__________________
BeHappy, AviSynth audio transcoder.
tebasuna51 is offline   Reply With Quote
Old 2nd November 2020, 09:53   #23  |  Link
PatchWorKs
Registered User
 
PatchWorKs's Avatar
 
Join Date: Aug 2002
Location: Italy
Posts: 303
Hi there, can you finally "sum up" all the (proper ?) ways to downmix multichannels to stereo with FFMPEG ?

I've already cheked many approaches, but I'm a bit confused now...
- https://superuser.com/questions/8524...o-using-ffmpeg
- http://forum.doom9.org/showthread.php?t=168267

Thanks in advance.
PatchWorKs is offline   Reply With Quote
Old 2nd November 2020, 12:55   #24  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 6,890
Quote:
Originally Posted by PatchWorKs View Post
...can you finally "sum up" all the (proper ?) ways to downmix multichannels to stereo with FFMPEG ?
1) If your input is 7.1, first do a 7.1 -> 5.1 downmix with the compand selected (see precedent posts):
-filter_complex "asplit [f][s]; [f] pan=3.1|c0=c0|c1=c1|c2=c2|c3=c3 [r]; [s] pan=stereo|c0=0.5*c4+0.5*c6|c1=0.5*c5+0.5*c7, compand=attacks=0:decays=0:points=-90/-84|-8/-2|-6/-1|-0/-0.1, aformat=channel_layouts=stereo [d]; [r][d] amerge [a]" -map "[a]"

2) If your audio equipment support Dolby ProLogic decoder (recover a 5.0 from your 2.0 file) use the DPLII downmix:
-filter_complex "pan=stereo|FL=.3254FL+.2301FC+.2818BL+.1627BR|FR=.3254FR+.2301FC-.1627BL-.2818BR"

3) If your audio equipment is only stereo (TV for instance) does not exist a proper way, because is not possible suply the same audio volume with 2 speakers than with 5 speakers, and there are many options. Select the desired:

a) The formal approach to preserve the balance between all the channels (of course the LFE channel is ignored by Dolby recommendation):
-filter_complex "pan=stereo|FL=.3694FL+.2612FC+.3694BL+0.0LFE|FR=.3694FR+.2612FC+.3694BR+0.0LFE, volumedetect"

b) The extreme dialog maximize:
-filter_complex "pan=stereo|FL=.5FL+.5FC+.0BL+0.0LFE|FR=.5FR+.5FC+.0BR+0.0LFE, volumedetect"

c) Any option between a) and b). The coeficients for each channel must sum 1 to avoid clip. For instance:
-filter_complex "pan=stereo|FL=.5FL+.4FC+.1BL+0.0LFE|FR=.5FR+.4FC+.1BR+0.0LFE, volumedetect"

I include the volumedetect filter to see if the mix admit a gain without clip.

Mix like:
FL=FC+0.30*FL+0.30*BL (sum of coeficients 1.6)
FL=0.5*FC+0.707*FL+0.707*BL+0.5*LFE (sum of coeficients 2.414)
are wrong because can produce clips.

Mix like (see the < instead the =)
FL < 1.0*FL + 0.707*FC + 0.707*BL
are equivalent (automatic normalization):
FL = 0.414*FL + 0.293*FC + 0.293*BL
__________________
BeHappy, AviSynth audio transcoder.

Last edited by tebasuna51; 3rd February 2023 at 10:54. Reason: add info
tebasuna51 is offline   Reply With Quote
Old 6th November 2020, 10:12   #25  |  Link
PatchWorKs
Registered User
 
PatchWorKs's Avatar
 
Join Date: Aug 2002
Location: Italy
Posts: 303
Quote:
Originally Posted by tebasuna51 View Post
Mix like (see the < instead the =)
FL < 1.0*FL + 0.707*FC + 0.707*BL
are equivalent (automatic normalization):
FL = 0.414*FL + 0.293*FC + 0.293*BL
Is this the implementation of "Novel 5.1 Downmix Algorithm with Improved Dialogue Intelligibility" research ?

Quote:


Similar to the state-of-the-art downmix methods, only 5 channels are taken into consideration: L, R, C, Ls and Rs. We can represent the downmix operation in the form of the following equation:

lt [n] = l[n] + 0.707 * c[n] + (dlev - 1) * e[n] + 0.5 * ls [n]
rt [n] = r[n] + 0.707 * c[n] + (dlev - 1) * e[n] + 0.5 * rs [n]


where e[n] is the extracted voice signal, dlev represents the dialogue level and all considered signals are represented in the digital domain, in which n denotes the sample index.
Someone @ Hydrogenaudio forums implemented it in this way:
Quote:
ffmpeg -i 6chan-input.wav -af "pan=stereo|FL < 1.0*FL + 0.707*FC + 0.707*BL|FR < 1.0*FR + 0.707*FC + 0.707*BR" -ac copy stereo.wav
...do you think is correct (and proper) ?

Last edited by PatchWorKs; 6th November 2020 at 16:45.
PatchWorKs is offline   Reply With Quote
Old 6th November 2020, 21:42   #26  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 6,890
Quote:
Originally Posted by PatchWorKs View Post
Nope. I don't know algoritms to implement "Disparity analysis" or "Voice channel extraction"

Quote:
Someone @ Hydrogenaudio forums implemented it in this way:
...do you think is correct (and proper) ?
Is a option included in my c) option.
Can work fine for some sources but bad for others (low dialogs).
I prefer a C coeficient greater than the B coeficients.
__________________
BeHappy, AviSynth audio transcoder.
tebasuna51 is offline   Reply With Quote
Old 6th November 2020, 23:36   #27  |  Link
richardpl
Registered User
 
Join Date: Jan 2012
Posts: 272
For low dialogs, use dynamic range compressor.
richardpl is offline   Reply With Quote
Old 7th November 2020, 08:23   #28  |  Link
PatchWorKs
Registered User
 
PatchWorKs's Avatar
 
Join Date: Aug 2002
Location: Italy
Posts: 303
Quote:
Originally Posted by tebasuna51 View Post
Nope. I don't know algoritms to implement "Disparity analysis" or "Voice channel extraction"
It's deeply described at pages 4-5-6 of the research PDF...

...do you think it's implementable in FFMPEG ?
PatchWorKs is offline   Reply With Quote
Old 7th November 2020, 12:05   #29  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 6,890
AFAIK the mix suggested is (for each stereo channel):

lt [n] = l[n] + 0.707 * c[n] + (dlev - 1) * e[n] + 0.5 * ls [n]

with e[n] the extra channel provided by the "Voice channel extraction", and the suggested boost for it is 10 dB (dlev = 3.216). Then can be:

FL < 1FL + 0.707FC + 2.162E + 0.5BL

and normalized:

FL = 0.229FL + 0.162FC + 0.495E + 0.114BL

I don't know how extract the E (Voice channel extraction) channel with ffmpeg functions, but seems than the stereo channels have more than half of the volume to only voices, for me is too much.

We can obtain the same voice volume (not so clear) with:

FL = 0.229FL + 0.657FC + 0.114BL

For me FL=.5FL+.4FC+.1BL is more than enough preserving the stereo effect much more.
__________________
BeHappy, AviSynth audio transcoder.
tebasuna51 is offline   Reply With Quote
Old 9th November 2020, 12:39   #30  |  Link
PatchWorKs
Registered User
 
PatchWorKs's Avatar
 
Join Date: Aug 2002
Location: Italy
Posts: 303
Quote:
Originally Posted by tebasuna51 View Post
For me FL=.5FL+.4FC+.1BL is more than enough preserving the stereo effect much more.
...you mean your previous "c" option:
Code:
-filter_complex "pan=stereo|FL=.5FL+.4FC+.1BL+0.0LFE|FR=.5FR+.4FC+.1BR+0.0LFE, volumedetect"
So, in the end, adding "<" instead of "=" (as you suggested) should be the best multichannel to stereo downmix by FFMPEG, right ?
Code:
-filter_complex "pan=stereo|FL<.5FL+.4FC+.1BL+0.0LFE|FR<.5FR+.4FC+.1BR+0.0LFE, volumedetect"

Last edited by PatchWorKs; 9th November 2020 at 12:44.
PatchWorKs is offline   Reply With Quote
Old 9th November 2020, 13:10   #31  |  Link
richardpl
Registered User
 
Join Date: Jan 2012
Posts: 272
Quote:
Originally Posted by PatchWorKs View Post
...you mean your previous "c" option:
Code:
-filter_complex "pan=stereo|FL=.5FL+.4FC+.1BL+0.0LFE|FR=.5FR+.4FC+.1BR+0.0LFE, volumedetect"
So, in the end, adding "<" instead of "=" (as you suggested) should be the best multichannel to stereo downmix by FFMPEG, right ?
Code:
-filter_complex "pan=stereo|FL<.5FL+.4FC+.1BL+0.0LFE|FR<.5FR+.4FC+.1BR+0.0LFE, volumedetect"
NO.

FFmpeg have many options for downmix.
richardpl is offline   Reply With Quote
Old 9th November 2020, 14:36   #32  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 6,890
Quote:
Originally Posted by PatchWorKs View Post
So, in the end, adding "<" instead of "=" (as you suggested)
Put < or = when the sum of coeficients is 1 (.5 + .4 + .1 = 1) is the same mix.
Quote:
should be the best multichannel to stereo downmix by FFMPEG, right ?
I can't say that. It can change with different sources, users preferences and device player.
__________________
BeHappy, AviSynth audio transcoder.
tebasuna51 is offline   Reply With Quote
Old 11th November 2020, 08:34   #33  |  Link
PatchWorKs
Registered User
 
PatchWorKs's Avatar
 
Join Date: Aug 2002
Location: Italy
Posts: 303
Ok, last question: does this formula "correctly" downmix 7.1/atmos to stereo ?

Code:
-filter_complex "asplit [f][s]; [f] pan=3.1|c0=c0|c1=c1|c2=c2|c3=c3 [r]; [s] pan=stereo|c0=0.5*c4+0.5*c6|c1=0.5*c5+0.5*c7, compand=attacks=0:decays=0oints=-90/-84|-10/-4|-6/-2|-0/-0.3, aformat=channel_layouts=stereo [d]; [r][d] amerge, pan=stereo|FL=.3254FL+.2301FC+.2818BL+.1627BR|FR=.3254FR+.2301FC-.1627BL-.2818BR, volumedetect [a]" -map "[a]"
Thanks.

Last edited by PatchWorKs; 11th November 2020 at 08:37.
PatchWorKs is offline   Reply With Quote
Old 11th November 2020, 11:55   #34  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 6,890
Warning with the emoticon. Must be:
Code:
-filter_complex "asplit [f][s]; [f] pan=3.1|c0=c0|c1=c1|c2=c2|c3=c3 [r]; [s] pan=stereo|c0=0.5*c4+0.5*c6|c1=0.5*c5+0.5*c7, compand=attacks=0:decays=0:points=-90/-84|-10/-4|-6/-2|-0/-0.3, aformat=channel_layouts=stereo [d]; [r][d] amerge, pan=stereo|FL=.3254FL+.2301FC+.2818BL+.1627BR|FR=.3254FR+.2301FC-.1627BL-.2818BR, volumedetect [a]" -map "[a]"
And this is a Dolby Prologic downmix to be played with audio equipment with DplII decoder.

To be played for a stereo only audio equipment (TV for instance) maybe you can try other downmix:
Code:
-filter_complex "asplit [f][s]; [f] pan=3.1|c0=c0|c1=c1|c2=c2|c3=c3 [r]; [s] pan=stereo|c0=0.5*c4+0.5*c6|c1=0.5*c5+0.5*c7, compand=attacks=0:decays=0:points=-90/-84|-8/-2|-6/-1|-0/-0.1, aformat=channel_layouts=stereo [d]; [r][d] amerge, pan=stereo|FL=.4FL+.4FC+.2BL|FR=.4FR+.4FC+.2BR, volumedetect [a]" -map "[a]"
Now with the compand curve (for 7.1 -> 5.1) preferred by Richard1485:
points=-90/-84|-8/-2|-6/-1|-0/-0.1
and a mix stereo (not dplII) with the dialogs reinforced:
pan=stereo|FL=.4FL+.4FC+.2BL|FR=.4FR+.4FC+.2BR

Does not exist a unique correct way to do the downmix because depend on:
- Device player (with or without DplII)
- Kind of source (it need or not reinforce dialogs)
- Users preferences (dialogs, compand curve)
__________________
BeHappy, AviSynth audio transcoder.
tebasuna51 is offline   Reply With Quote
Old 11th November 2020, 16:22   #35  |  Link
SeeMoreDigital
Life's clearer in 4K UHD
 
SeeMoreDigital's Avatar
 
Join Date: Jun 2003
Location: Notts, UK
Posts: 12,219
Quote:
Originally Posted by tebasuna51 View Post
And this is a Dolby Prologic downmix to be played with audio equipment with DplII decoder...
I have a few Dolby Surround (Prologic) encoded CD's. It would be interesting if such CD's could be reverse encoded to create native surround-sound channels.
__________________
| I've been testing hardware media playback devices and software A/V encoders and decoders since 2001 | My Network Layout & A/V Gear |
SeeMoreDigital is offline   Reply With Quote
Old 11th November 2020, 23:41   #36  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 6,890
The best option I know is the Foobar2000 plugin FreeSurround configured like the attached image.

Maybe the PowerDVD compatibility (a inverted surround channel) is better, that change with the method used to construct the dpl encode (it can't be know).

There are a ffmpeg function but for me is worse, maybe I don't found the correct parameters, this is my best test:

-af "surround=lfe_out=0:level_out=2"
Attached Images
 
__________________
BeHappy, AviSynth audio transcoder.
tebasuna51 is offline   Reply With Quote
Old 2nd February 2022, 03:52   #37  |  Link
damian101
Registered User
 
Join Date: Feb 2021
Location: Germany
Posts: 17
I use this one for all downmixing to stereo:
Code:
-af 'lowpass=c=LFE:f=120,pan=stereo|FL=.3FL+.21FC+.3FLC+.3SL+.3BL+.21BC+.21LFE|FR=.3FR+.21FC+.3FRC+.3SR+.3BR+.21BC+.21LFE'
Works for various channel configurations, including standard 7.1, 6.1, 5.1.

I usually normalize as well, by doing a first run with ebur128 at the end of the filter chain, and then setting a dB offset with volume filter in the actual encode, to reach -23 LUFS.
damian101 is offline   Reply With Quote
Old 2nd February 2022, 11:55   #38  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 6,890
Quote:
Originally Posted by damian101 View Post
I use this one for all downmixing to stereo:
Like I say already each user is free to experiment and select the preferred method.

But for me there are some troubles:

- Dolby recommend don't use the LFE channel in the downmix (and worse filtered), because can cancel or distort low frequencies present in other channels.

- The volume of FC channel (.21) is very low compared with the rest of channels (1.2) and the dialogs can be inaudibles.

- I can't recommend use double volume of front channels (.3) for surround channels (.6)

Maybe...
Code:
-af 'pan=stereo|FL=.4FL+.3FC+.15FLC+.15SL+.15BL+.15BC|FR=.4FR+.3FC+.15FRC+.15SR+.15BR+.15BC'
__________________
BeHappy, AviSynth audio transcoder.

Last edited by tebasuna51; 2nd February 2022 at 12:02. Reason: add info
tebasuna51 is offline   Reply With Quote
Old 27th September 2022, 18:23   #39  |  Link
PatchWorKs
Registered User
 
PatchWorKs's Avatar
 
Join Date: Aug 2002
Location: Italy
Posts: 303
Sorry to revive this old 3ad but the bet is still kicking...

...just found this interesting reply - dated Aug 25 - from PierU @ Super User:
Quote:

Old question but still interesting to me...

First, I never encountered a global volume issue. After reading the answer from @Franz-Michael Fisher I made a few tests, starting from a file with a DTS 5.1 track and transcoding it to pcm_s16le, pcm_f32le, and aac, all of them with the -ac 2 option. When playing the files with VLC and headphones, all of them sound the same as the original, except the pcm_s16le one that sounds quieter. Since I always use aac, the global volume is apparently not an issue.

Second, I sometimes face the problem of too low perceived dialogs compared to the music/sounds. So it's indeed tempting to downmix with alternate formulas that give more weight to the central channel FC, and I did that for a while... However, it turns out that FC does not contain only voices but also a large part of the music and sounds: as a consequence, overweighting FC is also narrowing the stereo image, which is not desirable...

I kept wondering why the dialogs are sometimes perceived too low after downmixing, and I have a possible explanation: the brain is very good at isolating a voice buried in the ambient noise according to the direction it comes from. That's why people with hearing aids still have difficulties to follow a conversation when multiple people speak at the same time: the earings aids can restore the volume, but the directivity is lost... So, with a real 5.1 or 7.1 setup the brain is not bothered by the side/rear channels when it comes to focus on the dialog, because they come from fully different directions. After downmix this is not the same: what was coming from the side/rear channels is now coming from the front, making the separation task more difficult for the brain. The solution is hence to downweight the side/rear channels: instead of the ATSC formula

Code:
-af "pan=stereo|FL < 1.0*FL + 0.707*FC + 0.707*BL|FR < 1.0*FR + 0.707*FC + 0.707*BR"
I am now using

Code:
-af "pan=stereo|FL < 1.0*FL + 0.707*FC + 0.4*BL|FR < 1.0*FR + 0.707*FC + 0.4*BR"
...it is a correct argue ?
__________________
HYbrid Multimedia Production Suite project @ GitHub
PatchWorKs is offline   Reply With Quote
Old 28th September 2022, 09:45   #40  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 6,890
Quote:
Originally Posted by PatchWorKs View Post
...it is a correct argue ?
Yes, it is correct for me.

But remember, the thread is about 7.1 downmix and this downmix is for a 5.1 downmix and the canonical downmix is:
pan=stereo|FL=.37FL+.26FC+.37BL|FR=.37FR+.26FC+.37BR
with very low FC presence

And the suggested:

pan=stereo|FL < 1.0*FL + 0.707*FC + 0.707*BL|FR < 1.0*FR + 0.707*FC + 0.707*BR
Is the same than (with = instead <, and rounding)
pan=stereo|FL=.4FL+.3FC+.3BL|FR=.4FR+.3FC+.3BR

And myself I suggested (five post before) a mix with more FC presence:
pan=stereo|FL=.4FL+.4FC+.2BL|FR=.4FR+.4FC+.2BR

The other:
pan=stereo|FL < 1.0*FL + 0.707*FC + 0.4*BL|FR < 1.0*FR + 0.707*FC + 0.4*BR
same than:
pan=stereo|FL=.47FL+.33FC+.19BL|FR=.47FR+.33FC+.19BR
have fronts with more force, I don't know if it is better.
Is a question of taste.
__________________
BeHappy, AviSynth audio transcoder.

Last edited by tebasuna51; 28th September 2022 at 10:11.
tebasuna51 is offline   Reply With Quote
Reply

Tags
downmix, phase shift

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 18:09.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.