Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
9th April 2018, 18:59 | #21 | Link | ||||
Registered User
Join Date: Mar 2011
Posts: 4,829
|
Quote:
-report -loglevel debug -i - -ignore_length true -ac 2 -c:a libmp3lame -q:a 2 -id3v2_version 3 %d The same source file (5.1ch AC3) was used each time, but for each I changed the maximum bitdepth in the fb2k encoder configuration. 32 bit float: Quote:
Quote:
Quote:
Last edited by hello_hello; 9th April 2018 at 19:32. |
||||
9th April 2018, 19:20 | #22 | Link | |||
Registered User
Join Date: Mar 2011
Posts: 4,829
|
To make it even more interesting, ffmpeg changes downmix coefficients according to the output format.
-report -loglevel debug -i - -ignore_length true -ac 2 -c:a aac %d 32 bit float: Quote:
Quote:
Quote:
Last edited by hello_hello; 9th April 2018 at 19:35. |
|||
13th April 2018, 03:53 | #23 | Link |
Registered User
Join Date: Nov 2015
Posts: 81
|
Sadly, I'm still getting the same result. I checked the ffmpeg report and this is how it mixes (you know what I mean) the channels:
Code:
[auto_resampler_0 @ 000002e30d6f3d80] [SWR @ 000002e30d6f3e80] FL: FL:1.000000 FR:0.000000 FC:0.707107 LFE:0.000000 BL:0.707107 BR:0.000000 [auto_resampler_0 @ 000002e30d6f3d80] [SWR @ 000002e30d6f3e80] FR: FL:0.000000 FR:1.000000 FC:0.707107 LFE:0.000000 BL:0.000000 BR:0.707107 What about this "Extra" tab? Can you add something here to make MeGUI use other matrix when downmixing the channels? Last edited by doomleox999; 13th April 2018 at 07:50. |
13th April 2018, 09:19 | #24 | Link | |
Moderator
Join Date: Feb 2005
Location: Spain
Posts: 6,915
|
Of course replace the part of my script:
Code:
fl_sl = MixAudio(fl, sl, 0.414214, 0.292893) fr_sr = MixAudio(fr, sr, 0.414214, 0.292893) l = MixAudio(fl_sl, fc, 1.0, 0.292893) r = MixAudio(fr_sr, fc, 1.0, 0.292893) MergeChannels(l, r) Normalize() Code:
fl_sl = MixAudio(fl, sl, 1.0, 0.707107) fr_sr = MixAudio(fr, sr, 1.0, 0.707107) l = MixAudio(fl_sl, fc, 1.0, 0.707107) r = MixAudio(fr_sr, fc, 1.0, 0.707107) MergeChannels(l, r) Now you can have clips in all volume values between 1.0 and 2.4 Like my image in https://forum.doom9.org/showthread.p...46#post1838646 Quote:
Nothing usefull can be added here.
__________________
BeHappy, AviSynth audio transcoder. Last edited by tebasuna51; 13th April 2018 at 09:22. |
|
19th April 2018, 08:41 | #25 | Link |
Registered User
Join Date: Nov 2015
Posts: 81
|
I still don't get the same result I get when encoding with ffmpeg (2ch sounds the same as original 6ch) so maybe I should give you more info about this whole situation.
The original file is a mkv with video, audio and some subtitles. This is what MediaInfo says about the audio: Code:
Audio ID : 2 Format : AAC Format/Info : Advanced Audio Codec Format profile : HE-AAC / LC Format settings : Explicit Codec ID : A_AAC-2 Duration : 1 h 26 min Bit rate : 141 kb/s Channel(s) : 6 channels Channel positions : Front: L C R, Side: L R, LFE Sampling rate : 48.0 kHz / 24.0 kHz Frame rate : 23.438 FPS (1024 SPF) Compression mode : Lossy Stream size : 87.6 MiB (7%) Is it OK to encode the mkv directly to aac or m4a or is better to extract the audio and encode that file? Because something weird happens when I extract the audio. If I open it, MPC-HC tells me its lenght is 1 h 22 min but Windows Explorer says 3 h 46 min. I use this script: Code:
AddAutoloadDir("C:\Users\LTX\BACKUP\MeGUI\tools\avs\plugins") LoadPlugin("C:\Users\LTX\BACKUP\MeGUI\tools\lsmash\LSMASHSource.dll") a=LWLibavAudioSource("C:\Users\LTX\Desktop\EOE_6CH.mkv") fl = GetChannel(a, 1) fr = GetChannel(a, 2) fc = GetChannel(a, 3) # lf = GetChannel(a, 4) sl = GetChannel(a, 5) sr = GetChannel(a, 6) fl_sl = MixAudio(fl, sl, 1.0, 0.707107) fr_sr = MixAudio(fr, sr, 1.0, 0.707107) l = MixAudio(fl_sl, fc, 1.0, 0.707107) r = MixAudio(fr_sr, fc, 1.0, 0.707107) MergeChannels(l, r) And those are the options I want. Another weird thing is that MeGUI takes like 50 min to encode the audio, while ffmpeg is super fast and gives me a better result. |
19th April 2018, 09:24 | #26 | Link |
Registered User
Join Date: May 2016
Posts: 197
|
I somehow forgot about this thread, but now that I see hello_hellos results I see a system behind it: If ffmpeg downmixes to 32b float, it uses the one set of coefficients, if it downmixes to pcm it uses the other set of coefficients. The most direct way to test this is by not using the aac or libmp3lame encoders, but -c:a pcm_s16le, pcm_s24le, pcm_f32le. And if one does it, the results confirm the preceding statement. The aac encoder probably accepts float input only (or prefers it) so that the auto_resampler resamples everything to float, whereas libmp3lame seems to accept everything so that downmixing in this case only changes the number of channels, not the sample format.
But does anybody have a definitive answer why ffmpeg uses different coefficients for float and non-float? In both cases the ratios coincide: 1/0.707107 and 0.414214/0.292893 are both very good approximations for 2^(0.5) which is probably the value that they are supposed to be. Probably it is because one doesn't need to care about clipping that much if one uses float output. Last edited by mkver; 19th April 2018 at 09:30. |
19th April 2018, 10:39 | #27 | Link | ||||||
Moderator
Join Date: Feb 2005
Location: Spain
Posts: 6,915
|
Quote:
2 speakers can supply the same volume than 6 speakers? Do you listen the 6 ch in a 6 speakers system? Quote:
Quote:
Quote:
Quote:
Then you can select the coefficients with ffmpeg and with AviSinth, and obtain the same. Quote:
You need Normalize or accept distort audio, maybe only a few points and you don't care in this sample, but we can't expect that always.
__________________
BeHappy, AviSynth audio transcoder. Last edited by tebasuna51; 19th April 2018 at 10:43. |
||||||
20th April 2018, 04:57 | #28 | Link | |||
Registered User
Join Date: Nov 2015
Posts: 81
|
Quote:
This is the MeGUI log using the script from my previous reply: (I had to edit it so I could post it, I removed every "[NoImage]") Quote:
Quote:
Last edited by tebasuna51; 20th April 2018 at 12:53. Reason: code -> quote |
|||
20th April 2018, 13:17 | #29 | Link | ||
Moderator
Join Date: Feb 2005
Location: Spain
Posts: 6,915
|
In any point of your audio chain there are a downmix to send audio to your headphones with only 2 speakers.
You are comparing a downmix method with other, not the original and the downmixed. Quote:
Quote:
And inside EOE_6CH.avs there are already the desired downmix to 2.0, and also a Normalize(), duplicated now. See my image in https://forum.doom9.org/showthread.p...18#post1839018 When the input is a .avs with the desired downmix you can select: Keep Original channels and if the Normalize() is included in the .avs, uncheck the Normalize peaks or you waste time with 2 Normalize() functions.
__________________
BeHappy, AviSynth audio transcoder. |
||
22nd April 2018, 01:20 | #30 | Link | |
Registered User
Join Date: Nov 2015
Posts: 81
|
So, I used this script:
Code:
ClearAutoloadDirs() AddAutoloadDir("C:\Users\LTX\BACKUP\MeGUI\tools\avs\plugins") LoadPlugin("C:\Users\LTX\BACKUP\MeGUI\tools\lsmash\LSMASHSource.dll") a=LWLibavAudioSource("C:\Users\LTX\Desktop\EOE_6CH.mkv") fl = GetChannel(a, 1) fr = GetChannel(a, 2) fc = GetChannel(a, 3) # lf = GetChannel(a, 4) sl = GetChannel(a, 5) sr = GetChannel(a, 6) fl_sl = MixAudio(fl, sl, 1.0, 0.707107) fr_sr = MixAudio(fr, sr, 1.0, 0.707107) l = MixAudio(fl_sl, fc, 1.0, 0.707107) r = MixAudio(fr_sr, fc, 1.0, 0.707107) MergeChannels(l, r) Normalize() return last And I'm still getting the same result. Here's the log: Quote:
6 CH ORIGINAL MKV.aac 2 CH FFMPEG.aac 2 CH MEGUI.aac I hope this is enough for you to note the difference between the 2 encodes, if it's not, let me know and I'll upload more. Last edited by tebasuna51; 24th April 2018 at 21:28. |
|
22nd April 2018, 12:58 | #31 | Link |
Moderator
Join Date: Feb 2005
Location: Spain
Posts: 6,915
|
The audio volume is proportional to RMS value of audio signals, your 6 CH original sample:
Code:
6 CH ORIGINAL MKV.aac --------------------- RMS power ch0: 3.56% (-28.98 dB) RMS power ch1: 2.27% (-32.87 dB) RMS power ch2: 1.80% (-34.91 dB) RMS power ch3: 3.15% (-30.04 dB) RMS power ch4: 2.44% (-32.26 dB) RMS power ch5: 2.43% (-32.28 dB) ------ 15.65 Code:
2 CH FFMPEG.aac 2 CH MEGUI.aac ------------------- ------------------- RMS power L: 4.45% (-27.03 dB) 3.19% (-29.91 dB) RMS power R: 3.58% (-28.92 dB) 2.53% (-31.92 dB) ----- ----- 8.03 5.72 Min value L: -40.22% (-7.91 dB) -28.36% (-10.95 dB) Min value R: -26.14% (-11.65 dB) -18.59% (-14.61 dB) Max value L: 39.02% (-8.17 dB) 27.20% (-11.31 dB) Max value R: 25.67% (-11.81 dB) 16.95% (-15.42 dB) But recoding your sample with the same methods than you, instead the original, and cut the first 15 sec.: Code:
ffmpeg_stereo.aac Megui_stereo.m4a ------------------- ------------------- RMS power L: 4.45% (-27.03 dB) 11.11% (-19.09 dB) RMS power R: 3.58% (-28.93 dB) 8.82% (-21.10 dB) ----- ------ 8.03 19.93 Min value L: -28.36% (-10.95 dB) -100.00% (0.00 dB) Min value R: -18.59% (-14.61 dB) -66.13% (-3.59 dB) Max value L: 27.20% (-11.31 dB) 97.13% (-0.25 dB) Max value R: 16.95% (-15.42 dB) 61.68% (-4.20 dB) The difference is related to Normalize(). Now Normalize amplify the volume to reach the max volume without distort: Min value L: -28.36% (-10.95 dB) -100.00% (0.00 dB) But normalizing the full stream reach the max volume at a point not included in the sample and, in the sample, the máx volume value is: Min value L: -40.22% (-7.91 dB) -28.36% (-10.95 dB) That means than the ffmpeg downmix have parts with samples clipped. If you want the same result don't Normalize().
__________________
BeHappy, AviSynth audio transcoder. Last edited by tebasuna51; 22nd April 2018 at 13:02. |
23rd April 2018, 16:03 | #32 | Link |
Registered User
Join Date: Nov 2015
Posts: 81
|
I didn't normalize and the result is the same. BTW, I noticed in the last log that MeGUI doesn't mention anything about the matrix thing, so maybe it didn't work.
I'm talking about this part in the custom script: Code:
fl_sl = MixAudio(fl, sl, 1.0, 0.707107) fr_sr = MixAudio(fr, sr, 1.0, 0.707107) l = MixAudio(fl_sl, fc, 1.0, 0.707107) r = MixAudio(fr_sr, fc, 1.0, 0.707107) |
23rd April 2018, 20:44 | #33 | Link | |
Moderator
Join Date: Feb 2005
Location: Spain
Posts: 6,915
|
Is not possible, without the Normalize() we must obtain similar values than ffmpeg:
Code:
2 CH FFMPEG.aac 2 CH MEGUI_Not-Nor. ------------------- ------------------- RMS power L: 4.45% (-27.03 dB) 4.45% (-27.04 dB) RMS power R: 3.58% (-28.92 dB) 3.53% (-29.05 dB) ----- ----- 8.03 7.98 Min value L: -40.22% (-7.91 dB) -40.70% (-7.81 dB) Min value R: -26.14% (-11.65 dB) -27.07% (-11.35 dB) Max value L: 39.02% (-8.17 dB) 38.43% (-8.31 dB) Max value R: 25.67% (-11.81 dB) 23.76% (-12.48 dB) Quote:
If you obtain only two channels the downmix work.
__________________
BeHappy, AviSynth audio transcoder. |
|
24th April 2018, 19:40 | #34 | Link |
Registered User
Join Date: Nov 2015
Posts: 81
|
These are the scripts I used and 15 seconds samples of both results:
Code:
ClearAutoloadDirs() AddAutoloadDir("C:\Users\LTX\BACKUP\MeGUI\tools\avs\plugins") LoadPlugin("C:\Users\LTX\BACKUP\MeGUI\tools\lsmash\LSMASHSource.dll") a=LWLibavAudioSource("C:\Users\LTX\Desktop\EOE_6CH.mkv") fl = GetChannel(a, 1) fr = GetChannel(a, 2) fc = GetChannel(a, 3) # lf = GetChannel(a, 4) sl = GetChannel(a, 5) sr = GetChannel(a, 6) fl_sl = MixAudio(fl, sl, 1.0, 0.707107) fr_sr = MixAudio(fr, sr, 1.0, 0.707107) l = MixAudio(fl_sl, fc, 1.0, 0.707107) r = MixAudio(fr_sr, fc, 1.0, 0.707107) MergeChannels(l, r) Normalize() return last Code:
ClearAutoloadDirs() AddAutoloadDir("C:\Users\LTX\BACKUP\MeGUI\tools\avs\plugins") LoadPlugin("C:\Users\LTX\BACKUP\MeGUI\tools\lsmash\LSMASHSource.dll") a=LWLibavAudioSource("C:\Users\LTX\Desktop\EOE_6CH.mkv") fl = GetChannel(a, 1) fr = GetChannel(a, 2) fc = GetChannel(a, 3) # lf = GetChannel(a, 4) sl = GetChannel(a, 5) sr = GetChannel(a, 6) fl_sl = MixAudio(fl, sl, 1.0, 0.707107) fr_sr = MixAudio(fr, sr, 1.0, 0.707107) l = MixAudio(fl_sl, fc, 1.0, 0.707107) r = MixAudio(fr_sr, fc, 1.0, 0.707107) MergeChannels(l, r) return last |
25th April 2018, 09:18 | #37 | Link | |
Moderator
Join Date: Feb 2005
Location: Spain
Posts: 6,915
|
Quote:
Normalize() amplify the source until reach a max value, in the first 15 seconds (100% in my Megui_stereo.m4a), using the full stream the 100% is in another point out of first 15 sec. and can't be amplified more.
__________________
BeHappy, AviSynth audio transcoder. |
|
30th April 2018, 18:22 | #38 | Link | |
Registered User
Join Date: Mar 2011
Posts: 4,829
|
Quote:
The coefficients might relate to the ability of the output format/codec to encode peaks above 0dB. I can't speak for ffmpeg's AAC encoder, but QAAC doesn't clip peaks when the input is float. I vaguely remember someone in another forum testing it and QAAC coped with peaks well over 0dB. I'm not sure about the LAME encoder. The stand-alone version accepts 32 bit float but downsamples to 24 bit integer before encoding, so peaks greater than 0dB would be clipped. ffmeg's built-in LAME encoder seems to accept 32 bit float though. I've compared it to the standalone LAME, and ffmpeg's LAME encodes peaks above 0dB, at least according to a true peak scan of the encoded audio. Whether those peaks are encoded accurately.... that might be a different story. Last edited by hello_hello; 30th April 2018 at 19:07. |
|
30th April 2018, 19:05 | #39 | Link |
Registered User
Join Date: Mar 2011
Posts: 4,829
|
doomleox999,
as an alternative you could try downmixing and encoding with foobar2000. I use it for 99.9% of my audio encoding. You can add DSPs to both the playback and conversion chains and conversion configurations can be saved as presets. There's plugins for decoding audio it can't decode "out of the box" (DTS and AC3 are the main examples) and there's even a plugin for decoding Avisynth scripts. Not that I downmix much these days, but if I downmix and encode a bunch of related files (episodes of a TV show etc) I usually downmix with the matrix mixer DSP while reducing the over-all volume by 6dB. It's not quite enough to guarantee there won't be clipping, but it's enough most of the time, and that way the relative volumes of each track doesn't change. Once it's encoded, I run a ReplayGain scan on the encoded audio to check the peak levels (I encode with either QAAC or ffmpeg's MP3 encoder so the peaks aren't clipped before encoding). If a track or two has peaks above 0dB I'd probably re-encode them with a limiter DSP in the conversion chain. On the rare occasion there's lots of tracks with peaks above 0dB, I'd decrease the volume of each output file by an appropriate amount (so the relative volumes remain the same). Foobar2000 can losslessly adjust the volume of MP3 and AAC audio in an MP4 or MKV container, or alternatively you could reduce the downmix volume and re-encode them all a second time, which isn't as bad as it sounds as you can encode multiple files simultaneously. Fb2k comes with it's own downmix DSP but it's not configurable and it doesn't reduce the volume to prevent clipping. There's a screenshot of the matrix mixer DSP below. The coefficients are specified as a percentage, as they are when downmixing with Avisynth or ffmpeg. This is my standard 5.1ch to stereo downmix preset with an overall 6dB volume reduction. Both the back and side channels need to be included for 5.1ch, as the surround channels can be in either the back or side. Last edited by hello_hello; 30th April 2018 at 19:10. |
Tags |
6channels, ffmpeg, megui, stereo |
Thread Tools | Search this Thread |
Display Modes | |
|
|