Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > General > Audio encoding

Reply
 
Thread Tools Search this Thread Display Modes
Old 1st December 2014, 17:53   #21  |  Link
hello_hello
Registered User
 
Join Date: Mar 2011
Posts: 4,829
Quote:
Originally Posted by manolito View Post
@ hello_hello
I think you are confusing ReplayGain with EBU R128 a little bit...
Not too much.....

Quote:
Originally Posted by manolito View Post
When you are talking abut the foobar2000 ReplayGain using the EBU R128 method for scanning, I believe this is the wrong term. If loudness scanning is done using EBU R128, then it is NOT ReplayGain any more. I would call it EBU R128 with a different reference level (e.g. -18 LUFS instead of the standard -23 LUFS).
Foobar2000 definitely uses the EBU R128 method for ReplayGain scanning, but the result isn't a lot different most of the time, so in that respect how you describe it is probably semantics. It's the target volume of 89dB or -18 LUFS which makes it ReplayGain, in my opinion. If the two scanning methods always produced dramatically difference loudness results, that'd be a different story, but mostly they don't seem to. Apparently EBU R128 is supposed to be a little more accurate. I don't know, so I won't argue with that.

http://www.foobar2000.org/changelog
1.1.6
ReplayGain scanner now uses libebur128 for improved accuracy.

https://github.com/jiixyj/libebur128
libebur128 is a library that implements the EBU R 128 standard for loudness normalisation.

https://github.com/jiixyj/loudness-scanner
Usage
The scanner also supports ReplayGain tagging.
The reference volume is -18 LUFS (5 dB louder than the EBU R128 reference level of -23 LUFS).


http://en.wikipedia.org/wiki/ReplayGain#Scanners
foobar2000: Generates metadata through included plugin using EBU R128 (but at old 89dB levels) for all supported tag formats.

Quote:
Originally Posted by manolito View Post
Both ReplayGain and EBU R128 establish a method for loudness scanning, plus they establish a reference loudness level. These two things are independent of each other. You can use ReplayGain scanning and set a reference level different from the standard 89 dB, and you can use EBU R128 scanning and also set a different reference level.
Well.....
If you scan with ReplayGain the tags saved will (should) always specify any volume change required to achieve the 89dB target volume. It couldn't work any other way. ReplayGain tagging can only work if the reference level is always the same.
Yes, some programs let you change the ReplayGain target volume and they'll use that target volume when converting etc, but they'll always write a tag relative to 89dB. Take a file that's already at the 89dB target volume. Scan it with ReplayGain while changing the target volume to 83dB, then convert it using the 83dB target volume. Any ReplayGain tag written will be +6dB. If there's no ReplayGain tag, well it's no longer ReplayGain, you've just used it's scanning to adjust the volume to some indiscriminate level.
Even if the playback device lets you change the target playback level it still needs ReplayGain tags relative to 89dB as a point of reference. If a program writes tags relative to some other reference level I can't see how that'd be anything but silly.

In respect to the discussion in this thread, ReplayGain and EBU R128 both have the same goal. ie to determine how loud the audio sounds. They mostly don't seem to disagree by much.
As an experiment I tried some movie audio (Jurassic Park). I downmixed to stereo, normalised and converted to MP3 so I could scan with MP3Gain. I also scanned the MP3 with foobar2000.
Mp3Gain/ReplayGain said it's "Track Gain" is +5.33dB (so it's level would need to be increased by 5.33dB on playback to achieve the 89dB target volume). Foobar2000/EBU R128 says it's +5.62dB.

Quote:
Originally Posted by manolito View Post
For my plugin which covers only DVD creation it makes total sense to use the EBU R128 scanning method, but employ a higher reference level like -18 LUFS. Most people seem to agree hat the EBU scanning method delivers more consistent results than the ReplayGain method, and it handles 6-ch audio which ReplayGain does not.
I'd agree in respect to scanning 6ch audio, but probably not when it comes to using a higher reference level such as -18 LUFS, assuming you're referring to adjusting the audio to that level. Unless I'm completely misunderstanding what the change in reference level would achieve.
That Jurassic Park audio I mentioned earlier.... to hit a target volume of 89dB (-18 LUFS) it needs to be increased by 5.33dB, which can't be done because it's already been normalised. Peaks at maximum, as loud as it gets. In fact after the MP3 was decoded while scanning it, both ReplayGain and EBU R128 agreed the peak level was already just a tad greater than maximum (1.003232 and 1.003263 respectively.... percentage, not dB).

89 - 5.33 = 83.67
The SMPTE reference level of 83dB or R128's -23 LUFS are looking pretty good (assuming 83dB and -23 LUFS are the same target volume).

Quote:
Originally Posted by manolito View Post
For FFmpeg the EBU R128 scanning is totally separate from the following loudness adjustment. You have to do the scanning pass, note the LUFS value, calculate the difference to the desired reference value, and then do a second pass for the loudness correction. The R128Gain software can do it all in one step, using FFmpeg for scanning and either SoX or FFmpeg for the loudness adjustment.
I've never used R128Gain. I assume this is it?
http://r128gain.sourceforge.net
I assume it's one step scan/convert process is really an automated two step process?

A look around the R128Gain site would indicate its kinda ReplayGain with a different name. Same principle, the same sort of tagging, just a different scanning method. Although it appears it does ReplayGain scanning too. I'll definitely have a play with it soon.
I assume it always writes ReplayGain tags relative to -18 LUFS, but what happens in EBU R128 mode in respect to the tags it writes when you change the reference level? Anyway, I'll have a play with it myself tomorrow. I'm keen to check it's EUR128 scan will produce the same result as foobar2000's ReplayGain scan. I assume it will....

Cheers.

Last edited by hello_hello; 1st December 2014 at 18:51.
hello_hello is offline   Reply With Quote
Old 2nd December 2014, 01:11   #22  |  Link
manolito
Registered User
 
manolito's Avatar
 
Join Date: Sep 2003
Location: Berlin, Germany
Posts: 3,079
Quote:
Originally Posted by smok3 View Post
And how is that supposed to work? (time travel of some sort?)
It works by using a command line like this:
Code:
r128gain.exe --command="sox %TRACK% %BN%_normalized.wav gain %TGDB%" input.wav
Technically it is a 2-pass process, but from the user's perspective R128Gain does it in one step.


@hello_hello

The fundamental difference between RelayGain and EBU R128 is the loudness scanning method. If you want to compare the two methods, please do some googling. For the anylyzing process EBU R128 uses a K-weighting curve, channel summing and a fixed gate at -70 LU. The ReplayGain scanning is quite different, AFAIK HydrogenAudio has a detailed description. And contrary to your experience most folks say that the results of these two methods are not similar, but quite different.

You now introduce another ReplayGain feature, and this is tagging the output using Meta Data, so the real content does not get modified at all, the player has to read the Meta Data and adjust the loudness accordingly.

This concept is absent from the EBU R128 standard, because it is meant for broadcasting purposes. This does not mean that the tagging concept cannot be employed together with EBU R128 (in fact R128Gain does support tagging for FLAC output files). But for broadcasting and also for my target files (DVD creation) the tagging concept cannot be used, so the source has to be reencoded physically.

Back to semantics:
For me it is ReplayGain when the ReplayGain scanning method is used, and it is EBU R128 when the EBU scanning method is used. And this is independent from the reference level and if the output is tagged or reencoded.


Cheers
manolito

Last edited by manolito; 2nd December 2014 at 07:23.
manolito is offline   Reply With Quote
Old 2nd December 2014, 08:59   #23  |  Link
smok3
brontosaurusrex
 
smok3's Avatar
 
Join Date: Oct 2001
Posts: 2,392
According to ha users, they are actually quite similar (rg and r128), example:
http://www.hydrogenaud.io/forums/ind...5&#entry759785
__________________
certain other member
smok3 is offline   Reply With Quote
Old 2nd December 2014, 09:12   #24  |  Link
hello_hello
Registered User
 
Join Date: Mar 2011
Posts: 4,829
Quote:
Originally Posted by manolito View Post
@hello_hello

The fundamental difference between RelayGain and EBU R128 is the loudness scanning method. If you want to compare the two methods, please do some googling. For the anylyzing process EBU R128 uses a K-weighting curve, channel summing and a fixed gate at -70 LU. The ReplayGain scanning is quite different, AFAIK HydrogenAudio has a detailed description. And contrary to your experience most folks say that the results of these two methods are not similar, but quite different.
I'm aware the scanning methods work differently, but I'd disagree the results tend to be a lot different. The fact that software tends to use ReplayGain tags for EBU R128 scanning also seems to blur the line somewhat. That's what R128Gain does by default. It scans with EBU R128 and saves the results in ReplayGain tags.
Only in what to me seems like madness, it doesn't always use the ReplayGain target volume for the tags. Maybe that's a reason for people thinking the results are so different because by default they're 5dB different.

Quote:
Originally Posted by manolito View Post
You now introduce another ReplayGain feature, and this is tagging the output using MetaData, so the real content does not get modified at all, the player has to read the Meta Data and adjust the loudness accordingly.
That's how ReplayGain was supposed to work. Of course a program can use the same metadata to physically change the volume when re-encoding, and we all use it that way, but that's not how it was originally intended to be used.
The main reason I mentioned ReplayGain in the first place was due to the metadata/tagging, even if it's with EBU R128 scanning. If you're trying to adjust a whole bunch of files so they sound the same in level, a possible barrier to that is the difference in dynamic range. Audio "A" might require the same volume decrease to achieve the target volume as audio "B", but audio "A" requires a further decrease to prevent clipping. As soon as that happens, it's game over in the "sounds the same" department. The only way around it is to adjust a bunch of files as a group and if one requires a further volume adjustment, they all get the same further adjustment. For that though, you need the metadata. Either that or you'd need to use a target volume that pretty much guarantees there'll be enough headroom. ie -23 LUFS.

Adjusting a group of files together with MP3Gain is nice and easy. You just scan them with TrackGain, reduce the target volume until there's no longer any file for which MP3Gain shows clipping, and that's the magic target volume (it adjusts to that volume, but still writes tags relative to 89dB). It uses the metadata to help you set a suitable target volume for a group of files.

Quote:
Originally Posted by manolito View Post
Back to semantics:
For me it is ReplayGain when the ReplayGain scanning method is used, and it is EBU R128 when the EBU scanning method is used. And this is independent from the reference level and if the output is tagged or reencoded.
If you look at the R128Gain presets you'll see the authors of the R128Gain GUI don't quite agree. They appear to have made this distinction:

ReplayGain1 = -18 LUFS target volume, ReplayGain scanning.
ReplayGain2 = -18 LUFS target volume, EBU R128 scanning.
EBU R128 = -23 LUFS target volume, EBU R128 scanning.

Each method seems to always write the scan results as ReplayGain tags, which makes sense, as without the tags you'd just be using the scan to indiscriminately fiddle with the volume.
It's obvious if you think about it. Sure, you can scan, adjust the volume and remove the tags, but then it's just a file with a different volume. How would you later determine how loud it is? You'd scan it again.

The scanning method and the reference level can never be independent or the whole thing falls apart. For ReplayGain it's always been fixed, as far as I know, at 89dB. It appears R128Gain writes a new tag called "replay_gain_reference_loudness" which by default is -23 LUFS (or-18 LUFS for the ReplayGain presets). And sorry, but the tag is called "replay_gain_reference_loudness" no matter which scanning method you use.
If you change the reference level when scanning, TrackGain also changes accordingly.

Default of -23 LUFS
replay_gain_reference_loudness -23 LUFS
replay_gain_track_gain -9.4 LUFS

-17 LUFS
replay_gain_reference_loudness -17 LUFS
replay_gain_track_gain -3.4 LUFS

And at face value that seems fine. Software should be able to read the reference loudness tag and work back to the default of -23 LUFS etc, but there has to be a reference, and unfortunately so far the only software I've tried that seems to acknowledge the reference loudness tag exists is Mp3Tag.
Foobar2000 doesn't display it and I suspect it'll take the TrackGain at face value while assuming 89dB as the target volume, but I haven't tested that yet. MediaInfo doesn't display the reference loudness tag, but it does display this one:
REPLAYGAIN_ALGORITHM : EBU R128

I still haven't played with R128Gain a lot. So far, I've confirmed it's EBU R128 scanning and foobar2000's ReplayGain scanning produce the same result, and so far foobar2000's scanning and MP3Gain's scanning differs a little, but we're still in a 1dB difference range, and nothing close to anything which looks like it could be referred to as resembling "quite different".

So far, R128Gain's scanning works as I thought it should, but with the addition of the "replay_gain_reference_loudness" tag so fiddling with the reference level at least makes sense. Well, maybe....So far, I don't seem to be able to get R128Gain to change the volume using the GUI. All it's doing is re-writing the file with the new tags, but it's early days. I'll come back to it after a break and try again.

So far, I'm still completely failing to understand how changing the reference level is going to improve things when it comes to normalising the audio, but maybe that'll became clearer once I've got my head around R128Gain a little better.
I'm not saying you need to keep the ReplayGain tags after re-encoding. Chances are the muxing software will remove them anyway. You'll know the audio has been adjusted to the same volume for a particular DVD and that's what matters. The reference loudness used at the time doesn't matter so much as long as they're the same. I just don't get why adjusting to -18 LUFS would necessarily be better when the industry default is -23 LUFS.

Last edited by hello_hello; 2nd December 2014 at 12:50.
hello_hello is offline   Reply With Quote
Old 2nd December 2014, 09:29   #25  |  Link
manolito
Registered User
 
manolito's Avatar
 
Join Date: Sep 2003
Location: Berlin, Germany
Posts: 3,079
Quote:
Originally Posted by smok3 View Post
According to ha users, they are actually quite similar (rg and r128), example:
http://www.hydrogenaud.io/forums/ind...5&#entry759785
Just look at the following post:
http://www.hydrogenaud.io/forums/ind...dpost&p=761876
and things do not look this similar now...


Cheers
manolito
manolito is offline   Reply With Quote
Old 2nd December 2014, 12:44   #26  |  Link
hello_hello
Registered User
 
Join Date: Mar 2011
Posts: 4,829
Quote:
Originally Posted by manolito View Post
Just look at the following post:
http://www.hydrogenaud.io/forums/ind...dpost&p=761876
and things do not look this similar now...
Out of 1275 scanned tracks, he made note of 5.
Of those five, the biggest difference was 5.7dB.
I'm not saying that's insignificant, but in the context of 1275 scanned tracks.....
Of the five he noted, the least difference was 3.09dB. Still a bit significant, but not massive.....
Pity he didn't also graph those 5 noteworthy examples with his ears to show which scanning method was getting it right.

I've never thought ReplayGain was perfect. Very, very good, but not perfect. I tend to run my audio player in random mode, so going from Billie Holiday to Billy Idol wouldn't be uncommon, and I've often thought maybe the old low-fidelity stuff is just a little bit quiet.
So I scanned Billy Idol's greatest hits. EBU R128 and ReplayGain don't differ by much. Billie Holiday.... EBU R128 seems to want to increase the TrackGain by between 1.5dB and 3dB for many of those. I guess EBU R128 and my ears agree there.

Those tests weren't particularly scientific. The MP3s were scanned by MP3Gain, their volume's adjusted to nearly 89dB, and TrackGain tags were written to make up any difference (MP3's can only be adjusted losslessly in 1.5dB increments), while I ran an EBU R128 scan on the already adjusted MP3s. Some time soon, I'll do some more accurate comparing, but I still maintain ReplayGain and EBU R128 are generally very similar. If now and then they differ by a few dB because EBU R128 is better, that's got to be a good thing, and a reason to use EBU R128 scanning for Replaygain, I'd have thought.

Anyway, I never intended to start a ReplayGain vs EBU R128 scanning debate. I just thought EBU R128 is used for ReplayGain scanning a bit these days, and ReplayGain scanning tends to save the metadata in tags (including peak level), and the metadata would be useful for determining the ideal target volume for a group of files so they could all be adjusted to the same target volume without any of them clipping. That's all.....

Last edited by hello_hello; 2nd December 2014 at 13:10.
hello_hello is offline   Reply With Quote
Old 2nd December 2014, 14:13   #27  |  Link
hello_hello
Registered User
 
Join Date: Mar 2011
Posts: 4,829
Wow..... it takes R128Gain 4 minutes and 20 seconds to scan my Jurassic Park MP3. Is that anywhere in the vicinity or normal?
MP3Gain does it in around 1 minute, 10 seconds and I thought it was slow. Foobar2000 zips through it in 16 seconds and it'll scan multiple files simultaneously.
R128Gain says Trackgain +5.63dB (EBU R128 scanning). Foobar2000 says TrackGain +5.62dB. I'll put the 0.01dB difference down to the decoding rather than the scanning, for the moment.

20 minutes of 5.1ch AC3: "Force Stereo" unchecked, 2 minutes 5 seconds for R128Gain, 7 seconds for foobar2000, TrackGain +4.5dB both times.
Same 20 minute AC3 with "Force Stereo" checked still takes 45 seconds. The TrackGain result changes quite a bit (11.9dB). I'll have to investigate that one as I don't understand it yet.

Why is R128Gain so unbelievably slow?
hello_hello is offline   Reply With Quote
Old 3rd December 2014, 00:08   #28  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,346
In addition to the normal gain, R128Gain may compute the "True Peak" value as defined by EBU R128, which is a somewhat more expensive process and can slow down the computations a lot.
It may also not have a optimized FFT filter for the K-Weighting, a naive C implementation would be relatively slow.

Although, without knowing the length of said MP3, its certainly super slow to take 2 minutes for something like a 5 minute audio file.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders

Last edited by nevcairiel; 3rd December 2014 at 00:12.
nevcairiel is offline   Reply With Quote
Old 3rd December 2014, 16:06   #29  |  Link
hello_hello
Registered User
 
Join Date: Mar 2011
Posts: 4,829
I can't re-test for a while. Probably tomorrow. The PC's running two encodes at the moment and I don't have much time anyway.
The Jurassic Park mp3 is 2 hours, six minutes long. R128Gain took 4 minutes and 20 seconds to scan it.
The second audio was 20 minutes worth of 5.1ch AC3.

I'll report back when I've been able to re-test with the "true peak" option disabled (it was enabled).
hello_hello is offline   Reply With Quote
Old 3rd December 2014, 17:55   #30  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,346
2 minutes for a 2 hour audio file seems like an acceptable time if True Peak is being computed. If anything, I would somehow doubt foobar2000 could decode and process the entire file in 16 seconds.
Audio decoding and processing is fast, but not that fast. 2 minutes is 60x realtime, 16 seconds is 472x realtime speed, which is just absurd.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 3rd December 2014, 18:05   #31  |  Link
smok3
brontosaurusrex
 
smok3's Avatar
 
Join Date: Oct 2001
Posts: 2,392
Quote:
Originally Posted by hello_hello View Post
I can't re-test for a while. Probably tomorrow. The PC's running two encodes at the moment and I don't have much time anyway.
The Jurassic Park mp3 is 2 hours, six minutes long. R128Gain took 4 minutes and 20 seconds to scan it.
The second audio was 20 minutes worth of 5.1ch AC3.

I'll report back when I've been able to re-test with the "true peak" option disabled (it was enabled).
Has anyone tested:
a. what would be the difference in calculus if you only scan first 10-20 minutes?
b. if you only scan each 10th/100th/1000th sample?
(assuming we ignore the true peak)
__________________
certain other member
smok3 is offline   Reply With Quote
Old 4th December 2014, 02:45   #32  |  Link
manolito
Registered User
 
manolito's Avatar
 
Join Date: Sep 2003
Location: Berlin, Germany
Posts: 3,079
FWIW I just found out that Peter Belkner published a follow-up software for R128Gain called BS1770Gain.
http://bs1770gain.sourceforge.net/

It is CLI only so far, parameters are not compatible with R128Gain. It has some interesting new features, it is more universal than R128Gain. I just hope that nobody associates its name with BS...

BTW up to now I was unaware of the newer ReplayGain2 specification. This new specs use the EBU R128 method for scanning and employ a reference level of -18 LUFS. Looks like much of the confusion between the two terms EBU R128 and ReplayGain originated from this new RG2 flavor.


Cheers
manolito
manolito is offline   Reply With Quote
Old 4th December 2014, 05:36   #33  |  Link
xooyoozoo
Registered User
 
Join Date: Dec 2012
Posts: 197
Quote:
Originally Posted by xooyoozoo View Post
The filter works on a single file basis, but I'd like to obtain an album-gain value. Is there a standard, "correct" way of doing so?

I'm thinking of concatenating an aggregate macro file to run through ffmpeg, but it'd be nice to save an extra processing step if that value can be trivially calculated using per-track values.
To answer my own question, ffmpeg's ebur128 filter calculates the gain value from two internal variables: integrated_sum & nb_integrated. If the values are somehow saved, "album" gain can be recalculated from any number of separate runs.

I'm getting results within <0.01 dB of results from concatenated sound files.

Last edited by xooyoozoo; 4th December 2014 at 05:39.
xooyoozoo is offline   Reply With Quote
Old 4th December 2014, 08:12   #34  |  Link
hello_hello
Registered User
 
Join Date: Mar 2011
Posts: 4,829
Quote:
Originally Posted by nevcairiel View Post
2 minutes for a 2 hour audio file seems like an acceptable time if True Peak is being computed. If anything, I would somehow doubt foobar2000 could decode and process the entire file in 16 seconds.
Disabling the True Peak calculation in R128Gain's options made a difference. That reduces the scanning time for the 2 hour, six minute mp3 to about 1 minute instead of 4 minutes 20 seconds. That also makes it a little faster than MP3Gain's scanning which took 1 minute 10 seconds.

Quote:
Originally Posted by nevcairiel View Post
Audio decoding and processing is fast, but not that fast. 2 minutes is 60x realtime, 16 seconds is 472x realtime speed, which is just absurd.


It's so absurd, I asked about it at the foobar2000 forum at one stage.
Not that the answer really meant much to me, but so far I've not seen any evidence it effects the accuracy of the scanning.

foobar2000's ReplayGain scanner is faster than probably anything else because foobar2000 can scan multiple files at once, it has very fast file decoders, it uses smart file buffering and Peter has hand optimized various ReplayGain functions.

Edit: When I get a chance, I'll do some more comparison scanning between R128Gain and foobar2000 and report back, although I suspect if foobar2000's scanning was producing inaccurate results, it's forum would be filled with posts complaining about it.

Last edited by hello_hello; 4th December 2014 at 13:57. Reason: Trying to get the forum to display an image
hello_hello is offline   Reply With Quote
Old 4th December 2014, 08:19   #35  |  Link
hello_hello
Registered User
 
Join Date: Mar 2011
Posts: 4,829
Quote:
Originally Posted by manolito View Post
BTW up to now I was unaware of the newer ReplayGain2 specification. This new specs use the EBU R128 method for scanning and employ a reference level of -18 LUFS. Looks like much of the confusion between the two terms EBU R128 and ReplayGain originated from this new RG2 flavor.
I'd just assumed ReplayGain was ReplayGain only sometimes these days it's used with EBU R128 scanning. Is there an actual ReplayGain2 specification or is it just some sort of de-facto standard?
hello_hello is offline   Reply With Quote
Old 4th December 2014, 08:41   #36  |  Link
manolito
Registered User
 
manolito's Avatar
 
Join Date: Sep 2003
Location: Berlin, Germany
Posts: 3,079
http://wiki.hydrogenaud.io/index.php..._specification


Cheers
manolito
manolito is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 16:08.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.