View Full Version : Nero's inferior ac3 transcoding
rpp3po
5th June 2005, 13:18
Hello,
Nero Recode uses a static gain value for transcoding the logarithmic ac3 to aac. This has the following consequences: either the transcoded audio gets corrupted (clipping) or its volume gets to low, which means loss of audio resolution (quality) and wasted storage space.
There were already many postings about Recode's low audio. JohnV proposed a fix (http://forum.doom9.org/showthread.php?p=648752#post648752) which indicates that they just changed the static gain value. This has made audio louder but the tradoff is inreased corruptiong probability.
This is just not the perfect way to do it. Because of its logarithmic nature ac3 is a able to carry huge dynamics, theoretically up to 40-bit in linear PCM scale. Knowing this it's a must to do a 2-pass ac3-to-aac conversion (including normalization). This is the only way to ensure that the complete 16-bit range of the aac target gets used. I had Nero transcodes which never used more than 11-bits at all.
As Nere Recode already does two passes for high quality video, why not include an ac3 peak scan during the first pass as well?
Greetings,
rpp3po
Mug Funky
5th June 2005, 14:37
news flash - 99.99999999% of all ac3 streams come from 16-bit masters.
they are usually normalized with -20dB mix level, and usually peak at -10dB.
don't be worried about dynamic range when you're transcoding lossy to lossy - there'll be not much left anyway. just encode it quiet and turn the volume up :)
rpp3po
5th June 2005, 15:43
news flash - 99.99999999% of all ac3 streams come from 16-bit masters.
What you're claiming is plain b*lls*t:
1. ac3 is logarithmic meaning that it has no PCM-alike linear bit depth itself.
2. I haven't seen a mastering studio for ages which works below 24 bits PCM-scale. Even the garage ones don't go for 16 bit anymore for a long time (except for a final red-book CD mastering stage).
So why the hell should a Lucasarts engenieer downmix to a "16-bit master" before feeding the ac3 algorithm? Could you provide some data for your "99.99999999%" claim?
don't be worried about dynamic range when you're transcoding lossy to lossy - there'll be not much left anyway. just encode it quiet and turn the volume up :)
I don't know about your crappy quality standards and I don't know why in the world transcoding must include a loss of dynamics besides a loss of quality. Just a real world example: Do a DVD->MP4 DVD-Backup of a movie containing great dynamics as SW Episode II using recent versions of Nero Recode. Extract one AAC-Track , convert this to 16-bit WAV and open it in Wavelab. Analyse the bit depth of the contained data. You'll get something far below 16-bit. Then take the DVD again, extract the same ac3-track, convert it to aac using Besweet and proper normalization before the aac encoding. Convert the result to 16-bit WAV and open it in Wavelab again. Analysis will show full blown 16-bit dynamics. That's not what I call "not much left anyway". Maybe you should check your own transcoding methods before making such claims.
hhanh00
5th June 2005, 22:11
How do you normalize a 6 channel ac3 in order to properly encode it in aac with nero?
Thanks,
--h
JohnV
5th June 2005, 22:24
There were already many postings about Recode's low audio. JohnV proposed a fix (http://forum.doom9.org/showthread.php?p=648752#post648752) which indicates that they just changed the static gain value. This has made audio louder but the tradoff is inreased corruptiong probability.
Shouldn't corrupt since "the fix" is a gain value used in dolby decoders by default. We had clipping before with this value, but it was because of other problem.
I agree 2-pass and normalization would be the best way..
SeeMoreDigital
5th June 2005, 22:56
I agree 2-pass and normalization would be the best way..2-pass, with or without normalization has got to make AAC sound way better anyway....
I guess this is why Micro$oft offers true 2-pass (VBR) for its own WMA file format ;)
Cheers
rpp3po
6th June 2005, 01:50
2-pass, with or without normalization has got to make AAC sound way better anyway....
The "2-pass" talked about here is not about data rate distribution as in the case of VBR, but finding audio peaks. Volume normalization is impossible with just one pass. The second pass of a VBR calculation applies a rate distribution curve, the second pass of a volume normalization just adds a constant gain to each sample.
I doubt that there will be a 2-pass VBR AAC encoder from Ahead in the mid-term future. They are aiming at the standalone market. Standalones have got limited ressources and are likely to prefer CBR a lot. The video is VBR already and multiple concurrent VBR streams harm the predictability of the overall bitrate.
But I agree with you that a 2-pass AAC VBR solution would have best quality/size ratio that'd be achievable.
Mug Funky
6th June 2005, 01:53
@ rpp3po:
chill out, man.
audio is recorded and mixed in 24 bits (or more, though A/D converters can't really do any more precision), but they will be mixed down to 16 bits when they get sent to be encoded to DVD.
ac3 is logarithmic in the same way that mp3 is... it stores exponents and mantissas separately. but i can assure you, the source is linear PCM, usually in 16 bits. trust me on this - i encode DVDs for a living, not just a hobby. i've never seen a DA-88 tape come in with higher than 16 bits, even though both the format and our equipment support it (DA-88 is an 8-channel digital tape that supports up to 192k, 24 bits, but is pretty much always 16/48). the digibeta tapes with 4 channels on them (only good for stereo, with 2 languages) can carry up to 20 bits, but will invariably have 16 bit audio on them. now i'm sure these 24 bit tapes exist, but i can assure you they probably don't make it onto the DVD. i've seen hundreds of DVDs pass through here, and none of them have had 24 bit audio on them, regardless of how the audio is actually stored in ac3. it's not a matter of poor quality control - the masters simply don't come in 24 bits. but i dare you to say you can hear the difference in a blind test with any regular run-of-the-mill 5.1 channel mix.
16 bits has a mathematical dynamic range of about 96dB, and with correct dithering it can be anything up to 115dB. this is plenty! it's much more than you'll get in a cinema, due to air conditioning, noisy amplifiers, projector noise, and the ever-present popcorn and plastic sounds. you'd be lucky to get 70dB of actual real-world dynamic range in a cinema.
besides, lossy encoding means adding noise to a signal. the trick is in adding just enough noise that it's not going to be heard. this means the dynamic range is reduced, as the signal-to-noise ratio is lowered. transcoding lossy to lossy will only make this worse, and actually will raise the added quantization noise above the threshold of hearing. this is why i said not to be worried about a measly couple of bits in LPCM when the audio is being double-sledgehammered with both the ac3 and aac encoding processes.
my advice for DVD backup is to simply keep the ac3 track and not touch it - it's suffered enough.
rpp3po
6th June 2005, 02:55
my advice for DVD backup is to simply keep the ac3 track and not touch it - it's suffered enough.
That doing nothing is lossless is no news... :) But I agree that that would be the best way, really.
Sadly there is no MPEG-4 spec for embedded ac3 sound. If I want AVC video, I must also choose AAC (AVC DVD-5 recodes look far superior over MPEG2-CCE shrinks) for standalone compatibility.
But I really don't get your point why a lossy process should rather motivate to let it get even worse than doing everything to keep this loss to the minimum possible?
A properly normalized 450 kB/s 5.1 AAC transcode is really still of very good quality and dynamics. I don't want to stick with 11 bit equivalents instead. Just turning up the volume is nothing else but plain compression. Dialogs don't sound much different from explosions anymore (concerning volume). That's easily distinguishable from properly normalized material on good equipment, a whole lot easier than 16/24 bit comparisons. Of course only if you're used to switching off DRC.
Mug Funky
6th June 2005, 03:10
ah, i meant turning up the volume on your stereo... not compressing (dynamic compression) the audio but just playing it back louder.
my DVD player doesn't support DRC, and to be honest i don't miss it. i like having explosions that wake up the neighborhood :)
yes, it'd be good to have ac3-in-mp4 capability, but for now i'm happy just QuEncing a DVD-9 down to a 5 (or just using a dual-layer disc and suffering the very slow layer break). truth be told, i haven't made a DVD backup in ages - no time.
rpp3po
6th June 2005, 03:38
ah, i meant turning up the volume on your stereo... not compressing (dynamic compression) the audio but just playing it back louder.
The combination
16 bit source -> 11 bit transcode due to missing normalization -> playback on 16-bit equipment -> turning up the volume till it is as loud as the source again
is home made dynamic compression (just without the fancy crab as dialnorm etc.).
But I get the impression, that we're not too far away from each other. Sorry for barking at you...! ;)
Have a good day!
Mug Funky
6th June 2005, 04:56
where is this 11 bit coming from? 11 bit is about -30dB below fullscale. that's around the dialog level on an 83dB SPL mix (most are done at 89 for DVD), which is around the quietest you'll ever get in a sensible movie mix. only bits of foley, actuality and some atmosphere sounds will be lower than that.
DVD soundtracks are mixed to an approximate average of -20dB and a peak level of -10. often they're much louder than that, as the reason for a -10dB peak is an old technicality that only matters in very limited situations (some old equipment would spazz out on higher levels). colourbars and test tones on master tapes are standardised at -20dB, and this is what we use to get levels right for DVD encodes (it's a big assumtion that the bars and tones will have anything to do with the program, but it's a standard and we have to stick with it. any variation is the fault of the engineer who mastered it).
having quiet sounds isn't the same as throwing out dynamic range. it's the opposite... low dynamic range is more like modern CDs which are crushed horribly against fullscale for the entirety of the album. you can convert some modern songs to 8 bits (48dB range!!) and not be able to ABX it with the original. the whole reason to have 16 bits and up is that you can have quiet sounds and loud sounds in the same program.
for a DVD to have an effective 11 bit program, you're talking about a PEAK level of less than -30dB. if you come across something like this, send the DVD back because somebody messed up.
SeeMoreDigital
6th June 2005, 07:55
I doubt that there will be a 2-pass VBR AAC encoder from Ahead in the mid-term future. They are aiming at the standalone market. Standalones have got limited ressources and are likely to prefer CBR a lot. The video is VBR already and multiple concurrent VBR streams harm the predictability of the overall bitrate. In that case you'll be pleased to know that it's already possible to play 2Ch (and 6Ch) VBR AAC streams in stand-alone players!
Currently I use Foobar2000 with NeroDigital's AAC encoder as it offers a normalizer. It does seem to do a slightly better job at encoding AAC audio than Recode2....
That said, there's never been any doubt in my mind that a full 2-pass encode has got to bring better results ;)
Cheers
rpp3po
6th June 2005, 12:32
@Mug Funky
Yes I'm talking about -30 db peak, not -30 db average. I did not say, that the DVD source was mastered to this low level. Just the result of Nero Recode's conversion was that quiet. I guess they left headroom for their ac3 conversion because they hadn't got a 100% solution for avoiding clipping at that point.
It is strange, that (even the latest) Recode 2.2.6.16 outputs aac's, which have a lower peak than Besweet/Azid transcodes with 0db gain and switched off DRC! That means that there isn't just missing normalization but attenuation. That shouldn't be there at all.
Ok, the ac3 spec says that even a -0db (peak) leveled master can clip after the ac3 conversion stage because of the added quantization noise. Same would apply for the aac compression stage. A little attenuation to get clipping headroom for aac artifacts is ok, but I think Ahead ahead took more than a little.
Concerning your modern music example I agree that there aren't much dynamics anymore. ABX'ing Madonna at 8 bit is nearly impossible. It's a shame that music like this is even released on 24 bit DVD-A.
However, there are no quiet kisses, crashing planes and earth quakes on there. For that I still need dynamics which go further than 11 bits. Concerning music there are luckily other alternatives, too.
Mug Funky
6th June 2005, 16:15
yikes! didn't realise it was THAT low. try using foobar2000 for decoding if it's that bad through nero. i've not actually tried recode as i don't like the interface.
btw, i've noticed that besweet and foobar decode differently, but i can't tell which does it better (wave subtraction doesn't tell anything about which was correct, just that there are differences in the high freqs that have nothing to do with dither). so maybe try both. they probably aren't audibly different, but it is disturbing to find differences.
vBulletin® v3.8.11, Copyright ©2000-2026, vBulletin Solutions Inc.