View Full Version : Quality loss when converting ac3->wav->vorbis?
ultimatebilly
12th April 2002, 10:03
What I mean is this:
Is there any quality loss when I extract the ac3-sound out of the vobs with vob2audio, and then convert the resulting wav to vorbis instead of converting the ac3 to vorbis directly (with headac3he)? Or is it just the same?
The reason I want to do this is that I dont want to use azid to downsample the audio source.
For my ears, with azid generated wav-files always seem to have a little lower volume than wav-files I generated with vob2audio (but maybe I'm just imagining things...) just like the sound of WinDVD seems to be a little bit too quiet (to me) compared with the sound of PowerDVD when playing back the same DVD.
DSPguru
12th April 2002, 10:13
1. direct conversion (ac3->ogg) will give you better quality compared to an ac3->wav->ogg process.
2. azid doesn't downsample. it's out of its scope.
4. azid supports 'dialouge normalizaion', and in case you used it, you might got a lower volume comparing to vob2audio's result.
5. for the maximum volume use normalizaion ("find max gain").
6. read the guides and faqs ;).
Dg.
ultimatebilly
12th April 2002, 13:36
I already tried "auto find maximum gain" (with gknot, which uses this setting as a default in azid) but it seemed to be still not the same volume as with vob2audio. But when the direct conversion with azid yields better results, I will try to find the right settings for my ears...
tangent
13th April 2002, 06:57
Theoretically, direct conversion works better because while .wav is a lossless medium, to store a the intermediate audio into it requires truncating the sample values at 16 bits, while the direction conversion allows you to pass sample values in 24 bits or higher accuracy in floating point.
In practice, even the best ears are unlikely to hear the difference.
Since most people playback at 16 bits, the errors you get are mostly quantisation bit errors in the 16th bit. This quantisation noise is usually below the noise floor of many audio systems, and the majority of the best ears out there cannot hear even a difference waveform of 2 bits out of 16. This means you are even much less likely to hear difference waveforms of 1 bit out of 16, and even much less likely to hear the differences when it is masked by the main audio.
DSPguru
13th April 2002, 07:48
your claim is only true for a normalized signal.
if the signal isn't normalized (usually), the comparision isn't between 16bit to 24bit (only the mantissa), it's a comparision between the 32bit floating-point to 16bit integers. in that case, your int wave, could be only using 12bit, and that error is surely audioable.
anyway, direct conversion is also faster, and you can avoid creating a huge temporary wav file on your h.d.
so, any way you put it, direct conversion is better.
tangent
14th April 2002, 18:51
YMMV, but I haven't seen normalisation gains for the movies I have encoded going above 12dB. A signal requiring 12dB gain to normalize would mean it would have a dynamic range using 14 out of the 16 bits. A signal using only 12 out of the 16 bits dynamic range would require a 24dB gain to normalize, not sure how often you will come across that.
Let's say you have a source signal at 12 bits out of 16, you encode it and decode it to a 16 bit wav, a 24 bit wav and a 32 bit floating point wav. Your 24 bit wav or 32 bit floating wav will not have better quality than the 16 bit wav, because quality is still limited by the SNR of the original recording. You cannot increase SNR just by increasing the dynamic range of a signal. Similarly you can't increase the quality of a 128kbps MP3 by reencoding it at 320kbps.
Let's say the source signal going into the encoder is actually 24 bits, but using 20 out of the 24 for dynamic range therefore becomes 12 out of 16 upon decoding when decoding to 16 bits. However, if you are normalising, wouldn't you be decoding with a +24dB gain? So you will still have the full dynamic range of the 16 bits at your disposal. However, even if you don't normalize the decoding, it is still questionable if the drop in SNR is anywhere near the drop in SNR from the lossless encoding process itself (if you have seen difference wavs between an encoding and original, it often goes beyond 4 bits of error), and we get back to the problem I mentioned earlier, where increasing the dynamic range doesn't do anything because SNR is already below that of a 16bit wav.
By the way, if you really think about it, a 32bit fixed point will give you better SNR than 32bit floating point. If you need me to explain why, I'll be happy to do so.
As for direct conversion being faster, there is one case (which occurs pretty often, maybe more often than not) where it would actually be faster to go through the intermediary wave, and that's if you want to normalize before you decode. Without the intermediary wav, you need to go through 2 decoding passes and 1 encoding pass to encode with normalization (the 2nd decoding pass is done at the same time as the encoding pass but takes up the same time and computations as 2 seperate passes). With the intermediary wav, you don't need 2 decoding passes, just 1 decoding pass and 1 encoding pass. I won't argue that without normalization, it will take slightly longer with the intermediary file (equivilent to the time taken to write and read to HD, not considering that it can be done in parrallel to computations), but it's nothing compared to the difference when you are going to normalize.
DSPguru
14th April 2002, 20:02
a. i always use dialog normalization. maximum gain gets up to 24db, yes. in all my transcoding, gain level was higher than 12db.
b. my source signal isn't 12bit. your second paragraph seems irrelevant.
c. quality is better, and users defenitly can hear it. read here (http://forum.doom9.org/showthread.php?s=&threadid=23014) !
d. in our case : 32bit fixed is better than 32bit floats that is better than 24bits fixed that is better than 16bits ints.
e. one-pass normalization changes the whole picture :
Originally posted by tangent
Therefore, I have came up with this process which I have been using. I know that there will probably be some people unhappy with my process for some reason or another, mainly because it's different from how they have been doing it, as well as being different from what respected people have been recommending on their guides. However, I would ask that everyone review this process and perhaps try it out and I'm sure that you'll be happier for it.
1) Use Azid to convert to wav (without normalising gain)
2) Use Lame to convert to mp3 with the command line --dm-preset <bitrate> --resample 44 --scale 1
3) Use MP3Gain set at 90.5dB to normalize the mp3
What are the advantages with this process?
1) No more noisy/lossy scaling
2) No more clipping
3) Best quality possible mp3 encoding
4) Much faster. The MP3Gain process (both passes) is much faster than the first pass Azid process
Cheers,
Dg.
vBulletin® v3.8.5, Copyright ©2000-2012, Jelsoft Enterprises Ltd.