Doom9's Forum - View Single Post

KpeX · 7th January 2004, 22:46

General Audio Procedures FAQ
Cross-format techniques for encoding, decoding, and playback.

1. What is the difference between lossy, lossless, and uncompressed audio formats?

Uncompressed audio is simply audio without any compression applied to it. Uncompressed audio is commonly used in AV conversions in PCM or WAV form.
[WAV is a container than support audio data, not only PCM uncompressed samples]

Lossless audio applies a compression to uncompressed audio without losing any information or degrading the quality at all. Lossless audio is not common in the AV world, but it is possible with formats like WMA Lossless or FLAC in Matroska.
[Also TrueHD and DTS-MA are lossless and used now in BluRay Disks]

Lossy audio attempts to apply to discard as much 'irrelevant' data as possible from the original, with the goal being to produce a file much smaller than the original that sounds almost identical. This results in a much lower bitrate and filesize then lossless or uncompressed audio. Lossy audio formats are extremely prevalent in AV, and include AC3, DTS, AAC, MPEG-1/2/3, Vorbis, and Real Audio.
[Now Vorbis is replaced by Opus]

We also talk about lossy and lossless processes. Whenever you transcode to a lossy format (for example wav > MP3), a loss in quality follows, therefore it is a lossy process. Transcoding from a lossy format to a lossy format (for example Mp3>AAC) is even worse, since there is loss introduced both from the first lossy file and the encoding of the second.

2. What's CBR/ABR/VBR?

CBR means that the stream's bitrate is constant and never changes.
VBR means that the stream's bitrate is variable, and changes in according to the amount of information that needed to be encoded.
ABR has a variable bitrate for each frame, but its average bitrate is a constant.

Conclusion:
CBR is a private-case of ABR which is a private-case of VBR.

[For audio movie tracks the VBR mode is highly recommended, because silences and dialogs only need low bitrate and we can reserve bitrate size for parts than need it.]

3. What's the difference between stereo, joint stereo, and dual channel?

Dual channel has 2 mono-channels, meaning each channel is encoded with half the bitrate of the overall bitrate.
Stereo has 2 seperate channels, but the bitrate allocation between those two channels changes accordingly to the amount of information there is in each channel.
Joint stereo has 2 channels, but takes advantage of what is common between the channels, so the compression gain is higher.

4. What are the different kinds of joint stereo?

Two of the most used joint stereo modes are IS (intensity stereo) and M/S (Mid/Side). M/S matrixing computes the sum and the difference of the two audio channels and stores these as two channels. This method is very efficient and is a lossless process, which means the two original channels can be extracted exactly as they were.

Intensity stereo replaces the left and right channels with a single signal plus directional information. This method is lossy and destroys DPL information. This method is only recommended at low bitrates.

Many encoders can use a combination of full stereo and either or both of these methods, deciding which to use on a per-frame basis.

5. How can I transcode AC3 5.1 to Pro Logic/Pro Logic II?

There are two steps to having a 2.0 channel stereo file that contains Dolby Pro Logic (II). First, a dolby pro logic source is needed, in this case we apply a DPL downmix on the 5.1 source. This can be done easily with besweet, in the '-azid()' section add a '-s surround' for Pro Logic or '-s surround2' for Pro Logic II.

[Instead Besweet you can use now ffmpeg, eac3to or BeHappy, for ffmpeg or eac3to there are many GUI's, I can support UsEac3to, by CLI I can recommend ffmpeg (replace uppercases with desired values):

ffmpeg -drc_scale 0 -i INPUT51 -af "pan=stereo|FL=.3254c0+.2301c2+.2818c4+.1627c5|FR=.3254c1+.2301c2-.1627c4-.2818c5" -acodec AC3 --dsur_mode 2 -ab 192k OUTPUT.AC3
]

Secondly, when the audio is encoded, the joint stereo information must be preserved. In order for this to happen, either full stereo or M/S joint stereo must be used - Intensity stereo will destroy DPL information. More information on Dolby ProLogic & Dolby Surround can be found here.This can be done with all the common audio formats:

MP3: Use the lame encoder, and use joint stereo. See the MP3 FAQ #9.
Vorbis: Use lossless or light lossy channel coupling.
AAC: Use M/S joint stereo, which pretty much all the AAC encoders use AFAIK.
MP2: MP2 specs do not support M/S, so you'll have to use full stereo.

6. Where can I find more information about audio coding formats and techniques?

http://www.hydrogenaudio.org [Recommended for audio soft and test, and foobar2000 player and converter forum]

7. What is the best lossy audio format for me to use?

As you should know, forum rules prohibit asking what's best. In general your own ears are the best answer to this question.
[Test in hydrogenaudio, test including Opus, multichannel test

To select a lossy encoder the first parameter is the compatibility with the device player you want use.
For a PC or a power standalone player conected by HDMI to a modern AVR all codecs can work, but others can have troubles.
Now the AAC is the more compatible with stereo, and AC3 for multichannel.

After that and checking the listening test the second parameter can be the ratio quality/size and the order can be: Opus>AAC>EAC3>AC3.
Also remember than there are differences in the listening test at low bitrates but the differences at high bitrates are low.]

8. When converting, is it better to downsample my DVD audio to 44.1 KHz or keep it at 48 KHz?

Unless you have to meet a standard's specifications, such as 44.1 KHz for SVCD, there is usually no reason to downsample audio to 44.1 KHz. It is possible that downsampling will introduce rounding errors when finding new sample points, which will degrade the quality, but when using a high quality resampler such as SSRC, this quality loss will be extremely miniscule. It can also be argued that certain audio encoders, such as Lame, are better tuned for 44.1 KHz, but these differences are also very miniscule.

9. How can I change the samplerate of a wave file?

Download SSRC from here, and use the following command:

Code:

ssrc --rate 44100 input.wav output.wav

In this case, the destination sample rate is 44.1 Khz (44100 Hz).
[We can use now ffmpeg, eac3to, BeHappy (supported in this subforum) and others. The CLI command:
ffmpeg -i INPUT.wav -af "aresample=44100" OUTPUT.wav]

10. What is SPDIF [/HDMI] and when is it used?

SPDIF (Sony Phillips Digital Interface) is a physical digital interface (_not_ analog). The connection can be either coax or optical (fibre) meaning in general it should provide less noise interference from my experience (varies depending on hardware and source). The SPDIF interface can pass PCM [only 2.0], DTS or Dolby Digital streams to your receiver/amplifier for decoding. In theory, your SPDIF capable sound card passes the audio packets without modification (It’s theory because much discussion has taken place as to if Creative's sound cards alter packets).

[HDMI is the new physical digital interface, with special cable, to transfer Audio/Video (AV) between devices.
Talking of audio, support the same than SPDIF plus TrueHD, DTS-HD and PCM multichannel]

11. Is it possible to backup music from a concert dvd into an audio cd?

Yes, the AC3/DTS needs to be demuxed and then converted to a PCM-WAV, Dolby Digital-Wav, or DTS-Wav.

Checkout BeSure from doom9's download page. It can create DTS-CD, AC3-CD & plain CDDA.
[Instead BeSure, BeSweet (with old ac3enc) and Surcode (commercial DTS encoder) you can use now ffmpeg to resample and encode to AC3 or DTS:
ffmpeg -i ANY_INPUT51 -vn -af "aresample=44100" -strict -2 -c:a dca -b:a 1411.2k OUTPUT.dts
or
ffmpeg -i ANY_INPUT51 -vn -af "aresample=44100" -c:a ac3 -b:a 640k OUTPUT.ac3
and after convert to Ac3-Wav or DTS-Wav with spdifer, a tool from AC3 Filter Tools
spdifer OUTPUT.ac3 OUTPUT.wav -wav
]

12. When transcoding 5.1 AC3 to 2-channel audio, which AC3 stream should i decode? 5.1 or 2.0?

The 2.0 stream has a better mix for 2 channels than the 5.1 to 2.0 downmix features offered by AC3 decoding tools. Therefore, it should be best to work with the 2.0 stream. Transcoding the 5.1 stream and downmixing it manually will only be useful for users who have a surround system that has a Dolby ProLogic (II) decoder. Some 2.0 tracks aren't downmixed for Pro Logic, so you must take the origianl 5.1 track and downmix it with a transcoder that is capable of doing it. More info on Dolby ProLogic & Dolby Surround can be found here.

13. How can I extract an audio stream (AC3/MP3/OGG/WAV) from an AVI file?

Download VirtualDubmod and open your AVI file. Click streams -> stream list. Click on the audio stream you want to extract; then click demux to remove the stream losslessly, or click decode to decompress the audio to a .wav file. This will work for any audio stream in an .avi file (first/second etc.)

[We can extract an audio stream X (order) from any container with:
ffmpeg -i ANY_CONTAINER -map 0:X -acodec copy OUTPUT.EXT]

14. How can I split my AC3/DTS/AAC/MP3/MP2/MPA/WAV track into several seperate, shorter, tracks?

Use BeSplit, it was written especially for that. Note that (Chapter-X-tractor) can generate BeSplit's commandline for .ifo files.

An alternative for DTS is (DTS Trimmer/Concatenator).

[The last BeSplit version (v09b8) can be download here.
Other tool to split some audio files is DelayCut]

15. My AVI plays with no audio. What can I do?

First of all, reread the forum rules, you will get no help with downloaded or illegally obtained files here.

If the file was created by yourself, you'll simply need to install a DirectShow filter capable of rendering the the audio format. Check Doom9's download page for some common audio filters. GSpot, also available on the download page, can be of assistance in identifying what audio is contained within an AVI.

[Now the DirectShow filters recommended are the LAV Filters and they are included in a recommended player MPC-HC, (of course there are others VLC, MPC-BE...
To obtain info from multimedia files we need MediaInfo]

16. Please post a list of the most recommended winamp plugins.

[Like Audio player I recommend now Foobar2000]

17. Can I decrypt/rip/create a DVD-A (DVD-Audio) disk? What format does DVD-A use?

[Use MakeMKV to decrypt/rip and play the mkv's without problem]

DVD-A uses either LPCM or a lossless encoding called MLP (Meridian Lossless Packing), and supports up to 24 bit and 192 KHz audio.

As far as creation goes, the dvd-audio project aims to provide audio enthusiasts with a set of free software tools to enable the authoring of DVD-Audio disks compliant with hardware DVD-Audio players, plus a software player for such disks. Note that this project is still in an early stage of development.

If you have deep pockets, minnetonka's (discwelder chrome) is a commercial DVD-Audio authoring application.

18. Is it possible to use Musepack (MPC) audio with video?

At this time it is not yet possible in any container format. The developers of Matroska have plans to include Musepack audio in Matroska, but due to the opinion of the lead developer of Musepack and the bitstream format of Musepack this will not happen until the SV8 bitstream (the next version of Musepack) is finalized. However there may be a Musepack SV7>'SV7.5' bitstream converter, so your current MPC files might be possible in Matroska in the future.
[Musepack SV8 still without support]

19. How can I convert the frame rate of my audio with freeware?

First of all, for the nth time, audio does not have a framerate. When someone talks about changing the framerate of audio, they are talking about time stretching the audio to match a video framerate that has been converted. FRC conversions can be done with several commercial tools, but that is not the focus of this FAQ, use google to find these.

A 'framerate conversion', which we will refer to as a speed changing operation, can be done either with or without pitch correction. When done without pitch correction, if the speed change is large, you will notice that the audio sounds too high or low pitched, much like changing the speed of a tape or record player. Such an operation can be done with BeSweet, simply add a

Code:

-ota( -r 23976 25000 )

to your ota section of your besweet commandline, where in this case the original video framerate was 23.976 and the final framerate is 25.000.

A speed change with pitch correction is also known as a tempo change; the speed at which the audio played back changes without chanigng the pitch from the original. to do this, you can use the freeware tool Audacity, or use BeSweet v1.5b28 or newer with the (commandline) :

Code:

-soundtouch( -r 23976 25000 )

[ffmpeg can also do the job with a filter, speedup without change the pitch: -af "atempo=1.042708"
speedup changing the pitch: -af "asetrate=50050, aresample=48000"
There are also SoundStretch]

7th January 2004, 22:46	#2 \| Link
KpeX Registered User Join Date: Jun 2003 Location: Great Lakes, USA Posts: 1,433	General Audio Procedures FAQ Cross-format techniques for encoding, decoding, and playback. 1. What is the difference between lossy, lossless, and uncompressed audio formats? Uncompressed audio is simply audio without any compression applied to it. Uncompressed audio is commonly used in AV conversions in PCM or WAV form. [WAV is a container than support audio data, not only PCM uncompressed samples] Lossless audio applies a compression to uncompressed audio without losing any information or degrading the quality at all. Lossless audio is not common in the AV world, but it is possible with formats like WMA Lossless or FLAC in Matroska. [Also TrueHD and DTS-MA are lossless and used now in BluRay Disks] Lossy audio attempts to apply to discard as much 'irrelevant' data as possible from the original, with the goal being to produce a file much smaller than the original that sounds almost identical. This results in a much lower bitrate and filesize then lossless or uncompressed audio. Lossy audio formats are extremely prevalent in AV, and include AC3, DTS, AAC, MPEG-1/2/3, Vorbis, and Real Audio. [Now Vorbis is replaced by Opus] We also talk about lossy and lossless processes. Whenever you transcode to a lossy format (for example wav > MP3), a loss in quality follows, therefore it is a lossy process. Transcoding from a lossy format to a lossy format (for example Mp3>AAC) is even worse, since there is loss introduced both from the first lossy file and the encoding of the second. 2. What's CBR/ABR/VBR? CBR means that the stream's bitrate is constant and never changes. VBR means that the stream's bitrate is variable, and changes in according to the amount of information that needed to be encoded. ABR has a variable bitrate for each frame, but its average bitrate is a constant. Conclusion: CBR is a private-case of ABR which is a private-case of VBR. [For audio movie tracks the VBR mode is highly recommended, because silences and dialogs only need low bitrate and we can reserve bitrate size for parts than need it.] 3. What's the difference between stereo, joint stereo, and dual channel? Dual channel has 2 mono-channels, meaning each channel is encoded with half the bitrate of the overall bitrate. Stereo has 2 seperate channels, but the bitrate allocation between those two channels changes accordingly to the amount of information there is in each channel. Joint stereo has 2 channels, but takes advantage of what is common between the channels, so the compression gain is higher. 4. What are the different kinds of joint stereo? Two of the most used joint stereo modes are IS (intensity stereo) and M/S (Mid/Side). M/S matrixing computes the sum and the difference of the two audio channels and stores these as two channels. This method is very efficient and is a lossless process, which means the two original channels can be extracted exactly as they were. Intensity stereo replaces the left and right channels with a single signal plus directional information. This method is lossy and destroys DPL information. This method is only recommended at low bitrates. Many encoders can use a combination of full stereo and either or both of these methods, deciding which to use on a per-frame basis. 5. How can I transcode AC3 5.1 to Pro Logic/Pro Logic II? There are two steps to having a 2.0 channel stereo file that contains Dolby Pro Logic (II). First, a dolby pro logic source is needed, in this case we apply a DPL downmix on the 5.1 source. This can be done easily with besweet, in the '-azid()' section add a '-s surround' for Pro Logic or '-s surround2' for Pro Logic II. [Instead Besweet you can use now ffmpeg, eac3to or BeHappy, for ffmpeg or eac3to there are many GUI's, I can support UsEac3to, by CLI I can recommend ffmpeg (replace uppercases with desired values): ffmpeg -drc_scale 0 -i INPUT51 -af "pan=stereo\|FL=.3254c0+.2301c2+.2818c4+.1627c5\|FR=.3254c1+.2301c2-.1627c4-.2818c5" -acodec AC3 --dsur_mode 2 -ab 192k OUTPUT.AC3 ] Secondly, when the audio is encoded, the joint stereo information must be preserved. In order for this to happen, either full stereo or M/S joint stereo must be used - Intensity stereo will destroy DPL information. More information on Dolby ProLogic & Dolby Surround can be found here.This can be done with all the common audio formats: MP3: Use the lame encoder, and use joint stereo. See the MP3 FAQ #9. Vorbis: Use lossless or light lossy channel coupling. AAC: Use M/S joint stereo, which pretty much all the AAC encoders use AFAIK. MP2: MP2 specs do not support M/S, so you'll have to use full stereo. 6. Where can I find more information about audio coding formats and techniques? http://www.hydrogenaudio.org [Recommended for audio soft and test, and foobar2000 player and converter forum] 7. What is the best lossy audio format for me to use? As you should know, forum rules prohibit asking what's best. In general your own ears are the best answer to this question. [Test in hydrogenaudio, test including Opus, multichannel test To select a lossy encoder the first parameter is the compatibility with the device player you want use. For a PC or a power standalone player conected by HDMI to a modern AVR all codecs can work, but others can have troubles. Now the AAC is the more compatible with stereo, and AC3 for multichannel. After that and checking the listening test the second parameter can be the ratio quality/size and the order can be: Opus>AAC>EAC3>AC3. Also remember than there are differences in the listening test at low bitrates but the differences at high bitrates are low.] 8. When converting, is it better to downsample my DVD audio to 44.1 KHz or keep it at 48 KHz? Unless you have to meet a standard's specifications, such as 44.1 KHz for SVCD, there is usually no reason to downsample audio to 44.1 KHz. It is possible that downsampling will introduce rounding errors when finding new sample points, which will degrade the quality, but when using a high quality resampler such as SSRC, this quality loss will be extremely miniscule. It can also be argued that certain audio encoders, such as Lame, are better tuned for 44.1 KHz, but these differences are also very miniscule. 9. How can I change the samplerate of a wave file? Download SSRC from here, and use the following command: Code: ssrc --rate 44100 input.wav output.wav In this case, the destination sample rate is 44.1 Khz (44100 Hz). [We can use now ffmpeg, eac3to, BeHappy (supported in this subforum) and others. The CLI command: ffmpeg -i INPUT.wav -af "aresample=44100" OUTPUT.wav] 10. What is SPDIF [/HDMI] and when is it used? SPDIF (Sony Phillips Digital Interface) is a physical digital interface (_not_ analog). The connection can be either coax or optical (fibre) meaning in general it should provide less noise interference from my experience (varies depending on hardware and source). The SPDIF interface can pass PCM [only 2.0], DTS or Dolby Digital streams to your receiver/amplifier for decoding. In theory, your SPDIF capable sound card passes the audio packets without modification (It’s theory because much discussion has taken place as to if Creative's sound cards alter packets). [HDMI is the new physical digital interface, with special cable, to transfer Audio/Video (AV) between devices. Talking of audio, support the same than SPDIF plus TrueHD, DTS-HD and PCM multichannel] 11. Is it possible to backup music from a concert dvd into an audio cd? Yes, the AC3/DTS needs to be demuxed and then converted to a PCM-WAV, Dolby Digital-Wav, or DTS-Wav. Checkout BeSure from doom9's download page. It can create DTS-CD, AC3-CD & plain CDDA. [Instead BeSure, BeSweet (with old ac3enc) and Surcode (commercial DTS encoder) you can use now ffmpeg to resample and encode to AC3 or DTS: ffmpeg -i ANY_INPUT51 -vn -af "aresample=44100" -strict -2 -c:a dca -b:a 1411.2k OUTPUT.dts or ffmpeg -i ANY_INPUT51 -vn -af "aresample=44100" -c:a ac3 -b:a 640k OUTPUT.ac3 and after convert to Ac3-Wav or DTS-Wav with spdifer, a tool from AC3 Filter Tools spdifer OUTPUT.ac3 OUTPUT.wav -wav ] 12. When transcoding 5.1 AC3 to 2-channel audio, which AC3 stream should i decode? 5.1 or 2.0? The 2.0 stream has a better mix for 2 channels than the 5.1 to 2.0 downmix features offered by AC3 decoding tools. Therefore, it should be best to work with the 2.0 stream. Transcoding the 5.1 stream and downmixing it manually will only be useful for users who have a surround system that has a Dolby ProLogic (II) decoder. Some 2.0 tracks aren't downmixed for Pro Logic, so you must take the origianl 5.1 track and downmix it with a transcoder that is capable of doing it. More info on Dolby ProLogic & Dolby Surround can be found here. 13. How can I extract an audio stream (AC3/MP3/OGG/WAV) from an AVI file? Download VirtualDubmod and open your AVI file. Click streams -> stream list. Click on the audio stream you want to extract; then click demux to remove the stream losslessly, or click decode to decompress the audio to a .wav file. This will work for any audio stream in an .avi file (first/second etc.) [We can extract an audio stream X (order) from any container with: ffmpeg -i ANY_CONTAINER -map 0:X -acodec copy OUTPUT.EXT] 14. How can I split my AC3/DTS/AAC/MP3/MP2/MPA/WAV track into several seperate, shorter, tracks? Use BeSplit, it was written especially for that. Note that (Chapter-X-tractor) can generate BeSplit's commandline for .ifo files. An alternative for DTS is (DTS Trimmer/Concatenator). [The last BeSplit version (v09b8) can be download here. Other tool to split some audio files is DelayCut] 15. My AVI plays with no audio. What can I do? First of all, reread the forum rules, you will get no help with downloaded or illegally obtained files here. If the file was created by yourself, you'll simply need to install a DirectShow filter capable of rendering the the audio format. Check Doom9's download page for some common audio filters. GSpot, also available on the download page, can be of assistance in identifying what audio is contained within an AVI. [Now the DirectShow filters recommended are the LAV Filters and they are included in a recommended player MPC-HC, (of course there are others VLC, MPC-BE... To obtain info from multimedia files we need MediaInfo] 16. Please post a list of the most recommended winamp plugins. [Like Audio player I recommend now Foobar2000] 17. Can I decrypt/rip/create a DVD-A (DVD-Audio) disk? What format does DVD-A use? [Use MakeMKV to decrypt/rip and play the mkv's without problem] DVD-A uses either LPCM or a lossless encoding called MLP (Meridian Lossless Packing), and supports up to 24 bit and 192 KHz audio. As far as creation goes, the dvd-audio project aims to provide audio enthusiasts with a set of free software tools to enable the authoring of DVD-Audio disks compliant with hardware DVD-Audio players, plus a software player for such disks. Note that this project is still in an early stage of development. If you have deep pockets, minnetonka's (discwelder chrome) is a commercial DVD-Audio authoring application. 18. Is it possible to use Musepack (MPC) audio with video? At this time it is not yet possible in any container format. The developers of Matroska have plans to include Musepack audio in Matroska, but due to the opinion of the lead developer of Musepack and the bitstream format of Musepack this will not happen until the SV8 bitstream (the next version of Musepack) is finalized. However there may be a Musepack SV7>'SV7.5' bitstream converter, so your current MPC files might be possible in Matroska in the future. [Musepack SV8 still without support] 19. How can I convert the frame rate of my audio with freeware? First of all, for the nth time, audio does not have a framerate. When someone talks about changing the framerate of audio, they are talking about time stretching the audio to match a video framerate that has been converted. FRC conversions can be done with several commercial tools, but that is not the focus of this FAQ, use google to find these. A 'framerate conversion', which we will refer to as a speed changing operation, can be done either with or without pitch correction. When done without pitch correction, if the speed change is large, you will notice that the audio sounds too high or low pitched, much like changing the speed of a tape or record player. Such an operation can be done with BeSweet, simply add a Code: -ota( -r 23976 25000 ) to your ota section of your besweet commandline, where in this case the original video framerate was 23.976 and the final framerate is 25.000. A speed change with pitch correction is also known as a tempo change; the speed at which the audio played back changes without chanigng the pitch from the original. to do this, you can use the freeware tool Audacity, or use BeSweet v1.5b28 or newer with the (commandline) : Code: -soundtouch( -r 23976 25000 ) [ffmpeg can also do the job with a filter, speedup without change the pitch: -af "atempo=1.042708" speedup changing the pitch: -af "asetrate=50050, aresample=48000" There are also SoundStretch] Last edited by tebasuna51; 30th September 2021 at 13:23. Reason: update