Quu
15th August 2010, 07:03
I want to test the sync of decoding mp3 and aac audio sources. I will be using the WAVSource, AVISource, DirectShowSource, NicMPG123Source, bassAudioSource and FFAudioSource API calls when I can.
What I found is that when decoding mp3 files, use NicAudio if possible, and DirectShow might work also. For AAC decoding, there is no solution currently in AVISynth that will provide the audio in the proper sync.
Setup
Windows Vista 64 bit (32 bit executable used)
I am using Audacity 1.3.12-beta (Unicode)
I am using AVISynth 2.5.8
I am using WAVI 1.06
I am using FFMpegSource ffms2-mt-r318
I am using Nicaudio 2.04
I am using AudioGraph audgraph_25_dll_20040318
I am using VirtualDub 1.9.9
I am using FAAC 1.28
I am using LAME 3.98.4
I am using BePipe (v1.0.2155.27457 from file properties)
I am using BassAudio bass.dll 2.4
I am using the BassAudio AVISynth Plug ins from BeHappy 20091012
I have Nero 8 installed, so it will be the DirectShow Filters
MP3 - http://otakuvideo.com/~quu/AVS_AudioTimings/FilterGraph-mp3.png
AAC - http://otakuvideo.com/~quu/AVS_AudioTimings/FilterGraph-AAC-MP4.png
I am using VLC 1.1.2
I am using Winamp 5.581
I am using dbPoweramp 13.4
All source files are at http://otakuvideo.com/~quu/AVS_AudioTimings/
For all VirtualDub inspections, a stop point is put at frame 4001, the video is rewound, and then played to the stop (frame 4000), making sure that no errors in seeking show up as sync issues. I picked frame 4000 because it was easy to remember, and in the MP3 I am using for a reference, an "interesting" wave form occurs there.
I am starting with this MP3
http://freemusicarchive.org/music/Fabrizio_Paterlini/Remixed/01_Forever_blue_March_Rosetta_re-imagined_version
Renamed to Test-MP3.mp3 and using Winamp, removed the album art and ID3 tags.
Testing
I have loaded the MP3 file into Audacity and exported it as a WAV. To make sure the WAV and the MP3 match, i reload the WAV into Audacity and zoom in really close to compare the WAV forms, they match. From this WAV file I ran it through FAAC with all default options, saving it to an .mp4 file and to an .aac (MPEG transport stream) file. Using Audacity, I have checked the wave form of all of these files, they match perfectly.
Top is the uncompressed WAV, middle is the FAAC encoded AAC file, and on the bottom is the original MP3
http://otakuvideo.com/~quu/AVS_AudioTimings/Audacity-SyncCheck.png
Top is the uncompressed WAV, middle is the AAC file int he transport stream, and on the bottom is the AAC in an MP4 file
http://otakuvideo.com/~quu/AVS_AudioTimings/FAAC-MP4vsTS-SyncCheck.png
Now I am building a set of "baseline-XXX.avs" file which will draw the WAV forms as a video so i can visually check the sync. I am going to decode the mp3 via DirectShowSource, Fatsos, bassAudioSource, and NicMPG123Source. The AAC files will be decoded with DirectShow and FFAudioSource twice (once as a transport stream, once as a MPEG-4). The WAV file will be decoded with WAVSource. All of these WAV form videos will then be stack on top of each other so the sync of the different decoding methods can be checked.
MP3 - http://otakuvideo.com/~quu/AVS_AudioTimings/baseline-MP3.png
AAC - http://otakuvideo.com/~quu/AVS_AudioTimings/Baseline-AAC.png
Right away with the MP3 comparison, the bassAudio is about a frame fast. (If there is an image, album art, in the ID3 of the mp3, FFAudioSource is about a frame slow, but that is why we removed it). The AAC decoders were much cleaner, though they all ran about half a frame behind the WAVSource baseline, which is strange since Audacity showed that they were all in sync with the WAV.
LAME was also used to decode the original MP3 into an uncompressed WAV file. This was not used as the baseline WAV file because Audacity showed the sync between it and the original MP3 to be different. But, when this WAV file is used, bassAudio becomes the correct sync.
http://otakuvideo.com/~quu/AVS_AudioTimings/baseline-MP3-LAME-WAV-Comparison.png
I decompressed the original mp3 using Audacity, VLC, LAME, Winamp, and dbPoweramp. The VLC and Audacity were identical, while the LAME, Winamp, and dbPoweramp were identical (but out of sync according to Audacity).
http://otakuvideo.com/~quu/AVS_AudioTimings/Synccheck-Audacity-VLC.png
http://otakuvideo.com/~quu/AVS_AudioTimings/Synccheck-Audacity-LAME.png
http://otakuvideo.com/~quu/AVS_AudioTimings/Synccheck-Audacity-Winamp.png
http://otakuvideo.com/~quu/AVS_AudioTimings/Synccheck-Audacity-dbPoweramp.png
Accuracy goes out if you "scrub" to the location instead of play to that location. If i scrub with the MP3, the DirectShow decoder fails horribly, but the rest are accurate. If we seek ahead with the AAC sources, the DirectShow is fine, while the two MP4 sourced decoders are running a fraction of a frame slow.
MP3 - http://otakuvideo.com/~quu/AVS_AudioTimings/baseline-MP3-Seeked.png
AAC - http://otakuvideo.com/~quu/AVS_AudioTimings/Baseline-AAC-Seeked.png
For both types of compressed audio, we got different timings from the same source files based on how we opened the files. Is the WAVSource the correct one, or is one of the other methods correct? One possible way to to re encode from the source multiple times in iteration, then compare the timing to the originals, which ever was stable is the "right" one.
The first is the easiest, testing the sync of the WAVSource. I have an AVS that opens the WAV file as a WAV source, and then literate 5 times using "wavi Test-WAV-[n].avs Test-WAV-[n+1].WAV", with the 5 AVS files prepped. Any sync or timing issues from WAVSource, unless compensated for by "wavi" will show them selves by iteration 5.
Rendered - http://otakuvideo.com/~quu/AVS_AudioTimings/WAVSource-TestResults.png
Audacity - http://otakuvideo.com/~quu/AVS_AudioTimings/WAVSource-TestResults-Aud.png
And by iteration 5, there is no change or adjustment at all to the results, the sync looks perfect.
For MP3 sync, it should be easier, as there are less variables. Before I started using BePipe to re encode the mp3 from each of the sources, I need to test LAME with a raw files to make sure any sync issues are from the AVISynth source, and not lame.
I re encode the original mp3, and the baseline WAV file with LAME. I use the "-preset insane" command line option (and the sync issues happened with all settings i tried). The resulting MP3 file encoded out of lame does not, according to Audacity, have the same sync as the original files.
The top is the original MP#, the middle the uncompressed WAV, and on the bottom is the resulting MP3s
MP3 to MP3 - http://otakuvideo.com/~quu/AVS_AudioTimings/Lame-Native-MP3-2-MP3-Reencode.png
WAV to MP3 - http://otakuvideo.com/~quu/AVS_AudioTimings/Lame-Native-WAV-2-MP3-Reencode.png
The MP3 source is closer than the WAV source, but they should be dead on. When I ran FAAC on the same uncompressed WAV, the sync was perfect. Because of the inability of LAME to create an MP3 file with the same sync as the source, i won't be able to use it in my testing. But based on the fact that the WAV source was found to be accurate, we can assume that DirectShow and NicAudio is the proper way to decode MP3 files in AVISynth, and given that DirectShow counts on a filter, Nic's might be best.
To see if maybe LAME was the reference, and all the other filters were wrong, I re encoded the original MP4 file 5 iterations deep. LAME was the only thing used, it decoded the previous MP3 and encoded that iteration's MP3.
http://otakuvideo.com/~quu/AVS_AudioTimings/LAME-MP3-Reencoded-5ITR.png
By the 5th iteration, the sync between the original MP3 (on top) and the 5th re encode (bottom) is off. this means that the LAME decoder is off sync, or the LAME encoder is off sync, or possibly both. This bothers me greatly, as I expect an encoder to encode what it is given, not add or remove bits.
That leaves AAC sync testing. FAAC has already shown that the AAC file it creates is in sync with the WAV file it came from, but first we need to make sure that the output from BePipe to FAAC is as accurate. We can use the AVS files from the WAVSource testing to check. I run the command "BePipe --script "Import(^Test-WAV-0.avs^)" | FAAC -o Test-AAC-BePipe-Check.aac -"
Top is WAV, bottom is resulting AAC - http://otakuvideo.com/~quu/AVS_AudioTimings/FAAC-BePipe-SyncCheck.png
As you can see, BePipe and FAAC does not introduce any sync issues that can be shown by Audacity. So now to the Re encoding.
As the Sync is identical, in AVISynth, for all 5 input methods, only one of them needs to be tested. The encoder would not change based on the source, and when looking at the wave form in the video, the resulting decoded audio is identical. For this test, I am going to be using the FFAudioSource on the transport stream (.aac), and repeat the iterations 5 times. ">BePipe --script "Import(^Test-AAC-[n].avs^)" | FAAC -o Test-AAC-[n+1].aac -"
Rendered - http://otakuvideo.com/~quu/AVS_AudioTimings/FFAudioSource-AAC-TestResults.png
Audacity - http://otakuvideo.com/~quu/AVS_AudioTimings/FFAudioSource-AAC-TestResults-Aud.png
As you can see, the timing drifts further and further with each subsequent re encode, showing that the delay induced by the AAC decoding is a real delay. This conflicts with my previous comparison between FAAC and Nero, but in that one I used the MP4 instead of the transport stream. So I am going to repeat this test with the only change being using the MP4 instead of the TS versions.
Rendered - http://otakuvideo.com/~quu/AVS_AudioTimings/FFAudiosource-MP4-TestResults.png
Audacity - http://otakuvideo.com/~quu/AVS_AudioTimings/FFAudiosource-MP4-TestResults-Aud.png
Well, the results are obvious, the AAC sources are slightly behind when decoded with AVISynth. When I checked the results my previous Nero vs. FAAC testing, I only did three iterations, and did not use Audacity. Increasing the iterations to 5 and using Audacity has shown the audio drift, even for FAAC sources.
What I found is that when decoding mp3 files, use NicAudio if possible, and DirectShow might work also. For AAC decoding, there is no solution currently in AVISynth that will provide the audio in the proper sync.
Setup
Windows Vista 64 bit (32 bit executable used)
I am using Audacity 1.3.12-beta (Unicode)
I am using AVISynth 2.5.8
I am using WAVI 1.06
I am using FFMpegSource ffms2-mt-r318
I am using Nicaudio 2.04
I am using AudioGraph audgraph_25_dll_20040318
I am using VirtualDub 1.9.9
I am using FAAC 1.28
I am using LAME 3.98.4
I am using BePipe (v1.0.2155.27457 from file properties)
I am using BassAudio bass.dll 2.4
I am using the BassAudio AVISynth Plug ins from BeHappy 20091012
I have Nero 8 installed, so it will be the DirectShow Filters
MP3 - http://otakuvideo.com/~quu/AVS_AudioTimings/FilterGraph-mp3.png
AAC - http://otakuvideo.com/~quu/AVS_AudioTimings/FilterGraph-AAC-MP4.png
I am using VLC 1.1.2
I am using Winamp 5.581
I am using dbPoweramp 13.4
All source files are at http://otakuvideo.com/~quu/AVS_AudioTimings/
For all VirtualDub inspections, a stop point is put at frame 4001, the video is rewound, and then played to the stop (frame 4000), making sure that no errors in seeking show up as sync issues. I picked frame 4000 because it was easy to remember, and in the MP3 I am using for a reference, an "interesting" wave form occurs there.
I am starting with this MP3
http://freemusicarchive.org/music/Fabrizio_Paterlini/Remixed/01_Forever_blue_March_Rosetta_re-imagined_version
Renamed to Test-MP3.mp3 and using Winamp, removed the album art and ID3 tags.
Testing
I have loaded the MP3 file into Audacity and exported it as a WAV. To make sure the WAV and the MP3 match, i reload the WAV into Audacity and zoom in really close to compare the WAV forms, they match. From this WAV file I ran it through FAAC with all default options, saving it to an .mp4 file and to an .aac (MPEG transport stream) file. Using Audacity, I have checked the wave form of all of these files, they match perfectly.
Top is the uncompressed WAV, middle is the FAAC encoded AAC file, and on the bottom is the original MP3
http://otakuvideo.com/~quu/AVS_AudioTimings/Audacity-SyncCheck.png
Top is the uncompressed WAV, middle is the AAC file int he transport stream, and on the bottom is the AAC in an MP4 file
http://otakuvideo.com/~quu/AVS_AudioTimings/FAAC-MP4vsTS-SyncCheck.png
Now I am building a set of "baseline-XXX.avs" file which will draw the WAV forms as a video so i can visually check the sync. I am going to decode the mp3 via DirectShowSource, Fatsos, bassAudioSource, and NicMPG123Source. The AAC files will be decoded with DirectShow and FFAudioSource twice (once as a transport stream, once as a MPEG-4). The WAV file will be decoded with WAVSource. All of these WAV form videos will then be stack on top of each other so the sync of the different decoding methods can be checked.
MP3 - http://otakuvideo.com/~quu/AVS_AudioTimings/baseline-MP3.png
AAC - http://otakuvideo.com/~quu/AVS_AudioTimings/Baseline-AAC.png
Right away with the MP3 comparison, the bassAudio is about a frame fast. (If there is an image, album art, in the ID3 of the mp3, FFAudioSource is about a frame slow, but that is why we removed it). The AAC decoders were much cleaner, though they all ran about half a frame behind the WAVSource baseline, which is strange since Audacity showed that they were all in sync with the WAV.
LAME was also used to decode the original MP3 into an uncompressed WAV file. This was not used as the baseline WAV file because Audacity showed the sync between it and the original MP3 to be different. But, when this WAV file is used, bassAudio becomes the correct sync.
http://otakuvideo.com/~quu/AVS_AudioTimings/baseline-MP3-LAME-WAV-Comparison.png
I decompressed the original mp3 using Audacity, VLC, LAME, Winamp, and dbPoweramp. The VLC and Audacity were identical, while the LAME, Winamp, and dbPoweramp were identical (but out of sync according to Audacity).
http://otakuvideo.com/~quu/AVS_AudioTimings/Synccheck-Audacity-VLC.png
http://otakuvideo.com/~quu/AVS_AudioTimings/Synccheck-Audacity-LAME.png
http://otakuvideo.com/~quu/AVS_AudioTimings/Synccheck-Audacity-Winamp.png
http://otakuvideo.com/~quu/AVS_AudioTimings/Synccheck-Audacity-dbPoweramp.png
Accuracy goes out if you "scrub" to the location instead of play to that location. If i scrub with the MP3, the DirectShow decoder fails horribly, but the rest are accurate. If we seek ahead with the AAC sources, the DirectShow is fine, while the two MP4 sourced decoders are running a fraction of a frame slow.
MP3 - http://otakuvideo.com/~quu/AVS_AudioTimings/baseline-MP3-Seeked.png
AAC - http://otakuvideo.com/~quu/AVS_AudioTimings/Baseline-AAC-Seeked.png
For both types of compressed audio, we got different timings from the same source files based on how we opened the files. Is the WAVSource the correct one, or is one of the other methods correct? One possible way to to re encode from the source multiple times in iteration, then compare the timing to the originals, which ever was stable is the "right" one.
The first is the easiest, testing the sync of the WAVSource. I have an AVS that opens the WAV file as a WAV source, and then literate 5 times using "wavi Test-WAV-[n].avs Test-WAV-[n+1].WAV", with the 5 AVS files prepped. Any sync or timing issues from WAVSource, unless compensated for by "wavi" will show them selves by iteration 5.
Rendered - http://otakuvideo.com/~quu/AVS_AudioTimings/WAVSource-TestResults.png
Audacity - http://otakuvideo.com/~quu/AVS_AudioTimings/WAVSource-TestResults-Aud.png
And by iteration 5, there is no change or adjustment at all to the results, the sync looks perfect.
For MP3 sync, it should be easier, as there are less variables. Before I started using BePipe to re encode the mp3 from each of the sources, I need to test LAME with a raw files to make sure any sync issues are from the AVISynth source, and not lame.
I re encode the original mp3, and the baseline WAV file with LAME. I use the "-preset insane" command line option (and the sync issues happened with all settings i tried). The resulting MP3 file encoded out of lame does not, according to Audacity, have the same sync as the original files.
The top is the original MP#, the middle the uncompressed WAV, and on the bottom is the resulting MP3s
MP3 to MP3 - http://otakuvideo.com/~quu/AVS_AudioTimings/Lame-Native-MP3-2-MP3-Reencode.png
WAV to MP3 - http://otakuvideo.com/~quu/AVS_AudioTimings/Lame-Native-WAV-2-MP3-Reencode.png
The MP3 source is closer than the WAV source, but they should be dead on. When I ran FAAC on the same uncompressed WAV, the sync was perfect. Because of the inability of LAME to create an MP3 file with the same sync as the source, i won't be able to use it in my testing. But based on the fact that the WAV source was found to be accurate, we can assume that DirectShow and NicAudio is the proper way to decode MP3 files in AVISynth, and given that DirectShow counts on a filter, Nic's might be best.
To see if maybe LAME was the reference, and all the other filters were wrong, I re encoded the original MP4 file 5 iterations deep. LAME was the only thing used, it decoded the previous MP3 and encoded that iteration's MP3.
http://otakuvideo.com/~quu/AVS_AudioTimings/LAME-MP3-Reencoded-5ITR.png
By the 5th iteration, the sync between the original MP3 (on top) and the 5th re encode (bottom) is off. this means that the LAME decoder is off sync, or the LAME encoder is off sync, or possibly both. This bothers me greatly, as I expect an encoder to encode what it is given, not add or remove bits.
That leaves AAC sync testing. FAAC has already shown that the AAC file it creates is in sync with the WAV file it came from, but first we need to make sure that the output from BePipe to FAAC is as accurate. We can use the AVS files from the WAVSource testing to check. I run the command "BePipe --script "Import(^Test-WAV-0.avs^)" | FAAC -o Test-AAC-BePipe-Check.aac -"
Top is WAV, bottom is resulting AAC - http://otakuvideo.com/~quu/AVS_AudioTimings/FAAC-BePipe-SyncCheck.png
As you can see, BePipe and FAAC does not introduce any sync issues that can be shown by Audacity. So now to the Re encoding.
As the Sync is identical, in AVISynth, for all 5 input methods, only one of them needs to be tested. The encoder would not change based on the source, and when looking at the wave form in the video, the resulting decoded audio is identical. For this test, I am going to be using the FFAudioSource on the transport stream (.aac), and repeat the iterations 5 times. ">BePipe --script "Import(^Test-AAC-[n].avs^)" | FAAC -o Test-AAC-[n+1].aac -"
Rendered - http://otakuvideo.com/~quu/AVS_AudioTimings/FFAudioSource-AAC-TestResults.png
Audacity - http://otakuvideo.com/~quu/AVS_AudioTimings/FFAudioSource-AAC-TestResults-Aud.png
As you can see, the timing drifts further and further with each subsequent re encode, showing that the delay induced by the AAC decoding is a real delay. This conflicts with my previous comparison between FAAC and Nero, but in that one I used the MP4 instead of the transport stream. So I am going to repeat this test with the only change being using the MP4 instead of the TS versions.
Rendered - http://otakuvideo.com/~quu/AVS_AudioTimings/FFAudiosource-MP4-TestResults.png
Audacity - http://otakuvideo.com/~quu/AVS_AudioTimings/FFAudiosource-MP4-TestResults-Aud.png
Well, the results are obvious, the AAC sources are slightly behind when decoded with AVISynth. When I checked the results my previous Nero vs. FAAC testing, I only did three iterations, and did not use Audacity. Increasing the iterations to 5 and using Audacity has shown the audio drift, even for FAAC sources.