Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Usage

Reply
 
Thread Tools Search this Thread Display Modes
Old 15th August 2010, 07:03   #1  |  Link
Quu
Registered User
 
Join Date: Jul 2002
Location: Atlanta, GA
Posts: 33
Audio Sync for MP3 and AAC files in AVISynth

I want to test the sync of decoding mp3 and aac audio sources. I will be using the WAVSource, AVISource, DirectShowSource, NicMPG123Source, bassAudioSource and FFAudioSource API calls when I can.

What I found is that when decoding mp3 files, use NicAudio if possible, and DirectShow might work also. For AAC decoding, there is no solution currently in AVISynth that will provide the audio in the proper sync.

Setup
Windows Vista 64 bit (32 bit executable used)

I am using Audacity 1.3.12-beta (Unicode)
I am using AVISynth 2.5.8
I am using WAVI 1.06
I am using FFMpegSource ffms2-mt-r318
I am using Nicaudio 2.04
I am using AudioGraph audgraph_25_dll_20040318
I am using VirtualDub 1.9.9
I am using FAAC 1.28
I am using LAME 3.98.4
I am using BePipe (v1.0.2155.27457 from file properties)
I am using BassAudio bass.dll 2.4
I am using the BassAudio AVISynth Plug ins from BeHappy 20091012
I have Nero 8 installed, so it will be the DirectShow Filters
MP3 - http://otakuvideo.com/~quu/AVS_Audio...rGraph-mp3.png
AAC - http://otakuvideo.com/~quu/AVS_Audio...ph-AAC-MP4.png
I am using VLC 1.1.2
I am using Winamp 5.581
I am using dbPoweramp 13.4
All source files are at http://otakuvideo.com/~quu/AVS_AudioTimings/

For all VirtualDub inspections, a stop point is put at frame 4001, the video is rewound, and then played to the stop (frame 4000), making sure that no errors in seeking show up as sync issues. I picked frame 4000 because it was easy to remember, and in the MP3 I am using for a reference, an "interesting" wave form occurs there.

I am starting with this MP3
http://freemusicarchive.org/music/Fa...agined_version
Renamed to Test-MP3.mp3 and using Winamp, removed the album art and ID3 tags.

Testing

I have loaded the MP3 file into Audacity and exported it as a WAV. To make sure the WAV and the MP3 match, i reload the WAV into Audacity and zoom in really close to compare the WAV forms, they match. From this WAV file I ran it through FAAC with all default options, saving it to an .mp4 file and to an .aac (MPEG transport stream) file. Using Audacity, I have checked the wave form of all of these files, they match perfectly.

Top is the uncompressed WAV, middle is the FAAC encoded AAC file, and on the bottom is the original MP3
http://otakuvideo.com/~quu/AVS_Audio...-SyncCheck.png
Top is the uncompressed WAV, middle is the AAC file int he transport stream, and on the bottom is the AAC in an MP4 file
http://otakuvideo.com/~quu/AVS_Audio...-SyncCheck.png

Now I am building a set of "baseline-XXX.avs" file which will draw the WAV forms as a video so i can visually check the sync. I am going to decode the mp3 via DirectShowSource, Fatsos, bassAudioSource, and NicMPG123Source. The AAC files will be decoded with DirectShow and FFAudioSource twice (once as a transport stream, once as a MPEG-4). The WAV file will be decoded with WAVSource. All of these WAV form videos will then be stack on top of each other so the sync of the different decoding methods can be checked.
MP3 - http://otakuvideo.com/~quu/AVS_Audio...seline-MP3.png
AAC - http://otakuvideo.com/~quu/AVS_Audio...seline-AAC.png
Right away with the MP3 comparison, the bassAudio is about a frame fast. (If there is an image, album art, in the ID3 of the mp3, FFAudioSource is about a frame slow, but that is why we removed it). The AAC decoders were much cleaner, though they all ran about half a frame behind the WAVSource baseline, which is strange since Audacity showed that they were all in sync with the WAV.

LAME was also used to decode the original MP3 into an uncompressed WAV file. This was not used as the baseline WAV file because Audacity showed the sync between it and the original MP3 to be different. But, when this WAV file is used, bassAudio becomes the correct sync.
http://otakuvideo.com/~quu/AVS_Audio...Comparison.png
I decompressed the original mp3 using Audacity, VLC, LAME, Winamp, and dbPoweramp. The VLC and Audacity were identical, while the LAME, Winamp, and dbPoweramp were identical (but out of sync according to Audacity).
http://otakuvideo.com/~quu/AVS_Audio...dacity-VLC.png
http://otakuvideo.com/~quu/AVS_Audio...acity-LAME.png
http://otakuvideo.com/~quu/AVS_Audio...ity-Winamp.png
http://otakuvideo.com/~quu/AVS_Audio...dbPoweramp.png

Accuracy goes out if you "scrub" to the location instead of play to that location. If i scrub with the MP3, the DirectShow decoder fails horribly, but the rest are accurate. If we seek ahead with the AAC sources, the DirectShow is fine, while the two MP4 sourced decoders are running a fraction of a frame slow.
MP3 - http://otakuvideo.com/~quu/AVS_Audio...MP3-Seeked.png
AAC - http://otakuvideo.com/~quu/AVS_Audio...AAC-Seeked.png

For both types of compressed audio, we got different timings from the same source files based on how we opened the files. Is the WAVSource the correct one, or is one of the other methods correct? One possible way to to re encode from the source multiple times in iteration, then compare the timing to the originals, which ever was stable is the "right" one.

The first is the easiest, testing the sync of the WAVSource. I have an AVS that opens the WAV file as a WAV source, and then literate 5 times using "wavi Test-WAV-[n].avs Test-WAV-[n+1].WAV", with the 5 AVS files prepped. Any sync or timing issues from WAVSource, unless compensated for by "wavi" will show them selves by iteration 5.
Rendered - http://otakuvideo.com/~quu/AVS_Audio...estResults.png
Audacity - http://otakuvideo.com/~quu/AVS_Audio...esults-Aud.png
And by iteration 5, there is no change or adjustment at all to the results, the sync looks perfect.

For MP3 sync, it should be easier, as there are less variables. Before I started using BePipe to re encode the mp3 from each of the sources, I need to test LAME with a raw files to make sure any sync issues are from the AVISynth source, and not lame.
I re encode the original mp3, and the baseline WAV file with LAME. I use the "-preset insane" command line option (and the sync issues happened with all settings i tried). The resulting MP3 file encoded out of lame does not, according to Audacity, have the same sync as the original files.
The top is the original MP#, the middle the uncompressed WAV, and on the bottom is the resulting MP3s
MP3 to MP3 - http://otakuvideo.com/~quu/AVS_Audio...3-Reencode.png
WAV to MP3 - http://otakuvideo.com/~quu/AVS_Audio...3-Reencode.png
The MP3 source is closer than the WAV source, but they should be dead on. When I ran FAAC on the same uncompressed WAV, the sync was perfect. Because of the inability of LAME to create an MP3 file with the same sync as the source, i won't be able to use it in my testing. But based on the fact that the WAV source was found to be accurate, we can assume that DirectShow and NicAudio is the proper way to decode MP3 files in AVISynth, and given that DirectShow counts on a filter, Nic's might be best.

To see if maybe LAME was the reference, and all the other filters were wrong, I re encoded the original MP4 file 5 iterations deep. LAME was the only thing used, it decoded the previous MP3 and encoded that iteration's MP3.
http://otakuvideo.com/~quu/AVS_Audio...coded-5ITR.png
By the 5th iteration, the sync between the original MP3 (on top) and the 5th re encode (bottom) is off. this means that the LAME decoder is off sync, or the LAME encoder is off sync, or possibly both. This bothers me greatly, as I expect an encoder to encode what it is given, not add or remove bits.

That leaves AAC sync testing. FAAC has already shown that the AAC file it creates is in sync with the WAV file it came from, but first we need to make sure that the output from BePipe to FAAC is as accurate. We can use the AVS files from the WAVSource testing to check. I run the command "BePipe --script "Import(^Test-WAV-0.avs^)" | FAAC -o Test-AAC-BePipe-Check.aac -"
Top is WAV, bottom is resulting AAC - http://otakuvideo.com/~quu/AVS_Audio...-SyncCheck.png
As you can see, BePipe and FAAC does not introduce any sync issues that can be shown by Audacity. So now to the Re encoding.

As the Sync is identical, in AVISynth, for all 5 input methods, only one of them needs to be tested. The encoder would not change based on the source, and when looking at the wave form in the video, the resulting decoded audio is identical. For this test, I am going to be using the FFAudioSource on the transport stream (.aac), and repeat the iterations 5 times. ">BePipe --script "Import(^Test-AAC-[n].avs^)" | FAAC -o Test-AAC-[n+1].aac -"
Rendered - http://otakuvideo.com/~quu/AVS_Audio...estResults.png
Audacity - http://otakuvideo.com/~quu/AVS_Audio...esults-Aud.png
As you can see, the timing drifts further and further with each subsequent re encode, showing that the delay induced by the AAC decoding is a real delay. This conflicts with my previous comparison between FAAC and Nero, but in that one I used the MP4 instead of the transport stream. So I am going to repeat this test with the only change being using the MP4 instead of the TS versions.
Rendered - http://otakuvideo.com/~quu/AVS_Audio...estResults.png
Audacity - http://otakuvideo.com/~quu/AVS_Audio...esults-Aud.png
Well, the results are obvious, the AAC sources are slightly behind when decoded with AVISynth. When I checked the results my previous Nero vs. FAAC testing, I only did three iterations, and did not use Audacity. Increasing the iterations to 5 and using Audacity has shown the audio drift, even for FAAC sources.

Last edited by Quu; 15th August 2010 at 18:28.
Quu is offline   Reply With Quote
Old 15th August 2010, 13:57   #2  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 6,467
About mp3:

1) Your sample mp3 file have a cover-art jpg included in the ID3v2 Tag. Some decoders can found wrong mp3 headers inside the jpg, first to mux this mp3 to video clean the Tags (Winamp or other soft) to avoid problems.

2) The reference decoder for mp3 is Lame, decode the mp3 to wav using Lame. Many decoders (like Audacity do) can include a initial delay in the wav file. Your mp3 decoded with lame is 5:54.3922, decoded with Audacity 5:54.4555 with a initial delay of 52 ms.
lame --decode Test-MP3.mp3 Test-WAV.wav

3) And yes, is a know issue, some decoders can introduce delay when decoding mp3.
You can use like reference WavSource("your mp3 decoded by Lame.wav")

About aac:

Same issue, some decoders can introduce delay when decoding aac audio.
The reference decoder for aac audio in mp4 is NeroAacDec.exe

Also encoders introduce delay, for instance all AC3 encoders introduce, by default, a delay of 5.333 ms (48 KHz source)
__________________
BeHappy, AviSynth audio transcoder.
tebasuna51 is offline   Reply With Quote
Old 15th August 2010, 17:36   #3  |  Link
Quu
Registered User
 
Join Date: Jul 2002
Location: Atlanta, GA
Posts: 33
Thank you.
I have one fundamental question then... what tool can I use to check the sync of two audio files? I have been using Audacity, loading both (or up to three at a time) files into it, zooming REALLY close at a peak or interesting wav form and then visually comparing. My test is built on the assumption that Audacity is accurate, mainly because Audacity and AudioGraph seam to agree with each other when the syncs are off.


1) I removed the cover Art and all ID3 tags in fact, just in case, using Winamp.
I then used Audacity to check the sync on my 4 baseline files, the now non ID3ed MP3 "original", the Uncompressed WAV, and the FAAC encoded files (one in a "Transport Stream (ADTS)", the other in an "MPEG-4 File Format (MP4)")
The sync on all four files are identical as far as i can tell in Audacity, so from what I can tell, the baseline files are solid.
http://otakuvideo.com/~quu/AVS_Audio...-SyncCheck.png

2) I have now used five tools to decode the MP3 to an uncompressed wav file. I used audacity itself, VLC, Winamp, Lame, and dbPoweramp.
When i compare the wav files in Audacity, i would expect them to be identical, as there is not supposed to be a random element is decoding, the MPEG-1 layer 3 audio spec is solid, and based on math.
Audacity vs VLC - http://otakuvideo.com/~quu/AVS_Audio...dacity-VLC.png
Audacity vs Lame - http://otakuvideo.com/~quu/AVS_Audio...acity-LAME.png
Audacity vs Winamp - http://otakuvideo.com/~quu/AVS_Audio...ity-Winamp.png
Audacity vs dbPoweramp - http://otakuvideo.com/~quu/AVS_Audio...dbPoweramp.png
So, I have VLC and Audacity on one side, and LAMe, Winamp, and dbPoweramp on the other
which is correct? On one hand, LAMe is the "standard", on the other hand, the wav forms match on the other... blah, i need better tools to test for sync

3) if I were to use the LAME decoded MP3 file, then only the bassAudioSource is in sync
http://otakuvideo.com/~quu/AVS_Audio...Comparison.png
(Interestingly... by getting rid of the image in the ID3, FFAudioSource is now in sync with the other two methods)

AAC... from what I understood the "reference" encoder for AAC was FAAC, and NeroAACEnc is just a higher quality one that everybody uses.

Encoders and muxers that mess with the sync are problematic for me. If it is not know and compensated for, it can mess up the viewing experience of many videos.

I am going to run one more test, and then amend my report above with the findings of it, and what I discovered here.
I am going to take the reference mp3 file and re encode it 5 iterations deep. If LAME does not induce a delay, and baasAudioSouce is the correct decoding of the mp3, then after 5 iterations, the wav forms of the baseline mp3 and the iteration 5 mp3 should be identical, even if offset by the same amount.
Quu is offline   Reply With Quote
Old 15th August 2010, 17:56   #4  |  Link
Quu
Registered User
 
Join Date: Jul 2002
Location: Atlanta, GA
Posts: 33
I wonder if I should add the MKV container to the AAC encoding setup. BassAudio does not suport the mka, while FFAudioSource does
Quu is offline   Reply With Quote
Old 16th August 2010, 00:51   #5  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 6,467
Quote:
Originally Posted by Quu View Post
...
AAC... from what I understood the "reference" encoder for AAC was FAAC, and NeroAACEnc is just a higher quality one that everybody uses.
...
I say NeroAacDec, the decoder Faad (and BassAudio) also introduce delay decoding .m4a.
__________________
BeHappy, AviSynth audio transcoder.
tebasuna51 is offline   Reply With Quote
Old 16th August 2010, 04:41   #6  |  Link
Quu
Registered User
 
Join Date: Jul 2002
Location: Atlanta, GA
Posts: 33
Quote:
Originally Posted by tebasuna51 View Post
I say NeroAacDec, the decoder Faad (and BassAudio) also introduce delay decoding .m4a.
Ahh, ok. I was using Audacity to compare the wave forms of the audio files to determine if they were synced properly

Why do decoders and encoders add delays? that makes no sense to me. I have an audio file I want into another format, why would an encoder, or a decoder change the timing of it? who thought that would be a good idea?

It makes it hard to make sure the audio and video is in the same sync as the master uncompressed AVI file when compressed.
Quu is offline   Reply With Quote
Old 16th August 2010, 19:28   #7  |  Link
Quu
Registered User
 
Join Date: Jul 2002
Location: Atlanta, GA
Posts: 33
Quote:
Originally Posted by tebasuna51 View Post
3) And yes, is a know issue, some decoders can introduce delay when decoding mp3.
You can use like reference WavSource("your mp3 decoded by Lame.wav")

About aac:

Same issue, some decoders can introduce delay when decoding aac audio.
The reference decoder for aac audio in mp4 is NeroAacDec.exe

Also encoders introduce delay, for instance all AC3 encoders introduce, by default, a delay of 5.333 ms (48 KHz source)
I did some more research about this and found this article about AAC on the apple site
http://developer.apple.com/iphone/li...09/tn2258.html
Quote:
Encoder Delay Recommendation

Apple recommends 3rd party products and devices generating AAC bitstreams do so with the assumption that the playback system will always assume there is an encoding delay of 2112 samples in the produced bitstream. It must also be assumed that without an explicit value, the playback system will trim 2112 samples from the AAC decoder output when starting playback from any point in the bistream.

Note: This problem is not unique to AAC. MP3 also has these data dependencies and delays in its bitstream, as do proprietary codecs such as AC-3 and others.

In all of these cases the behaviour is as described above; an implicit, not-signalled, assumption is made about the size of this delay and the playback engine is required to trim this designated number of samples from its output at the start of playback.
So I guess the problem I have is that not all encoders/decoders seam to agree within the same codec on the buffer/delay.

I will use LAME as an example
if LAME adds a delay to its encoded MP3, then when it decodes that MP3 back to an uncompressed WAV file, it should remove that delay. It does not, as I can prove by encoding then decoding the same song again and again, by the 5th iteration the skew is incredibly obvious.

fundamentally it comes down to, what tools do you use to test the sync, because if the encoder and decoder don't agree on the delay, then you are doomed, and any encoder/decoder provided by the same source had better be in sync.

I am going to rework things from the ground up for me. I have a specific target in mind fro playback, I need to define a test that can accurately measure the sync for that target, and then test encoding solutions vs that target.
Quu is offline   Reply With Quote
Old 18th August 2010, 10:00   #8  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 6,467
Using Lame like encoder/decoder the wav isn't delayed.

The same for AAC using NeroAacEnc and NeroAacDec.
About bass_aac/faad2 decoders you can read Menno post in:
BASS_AAC is not accuracy?
__________________
BeHappy, AviSynth audio transcoder.
tebasuna51 is offline   Reply With Quote
Old 18th August 2010, 17:40   #9  |  Link
Quu
Registered User
 
Join Date: Jul 2002
Location: Atlanta, GA
Posts: 33
I took a starting mp3 file

I re encoded it with lame 5 times
the resulting file (and each iteration to a growing degree) was out of sync with the original

IF using LAME the wav is not delayed, i would expect, at worse, the first iteration be out of sync from the original, as it may have been encoded with a different amount of delay at the beginning.

That was not the case, each iteration of the re encode using LAME pushed the sync further and further out. Test it your self
Take an mp3 file, rename it test.mp3, then run the following

"lame -preset insane test.mp3 test-itr1.mp3"
"lame -preset insane test-itr1.mp3 test-itr2.mp3"
"lame -preset insane test-itr2.mp3 test-itr3.mp3"
"lame -preset insane test-itr3.mp3 test-itr4.mp3"
"lame -preset insane test-itr4.mp3 test-itr5.mp3"
LAME is doing the decoding of the MP3 and then the encoding. Ignoring the loss of quality (we mitigate this with "insane") as you re encode an audio, if LAME does not add a delay, or if it adds the same delay that it compensates for when it decodes, then iteration 1 and iteration 5 should have the exact same timings

I did that experiment, and when I compared iteration 1 and 5 in Audacity, they were out of sync...

OK... what if audacity and lame don't agree with the amount of empty frames at the beginning... does not matter, all I am doing it comparing one lame encoded mp3 (iteration 1) with another (iteration 5)... so any dis agreement between lame and audacity would equally affect both... but their sync is off

From what research I have done, it seams that ALL modern audio encoders need to add an amount of empty audio frames to the beginning of their files. It is up to the decoders to know this and compensate. Sadly, from what I read, the amount of empty frames is either not part of the formal spec, or not followed, so what you get is one encoder/decoder from a single vendor working perfect, but not when you mix vendors for encode and decode... or in the case of LAME, the same "vendor" not even agreeing with itself
Quu is offline   Reply With Quote
Old 18th August 2010, 20:37   #10  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 6,467
Try with
"lame --decode test.mp3 test-itr1.wav"
"lame test-itr1.wav test-itr1.mp3"
"lame --decode test-itr1.mp3 test-itr2.wav"
and open the wav files in Audacity (don't open mp3 files with Audacity)
__________________
BeHappy, AviSynth audio transcoder.
tebasuna51 is offline   Reply With Quote
Old 18th August 2010, 21:00   #11  |  Link
Quu
Registered User
 
Join Date: Jul 2002
Location: Atlanta, GA
Posts: 33
tebasuna51, that works (I did 5 iterations, to match the other test)... What audio software should I use instead of Audacity?

btw, when decoding the original, I saw
* skipping initial 1105 samples (encoder+decoder delay)
* skipping final 1689 samples (encoder padding-decoder delay)
but when decoding subsequent versions, I saw
* skipping initial 1105 samples (encoder+decoder delay)
* skipping final 537 samples (encoder padding-decoder delay)

but what bothers me, is why does that method work, but when passed an mp3 native in, are the timings different. Why does lame not use the same logic when decoding an mp3 for saving to a wav that it does when decoding an mp3 to re encode? When I do my above test, I see no mention of skipping any samples.

blarg... the inconsistencies in audio encoders and decoders hurts my brain
Quu is offline   Reply With Quote
Old 18th August 2010, 23:24   #12  |  Link
IanB
Avisynth Developer
 
Join Date: Jan 2003
Location: Melbourne, Australia
Posts: 3,167
From an Avisynth perspective, it is the *Source() filters responsibility to deal with any pre-roll or decoder delay.

The API timing elements are :-

For the Audio stream
  • VideoInfo::audio_samples_per_second
  • VideoInfo::num_audio_samples
  • IClip::GetAudio(void* buf, __int64 start, __int64 count, IScriptEnvironment* env)
And for the Video stream
  • VideoInfo::fps_numerator
  • VideoInfo::fps_denominator
  • VideoInfo::num_frames
  • IClip::GetFrame(int n, IScriptEnvironment* env)
At time T=0 seconds the expectation is Audio sample 0 goes with Video frame 0.

At time T=1 second the expectation is Audio sample (VideoInfo::audio_samples_per_second) goes with Video Frame (VideoInfo::fps_numerator/VideoInfo::fps_denominator)

And every time the Avisynth filter graph asks the *Source() filter for sample number A or frame number V the expectation is exactly that be delivered each and every time.

Various filters like AudioDub(), DelayAudio(), Trim(), AllignedSplice() (++), UnAllignedSplice() (+), Assume*(), etc allow the Avisynth filter graph to be configured to control the timing and position properties of the clip.

Unfortunately not all source filter live up to this expectation, some have bugs, some have an impossible task due to the nature of the source data they are decoding.

Where there are bugs, report them. If they are not reported then there is no chance of them ever being fixed.

Where the source format is uncooperative then you need to construct and use your Avisynth script in accordance with the documented restriction. Worst case this may mean you have to do a single linear pass of the source data and transcode it to a suitable lossless cooperative intermediate format.
IanB is offline   Reply With Quote
Old 19th August 2010, 00:18   #13  |  Link
Overdrive80
Anime addict
 
Overdrive80's Avatar
 
Join Date: Feb 2009
Location: Spain
Posts: 675
It is true that the delay introduced neroaac_enc, opened a long time thread about something like that maybe interests you http://forum.doom9.org/showthread.php?p=1420989 # post1420989

In adobe premiere i can view that delay was 16 ms more or less, i couldnt say it with more precision.

Last edited by Overdrive80; 19th August 2010 at 00:21.
Overdrive80 is offline   Reply With Quote
Old 19th August 2010, 02:01   #14  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 6,467
Quote:
Originally Posted by Quu View Post
tebasuna51, that works (I did 5 iterations, to match the other test)... What audio software should I use instead of Audacity?
I always decode to wav and after open the wav files in Audio editors.

Quote:
blarg... the inconsistencies in audio encoders and decoders hurts my brain
Welcome to audio software.
__________________
BeHappy, AviSynth audio transcoder.
tebasuna51 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 17:29.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, vBulletin Solutions Inc.