Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Usage

Reply
 
Thread Tools Search this Thread Display Modes
Old 13th September 2024, 18:34   #1  |  Link
rgr
Registered User
 
Join Date: Jun 2022
Posts: 104
Opening AAC from MP4 file - what else besides LWLibas/LSMASH/ffms2?

I have an audio problem with a 4h MP4 file (H265 VFR + AAC) from a standalone recorder.

I open it in MPC-HC and at 4:00:59.500 I have synchronized video and audio. It looks and sounds OK.

I open it in AviSynth using LWLibavVideoCodec:
Code:
name1="VFR.mp4"
a=AudioDub(LWLibavVideoSource(name1,fpsnum=50),LWLibavAudioSource(name1))
return a
And here is the problem, because the audio track is sped up and what was in 4:00:59.500 moves to approx. 4:00:49.180
In ffms2 it is the same.

(This is not a linear shift, because at time 0:13:00.000 I have only 1 frame of shift.)

If, however, in VirtualDub, load the same MP4 file as an audio track (Audio->Audio from another file), everything is perfectly synchronized.

And here's the question -- what else can I use to open audio from such a file?
rgr is offline   Reply With Quote
Old 14th September 2024, 11:54   #2  |  Link
Frank62
Registered User
 
Join Date: Mar 2017
Location: Germany
Posts: 257
What happens if you leave out the "fpsnum=50"?
Frank62 is offline   Reply With Quote
Old 14th September 2024, 12:17   #3  |  Link
rgr
Registered User
 
Join Date: Jun 2022
Posts: 104
Quote:
Originally Posted by Frank62 View Post
What happens if you leave out the "fpsnum=50"?
The video frame position changes (but there is still no synchronization).
Nothing happens with the audio track.
rgr is offline   Reply With Quote
Old 14th September 2024, 13:32   #4  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 7,110
That is the problem with VFR video forced to be played at a fix (fpsnum=50) fps, that change the video duration and don't match with the audio duration.
What about:

a=LWLibavAudioSource(name1)
v=LWLibavVideoSource(name1).ChangeFPS(50)
AudioDub(v,a)

Now AviSynth "Changes the frame rate by deleting or duplicating frames."
__________________
BeHappy, AviSynth audio transcoder.
tebasuna51 is offline   Reply With Quote
Old 14th September 2024, 16:17   #5  |  Link
rgr
Registered User
 
Join Date: Jun 2022
Posts: 104
I'll check, but the problem is with the audio track, not the video track (LWLibavVideoSource handles VFR very well, unlike ffms2).

For now, I used ffmpeg and converted the clip twice -- once using "-c copy", the second time recoding it to mp4 CFR (h264+aac).

In both cases ("copy" and "reencode"), playback in MPC-HC is correct as in the original (synchronization is preserved).

If I load both of these files into VDub using Avisynth (as in the first post), the sound is shifted. (Also when playing AVS script via MPC-HC, the sound is shifted)

If I overlay the audio track using Audio->Audio from other file and load the same file, the sound is correctly synchronized with the image (and this is regardless of whether I load the sound using "Caching Input Driver" or using "ffmpeg").

So it looks like the problem might be somewhere in AviSynth, because I don't believe that any plugin (LibavAudioSource, ffms2, BSAudioSource) can't handle audio (although all of them are probably based on ffmpeg libraries anyway -- but VDub with the same libraries handles sound correctly).

Edit: LWLibavVideoSource(name1).ChangeFPS(50) works just as well as LWLibavVideoSource(name1,fpsnum=50)
The video track remains correct, the audio track still generates a larger offset the longer it runs.

Last edited by rgr; 14th September 2024 at 16:23.
rgr is offline   Reply With Quote
Old 14th September 2024, 17:06   #6  |  Link
Frank62
Registered User
 
Join Date: Mar 2017
Location: Germany
Posts: 257
50 seems wrong.
Frank62 is offline   Reply With Quote
Old 14th September 2024, 18:54   #7  |  Link
hello_hello
Registered User
 
Join Date: Mar 2011
Posts: 4,918
Quote:
Originally Posted by tebasuna51 View Post
That is the problem with VFR video forced to be played at a fix (fpsnum=50) fps, that change the video duration and don't match with the audio duration.
It shouldn't change the duration. FPSNum should just cause LWLibavVideoSource to add or drop frames to output the specified frame rate. If the video is VFR, ChangeFPS(50) would add or drop frames evenly throughout the video based on the average frame rate and mess with the A/V sync, rather than add or drop them where it's necessary as FPSNum should. If the video is technically 50fps, as an example (to have a standard frame rate) and the first half has an average frame rate of 25fps, while the second half is 50fps, then FPSNum=50 should only add duplicate frames where they're needed in the first half.

rgr,
I don't know if this is the reason as I don't know how the audio decoders deal with gaps, but if there's gaps in the audio and they're not accounted for....
Try extracting the audio with eac3to or MeGUI's HD Streams Extractor as eac3to will replace any gaps with silence.

The golden rule I live by is to never extract the audio or other streams unless I'm converting/editing them. You can open most containers with MKVToolNixGUI and it'll remux without changing the timecodes for existing streams and therefore any gaps will be retained. I usually convert the video, open it in MKVToolNix, add the original source file, de-select the original video stream, then remux.

All of the above is, as I said, just a guess as to whether it's the problem.

PS. I should add, there can be issues when FPSNum is used if the specified frame rate is the same as the video frame rate in sections and/or it's largely CFR. Maybe it's no longer the case, but in the past I've seen both FFMS2 and LWLibavVideoSource drop frames they shouldn't have and replace them with duplicates, probably due to jitter in the timecodes or something like that. It doesn't happen often, and maybe not at all now, but there's always the option of converting to a CFR with an Avisynth plugin. It requires extracting the timecodes so the plugin knows where to add/drop frames.
http://avisynth.nl/index.php/TimecodeFPS
http://avisynth.nl/index.php/VfrToCfr

As ChangeFPS(50) didn't seem to make any difference, you should confirm the video is actually VFR. Adding Info() to the script will give you the frame count and frame rate. If it's 50fps then FPSNum wouldn't be needed (unless you want to convert from one CFR to another, but there's better ways to do it). If not, then adding FPSNum would also change the frame count as it's very, very unlikely the average frame rate of a VFR video would exactly match a frame rate you've specified.

Last edited by hello_hello; 14th September 2024 at 20:52.
hello_hello is offline   Reply With Quote
Old 14th September 2024, 22:54   #8  |  Link
Frank62
Registered User
 
Join Date: Mar 2017
Location: Germany
Posts: 257
Quote:
Originally Posted by rgr View Post
If, however, in VirtualDub, load the same MP4 file as an audio track (Audio->Audio from another file), everything is perfectly synchronized.
You could try to open both again like this, then set Audio/Full processing Mode, save as a WAV file (which can cause problems because of the length, but worth a try, maybe in parts), and try again in Avisynth with the WAV file.
My feeling is, that there is no variable framerate, but some error - but only a feeling.
Frank62 is offline   Reply With Quote
Old 15th September 2024, 00:01   #9  |  Link
rgr
Registered User
 
Join Date: Jun 2022
Posts: 104
Quote:
Originally Posted by hello_hello View Post
Try extracting the audio with eac3to or MeGUI's HD Streams Extractor as eac3to will replace any gaps with silence.
Code:
eac3to.exe "VFR.mp4"
Running in fast mode
Keeping dialnorm
The format of the source file could not be detected.
Quote:
Maybe it's no longer the case, but in the past I've seen both FFMS2 and LWLibavVideoSource drop frames they shouldn't have and replace them with duplicates, probably due to jitter in the timecodes or something like that.
ffms2 actually works badly with VFR, but LWLibavSource does very well. I can see it when playing in MPC-HC -- new scene start perfectly at 4:00:59.500 just like with LWLibavSource.

Quote:
As ChangeFPS(50) didn't seem to make any difference, you should confirm the video is actually VFR.
I took a closer look - ChangeFPS(50) works, but the frames are obviously shifted incorrectly. The file is undoubtedly VFR (Frame rate mode : Variable Frame rate : 50.000 FPS Minimum frame rate : 42.373 FPS Maximum frame rate : 60.976 FPS).
rgr is offline   Reply With Quote
Old 15th September 2024, 00:10   #10  |  Link
rgr
Registered User
 
Join Date: Jun 2022
Posts: 104
Quote:
Originally Posted by Frank62 View Post
You could try to open both again like this, then set Audio/Full processing Mode, save as a WAV file (which can cause problems because of the length, but worth a try, maybe in parts), and try again in Avisynth with the WAV file.
My feeling is, that there is no variable framerate, but some error - but only a feeling.
I tried. Unfortunately AviSynth was unable to load this wav -- after a little over 3 hours there was no sound. However, when playing the wav in MPC-HC I hear that the audio after recoding is badly shifted.

As if playing the original stream from the MP4 file was OK, but each linear reencoding caused gradual shortening of the audio track.
rgr is offline   Reply With Quote
Old 15th September 2024, 05:09   #11  |  Link
hello_hello
Registered User
 
Join Date: Mar 2011
Posts: 4,918
"The format of the source file could not be detected."

Probably because it's an MP4. Remux it as an MKV and you should be okay.

It sounds like VFR but only Info() in a script can tell you for sure. If it doesn't show 50fps then it's variable. Sometimes though, MediaInfo will report VFR when it's actually not, especially for MP4s. Something about the timebase some programs use if I remember correctly, and the frame timestamps don't always end up in the exact spot they should be. For MKV, as an example, timestamps are usually rounded to the nearest millisecond so for a frame rate such as 23.976 the frame duration should be 41.7083333333 seconds, but instead they alternate in a pattern between 41ms and 42ms. None of that should confuse MediaInfo, and MKVToolNix writes statistics about each track to tags in the MKV so MediaInfo can just read that info. MP4s also often seem to have an initial video delay as well as an audio delay though. I don't understand why, but maybe that confuses MediaInfo sometimes. It's probably VFR but I'd still get a second opinion myself.

Edit: If ChangeFPS(50) is changing something then I guess it's not being decoded as 50fps so you probably don't need to confirm it with Info().

Last edited by hello_hello; 15th September 2024 at 05:16.
hello_hello is offline   Reply With Quote
Old 15th September 2024, 12:03   #12  |  Link
rgr
Registered User
 
Join Date: Jun 2022
Posts: 104
Info() will always return 50fps, because that's the average fps and with that AviSynth returns video as CFR.
https://imgur.com/a/GLy2DYu
If it wasn't VFR, then fpsnum=50 would return the same video track. But that doesn't matter, because the problem isn't with the video track.

eac3to v3.52
command line: eac3to.exe VFR.mkv out.wav
------------------------------------------------------------------------------
Running in fast mode
Keeping dialnorm
MKV, 1 video track, 1 audio track, 4:06:24, 49.998p
1: h265/HEVC, 576p50 (15:11)
2: AAC, 2.0 channels, 48kHz, dialnorm: 0dB
[v01] The video bitstream framerate field doesn't match the container framerate. <WARNING>
Track 2 is used for destination file "out.wav".
[a02] Extracting audio track number 2...
[a02] Decoding with DirectShow (Nero Audio Decoder 2)...
[a02] Getting "Nero Audio Decoder 2" instance failed. <ERROR>
Aborted at file position 3932160. <ERROR>

Last edited by rgr; 15th September 2024 at 20:29.
rgr is offline   Reply With Quote
Old 15th September 2024, 19:26   #13  |  Link
hello_hello
Registered User
 
Join Date: Mar 2011
Posts: 4,918
Quote:
Originally Posted by rgr View Post
Info() will always return 50fps, because that's the average fps and with that AviSynth returns video as CFR.
The odds of that happening when there's frames with a rate both greater and lower than 50 fps must be very close to zero, but if you're happy for ChangeFPS(50) to be shifting frames when the video is already CFR 50 fps....

For the record it's the source filter that chooses the frame rate as Avisynth, to the best of my knowledge, has no way to know what it is, and I assume ffms2 and Lsmash both calculate the average frame rate for VFR as it's the most logical choice. The frame rate in your screenshot appears to be 49.998 fps anyway, but for VFR video it'll change as the frame rate changes, although as it doesn't matter I'll leave it there.

I'm not sure I've ever extracted AAC audio as a wave file, but I was just playing with eac3to and what I said about it repairing gaps no longer seems to be happening. I'm not sure why but I'll do my best to work it out as the alternative can only be be I've imagined it and I should be living in a room with padded walls. I'll have quite a few MeGUI log files on an old drive I can search through to prove to myself I haven't lost the plot, but I've used eac3to for that purpose many, many times, so I don't understand why it's not fixing them now.

Last edited by hello_hello; 15th September 2024 at 19:29.
hello_hello is offline   Reply With Quote
Old 15th September 2024, 20:32   #14  |  Link
rgr
Registered User
 
Join Date: Jun 2022
Posts: 104
Quote:
Originally Posted by hello_hello View Post
The odds of that happening when there's frames with a rate both greater and lower than 50 fps must be very close to zero, but if you're happy for ChangeFPS(50) to be shifting frames when the video is already CFR 50 fps....
As I wrote in the first post -- I do not use ChangeFPS(50), but LWLibavvideoSource(fpsnum=50).
rgr is offline   Reply With Quote
Old 15th September 2024, 23:24   #15  |  Link
hello_hello
Registered User
 
Join Date: Mar 2011
Posts: 4,918
Quote:
Originally Posted by rgr View Post
I took a closer look - ChangeFPS(50) works, but the frames are obviously shifted incorrectly
Obviously I was making a point based on what you said happened when you did use ChangeFPS(50) so I'll leave you to it.
hello_hello is offline   Reply With Quote
Old 17th September 2024, 13:15   #16  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 7,110
Quote:
Originally Posted by rgr View Post
...But that doesn't matter, because the problem isn't with the video track.

eac3to v3.52
command line: eac3to.exe VFR.mkv out.wav
------------------------------------------------------------------------------
...
eac3to can't decode AAC.
Try first extract the AAC with eac3to v3.36 (replace only the eac3to.exe in the eac3to install folder) to see if detect audio gaps.

eac3to v3.36
command line: eac3to.exe VFR.mkv 2: out.aac

maybe you can see:

[a02] Audio has a gap of Xms at playtime 0:00:YY. <WARNING>

Try with the extracted AAC like audio from other file in VirtualDub2
__________________
BeHappy, AviSynth audio transcoder.
tebasuna51 is offline   Reply With Quote
Old 17th September 2024, 15:19   #17  |  Link
rgr
Registered User
 
Join Date: Jun 2022
Posts: 104
Yes, there are gaps.
There are about 50 entries, which gives a total of about 300ms of shift. And that's right, that's about it.


And what does that mean technically? I always thought that an audio track was a continuous "file" (mp3/aac/...) of course remuxed from video track.


MKV, 1 video track, 1 audio track, 4:06:24, 49.998p
1: h265/HEVC, 576p50 (15:11)
2: AAC, 2.0 channels, 48kHz
v01 The video bitstream framerate field doesn't match the container framerate.
a02 Extracting audio track number 2...
a02 Creating file "out.aac"...
a02 Audio has a gap of 5ms at playtime 0:02:55.
a02 Audio has a gap of 5ms at playtime 0:05:03.
a02 Audio has a gap of 5ms at playtime 0:09:10.
a02 Audio has a gap of 6ms at playtime 0:14:42.
a02 Audio has a gap of 5ms at playtime 0:19:30.
a02 Audio has a gap of 6ms at playtime 0:24:23.
a02 Audio has a gap of 5ms at playtime 0:29:35.
a02 Audio has a gap of 5ms at playtime 0:33:36.
a02 Audio has a gap of 6ms at playtime 0:39:01.
a02 Audio has a gap of 5ms at playtime 0:42:43.
a02 Audio has a gap of 6ms at playtime 0:47:55.
a02 Audio has a gap of 6ms at playtime 0:53:59.
a02 Audio has a gap of 6ms at playtime 0:57:50.
a02 Audio has a gap of 5ms at playtime 1:02:44.
a02 Audio has a gap of 6ms at playtime 1:07:57.
a02 Audio has a gap of 5ms at playtime 1:13:50.
a02 Audio has a gap of 5ms at playtime 1:16:12.
a02 Audio has a gap of 6ms at playtime 1:21:35.
a02 Audio has a gap of 5ms at playtime 1:27:09.
a02 Audio has a gap of 5ms at playtime 1:31:31.
a02 Audio has a gap of 6ms at playtime 1:35:50.
a02 Audio has a gap of 5ms at playtime 1:40:47.
a02 Audio has a gap of 5ms at playtime 1:46:10.
a02 Audio has a gap of 7ms at playtime 1:49:50.
a02 Audio has a gap of 5ms at playtime 1:56:40.
a02 Audio has a gap of 5ms at playtime 2:00:53.
a02 Audio has a gap of 5ms at playtime 2:03:19.
a02 Audio has a gap of 5ms at playtime 2:09:34.
a02 Audio has a gap of 5ms at playtime 2:12:22.
a02 Audio has a gap of 5ms at playtime 2:18:21.
a02 Audio has a gap of 5ms at playtime 2:21:12.
a02 Audio has a gap of 5ms at playtime 2:27:07.
a02 Audio has a gap of 6ms at playtime 2:30:19.
a02 Audio has a gap of 5ms at playtime 2:34:55.
a02 Audio has a gap of 5ms at playtime 2:40:49.
a02 Audio has a gap of 5ms at playtime 2:43:40.
a02 Audio has a gap of 5ms at playtime 2:47:59.
a02 Audio has a gap of 5ms at playtime 2:53:20.
a02 Audio has a gap of 5ms at playtime 2:56:07.
a02 Audio has a gap of 6ms at playtime 3:01:37.
a02 Audio has a gap of 5ms at playtime 3:06:39.
a02 Audio has a gap of 5ms at playtime 3:11:16.
a02 Audio has a gap of 6ms at playtime 3:15:15.
a02 Audio has a gap of 6ms at playtime 3:21:09.
a02 Audio has a gap of 6ms at playtime 3:25:36.
a02 Audio has a gap of 6ms at playtime 3:31:16.
a02 Audio has a gap of 5ms at playtime 3:36:05.
a02 Audio has a gap of 6ms at playtime 3:39:47.
a02 Audio has a gap of 6ms at playtime 3:44:33.
a02 Audio has a gap of 6ms at playtime 3:50:40.
a02 Audio has a gap of 5ms at playtime 3:55:20.
a02 Audio has a gap of 5ms at playtime 4:00:35.
a02 Audio has a gap of 6ms at playtime 4:03:18.
a02 Starting 2nd pass...
a02 Extracting audio track number 2...
a02 Realizing AAC gaps...
a02 Creating file "out.aac"...
eac3to processing took exactly 6 minutes.
Done.

Last edited by rgr; 17th September 2024 at 17:25.
rgr is offline   Reply With Quote
Old 18th September 2024, 09:17   #18  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 7,110
Quote:
Originally Posted by rgr View Post
And what does that mean technically? I always thought that an audio track was a continuous "file" (mp3/aac/...) of course remuxed from video track.
An audio file is not a continuous.
A compressed audio is encoded like frames. Talking about standard AAC 48 KHz each frame have 1024 samples, in time it have a duration of 1024/48000 = 21.333 ms.

In a container are stored video and audio frames, and each one have a "timestamp" than say the player when play that frame.
In a CFR video 50 fps the timestamps show play a frame each 20 ms, but in a VFR that time can change.

But there are also "timestamps" for each audio frame, if the first timestamp is 0 ms and the second at 26 ms there are a gap of 4.667 ms without sound because the first AAC frame finish at 21.333 ms.

If eac3to make well their job "Realizing AAC gaps..." the out.aac must have the same video duration.

If you extract the aac with:

eac3to.exe VFR.mkv 2: out.aac -no2ndpass

The gaps are ignored and the audio is short, like the decoded in Avs.
__________________
BeHappy, AviSynth audio transcoder.

Last edited by tebasuna51; 18th September 2024 at 09:20.
tebasuna51 is offline   Reply With Quote
Old 20th September 2024, 13:51   #19  |  Link
rgr
Registered User
 
Join Date: Jun 2022
Posts: 104
Quote:
Originally Posted by tebasuna51 View Post
But there are also "timestamps" for each audio frame, if the first timestamp is 0 ms and the second at 26 ms there are a gap of 4.667 ms without sound because the first AAC frame finish at 21.333 ms.
OK, thanks for the explanation, now everything is clear.

I have one more question -- is it possible to fill in these gaps using AviSynth (without using eac3to.exe)? As I understand it, LWLibavSource (like ffms2 or BestSource) uses ffmpeg libraries, so it all comes down to the question of whether there is such a possibility in the ffmpeg library?

(Overall I think ignoring audio timestamps is a bug, after all they are there for a reason.)
rgr is offline   Reply With Quote
Old 22nd September 2024, 11:02   #20  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 7,110
Fill gaps with what?
If you fill with decoded silence at the midle of a strong sound, when you recode that you can obtain a click, because encoders must be initialized with the last value of previous frame.
Support some clicks is ok but many...
eac3to show 53 but only with 5 ms or more, sure there are much more short than 5 ms.

If you want recode that file the unique solution is preserve the timestamps and aply them the the new audio (of course to video also).

Check if your standalone recorder have other options to store your video/audio, or play it as is without try to manage it.
__________________
BeHappy, AviSynth audio transcoder.

Last edited by tebasuna51; 22nd September 2024 at 11:07.
tebasuna51 is offline   Reply With Quote
Reply

Tags
audio shift, lwlibavaudiosource

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 06:50.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.