PDA

View Full Version : Important audio sync issue found


neuron2
19th January 2005, 22:30
I was working on a stream reported by a correspondent. We found that the actual sync of the stream was -66 ms even though DGIndex reported 0 ms (verified with audio waveform viewer).

As you may know, DGIndex calculates the delay by comparing the first I frame's PTS to the immediately following audio frame's PTS. This relies upon them being accurate! But we found that some streams are wrong on the first video PTS. Here's an example of a set of video PTSs from the beginning of the VOB. Note that they should increment by 400 ms because they come once per GOP and there are 12 frames per GOP @ 29.97fps.

41
509
910
1307
1715
2110
2508
etc.

The first PTS value is wrong. It should be about 110. That means the delay calculated by DGIndex will be off by 110 - 41 = 69 ms. We observed an actual de-sync of two frames = 66 ms. If you correct the first PTS, DGIndex gives a delay of -69 ms and the resulting movie is dead on sync!

So, I am going to add a check in DGIndex. It will find the good increment pattern and then extrapolate back to what the first PTS should be. If the actual and extrapolated values differ, it will pop up a warning and offer the user the opportunity to use the extrapolated value. Is that reasonable?

This may account for many of our mysterious sync issues. :D

zettai
19th January 2005, 22:53
Oooh fantastic! That sounds like a good operating procedure to use, go for it!

Cyberia
20th January 2005, 00:10
Then this check is done on a per file basis? (eg: each vob or ts in a movie is independently checked?)

neuron2
20th January 2005, 05:06
Originally posted by Cyberia
Then this check is done on a per file basis? (eg: each vob or ts in a movie is independently checked?) No, the audio delay is calculated from the PTS's of the first video frame and the first audio frame. Those occur at the start of the first loaded file.

len0x
20th January 2005, 18:02
What would be the default behaviour from CLI?

neuron2
20th January 2005, 18:25
@len0x

Don't worry. It's not going into 1.1.0. I discovered some other things that are wrong with the delay calculation and it turns out that the case described above is rare and it is quite tricky to fix. I need to do more work on it.

But I did find that the delay for all transport streams is way off, and I have a fix for that. Zep had sent me a transport stream that was way off by almost 600ms! Now I get it right. That fix will be in the 1.1.0 release. The fix for the obscure situation described above will be later. I'll probably make a little tool for inspecting PTS's that will help in analysing sync issues.

fccHandler
20th January 2005, 20:39
Originally posted by neuron2
Note that they should increment by 400 ms because they come once per GOP and there are 12 frames per GOP @ 29.97fps.
That assumption seems reasonable, but it may be flawed. I've seen streams like this too, where the video PTS seems to drift, especially in direct-to-MPEG captures, and not just at the beginning. (Sorry I don't have any examples on hand at the moment.)

I believe some hardware encoders are applying a kind of "variable frame rate" in such streams, possibly to compensate for discrepancies in the timing of incoming video and audio samples. I'm not certain, but I think the spec allows this to a limited degree, thus the frame rate in the header is merely the closest approximation.

Note that WMP handles the degenerate situation quite well, perhaps by constantly monitoring the PTS and adjusting its video playback rate to maintain A/V sync throughout.

eb
20th January 2005, 21:51
neuron2 asked:Is that reasonable?

Yes it is.
Nice to see you back with full sack of energy on your back.

It is very important starting point, till to the next drop/break point, where Cyberia is worring about.
We already agreed that (for live capturing/recording)the original values of timestamps for video and audio should be kept as they are.
With multipart records for the same show also endings are very important, please not cut them with scisors,
video1---+++---+++---+++
audio1...---+++---+++---+++

video2---+++---+++---+++
audio2...---+++---+++---+++

then if records where made corectly then joining such parts will be without unpleasnt effects.

eb

neuron2
20th January 2005, 23:57
@Cyberia

Can you translate eb's post into English that I can understand? :)

Sorry, eb, it's probably my fault, but I just can't figure out what you are saying.

Cyberia
21st January 2005, 00:01
Ummm. No.

I have to run now, will try to translate again this evening.

len0x
21st January 2005, 00:22
I *think* what eb meant is that audio synch has to be maintained not only during start of TS, but also when you make sub-selection and save part of it.

eb
21st January 2005, 01:36
sorry for that,
Let assume that we are talking about .ts streams from sattv digital records, where both audio and video carring their own timestamps mainly as real time values.
It is not important if they start as zero or xxxxxxxx .
Try to determin DELAY for audio at start, check it in 1/4, 1/2, 3/4 at end of the file.If these values are equal within 50 (350,400,375,368), count the medium value of them and use as DELAY.Rewrite all timestamps if it is needed starting from zero, I prefer to keep original timestamps as long as possible.
If from this control checks you see that delay is skewed (50,250,450,650) than use corrections to timestamps accordingly.

Go back to the pictures, with multipart files of the same show never cut them at the ends to equel that video and audio end at the same time , keep this DELAY displacement.After joining them there will be no sings of joints.

I hope that this time you can follow me.

eb

EDIT: take a look to VirtualDubMod FRAMERATE where are possibilities to cut audio and video ends or leave them as they are.

neuron2
21st January 2005, 02:51
Well, obviously, subselection already works.

I think he's saying he's going to load multiple files with different syncs, so that using just the first one is going to be bad.

The only way I could deal with that is to insert or remove audio frames. That won't be in Version 1.1.0.

nnigam
24th January 2005, 17:28
I have a lot of audio synch errors in my OTA HD captures. ProjectX does not provide a delay, but I have had to correct upto 5000ms to get a somewhat close approximation of a synching. DGIndex does a much better job by providing a delay value, but sometimes this is still not accurate. Maybe the first email from neuron specifies my problem.

I do know that my HD captures are chopped up into 2gb size files for a 1 hour recording meaning that I have 5 files for the program. Is this the multiple files that EB is talking about. An audio delay in the first file means that the video ends before the audio in the first file, and continues onwards for all the files. If the audio is chopped to the end of the first file as I get from EB's message, it will put the remaining files out of synch. The sycnh error that I gets appears to be in one of two formats. 1. the gap continually grows as we progress along the video, or 2. The gap goes +- a few ms throughout the video. It is not always easy to tell the +-, but it is always off even if I manually adjust the delay, I am unable to get proper synch. The original TP files play in perfect synch in my capture system though. They just take up too much space to keep as is.

I am trying with DGIndex 1.1.0 and will see what happens.

neuron2
24th January 2005, 19:10
Yes, that is eb's issue.

This is high on my priority list, along with multiple range selection on the timeline (cutting).