View Full Version : What IS interleaving?
catfish
24th April 2005, 11:05
I know that AVI stands for Audio Video Interleaved.. but in my ever expanding search for knowledge, i came to wonder.. what IS interleaving?
i looked it up on wikipedia, and i quote
Interleaving in Computer Science is a way to arrange data in a noncontiguous way to increase performance.
now that doesn't really tell me anything.. so i was wondering if someone in here might answer me about
WHAT interleaving is
HOW it complies to AVI (and all other containers for that matter)
WHAT is it good for (what's the point?)
I realize this is a newbie question, but since this is the only forum, of its kind, that i know of, it wasn't much of a choice :)
stephanV
24th April 2005, 12:08
Its pretty easy, and the description you got is pretty good...
Lets look at it like this. You have one video stream and one (matching) audio stream and you want those streams put together in one file. One way of doing it would be to paste the audio stream behind the video stream. There is however a problem with this: if the streams get big your read device constantly has to search back and forward in the file; you want video and audio played in synch so it has to read a part of video and then a part of audio that matches... after all, buffering a 600 MB video stream is not really an option. On a modern hard disk this kind of reading probably wouldnt be a problem though, but when AVI was invented it certainly was... also for CDs/DVDs constanlty seeking back and forward in file is not really recommendable.
The obvious solution to this problem would be to break the stream up in chunks (for a video stream a chunk is one video frame) and put chunks of video and audio that should be played more or less together, close to each other in the file. This way it is much easier to read out the file.
For example:
A video stream of 6 frames: V0 V1 V2 V3 V4 V5
A matching audio stream of 4 blocks: A0 A1 A2 A3
A non-interleaved file: V0 V1 V2 V3 V4 V5 A0 A1 A2 A3
Interleaved: V0 A0 V1 A1 V2 V3 A2 V4 A3 V5
Hope this helps...
catfish
24th April 2005, 14:23
Hope this helps...
It sure does!
Thanks a lot for a simple, understandable explanation :)
i suppose most containers use this method? - like mpg, ogm, matroska etc.?
stephanV
24th April 2005, 14:50
Originally posted by catfish
i suppose most containers use this method? - like mpg, ogm, matroska etc.?
Yup, if not all. I think a container that cant do interleaving is pretty much useless... I think you could even say that interleaving is one of the primary reasons containers exist.
video_magic
24th April 2005, 19:34
I found this thread very interesting, thanks for the nice explanation Stephanv
cypher_soundz
25th April 2005, 20:07
lurker #2 :p
yes thanks very informative
Regards
cyph
echooff
25th April 2005, 20:29
lurker#3:p
I also appreciated it.
ukb008
26th April 2005, 03:25
Quoted from your example:
Interleaved: V0 A0 V1 A1 V2 V3 A2 V4 A3 V5
I notice a little discrepancy in the array of Video and Audio data. A regular mathematical pattern would be:
V0 A0 V1 A1 V2 A2 V3 A3 V4 A4 V5 A5
So, I expect there were different combinations of sequences possible in Interleaving. Can you stoop down to explain a bit further? Like what determines those permutations?
Regards.
dragongodz
26th April 2005, 04:22
just to add that interleaving doesnt have to be at single frame either. you can for example use V0 V1 A0 A1 or V0 V1 V2 A0 A1 A2 etc. have a look at the interleaving option in Virtualdub, for example, and you can set the interleaving to be every number of frames or milliseconds.
stephanV
26th April 2005, 10:44
Originally posted by ukb008
Quoted from your example:
Interleaved: V0 A0 V1 A1 V2 V3 A2 V4 A3 V5
I notice a little discrepancy in the array of Video and Audio data. A regular mathematical pattern would be:
V0 A0 V1 A1 V2 A2 V3 A3 V4 A4 V5 A5
Actually my pattern is completely regular. If you assume audio and video have the exact same length, each chunk is interleaved in such a way they are in order of display, and if a video and audio chunk start at the same time, the video chunk comes first.
Of course, as dragongodz said, interleaving is influenced by settings so if you want to interleave every other three video frames the pattern would become this (note that this is quite specific to AVI, how interleaving exactly is done in other containers i dont know):
V0 V1 V2 A0' V3 V4 V5 A1' (for 99% of the audio formats there is no point in keeping 2 separate audio chunks, it will only cause extra overhead)
So, I expect there were different combinations of sequences possible in Interleaving. Can you stoop down to explain a bit further? Like what determines those permutations?
Granularity of the streams. You can tell your application to interleave audio every 0.5 seconds, but if your stream has video frames that are each 2 seconds long, you can imagine this causes a bit of a problem. ;) Also remember that like video, audio also consists of frames (or blocks) of a certain duration (for 48 kHz MP3 one frame is 0.024s), so you are not allowed to cut up the audio stream at any byte you like. (Actually for MP3 you are, but thats because of the robustness of MP3 decoders, dont try it with AAC for example.)
So basically you can interleave in anyway you want as long as you don't break up the video and audio stream at invalid points. You do have to keep in mind with which goal you were interleaving in the first place though, so dont interleave audio every other 15000 video frames or something...
[edit]some spelling
dragongodz
26th April 2005, 13:50
for 99% of the audio formats there is no point in keeping 2 separate audio chunks, it will only cause extra overhead
yes the way i presented it was to just represent equivilent periods for video and audio. to put it another way would be
V0 A0
where V can be 1 frame, 2 frames, 3 frames etc all the way to the total length of the video minus 1 ,if its the total then its not interleaved. the A is an equivilent amount to the selected V length. the interleaving pattern is then repeated until the total streams are done.
does that make it easier to understand though ? :D
ukb008
26th April 2005, 14:27
So at the interleaving points, the number of blocks of video and audio must be integers, is that it? Like you can't interleave 3 blocks of video with 5.89 blocks of audio...
Regards.
stephanV
26th April 2005, 14:43
Originally posted by ukb008
So at the interleaving points, the number of blocks of video and audio must be integers, is that it? Like you can't interleave 3 blocks of video with 5.89 blocks of audio...
Exactly. A decoder (understandably) wouldnt know what to do with 89% of one decodeble unit. You can only have misaligned chunks if your audio stream has (a) frame headers itself and (b) the decoder can buffer enough data (which i assume is the case with mp3 decoders).
ukb008
26th April 2005, 14:56
It's all clear.
For now.
Regards.
mpucoder
26th April 2005, 16:05
I'm glad no one asked about DVD interleaving, where there can be three layers of interleave (program stream, angle, and story)
dragongodz
26th April 2005, 17:00
I'm glad no one asked about DVD interleaving
yes i was cringing waiting for it to happen. since you have seen this thread i will leave that to you if anyone does ask. afterall now that you have gona and mentioned it i can see someone being tempted to ask. :D
ukb008
27th April 2005, 02:34
Tell us briefly in details. In plain English, please.
Regards.
mpucoder
27th April 2005, 03:03
Well, at the lowest level we don't even call it interleaving, but multiplexing. But it's the same thing. One video, up to 8 audio, up to 32 subpicture, and one NAV stream are all interleaved. Data may not be more than one second ahead of its time to be shown. Sounds simple, right?
The next level up is called interleaving. That's where up to 9 of the lower level multiplexes called "angles", of the same playback time, get woven together to provide for seamless switching between them. (note: there is also a non-seamless interleave that relaxes the constraints on the encoding, but is still equal time angles).
And the top level is also called an interleave. This allows for more than one set of nine angles, of different playback times, to be interleaved in a way that allows no switching, but uninterrupted playback of any possible path.
vBulletin® v3.8.5, Copyright ©2000-2012, Jelsoft Enterprises Ltd.