View Full Version : GOP's: I, P, and B frames explained...
I noticed some confusion about B frames, so I thought I would write down a quick explaination of an MPEG "GOP", or, "Group Of Pictures", as explained to me by a professor
GOP - Begins with an "I" frame, followed usually by a number of "P"
and "B" frames (divx5 only uses B frames I believe)
- each GOP is independant: all frames needed for predictions are
contained within each GOP
- GOP's can be as small as a single I frame, or as large as
desired, but usually no more than 15 frames in length.
- the longer the GOP, the more efficient, but less rubust the
coding
I frame - "Intra-coded" frames : average 7:1 reduction.
- like JPEG, every video frame is broken into blocks of 8x8
pixels of Y, R-Y, and B-Y (although, I am not sure how
this "1/4 pixels" divx5 has plays into all this)
- blocks are grouped into "macroblocks" of 16x16
- macroblocks are grouped horizontally into slices which
have similar average block levels.
- multiple slices form a frame, and these frames are the
resulting "I" frames.
P frame - P frames are predicted based on prior I or P frames plus
the addition of data for changed macroblocks.
- average about 20:1 reduction, or about half the size of I
frames
- I don't think divx5 uses these, MPEG2 does though.
B frame - Bidirectionally predicted frames based on appearance and
positions of past and future frames macroblocks.
- B frames require less data than P frames, averaging about
50:1 reduction.
- B frames require more decoder buffer memory because 2
frames are compared during the reconstruction process.
- B frames also require manipulation of the coding order:
frames moving from the coder to the decoder are NOT in
presentation sequence.
basically, the the B frame will say something like "this frame is the same as the GOP's "I" frame except this one part, I will only contain the data needed to encode this one part, and combine it with the info from the I frame", in laymen's terms of course. This give DivX5 it's optimal reduction capability.
This also means of course, that your P3500 media box in you living room might struggle with decoding a high rate D5 encode (not sure about that, but D5 is a more intense encoding/decoding process, but DVD's use I, P, and B frames, sooooooo...
Oh, BTW, in MPEG2 at least, a GOP order is always IPBBPBBPBBIPBBPBB etc etc. (pending on your GOP size), but it is always 1 I, 1 P, and 2 B's, then you can stack more groups of "PBB"'s in that one GOP if needed (usually up to 15 total frames.
Note: this has no role in 'fps' rates...
Acaila
6th March 2002, 21:56
Thank you for that very good explanation about GOP's. I might even use it in a FAQ someday :)
One thing though:
DivX5 doesn't use only B-frames, it uses P-frames as well like its predecessor. Contrary to MPEG2, DivX5 uses a "PB" sequence chain instead of "PBB". The latter would result in better compression, but the way I understand it the avi format won't work correctly with more that one sequential B-frame.
Still, using "PB" instead of only P-frames results in a serious size decrease already, so it was definately worth it.
ahh, didn't know that P frames are used too...
you know what, I was thinking of 3.1 alpha, doesn't that only use I frames??
I never even used 4.XX before, went straight from 3 to 5...
GAteKeeper
7th March 2002, 05:24
To be entirely accurate it uses "BP" grouping (i.e. frame n is B and frame n+1 is P) but it appears as "PB" because you cant decode a frame until the frames it predicted off are decoded.
And so p frames have to be decoded before the B frames between them.
1/4 Pixel accuracy is used in motion estimation to get the best fit for each macro-block. say you have a panning camera and each frame the picture hasnt moved an entire number of pixels accross the screen then 1/4 accuracy helps get the predicted macro-blocks in a much better possition. This isnt used in I-Frames because motion estimation isnt used in I-frames because it is basically a JPEG encoded frame using a different Quant value (typically 16).
I dont think MPEG-2 uses more than 1 B frame but i could be wrong.
The use of B-Frames does improve the compression/quality of the codec by a considerable amount but requires a lot more motion estimation (a large portion of hte encoding time) to be done hence hte much longer encoding times. B frames are typically 1/2 the size of P-Frames but require more encoding and decoding.
GAteKeeper
none-the-less, the encoded file stores the frame in "PB" order, it isn't until reconstruct that the order changes....
and MPEG2 can use either 1 or 2 B's to every P...
BTW, what is the minimum system requirements for D5 reconstruct??
movmasty
7th March 2002, 09:54
hey -i- , what is "D5 reconstruct???? :confused:
more precisely, what is D5 and what is reconstruct?
the correct writing is BP, since a gop starts with an I frame, and not with a P...
>- each GOP is independant....
-this is true only with a "closed" gop structure
usually mpgs are encoded with a "open" structure,cause gives better compression.
good mpg encoders gives you the option to encode with open or closed gops.
>- GOP's can be as small as a single I frame, or as large as
desired, but usually no more than 15 frames in length.
-all I frames means no gop at all, like motionjpg.
and you will have reproduction probs with more than 20 frames in pal and 24 in ntsc.
>- the longer the GOP, the more efficient, but less rubust the
coding
-the more the B frames...the more efficient, but less rubust the
coding
>I frame : average 7:1 reduction....P frame about 20:1 reduction, or about half the size of I frames (is 20/7 = 2 ????)..B frames....50:1 reduction.
-that depends from the bitrate you encode with....
>....but it is always 1 I, 1 P, and 2 B's,
-i well encoded some mpg with as much as 8 b frames between 2 P..., or just 1 B,or just I and P frames....
>...this has no role in 'fps' rates
-generally you can use longer gops in ntsc than in pal/film
see also
http://forum.doom9.org/showthread.php?s=&threadid=17166&highlight=frames
very interesting is then the ability of mpg to write a movie header every 1/10 gops,this means that you will able to play a mpg also if you have only any small chunk of it.
also the "skip x b frames" option, that can give you only a fraction of the nominal fps, es. 15,12.5,12,10,8.33 ecc.
kastro68
8th March 2002, 11:41
Just thought people should know
Edit: Well it does look suspicious
duartix
8th March 2002, 19:48
Well I wouldn't be too sure as to -h- being -i-.
D5 = DivX 5
A_Pleite
8th March 2002, 20:51
Well, maybe IŽve understood something. So
P-frames = Delta-frames ?
And in B-frames are saved the differences from the difference between a I-frame and P-frame right?:confused:
The thing about a header every 10 frames:
This sounds really intersting, because you can teoretically have a variable fps since the video is a group of loads of minivideos which are played together. Maybe it is possible to switch the codec avery ten frames (teoretically - maybe with fast pc that has a lot of ram).
:eek: :confused: :) :confused: :eek:
A_Pleite
movmasty
9th March 2002, 01:52
P and B frames are both delta frames, B are more compressed thought,
cause they refer to both a previous and a successive P frame, thus with a better motion estimation.
- when i said "write a movie header every 1/10 gops",
i meant the SAME header.
and in fact if you join two complete mpg with different frame rate with dos copy/b command,
the player/driver wont play well the second mpg.
actually..........
(ps, -i- is a newbie, how can someone thing that he is -h?
doctor -h and mister -i- ??)
who is "-h-"???
D5 = divx5
reconstruct = playback (I like to say things like "bit rate reduction" rather than "compression";)
some q and a, primarily for movmasty:
an "I"-only frame sequence can be considered a GOP, I just don't think it is done so often, or like you said, like a M-JPEG, and I think that is more semantics more than anything else...
-the reduction ratios I presented are only averages, and as you correctly pointed, range pretty much based on bitrate...
- the IPBB thing. from what I understand, the data is stored that way, even though that is out of viewing order. Now this, i am going on what I was taught, should I mention that this might be wrong to my professor?
-and about the "B" frames, how did you get 8 B frames?
who is "-h-"???
Voila! :)
- the IPBB thing. from what I understand, the data is stored that way, even though that is out of viewing order. Now this, i am going on what I was taught, should I mention that this might be wrong to my professor?
The P-frame must be stored in the file before the B-frames which are predicted from it, so yes you'd see IPB, IPBB, IPBBBBBB or whatever in the bitstream. XviD will be working around the avi limitations by storing a P and B frame in a single avi frame, allowing playback without sync issues (or dshow hacks).
-and about the "B" frames, how did you get 8 B frames?
I assume using TMPGEnc - you can specify however many B-frames you'd like to put between P-frames. The same feature will make it into XviD, though there isn't much point going above 2 from memory.
-h
hello, nice name!!!
anyways,
the synch issues...
I persoanlly am not having the slightest synch issues with divx5, and I do not use d show. so how is it that some are claiming that the B frames are causing a 1-2 frame delay, when I can't see 1.) how, and 2.) if it is so, how come I am not having synch issues?
from what I understand, B,P frames do not really add to the frame rate, rather, they are reference frames for further frames?
I feel I might ne wrong on that though....
movmasty
9th March 2002, 06:19
-h , are you talking with yourself ??????????? :p
-h , are you talking with yourself ??????????? :p
I sure hope not, it'd mean my memory is getting even worse..
I persoanlly am not having the slightest synch issues with divx5, and I do not use d show. so how is it that some are claiming that the B frames are causing a 1-2 frame delay, when I can't see 1.) how, and 2.) if it is so, how come I am not having synch issues?
If you are using vfw for decoding, you certainly are getting a delay :)
DivX5 uses a bit of a hack to use B-frames in the avi format - I wrote up my initial findings here (http://www.videocoding.de/forum/viewtopic.php?topic=321&forum=2&13), and thanks to the way DivX5 reads the avi, it's impossible not to have a delay when using vfw for decoding.
As an experiment, try loading a DivX5 with B-frames avi into VirtualDub, go to the start of the movie, then press the Right arrow twice - see how the picture on the screen stays the same? It's just duplicating the first frame until it finds a B-frame that it can decode, at which point we're 2 frames behind. The dshow filter uses directshow's (more able) api to either read ahead in the avi, or delay the audio sufficiently, to offset this.
-h
movmasty
9th March 2002, 08:01
uhmmmmmm, does vdub behave the same with mpg1 files ??????????
Edit: -h ?
A_Pleite
10th March 2002, 00:19
Answer to the
"Is it really IBPBP.. ?"-question:
Make a 2pass of a vid, look in the created log for
"intra"
and behind "intra" youŽll find a number, either
0
or
1
or
2,
right?
(Maybe IŽm wrong)
Linux
4th August 2002, 17:45
Each GOP is not automaticly independent.
Only those that are encoded as closed.
Think of a GOP as IBBPBBPBBPBBPBBPBBIBBP (standard DVD)
If the GOP is not closed and you use B-frame, the last B are dependent to the next I-frame.
This makes the movie harder to cut.
If all GOP are closed you can cut between the B and I frame.
If the GOP's are open you have to delete the last two B-frame and cut between the P and I. This makes all cuts to be visible in inspection.
You then get a GOP like IBBPBBPBBPBBPBBPIBBP
zulu
26th August 2002, 03:49
Think of a GOP as IBBPBBPBBPBBPBBPBBIBBP (standard DVD)
hmmm.. that are 22 frames.
correct me if i'm wrong, but insn't the standard dvd gop length 15 frames for pal and 18 frames for ntsc?
Linux
26th August 2002, 12:25
Originally posted by zulu
hmmm.. that are 22 frames.
correct me if i'm wrong, but insn't the standard dvd gop length 15 frames for pal and 18 frames for ntsc?
Sorry if you did not understand me.
The GOP stream is automaticly repeating itself.
Since I was talking about closed versus not closed GOP I represented the GOP stream by showing one hole GOP and the begining of the second GOP.
The importent thing when talking about non closed GOP are that the last B-frame of one GOP is depenent of the I frame on the next.
If I only represented the GOP stream with one GOP it would not been that obvious.
zulu
26th August 2002, 16:29
ouch. sorry, got it. i read not carefully enough ;)
vBulletin® v3.8.4, Copyright ©2000-2009, Jelsoft Enterprises Ltd.