To Xvid developers: Variable B-frames?

athos · 6th March 2002, 14:56

Hi all! This is my first post (actually it is my second try at my first post, had some problems with cookies), although i have been lurking this forum for quite some time.

I had written a very elaborate post which got lost due to problems with my webbrowser and firewall [insert some harsh language about these two here] :\

Any way, my point was that I had an idea about using variable amount of b-frames per sequence (by sequence i mean IPBPBPBPBPBPBPBPBPBPBPBPBPBP, refer to -h's post in "any opinions on divx 5?") in Xvid. I read that divxnetworks have decided on using one b-frame per sequence, whereas most mpeg1/2 encoders use two. this leads me to believe that there is no absolute optimal number of b-frames to use for all video, but that it somehow depends on the content. further, i assume that this would then be measurable at encode time.

as and example (note: this is not a real theory) i assume that this depends on the factors that are used when deciding the bitbudget for a part of the video in a vbr situation. in my example, content that require lower bitates would lend itself better to using more b-frames (maybe even 3 or 4 per sequence, again just an example) while more demanding content would be better off with fewer or even no b-frames. because i understand that b-frame decoding is pretty cpu demanding, there would be an added advantage in this example because there would be less b-frames in high bitrate parts of the video, and more in low bitrate parts.
this dynamic decision would probably be easier to make when doing 2 passes, so in a 1-pass vbr situation you might want to use a more conservative strategy, or a fixed number of b-frames.

because we want xvid to be as user configurable as possible, one would like to be able to set:

whether to use b-frames at all, and if so if to use variable b-frames
minumum (probably 0 or 1) and, more importantly, maximum b-frames per sequence. (assuming variable b-frames chosen)
a number or slider controlling the conservatism of the algorithm (again if variable b-frames are used).
if fixed number of b-frames, how many per sequence.

a couple of relevant questions:

does the iso mpeg-4 standard allow for variable b-frames? if it is not specified explicitly, perhaps it would be possible for a workaround, while still conforming to standard?
what factor(s) decide the optimal number of b-frames per sequence?
how much would the gain be of implementing this idea, assuming the above questions can be answered satisfyingly?

i am sorry that this post is not as well written as it fast first, i just got tired of rewriting the whole thing.

keep up the good work with xvid! i hope the divx scene will move away from divx3 (which i think is obselete and has no future), towards xvid and perhaps ogg audio instead of mp3

Koepi · 6th March 2002, 15:05

DXN choose the IBPBPBPIBPBP solution because you will get in trouble with the AVI file format if doing otherwise.

Regards,
Koepi

athos · 6th March 2002, 15:31

oh, ok.

Maybe it would be possible using another format, for example OGM, MP4 or MCF?

Koepi · 6th March 2002, 15:35

The primary goal is to stay as compatible as possible.
So i guess we're trying some workarounds for .avi first. Another option is to write our own encoding application - with that we could use whatever we want. But it's a bad idea to write another format parser.

Just sit back and relax, starting on sunday it'll get really interesting

Regards,
Koepi

rui · 6th March 2002, 15:56

Quote:

Originally posted by Koepi

Just sit back and relax, starting on sunday it'll get really interesting

Regards,
Koepi

Hummm....What surprises do you boys have arranged to us?

Actron · 6th March 2002, 16:10

i cant wait till sunday

)

athos · 6th March 2002, 16:32

i am also excited about sunday!

but, back to the subject. i understand that variable or even multiple b-frames will not be implemented at this time, because of limitations in the avi format and the strive for compability (which i totally agree on). still, i am curious what you people think of the idea of variable b-frames? would it be a good idea, assuming that it will be possible to implement in the future?

Acaila · 6th March 2002, 16:59

@Koepi:

So to be able to use B-frames to their fullest extent you need something that's not avi?

Tronic is hard at work on the first MCF parser at this very moment, and I expect him to have an alpha version done very soon. If XviD developers would work together with the MCF coders you would both benefit from it.
1- You would be able to make the codec a lot better than it already is. That is without the restrictions of avi.
2- You would really help the launch of MCF once it's ready. Being implemented with one of the best codecs around is a great way to bring it to the masses.

Because MCF is still in development stage it's quite easy to add/change features. And XviD is also still in development stage so your job wouldn't change a lot as I see it.

Hmm, the above post reminds me of ChristianHJW, maybe I've been hanging around with him a bit too much

saVe · 6th March 2002, 17:09

i must admit that i have some problems imagining a way this could be achieved. here's my personal q&a:

how will the codec know where to use how many b frames?
could be solved scene-based. scenes with more bits will recieve more p frames rather than b-frames.

how will the codec know where to insert the p-frames to get the most efficient quality?
when the limit of maximum changes in picture information is exceeded the codec inserts a p-frame and processes the missing b-frames.

maybe i'm completely wrong and the codec can't operate scene-based. but my idea would be to do a normal 1st pass using only i-frames and p-frames.
in the second pass each scene (i-frame to i-frame) could be taken and processed on it's own by comparing the i-frame to the following p-frames and when a certain (of course optimally tweaked) amount of information relating to the i-frame cahnged inserting a p-frame and inserting the missing b-frames. then the whole thing would be repeated using the new p-frame like the i-frame was used before. this would go on until the next i-frame is reached and started all over again. there could also be a fixed value for high motion scenes and one for low motion scenes, with some sort of breaking point in the average bitrate of the whole scene

would this be possible or am i telling compete *edit*?

Nic · 6th March 2002, 17:11

No offense Acailia, but when I first saw you posting in DivX.com I thought you were Christian using a different handle....you both have a very similar, outlook & style....

....

Cheers,
-Nic

ps
Im keeping a look out for the first parser, christian has done amazing work at pushing this along...
(....Tronic's posted at www.videocoding.de as well)

athos · 6th March 2002, 17:28

save> i think the problem can be summarized as this: given a number of frames between two keyframes, how many should be p and how many should be b frames? Ok, maybe im oversimplifying here because there could be a higher complexity in parts of this list of frames than in the others.

still, too many b frames will lead to unnecessarily large p-frames, and too few b frames will lead to too many p frames. the optimal number of b frames in the sequence is a number where if you add b frames the p frames will grow so that the total number of bits used in the sequence (given a fixed quantisizer) grows, and if you remove b frames the p frames added instead would also grow the total number of bits used. so the optimal ratio between b frames and p frames is one which yields the smallest file when using quality-based/fixed quantisizer encoding.

now, this is only a description of the problem and not the solution, since you do not want to do several encodes of the same frames just to see what number of b frames is optimal. there should be some way to determine this by looking at some factor, for example the same as used for determining the bitbudget and quantisizer settings.

pandv · 6th March 2002, 17:38

About the B frames thing, I am thinking if it's possible to have IBBBBI (maybe reordered to IIBBBB in the stream), without P frames?

This can be util in transitions between two images (a scroll for example), without the penalisation of the biggers P frames.

pandv.

saVe · 6th March 2002, 17:45

i totally agree with you, athos.

i think the perfect number of b-frames between two i-frames or p-frames can only be found by trying. as i wrote before, this could maybe determined by changes in picture information, so when a p-frame from the 1st pass is different in more than xx% than the i-/p-frame you are referring to there would be a p-frame inserted in the 2nd pass. of course this value has yet to be found but i think there is one. the developers could implement a slider as you suggested and let us users (aren't we beta testers by now?

) find the best settings for the percentages (by using the famous movie "the replacements" *lol*).

athos · 6th March 2002, 17:59

save> i was thinking maybe it is possible to estimate how large the p frames need to be, given a number of b frames, without processing the entire frame? I'm looking for a formula that would state something in the style of:

Code:

4 p-frames of size x
translates to
2 p-frames of size y and 2 b-frames of size z

obviously, you would here be looking for a ratio of b and p frames such that the sum of 2 * y + 2 * z is less than 4 * x.

Now, this estimation will probably not be exact, but hopefully it might be good enough to render good results.

Acaila · 6th March 2002, 19:48

@Nic:

When I started posting at DivX.com I was a total noob, but being compared to Chris is always a compliment in my views.

@Athos:
I hope I understand you thoughts correctly, but the only way to give good results (as I see it) would be for the codec to encode each frame between two I-frames as both P and B frames. Then to calculate which sequence of these P and B frames will result in the smallest size, and discard the rest. I think this will slow down encoding a lot though.
But it would be a much more efficient use of space.

athos · 6th March 2002, 20:54

Acalia> Well, this is one solution, that is slow but thorough. I can think of two alternatives (not sure if they are plausible, just ideas):

Encode p frames only in the first pass, and b frames only in the second pass and then compare these. This would only work in 2 pass mode of course. I imagine this might not work, as the second pass is not identical to the first, but maybe the relative relationship in size is comparable?
Somehow (more or less heuristically) estimate any bit savings by using b frames instead of p frames. I am not very good at math, so i am not sure if this can be done, and if so how good. I am thinking of some function that takes the factor that decides how useful b frames are (complexity? motion? bitrate demand?) and size of p frames and returns an estimate of the savings (or not) by turning them into b frames. a requirement of this function is of course that it is faster (less complex) than actually rendering the b frames and comparing.

as you can see, i am thinking abstractly here, just throwing ideas.

even in the case that you describe (sort of a worst case), maybe the benefits in quality would make it worth to implement (in the future) as an option? if the double rendering is only needed in the second pass, then the rendering time would increase by 50% roughly in a 2 pass encode. of course, if it is needed in both passes it would increase by roughly 100%.

saVe · 6th March 2002, 21:41

@athos:
your first idea cannot work, because the first pass uses bigger quantizers over the whole movie than the second pass, so comparing them in terms of size would not be leading to a result. what we need to do is to compare the size of the second pass p-frames to the size of second pass b-frames.

i think your second idea has the problem that estimating bit savings depends on the source a lot. in high motion scenes certainly too many b-frames would be the worst thing to do because for example the ending p-frame is very different from the first b-frame. i hope you get what i mean, maybe it's not the best way of descibing what i mean...

@acaila:
when you render all frames as b-frames instead of p-frames you don't take into account that when you insert p-frames more often the b-frames get smaller too. again, if the distance between b-frame and one of the p-frames gets too big, maybe the b-frame will come out bigger than a p-frame would have... in this case inserting another p-frame can save space and the b-frames will be different from the greater number of b-frames that were encoded in the first place. the problem is just where the heck to insert the damn p-frames!

to you both: please don't take this as an offense, it's an amateur's opinion!

Acaila · 6th March 2002, 22:13

Quote:

@acaila:
when you render all frames as b-frames instead of p-frames you don't take into account that when you insert p-frames more often the b-frames get smaller too. again, if the distance between b-frame and one of the p-frames gets too big, maybe the b-frame will come out bigger than a p-frame would have... in this case inserting another p-frame can save space and the b-frames will be different from the greater number of b-frames that were encoded in the first place. the problem is just where the heck to insert the damn p-frames!

Yes you are correct. I had also realised that before I wrote it down, but I refrained from adding that part because it would have over-complicated things.
I'm no coder, I was just passing an idea around so someone with understanding of codec internals could either use it as inspiration or dismiss it.
I do think the whole idea of variable B-frames is a very good one and would do XviD a lot of good when implented (if it's possible).

Oh, and thank you for writing my name correctly. It seems many people use a personal variation instead of the real deal. I've seen like 3 different versions already today

saVe · 6th March 2002, 22:26

so you got my point? wow, someone understands what i'm saying!

have to improve my explanation skills though....

i'm not a coder either so maybe we should post this over at videocoding.de/xvid.org! the core coders are all there, maybe they find it interesting...

edit: -h, koepi, nic (in alphabetical order

), what do you think about this? would it be possible? if yes, could one of you post it at videocoding.de? i think they would more likely listen to you!

Acaila · 6th March 2002, 22:37

No need to clutter up that forum with trivials. If Koepi, -h or Nic think this is an interesting idea they'll pass it along.

6th March 2002, 14:56	#1 \| Link
athos Registered User Join Date: Mar 2002 Location: Stockholm, Sweden Posts: 353	To Xvid developers: Variable B-frames? Hi all! This is my first post (actually it is my second try at my first post, had some problems with cookies), although i have been lurking this forum for quite some time. I had written a very elaborate post which got lost due to problems with my webbrowser and firewall [insert some harsh language about these two here] :\ Any way, my point was that I had an idea about using variable amount of b-frames per sequence (by sequence i mean IPBPBPBPBPBPBPBPBPBPBPBPBPBP, refer to -h's post in "any opinions on divx 5?") in Xvid. I read that divxnetworks have decided on using one b-frame per sequence, whereas most mpeg1/2 encoders use two. this leads me to believe that there is no absolute optimal number of b-frames to use for all video, but that it somehow depends on the content. further, i assume that this would then be measurable at encode time. as and example (note: this is not a real theory) i assume that this depends on the factors that are used when deciding the bitbudget for a part of the video in a vbr situation. in my example, content that require lower bitates would lend itself better to using more b-frames (maybe even 3 or 4 per sequence, again just an example) while more demanding content would be better off with fewer or even no b-frames. because i understand that b-frame decoding is pretty cpu demanding, there would be an added advantage in this example because there would be less b-frames in high bitrate parts of the video, and more in low bitrate parts. this dynamic decision would probably be easier to make when doing 2 passes, so in a 1-pass vbr situation you might want to use a more conservative strategy, or a fixed number of b-frames. because we want xvid to be as user configurable as possible, one would like to be able to set: whether to use b-frames at all, and if so if to use variable b-frames minumum (probably 0 or 1) and, more importantly, maximum b-frames per sequence. (assuming variable b-frames chosen) a number or slider controlling the conservatism of the algorithm (again if variable b-frames are used). if fixed number of b-frames, how many per sequence. a couple of relevant questions: does the iso mpeg-4 standard allow for variable b-frames? if it is not specified explicitly, perhaps it would be possible for a workaround, while still conforming to standard? what factor(s) decide the optimal number of b-frames per sequence? how much would the gain be of implementing this idea, assuming the above questions can be answered satisfyingly? i am sorry that this post is not as well written as it fast first, i just got tired of rewriting the whole thing. keep up the good work with xvid! i hope the divx scene will move away from divx3 (which i think is obselete and has no future), towards xvid and perhaps ogg audio instead of mp3

6th March 2002, 15:05	#2 \| Link
Koepi Moderator Join Date: Oct 2001 Location: Germany Posts: 4,454	DXN choose the IBPBPBPIBPBP solution because you will get in trouble with the AVI file format if doing otherwise. Regards, Koepi __________________ Koepi's new media development site

6th March 2002, 15:35	#4 \| Link
Koepi Moderator Join Date: Oct 2001 Location: Germany Posts: 4,454	The primary goal is to stay as compatible as possible. So i guess we're trying some workarounds for .avi first. Another option is to write our own encoding application - with that we could use whatever we want. But it's a bad idea to write another format parser. Just sit back and relax, starting on sunday it'll get really interesting Regards, Koepi __________________ Koepi's new media development site

6th March 2002, 17:38	#12 \| Link
pandv Registered User Join Date: Oct 2001 Posts: 62	IBBBBI possible? About the B frames thing, I am thinking if it's possible to have IBBBBI (maybe reordered to IIBBBB in the stream), without P frames? This can be util in transitions between two images (a scroll for example), without the penalisation of the biggers P frames. pandv.

6th March 2002, 17:59	#14 \| Link
athos Registered User Join Date: Mar 2002 Location: Stockholm, Sweden Posts: 353	save> i was thinking maybe it is possible to estimate how large the p frames need to be, given a number of b frames, without processing the entire frame? I'm looking for a formula that would state something in the style of: Code: 4 p-frames of size x translates to 2 p-frames of size y and 2 b-frames of size z obviously, you would here be looking for a ratio of b and p frames such that the sum of 2 * y + 2 * z is less than 4 * x. Now, this estimation will probably not be exact, but hopefully it might be good enough to render good results. Last edited by athos; 6th March 2002 at 18:04.

6th March 2002, 15:31	#3 \| Link
athos Registered User Join Date: Mar 2002 Location: Stockholm, Sweden Posts: 353	oh, ok. Maybe it would be possible using another format, for example OGM, MP4 or MCF?

6th March 2002, 16:10	#6 \| Link
Actron Xvid + Ogg + Vorbis = :-) Join Date: Feb 2002 Posts: 53	i cant wait till sunday )

6th March 2002, 16:32	#7 \| Link
athos Registered User Join Date: Mar 2002 Location: Stockholm, Sweden Posts: 353	i am also excited about sunday! but, back to the subject. i understand that variable or even multiple b-frames will not be implemented at this time, because of limitations in the avi format and the strive for compability (which i totally agree on). still, i am curious what you people think of the idea of variable b-frames? would it be a good idea, assuming that it will be possible to implement in the future?

6th March 2002, 16:59	#8 \| Link
Acaila Retired Join Date: Jan 2002 Location: Netherlands Posts: 1,529	@Koepi: So to be able to use B-frames to their fullest extent you need something that's not avi? Tronic is hard at work on the first MCF parser at this very moment, and I expect him to have an alpha version done very soon. If XviD developers would work together with the MCF coders you would both benefit from it. 1- You would be able to make the codec a lot better than it already is. That is without the restrictions of avi. 2- You would really help the launch of MCF once it's ready. Being implemented with one of the best codecs around is a great way to bring it to the masses. Because MCF is still in development stage it's quite easy to add/change features. And XviD is also still in development stage so your job wouldn't change a lot as I see it. Hmm, the above post reminds me of ChristianHJW, maybe I've been hanging around with him a bit too much

6th March 2002, 17:09	#9 \| Link
saVe yet another user Join Date: Jan 2002 Location: Austria Posts: 91	i must admit that i have some problems imagining a way this could be achieved. here's my personal q&a: how will the codec know where to use how many b frames? could be solved scene-based. scenes with more bits will recieve more p frames rather than b-frames. how will the codec know where to insert the p-frames to get the most efficient quality? when the limit of maximum changes in picture information is exceeded the codec inserts a p-frame and processes the missing b-frames. maybe i'm completely wrong and the codec can't operate scene-based. but my idea would be to do a normal 1st pass using only i-frames and p-frames. in the second pass each scene (i-frame to i-frame) could be taken and processed on it's own by comparing the i-frame to the following p-frames and when a certain (of course optimally tweaked) amount of information relating to the i-frame cahnged inserting a p-frame and inserting the missing b-frames. then the whole thing would be repeated using the new p-frame like the i-frame was used before. this would go on until the next i-frame is reached and started all over again. there could also be a fixed value for high motion scenes and one for low motion scenes, with some sort of breaking point in the average bitrate of the whole scene would this be possible or am i telling compete edit?

6th March 2002, 17:11	#10 \| Link
Nic Moderator Join Date: Oct 2001 Location: England Posts: 3,285	No offense Acailia, but when I first saw you posting in DivX.com I thought you were Christian using a different handle....you both have a very similar, outlook & style.... .... Cheers, -Nic ps Im keeping a look out for the first parser, christian has done amazing work at pushing this along... (....Tronic's posted at www.videocoding.de as well)

6th March 2002, 17:28	#11 \| Link
athos Registered User Join Date: Mar 2002 Location: Stockholm, Sweden Posts: 353	save> i think the problem can be summarized as this: given a number of frames between two keyframes, how many should be p and how many should be b frames? Ok, maybe im oversimplifying here because there could be a higher complexity in parts of this list of frames than in the others. still, too many b frames will lead to unnecessarily large p-frames, and too few b frames will lead to too many p frames. the optimal number of b frames in the sequence is a number where if you add b frames the p frames will grow so that the total number of bits used in the sequence (given a fixed quantisizer) grows, and if you remove b frames the p frames added instead would also grow the total number of bits used. so the optimal ratio between b frames and p frames is one which yields the smallest file when using quality-based/fixed quantisizer encoding. now, this is only a description of the problem and not the solution, since you do not want to do several encodes of the same frames just to see what number of b frames is optimal. there should be some way to determine this by looking at some factor, for example the same as used for determining the bitbudget and quantisizer settings.

6th March 2002, 17:45	#13 \| Link
saVe yet another user Join Date: Jan 2002 Location: Austria Posts: 91	i totally agree with you, athos. i think the perfect number of b-frames between two i-frames or p-frames can only be found by trying. as i wrote before, this could maybe determined by changes in picture information, so when a p-frame from the 1st pass is different in more than xx% than the i-/p-frame you are referring to there would be a p-frame inserted in the 2nd pass. of course this value has yet to be found but i think there is one. the developers could implement a slider as you suggested and let us users (aren't we beta testers by now? ) find the best settings for the percentages (by using the famous movie "the replacements" lol).

6th March 2002, 19:48	#15 \| Link
Acaila Retired Join Date: Jan 2002 Location: Netherlands Posts: 1,529	@Nic: When I started posting at DivX.com I was a total noob, but being compared to Chris is always a compliment in my views. @Athos: I hope I understand you thoughts correctly, but the only way to give good results (as I see it) would be for the codec to encode each frame between two I-frames as both P and B frames. Then to calculate which sequence of these P and B frames will result in the smallest size, and discard the rest. I think this will slow down encoding a lot though. But it would be a much more efficient use of space.

6th March 2002, 20:54	#16 \| Link
athos Registered User Join Date: Mar 2002 Location: Stockholm, Sweden Posts: 353	Acalia> Well, this is one solution, that is slow but thorough. I can think of two alternatives (not sure if they are plausible, just ideas): Encode p frames only in the first pass, and b frames only in the second pass and then compare these. This would only work in 2 pass mode of course. I imagine this might not work, as the second pass is not identical to the first, but maybe the relative relationship in size is comparable? Somehow (more or less heuristically) estimate any bit savings by using b frames instead of p frames. I am not very good at math, so i am not sure if this can be done, and if so how good. I am thinking of some function that takes the factor that decides how useful b frames are (complexity? motion? bitrate demand?) and size of p frames and returns an estimate of the savings (or not) by turning them into b frames. a requirement of this function is of course that it is faster (less complex) than actually rendering the b frames and comparing. as you can see, i am thinking abstractly here, just throwing ideas. even in the case that you describe (sort of a worst case), maybe the benefits in quality would make it worth to implement (in the future) as an option? if the double rendering is only needed in the second pass, then the rendering time would increase by 50% roughly in a 2 pass encode. of course, if it is needed in both passes it would increase by roughly 100%.

6th March 2002, 21:41	#17 \| Link
saVe yet another user Join Date: Jan 2002 Location: Austria Posts: 91	@athos: your first idea cannot work, because the first pass uses bigger quantizers over the whole movie than the second pass, so comparing them in terms of size would not be leading to a result. what we need to do is to compare the size of the second pass p-frames to the size of second pass b-frames. i think your second idea has the problem that estimating bit savings depends on the source a lot. in high motion scenes certainly too many b-frames would be the worst thing to do because for example the ending p-frame is very different from the first b-frame. i hope you get what i mean, maybe it's not the best way of descibing what i mean... @acaila: when you render all frames as b-frames instead of p-frames you don't take into account that when you insert p-frames more often the b-frames get smaller too. again, if the distance between b-frame and one of the p-frames gets too big, maybe the b-frame will come out bigger than a p-frame would have... in this case inserting another p-frame can save space and the b-frames will be different from the greater number of b-frames that were encoded in the first place. the problem is just where the heck to insert the damn p-frames! to you both: please don't take this as an offense, it's an amateur's opinion! Last edited by saVe; 6th March 2002 at 21:44.

6th March 2002, 22:26	#19 \| Link
saVe yet another user Join Date: Jan 2002 Location: Austria Posts: 91	so you got my point? wow, someone understands what i'm saying! have to improve my explanation skills though.... i'm not a coder either so maybe we should post this over at videocoding.de/xvid.org! the core coders are all there, maybe they find it interesting... edit: -h, koepi, nic (in alphabetical order ), what do you think about this? would it be possible? if yes, could one of you post it at videocoding.de? i think they would more likely listen to you! Last edited by saVe; 6th March 2002 at 22:32.

6th March 2002, 22:37	#20 \| Link
Acaila Retired Join Date: Jan 2002 Location: Netherlands Posts: 1,529	No need to clutter up that forum with trivials. If Koepi, -h or Nic think this is an interesting idea they'll pass it along.

Thread Tools	Search this Thread
Show Printable Version Email this Page	Search this Thread: Advanced Search
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode