PDA

View Full Version : The purpose of parts of the H.264 specs...


Quarkboy
27th April 2008, 11:31
I've been reading through the ITU H.264 specifications (2005 revision), mainly for fun (yes, I know, I'm sadistic), but now that I have a much better grasp of the extract structure of the format, a few lingering questions remain about the overall purpose of some of the included syntax/semantics:

First, what exactly is the practical use of slices? I understand that it's useful in theory to split up a picture into smaller logical units that aren't necessarily the same size, but then there are all those various slice mappings (section 8.2.2), which assuming I'm reading the specs right basically map various sub rectangles of the picture to slices and uses non-standard orderings of the macroblocks within them (i.e. not simply raster ordering but a bunch of other variants).
While I could see the theoretical benefit of using these types of slices in specific cases, the computation complexity of searching for the optimal organization would seem to far outweigh any benefit to using these non-standard scannings... especially considering that you can decide intra/inter prediction based on sub_macroblock precision...
I know that x264 uses slices to do multi-threaded encoding on multi-core processors, but there seems to be little use for anything but the simplest style of slicing, why is such a complicated collection part of the specifications (and required for all profiles, I might add)?

Second, I don't understand how SI and SP mblocks can be practically applied to the situation they were intended for, random access and editing ease of use. First of all, they're only allowed in Extended profile, but extended also has a bunch of other restrictions making it a lot closer to baseline than high-profile, including no cabac and no higher precision color modes. The basic idea with them seems to be that they are redundant slices enabling the frames to be decoded in different orderings, but I don't see how an encoder could reasonably make use of this feature to ensure true random-access or editing capability. Wouldn't it be easier just to force insertion of IDR frames in the encoder to create cut points then try to optimize SP/SI macroblocks... not to mention that video editing is done per-frame, and not per-slice, so why are the SI and SP slicing parameters in the first place?

I realize that x264 doesn't implement extended profile, and maybe some of my questions are behind the reasons for that, but I'd like to hear what the elders here have to think about these questions.

Dark Shikari
27th April 2008, 11:35
First, what exactly is the practical use of slices?Slices make parallel encoding, especially with hardware-based encoders, a lot easier. Additionally, they make parallel decoding easier, plus, since the CABAC context restarts at the beginning of each slice, it adds redundancy too.I understand that it's useful in theory to split up a picture into smaller logical units that aren't necessarily the same size, but then there are all those various slice mappings (section 8.2.2), which assuming I'm reading the specs right basically map various sub rectangles of the picture to slices and uses non-standard orderings of the macroblocks within them (i.e. not simply raster ordering but a bunch of other variants).
While I could see the theoretical benefit of using these types of slices in specific cases, the computation complexity of searching for the optimal organization would seem to far outweigh any benefit to using these non-standard scannings... especially considering that you can decide intra/inter prediction based on sub_macroblock precision...Yes, the weird slice orderings aren't used by anyone as far as I know.I know that x264 uses slices to do multi-threaded encoding on multi-core processorsNo, it doesn't, it uses frame-based threading (and has for well over a year now).but there seems to be little use for anything but the simplest style of slicing, why is such a complicated collection part of the specifications (and required for all profiles, I might add)?The weird slices (flexible macroblock ordering/etc) that Extended uses aren't required in any profile, and nobody that I know uses Extended.Second, I don't understand how SI and SP mblocks can be practically applied to the situation they were intended for, random access and editing ease of use. First of all, they're only allowed in Extended profile, but extended also has a bunch of other restrictions making it a lot closer to baseline than high-profile, including no cabac and no higher precision color modes. The basic idea with them seems to be that they are redundant slices enabling the frames to be decoded in different orderings, but I don't see how an encoder could reasonably make use of this feature to ensure true random-access or editing capability. Wouldn't it be easier just to force insertion of IDR frames in the encoder to create cut points then try to optimize SP/SI macroblocks... not to mention that video editing is done per-frame, and not per-slice, so why are the SI and SP slicing parameters in the first place?See Akupenguin's explanation (http://akuvian.org/src/x264/switching_pictures.txt).

Quarkboy
27th April 2008, 11:53
...

Thanks! That was fast... Now rereading a bit I see that not-allowing arbitrary slice ordering also basically forbids any of the interested slice scanning orders (I think), so that forces everything but extended to not have to worry about that.

Interesting about how SI and SP frames would work in practice... Seems pretty pointless to me, at least from a consumer standpoint, but I could see how broadcasters could use such functionality... Does any of the current HD broadcasts in h.264 use extended profile? Do there even exist hardware that can decode it?

Dark Shikari
27th April 2008, 11:54
Interesting about how SI and SP frames would work in practice... Seems pretty pointless to me, at least from a consumer standpoint, but I could see how broadcasters could use such functionality... Does any of the current HD broadcasts in h.264 use extended profile? Do there even exist hardware that can decode it?Not as far as I know. Scalable Video Coding is the next big "its supposed to be useful to broadcasters" thing, but I don't know anyone going for that yet.

Quarkboy
27th April 2008, 11:59
While I have your attention... a feature request!

It's possible to modify the deblocking parameters per-slice, in the syntax (something I didn't realize before reading it). Would it be possible to add in an "auto" deblocking strength option that automatically chooses deblocking strength per slice based on some more complicated metric than simply QP?

Dark Shikari
27th April 2008, 12:03
While I have your attention... a feature request!

It's possible to modify the deblocking parameters per-slice, in the syntax (something I didn't realize before reading it). Would it be possible to add in an "auto" deblocking strength option that automatically chooses deblocking strength per slice based on some more complicated metric than simply QP?Some encoders already have this, an "adaptive deblocking strength" per frame (or even per slice, though I have never seen an encoder that varies it among slices in the same frame). It would require a bit of new bitstream syntax, I think, but not much; it wouldn't be hard.

Some encoders do use adaptive deblocking strengths, but I'm not sure how useful that would be; if you want to change the curve of "deblocking strength versus QP" its completely impossible when you're using adaptive quantization, since you can only vary deblocking strength per-frame, not per-block.

bond
1st May 2008, 10:46
iirc slices are there for error resilience purposes in the first place