Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > MPEG-4 AVC / H.264
Register FAQ Calendar Today's Posts Search

Reply
 
Thread Tools Search this Thread Display Modes
Old 24th May 2011, 00:08   #1  |  Link
Rouhi
Registered User
 
Join Date: Apr 2011
Posts: 64
Intra Frame bit stream format in H.264

As we know, the compressed data stored in I_Frames in H.264 standard can be categorised to two main types of data:
1- Discrete Cosine Transform (or better to say Integer Cosine Transform) coefficients. This type of data is used for transforming the data of the blocks which could not be encoded as intra frame prediction.
2- Intra frame prediction codes. These codes are consisting of 0 to 8 for 4x4 blocks and 0 to 3 for 16x16 blocks.
Do you have any clue about how can we access to these low level data and how they stored in side frames? I know they are in Golomb code format. Whats your idea?
Rouhi is offline   Reply With Quote
Old 24th May 2011, 07:17   #2  |  Link
imcold
pencil artist
 
imcold's Avatar
 
Join Date: Jan 2006
Location: Slovakia
Posts: 201
You should read the specification of the standard, section "Syntax and semantics". That's about the best explanation you can find. You'll have to parse the bitstream as any h.264 decoder does to get info about each macroblock.
I16x16 coding for example, assuming CAVLC coding (CABAC doesn't use exp.golomb codes at all):
-mbtype: unsigned exp.golomb <1..24> based on CBP for luma&chroma and prediction type
-chroma prediction: unsigned eg
-mb_qp_delta: signed eg
-luma DC block
-luma AC blocks
-chroma DC blocks
-chroma AC blocks
Some parts are optional, based on CBP.
__________________
fevh264 - open-source baseline h.264 encoder
imcold is offline   Reply With Quote
Old 27th May 2011, 04:26   #3  |  Link
Rouhi
Registered User
 
Join Date: Apr 2011
Posts: 64
Do you think the adjacency of predicted block codes can be tracked in coded video without decompressing it?
I hope you understand my mean. suppose a 4x4 bloc code in intra prediction is 3. and my file pointer is on this coded block. can i find the adjacent coded blocks in the compressed file? for example what is the top or left or right or lower block codes, if there are exist.
Rouhi is offline   Reply With Quote
Old 27th May 2011, 07:48   #4  |  Link
imcold
pencil artist
 
imcold's Avatar
 
Join Date: Jan 2006
Location: Slovakia
Posts: 201
You can't know the length of (macro)blocks without parsing them first, so seeking to get prediction info only is impossible. Also, blocks don't start on byte boundaries.
__________________
fevh264 - open-source baseline h.264 encoder
imcold is offline   Reply With Quote
Old 31st May 2011, 03:27   #5  |  Link
Rouhi
Registered User
 
Join Date: Apr 2011
Posts: 64
Absolutely without parsing, it is impossible. But in my question I said without "decompressing"....
Suppose i have parsed the video data and my file pointer is on a 4x4 prediction block coefficient and suppose it has a value , for example 3(one of that 9 values of intra prediction codes), ok?
I just want to know that how can i find the other 4x4 prediction block coefficient that are near to this block. the point is that i am still in compressed domain and don't know where is the top or left or....so in this case is it possible to find the other blocks coefficients around (from spacial point of view) our file pointer in the compressed domain ?
In another view, my question is that is it possible to find spacial direction(top, bottom , left and right) in compressed domain of a video file in MPEG4 AVC format?
Rouhi is offline   Reply With Quote
Old 31st May 2011, 05:30   #6  |  Link
imcold
pencil artist
 
imcold's Avatar
 
Join Date: Jan 2006
Location: Slovakia
Posts: 201
If you can parse the slice data correctly, then yes: it is possible to "possible to find the other blocks coefficients around", by having a macroblock array/table - built up while parsing and keeping at least the previous and current row of MBs. You'll need to keep at least the prediction info and nonzero counts in the table. You know the image width & height in MB units, so of course you know where the the top/left/etc. macroblock is.
I'm talking about parsing because you don't need to fully decompress the data, I hoped that much is clear.
__________________
fevh264 - open-source baseline h.264 encoder

Last edited by imcold; 31st May 2011 at 05:33.
imcold is offline   Reply With Quote
Old 2nd June 2011, 00:45   #7  |  Link
Rouhi
Registered User
 
Join Date: Apr 2011
Posts: 64
Tnx for your reply. Regarding parsing the slice correctly we had a discussion (or may be still in continue) with Selur in this topic:

Bit stream structure of Access Unit and VCL NAL unit in MPEG-4

We could finally agree on the header bit stream of I, P and B frames. As an example for I slices the header is 0x 00 00 01 X Y

0x 00 00 01 is three byte of start code Prefix.

X=25,45 or 65 and consist of : forbiden zero bit + NAL-ref-idc+NAL-unit-type. NAL-unit-type=5(00101) id used for IDR pictures and I-Frames
X=01 means B-Frame and X=41 means P-Frames.(NAL-unit-type=1)

Y should be 00 . If Y is not zero it means that the I-Frame is sliced.

With this approach we could get same results with two different code from videos.
If you have a look on that topic, may be you can give me some nice advices like this one.

Last edited by Rouhi; 2nd June 2011 at 00:53.
Rouhi is offline   Reply With Quote
Old 2nd June 2011, 10:23   #8  |  Link
imcold
pencil artist
 
imcold's Avatar
 
Join Date: Jan 2006
Location: Slovakia
Posts: 201
Y is part of the slice header, you may start your bit-level parsing there to get more info about the slice (but afaik you won't get far if you don't have info from SPS/PPS).
What is your ultimate goal, by the way?
__________________
fevh264 - open-source baseline h.264 encoder

Last edited by imcold; 2nd June 2011 at 10:26.
imcold is offline   Reply With Quote
Old 6th June 2011, 04:16   #9  |  Link
Rouhi
Registered User
 
Join Date: Apr 2011
Posts: 64
According to standard H.264, the header contain only 3 byte start code prefix an one byte X represents the VCL header for representing the VCL-NAL Unit type. As presented below

0x00 00 01 X

But the Y which is the first byte after X, is part of VCL or Non VCL data. If you have another idea please let me know your reference.
BTW for SPS the nal_unit_type(the LSB 5 bits of X) sould be 7 and for PPS the nal_unit_type would be 8.

You asked
Quote:
What is your ultimate goal, by the way?
I should answer that i am looking for 4x4 and 16x16 intra prediction modes (0 to 8 for 4x4 and 0 to 3 for 16x16 blocks) in the VCL NAL units.
For my goal, the information stored in SPS seems not very important because contains resolution and colour coding information. But PPS which contains the information such as picture coding, picture partitioning into slices and entropy coding, would be more important in my research, although i am just looking for intra prediction modes and their location in the slices. Do u have any suggestion?

Last edited by Rouhi; 6th June 2011 at 04:34.
Rouhi is offline   Reply With Quote
Old 6th June 2011, 10:04   #10  |  Link
imcold
pencil artist
 
imcold's Avatar
 
Join Date: Jan 2006
Location: Slovakia
Posts: 201
I think you should take a look at h264bitstream library. It's probably your best choice to get the data you want, if you don't want to get involved with the h.264 spec.
__________________
fevh264 - open-source baseline h.264 encoder

Last edited by imcold; 6th June 2011 at 10:07.
imcold is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 20:34.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.