Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > MPEG-4 AVC / H.264

Reply
 
Thread Tools Search this Thread Display Modes
Old 15th April 2009, 05:58   #1  |  Link
videophool
Registered User
 
Join Date: Nov 2008
Posts: 91
Parsing h.264

I am working on an H.264 parser reader. I have some confusion with regards to SEI NAL Units. Table 7.3.2.3 in the spec says to loop while(more_rbsp_data()), I am finding some SEI NAL have extra bytes that I can not define in the spec. For example, I see an SEI type 0 with 9 bytes payloadSize. Immediately after the 9 bytes of data, there is a 0x40, and an SEI type 5. The 0x40 appears to be a delimiter, but I can not find any description in the spec,a nd so my parser misses the second SEI in the NAL. In another SEI in the stream, there is a type 1 with 7 bytes payloadSize. The payload data is followed by two additional bytes (0x25 0x10), and then 0x80.

There is always a 0x80 at the end of every SEI, which looks like a byte alignment value (10000000b). However, the spec does not seem to require this byte since the SEI was already byte aligned. But the 0x80 appears to be useful for parsing an SEI.

I don't want to implement my parser based on guesses from what I see in a stream. I must be missing some piece of information from the spec. Any SEI parding tips would be greatly appreciated. Thanks.
videophool is offline   Reply With Quote
Old 15th April 2009, 06:18   #2  |  Link
Guest
Guest
 
Join Date: Jan 2002
Posts: 21,922
Post a full hex dump of your SEI NALU so that we may parse it without relying on your possibly erroneous interpretation.

Last edited by Guest; 15th April 2009 at 06:20.
Guest is offline   Reply With Quote
Old 15th April 2009, 07:58   #3  |  Link
videophool
Registered User
 
Join Date: Nov 2008
Posts: 91
Here is one where there is a SEI type 0 with 9 bytes followed by an SEI type 5 with 565 bytes. I understand what I am looking at, but I can not find any details in the spec for parsing multiple SEI's in one NAL

00 00 00 01 06 00 09 80 00 22 F1 80 00 00 03 00 40 05 FF FF 37 DC 45 E9 BD E6 D9 48 B7 96 2C D8
20 D9 23 EE EF 78 32 36 34 20 2D 20 63 6F 72 65 20 36 37 20 2D 20 48 2E 32 36 34 2F 4D 50 45 47
2D 34 20 41 56 43 20 63 6F 64 65 63 20 2D 20 43 6F 70 79 6C 65 66 74 20 32 30 30 33 2D 32 30 30
39 20 2D 20 68 74 74 70 3A 2F 2F 77 77 77 2E 76 69 64 65 6F 6C 61 6E 2E 6F 72 67 2F 78 32 36 34
2E 68 74 6D 6C 20 2D 20 6F 70 74 69 6F 6E 73 3A 20 63 61 62 61 63 3D 31 20 72 65 66 3D 33 20 64
65 62 6C 6F 63 6B 3D 31 3A 30 3A 30 20 61 6E 61 6C 79 73 65 3D 30 78 31 3A 30 78 31 31 31 20 6D
65 3D 68 65 78 20 73 75 62 6D 65 3D 35 20

Here is type that has 7 bytes of data, and then two extra bytes.

00 00 00 01 06 01 07 00 00 03 00 00 03 00 25 10 80

Note that both are terminated with 0x80, which is a legit byte alignment value (that seems to be unnecessary, since the SEI is already aligned). I can not find any other explanation in the spec for the meaning of this byte.

Thanks.
videophool is offline   Reply With Quote
Old 15th April 2009, 11:43   #4  |  Link
Sergey A. Sablin
Registered User
 
Join Date: Dec 2004
Location: Tomsk, Russia
Posts: 366
Quote:
Originally Posted by videophool View Post
Here is one where there is a SEI type 0 with 9 bytes followed by an SEI type 5 with 565 bytes. I understand what I am looking at, but I can not find any details in the spec for parsing multiple SEI's in one NAL

00 00 00 01 06 00 09 80 00 22 F1 80 00 00 03 00 40 05 FF FF 37 DC 45 E9 BD E6 D9 48 B7 96 2C D8
20 D9 23 EE EF 78 32 36 34 20 2D 20 63 6F 72 65 20 36 37 20 2D 20 48 2E 32 36 34 2F 4D 50 45 47
2D 34 20 41 56 43 20 63 6F 64 65 63 20 2D 20 43 6F 70 79 6C 65 66 74 20 32 30 30 33 2D 32 30 30
39 20 2D 20 68 74 74 70 3A 2F 2F 77 77 77 2E 76 69 64 65 6F 6C 61 6E 2E 6F 72 67 2F 78 32 36 34
2E 68 74 6D 6C 20 2D 20 6F 70 74 69 6F 6E 73 3A 20 63 61 62 61 63 3D 31 20 72 65 66 3D 33 20 64
65 62 6C 6F 63 6B 3D 31 3A 30 3A 30 20 61 6E 61 6C 79 73 65 3D 30 78 31 3A 30 78 31 31 31 20 6D
65 3D 68 65 78 20 73 75 62 6D 65 3D 35 20

Here is type that has 7 bytes of data, and then two extra bytes.

00 00 00 01 06 01 07 00 00 03 00 00 03 00 25 10 80

Note that both are terminated with 0x80, which is a legit byte alignment value (that seems to be unnecessary, since the SEI is already aligned). I can not find any other explanation in the spec for the meaning of this byte.

Thanks.
7.3.1. NAL unit syntax.
emulation_prevention_three_byte is your keyword.
Sergey A. Sablin is offline   Reply With Quote
Old 15th April 2009, 14:07   #5  |  Link
Guest
Guest
 
Join Date: Jan 2002
Posts: 21,922
First, as Sergey says, you have to remove the emulation prevention bytes. Then you parse as done in the JM reference software:

Code:
void InterpretSEIMessage(byte* msg, int size, ImageParameters *img)
{
  int payload_type = 0;
  int payload_size = 0;
  int offset = 1;
  byte tmp_byte;
  do
  {
    // sei_message();
    payload_type = 0;
    tmp_byte = msg[offset++];
    while (tmp_byte == 0xFF)
    {
      payload_type += 255;
      tmp_byte = msg[offset++];
    }
    payload_type += tmp_byte;   // this is the last byte

    payload_size = 0;
    tmp_byte = msg[offset++];
    while (tmp_byte == 0xFF)
    {
      payload_size += 255;
      tmp_byte = msg[offset++];
    }
    payload_size += tmp_byte;   // this is the last byte

    switch ( payload_type )     // sei_payload( type, size );
    {
    case  SEI_BUFFERING_PERIOD:
      interpret_buffering_period_info( msg+offset, payload_size, img );
      break;
    case  SEI_PIC_TIMING:
      interpret_picture_timing_info( msg+offset, payload_size, img );
      break;
    case  SEI_PAN_SCAN_RECT:
      interpret_pan_scan_rect_info( msg+offset, payload_size, img );
      break;
    case  SEI_FILLER_PAYLOAD:
      interpret_filler_payload_info( msg+offset, payload_size, img );
      break;
    case  SEI_USER_DATA_REGISTERED_ITU_T_T35:
      interpret_user_data_registered_itu_t_t35_info( msg+offset, payload_size, img );
      break;
    case  SEI_USER_DATA_UNREGISTERED:
      interpret_user_data_unregistered_info( msg+offset, payload_size, img );
      break;
    case  SEI_RECOVERY_POINT:
      interpret_recovery_point_info( msg+offset, payload_size, img );
      break;
    case  SEI_DEC_REF_PIC_MARKING_REPETITION:
      interpret_dec_ref_pic_marking_repetition_info( msg+offset, payload_size, img );
      break;
    case  SEI_SPARE_PIC:
      interpret_spare_pic( msg+offset, payload_size, img );
      break;
    case  SEI_SCENE_INFO:
      interpret_scene_information( msg+offset, payload_size, img );
      break;
    case  SEI_SUB_SEQ_INFO:
      interpret_subsequence_info( msg+offset, payload_size, img );
      break;
    case  SEI_SUB_SEQ_LAYER_CHARACTERISTICS:
      interpret_subsequence_layer_characteristics_info( msg+offset, payload_size, img );
      break;
    case  SEI_SUB_SEQ_CHARACTERISTICS:
      interpret_subsequence_characteristics_info( msg+offset, payload_size, img );
      break;
    case  SEI_FULL_FRAME_FREEZE:
      interpret_full_frame_freeze_info( msg+offset, payload_size, img );
      break;
    case  SEI_FULL_FRAME_FREEZE_RELEASE:
      interpret_full_frame_freeze_release_info( msg+offset, payload_size, img );
      break;
    case  SEI_FULL_FRAME_SNAPSHOT:
      interpret_full_frame_snapshot_info( msg+offset, payload_size, img );
      break;
    case  SEI_PROGRESSIVE_REFINEMENT_SEGMENT_START:
      interpret_progressive_refinement_end_info( msg+offset, payload_size, img );
      break;
    case  SEI_PROGRESSIVE_REFINEMENT_SEGMENT_END:
      interpret_progressive_refinement_end_info( msg+offset, payload_size, img );
      break;
    case  SEI_MOTION_CONSTRAINED_SLICE_GROUP_SET:
      interpret_motion_constrained_slice_group_set_info( msg+offset, payload_size, img );
    case  SEI_FILM_GRAIN_CHARACTERISTICS:
      interpret_film_grain_characteristics_info ( msg+offset, payload_size, img );
      break;
    case  SEI_DEBLOCKING_FILTER_DISPLAY_PREFERENCE:
      interpret_deblocking_filter_display_preference_info ( msg+offset, payload_size, img );
      break;
    case  SEI_STEREO_VIDEO_INFO:
      interpret_stereo_video_info_info ( msg+offset, payload_size, img );
      break;
    default:
      interpret_reserved_info( msg+offset, payload_size, img );
      break;
    }
    offset += payload_size;

  } while( msg[offset] != 0x80 );    // more_rbsp_data()  msg[offset] != 0x80
  // ignore the trailing bits rbsp_trailing_bits();
  assert(msg[offset] == 0x80);      // this is the trailing bits
  assert( offset+1 == size );
}
Guest is offline   Reply With Quote
Old 16th April 2009, 07:47   #6  |  Link
videophool
Registered User
 
Join Date: Nov 2008
Posts: 91
Perfect. Thanks.
videophool is offline   Reply With Quote
Old 15th October 2009, 20:27   #7  |  Link
Francisco Amorim
Registered User
 
Join Date: Oct 2009
Posts: 3
Quote:
Originally Posted by videophool View Post
Here is one where there is a SEI type 0 with 9 bytes followed by an SEI type 5 with 565 bytes. I understand what I am looking at, but I can not find any details in the spec for parsing multiple SEI's in one NAL

00 00 00 01 06 00 09 80 00 22 F1 80 00 00 03 00 40 05 FF FF 37 DC 45 E9 BD E6 D9 48 B7 96 2C D8
20 D9 23 EE EF 78 32 36 34 20 2D 20 63 6F 72 65 20 36 37 20 2D 20 48 2E 32 36 34 2F 4D 50 45 47
2D 34 20 41 56 43 20 63 6F 64 65 63 20 2D 20 43 6F 70 79 6C 65 66 74 20 32 30 30 33 2D 32 30 30
39 20 2D 20 68 74 74 70 3A 2F 2F 77 77 77 2E 76 69 64 65 6F 6C 61 6E 2E 6F 72 67 2F 78 32 36 34
2E 68 74 6D 6C 20 2D 20 6F 70 74 69 6F 6E 73 3A 20 63 61 62 61 63 3D 31 20 72 65 66 3D 33 20 64
65 62 6C 6F 63 6B 3D 31 3A 30 3A 30 20 61 6E 61 6C 79 73 65 3D 30 78 31 3A 30 78 31 31 31 20 6D
65 3D 68 65 78 20 73 75 62 6D 65 3D 35 20

Here is type that has 7 bytes of data, and then two extra bytes.

00 00 00 01 06 01 07 00 00 03 00 00 03 00 25 10 80

Note that both are terminated with 0x80, which is a legit byte alignment value (that seems to be unnecessary, since the SEI is already aligned). I can not find any other explanation in the spec for the meaning of this byte.

Thanks.
Someone can explain to me why the start code is 00 00 00 01 and not 00 00 01, as is written on the Advanced video coding for generic audiovisual services ?
Where can i find this information??

do you know why before an IDR VCL Nal unit there is always 2 NAL units?
Francisco Amorim is offline   Reply With Quote
Old 15th October 2009, 20:39   #8  |  Link
Guest
Guest
 
Join Date: Jan 2002
Posts: 21,922
Quote:
Originally Posted by Francisco Amorim View Post
Someone can explain to me why the start code is 00 00 00 01 and not 00 00 01, as is written on the Advanced video coding for generic audiovisual services ?
See section B.1 in the AVC spec.

Quote:
Where can i find this information??
http://neuron2.net/library/avc/T-REC...1-I!!PDF-E.pdf

Quote:
do you know why before an IDR VCL Nal unit there is always 2 NAL units?
Maybe you are thinking of SPS and PPS. But they don't *always* have to be present before every IDR. They just need to have been previously sent.

Last edited by Guest; 25th November 2009 at 23:47.
Guest is offline   Reply With Quote
Old 21st October 2009, 17:49   #9  |  Link
Francisco Amorim
Registered User
 
Join Date: Oct 2009
Posts: 3
Thanks neuron2.

let me ask you something more,
i need to build a switch for streams of h264 coded video data..
but first i want to know one thing,
if i have 2 video streams of different videos, when i want to stop decoding one to start decoding the other, i can start this other on a IDR nal unit?
if not, on with nal should i start this other, i cant start it from the beggining...
Francisco Amorim is offline   Reply With Quote
Old 21st October 2009, 18:03   #10  |  Link
Guest
Guest
 
Join Date: Jan 2002
Posts: 21,922
You need to start by sending the SPS and PPS that are needed followed by an IDR or recovery point.
Guest is offline   Reply With Quote
Old 25th November 2009, 23:08   #11  |  Link
Francisco Amorim
Registered User
 
Join Date: Oct 2009
Posts: 3
i didnt understand yet...

the start code is 00 00 01
what is the 00 before the start code?
Francisco Amorim is offline   Reply With Quote
Old 25th November 2009, 23:49   #12  |  Link
Guest
Guest
 
Join Date: Jan 2002
Posts: 21,922
Quote:
Originally Posted by Francisco Amorim View Post
i didnt understand yet...

the start code is 00 00 01
what is the 00 before the start code?
trailing_zero_8bits (or leading_zero_8bits if at the start of the stream).

Refer to B.1 in the spec as I said before.
Guest is offline   Reply With Quote
Old 26th November 2009, 11:34   #13  |  Link
Shevach
Video compressionist
 
Join Date: Jun 2009
Location: Israel
Posts: 124
Quote:
Originally Posted by neuron2 View Post
trailing_zero_8bits (or leading_zero_8bits if at the start of the stream).
More option - cabac_zero_word bytes. Under some circumstances the standard requires to put one or more cabac_zero_word at the end of slice, although not all encoders indeed obey it.
Shevach is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 21:03.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, vBulletin Solutions Inc.