View Full Version : h264 bitstream expert needed
madshi
24th December 2007, 01:18
Am writing a very little h264 bitstream parser. Just to figure out details like video width/height, framerate etc from the bitstream. Now for HD DVD streams everything seems to work more or less. But for SkyHD and PremiereHD broadcasts, I'm having major trouble parsing the "Sequence parameter set" correctly, specifically the "VUI parameters" sub values. The two values "num_units_in_tick" and "time_scale" seem to have rather insane values:
num_units_in_tick: 48
time_scale: 0x1000000
That doesn't make any sense to me. Also when parsing "through" until the end of the VUI parameters, I'm getting short on bits. Both problems do not occur with a HD DVD test stream. So I believe there's something wrong somewhere. But I've triple checked my code and I believe it to be correct (or at least following the specs). Here's a SkyHD sample:
http://madshi.net/test.h264 (64KB)
Thanks for any help you can give me!
Dark Shikari
24th December 2007, 01:25
Correct parsing according to Elecard StreamAnalyzer:
http://i15.tinypic.com/6yok11i.png
According to JM Ldecod.trace:
Annex B NALU w/ long startcode, len 2, forbidden_bit 0, nal_reference_idc 0, nal_unit_type 9
Annex B NALU w/ long startcode, len 27, forbidden_bit 0, nal_reference_idc 3, nal_unit_type 7
@0 SPS: profile_idc 01001101 ( 77)
@8 SPS: constrained_set0_flag 0 ( 0)
@9 SPS: constrained_set1_flag 1 ( 1)
@10 SPS: constrained_set2_flag 0 ( 0)
@11 SPS: constrained_set3_flag 0 ( 0)
@12 SPS: reserved_zero_4bits 0000 ( 0)
@16 SPS: level_idc 00101000 ( 40)
@24 SPS: seq_parameter_set_id 1 ( 0)
@25 SPS: log2_max_frame_num_minus4 0001001 ( 8)
@32 SPS: pic_order_cnt_type 1 ( 0)
@33 SPS: log2_max_pic_order_cnt_lsb_minus4 00110 ( 5)
@38 SPS: num_ref_frames 011 ( 2)
@41 SPS: gaps_in_frame_num_value_allowed_flag 0 ( 0)
@42 SPS: pic_width_in_mbs_minus1 0000001111000 (119)
@55 SPS: pic_height_in_map_units_minus1 00000100010 ( 33)
@66 SPS: frame_mbs_only_flag 0 ( 0)
@67 SPS: mb_adaptive_frame_field_flag 0 ( 0)
@68 SPS: direct_8x8_inference_flag 1 ( 1)
@69 SPS: frame_cropping_flag 0 ( 0)
@70 SPS: vui_parameters_present_flag 1 ( 1)
@71 VUI: aspect_ratio_info_present_flag 1 ( 1)
@72 VUI: aspect_ratio_idc 00000001 ( 1)
@80 VUI: overscan_info_present_flag 1 ( 1)
@81 VUI: overscan_appropriate_flag 1 ( 1)
@82 VUI: video_signal_type_present_flag 1 ( 1)
@83 VUI: video_format 101 ( 5)
@86 VUI: video_full_range_flag 1 ( 1)
@87 VUI: color_description_present_flag 1 ( 1)
@88 VUI: colour_primaries 00000001 ( 1)
@96 VUI: transfer_characteristics 00000001 ( 1)
@104 VUI: matrix_coefficients 00000001 ( 1)
@112 VUI: chroma_loc_info_present_flag 1 ( 1)
@113 VUI: chroma_sample_loc_type_top_field 1 ( 0)
@114 VUI: chroma_sample_loc_type_bottom_field 1 ( 0)
@115 VUI: timing_info_present_flag 1 ( 1)
@116 VUI: num_units_in_tick 00000000000000000000000000000001 ( 1)
@148 VUI: time_scale 00000000000000000000000000110010 ( 50)
@180 VUI: fixed_frame_rate_flag 1 ( 1)
@181 VUI: nal_hrd_parameters_present_flag 0 ( 0)
@182 VUI: vcl_hrd_parameters_present_flag 0 ( 0)
@183 VUI: pic_struct_present_flag 1 ( 1)
@184 VUI: bitstream_restriction_flag 0 ( 0)
Annex B NALU w/ long startcode, len 5, forbidden_bit 0, nal_reference_idc 0, nal_unit_type 6
@185 SEI: seq_parameter_set_id 1 ( 0)
Annex B NALU w/ long startcode, len 5, forbidden_bit 0, nal_reference_idc 0, nal_unit_type 6
@186 SEI: recovery_frame_cnt 1 ( 0)
@187 SEI: exact_match_flag 0 ( 0)
@188 SEI: broken_link_flag 1 ( 1)
@189 SEI: changing_slice_group_idc 00 ( 0)
Annex B NALU w/ long startcode, len 4, forbidden_bit 0, nal_reference_idc 3, nal_unit_type 8
@191 PPS: pic_parameter_set_id 1 ( 0)
@192 PPS: seq_parameter_set_id 1 ( 0)
@193 PPS: entropy_coding_mode_flag 1 ( 1)
@194 PPS: pic_order_present_flag 0 ( 0)
@195 PPS: num_slice_groups_minus1 1 ( 0)
@196 PPS: num_ref_idx_l0_active_minus1 010 ( 1)
@199 PPS: num_ref_idx_l1_active_minus1 1 ( 0)
@200 PPS: weighted_pred_flag 0 ( 0)
@201 PPS: weighted_bipred_idc 00 ( 0)
@203 PPS: pic_init_qp_minus26 1 ( 0)
@204 PPS: pic_init_qs_minus26 1 ( 0)
@205 PPS: chroma_qp_index_offset 1 ( 0)
@206 PPS: deblocking_filter_control_present_flag 1 ( 1)
@207 PPS: constrained_intra_pred_flag 0 ( 0)
@208 PPS: redundant_pic_cnt_present_flag 0 ( 0)
Annex B NALU w/ long startcode, len 5, forbidden_bit 0, nal_reference_idc 0, nal_unit_type 6
@209 SEI: pic_struct 0011 ( 3)
@213 SEI: clock_time_stamp_flag 0 ( 0)
@214 SEI: clock_time_stamp_flag 0 ( 0)
Annex B NALU w/ long startcode, len 9581, forbidden_bit 0, nal_reference_idc 3, nal_unit_type 5
@215 SH: first_mb_in_slice 1 ( 0)
@216 SH: slice_type 011 ( 2)
@219 SH: pic_parameter_set_id 1 ( 0)
@220 SH: frame_num 000000000000 ( 0)
@232 SH: field_pic_flag 0 ( 0)
@233 SH: idr_pic_id 1 ( 0)
@234 SH: pic_order_cnt_lsb 000000000 ( 0)
@243 SH: no_output_of_prior_pics_flag 0 ( 0)
@244 SH: long_term_reference_flag 0 ( 0)
@245 SH: slice_qp_delta 00000100101 (-18)
@256 SH: disable_deblocking_filter_idc 011 ( 2)
@259 SH: slice_alpha_c0_offset_div2 1 ( 0)
@260 SH: slice_beta_offset_div2 1 ( 0)
madshi
24th December 2007, 02:25
Thank you, that's really helpful. However, I still don't understand it. The output of "JM Ldecod.trace" is extremely interesting. But as far as I can see it doesn't fit to the hex data...
:confused:
Please excuse me, maybe I'm just being stupid!
Anyway, here's the hex data in bytes and in bits:
@000 4D 40 28 89
@032 99 80 F0 08
@064 8B 01 F7 01
@096 01 01 F0 00
@128 00 03 00 10
@160 00 00 03 03
@192 29 40 00
@000 01001101 01000000 00101000 10001001
@032 10011001 10000000 11110000 00001000
@064 10001011 00000001 11110111 00000001
@096 00000001 00000001 11110000 00000000
@128 00000000 00000011 00000000 00010000
@160 00000000 00000000 00000011 00000011
@192 00101001 01000000 00000000
The data fits perfectly to the JM Ldecod.trace until "num_units_in_tick" which is bit position 116. According to JM Ldecod.trace there should be a 32bit dword with the value "1". However in the bitstream I simply don't see that (see red color)! That's confusing the heck out of me. It seems that JM Ldecod.trace reads those 32bit dword values differently than I would expect.
Do you have any thoughts on this?
drmpeg
24th December 2007, 02:47
It's due to the presence of "emulation_prevention_three_byte" codes. Whenever there is a long sequence of zeroes, a byte with the value 3 is inserted to avoid start code emulation.
Most decoders or parsers scan the NAL for emulation_prevention_three_byte and remove them before doing any further processing.
BTW, I have the source for h264_parse packaged here:
http://www.w6rz.net/h264_parse.zip
Ron
madshi
24th December 2007, 02:52
Thanks Ron. You're a lifer saver once again... :)
madshi
24th December 2007, 13:09
One more question to the Gods of h264:
For Blu-Ray movies I've getting:
num_units_in_tick: 1001
time_scale: 48000
interlaced: false
Why is that not 24000? How can I get from time_scale/num_units_in_tick to the real framerate? Of course I could hardcode a conversion from 48000 to 24000, but I'd prefer a mathematical way to get the "right" framerate.
Sergey A. Sablin
24th December 2007, 15:06
One more question to the Gods of h264:
For Blu-Ray movies I've getting:
num_units_in_tick: 1001
time_scale: 48000
interlaced: false
Why is that not 24000? How can I get from time_scale/num_units_in_tick to the real framerate? Of course I could hardcode a conversion from 48000 to 24000, but I'd prefer a mathematical way to get the "right" framerate.
'cause this is _field_ rate, not frame rate. it doesn't depend on sequence scan/coding mode - it is so for both progressive and interlaced sequences.
madshi
24th December 2007, 18:04
'cause this is _field_ rate, not frame rate. it doesn't depend on sequence scan/coding mode - it is so for both progressive and interlaced sequences.
Oh, that's good to know. I think I'm all set now. Thanks much! :)
musicman2311
15th January 2008, 19:10
Madshi - is it possible to create a tool that converts AVC from HD-DVD to Blu-ray , for instance to clear the 'gaps_in_frame_num_value_allowed_flag' if set ?
Trahald
15th January 2008, 19:43
i fixed h264info to do a reverse pulldown.. goes 29->23 and clears that flag... and other stuff you have to do to make 23.976fps acceptable... its only for cabac streams (which for the purpose here the streams should always be cabac) i didnt really test it much.. the output from my test file did work on scenarist bluray so seems ok. its on sourceforge (batchccews project) as a test version . (ver .14)
ps if the stream is already 23.976 fps it will still work for that. you would still select reverse pulldown
musicman2311
16th January 2008, 14:11
Trahald -thanks - I am impressed, excellent piece of work.
I have not played the result yet on PS3, but I will let you know. youare my hero already !
madshi
24th March 2008, 16:06
Dear h264 experts: I need your help again!! :o
I'm desperately trying to figure out which framerate an h264 video stream has just by looking at the bitstream. Sounds simple? Yes, it does: num_units_in_tick and time_scale look like the ticket to perfect framerate detection.
Bah! Unfortunately it's not as simple as it sounds: HD DVD movies are really 23.976p, but num_units_in_tick/time_scale is 30000/1001! Ok, so I'm checking frame_mbs_only_flag. If that is set, it's obviously progressive and so even though num_units_in_tick is 30000, in reality it's 24000? Seemed to work fine for most h264 HD DVD movies. Until I ran over some samples where frame_mbs_only_flag is not set, and num_units_in_tick is still 30000 and the *real* framerate is still 24000/1001. Interestingly the Sonic HD Demuxer is as confused about these samples as I am and plays them too fast.
I'm really confused. Is there any reliable way to find out the true framerate of a h264 bitstream? I'm talking about the framerate I have to mux the bitstream with (e.g. to MKV) to make it play with the correct runtime.
Thanks much!
Dark Shikari
24th March 2008, 16:13
Until I ran over some samples where frame_mbs_only_flag is not set, and num_units_in_tick is still 30000 and the *real* framerate is still 24000/1001. Interestingly the Sonic HD Demuxer is as confused about these samples as I am and plays them too fast.Telecining.
Guest
24th March 2008, 16:29
Pulldown is signalled by picture timing SEIs. See the spec for details.
If you ditch the pulldown (i.e., ignore the pic timing SEIs) then you can encode at the film rate. DGAVCIndex has an option for that. But you are presumably using DS filters, and so that is probably not an option. If the DS filter doesn't support pulldown, you better get a different one.
madshi
24th March 2008, 17:06
Pulldown is signalled by picture timing SEIs. See the spec for details.
I was already looking at pic_struct in the picture timing SEI, but I think I didn't interpret it correctly. Just for clarification:
pic_struct:
frame -> this is a progressive stream (Blu-Ray)
top_field -> this is an interlaced stream (Blu-Ray or HD DVD)
bottom_field -> this is an interlaced stream (Blu-Ray or HD DVD)
top_field, bottom_field -> this is a progressive stream with pulldown flags (HD DVD)
bottom_field_, top_field -> this is a progressive stream with pulldown flags (HD DVD)
top, bottom, top -> this is a progressive stream with pulldown flags (HD DVD)
bottom, top, bottom -> this is a progressive stream with pulldown flags (HD DVD)
Is that correct?
One more question, just to be sure: Is it enough to check only one picture timing SEI block? E.g. can it happen that although we have an interlaced stream, that there are still some "frame"s in the beginning of the stream and that only later the real interlaced fields appear? Or can I rely on that if the bitstream is truely interlaced that already the first picture timing SEI contains a pic_struct of "top_field" or "bottom_field"?
Thank you! :)
If you ditch the pulldown (i.e., ignore the pic timing SEIs) then you can encode at the film rate. DGAVCIndex has an option for that. But you are presumably using DS filters, and so that is probably not an option. If the DS filter doesn't support pulldown, you better get a different one.
I need this for my Matroska muxing tool. I can easily ditch the pulldown. Just need to know the right muxing framerate...
Guest
24th March 2008, 17:48
Pic structure 5 and 6 denote field pulldown. Pic structure 7 and 8 denote frame pulldown. The ones below 5 are not signalling pulldown.
Real streams can be mixtures of all types. So it is dangerous to look at just one SEI.
madshi
24th March 2008, 17:55
Ouch.
Thanks for your help!
h264visa
24th March 2008, 23:28
Bah! Unfortunately it's not as simple as it sounds: HD DVD movies are really 23.976p, but num_units_in_tick/time_scale is 30000/1001! Ok, so I'm checking frame_mbs_only_flag. If that is set, it's obviously progressive and so even though num_units_in_tick is 30000, in reality it's 24000? Seemed to work fine for most h264 HD DVD movies. Until I ran over some samples where frame_mbs_only_flag is not set, and num_units_in_tick is still 30000 and the *real* framerate is still 24000/1001. Interestingly the Sonic HD Demuxer is as confused about these samples as I am and plays them too fast.
My H264Visa Mobile Solution can calculate this fps according to num_units_in_tick and time_scale, you can have a try. It is located in summary dialog, Stream tab.
This version is not free, but you can trial for 30 days.
madshi
25th March 2008, 12:25
My H264Visa Mobile Solution can calculate this fps according to num_units_in_tick and time_scale
That's the easy part. I can do that myself already... ;)
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.