Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > High Efficiency Video Coding (HEVC)

Reply
 
Thread Tools Search this Thread Display Modes
Old 21st April 2020, 21:14   #1  |  Link
es
Registered User
 
Join Date: Mar 2018
Posts: 7
Understanding CodecPrivate of an x265-encoded file as reported by mkvmerge

I’m trying to understand why I can’t encode a file to HEVC to match another file for later merging without mkvtoolnix throwing an “The codec’s private data does not match. Both have the same length (1136) but different content” warning despite the fact that I’m using an ffmpeg with the same commit-version of x265 as the original encoding with identical settings (verified in MediaInfo) and able to achieve identical PPS/SPS/VPS values (but with a one-byte difference in SEI). From this "bug" report #2390 and similar issues on this forum and other places google helped me to find I understand that it’s tricky and depends on the CodecPrivate field.
So, I’m trying to understand what this field contains and whether I can tweak anything on the encoding side to achieve full parity (currently I get a difference of three symbols out of ~2k).

For this, I’m trying to match these two CodecPrivate data fields for the same file (my x265 encoding)
  1. From FFMPEG
    - Get all headers
    Code:
    ffmpeg -i in.mkv -c:v copy -an -sn -bsf:v trace_headers -t 0.01 -report -loglevel 0 -f matroska NUL
    → trace_headersS.log (this and other files mentioned below are attached in an archive)
    - Parse trace_headersS.log to leave only binary values in each group → trace_headersBinGroup.log
    - Convert binary values to Hex and compare to the values in 2. → trace_headersBinGroup2HexMatch.md
  2. From MKVmerge
    - Extract a hex field “codec_private_data” from JSON:
    Code:
    mkvmerge -i -F "json" in.mkv > in.json
    → CodecPrivateHex_original

I’ve managed to match all info from the ffmpeg headers to mkvmerge data (with an occasional emulation_prevention_three_byte complicating the match), but I’m still missing a few chunks of data that are present in the mkvmerge codec_private_data field, but are absent in the ffmpeg headers (the ffmpeg headers also had Slice Segment Header data, but as far as I understand it’s not part of CodecPrivate, so it’s fine that the mkvmerge data has none of those). One of these chunks also happens to be the place of a two-symbol difference between my endcoding and the other file

My first question is: is there a way to somehow fully decode the CodecPrivate field given by mkvmerge?


The third-symbol difference between my encoding and the other file is a payload_byte[73] field within Supplemental Enhancement Information group (all other payload_byte fields are identical)

My second question is: what affects the data written to SEI? Is there a way to somehow decode it to understand what it means (I’m using CodecVisa, but it also shows no extra explanation as to what that payload is)?

Would appreciate any help on how to parse this intricate mess!
Attached Files
File Type: rar BitHexComparison(oHVC5f).rar (17.0 KB, 46 views)
es is offline   Reply With Quote
Old 22nd April 2020, 19:16   #2  |  Link
sneaker_ger
Registered User
 
Join Date: Dec 2002
Posts: 5,565
Quote:
Originally Posted by es View Post
My second question is: what affects the data written to SEI? Is there a way to somehow decode it to understand what it means (I’m using CodecVisa, but it also shows no extra explanation as to what that payload is)?
MediaInfo will show the SEI contents in a human-readable form.
sneaker_ger is offline   Reply With Quote
Old 22nd April 2020, 21:28   #3  |  Link
es
Registered User
 
Join Date: Mar 2018
Posts: 7
Quote:
Originally Posted by sneaker_ger View Post
MediaInfo will show the SEI contents in a human-readable form.
Wow, how? What is the field name I can select to show this?
Or do you mean it is shown as "Encoding settings" in the regular summary view? In this case the reported MediaInfo data is identical, but the SEI data is different in 1 hex symbol
Or is it also Default/Forced/Color range fields that are show below Encoding settings. I've tried encoding with a different Color range setting, but the SEI part of CodecPrivate as reported by mkvmerge didn't change and Default/Forced are also reported by MediaInfo as identical
es is offline   Reply With Quote
Old 23rd April 2020, 01:51   #4  |  Link
mkver
Registered User
 
Join Date: May 2016
Posts: 197
The trace_headers bitstream filter will only show you the raw output of the content of the various NAL units in the header; it does not show you the mp4 encapsulation. You can find the format of said encapsulation in ISO/IEC 14496-15; or you can just look at the code that extracts the NAL units and strips the encapsulation away here.

[Edit]: Have you updated your compiler/used different binaries? payload_byte[73] is the "9" in "[GCC 9.3.0]".

Last edited by mkver; 23rd April 2020 at 02:08.
mkver is offline   Reply With Quote
Old 23rd April 2020, 15:05   #5  |  Link
es
Registered User
 
Join Date: Mar 2018
Posts: 7
Quote:
Originally Posted by mkver View Post
The trace_headers bitstream filter will only show you the raw output of the content of the various NAL units in the header; it does not show you the mp4 encapsulation. You can find the format of said encapsulation in ISO/IEC 14496-15; or you can just look at the code that extracts the NAL units and strips the encapsulation away here.
Does the fact that I'm working with MKV/x265, not mp4/AVC, change anything in your suggestion?
Re. the standard I was looking at this H.265@11/2019 standard, file T-REC-H.265-201911-I!!PDF-E.pdf (that's where I got e.g. info re. emulation_prevention_three_byte which the trace_headers "eats")
Also, is there a tool that would decode the encapsulation to individual fields in the same way the trace_headers decodes NAL units to individual syntax units (as I'm guessing trying to parse it from the spec/source code might be a bit of a challenge)?

Quote:
Originally Posted by mkver View Post
[Edit]: Have you updated your compiler/used different binaries? payload_byte[73] is the "9" in "[GCC 9.3.0]".
Awesome! You're correct, the GCC version is indeed different between the two files, I had no idea this supposedly extra information is also part of codec private data. I'll try to get/compile an x265 binary not only with the same version, but also with the same compiler version and reencode, maybe it'll also somehow fix the remaining difference in the encapsulation?
From your response I guessed that the SEI is simply an ASCII decoded in HEX, so I've tried the reverse and got almost identical info to what MediaInfo was showing:
+ means it's the same as MI
- means it's not present in MI
|my comments|

-|this is nal_unit_type to last_payload_size_byte|N0<x01><0x05>ÿÿÿø
-|this is payload_byte[0]..[15], but apparently not ASCII|,¢Þ µ<0x17>GÛ»U¤þ<0x7f>ÂüN
+|matches MI|x265
-|not in MI|(build 84) -
+|matches MI|1.9+200 -6098ba3e0cf16b11:[Windows][GCC 9.3.0][64 bit] 10bit
-|not in MI| - H.265/HEVC codec - Copyright 2013-2015 (c) Multicoreware Inc - http://x265.org - options: 1920x1080 fps=24000/1001 bitdepth=10
+|matches MI|wpp ctu=64 ...
-|some kind of end of message?|<0x80>

Q1. What is this gibberish at the second line, where payload_bytes have started? Isn't this also supposed to be ASCII data just like the other payload bytes that follow?
Q2. Given that MediaInfo skips some of the SEI information, do you know a tool that would show the full info?

Thank you!
es is offline   Reply With Quote
Old 23rd April 2020, 18:06   #6  |  Link
mkver
Registered User
 
Join Date: May 2016
Posts: 197
Quote:
Originally Posted by es View Post
Does the fact that I'm working with MKV/x265, not mp4/AVC, change anything in your suggestion?
Matroska essentially copied the mp4 encapsulation for both H.264 and HEVC, therefore it does not change anything.
Quote:
Originally Posted by es View Post
This standard does not prescribe the encapsulation, although it defines one (namely in Annex B; the resulting format is therefore known as "Annex B"). mp4 uses a different one, where the length of a NAL unit is given by a length field in front of the NAL unit and not by the start code (0x000001) of the next NAL unit.
Quote:
Originally Posted by es View Post
Also, is there a tool that would decode the encapsulation to individual fields in the same way the trace_headers decodes NAL units to individual syntax units (as I'm guessing trying to parse it from the spec/source code might be a bit of a challenge)?
Not that I know of.
Quote:
Originally Posted by es View Post
Awesome! You're correct, the GCC version is indeed different between the two files, I had no idea this supposedly extra information is also part of codec private data. I'll try to get/compile an x265 binary not only with the same version, but also with the same compiler version and reencode, maybe it'll also somehow fix the remaining difference in the encapsulation?
From your response I guessed that the SEI is simply an ASCII decoded in HEX,
The syntax of the SEI messages is given in annex D of the H.265 standard (i.e. it does not depend on the encapsulation). The creator of this SEI message can choose what the user_data_payload_byte actually contain; in this case, x265 simply uses an ASCII string.
Quote:
Originally Posted by es View Post
-|this is nal_unit_type to last_payload_size_byte|N0<x01><0x05>ÿÿÿø
-|this is payload_byte[0]..[15], but apparently not ASCII|,¢Þ µ<0x17>GÛ»U¤þ<0x7f>ÂüN
Now that you mention it: You are using an ancient version of ffmpeg (that does not have this commit). Your version had no support for parsing user_data_unregistered SEI messages and so it used the mode for unknown/unimplemented SEI messages, which just shows the bytes. If you had used a newer version, you would have seen why the first 16 bytes are different: They actually contain the "uuid_iso_iec_11578", only the bytes after this are user_data_payload_bytes.
Quote:
Originally Posted by es View Post
-|not in MI| - H.265/HEVC codec - Copyright 2013-2015 (c) Multicoreware Inc - http://x265.org - options: 1920x1080 fps=24000/1001 bitdepth=10
These options are probably not shown in the string because they are redundant -- MI already shows the dimensions and the frame rate as separate fields.
Quote:
Originally Posted by es View Post
-|some kind of end of message?|<0x80>
This is not part of the SEI message; instead it is the SEI NAL unit's rbsp_trailing_bits() syntax structure. Even your ancient version of trace_headers shows this:
Code:
[AVBSFContext @ 00000217e94bae00] 8152        payload_byte[1012]                                   00110000 = 48
[AVBSFContext @ 00000217e94bae00] 8160        rbsp_stop_one_bit                                           1 = 1
[AVBSFContext @ 00000217e94bae00] 8161        rbsp_alignment_zero_bit                                     0 = 0
[AVBSFContext @ 00000217e94bae00] 8162        rbsp_alignment_zero_bit                                     0 = 0
[AVBSFContext @ 00000217e94bae00] 8163        rbsp_alignment_zero_bit                                     0 = 0
[AVBSFContext @ 00000217e94bae00] 8164        rbsp_alignment_zero_bit                                     0 = 0
[AVBSFContext @ 00000217e94bae00] 8165        rbsp_alignment_zero_bit                                     0 = 0
[AVBSFContext @ 00000217e94bae00] 8166        rbsp_alignment_zero_bit                                     0 = 0
[AVBSFContext @ 00000217e94bae00] 8167        rbsp_alignment_zero_bit                                     0 = 0
Quote:
Originally Posted by es View Post
Q1. What is this gibberish at the second line, where payload_bytes have started? Isn't this also supposed to be ASCII data just like the other payload bytes that follow?
Q2. Given that MediaInfo skips some of the SEI information, do you know a tool that would show the full info?
Q1 has already been answered above.
Q2: trace_headers shows you all the bytes, but not as ASCII string. I don't know a tool that does.

And regarding your actual question: If only x265's SEI message shows a difference, then appending these videos with mkvmerge is ok; you can ignore the warning.

Last edited by mkver; 23rd April 2020 at 18:09.
mkver is offline   Reply With Quote
Old 23rd April 2020, 21:44   #7  |  Link
es
Registered User
 
Join Date: Mar 2018
Posts: 7
Quote:
Originally Posted by mkver View Post
And regarding your actual question: If only x265's SEI message shows a difference, then appending these videos with mkvmerge is ok; you can ignore the warning.
I've managed to find the identical x265 version (same commit, same GCC) and actually my SEI messages are now also identical!!! However, using this standalone x265 binary to encode instead of ffmpeg (and muxing the resulting file to mkv with MKVToolnix) I now have more differences in the encapsulation data fields (previously I had a difference of only two hex symbols in the area before any of the VPS/SPS/PPS/SEI fields, but now it's ten; and the differences appear not only before VPS, but also in between VPS and SPS, and between SPS and PPS, and between PPS and SEI). I wonder why that is, is encapsulation such a flexible thing?

Q3. Is it safe to ignore this differences or can it lead to some subtle bug with the merged file?

Regardless, I'm still interested in getting to the bottom of the difference, so will see if I can somehow catch whichever leading/trailing/zero_byte zeros (from Annex B in HEVC) misbehave between the two otherwise identical files


Quote:
Originally Posted by mkver View Post
Now that you mention it: You are using an ancient version of ffmpeg...
Yes, I needed this ancient version (4.1.5) because newer versions wrote an extra field to vui_parameters in SPS that the original HEVC encoding didn't have, creating more differences in code private data
But I didin't really have to get headers with the same version, so when I used the new version I indeed saw that these bytes were uuid_iso_iec_11578[0]..[15] (still not sure what these unregistered user data are for, though)


Quote:
Originally Posted by mkver View Post
These options are probably not shown in the string because they are redundant -- MI already shows the dimensions and the frame rate as separate fields.
Yeah, though this make it impossible to understand, which of the MI information fields are relevant for the merging

Thank you for your detailed and helpful responses!
es is offline   Reply With Quote
Old 24th April 2020, 00:32   #8  |  Link
mkver
Registered User
 
Join Date: May 2016
Posts: 197
Quote:
Originally Posted by es View Post
I've managed to find the identical x265 version (same commit, same GCC) and actually my SEI messages are now also identical!!! However, using this standalone x265 binary to encode instead of ffmpeg (and muxing the resulting file to mkv with MKVToolnix) I now have more differences in the encapsulation data fields (previously I had a difference of only two hex symbols in the area before any of the VPS/SPS/PPS/SEI fields, but now it's ten; and the differences appear not only before VPS, but also in between VPS and SPS, and between SPS and PPS, and between PPS and SEI). I wonder why that is, is encapsulation such a flexible thing?
The outline of the HEVCDecoderConfigurationRecord (as it is called) is this: First some fields that tell you about compatibility with certain HEVC profiles and levels, then certain other info (like whether it is constant framerate), then the length of the size fields (that are prepended in front of the NAL units in each sample/block) and then certain arrays. Each array initially contains one byte whose six lower bits contain the type of the NAL units in this array while the highest bit (called array_completeness) indicates whether there might be further NAL units of this type in the bitstream/the actual samples. Then there is a sixteen bit field (big endian) containing the number of NAL units in this array, followed by the NAL units, each prepended with a 16 bit size field. On a quick test mkvmerge sets the array_completeness flag (when muxing an elementary stream) while ffmpeg does not. These are the differences that you see in between the arrays.
Quote:
Originally Posted by es View Post
Q3. Is it safe to ignore this differences or can it lead to some subtle bug with the merged file?
While mkvmerge sets the array_completeness flag in the CodecPrivate, it does not strip away the NAL units from the actual samples. This actually means that the output of mkvmerge is invalid. So I would really use ffmpeg for the time being.
Quote:
Originally Posted by es View Post
Yes, I needed this ancient version (4.1.5) because newer versions wrote an extra field to vui_parameters in SPS that the original HEVC encoding didn't have, creating more differences in code private data
But I didin't really have to get headers with the same version, so when I used the new version I indeed saw that these bytes were uuid_iso_iec_11578[0]..[15] (still not sure what these unregistered user data are for, though)
Even if you want to continue to use the ancient version for encoding, you can use still use newer versions for analyzing. (And if you encode files that you don't want to append to files encoded by the ancient version, then there is no need to use ancient tools for these at all.)
mkver is offline   Reply With Quote
Old 25th April 2020, 01:44   #9  |  Link
es
Registered User
 
Join Date: Mar 2018
Posts: 7
Quote:
Originally Posted by mkver View Post
The outline of the HEVCDecoderConfigurationRecord (as it is called) is this
Thanks, this term was helpful, I found it in the spec (will copy&paste below for future reference/ease of finding by people with similar issue) and managed to finally decode the non-header information in codec_private_data!!!
Turns out, there is only one-bit difference in the bit(3) numTemporalLayers; field, my encoding (with the ancient ffmpeg 4.1.5 and and old matching x265 version, though newer GCC) has a value of 1 (per spec this "indicates that the stream is not temporally scalable"), while the original HEVC encoding has a value of 0 (per spec this "indicates that it is unknown whether the stream is temporally scalable")

Q1. How would I know, whether a file has those layers or not if the numTemporalLayers flag is 0? My guess is that it doesn't since the headers file for it shows that all "temporal_id:" have value "0"
Q2. if from Q1 I get that a file doesn't, does the unknown flag matter in merging?
Q3. Is there a config in ffmpeg I could tweak that would affect this value (or is there maybe another way to edit it) so I could finally get the fully identical coded private data fields?

Quote:
Originally Posted by mkver View Post
While mkvmerge sets the array_completeness flag in the CodecPrivate, it does not strip away the NAL units from the actual samples. This actually means that the output of mkvmerge is invalid.
Oh, ok. Just to clarify - this is only a bug when muxing elementary streams, but it's otherwise ok to merge mkv files together?


Quote:
Originally Posted by mkver View Post
So I would really use ffmpeg for the time being.
I've tried to do so, but there seems to be an error with ffmpeg muxing of hevc elementary streams
I was using this command (tried with the latest ffmpeg build as well) to mux my HEVC file to an mkv
Code:
ffmpeg -i out.h265 -c:v copy out.mkv
Is this
a proper fix for this issue (convert to mp4, convert mp4 to mkv, thow the first step still thorws a warning)?


P.S.
Below is the HEVCDecoderConfigurationRecord structure that helped me decode Codec Private Data field and find what prevents mkvmerge from successful merging two identically encoded hevc files. While the spec's pdfs aren't openly available, this part can be found at this stackoverflow question

Code:
aligned(8) class HEVCDecoderConfigurationRecord {
  unsigned int(8) configurationVersion = 1;
  unsigned int(2) general_profile_space;
  unsigned int(1) general_tier_flag;
  unsigned int(5) general_profile_idc;
  unsigned int(32) general_profile_compatibility_flags;
  unsigned int(48) general_constraint_indicator_flags;
  unsigned int(8) general_level_idc;
  bit(4) reserved = ‘1111’b;
  unsigned int(12) min_spatial_segmentation_idc;
  bit(6) reserved = ‘111111’b;
  unsigned int(2) parallelismType;
  bit(6) reserved = ‘111111’b;
  unsigned int(2) chroma_format_idc;
  bit(5) reserved = ‘11111’b;
  unsigned int(3) bit_depth_luma_minus8;
  bit(5) reserved = ‘11111’b;
  unsigned int(3) bit_depth_chroma_minus8;
  bit(16) avgFrameRate;
  bit(2) constantFrameRate;
  bit(3) numTemporalLayers;
  bit(1) temporalIdNested;
  unsigned int(2) lengthSizeMinusOne;
  unsigned int(8) numOfArrays;
  for (j=0; j < numOfArrays; j++) {
    bit(1) array_completeness;
    unsigned int(1) reserved = 0;
    unsigned int(6) NAL_unit_type;
    unsigned int(16) numNalus;
    for (i=0; i< numNalus; i++) {
      unsigned int(16) nalUnitLength;
      bit(8*nalUnitLength) nalUnit;
    }
  }
}
es is offline   Reply With Quote
Old 25th April 2020, 19:04   #10  |  Link
es
Registered User
 
Join Date: Mar 2018
Posts: 7
Quote:
Originally Posted by es View Post
Also, is there a tool that would decode the encapsulation to individual fields in the same way the trace_headers decodes NAL units to individual syntax units (as I'm guessing trying to parse it from the spec/source code might be a bit of a challenge)?
Quote:
Originally Posted by mkver View Post
Not that I know of.

OMG, I've just discovered by accident that I didn't need to dive into any of the specs to parse the CodecPrivate data field, the good ol' MediaInfo already does it for me — it its Debug mode it shows the full parsing of all the metadata (PPS/SPS/VPS/SEI, and as a cherry on top the SEI field is properly decoded as well!) plus of all the HEVCDecoderConfigurationRecord fields!!! So now I finally have an easily accessible view to that information

Turns out that there is a second bit of difference I missed last night (besides the numTemporalLayers) — for some reason my encoding has a constantFrameRate value of 0 (which, per spec "indicates that the stream may or may not be of constant frame rate."), despite MediaInfo showing "Frame rate mode: Constant" in its information window.
Q4. How can I fix the 0 constantFrameRate value?
Update: A4. I've found that FFmpeg simply sets this as a constant (0) in this hevc.c line regardless of the "actual" file value, so this can be fixed with a pre-compile edit.

Update: A3.
And the numTemporalLayers can never be zero in FFmpeg given that it's defined as the maximum of another non-negative var + 1 here in VPS and here in SPS
So it's actually impossible to achieve a match of numTemporalLayers=0 without slightly tweaking the source code for this as well.

I guess the only solution to get a file with identical CodecPrivate is:
1. Build the same old GCC version
2. Manually change the constantFrameRate/numTemporalLayers variables in the code
3. Cross-compile x265 and ffmpeg with these changes
4. Hope that the answers Q1/2 above pose no danger

Last edited by es; 26th April 2020 at 16:49. Reason: Found answers to Q4 and Q3
es is offline   Reply With Quote
Reply

Tags
codecprivate, mkvmerge, x265

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 14:40.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.