Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
|
![]() |
|
Thread Tools | Search this Thread | Display Modes |
![]() |
#1 | Link |
Registered User
Join Date: Sep 2010
Posts: 2
|
MP4 delay - start_pts vs media time from edit list table entry in moov.trak.edts.elst
Hi!
I'd like to understand some MP4 and AAC-related stuff and ffmpeg behavior regarding it. I'm transcoding 14.5 secs footage (50fps; 696000 48kHz audio samples) huffyuv+pcm_s16le from MKV into h264+aac to MP4 using latest stable ffmpeg 3.0.1 (Zeranoe's Win64 static build). Code:
ffmpeg -i 30-notes-huffyuv.mkv ^ -pix_fmt:v yuv420p ^ -c:v libx264 -profile:v high -preset:v fast ^ -sc_threshold:v 0 -g:v 25 -bf:v 2 -crf:v 18 ^ -c:a aac -profile:a aac_low -b:a 384k -cutoff:a 22000 ^ 30-notes.mp4 Code:
[STREAM] index=0 codec_name=h264 codec_long_name=H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10 profile=High codec_type=video codec_time_base=1/100 codec_tag_string=avc1 codec_tag=0x31637661 width=1920 height=1080 coded_width=1920 coded_height=1088 has_b_frames=2 sample_aspect_ratio=1:1 display_aspect_ratio=16:9 pix_fmt=yuv420p level=42 color_range=N/A color_space=unknown color_transfer=unknown color_primaries=unknown chroma_location=left timecode=N/A refs=4 is_avc=true nal_length_size=4 id=N/A r_frame_rate=50/1 avg_frame_rate=50/1 time_base=1/12800 start_pts=0 start_time=0.000000 duration_ts=185600 duration=14.500000 bit_rate=15634211 max_bit_rate=N/A bits_per_raw_sample=8 nb_frames=725 nb_read_frames=N/A nb_read_packets=N/A DISPOSITION:default=1 DISPOSITION:dub=0 DISPOSITION:original=0 DISPOSITION:comment=0 DISPOSITION:lyrics=0 DISPOSITION:karaoke=0 DISPOSITION:forced=0 DISPOSITION:hearing_impaired=0 DISPOSITION:visual_impaired=0 DISPOSITION:clean_effects=0 DISPOSITION:attached_pic=0 TAG:language=und TAG:handler_name=VideoHandler [/STREAM] [STREAM] index=1 codec_name=aac codec_long_name=AAC (Advanced Audio Coding) profile=LC codec_type=audio codec_time_base=1/48000 codec_tag_string=mp4a codec_tag=0x6134706d sample_fmt=fltp sample_rate=48000 channels=2 channel_layout=stereo bits_per_sample=0 id=N/A r_frame_rate=0/0 avg_frame_rate=0/0 time_base=1/48000 start_pts=-1024 start_time=-0.021333 duration_ts=697024 duration=14.521333 bit_rate=235170 max_bit_rate=384000 bits_per_raw_sample=N/A nb_frames=681 nb_read_frames=N/A nb_read_packets=N/A DISPOSITION:default=1 DISPOSITION:dub=0 DISPOSITION:original=0 DISPOSITION:comment=0 DISPOSITION:lyrics=0 DISPOSITION:karaoke=0 DISPOSITION:forced=0 DISPOSITION:hearing_impaired=0 DISPOSITION:visual_impaired=0 DISPOSITION:clean_effects=0 DISPOSITION:attached_pic=0 TAG:language=und TAG:handler_name=SoundHandler [/STREAM]
But if we look into internals of this MP4 via Elecard Video Format Analyzer or via dump from mp4box (I used latest stable 0.6.1 Win64 build): Code:
mp4box -std -diso 30-notes.mp4 | egrep -v "\<(CompositionOffsetEntry|SyncSampleEntry|SampleToChunkEntry|SampleSizeEntry|ChunkEntry)\>" <?xml version="1.0" encoding="UTF-8"?> <!--MP4Box dump trace--> <IsoMediaFile Name="30-notes.mp4"> <FileTypeBox MajorBrand="isom" MinorVersion="512"> <BoxInfo Size="32" Type="ftyp"/> <BrandEntry AlternateBrand="isom"/> <BrandEntry AlternateBrand="iso2"/> <BrandEntry AlternateBrand="avc1"/> <BrandEntry AlternateBrand="mp41"/> </FileTypeBox> <FreeSpaceBox size="0"> <BoxInfo Size="8" Type="free"/> </FreeSpaceBox> <MediaDataBox dataSize="28763883"> <BoxInfo Size="28763891" Type="mdat"/> </MediaDataBox> <MovieBox> <BoxInfo Size="18880" Type="moov"/> <MovieHeaderBox CreationTime="0" ModificationTime="0" TimeScale="1000" Duration="14522" NextTrackID="3"> <BoxInfo Size="108" Type="mvhd"/> <FullBoxInfo Version="0" Flags="0x0"/> </MovieHeaderBox> <TrackBox> <BoxInfo Size="12723" Type="trak"/> <TrackHeaderBox CreationTime="0" ModificationTime="0" TrackID="1" Duration="14500" Width="1920.00" Height="1080.00"> <Matrix m11="0x00010000" m12="0x00000000" m13="0x00000000" m21="0x00000000" m22="0x00010000" m23="0x00000000" m31="0x00000000" m32="0x00000000" m33="0x40000000"/><BoxInfo Size="92" Type="tkhd"/> <FullBoxInfo Version="0" Flags="0x3"/> </TrackHeaderBox> <EditBox> <BoxInfo Size="36" Type="edts"/> <EditListBox EntryCount="1"> <BoxInfo Size="28" Type="elst"/> <FullBoxInfo Version="0" Flags="0x0"/> <EditListEntry Duration="14500" MediaTime="512" MediaRate="1"/> </EditListBox> </EditBox> <MediaBox> <BoxInfo Size="12587" Type="mdia"/> <MediaHeaderBox CreationTime="0" ModificationTime="0" TimeScale="12800" Duration="185600" LanguageCode="und"> <BoxInfo Size="32" Type="mdhd"/> <FullBoxInfo Version="0" Flags="0x0"/> </MediaHeaderBox> <HandlerBox Type="vide" Name="VideoHandler" reserved1="0" reserved2="data:application/octet-string,000000000000000000000000"> <BoxInfo Size="45" Type="hdlr"/> <FullBoxInfo Version="0" Flags="0x0"/> </HandlerBox> <MediaInformationBox> <BoxInfo Size="12502" Type="minf"/> <VideoMediaHeaderBox> <BoxInfo Size="20" Type="vmhd"/> <FullBoxInfo Version="0" Flags="0x1"/> </VideoMediaHeaderBox> <DataInformationBox><BoxInfo Size="36" Type="dinf"/> <DataReferenceBox> <BoxInfo Size="28" Type="dref"/> <FullBoxInfo Version="0" Flags="0x0"/> <URLDataEntryBox> <!--Data is contained in the movie file--> <BoxInfo Size="12" Type="url "/> <FullBoxInfo Version="0" Flags="0x1"/> </URLDataEntryBox> </DataReferenceBox> </DataInformationBox> <SampleTableBox> <BoxInfo Size="12438" Type="stbl"/> <SampleDescriptionBox> <BoxInfo Size="154" Type="stsd"/> <FullBoxInfo Version="0" Flags="0x0"/> <AVCSampleEntryBox DataReferenceIndex="1" Width="1920" Height="1080" XDPI="4718592" YDPI="4718592" BitDepth="24"> <BoxInfo Size="138" Type="avc1"/> <AVCConfigurationBox> <AVCDecoderConfigurationRecord configurationVersion="1" AVCProfileIndication="100" profile_compatibility="0" AVCLevelIndication="42" nal_unit_size="4" chroma_format="0" luma_bit_depth="0" chroma_bit_depth="0"> <SequenceParameterSet size="27" content="data:application/octet-string,6764002AACD940780227E5C044000003000400000301903C60C658"/> <PictureParameterSet size="6" content="data:application/octet-string,68EAE08CB22C"/> </AVCDecoderConfigurationRecord> <BoxInfo Size="52" Type="avcC"/> </AVCConfigurationBox> </AVCSampleEntryBox> </SampleDescriptionBox> <TimeToSampleBox EntryCount="1"> <BoxInfo Size="24" Type="stts"/> <FullBoxInfo Version="0" Flags="0x0"/> <TimeToSampleEntry SampleDelta="256" SampleCount="725"/> <!-- counted 725 samples in STTS entries --> </TimeToSampleBox> <CompositionOffsetBox EntryCount="665"> <BoxInfo Size="5336" Type="ctts"/> <FullBoxInfo Version="0" Flags="0x0"/> <!-- counted 725 samples in CTTS entries --> </CompositionOffsetBox> <SyncSampleBox EntryCount="29"> <BoxInfo Size="132" Type="stss"/> <FullBoxInfo Version="0" Flags="0x0"/> </SyncSampleBox> <SampleToChunkBox EntryCount="93"> <BoxInfo Size="1132" Type="stsc"/> <FullBoxInfo Version="0" Flags="0x0"/> <!-- counted 724 samples in STSC entries (could be less than sample count) --> </SampleToChunkBox> <SampleSizeBox SampleCount="725"> <BoxInfo Size="2920" Type="stsz"/> <FullBoxInfo Version="0" Flags="0x0"/> </SampleSizeBox> <ChunkOffsetBox EntryCount="679"> <BoxInfo Size="2732" Type="stco"/> <FullBoxInfo Version="0" Flags="0x0"/> </ChunkOffsetBox> </SampleTableBox> </MediaInformationBox> </MediaBox> </TrackBox> <TrackBox> <BoxInfo Size="5943" Type="trak"/> <TrackHeaderBox CreationTime="0" ModificationTime="0" TrackID="2" Duration="14522" AlternateGroupID="1" Volume="1.00"> <BoxInfo Size="92" Type="tkhd"/> <FullBoxInfo Version="0" Flags="0x3"/> </TrackHeaderBox> <EditBox> <BoxInfo Size="36" Type="edts"/> <EditListBox EntryCount="1"> <BoxInfo Size="28" Type="elst"/> <FullBoxInfo Version="0" Flags="0x0"/> <EditListEntry Duration="14500" MediaTime="1024" MediaRate="1"/> </EditListBox> </EditBox> <MediaBox> <BoxInfo Size="5807" Type="mdia"/> <MediaHeaderBox CreationTime="0" ModificationTime="0" TimeScale="48000" Duration="697024" LanguageCode="und"> <BoxInfo Size="32" Type="mdhd"/> <FullBoxInfo Version="0" Flags="0x0"/> </MediaHeaderBox> <HandlerBox Type="soun" Name="SoundHandler" reserved1="0" reserved2="data:application/octet-string,000000000000000000000000"> <BoxInfo Size="45" Type="hdlr"/> <FullBoxInfo Version="0" Flags="0x0"/> </HandlerBox> <MediaInformationBox> <BoxInfo Size="5722" Type="minf"/> <SoundMediaHeaderBox> <BoxInfo Size="16" Type="smhd"/> <FullBoxInfo Version="0" Flags="0x0"/> </SoundMediaHeaderBox> <DataInformationBox><BoxInfo Size="36" Type="dinf"/> <DataReferenceBox> <BoxInfo Size="28" Type="dref"/> <FullBoxInfo Version="0" Flags="0x0"/> <URLDataEntryBox> <!--Data is contained in the movie file--> <BoxInfo Size="12" Type="url "/> <FullBoxInfo Version="0" Flags="0x1"/> </URLDataEntryBox> </DataReferenceBox> </DataInformationBox> <SampleTableBox> <BoxInfo Size="5662" Type="stbl"/> <SampleDescriptionBox> <BoxInfo Size="106" Type="stsd"/> <FullBoxInfo Version="0" Flags="0x0"/> <MPEGAudioSampleDescriptionBox DataReferenceIndex="1" SampleRate="48000" Channels="2" BitsPerSample="16"> <BoxInfo Size="90" Type="mp4a"/> <MPEG4ESDescriptorBox> <BoxInfo Size="54" Type="esds"/> <FullBoxInfo Version="0" Flags="0x0"/> <ES_Descriptor ES_ID="es2" binaryID="2" > <decConfigDescr> <DecoderConfigDescriptor objectTypeIndication="64" streamType="5" maxBitrate="384000" avgBitrate="235170" > <decSpecificInfo> <DecoderSpecificInfo type="auto" src="data:application/octet-string,%11%90%56%E5%00" /> </decSpecificInfo> </DecoderConfigDescriptor> </decConfigDescr> <slConfigDescr> <SLConfigDescriptor > <predefined value="2" /> <custom > </custom> </SLConfigDescriptor> </slConfigDescr> </ES_Descriptor> </MPEG4ESDescriptorBox> </MPEGAudioSampleDescriptionBox> </SampleDescriptionBox> <TimeToSampleBox EntryCount="2"> <BoxInfo Size="32" Type="stts"/> <FullBoxInfo Version="0" Flags="0x0"/> <TimeToSampleEntry SampleDelta="1024" SampleCount="680"/> <TimeToSampleEntry SampleDelta="704" SampleCount="1"/> <!-- counted 681 samples in STTS entries --> </TimeToSampleBox> <SampleToChunkBox EntryCount="2"> <BoxInfo Size="40" Type="stsc"/> <FullBoxInfo Version="0" Flags="0x0"/> <!-- counted 681 samples in STSC entries (could be less than sample count) --> </SampleToChunkBox> <SampleSizeBox SampleCount="681"> <BoxInfo Size="2744" Type="stsz"/> <FullBoxInfo Version="0" Flags="0x0"/> </SampleSizeBox> <ChunkOffsetBox EntryCount="679"> <BoxInfo Size="2732" Type="stco"/> <FullBoxInfo Version="0" Flags="0x0"/> </ChunkOffsetBox> </SampleTableBox> </MediaInformationBox> </MediaBox> </TrackBox> <UserDataBox> <BoxInfo Size="98" Type="udta"/> <MetaBox> <BoxInfo Size="90" Type="meta"/> <FullBoxInfo Version="0" Flags="0x0"/> <HandlerBox Type="mdir" Name="" reserved1="0" reserved2="data:application/octet-string,6170706C0000000000000000"> <BoxInfo Size="33" Type="hdlr"/> <FullBoxInfo Version="0" Flags="0x0"/> </HandlerBox> <ItemListBox> <BoxInfo Size="45" Type="ilst"/> <ToolBox value="Lavf57.25.100" > <FullBoxInfo Version="0" Flags="0x1"/> <BoxInfo Size="37" Type=".too"/> </ToolBox> </ItemListBox> </MetaBox> </UserDataBox> </MovieBox> </IsoMediaFile>
1. What I am missing here? Am I looking at media time incorrectly, i.e. it has some other meaning that I think it has? I have also some bonus questions: 2. Isn't AAC encoder required to produce full access units (typically having 1024 samples)? 697024/1024 = 680.6875 is not an integer. 3. I know that padding info (for start and end) can be stored within ITUNSMPB tag, but ffmpeg is not using that, adhering (I hope so) to ISO only, so where is this tail padding stored? Or is moov.trak.mdia.mdhd.duration allowed to be lower that real media duration (which would be divisible by 1024)? 4. If ffmpeg is using ISO way of delaying AAC audio (instead of iTunes way), then shouldn't it also add sample group (sgpd) with roll distance set to -1, as edit list (elst) is not enough for signaling encoder delay? Ok, that's all for my first post on doom9. |
![]() |
![]() |
![]() |
#2 | Link | ||||
Registered User
Join Date: Dec 2002
Posts: 5,565
|
Quote:
Quote:
Quote:
Quote:
|
||||
![]() |
![]() |
![]() |
#3 | Link |
Registered User
Join Date: Sep 2010
Posts: 2
|
Thank you for your reply, sneaker_ger!
Full dump: http://paste.przemoc.net/doom9/mp4-d...notes_info.xml |
![]() |
![]() |
![]() |
Tags |
acc, ffmpeg, media_time, mp4, start_pts |
Thread Tools | Search this Thread |
Display Modes | |
|
|