View Full Version : SCR Calculation
jdobbs
3rd November 2003, 00:28
Does anyone have any information on how to calculate the SCR value when remuxing into VOB files?
I'm trying to write a routine to remux .M2V files into a stream from a VOB and am having a hell of a time determining how SCR is calculated. It seems to start out:
(LBA number) * (Size_of_LBA) * 90,000 (clock) / Mux_Rate
But in examining the value in many .VOB files, I can't seem to find any consistency... I'm also having a hard time finding any real consistent relationship between SCR and DTS/PTS... other than it is always lower (as you'd expect).
Any help would be appreciated.
Thanks,
jdobbs
diehardii
14th November 2003, 00:25
Hi, you probably already know this or have it, but the mpeg2 stream standard with the explanation can be found at www.neuron2.net. I am trying to write a live source muxer file and have found the documents there to be very helpful.
~Steve
gmo
14th November 2003, 18:36
Hi,
i'am hacked my self a multiplexer to build DVD compliant
PS streams. it works without scanning input data first
(like bbmpeg muxer does) so it can be used to create
muxed data through 2 fifo file (one video, one audio)
while piping ...
now back to the SCR/PTS stuff...
to play a video you need to know when to display every image
and output the correspondend sound, simply you need a clock !
but in consumer DVD players there is no clock like in other
devices with a nice clock chip. NO, the player build its time by
reading it from the data packs (each of 2048 bytes size).
the pack header indeed contains the SCR timestamp telling the
player how much time it is. that' all about SCRs !
(how it will be build, see later).
now the player has to display pictures and play sound at certain
times (e.g. every 40ms for PAL, which would give following row of
times: 0ms (1st. pcture) , 40ms (2nd. picture), 80ms (3),
120ms (4), ... -> these times are the PTS values!
And when the right display time is arrived, the player can see
on its clock (SCRs). but it need first do decode the data of a
pack, so it couldn't display it ? ...
there is a simple solution -> all PTS becomes an offset of about
few 100 ms, e.g. 320 ms which results in a time row of
320ms (1st. picture), 360ms (2nd. picture), 400ms (3), ...
now the player read a PACK with picture data, the pack SCR is still
before the PTS of the contained data, thus it has enough time to
buffer anddecode the picture.
And if the player now reaches the pack with the SCR matching the PTS
above it CAN display the picture :-) (and the data in the actual
pack belongs to a picture later to display and so on) ...
Next need of SCR/PTS is for synchronisation of video and audio.
therefore packs containing the audio portion of a given video
picture needs the same PTS timestamp like this.
but often the PTS between picture data packs and corresponding
sound packs differs, because of technical reasons there may be
a delay for audio, which have to be compensated, that's all.
And the same applies also for audio: packs must never contain data
with PTS timestamps equal or lower than it's pack SCR.
Now to the calculations: we assume to multiplex pictures and
its appropriate audio data (corresponding to the duration of
picture display time = 40ms for PAL) alternatly one behind the
other.
At all we have many units of one picture and its audio data each.
and we know the amount of bytes of such a unit. therefore we can
make the following relation: the amount of bytes for one unit
will represent the display duration of one picture (=40ms for PAL)
thus after reading these amount of bytes there will be past 40ms.
now these bytes (called "payload unit") must be broken in smaller
parts to fit into packs (e.g. wich size of 2048 bytes, but some
space is needed for headers). every part of bytes of the unit used
to fill a pack will correspond to a part of the 40ms display
duration. in this way the pack SCR can be calculated wizh this
formula:
SCR(n) = SCR(n-1) +
(picture duration)*(bytes per part)/(bytes per unit)
the PTS with this formula:
PTS = (picture-no) * (picture duration) + offset
here an example for better understanding:
one unit (picture and sound portion) will have total 40000 bytes
pack size is 2048 bytes, whereas some space is needed for headers.
we assume the free space per pack (= payload size) is equal for
every pack and has 2000 bytes -> total we get 20 packs for this
unit.
the first pack starts with SCR-1 = 0ms, the second pack will get
SCR(2) = SCR(1) + 40ms * 2000/40000 = 2ms , the next pack get
SCR(3) = SCR(2) + 40ms * 2000/40000 = 4ms, SCR(4) = 6ms, ... ,
SCR(20) = 38ms.
only the pack with the 1st. part of picture data will get a PTS
timestamp, other packs of this picture doesn´t need any.
we assume offest of 320 ms and data is embedded in 1st. pack.
the result for this pack is SCR=0ms and PTS=320ms.
the pack with the first part of audio data will also get a PTS with
(same offset of 320ms but additional delay of 200 ms), we assume it
is embedded in the last pack (no. 20).
the result for this pack is SCR=38ms PTS=520ms.
And for the bytes of the next payload unit this will go exactly
the same way:
SCR(21)=40ms/PTS=560ms (oh! which wonder this is the next picture
after 40ms clock time :-), SCR(22)=42ms, ...
still few words to pack without any payload like VOBU system
headers. they won't increase SCR timings, because they contains
no payload ! such packs becomes the same PTS like the pack before.
my time is too short for further explantations about DTS timestamps
(will talk about another day), but you can omitt it -> consumer
player will play nevertheless ...
but some other importants: to make it simplier, you should place
padding packets into a pack, if the last part of video/audio data
of a unit doesn't consumes the whole payload space of the pack.
so you will a get the next pack containing the 1st. part of the
video/audio data at the beginning.
Note: the amount of bytes for padding packet doesn't count as
payload !
furthermore VOBs has often PTS/DTS values only for every
(or every 2nd.) I-picture. rest of pictures doen't need any.
hope, you get a better understanding with help of my more realistic
point of view...
- GMo -
unixfs
16th November 2003, 09:33
wow, this is the best explanation I found on the subject. THANKS!
Is your muxer open source, or do you plan to release it as such?
It would be very appreciated :)
jdobbs
16th November 2003, 11:53
Good explanation -- thanks. This helps a lot. One thing I've noticed in DVD is a constant value of 146.286 (in 90Khz ticks) that gets added to every pack -- which I believe is the actual read time for one sector from the disk.
gmo
16th November 2003, 14:10
Hi,
i'am currently working on my multiplexer and it woll be GPLd software...
while testing around with this stuff, i get more new ideas every day
and i have still a further solution how it could be done:
as the same as in my last posting we will regard on a unit of
one video frame and the corresponding audio portion.
the bytes of such a unit can be packed into the usualy packs of
2048 bytes containing the known headers with SCR and PTS
values. including packs (without payload) for VOBUS ans similar...
after building (writing) all the packs for one unit including packs
(without payload) for VOBUS ans similar, we can create the
SCR values for each pack afterwards !
this will be closer to the known formula with the muxrate, because
the bytes of one unit represents our picture duration (40ms) and
therefore the number of bytes after "packing" the unit (including
all payloadless packs) will does it too.
now we can use the following formula for one SCR step:
SCRstep = (packsize) /(bytes of "packed" unit) * (picture duration)
(packsize is 2048 bytes, do you remember ?)
starting with SCR(1) = 0, we will reach a value a bit before 40ms for
the last pack of the unit and the first pack of the next unit will get
exactly SCR=40ms. this means, you have to calculate the SCRs as
follows:
SCR(n) = SCR(n-1) + SCRstep
starting with n=2 and SCR(1) = 0;
For PTS stuff there will be no difference to my last post ( in the 1st.
video/audio pack you can put the PTS including a offset...)
now still some words to DTS (Decoding Time Stamps) values: it should
tell the player how to decode the picture and should be less or equal
the PTS value (equal is allowed because of the assumption the decoder
needs no time to decode ... as described in the mpeg specs)
because B-frames have to be decoded before P-frame and displayed
after it, its DTS should be less its PTS and its PTS should be greater(!)
than the PTS of the P-Frame. therefore in VOB streams only the I-Frames
(or every 2nd.) will have a PTS -> this will keep the PTS handling a bit
easier...
Furthermore in VOB files there are DTS used also only for I-Frames and
should be equal to its PTS, but often they are lower (about 40ms or 80ms)
-> probably this will be caused by "open" GOPs, where B-frames following the
the I-Frame must be DISPLAYED earlier but needs the I-frame for its DECODING
first.
because i've not exactly estimated at the moment, how DTS should be placed,
i omit it simply and put only PTS values on every 2nd. I-frame.
the result will play fine on my consumer DVD (Pioneer DV-444), because there a
a lot more parameters in a mpeg stream helping the player to indicate when it have
to decode a frame ... and it make use of it, if no DTS is present.
hoe this helps.
- GMo -
gmo
17th November 2003, 19:09
Hi,
the source code of my multiplexer (lvemux) can be found here
http://home.arcor.de/gmo18t/lve/download/lvemux.tgz
it is a linux program using nothing specail but libpthread
for a "piping" mode, but this can be removed while porting to
windows. then only "file" mode will be available...
the main muxing part (=API) is separated in the files "mux_out.c"
and "mux_out.h". if somebody will write his own muxer he can
use it. in this case libpthread isn't needed anyway and only
the reading of audio and video data have to be programmed
-> but be aware to call the mux routines everytime with a full
video frame (or with 2 inerlaced fields) and subsequently the
corresponding audio portion !
- GMo -
unixfs
17th November 2003, 23:07
Good, a new GPL muxer was really needed.
Thanks a lot!
gmo
18th November 2003, 18:37
Hi,
lvemux can be found here (updated)
http://sourceforge.net/project/showfiles.php?group_id=97090
- GMo -
jdobbs
23rd November 2003, 13:08
to make it simplier, you should place
padding packets into a pack
gmo
I noticed one exception here I'm not sure how to handle. I can pad the stream to the end of the payload area as long as I have 6 or more bytes left (4 bytes for the start-code and 2 for the length) then pad with 0xFF -- what do you do if there are fewer than 6 bytes left?
mpucoder
23rd November 2003, 23:18
In that case you pad the video/audio/subpicture packet itself, see the PES header. You add the excess length to the PES header data length, and place that number of pad bytes between the header data and the video/audio/subpicture data.
And you really should do this for less than 7 bytes as some software barfs on a 0 length pad.
btw, it is a requirement to pad the last pack of each stream within a VOBU. ie every pack must be full of something. And if anyone is thinking of saving a little space by placing 2 or more streams in one pack, forget it - not allowed (except the NAV pack, which has DSI and PCI packets)
jdobbs
24th November 2003, 00:57
Thanks. That makes sense -- I was wondering when you would ever use the pad in the PES header, but wasn't sure that was the answer. Glad you also mentioned the zero length pad -- because I'd set mine to use it... I'll modify it to use the PES pad.
gmo
24th November 2003, 14:41
Hi,
thanks for the hints ...
@mpucoder:
the maximum possible value for header data length
would be 255 (8 bit). could the number of stuffing
bytes be then also maximally 255 or is there a
smaller limit value for the number of stuffing
bytes?
- GMo -
mpucoder
24th November 2003, 16:00
The combined length of the data and stuffing can not exceed 255, as you noted. This means at most 255 stuffing bytes if there is no data. But generally you do not stuff the PES with more than 6 bytes, you use the padding stream (01 BE) for 7 or more.
also, quick note: The pack header (01 BA) has a 3-bit field for stuffing, don't use it for DVD. Most players and softwares expect the PES header to start at offset 0x00E. They demultiplex by examining the byte at offset 0x011, which should be the stream ID.
jdobbs
25th November 2003, 19:59
I gotta' tell you, mpucoder, you are one fountain of knowledge.
jdobbs
26th November 2003, 15:03
@mpucoder
When I set the PES header data length and then pad with 0xFF before the payload -- it seems to confuse IFOEDIT (I'm using IFOEDIT to view my resulting VOB files). I have fears I'm doing something wrong. Here's an example:
1. I'm at the end of a GOP, so I'm about to flush my data and start with a new NAVPACK.
2. In the flush process I see that I would have only two remaining bytes in the payload area (offset 0x7FE and 0x7FF) after loading with the remaining MPEG data.
3. So I put the value of 2 in the PES Header data length (it was 0) and pad with 0xFF, 0xFF -- followed by the MPEG data to the end of the payload area.
4. When I review the output -- here is what IFOEDIT shows:
[0016] PES HeaderData length 2 [02]
[0017] PES HeaderData 255 255 [ff ff ]
header data details:
[0000] MPEG Sequence Header start code 567662809 [21d5d8d9]
[0004] Video attributes 250121551 [0ee88dd9]
Video width: 238
Video height: 2189
....
followed by more garbage
I gotta be doing something wrong -- I also don't see the start code in the hex stream that IFOEDIT seems to be attempting to display...
Any help would be appreciated.
jdobbs
mpucoder
26th November 2003, 15:34
IfoEdit? Don't you mean VobEdit?
I've been looking for some examples in commercial DVDs to see if VobEdit interprets it correctly, but little luck. I found one which should be even more rare, an exact fit. That makes me wonder if they padded the video with 0 bytes (you can do this, too, as the decoder just keeps looking for a start code)
Make sure the packet length is 2028 (a full paylaod). VobEdit will attempt to interpret some video data if a start code is present in the first 3 bytes of the packet payload. It will then try to interpret the data beyond the packet.
Can you send just that sector to mpucoder@comcast.net ?
Edit: I found an example, and VobEdit does interpret it strangely. In this example the PES header data content flags are 8100 (the usual for nothing extra), PES header data length 5, followed by 5 bytes of 0xff. After that VobEdit tried to interpret the data fragment, even though there is no start code. Apparently the determination to interpret is based on the presence of PES header data, normally present only on the first packet of the VOBU (for DTS/PTS).
mpucoder
26th November 2003, 16:13
Up above I said that padding was required on all streams at the end of the VOBU. This is not entirely correct. Video, mpeg audio, and subpictures need to be padded. AC3, and probably other audio types, do not pad every VOBU, just at the end of the cell. DTS, of course, needs no padding as the DVD version is tailored to fit exactly.
jdobbs
26th November 2003, 16:15
Whoops -- yes I meant VOBEdit. I'm constantly clicking between one and the other as I'm building this remuxer.
I'll send the sector. Interestingly enough I also found an exact fit in my file too (only one though).
I had the packet length set to 2028 (0x7EC).
jdobbs
26th November 2003, 16:18
I think this might be a bug in VOBEdit.
jdobbs
26th November 2003, 22:45
This SCR is really kicking my butt. I can't seem to get a steady picture. I've tried several different methods. I took the same M2V file and ran it through Maestro and it appears that it simply outputs LBAs that have are incremented at a value of 146.286. Then there is a leap of several thousand ticks at the beginning of each GOP. That seems easy enough -- but when I do it my video stutters...
I tried using the PTS to calculate the time to add to each packet -- no dice.
I tried collecting an entire GOP and distributing the time evenly across it -- nope.
I tried calculating by frame as gmo indicated in this stream -- it doesn't work for me either.
Somebody has to know something more than what I've gotten so far :scared:
Added: Could it be I'm not adding the RFF flags correctly? I set the flag on for frames 2, 3, and 4 out of every 6 frames... (the source is FILM to be played on NTSC).
mpucoder
26th November 2003, 23:05
That's not the proper sequence for 2:3 pulldown. Both the rff and tff flags are involved, in this sequence
tff = 1, rff = 0
tff = 1, rff = 1
tff = 0, rff = 0
tff = 0, rff = 1
Also, pay attention to the temporal sequence number in the Picture Header (01 00). The rff/tff sequence goes with the display order (given by the temporal sequence), not the encoded order.
jdobbs
26th November 2003, 23:47
Whoah....:eek: Guess I should have looked a little closer at that... I thought of doing the pulldown as kind of an aside so I could skip doing it on a command line before remerging into the VOB. So I dumped out a few GOPs and threw a couple of lines of code at it. Guess I'll have to fix it now.
Serves me right for not doing the research.
jdobbs
27th November 2003, 00:04
Is it safe to say that I can take the 2 least significant bits (bits 6 & 7 of byte 5 in the header) of the temporal sequence number to determine RFF/TFF output? As in:
00 = TFF=1, RFF=0
01 = TFF=1, RFF=1
10 = TFF=0, RFF=0
11 = TFF=0, RFF=1
BTW -- your help is truly appreciated. I think the resulting product will be worth the effort. I'm building a "two click" utility that prepares CCE encoding (or other third party encoders) and then reintegrates it back into the original DVD structure for output to a DVD-5.
mpucoder
27th November 2003, 05:38
Originally posted by jdobbs
Is it safe to say that I can take the 2 least significant bits (bits 6 & 7 of byte 5 in the header) of the temporal sequence number to determine RFF/TFF output? As in:
00 = TFF=1, RFF=0
01 = TFF=1, RFF=1
10 = TFF=0, RFF=0
11 = TFF=0, RFF=1
Almost - you must continue the sequence into following GOP's. One way is to use a byte (char) initialized to 2 for the first GOP. Let's call this byte rff_tff. Add rff_tff to each picture's temporal sequence number and use the least 2 significant bits to form tff (bit 1) and rff (bit 0). At the end of the GOP add the number of pictures in the GOP to rff_tff for the next GOP.
jdobbs
27th November 2003, 11:17
@mpucoder
Thanks.
Follow-up: That was it. The remultiplexed VOBs are working now.
diehardii
5th December 2003, 05:14
Hi,
Before I spend an inordinate amount of time porting lvemux, has anyone created a windows program using the base of lvemux? Thanks.
~Steve
huangy
16th December 2003, 05:15
Originally posted by gmo
Hi,
lvemux bugfix here (updated)
http://home.arcor.de/gmo18t/lve/download/lvemux-031125.tgz
- GMo -
I can not download it, could you send me a copy of that?
my email: dr_huangy@yahoo.com. Thanks a lot!
mpucoder
16th December 2003, 07:08
Try here (http://lvempeg.sourceforge.net/)
older stuff here (http://earth.prohosting.com/gmo18t/)
(found with google in 3 minutes)
huangy
16th December 2003, 07:17
Originally posted by mpucoder
Try here (http://lvempeg.sourceforge.net/)
older stuff here (http://earth.prohosting.com/gmo18t/)
(found with google in 3 minutes)
Got it! Thank you very much!
vBulletin® v3.8.4, Copyright ©2000-2009, Jelsoft Enterprises Ltd.