Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > General > Subtitles

Reply
 
Thread Tools Search this Thread Display Modes
Old 29th March 2024, 20:09   #1  |  Link
Unforgettable
Registered User
 
Join Date: Mar 2024
Posts: 6
How to extract the timestamps of PGS subtitles?

I have a M2TS file from a BD rip that contains one subtitle stream with 490 subtitles in track 4. The subtitles are in PGS format.

Now I would like to extract the timestamps of all tracks, including the subtitle timestamps. The fastest method to achieve that seems to be ffprobe:

Code:
ffprobe -show_entries packet=stream_index,pts,duration -of compact=p=0:nk=1 input.m2ts > timestamps.txt
This works correctly for all tracks except the subtitle track. For the latter, there are nearly 4000 timestamps in the resulting file, which does not fit the number of subtitles.

I then found out that the command shown probably is wrong, and tried to get the frame timestamps instead of the packet timestamps:

Code:
ffprobe -show_entries frame=stream_index,pts,duration -of compact=p=0:nk=1 input.m2ts > timestamps.txt
This also went wrong: There were lines for the subtitles in the resulting file, but those lines lacked the timestamps (actually, they consisted only of the word "subtitle" if I remember correctly).

Could somebody please explain why the number of subtitle packets exceeds the number of subtitles by a factor of 10 (roughly), and what would be the correct way to extract PGS subtitle timestamps?

I guess I am able to understand technical explanations to certain degree, but sometimes I find it hard to understand all the abbreviations and the BD-specific terms. I would be very grateful if somebody could give an explanation that's halfway precise, but in simple words :-)

Thank you very much in advance, and Happy Easter!
Unforgettable is offline   Reply With Quote
Old 30th March 2024, 20:11   #2  |  Link
cubicibo
Registered User
 
Join Date: Feb 2022
Posts: 133
Multiple and vastly different elementary packets are needed to define and display one subtitle. FFprobe effectively dumps the PTS of every packets. If your stream is basic enough (one subtitle at a time), you can notice that a packet always has a PTS in advance to the rest: this is your subtitle PTS. With some basic scripting you should be good to go.

Quote:
1|54003003 <--- display
1|54002700
1|53996867
1|53996991
1|53997474
1|53997474
1|54007507 <--- undisplay
1|54007202
1|54007202
I don't think your approach is good to dump the PTS of the data. SubtitleEdit should give you usable result without having to worry about any of this.
cubicibo is offline   Reply With Quote
Old 31st March 2024, 18:18   #3  |  Link
Unforgettable
Registered User
 
Join Date: Mar 2024
Posts: 6
Thank you very much for your reply and your explanations!

I'll try to explain my goal: I am currently working on a program that at some far point should prevent the offset between audio and video that arises when converting a BD playlist / a series of M2TS files to one MKV file. So far, I haven't found a program or a set of tools that prevents these offsets, at least when re-encoding is not desired.

The current approach that most (if not all) tools use is to first demux the M2TS files and then mux the single tracks into the MKV file. IMHO, this approach has some problems, although they may not be noticed at the first sight in simple situations, e.g., when there are only a few M2TS files with no specialties. But as soon as it comes to specialties, for example multi-edition MKV files, the situation is different.

Therefore I have chosen a different approach which is based on mkvmerge's ability to use external timestamp files and which does not depend on external demuxers. At the moment, the whole thing is highly experimental, and I can't guarantee that it will work out, but on the other hand, it is promising, and I already have a proof of concept, having created a multi-edition MKV of a BD rip that other tools couldn't create in a clean way.

So, what's my problem?

Of course, the problem is generating the external timestamp files that mkvmerge should process. To generate these files, I first have to extract every timestamp for every frame in every track from the original M2TS files, then do my calculations on them, and finally write them out to the timestamp files for mkvmerge. For the first step (extract timestamps from M2TS), I am using ffprobe, but I would be grateful for other suggestions.

It is quite easy to extract the timestamps from the various audio tracks and video tracks, using the command shown in my first post. The proof of concept mentioned above shows that there are no bad surprises with that. However, in that proof of concept, I have left away the subtitles because I couldn't find a method to extract their timestamps correctly.

Before posting, I already had tried SubtitleEdit. It is a great program, and it can export timestamps. Unfortunately, I couldn't find out yet whether there is a command line mode for that. My own program can't operate the GUIs of other programs (yet? - my gut feeling is that I shouldn't go this way). But I'll look again into SubtitleEdit. Perhaps I have missed the command line part.

Thank you very much for confirming that the first subtitle packet timestamp is the subtitle display PTS. I already suspected this, because I noticed that for every subtitle there is a timestamp that is later (in time) than the timestamps that follow in the file. But I thought that this may be random and was unsure.

As mentioned above, mkvmerge seems to need the display PTS and the undisplay PTS for each subtitle in the external timestamp file. So perhaps I can implement something like the following in my program to extract the timestamps from the subtitle tracks:
Code:
1. Move to begin (in the sense of file offset) of M2TS subtitle track
2. Extract next timestamp in the track and save as display PTS of (next) subtitle
3. Throw away all following timestamps that are earlier in time than the one saved in step 2
4. Extract next timestamp and save as undisplay PTS of (next) subtitle
5. Throw away all following timestamps that are earlier in time than the one saved in step 4
6. Go to step 2
So I have two questions:

- Could you give me a tip regarding a free command line tool that can extract the display and the undisplay PTS from a subtitle track in a M2TS file without hassle? (If there is no such tool, I guess the method above should do for now, and of course I'll look into SubtitleEdit again)

- You have written: "... If your stream is basic enough (one subtitle at a time), ...". That makes me mistrustful :-) Could you give me a hint how to handle situations that are not that easy? Or perhaps an example (since currently I can't imagine the effect or the goal of multiple subtitles at the same time)?

Thank you very much, and best regards!

Last edited by Unforgettable; 31st March 2024 at 20:50.
Unforgettable is offline   Reply With Quote
Old 31st March 2024, 22:27   #4  |  Link
cubicibo
Registered User
 
Join Date: Feb 2022
Posts: 133
Quote:
- Could you give me a tip regarding a free command line tool that can extract the display and the undisplay PTS from a subtitle track in a M2TS file without hassle? (If there is no such tool, I guess the method above should do for now, and of course I'll look into SubtitleEdit again)
I don't know of any such tool. You could chain eac3to or tsmuxer with some custom SUPer python script... but that would be like driving a bulldozer to buy eggs at the local market. Regardless, the method in the code block seems correct, but you seem to miss something about PGS+MKV. Per the Matroska specs:
Quote:
A Segment is normally shown until a subsequent Segment is encountered. Therefore the Matroska Block MAY have no Duration. In that case, a player MUST display a Segment within a Matroska Block until the next Segment is encountered.
There's no "undisplay PTS" per se. PGS is just a chain of "segment" that always initiate some drawing operation on your screen. And that drawing operation may be anything, from clearing the screen, to draw a new subtitle, change colour, or just refresh the screen with the exact same data.

Quote:
Or perhaps an example (since currently I can't imagine the effect or the goal of multiple subtitles at the same time)?
One line of dialogue and a sign. Both would have their own "in" and "out" time in your subtitle file and would overlap at some point in time:
Code:
00:00:10.000 -> 00:00:13.000 Dialogue
00:00:12.000 -> 00:00:15.000 Sign
Both lines overlap at 00:00:12.000 and for one second. FFprobe output of such stream would look like this:
  1. PTS1 <- draw ("display" @ 00:00:10.000)
  2. ... PTS2...i < PTS1
  3. PTSj <- draw (@ 00:00:12.000)
  4. ... PTSk...o < PTSj
  5. PTSp <- draw (@ 00:00:13.000)
  6. ... PTSq...u < PTSp
  7. PTSv <- draw ("undisplay" @ 00:00:14.000)
  8. ... PTSw...z < PTSv
If you stream just "display" and "undisplay" subtitles one at a time, and after the other, it is fairly straightforward as every other PTS is an "undisplay" operation. But in that case, you cannot know what's going on just by looking at the PTS.

But those files are quite rare. Most authoring business don't know anything about their job and only work with .SRT files that gets filtered to remove any overlapping event.

Last edited by cubicibo; 2nd April 2024 at 07:54.
cubicibo is offline   Reply With Quote
Old 1st April 2024, 20:54   #5  |  Link
Unforgettable
Registered User
 
Join Date: Mar 2024
Posts: 6
Thank you very much again for your patience, for your explanations, for your example of a dialogue and a sign, and thanks for the hint regarding the MKV specification!

I now have understood that I was too naive, but that in practice I might get away with it in most cases. I also have understood that there is no "display" or "undisplay" command for subtitles. But I find those terms very meaningful, so I hope you'll excuse when I stick with them :-)

In the meantime, I have made some progress:

Using ffprobe with a different output format, I have extracted the subtitle frame (not packet) metadata from a certain M2TS test file, including the timestamps. Fortunately, ffprobe found exactly twice the number of subtitle frames than the number of subtitles in that file. This nearly surely means that each subtitle has one "display" and one "undisplay" frame.

I have observed that in the metadata of each frame there is an entry "start_display_time" and an entry "end_display_time". The first is always 0 and the latter is always (2^32 - 1) for every subtitle frame. Probably this a common method to ensure that subtitles appear or disappear no sooner than the next subtitle (frame) replaces them.

I also have observed that there is an entry "num_rects" whose value is always 1 in each "display" subtitle frame and always 0 in each "undisplay" subtitle frame. Perhaps this could also be used to tell apart "display" frames from "undisplay" frames. Although I can't make something useful from that, I thought it may be worth mentioning.

I have made an external subtitle track timestamp file for mkvmerge from the extracted timestamps, and have let mkvmerge process that file when merging the MKV. The result is nearly as expected: All subtitles in the MKV are working correctly, except the very first one. The first subtitle does not disappear after a few seconds (as it should happen), but remains for minutes until the second subtitle replaces it. I already have opened a question at the mkvtoolnix forum about possible reasons or remedies.

I'll report back here when this problem is solved. Integrating further command line tools or different command line parameters for external tools or implementing the code from my previous post into my program won't be difficult, but first I need to know what the issue with the first subtitle in my test environment is.

Thank you very much, and best regards!
Unforgettable is offline   Reply With Quote
Reply

Tags
pgs, timestamps

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 21:27.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.