Log in

View Full Version : Understanding DVD data structures


vi
8th March 2015, 15:39
Hi,

I'm writing a program to extract DVD subtitles, I've spent quite a long time scouring the Internet for information (mainly stnsoft.com, big thanks to whoever's maintaining that website as it is basically the only source that describes the IFO structure in detail) and reading the ISO 13818-1 document. However I've still some trouble piecing everything together.

Namely, what parts, and in what order, must be played ?

If I understood correctly each IFO file describes a list of Titles. Each Title contains a list of Parts-Of-Title, which each point to a Program. The Programs are stored in Program Chains. Each Program contains a list of Cells, which finally point to a block of data that needs to be decoded and played.

What is the correct entry point, the Title or the Program Chain ? If it is the Title, should all the Parts-Of-Title be played in order, or only the first looked up and then the Program Chain from then on ? This is a bit confusing because nothing prevents different Parts-Of-Title of a given Title from pointing to different Program Chains...

Is it even possible to fully know without emulating the DVD VM instructions ?

This is very technical but I hope someone can help me figure this out.

Cheers.

r0lZ
9th March 2015, 12:01
There is no "correct entry point". It depends of the VM command that has been used to call the title. For example, in the pre-commands of the title (of the main movie), there are often several LinkPTT commands that jump to the different chapters (PTTs) of the movie, according to the value of a GPRM (that has been set in the Chapters menu). But of course, if the Title is called directly (and not from the Chapters menu), usually it's the first cell (and first PG and PTT) of the PGC that is played first. You can therefore consider it as the official (or "usual") entry point.

In the PGC, there are also flags to force the programs of the PGC to be played sequentially, randomly, or in shuffle mode. The random and shuffle modes are rarely used, but they exist, and of course, in that case, the programs are not played sequentially.

Don't forget also that a title can have several angles. In that case, althoiugh the cells are still played sequentially, only the cells common to all angles or specific to the current angle are played. The cells pertaining to another angle are skipped. (The case of the multi-story Titles is similar, but in that case, there are several PGCs with different sequences of cells. When a PGC is played, the cells pertaining to the other "stories" are also skipped, but since they are not referenced in the PGC, they do not exist from the PGC point of view, but they exist in the VOBs.)

Note also that a Title can be split in several PGCs. When it's the case, the first PGC is defined as the Entry PGC in the Title Play Map Table, and the other ones are called via the "Next PGC Link" or with a post or cell command. (The entry PGC is usually the first PGC of the Title, but it's not a requirement.) That feature is also rarely used, but can be useful for Games. Anyway, the concept of "entry PGC" for a title is only useful when a multi-PGC Title is called with the JumpTT (Jump to Title) command.

Also, there are two concepts of Title number. There is the global title number, the TT (as seen in the Title menus of some DVD players, and defined in the Title Play Map Table of the VMG), and a title number within the current VTS, the VTS_TT. The TT can be called directly only from the VMG (First-Play PGC or VMGM domain), and the VTS_TT only from the current VTS.

So, to reply to your question, the entry point can be the Title (TT) when a JumpTT command is used to call it from the VMG, a VTS_TT if it is called from the same VTS with JumpVTS_TT, a PGC if it is called with a LinkPGCN or its variants, a PTT with a LinkPTT or Jump_VTS_PTT, a Program with a LinkPGN, a PGC with a LinkPGCN or a Cell with a LinkCN. It is even possible to return from a menu to any point in the previously played Title with RSM (Resume).

Note also that some cells can be skipped with cell commands. For example, if cell 1 has the cell command "LinkCN 3", when t has finished playing, the cell command is executed, and therefore thje playback continues at cell 3, and cell 2 is skipped. The cell commands are rarely used in the middle of a movie, because theoretically it is not possible to play the next cell (cell 3 in my example) seamlessly. Theoretically, the player should pause during one or two seconds at the end of the playback of the cell with a cell command. But cell commands can be used to skip a dummy short cell at the end of the movie (necessary to allow the user to skip the last chapter of the movie with the Next Chapter button of the remote). Cells commands are also used in ARccOS/RipGuard protections to skip the protected cells before the (real) beginning of the movie.

Conclusion: Most of the time, when a Title contains a movie, it has a standard structure: a single PGC with some PTTs and PGs (they have to be equal in the Title domain) with each PG containing one or several cells. The first cell is played first, then the next cells are played sequentially, except when the title is multi-angle.

The ARccOS/RipGuard junk adds several dummy short cells before the real beginning of the video, and you will have to skip them if you want to extract only the streams pertaining really to the movie.

It should be possible to detect the correct order of the cells to rip in most cases without having to emulate the VM commands. If the Title is sequential, you can simply analyse the cells of the movie: If there are many short cells with or without cell commands at the beginning of the movie, skip them, because they are probably ARccOS junk. Similarly, you can skip the last cell if it is very short (less than 2 seconds). The remaining cells should not have any cell command (except the last cell). You must also take into account the angle cells (if any), and rip only a specific angle.

In all other cases (random or shuffle playback, or a lot of cell commands, or a multi-PGC title) you can usually assume that it's not a movie, and extracting that streams of such titles makes little sense. But there are no rules. For example I have seen collections of shorts authored as several multi-PGC Titles. The second PGC is used ONLY to jump to the next title when the option to play all shorts have been selected. Of course, it's much confusing! You can of course use PgcEdit to analyse that special cases.

Last note: I haven't explained the organisation of the menu PGCs. There is a lot of differences with the Titles. But I suppose that you are not interested in the menus.

vi
9th March 2015, 21:54
Thanks for the very detailed answer. It will take me a while to process all of it.

So it is possible to have several PGC in one Title. Which means possibly different CLUTs or having streams becoming available in the middle of a movie. And also VM commands to jump anywhere (Is the VM instruction set Turing-complete ? :D). I'll have to take a few shortcuts and assume these edge cases don't exist because this project is already taking significantly longer than I expected when I started it ;)

What about presentation timestamps when skipping Cells ? Right now to determine the time of a subtitle I'm reading the PTS field of the PES packet which contains the start of the subpicture packet, but if Cells can be in any order then it must be wrong.

By the way is there a test DVD with various configurations available somewhere ?

Cheers.

r0lZ
9th March 2015, 22:59
Yes for the different CLUTs, but no for the different streams. The PGCs of the multi-PGC Title must be in the same VTS domain, and therefore must share the same streams.

I'm not sure for the PTS, but normally, they have to be contiguous only inside the same VOB (the Video OBject, not the .VOB file). In a single Title, there can be several VOBs, and the timestamp is reset at each new VOB. (Look in the Cells Table: when the VOB ID changes, the timestamps are reset.) So, IMO, you have to treat the timestamps of each cell (or at least each VOB) individually.

BTW, why do you need to write your own program to extract the subtitles? You can use Jsoto's PgcDemux (http://download.videohelp.com/jsoto/dvdtools.htm), that works pretty well and can be called from the CLI if you wish. It can extract all stream types, or the subtitles only, for a whole PGC, a single VOB or a single cell.