Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > General > Subtitles

Reply
 
Thread Tools Search this Thread Display Modes
Old 30th March 2007, 06:20   #1  |  Link
Rectal Prolapse
Registered User
 
Join Date: Mar 2005
Posts: 429
Converting Bluray disc subtitles

I'm wondering if this is possible yet.

I've tried ripping the data from a bluray disc's .m2ts file. I believe that PIDs starting from 1200 are the subtitle streams, but I have no idea what to do with them.

Anyone have any information on how to convert them to SRTs? Or where I can find the format specifications for them?
Rectal Prolapse is offline   Reply With Quote
Old 30th March 2007, 17:37   #2  |  Link
Pelican9
Coder
 
Pelican9's Avatar
 
Join Date: Jan 2007
Location: Around the World
Posts: 697
Quote:
Originally Posted by Rectal Prolapse View Post
I'm wondering if this is possible yet.

I've tried ripping the data from a bluray disc's .m2ts file. I believe that PIDs starting from 1200 are the subtitle streams, but I have no idea what to do with them.

Anyone have any information on how to convert them to SRTs? Or where I can find the format specifications for them?
I think it's similar to the HD DVD's subtitle stream.
Could you send me a sample?
Pelican9 is offline   Reply With Quote
Old 30th March 2007, 20:23   #3  |  Link
Rectal Prolapse
Registered User
 
Join Date: Mar 2005
Posts: 429
I'll try to do that this weekend. The router goes down a lot though, so that may not happen.
Rectal Prolapse is offline   Reply With Quote
Old 31st March 2007, 12:08   #4  |  Link
DeepBeepMeep
Registered User
 
Join Date: Jun 2006
Posts: 133
Quote:
Originally Posted by Rectal Prolapse View Post
I'll try to do that this weekend. The router goes down a lot though, so that may not happen.
Rectale Prolapse, which tool have used to extract the stream? Thanks. I might try to use it as well to upload a sample for Pelican.
DeepBeepMeep is offline   Reply With Quote
Old 31st March 2007, 20:54   #5  |  Link
Rectal Prolapse
Registered User
 
Join Date: Mar 2005
Posts: 429
I used TSReader Lite. There are also pay versions of those, but I think the Lite version is all we need right now. I guessed that the PIDs for subs begin at PID 1200. I hope I am right!

I would like to try it with VLC too - and see if I get similar results.
Rectal Prolapse is offline   Reply With Quote
Old 13th April 2007, 17:22   #6  |  Link
enantiomer
Registered User
 
Join Date: Jan 2007
Posts: 35
Rectal Prolapse, I agree that PID 0x1200 and up are probably where the susbtitles streams start.

Pelican9: did you receive a Blu-ray subtitle stream sample? BTW thanks for your efforts on SUPread. Nice little proggie.
enantiomer is offline   Reply With Quote
Old 13th April 2007, 20:46   #7  |  Link
cfsmp3
Registered User
 
Join Date: Sep 2005
Posts: 15
Any chance I can get a sample as well? I'm doing lots of subtitles related work these days.
cfsmp3 is offline   Reply With Quote
Old 13th April 2007, 21:50   #8  |  Link
Pelican9
Coder
 
Pelican9's Avatar
 
Join Date: Jan 2007
Location: Around the World
Posts: 697
Quote:
Originally Posted by enantiomer View Post
Pelican9: did you receive a Blu-ray subtitle stream sample?
Yes, I did. It's not the same as the HD DVD's subtitle stream. I think it contains some headers.

Quote:
Originally Posted by cfsmp3 View Post
Any chance I can get a sample as well? I'm doing lots of subtitles related work these days.
http://www.sendspace.com/file/ce4m87

If you figure out the file structure, I'll change SUPread.

Last edited by Pelican9; 13th April 2007 at 21:55.
Pelican9 is offline   Reply With Quote
Old 15th April 2007, 20:50   #9  |  Link
Rectal Prolapse
Registered User
 
Join Date: Mar 2005
Posts: 429
Here is another rip of subtitles, from Memento blu-ray:

http://www.sendspace.com/file/ghi4gb

This should be a better rip than the first one, where I used TSReader Lite that doesn't rip from the beginning of the stream!
Rectal Prolapse is offline   Reply With Quote
Old 17th April 2007, 14:40   #10  |  Link
Pelican9
Coder
 
Pelican9's Avatar
 
Join Date: Jan 2007
Location: Around the World
Posts: 697
Quote:
Originally Posted by Rectal Prolapse View Post
Here is another rip of subtitles, from Memento blu-ray:

http://www.sendspace.com/file/ghi4gb

This should be a better rip than the first one, where I used TSReader Lite that doesn't rip from the beginning of the stream!
I haven't got any info from these files. :-(
Pelican9 is offline   Reply With Quote
Old 21st April 2007, 04:23   #11  |  Link
Rectal Prolapse
Registered User
 
Join Date: Mar 2005
Posts: 429
Yeah I know how you feel - it's a big job figuring these files out!

Fortunately, I think I figured out the Run Length Encoding scheme used for the bitmaps. However, I haven't figured out where the subtitle time and duration information is stored, and I also need to decode the color table somehow.

Some examples from my RLE findings:

00 47 80 00 00 = The next 1920 pixels across are of color index 0. The 00 pair at the end signifies the end of a line.

00 84 0e = The next 4 pixels are of the color index 14 (decimal).

00 9e 14 = The next 30 (0x1e) pixels are of the color 20.

00 c1 4e 33 = the next 334 pixels are of the color 51.

If the sequence doesn't begin with a 00 then the value indicates the color of the pixel at that point.

If the sequence ends with two 00's then that means we reached the end of the line.
Rectal Prolapse is offline   Reply With Quote
Old 21st April 2007, 19:47   #12  |  Link
Rectal Prolapse
Registered User
 
Join Date: Mar 2005
Posts: 429
More:

00 03 = The next 3 pixels use color index 0.

00 01 = The next pixel is color index 0.

So, basically, 00 is reserved as an RLE command. To get the color 0 you need to do the above.

Given the above, here are some examples:

00 47 80 00 00 00 42 0F 0E 14 00 84 FE 14 0E 00 0F

Breaks down into:

00 47 80 = 1920 pixels of color 0.
00 00 = end of line, go to the beginning of the next line.
00 42 0F = 527 pixels of color 0
0E = pixel at x location 528 is color 14.
14 = pixel at x location 529 is color 20.
00 84 FE = the next 4 pixels are of color 254.
14 = the next pixel is color 20.
0E = the next pixel is color 14.
00 0F = the next 15 pixels are color 15.

This will go on until 1920 pixels for the line have been specified, and shall end in a 00 00 pair.

Now, not all subpictures will be 1920 pixels across - this depends on the header before the bitmap info.

Last edited by Rectal Prolapse; 21st April 2007 at 19:57.
Rectal Prolapse is offline   Reply With Quote
Old 25th April 2007, 17:12   #13  |  Link
Pelican9
Coder
 
Pelican9's Avatar
 
Join Date: Jan 2007
Location: Around the World
Posts: 697
Code:
rle_coded_line() {
  do {
    if (nextbits != ‘0000 0000b’) {
      pixel_code (8bit)
    } else {
      8-bit_zero (8bit)
      switch_1 (1bit)
      switch_2 (1bit)
      if (switch_1 == ‘0b’) {
        if (switch_2 == ‘0b’) {
          if (nextbits != ’00 0000b’)
            run_length_zero_1-63 (6bit)
          else
            end_of_line_signal (6bit)
        } else {
          run_length_zero_64-16K (14bit)
        }
      } else {
        if (switch_2 == ‘0b’) {
          run_length_3-63 (6bit)
          pixel_code (8bit)
        } else {
          run_length_64-16K (14bit)
          pixel_code (8bit)
        }
      }
    }
  } while (!end_of_line_signal)
}
pixel_code: An 8-bit code, specifying the pixel value as an entry number of a Palette with 256 entries.

8-bit-zero: An 8-bit field filled with ‘0000 0000b’.

switch_1: A 1-bit switch that identifies the meaning of the fields that follow: if set to the value ‘0b’, the
field indicates a run-length for a pixel value of ‘0x00’ or an end_of_line_signal; if set to the value ‘1b’,
the field indicates a run-length for a pixel value that is not ‘0x00’.

switch_2: A 1-bit switch that identifies the meaning of the fields that follow: if set to the value ‘0b’, the field indicates a small run-length or end_of_line_signal; if set to the value ‘1b’, the field indicates a long run-length.

run_length_zero_1-63: The number of pixels that shall be set to a value of ‘0x00’.

end_of_line_signal: A 6-bit field filled with ’00 0000b’. The presence of this field signals the end of the coded line.

run_length_3-63: The number of pixels that shall be set to the pixel value defined next. This field shall not have a value less than 3.

run_length_zero_64-16K: The number of pixels that shall be set to a value of ‘0x00’. This field shall not have a value less than 64.

run_length_64-16K: The number of pixels that shall be set to the pixel value defined next. This field shall not have a value less than 64.
Pelican9 is offline   Reply With Quote
Old 5th May 2007, 22:40   #14  |  Link
Rectal Prolapse
Registered User
 
Join Date: Mar 2005
Posts: 429
Thanks Pelican9!

Last week I spent a lot of time doing more work reverse engineering m2ts subpicture streams.

I believe I have figure out the essentials. I will try to post more details in the coming days.

I can say with absolute certainty that the timestamps are NOT included in the elementary PG stream. They are embedded in the PTS values of the PES packets before each new subtitle.

These PES packets should be extracted by a tool (like TSReader or Manzanita) that preserves all the packets in the stream. Unfortunately this means we will need a transport stream parser.

I also discovered how subtitles are forced on.

There are separate commands for define the palette, the bitmap, and a command for blanking the subtitle. The latter is preceded by a PES packet that specifies the PTS.

One problem remaining is determining the subtitle timing relative to the presentation of the first video frame. I don't yet know how to get this value.

I hope to provide very specific details soon. Hopefully you and Haali and whoever else can make use of this information.

BTW, I used a clean-room approach to this - I do not have the full Blu-ray specification, only the publically available one. I also used the XVI hex editor and PowerDVD 7.3 for my experiments. Lots of trial and error was involved. Thanks to everyone who helped!
Rectal Prolapse is offline   Reply With Quote
Old 5th May 2007, 23:39   #15  |  Link
Pelican9
Coder
 
Pelican9's Avatar
 
Join Date: Jan 2007
Location: Around the World
Posts: 697
Quote:
Originally Posted by Rectal Prolapse View Post
I can say with absolute certainty that the timestamps are NOT included in the elementary PG stream. They are embedded in the PTS values of the PES packets before each new subtitle.
Same as HD DVD.

Quote:
Originally Posted by Rectal Prolapse View Post
These PES packets should be extracted by a tool (like TSReader or Manzanita) that preserves all the packets in the stream. Unfortunately this means we will need a transport stream parser.
I've just ask dmz01 to make a subtitle extractor in his app. (called TsRemux).
If he can make a .sup file I will change SUPread to handle it.

Quote:
Originally Posted by Rectal Prolapse View Post
There are separate commands for define the palette, the bitmap, and a command for blanking the subtitle. The latter is preceded by a PES packet that specifies the PTS.
Same as HD DVD.

Quote:
Originally Posted by Rectal Prolapse View Post
One problem remaining is determining the subtitle timing relative to the presentation of the first video frame. I don't yet know how to get this value.
Same as HD DVD. :-) We have to find the first video PTS and then subtract this value from every subtitle PTS. (It's very easy to find on HD DVD's DSI packet.)

Last edited by Pelican9; 5th May 2007 at 23:49.
Pelican9 is offline   Reply With Quote
Old 5th May 2007, 23:51   #16  |  Link
Rectal Prolapse
Registered User
 
Join Date: Mar 2005
Posts: 429
Oh, I'm sure HD-DVD is the same.

I find it very strange that no one has posted their findings on these formats before though - is there another forum for that?

EDIT: Nevermind.

Last edited by Rectal Prolapse; 6th May 2007 at 00:02.
Rectal Prolapse is offline   Reply With Quote
Old 6th May 2007, 00:05   #17  |  Link
Rectal Prolapse
Registered User
 
Join Date: Mar 2005
Posts: 429
The subpicture stream consists of data inside sections. Each section begins with an identifier byte, followed by a 16 bit integer that contains the length of the data, followed by the data itself.

Identifiers, order of presentation found in typical subtitle stream:

16 00 13 07 80 04 38 40 00 01 00 00 00 01 00 00 00 00 00 00 00 00

16 = identifier
00 13 = size of section
07 80 04 38 = 1920x1080.
01 = do not clear. If set to 00 instead it will clear the subpicture!

40 00 01 00 = sequence number??

00 = if set to 40 instead of 00 the next subtitle will be forced on.

17 00 0A 01 00 00 00 00 00 07 80 04 38

17 = ?
07 80 04 38 = 1920x1080

14 04 FD 00 00 00 10 80 80 00

14 = palette definition
04 FD = size of section following in bytes.
00 10 80 80 00 = color index 0 with YCbCr = 10,80,80, alpha channel = 0.

15 89 92 00 00 01 C0 00 89 8B 07 80 04 38 00 47 80 00 00

15 = bitmap picture section
89 92 = size of section in bytes following.
00 00 01 C0 00 = unknown !
89 8B = appears to be another length indicator. 89 92 minus 5 bytes.
07 80 04 38 = 1920x1080 image dimensions.

80 00 00
80 = end of picture? No section? End of Epoch? Go blank?
Rectal Prolapse is offline   Reply With Quote
Old 6th May 2007, 00:13   #18  |  Link
Rectal Prolapse
Registered User
 
Join Date: Mar 2005
Posts: 429
Quote:
We have to find the first video PTS and then subtract this value from every subtitle PTS. (It's very easy to find on HD DVD's DSI packet.)
This didn't seem to work on the test stream I had. It appeared to be off by 300-400ms. But then again it had VC-1 in it so maybe the TS analyzer (without VC-1 support) I had didn't know where the real video frame was.
Rectal Prolapse is offline   Reply With Quote
Old 6th May 2007, 00:14   #19  |  Link
Rectal Prolapse
Registered User
 
Join Date: Mar 2005
Posts: 429
BTW Pelican9, you implied that you already knew the format - is that true? If dmz01 can write a subtitle extractor then he must know the format already?

Is there a reason why this information isn't more widely known? On the other hand, I don't have any special access to Blu-ray specs - so NDAs and licenses are not a concern for me - just old-fashioned true reverse engineering.
Rectal Prolapse is offline   Reply With Quote
Old 6th May 2007, 02:43   #20  |  Link
drmpeg
Registered User
 
Join Date: Jan 2003
Location: Silicon Valley
Posts: 458
Here's an update to xport that demuxes the subtitle (Presentation Graphics) stream.

http://www.w6rz.net/xportpgs.zip

An extra parameter has been added to select the subtitle stream.

xport -h movie.m2ts 1 1 1 1

Output filename is bits0001.pgs and the -u option dumps the PTS.

Ron
__________________
HD MPEG-2 Test Patterns http://www.w6rz.net
drmpeg is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 08:25.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2017, vBulletin Solutions Inc.