View Full Version : Subtitles streams greater than #31
DMD
6th February 2011, 10:22
The stream that interests me is the ghost n.66 (forced Italian)
The other team (Italian) refer to total captioning and commentary
Ghitulescu
6th February 2011, 11:32
I haven't yet seen a BD with so many subtitles. Maybe the software is right and the "ghost" streams are indeed ghosts, eg refer back to a valid stream, while their payload is 0.
chompy
6th February 2011, 15:49
It's easy (when you know)... The subtitles greater than 32 are stored as text subtitles in another m2ts. You can use BDEdit to find which is this m2ts: Go to the movie PlayList and in SubPath you'll get it.
Then you just nedd use deank's textST with this m2ts: http://multiavchd.deanbg.com/textST.exe
Greetings
DMD
6th February 2011, 19:32
I confirm that from the stream n.33 the subtitles are not formed in PGS
I checked this with the files 00000.mpls and 00001.mpls
in the path BDMV \ PLAYLIST
checking with MediaInfo is detected Phantom of the stream I am interested in extracting (stream n.66)
Testo #66
ID : 8448 (0x2100)
Formato : Subtitle
Lingua : Italiano
file 00000mpls and File 00001mpls
Testo #1
ID : 4608 (0x1200)
Formato : PGS
Lingua : Giapponese
Testo #2
ID : 4609 (0x1201)
Formato : PGS
Lingua : Inglese
Testo #3
ID : 4610 (0x1202)
Formato : PGS
Lingua : Francese
Testo #4
ID : 4611 (0x1203)
Formato : PGS
Lingua : Italiano
Testo #5
ID : 4612 (0x1204)
Formato : PGS
Lingua : Tedesco
Testo #6
ID : 4613 (0x1205)
Formato : PGS
Lingua : Spagnolo
Testo #7
ID : 4614 (0x1206)
Formato : PGS
Lingua : Olandese
Testo #8
ID : 4615 (0x1207)
Formato : PGS
Lingua : Coreano
Testo #9
ID : 4616 (0x1208)
Formato : PGS
Lingua : Svedese
Testo #10
ID : 4617 (0x1209)
Formato : PGS
Lingua : Danese
Testo #11
ID : 4618 (0x120A)
Formato : PGS
Lingua : Finnico
Testo #12
ID : 4619 (0x120B)
Formato : PGS
Lingua : Norvegese
Testo #13
ID : 4620 (0x120C)
Formato : PGS
Lingua : Portoghese
Testo #14
ID : 4621 (0x120D)
Formato : PGS
Lingua : Cinese
Testo #15
ID : 4622 (0x120E)
Formato : PGS
Lingua : Giapponese
Testo #16
ID : 4623 (0x120F)
Formato : PGS
Lingua : Giapponese
Testo #17
ID : 4624 (0x1210)
Formato : PGS
Lingua : Norvegese
Testo #18
ID : 4625 (0x1211)
Formato : PGS
Lingua : Portoghese
Testo #19
ID : 4626 (0x1212)
Formato : PGS
Lingua : Giapponese
Testo #20
ID : 4627 (0x1213)
Formato : PGS
Lingua : Francese
Testo #21
ID : 4628 (0x1214)
Formato : PGS
Lingua : Italiano
Testo #22
ID : 4629 (0x1215)
Formato : PGS
Lingua : Tedesco
Testo #23
ID : 4630 (0x1216)
Formato : PGS
Lingua : Spagnolo
Testo #24
ID : 4631 (0x1217)
Formato : PGS
Lingua : Giapponese
Testo #25
ID : 4632 (0x1218)
Formato : PGS
Lingua : Inglese
Testo #26
ID : 4633 (0x1219)
Formato : PGS
Lingua : Francese
Testo #27
ID : 4634 (0x121A)
Formato : PGS
Lingua : Italiano
Testo #28
ID : 4635 (0x121B)
Formato : PGS
Lingua : Tedesco
Testo #29
ID : 4636 (0x121C)
Formato : PGS
Lingua : Spagnolo
Testo #30
ID : 4637 (0x121D)
Formato : PGS
Lingua : Olandese
Testo #31
ID : 4638 (0x121E)
Formato : PGS
Lingua : Portoghese
Testo #32
ID : 4639 (0x121F)
Formato : PGS
Lingua : Giapponese
Testo #33
Formato : Subtitle
Lingua : Inglese
Testo #34
ID : 256 (0x100)
Formato : Subtitle
Lingua : Francese
Testo #35
ID : 512 (0x200)
Formato : Subtitle
Lingua : Italiano
Testo #36
ID : 768 (0x300)
Formato : Subtitle
Lingua : Tedesco
Testo #37
ID : 1024 (0x400)
Formato : Subtitle
Lingua : Spagnolo
Testo #38
ID : 1280 (0x500)
Formato : Subtitle
Lingua : Olandese
Testo #39
ID : 1536 (0x600)
Formato : Subtitle
Lingua : Svedese
Testo #40
ID : 1792 (0x700)
Formato : Subtitle
Lingua : Danese
Testo #41
ID : 2048 (0x800)
Formato : Subtitle
Lingua : Finnico
Testo #42
ID : 2304 (0x900)
Formato : Subtitle
Lingua : Norvegese
Testo #43
ID : 2560 (0xA00)
Formato : Subtitle
Lingua : Portoghese
Testo #44
ID : 2816 (0xB00)
Formato : Subtitle
Lingua : Inglese
Testo #45
ID : 3072 (0xC00)
Formato : Subtitle
Lingua : Francese
Testo #46
ID : 3328 (0xD00)
Formato : Subtitle
Lingua : Italiano
Testo #47
ID : 3584 (0xE00)
Formato : Subtitle
Lingua : Tedesco
Testo #48
ID : 3840 (0xF00)
Formato : Subtitle
Lingua : Spagnolo
Testo #49
ID : 4096 (0x1000)
Formato : Subtitle
Lingua : Olandese
Testo #50
ID : 4352 (0x1100)
Formato : Subtitle
Lingua : Svedese
Testo #51
ID : 4608 (0x1200)
Formato : Subtitle
Lingua : Danese
Testo #52
ID : 4864 (0x1300)
Formato : Subtitle
Lingua : Finnico
Testo #53
ID : 5120 (0x1400)
Formato : Subtitle
Lingua : Inglese
Testo #54
ID : 5376 (0x1500)
Formato : Subtitle
Lingua : Olandese
Testo #55
ID : 5632 (0x1600)
Formato : Subtitle
Lingua : Svedese
Testo #56
ID : 5888 (0x1700)
Formato : Subtitle
Lingua : Danese
Testo #57
ID : 6144 (0x1800)
Formato : Subtitle
Lingua : Finnico
Testo #58
ID : 6400 (0x1900)
Formato : Subtitle
Lingua : Norvegese
Testo #59
ID : 6656 (0x1A00)
Formato : Subtitle
Lingua : Portoghese
Testo #60
ID : 6912 (0x1B00)
Formato : Subtitle
Lingua : Svedese
Testo #61
ID : 7168 (0x1C00)
Formato : Subtitle
Lingua : Danese
Testo #62
ID : 7424 (0x1D00)
Formato : Subtitle
Lingua : Finnico
Testo #63
ID : 7680 (0x1E00)
Formato : Subtitle
Lingua : Norvegese
Testo #64
ID : 7936 (0x1F00)
Formato : Subtitle
Lingua : Inglese
Testo #65
ID : 8192 (0x2000)
Formato : Subtitle
Lingua : Francese
Testo #66
ID : 8448 (0x2100)
Formato : Subtitle
Lingua : Italiano
Testo #67
ID : 8704 (0x2200)
Formato : Subtitle
Lingua : Tedesco
Testo #68
ID : 8960 (0x2300)
Formato : Subtitle
Lingua : Spagnolo
DMD
6th February 2011, 20:16
It's easy (when you know)... The subtitles greater than 32 are stored as text subtitles in another m2ts. You can use BDEdit to find which is this m2ts: Go to the movie PlayList and in SubPath you'll get it....
With BDEditI just found this
http://img153.imageshack.us/img153/5739/screenshot001zs.png (http://img153.imageshack.us/i/screenshot001zs.png/)
setarip_old
6th February 2011, 23:57
@DMD/dom161 After much research I managed to find a discussion about my problemYou really mean the link I gave you at the MakeMKV forum, don't you? ;>}
http://www.makemkv.com/forum2/viewtopic.php?f=1&t=2561&p=10789#p10789
Re: Blu-ray to MKV: How to find subtitle ghost?
by setarip_old » Sat Feb 05, 2011 5:41 pm
@dom61
Sorry about that. Here 'tis:
http://forum.doom9.org/showthread.php?t=146501
DMD
7th February 2011, 06:25
@DMD/dom161 You really mean the link I gave you at the MakeMKV forum, don't you? ;>}
Yes!
and thank you for that. :)
I hope to extract Stream "ghost", until now I did not understand how :confused::o
Thanks
setarip_old
7th February 2011, 07:21
Yes! and thank you for that. You're quite welcome ;>}
chompy
7th February 2011, 09:20
With BDEditI just found this
http://img153.imageshack.us/img153/5739/screenshot001zs.png (http://img153.imageshack.us/i/screenshot001zs.png/)
In the SubPath of you PlayList, you can see all the Text Subs you are looking for, and in the windows right of it you can see the m2ts containing each Text Sub when you click on them.
If you want sub number 66, then you should look SubPath 033 and use textST with the m2ts file that BDEdit tells you it's referred in that Text Sub.
Greetings
DMD
8th February 2011, 09:21
@ chompy
Thank you for the correct procedure
I tried to simplify the various steps according to my knowledge,
creating a compressed file to be extracted under the folder of the BD STREAM
http://img96.imageshack.us/img96/7966/image084.png (http://img96.imageshack.us/i/image084.png/)
With the right mouse button> context menu> Edit on commandline.bat, and you open the text file in which we add the stream reported with BDEdit, after the command textST.exe
As in the example:
textST.exe 00003.m2ts
textST.exe 00014.m2ts
textST.exe 00034.m2ts
and then save
Then double click on commandline.bat, after a few seconds of processing their files appear in srt format
http://img248.imageshack.us/img248/6915/image086a.png (http://img248.imageshack.us/i/image086a.png/)
I kindly ask, if it can be developed a tool that can simplify this procedure?
I described the complete procedure in this post
http://forum.doom9.org/showpost.php?p=1476572&postcount=13
THANKS
HPotter
7th November 2014, 08:52
I have some problem with decode text subtitles by textST.
After decode by textST
1
00:10:59,338 --> 00:11:03,091
2
00:11:39,044 --> 00:11:42,381
3
00:11:48,303 --> 00:11:53,308
Hei.
4
00:11:53,308 --> 00:11:58,146
Mikset?
Must be:
1
00:01:10,530 --> 00:01:14,290
(Kuorsausta.)
2
00:01:50,240 --> 00:01:53,580
(Puhelin värisee.)
3
00:01:59,500 --> 00:02:04,500
Hei. <i>-Eve täällä, moi.
"Mä en pääse tänään tulemaan.</i>
4
00:02:04,500 --> 00:02:09,340
Mikset?
<i>-Se masentaa mua.</i>
It have many spaces and blank lines. And wrong timelines...
I attached original 00034.m2ts and .srt after textSP.
It text subtitles in UTF-8.
I need a help with this :rolleyes:.
r0lZ
9th February 2015, 10:33
I wrote a small tool to process .pes files extracted with tsRemux or directly the original m2ts files with text subtitles...
[...]
textST (CLI) (http://multiavchd.deanbg.com/textST/) (630KB) self extractable 7z archive (contains calclib.dll and textST.exe), or if you're using multiAVCHD then you can download just the executable textST.exe (http://multiAVCHD.deanbg.com/textST.exe)(80kb) and put it in your multiAVCHD folder.
[...]
The "textST (CLI)" link is dead. (The link with textST.exe alone is still valid.) Can you upload the 7z archive again? Thanks!
DMD
9th February 2015, 12:51
With the availability of the necessary files (calclib.dll, readSUP.exe and textST.exe) I created the file commandline.bat, very elementary.
Which extracts the text file from the stream m2ts, just enter the stream only interested without the extension.
http://i57.tinypic.com/2s9t4dl.jpg
http://i61.tinypic.com/2rz99np.jpg
http://i60.tinypic.com/2r5cn7k.jpg
If you are interested in the file Text Tool stream.rar is this
http://www.mediafire.com/download/zldn3typ74haphy/Tool+Text+streams+V.ENG.rar
The file must be decompressed into the folder rar STREAM of Blu-ray disc and launch the file commandline.bat
greetings
r0lZ
9th February 2015, 13:04
Thanks.
Despite its name, it seems that the archive includes the Italian version of the shell script. Not a problem for me.
DMD
9th February 2015, 13:48
MakeMKV with the latest version (1.9.1) is already compatible with this type of subtitles.
So if you make the language selection automatically are also selected subtitles text format of the same language.
http://i57.tinypic.com/6dr0n6.jpg
Since MakeMKV already allows the language selection
If there was a tool to extract directly the subtitle TextST, we should not look for him among the many in the STREAM folder of Blu-ray.
For this reason it would be better to perfect tool MKVExtractGUI2
DVD
20th January 2016, 19:20
Hello everyone,
I stumpled over this post when I tried to convert my "The Mummy" Blu-Ray into a MKV file using MakeMKV. Of course the TextST files are not playable anywhere (VLC, ...) so I thought of converting them to SSA/AAS subtitles. In addition to just SRT I thought it would be possible to add some sort of formatting or even screen position which may also be present in the original source data. Since I could not find the source code available I started some reverse engeneering on the format. Here is what I go so far:
http://www.simonheckmann.de/img/Analysis.png
I used mkvextract to extract the TextST streams from the MakeMKV output MKV file. I used the --fullraw flag on all streams, so maybe some of the header data in the sample is also due to this.
I also tried looking for any specification of the format in the Internet but could not find anything helpful.
Your thoughts/input is highly appreciated.
Kind regards,
Simon
DVD
21st January 2016, 11:35
Hello guys,
I did some further analysis, this time with "The Mummy Returns". As it turns out, some of the subtitles are at the top of the screen (to not overlap with hard subtitle text on the movie-screen) and some are at the bottom. I marked the differences in blue. I think they could impact the position in some way:
http://www.simonheckmann.de/img/Analysis_2.png
bigotti5
23rd January 2016, 00:33
Top and Bottom are defined in RegionStyles
in example above
Top is RegionStyleID 01
Bottom is RegionStyleID 00
PTS are 5 bytes each
and so on
I gathered some info about BD textsubstreams, should help
Dialog Style
segment type 08 = 0x81
length of Dialog Styles 16
Player Own Style 08 = 0x00 prohibit
0x80 permit
Number of Region Styles 08
Number of User Styles 08
Region Info
region_style_id 08
region_horizontal_position 16
region_vertical_position 16
region_width 16
region height 16
region_bg_color_palette_id 08
reserved 08
Textbox Info
text_box_horizontal_position 16
text_box_vertical_position 16
text_box_width 16
text_box_height 16
text_flow 08
text_horizontal_alignment 08 = 0x01 horizontal writing right
0x02 horizontal writing left
0x03 vertical writing
text_vertical_alignment 08 = 0x01 left
0x02 center
0x03 right
line_space 08
font_id 08
font_style 08 = 0x00 normal
0x01 bold
0x02 italic
0x03 bold+italic
0x04 outline
0x05 outline+bold
0x06 outline+italic
0x07 outline+bold+italic
font_size 08 = 0x08 to 0x90
font_palette_entry 08
outline_palette_entry 08
outline_size 08 = 0x01 thin
0x02 medium
0x03 thick
.....next region_style_id......
Palette
length 16
palette_entry_id 08
Y_value 08
Cr_value 08
Cb_value 08
T_value 08
....
....
(last palette_entries are always 254 -> FE)
Dialog Presentation Segment
#_of_dialog_presentation_segments 16
segment_type 08 = 0x82
reserved 08
segment_length 08
dialog_start_time 40
dialog_end_time 40
palette_update_flag 01 if set
reserved 07 then
.............Palette..........
numbers_of_regions 08
continous_present_flag 01
forced_flag 01
reserved 06
region_style_id 08
Text Subtitle
text_subtitle_length 16
escape_code 08 = always 0x1B
data_type 08 = 01 Text string start
02 Change font set
03 Change font style
04 Change font size
05 Change font color
0A Line break
0B End of inline style
data_length 08 = data_typ 01 -> length of text string
data_typ 02 -> 0x01
data_typ 03 -> 0x03
data_typ 04 -> 0x01
data_typ 05 -> 0x01
data_typ 0A -> 0x00
data_typ 0B -> 0x00
....
....
....
DVD
24th January 2016, 23:10
Hello,
Thank you very much. This really helped. I am able to parse the header successfully now. I will post my Java code here once my parser/converter is done. I will try to support SSA/ASS output as well as SRT.
There is one thing I noticed with your code: It seems to be off by one byte at the beginning. I have:
Identifier: 1 Byte (always 0x81) [position confirmed]
Section length: 2 Byte [position confirmed]
(unknown): 1 Byte
Player Own Style: 1 Byte [???]
# of Region Styles: 1 Byte [position confirmed]
# of User Styles: 1 Byte [???]
...
If I align it like this, it parses nicely (width, height, color all make sense). If I align it like in your code (skip the (unknown) Byte marked in red), all other information is rubbish (width = 65000, ...)
I will continue work and post updates here.
Thanks for your support.
:thanks:
Kind regards,
DVD
bigotti5
25th January 2016, 00:49
Yes, mistake in my records - should be
Dialog Style
segment type 08 = 0x81
length of Dialog Styles 16
Player Own Style 01 = 0x00 prohibit
0x80 permit
reserved 15
Number of Region Styles 08
Number of User Styles 08
...
...
thx
bigotti5
25th January 2016, 03:27
I did some corrections to my records
- user styles missing
- wrong text flow entries
Dialog Style
segment type 08 = 0x81 (Dialog Style)
length of Dialog Styles 16
Player Own Style 01 = 0x00 prohibit
0x80 permit
reserved 15
Number of Region Styles 08
Number of User Styles 08
Region Info
region_style_id 08
region_horizontal_position 16
region_vertical_position 16
region_width 16
region height 16
region_bg_color_palette_id 08
reserved 08
Textbox Info
text_box_horizontal_position 16
text_box_vertical_position 16
text_box_width 16
text_box_height 16
text_flow 08 = 0x01 horizontal writing right
0x02 horizontal writing left
0x03 vertical writing
text_horizontal_alignment 08 = 0x01 left
0x02 center
0x03 right
text_vertical_alignment 08 = 0x01 top
0x02 middle
0x03 bottom
line_space 08
font_id 08
font_style 08 = 0x00 normal
0x01 bold
0x02 italic
0x03 bold+italic
0x04 outline
0x05 outline+bold
0x06 outline+italic
0x07 outline+bold+italic
font_size 08
font_palette_entry 08
outline_palette_entry 08
outline_size 08 = 0x01 thin
0x02 medium
0x03 thick
User changeable style set if Number of User Styles != 0
user_style_id 08
reg_horiz_pos_direction 01 = 0 right
1 left
reg_horiz_pos_delta 15
reg_verti_pos_direction 01 = 0 down
1 up
reg_verti_pos_delta 15
font_size_inc_dec 01 = 0 increase
1 decrease
font_size_delta 07
txtbox_hor_pos_dir 01 = 0 right
1 left
txtbox_hor_pos_delta 15
txtbox_vert_pos_dir 01 = 0 down
1 up
txtbox_vert_pos_delta 15
txtbox_width_inc_dec 01 = 0 increase
1 decrease
txtbox_width_delta 15
txtbox_height_inc_dec 01 = 0 increase
1 decrease
txtbox_height_delta 15
line_space_inc_dec 01 = 0 increase
1 decrease
line_space_delta 07
.....next region_style_id......
Palette
length 16
palette_entry_id 08
Y_value 08
Cr_value 08
Cb_value 08
T_value 08
....
....
(last palette_entries are always 254 -> FE)
Dialog Presentation Segment
#_of_dialog_presentation_segments 16
segment_type 08 = 0x82
reserved 08
segment_length 08
dialog_start_PTS 40
dialog_end_PTS 40
palette_update_flag 01 if set
reserved 07 then
.............Palette..........
numbers_of_regions 08
continous_present_flag 01
forced_flag 01
reserved 06
region_style_id 08
Text Subtitle
text_subtitle_length 16
escape_code 08 = always 0x1B
data_type 08 = 01 Text string start
02 Change font set
03 Change font style
04 Change font size
05 Change font color
0A Line break
0B End of inline style
data_length 08 = data_typ 01 -> length of text string
data_typ 02 -> 0x01
data_typ 03 -> 0x03
data_typ 04 -> 0x01
data_typ 05 -> 0x01
data_typ 0A -> 0x00
data_typ 0B -> 0x00
DVD
25th January 2016, 19:26
Hello everyone,
The first version of my parser/converter written in Java is complete. Currently it supports parsing of TextST streams based on the raw data and export to SRT files.
Here is how it works:
1.) Use MakeMKV to extract the video, audio and subtitle information into an MKV file.
2.) Use mkvextract (part of mkvtoolnix (https://mkvtoolnix.download/)) with the MKV file created in 1.) and the following commandline to demux the raw TextST data for all TextST stream IDs:
mkvextract tracks <MKV Source File> --fullraw <First ID>:<Filename ID 1>.<Extension> --fullraw <Second ID>:<Filename ID 2>.<Extension> ... --fullraw <Last ID>:<Filename ID n>.<Extension>
3.) Compile and use my parser (source code attached) with the following command line:
java -jar SubConverter.jar -d=<Source Folder> -e=<Extension> -o=SRT -v
This will parse/convert all files in the folder "<Source Folder>" with the file extension "<Extension>".
-o=SRT provide SRT files as output
-o=SSA/ASS will SOON provide SubStation Alpha files as output (work in progress)
-v will output the parsed data on the console
Please note that the code does not come with any form of warranty! I tried this with the German and English TextST subtitles on the Mummy and the Mummy Returns and it worked quite well so far.
Feedback highly appreciated!
Special thanks to bigotti5 for providing input on the data format!
bigotti5
25th January 2016, 19:34
Attachments must be approved by an admin - this can take a while
upload your files to an external hoster and provide the link
DVD
25th January 2016, 19:47
I will see what I can do about the attachment.
BTW: There still seem to be some glitches in your format specification. I worked around them based on my findings. Not sure if it works for all files eventually but for mine it did ...
DVD
25th January 2016, 19:50
Oh, and something else comes to mind: Do you have an idea what "Font ID" is. What font is it referring to?
bigotti5
25th January 2016, 20:21
refers to BDMV/AUXDATA font files
ID 0 = 00000.otf
ID 1 = 00001.otf
...
bigotti5
26th January 2016, 13:13
There still seem to be some glitches in your format specification
Can you be more precise?
DVD
26th January 2016, 13:51
Sure (as far as I can tell):
#_of_dialog_presentation_segments 16
I have my doubt that this exists. In my code I read from the palette info directly to "segment_type".
.....next region_style_id......
In your code this is AFTER "User Styles". However I would assume that first we have the "region styles" [1-n] and then the "user styles" [1-n]. Just an assumption.
....
....
(last palette_entries are always 254 -> FE)
Not sure what ... ... is. In my parser I read entry id plus four values per color
length 16
palette_entry_id 08
Y_value 08
Cr_value 08
Cb_value 08
T_value 08
.... next palette entry ...
bigotti5
26th January 2016, 19:04
Last palette
of course 5 bytes, my comment should mean that palette entry 254 is always present at last
If there are only e.g. 3 colors IDs are 01, 02 and 254
However I would assume that first we have the "region styles" [1-n] and then the "user styles" [1-n]. Just an assumption.
Definitly not
In my code I read from the palette info directly to "segment_type".
In my test encodes from scenarist there is always the number of dps before first 0x82
here a sample log from commercial TextST parser (DVDLogic BDReauthor pro)
...
... Textbox height delta: 0
Line space inc dec: 0 - Increase
Line space delta: 0
Palette
Length: 20
Entry ID: 0, Y: 32, Cr: 118, Cb: 240, T: 0
Entry ID: 1, Y: 235, Cr: 128, Cb: 128, T: 255
Entry ID: 2, Y: 16, Cr: 128, Cb: 128, T: 255
Entry ID: 254, Y: 16, Cr: 128, Cb: 128, T: 0
Number of Dialog Presentation Segments: 2
Dialog Presentation Segment 1
Segment descriptor
Segment type: 0x82 - Dialog Presentation Segment
Segment length: 63
Dialog start PTS: 54450000 - 00:10:05.000
Dialog end PTS: 54720000 - 00:10:08.000
Palette update flag: 0
Number of Regions: 1
Dialog region 1
Continuous present flag: 0
Forced on flag: 0
Region style id ref: 0
Subtitle data length: 0
...
...
DVD
26th January 2016, 21:22
Okay, I will update the stuff with the User Style.
As for the other items, in the files I have neither the color with index 254 is present nor the number of dialog presentation segments.
bigotti5
27th January 2016, 03:14
After a test encode with Blu-print
PaletteID 254 is a quirk in Scenarist
Number of dialog presentation segments is removed from elementary stream by mkvmerge
(from The Mummy 00012.m2ts eng textST)
http://members.aon.at/video.digital/dif.png
DVD
27th January 2016, 10:18
Ah, perfect. This explains it. Still odd that mkvextract removes this.
Anyways, another question: I did some testing yesterday and found that the timestamps I decoded from the stream do not match with the movie. Do you have any guidance how the "dialog_start_time" needs to be interpreted. Currently, I devide the 5 byte number by 90 to get the amount of milliseconds. Unfortunately, this does not seem to work. Is the factor wrong or is there any sort of delay?
bigotti5
27th January 2016, 11:50
You have to remove muxing delay from PTS
In corresponding mpls you can find "in_time" - this is the muxing delay
Look for first "4D 32 54 53" (M2TS), ignore 3 bytes, next 4 bytes are "in_time". (Should be always from byteposition 0x52)
Divide by 45 und you have the value in ms.
In case of The Mummy e.g.
11.650
http://members.aon.at/video.digital/mpls.png
Common delays are
00:10:00.000 - Scenarist
00:00:11.650 - Blu-Print
00:00:02.000 - Panasonic
DVD
31st January 2016, 02:55
@ bigotti5: Thanks, this helps!
I added some code to parse all MPLS files from a Disc and list the in_time delays so they can be used to correct the offsets. I am almost done with my code. I need to add some small features/fixes, but for the most part it seems to work now. The ASS/SSA files it produces work quite nicely and they even use the correct formatting/position from the TextST subtitles now (taken from The Mummy Returns):
Original BD Image - Example 1 (http://www.simonheckmann.de/img/ZeilenOben.jpg)
Original BD Image - Example 2 (http://www.simonheckmann.de/img/ZeileUnten.jpg)
Original BD Image - Example 3 (http://www.simonheckmann.de/img/ZeilenUnten.jpg)
ASS representation in the MKV file - Example 1 (http://www.simonheckmann.de/img/Lines_Top.png)
ASS representation in the MKV file - Example 2 (http://www.simonheckmann.de/img/Line_Bottom.png)
ASS representation in the MKV file - Example 3 (http://www.simonheckmann.de/img/Lines_Bottom.png)
DVD
1st February 2016, 18:24
Hello,
I am now done with implementing a more complete version which also includes export to Advanced SubStation Alpha Subtitles (ASS) as well as simple SubRip Text (SRT) support. Although the feature set is far from complete, it worked quite well for my "The Mummy" and "The Mummy Returns" Blu-Rays. Timing and formating seem fine and on "The Mummy Returns" it even positions the subtitle on the top/bottom of the screen depending on its assigned screen position in the original TextST file. (For some screenshots of the result, see previous post)
I thought I share my code and some instructions how to use it so that somebody willing to use or even extend on it can give it a shot and does not need to start from scratch.
Here is how it works:
Use MakeMKV to extract the video, audio and subtitle information into an MKV file.
Use mkvextract (part of mkvtoolnix (https://mkvtoolnix.download/)) with the MKV file created in the first step and the following commandline to demux the raw TextST data for all TextST stream IDs:
mkvextract tracks <MKV Source File> --fullraw <First ID>:<Filename ID 1>.<Extension> --fullraw <Second ID>:<Filename ID 2>.<Extension> ... --fullraw <Last ID>:<Filename ID n>.<Extension>
Now the MPLS files (playlists) need to be analyzed to detect the muxing delay for the subtitle streams. The following command line will do so:
java -jar SubConverter.jar -mode=MPLS -input="/Users/DVD/Movie/MPLS" -verbose
Result:
Blu-Ray TextST Parser/Converter - Version 0.50 alpha - by DVD
Using directory '/Users/DVD/Movie/MPLS' with extension 'mpls'
1 files files found!
Start processing file '/Users/DVD/Movie/MPLS/00000.mpls'
Num bytes read: 1864
Start Time: 00:00:11.650
End Time: 02:09:43.414
Duration: 02:09:31.764
Delay: -11650
Now this delay can be used as input for the actual conversion, e.g. SRT:
java -jar SubConverter.jar -mode=TEXTST -input="/Users/DVD/Movie/" -extension=TextST -output=SRT -delay=-11650 [-verbose]
Along with the following output the SRT files will be generated in the source folder.
Blu-Ray TextST Parser/Converter - Version 0.50 alpha - by DVD
Using directory '/Users/DVD/Movie/' with extension 'TextST'
9 file(s) files found!
Start processing file '/Users/DVD/Movie/12.textST'
Num bytes read: 138311
Parsing successfull for '/Users/DVD/Movie/12.textST'
Exporting to file '/Users/DVD/Movie/12.srt' using SRT format.
Export complete!
[...]
If you want to use ASS output, the following command line should be used:
java -jar SubConverter.jar -mode=TEXTST -input="/Users/DVD/Movie" -extension=TextST -output=SSA/ASS -delay=-11650 -fonts="/Users/DVD/Movie/Fonts" -width=1920 -height=1080 [-verbose]
This will also output all subtitles but as formatted ASS files instead of SRT.
Blu-Ray TextST Parser/Converter - Version 0.50 alpha - by DVD
Using directory '/Users/DVD/Movie' with extension 'TextST'
9 file(s) files found!
Start processing file '/Users/DVD/Movie/12.textST'
Num bytes read: 138311
Parsing successfull for '/Users/DVD/Movie/12.textST'
Checking for font directory ...
Using directory '/Users/DVD/Movie/Fonts' with extension 'otf'
1 file(s) files found!
Exporting to file '/Users/DVD/Movie/12.ass' using SSA/ASS format.
Export complete!
[...]
The additional parameters (as opposed to SRT output) work as follows:
-fonts Identifies the folder that contains the Blu-Ray font files (BDMV/AUXDATA/*.otf). The fonts will not be included into the ASS file, but the tool will read the font name and put it into the result file. You can manually install them on your device or merge them into the final MKV file.
-width x-resolution of the video material used in the ASS file for creating resizing and positioning information.
-height y-resolution of the video material used in the ASS file for creating resizing and positioning information.
The final ASS files are viewable in any text editor and can be modified to your own liking, e.g. font, size, color, etc . I for example increased the font size a little bit from 54 to 65.
[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: Region_Style_0,Arial,65,&H00EBEBEB,&H00EBEBEB,&H3F101010,&HFF000000,0,0,0,0,100,100,0,0,1,2,0,2,1,1,142,0
Style: Region_Style_1,Arial,65,&H00EBEBEB,&H00EBEBEB,&H3F101010,&HFF000000,0,0,0,0,100,100,0,0,1,2,0,8,1,1,141,0
In sample above each style entry is derived from a region style in the source material. In this case a text box at the top of the screen and a text box at the bottom.
Note: The -verbose flag prints a lot of additional information on the console. This can be used for debugging and analysis.
Please note that the code does not come with any form of warranty!
Again, special thanks to bigotti5 for providing his valuable input!
bigotti5
1st February 2016, 23:34
Attachment approval takes days so cant test your prog now.
I made a small BD to test.
Contains 3 StyleRegions and many inline styles.
This rar file (https://www.dropbox.com/s/tcmr9ulajnrqjab/TextST.rar?dl=0) contains BD folder and a tes file (number of dialog presentations removed).
StainlessS
1st February 2016, 23:41
Not so long ago, a moderator suggested that fastest way to get an attachment approved was to 'report' your own post.
DVD
2nd February 2016, 00:10
Attachment approval takes days so cant test your prog now.
I made a small BD to test.
Contains 3 StyleRegions and many inline styles.
This rar file (https://www.dropbox.com/s/tcmr9ulajnrqjab/TextST.rar?dl=0) contains BD folder and a tes file (number of dialog presentations removed).
I must admit that I have not implemented any inline style conversion yet, although I think it can be added. I had only the mummy movies to test with and they do not use inline styles. As soon as I have another day to spare I will look into your file ... Of course you can also take my code and make additions to it!
vBulletin® v3.8.11, Copyright ©2000-2026, vBulletin Solutions Inc.