Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > General > Subtitles

Reply
 
Thread Tools Search this Thread Display Modes
Old 3rd September 2007, 13:41   #201  |  Link
Taktaal
Registered User
 
Join Date: May 2003
Posts: 114
I'm stupid

The program expects a folder "SupRip" to exist in %appdata%, if it doesn't already, then the program will crash. I fixed the bug in the new version 0.61

Last edited by Taktaal; 3rd September 2007 at 18:29.
Taktaal is offline   Reply With Quote
Old 7th September 2007, 21:53   #202  |  Link
l3pyr
Registered User
 
Join Date: Oct 2005
Posts: 11
Quote:
Originally Posted by Taktaal View Post
I'm stupid

The program expects a folder "SupRip" to exist in %appdata%, if it doesn't already, then the program will crash. I fixed the bug in the new version 0.61
Great work with Suprip, but there are a few bugs I've come across.

The first thing I noticed is that it didn't like line 4 in the following scn-sst file

Code:
St_Format 4
Display_Area  (0 2 1919 1079)
TV_Type NTSC
SubTitle GitS - English.sup
##########################
SP_NUMBER START END FILE_NAME
It said it should only be two words, so I changed it to

Code:
St_Format 4
Display_Area  (0 2 1919 1079)
TV_Type NTSC
SubTitle GitS-English.sup
##########################
SP_NUMBER START END FILE_NAME
and everything worked fine. You should probably fix it to support spaces in filenames.

Next, there is one subtitle that causes it to crash for me. Its my 501th line, but even if I put that one subtitle in its own scn-sst file it still causes a crash. My work around right now is simply removing that line and then adding that subtitle into the finished SRT file after all is said and done. Strange problem, no?

Update: the 644th line also screwed it up.. here's both lines. Need me to post the PNGs?

Code:
501 01:00:01:04 01:00:02:09 line0501.png
644 01:10:26:11 01:10:29:20 line0644.png
Again, great work on this program and I (along with everyone else I'm sure) look forward to its development.

Last edited by l3pyr; 7th September 2007 at 22:39. Reason: Updated info
l3pyr is offline   Reply With Quote
Old 8th September 2007, 14:22   #203  |  Link
Taktaal
Registered User
 
Join Date: May 2003
Posts: 114
Probably there's a problem with the PNGs, can you post them? I'll fix the line 4 problem.

Also I posted version 0.70 that can directly open SUP files
http://x0r.ch/suprip/

Last edited by Taktaal; 9th September 2007 at 21:48.
Taktaal is offline   Reply With Quote
Old 10th September 2007, 15:29   #204  |  Link
hristoff2
Registered User
 
Join Date: Jun 2007
Posts: 39
Quote:
Originally Posted by Taktaal View Post
Probably there's a problem with the PNGs, can you post them? I'll fix the line 4 problem.

Also I posted version 0.70 that can directly open SUP files
http://x0r.ch/suprip/

Thanks! I'll check it out on Underworld's german subs (HDDVD).
hristoff2 is offline   Reply With Quote
Old 11th September 2007, 15:28   #205  |  Link
killa_kid
Registered User
 
Join Date: Aug 2007
Posts: 39
The OCR is working great now. I was wondering if it would be possible to add an option (checkbox, dropdown, whatever works best) for bold, italic & underlined. I like my subtitles to look the same as the images...but a lot smaller in size :P
killa_kid is offline   Reply With Quote
Old 11th September 2007, 18:31   #206  |  Link
Rectal Prolapse
Registered User
 
Join Date: Mar 2005
Posts: 430
Taktaal, does subrip support Blu-ray .sup files (as created by the new xport)? The format is very different from HD-DVD.

Last edited by Rectal Prolapse; 11th September 2007 at 18:34.
Rectal Prolapse is offline   Reply With Quote
Old 11th September 2007, 19:58   #207  |  Link
aqualung99
Registered User
 
Join Date: Jun 2007
Posts: 17
I definitely have better luck with Taktaal's SupRip. By the time I'm 25-50% of the way through the movie it's "learned" pretty much all the characters and doesn't require any more user interaction (even with italics mixed-in.)
Unfortunately, I do still have the capital-I vs. lowercase-l problem -- a dictionary would most certainly help that (and frankly, it still does better in this department than SUPread did for me ) Also, it can't handle single-word italics (treats the whole line as italic or not based on the first word.) But that's pretty minor.
All in all, a (say) 3-hour task is still reduced to about 1 and a half. And that's a god-send!



Edit: Oops, I found another bug. In "Letters of Iwo Jima" EVODemux doesn't compensate for a really high, uhh, PTS I think it's called? Anyway the timecodes stored in the subtitle are all really big (like the very first one is after more than 5 hours.) SupRip chokes about 2/3 of the way into the subtitles and starts to negate all the time values! Observe:

Code:
615
06:37:23,507 --> 06:37:25,714
fight for your fallen brethren...

616
06:37:27,545 --> 06:37:29,251
until the end.

617
-06:-37:-30,-089 --> -06:-37:-28,-610
Get up!

618
-06:-37:-24,-350 --> -06:-37:-22,-280
Major General Hayashi
is leading an attack.

Last edited by aqualung99; 11th September 2007 at 22:24.
aqualung99 is offline   Reply With Quote
Old 12th September 2007, 10:29   #208  |  Link
Taktaal
Registered User
 
Join Date: May 2003
Posts: 114
6 hours 37 minutes is 2.15 billion ticks so yeah that's my error, I'll fix it and add a PTS correction option. Can you post the EVODemux text that appears after decoding the main evobs? I'm especially interested in the PTS values there.

As for underlined and bold, I haven't yet seen any HD-DVD subtitles that use those styles.

And no, I don't support BluRay subtitles. I'll have to look if I find some format documentation about it somewhere. Maybe we'll luck out and Sony will give up its newest format experiment though soon.
Taktaal is offline   Reply With Quote
Old 12th September 2007, 11:47   #209  |  Link
aqualung99
Registered User
 
Join Date: Jun 2007
Posts: 17
Quote:
Originally Posted by Taktaal View Post
6 hours 37 minutes is 2.15 billion ticks so yeah that's my error, I'll fix it and add a PTS correction option. Can you post the EVODemux text that appears after decoding the main evobs? I'm especially interested in the PTS values there.
Cool!

Here's the EVODemux dump:

Code:
Opening file feature_IWOJIMADOMNVC1_HD.EVO
Reading...
File size: 11268 Mbytes.
VOB number 0 contains 1 video , 2 audio and 4 subpicture streams.
PTM of first video frame = 6736B4B1
PTM of last video frame = 80002801
Duration = 1:17:00.616
VC-1 video stream 0 found!
   First PTS = 2736B4B1  (+35791394ms)
   Substream id = 55
   Profile = Advanced
   Level = 3
   Chroma Format = 4:2:0
   Size = 1920x1080
   Display size = 1920x1080
   Aspect ratio = 1:1 (square samples)
   Frame Rate = 23.976 (24000/1001)
Dolby TrueHD audio stream 1 found!
   First PTS = 2736B4B1  (+35791394ms)
   Substream id = B1
   TrueHD stream (up to 8 channels)
   Sampling frequency = 48 kHz
   2 ch. decoder
      channel modifier = Lt/Rt
   6 ch. decoder
      channel modifier = Lt/Rt
      channel arrangement = Main (Left,Right), Centre, LFE, Surrounds (Ls/Rs)
   8 ch. decoder
      channel modifier = Lt/Rt
      channel arrangement = Main (Left,Right), Centre, LFE, Surrounds (Ls/Rs)
   Dynamic range control = -10.6371 dB .. 8.4282 dB
Dolby Digital Plus audio stream 0 found!
   First PTS = 6736B4B1
   Substream id = C0
   Stream 0 is Dolby Digital Plus
   frame size = 1280 bytes, number of blocks per frame = 3
   Sampling frequency = 48 kHz
   Transmission bitrate = 640 kbit/s
   Channel arrangement = L + R + C + Ls + Rs, bsid = 16
   LFE channel = present
Subpicture stream 3 found!
   First PTS = 676F7ABE  (+41341ms)
   Substream id = 23
   Format = 8bitRLE
Subpicture stream 2 found!
   First PTS = 676F7ABE  (+41341ms)
   Substream id = 22
   Format = 8bitRLE
Subpicture stream 1 found!
   First PTS = 67E033B2  (+123423ms)
   Substream id = 21
   Format = 8bitRLE
Subpicture stream 0 found!
   First PTS = 67E9D318  (+130430ms)
   Substream id = 20
   Format = 8bitRLE
.
Opening file feature_IWOJIMADOMNVC1_HD_Divide.EVO
Reading...
File size: 8588 Mbytes.
VOB number 0 contains 1 video , 2 audio and 4 subpicture streams.
PTM of first video frame = 80002801
PTM of last video frame = 9479CC0E
Duration = 1:03:36.846
VC-1 video stream 0 found!
   First PTS = 00002801  (+28481545ms)
   Substream id = 55
   Profile = Advanced
   Level = 3
   Chroma Format = 4:2:0
   Size = 1920x1080
   Display size = 1920x1080
   Aspect ratio = 1:1 (square samples)
   Frame Rate = 23.976 (24000/1001)
Dolby Digital Plus audio stream 0 found!
   First PTS = 80000EB1  (+4620544ms)
   Substream id = C0
   Stream 0 is Dolby Digital Plus
   frame size = 1280 bytes, number of blocks per frame = 3
   Sampling frequency = 48 kHz
   Transmission bitrate = 640 kbit/s
   Channel arrangement = L + R + C + Ls + Rs, bsid = 16
   LFE channel = present
Dolby TrueHD audio stream 1 found!
   First PTS = 00001DB1  (+28481516ms)
   Substream id = B1
   TrueHD stream (up to 8 channels)
   Sampling frequency = 48 kHz
   2 ch. decoder
      channel modifier = Lt/Rt
   6 ch. decoder
      channel modifier = Lt/Rt
      channel arrangement = Main (Left,Right), Centre, LFE, Surrounds (Ls/Rs)
   8 ch. decoder
      channel modifier = Lt/Rt
      channel arrangement = Main (Left,Right), Centre, LFE, Surrounds (Ls/Rs)
   Dynamic range control = -10.6371 dB .. 8.4282 dB
Subpicture stream 3 found!
   First PTS = 800F00AD  (+4631426ms)
   Substream id = 23
   Format = 8bitRLE
Subpicture stream 1 found!
   First PTS = 800F00AD  (+4631426ms)
   Substream id = 21
   Format = 8bitRLE
Subpicture stream 0 found!
   First PTS = 800F00AD  (+4631426ms)
   Substream id = 20
   Format = 8bitRLE
Subpicture stream 2 found!
   First PTS = 800F3B54  (+4631593ms)
   Substream id = 22
   Format = 8bitRLE
Done.
aqualung99 is offline   Reply With Quote
Old 24th September 2007, 00:22   #210  |  Link
Taktaal
Registered User
 
Join Date: May 2003
Posts: 114
Ok I tried for a good week to try to get a reliable single word italic detection going, but couldn't do it without getting an unacceptable number of false positives

Anyway, I uploaded a new version which fixed a few other bugs, but single italic words not getting detected is probably going to stay unless i get an idea I haven't thought of yet.

Also, I started adding Bluray support, but that's going slow since I don't have a Bluray drive or movies to test my code on.

http://x0r.ch/suprip/
Taktaal is offline   Reply With Quote
Old 25th September 2007, 17:07   #211  |  Link
hristoff2
Registered User
 
Join Date: Jun 2007
Posts: 39
Quote:
Originally Posted by Taktaal View Post
Ok I tried for a good week to try to get a reliable single word italic detection going, but couldn't do it without getting an unacceptable number of false positives

Anyway, I uploaded a new version which fixed a few other bugs, but single italic words not getting detected is probably going to stay unless i get an idea I haven't thought of yet.

Also, I started adding Bluray support, but that's going slow since I don't have a Bluray drive or movies to test my code on.

http://x0r.ch/suprip/
Some feedback:

- There seems to be an OCR error when a small 'L' follows a small 'L' ('don't tell..) -> you get like "don't tel 'I or l' (capital 'i' I suppose) - see .sup below, it's in one of the first 10 lines
- Crashes @ line #27 for me without any further information, winxp pro sp2

Here's the sup file if you wanna check it: http://rapidshare.com/files/58185467...cture.sup.html

thanks for your work
hristoff2 is offline   Reply With Quote
Old 26th September 2007, 04:04   #212  |  Link
A_C_ONE
Registered User
 
A_C_ONE's Avatar
 
Join Date: Sep 2007
Posts: 18
A pair: IDX \ SUB format

I need a great deal of help from an expert. I am a developer and created a small programm to convert SUB to SRT subtitles and perform a lot of operation with SUB (text) subtitles. Now my goal is to "teach" my applet to recognize BMPs in a VOB SUB file. The one whis goes in pair with IDX.
IDX a pretty straightforward textual file, to read it - there is no problem.
The format of SUB file - which contains bitmaps of subtitles - is a problem, I tried to find out any information on the internet - to no avail. There are no description or specifications of a SUB format. SUB format remains a black box for me. I 'm talking not about general description of a SUB format, or what is it for etc.etc.etc, but a description of SUB format - exactly.
I am not talking about OCR program, I didn't go that far. For the start I would want to "chop" the SUB, according to data i IDX file -positions.
And then display those chops in a BMP control, like image control or so. Or t save them (decode probably and save) as BMP files.

I deeply appreciate any help in this matter.

with regards
a developer

Last edited by A_C_ONE; 26th September 2007 at 04:06.
A_C_ONE is offline   Reply With Quote
Old 26th September 2007, 04:52   #213  |  Link
Rectal Prolapse
Registered User
 
Join Date: Mar 2005
Posts: 430
A_C_ONE, this thread is about subtitles from HD-DVD and Blu-ray movies. I think you are referring to VOBSUB files from DVDs? If so, you may be better off starting a new thread, or find a more relevant thread to post your question in.

Good luck!
Rectal Prolapse is offline   Reply With Quote
Old 29th September 2007, 11:12   #214  |  Link
Zelos
Registered User
 
Join Date: May 2007
Location: Marseille
Posts: 73
Taktaal good job , ocr works fine !

I have a problem when i ocr all.
The soft crashes , even if i try to do it manually , after some pictures , it crashes.
Zelos is offline   Reply With Quote
Old 29th September 2007, 15:37   #215  |  Link
Deckard2019
Registered User
 
Join Date: Jan 2005
Posts: 110
Same problem for me. French language, XP SP2. Always on the same picture :

Deckard2019 is offline   Reply With Quote
Old 29th September 2007, 20:32   #216  |  Link
Taktaal
Registered User
 
Join Date: May 2003
Posts: 114
Looks like I accidentially introduced a bug while searching for an algorithm that reliably detects italic words. I fixed it in the new version 0.81.
Taktaal is offline   Reply With Quote
Old 29th September 2007, 23:06   #217  |  Link
Deckard2019
Registered User
 
Join Date: Jan 2005
Posts: 110
Works better now ! Thank you !

But where are timings ??? Just after opening subtitles.scn-sst, SRT tab looks like this :



Same results when OCR ends.

Last edited by Deckard2019; 29th September 2007 at 23:33.
Deckard2019 is offline   Reply With Quote
Old 30th September 2007, 01:27   #218  |  Link
Taktaal
Registered User
 
Join Date: May 2003
Posts: 114
Strange. Is it a Bluray or HDDVD subtitle? Does the problem happen too when you open the .sup directly? Can you upload the .sup somewhere?
Taktaal is offline   Reply With Quote
Old 30th September 2007, 07:52   #219  |  Link
Deckard2019
Registered User
 
Join Date: Jan 2005
Posts: 110
Quote:
Is it a Bluray or HDDVD subtitle?
BluRay.

Quote:
Does the problem happen too when you open the .sup directly?
No.

Quote:
Can you upload the .sup somewhere?
http://deckstuff.free.fr/temp/hb.sup
Deckard2019 is offline   Reply With Quote
Old 30th September 2007, 14:12   #220  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,126
Hmmmm... Just beginning to look into subtitles stuff. @Taktaal, thanks for investing work here! I've one crazy idea. Maybe you'll like it, or maybe not. But I thought I'd just post it, just in case:

Couldn't you create one monstrous bitmap (e.g. 800x100000 pixels) and draw all subtitles to that one large bitmap? Additionally you could manually add the timestamps (as written text) to that bitmap, too. We could then feed such a monster bitmap to a good OCR software and the result should be a full SRT subtitle text file. Maybe we'd need a little helper tool which cleans up the final text file to make it fully SRT compatible, but that should be no big problem. What do you think?
madshi is offline   Reply With Quote
Reply

Tags
supread, suprip

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 13:37.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2018, vBulletin Solutions Inc.