Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > General > Subtitles

Reply
 
Thread Tools Search this Thread Display Modes
Old 17th May 2005, 05:26   #61  |  Link
Esc
Registered User
 
Esc's Avatar
 
Join Date: Jan 2005
Posts: 74
I tried to run a hardsubbed file. Out of curiosity. Here is exactly what I did.
I had a sucessful run of DVD subtitles recognition. So I had a Character matrix open and some text in Subtitles window. Both were saved but not cleared.
I opened an AVI file. Pressed Play. Waited for the first subtitle to appear. Pressed 'Pause' (same button). Checked 'Use' box. Clicked with the mouse on the subtitle several times in different places. Pressed 'Run'. Waited a bit. Got bored. Pressed 'Stop' (same button). A pop-up window came out with some gibberish image and a prompt to enter a character. Whenever I tried to press 'pause/abort' or just close that window by the upper-right cross button I would get an error message. The header said: SubRip - 100%. The body said: List index out of bounds (5). And the pop-up window would stay open.
Esc is offline   Reply With Quote
Old 17th May 2005, 10:48   #62  |  Link
ai4spam
Programmer
 
ai4spam's Avatar
 
Join Date: Sep 2003
Posts: 382
Well, I usually reply only to people who actually need the program, not just use it "out of curiosity" , but here goes: pressing "Run" won't do you any good (it'll give you garbage/gibberish) if you're not clicking inside the subtitle and changing the settings to get the right image in the first place . Basically, it should be white subs on a black background and not much else. For example, if the subtitles are on top of some white object, and that shows up as a big white blob, you should lower the subtitle color tolerance or check the "fill large areas" checkbox. If the subs are too thin or they look like they're "eaten by ants", you should increase the text and/or outline color tolerance. There are other tips in the (now outdated, but still fairly useful) section of the manual/readme.
I'll try to reproduce the bug you reported (adding to a non-empty sub), then fix it.

Last edited by ai4spam; 17th May 2005 at 10:52.
ai4spam is offline   Reply With Quote
Old 17th May 2005, 14:39   #63  |  Link
NN
Registered User
 
Join Date: Nov 2004
Posts: 5
Some suggestions

1. When I use SubRip to convert .idx/.sub files to .srt files I need to supply a name for the .srt file. It would be a lot more conveniant if SubRip would default to the name of the .idx file (with .srt extension).

2. If I forget to clear the subtitle text window before processing the next .idx/.sub file I get the following message:
Subtitles text file isn't empty. Add to the end of file? (OK) (Cancel)
I would like a third option: Clear the subtitle text window before proceeding.

3. I had trouble adding the %-character to the matrix. SubRip first detected the first o and after using 'take with next' it included the /, but I cound't get it to include the second o.

4. The subtitles I converted contained a lot of italic i, l and ! characters which where all detected as /. Is it possible to modify the detection system to recognise that a / within an italic word is probably not a / but an i or an l ?

Thanks, NN
NN is offline   Reply With Quote
Old 17th May 2005, 15:32   #64  |  Link
Esc
Registered User
 
Esc's Avatar
 
Join Date: Jan 2005
Posts: 74
Maybe I wasn't clear enough. I didn't just 'press any key'. I clicked inside the subtitle. I did it several times to make sure I hit the right spot. Those subtitles were fairly small. And they were not plain white-on-black but some relatively bright color with a darker outline. It was an anime fansub.
Esc is offline   Reply With Quote
Old 17th May 2005, 16:11   #65  |  Link
ai4spam
Programmer
 
ai4spam's Avatar
 
Join Date: Sep 2003
Posts: 382
@Esc:
Increasing the text and outline color tolerances would deffinitely help you get the whole letters. Use the other features (draw lines between, fill open, fill large) to get rid of the false guesses. Again, adjust the parameters until you see something that makes sense in the preview (white text with red outline).
Also, anime fansubs sometimes use different text and outline colors during the same movie. Nothing can be done at this level, you need to stop processing and click on the text again when the colors change.

Last edited by ai4spam; 17th May 2005 at 21:14.
ai4spam is offline   Reply With Quote
Old 17th May 2005, 21:12   #66  |  Link
ai4spam
Programmer
 
ai4spam's Avatar
 
Join Date: Sep 2003
Posts: 382
@NN:
1) is feasible and easy enough
2) is also feasible, but not really justified, all you have to do is press "Cancel" and then "Clear"
3) is relatively hard, but I'll see what I can do (I had the same problem)
4) you're probably encountering ths problem because you're reusing character matrices from other DVDs, instead of making new ones . That's ok in most cases, but you should increase the OCR sensitivity when you get these errors.
ai4spam is offline   Reply With Quote
Old 17th May 2005, 21:28   #67  |  Link
Esc
Registered User
 
Esc's Avatar
 
Join Date: Jan 2005
Posts: 74
BTW, is it me or ver 1.20 is really so much faster than 1.17?!

Also, I'm curious. Where does it get the names for the subtitles. It called Director's commentary subtitles from my last DVD "English caption for children" or something like that. It's not a real problem. Just looks funny.
Esc is offline   Reply With Quote
Old 17th May 2005, 21:50   #68  |  Link
ai4spam
Programmer
 
ai4spam's Avatar
 
Join Date: Sep 2003
Posts: 382
@Esc:
It's compiled with a different version of Delphi. Also, Zuggy took out some components that were generating the P4 crash bug.
The names are read from the .ifo/.idx file. If you open it in a text editor, you'll find them there.

Last edited by ai4spam; 19th May 2005 at 08:36.
ai4spam is offline   Reply With Quote
Old 17th May 2005, 22:27   #69  |  Link
NN
Registered User
 
Join Date: Nov 2004
Posts: 5
Re: Some suggestions

ai4spam, thanks for the quick response to my first post !

ps.
2. I know, but it would save me 5 mouse-clicks (and me feeling stupid every time I forget to use Clear).
4. The subtitles with both the / and the italic i, l and ! came from the same movie. Maybe this is something that could be added to the post-OCR correction process (like fixing l vs. I) ?

I forgot one more suggestion:

5. Could SubRip take the delay in an .idx file into account when creating a .srt file ?

Example:

id: en, index: 0
delay: -02:06:50:700
timestamp: 02:07:04:396, filepos: 000000000
timestamp: 02:07:08:200, filepos: 000001800
...
NN is offline   Reply With Quote
Old 17th May 2005, 22:47   #70  |  Link
ai4spam
Programmer
 
ai4spam's Avatar
 
Join Date: Sep 2003
Posts: 382
@NN:
2: ok, I'll see what I can do. I've been sick for the last few days, thus unable to work much .
4: increase the OCR sensitivity all the way to 1000, that should solve your problem for now. Post-OCR for such cases is possible, but I don't know if I'll be the one to implement it (I got things in my personal life that I need to take care of first).

For the new suggestion: just use the time adjuster for now. We'll put it on the "sensible requests" list .
ai4spam is offline   Reply With Quote
Old 17th May 2005, 23:50   #71  |  Link
masken
uhm... ?
 
Join Date: Oct 2001
Location: Gothenburg, Sweden
Posts: 281
@NN, what you need to do is work a technique for building your character matrix. The solution is simple: simply OCR the first "o" as "%", at the next OCR stop, most likely "/" and part of either "o", just hit enter. Same with the third "o". This will build a correct OCR directly as the other characters are automatically skipped the next time they appear.

There's other examples of this. With the "Extend/Reduce" feature request á la SubResync, the idea was to be able to expand such selections so the whole "%" was to be automatically selected.
masken is offline   Reply With Quote
Old 18th May 2005, 03:59   #72  |  Link
ai4spam
Programmer
 
ai4spam's Avatar
 
Join Date: Sep 2003
Posts: 382
Unfortunately, the first "o" can just as easily be a degree sign, so marking it as "%" is not a good idea. One way would be to say "extend right" for the first "o", set the "o/" to "%" and the second "o" to nothing. Then, each time a degree sign is encountered, it will ask you for a character pair again, because it will be followed either by a space or by a comma or other things, but never a "/". All you have to do is type in the appropriate "o ", "o,", "o.", and so on (here "o" is the degree sign). If the degree sign is found last in a line, the "take with next" character will be replaced with what you input, reverting to single character mode.
Some automatic way to do this might be possible (not easy), but care should be taken with character sets that are not Latin-based, like Chinese or Japanese.

Last edited by ai4spam; 18th May 2005 at 04:38.
ai4spam is offline   Reply With Quote
Old 19th May 2005, 08:48   #73  |  Link
ai4spam
Programmer
 
ai4spam's Avatar
 
Join Date: Sep 2003
Posts: 382
Re: Re: Some suggestions

Quote:
Originally posted by NN
5. Could SubRip take the delay in an .idx file into account when creating a .srt file ?

Example:

id: en, index: 0
delay: -02:06:50:700
timestamp: 02:07:04:396, filepos: 000000000
timestamp: 02:07:08:200, filepos: 000001800
...
I've never seen such a file. Could you please mail me an example (lookup the address in the readme)? I need to know if the delay is per stream or per file. Thanks.

Last edited by ai4spam; 19th May 2005 at 10:51.
ai4spam is offline   Reply With Quote
Old 21st May 2005, 00:01   #74  |  Link
masken
uhm... ?
 
Join Date: Oct 2001
Location: Gothenburg, Sweden
Posts: 281
@ai4spam, yeah, I thought someone would mention something about the degree character. Thing is though, in most if not all fonts, I've noticed from (long) experience that the "%"-o and the °-character differs enough for subrip to make them out as two different characters

But yes, you're right of course, this is definetly one of the reasons I wated the "extend" feature too

Last edited by masken; 21st May 2005 at 00:04.
masken is offline   Reply With Quote
Old 21st May 2005, 07:20   #75  |  Link
ai4spam
Programmer
 
ai4spam's Avatar
 
Join Date: Sep 2003
Posts: 382
Better safe than sorry I always say . No worries, Beta 16 is being worked on.
ai4spam is offline   Reply With Quote
Old 23rd May 2005, 12:39   #76  |  Link
ai4spam
Programmer
 
ai4spam's Avatar
 
Join Date: Sep 2003
Posts: 382
Finally, Beta 16 is up.
Changelog:
- support for extending more than 1 character (test and provide feedback, please)
- GUI additions and fixes in video mode
- changed saved bitmaps form .bmp to .pgm and fixed the cropping
- some of the feature requests have been implemented - you'll just have to get it to see which

I'll give the VirtualDub filter one more try...

ATTN: Big boo-boo on my part, I had broken DVD subtitles. It's fixed now, please download again. I made a couple of changes elsewhere anyway (improved fill sides, plus added shortcuts for extend buttons).

Last edited by ai4spam; 23rd May 2005 at 19:17.
ai4spam is offline   Reply With Quote
Old 24th May 2005, 02:36   #77  |  Link
Esc
Registered User
 
Esc's Avatar
 
Join Date: Jan 2005
Posts: 74
Couple of thoughts here.
1. It doesn't show the build number in About window. I know that I am on build 16 because I have installed it right now. But will I remember in 2 weeks if I find a bug? It would be nice if there was the build number in the version.
2. Non-standard letters turn into standard for some reason. My é becomes e. I do not remember such a problem in 1.17.
3. Subtitle Index offset changes its value every time you walk through it. If you go to output format, select SubRip and just keep hitting tab, every time you walk through that field it changes it's value to opposite.
Esc is offline   Reply With Quote
Old 24th May 2005, 04:27   #78  |  Link
Longinus
Registered User
 
Join Date: Apr 2003
Location: Brazil
Posts: 87
Hello..
Thank you for making SubRipAvi. In the past I used AVISubDetector, but I just guessed the options every time, trying to make it work. Your solution is a LOT easier.

But I'm having some problems.. The first is that in the latest beta, the "Video File Viewer" window background is "transparent". You can't read some of the text, and it's drawing everything from another window if you put on top of it.. (It didn't happen in the earlier bets)

Another thing is that I'm trying to OCR a timecode (of the frame number), so I have to get every frame. So I set "skip first", "update every" and "Min duration" to 1. But it doesn't work. Sometimes subrip jumps 10 frames, sometimes less, sometimes more.
Am I doing something wrong?
Longinus is offline   Reply With Quote
Old 24th May 2005, 05:23   #79  |  Link
ai4spam
Programmer
 
ai4spam's Avatar
 
Join Date: Sep 2003
Posts: 382
@Esc:
1) Will take care of this.
2) This happens to me with diactitics in languages like Romanian. It's not SubRip's fault, somehow you changed your system settings (that, plus Delphi makes a hidden ANSI-OEM-ANSI conversion in strings, and it can't be taken out AFAIK). Just go to regional settings and set "language of non-unicode programs" to the language of your choice. Are you sure you're using the é from the default font/charset? In my example (Romanian), some letters would change (t, and s,), but not others (i^ and a^). The way I "fixed" it was to assign the correct characters to some of the 20 buttons on the bottom. If you select the Romanian language in the OCR window, you'll notice that the letters I mentioned still don't show up correctly, even when using the EASTEUROPE charset, but they do show up correctly in the final subs. So, try to play with the charset in the general options window. The rule for characters like your é is: if you copy and paste it on a button (with right click), it should show up as you want it on the button, or else it's not the right character.
3) I fixed it, will upload in next beta.

@Longinus:
Thanks for your appreciation.
Your "transparent window" bug is really weird, there is no reason why it should happen. Maybe there's a problem with your video drivers? Please try it on another machine and let me know if it still happens. A screenshot would also be useful. Ah, and it's worth asking: are you sure you can't see the text because of the new feature (fill to the sides of the text with fuchsia color)?
The skip first, update every and min duration were set there for temporal optimization, without them it would be really slow. Beats me why you would try to OCR a timecode...
Anyway, basically I'm processing every frame, when I detect a sub I skip the first frames (to let the sub appear fully), then accumulate for min duration frames, then just compare and reset every now and then.
In your case, if the timecode is the only thing you OCR, then it doesn't change much from frame to frame. Try also lowering the same sub tolerance to a really small number. However, it will still accumulate at least 1 frame, to process every frame I'll need to make some changes (again, done, but will upload in next beta).

Last edited by ai4spam; 24th May 2005 at 05:35.
ai4spam is offline   Reply With Quote
Old 24th May 2005, 08:38   #80  |  Link
Longinus
Registered User
 
Join Date: Apr 2003
Location: Brazil
Posts: 87
I tried it in my other computer, and it worked. It probably is a driver bug, this computer is a long time overdue for a full format. But anyway, here is the link to the screenshot image.

http://www.unkind.org/blog/images/subrip_screen.png

About the timecode... Someone gave me an edited movie (15m), that used 3 big ones (20m each). But the edditor somehow LOST the Final Cut projet file. So I'm stuck with the low-qualy final movie, and I have to re-edit, putting the same parts but in high quality. Of course this will be a pain in the ass to do by hand... but I have the timecode, and If I can get it in text format, I can code a simple program to create a avisynth script that will cut the used parts for me.. It will work.. in theory..
Longinus is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 20:17.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.