Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
9th November 2005, 23:56 | #241 | Link | |
Registered User
Join Date: Jan 2004
Location: Czech
Posts: 181
|
Quote:
So I have many matrices with diffrent charmaps. Matrices differs, but contains only one charmap. I don't have problem with bad-recognized chars.
__________________
(Sorry for my bad english, I'm czech, not englishman... :)) |
|
15th November 2005, 00:40 | #242 | Link | |
Programmer
Join Date: Sep 2003
Posts: 382
|
Quote:
|
|
28th November 2005, 17:13 | #243 | Link |
uhm... ?
Join Date: Oct 2001
Location: Gothenburg, Sweden
Posts: 281
|
@ai4spam, have you worked anything on the "whole words formatting only" issue?
Two other things I've come to notice: If the vobsubs are smaller than usual, the words often get stuck together if you don't manually adjust the Space Width setup. Nothing strange in itself, but actually, vobsub could do a pretty good guess at the space width setup automatically by checking the medium character height in a vobsub picture when OCR:ing an adjust this according to a table or math function etc. What do you think about this? Check your PM's for an example on this. Another thing in the post-OCR check... I think there might already be something similar there, but I'll post about it anyhow. Since the pre-extend selection period, one often missed parts of certain characters (especially when in Italics) when OCR:ing, and many still haven't got the habit of extending a selection since they think most of the character is covered. The post-OCR check could automatically replace these character combinations: .! > ! !. > ! ?. > ? .? > ? .: > : :. > : |
29th November 2005, 10:46 | #244 | Link |
Programmer
Join Date: Sep 2003
Posts: 382
|
@masken:
1) It wouldn't be that easy, because there are other factors involved, and kerning is different from font to font, from regular to italic to bold, and so on. 2) You can do those yourself by editing the "English_screwed.dic" file or creating a new one. PS: A new version will be up soon, but my time to work on it is limited. Last edited by ai4spam; 29th November 2005 at 10:53. |
29th November 2005, 13:28 | #245 | Link | |
Registered User
Join Date: May 2002
Location: Czech Republic
Posts: 171
|
Quote:
What about this sentense: [Mama] Wash yours hands...! [Good son] Should I do it today...? The other replaces you suggested are already covered by postocr correction. |
|
4th December 2005, 22:20 | #248 | Link |
Registered User
Join Date: Sep 2004
Location: France
Posts: 367
|
Subripping of idx/sub extracted from an mkv file is really weird
http://lemoi.fr.free.fr/sub_fr.rar idx/sub are correct, but after subrip, timecodes are messed up |
5th December 2005, 12:36 | #249 | Link |
Registered User
Join Date: Jul 2003
Location: Brazil
Posts: 234
|
Hi, dear all.
I have tryed to rip subtitles from an extra dvd edition and got some problems in OCR. I final, Subrip can "extract" all of the subs in bmp without problem ( about 400 bmp's files showing all the speeches). But, when I try to rip to srt text, Subrip can't read directly. There is a way to correct that in Subrip? Or maybe using Ifoedit ( or other tool) to create a new ( dvd pallete) file that can be better read by Subrip later? And, for additional future request to add, can Subrip in future be able to read/rip/save in SUP/srt and vice-versa formats, just the way I got subtitles when I demux in dvd rip extraction usind DVD-D and similars rippers? Thanks. Last edited by johner23; 5th December 2005 at 13:14. |
5th December 2005, 15:26 | #250 | Link |
Programmer
Join Date: Sep 2003
Posts: 382
|
@LeMoi: I don't understand what you mean by "messed up", the subtitles you sent work fine here.
@johner23: You'll have to give me more details than "can't read directly", what exacty does not work? Are the colors in the bitmaps somehow screwed up? Can you send me the index file and a couple of bitmaps? By the way, if at all possible, use vsRip from http://guliverki.sf.net for ripping into idx/sub files instead of bitmaps. |
5th December 2005, 16:07 | #251 | Link | |
Registered User
Join Date: Jul 2003
Location: Brazil
Posts: 234
|
Quote:
PS: it's possible to correct them using Ifoedit or other tool? If I choose to rip in bitmap way, ok. Subrip give me almost 400 bmp files, which contains all the lines in an image way. But whe I try to rip into srt-txt file, no way: Subrip can't recognize the colors. So I need to type manually one by one. There is an option that contains 4 basic colors in subrip ( blue, red, black and one other that I can't remember now). I turn off 2 of them and after that, Sub rip can "read" in part that files. But I need to type all the lines, because Subrip only reconize the times, not the words themself. Do you want that I send you that file? It's 7Mb lenght. I've tryed to use Gabest tool, but it seems that fails in extraction. Only subrip could get the times. Thanks. Last edited by johner23; 5th December 2005 at 16:10. |
|
6th December 2005, 12:41 | #253 | Link |
Registered User
Join Date: Jul 2003
Location: Brazil
Posts: 234
|
@ai4spam
Again, I tried to rip the sub, but that times, I use the most recent Subrip's version ( the other was a little bit older) and could rip it well. Of course there was some little parts that I had to correct manually. But, almost 98% was done this time. As suggestion, can you add the same functions similar DVD Subtitle Tools inside Subrip core? ---> http://web.quick.cz/FKasparek/ In the worst case, if I couldn't rip the sub, I would try to demux into sup files using DVD-D or VobEdit and try to conver them into txt using DVD Subtitle Tools. If we could deal with sup files and edit them, it will be interesting to correct the colour pallete and others aspects in hard ripping cases. Thanks for your attention. Last edited by johner23; 6th December 2005 at 12:44. |
6th December 2005, 13:16 | #254 | Link | |
Registered User
Join Date: Sep 2004
Location: France
Posts: 367
|
Quote:
|
|
7th December 2005, 15:04 | #255 | Link |
Programmer
Join Date: Sep 2003
Posts: 382
|
As a matter of fact, I did. Here are the first and last 3 subtitles, from the .idx and form the result:
Code:
id: fr, index: 0 timestamp: 00:00:48:760, filepos: 000000000 timestamp: 00:04:05:120, filepos: 000001000 timestamp: 00:04:08:560, filepos: 000002000 ... timestamp: 01:43:00:800, filepos: 0003ec000 timestamp: 01:43:03:160, filepos: 0003ed000 timestamp: 01:43:05:720, filepos: 0003ee000 Code:
1 00:00:48,760 --> 00:00:51,593 blah blah blah 2 00:04:05,120 --> 00:04:08,317 blah blah blah 3 00:04:08,560 --> 00:04:10,755 blah blah blah ... 1127 01:43:00,800 --> 01:43:03,872 blah blah blah 1128 01:43:03,160 --> 01:43:05,116 blah blah blah 1129 01:43:05,720 --> 01:43:07,676 blah blah blah |
7th December 2005, 15:07 | #256 | Link | |
Programmer
Join Date: Sep 2003
Posts: 382
|
Quote:
As for functionality implemented elsewhere, there's very little chance you'll see it in SubRip. I simply don't have the time, and it doesn't make sense to reimplement something that works well. If it didn't... that's another story. |
|
7th December 2005, 17:13 | #257 | Link | |
Registered User
Join Date: Sep 2004
Location: France
Posts: 367
|
Quote:
At the end of the subriping process : and the content : Example : idx/sub : 00:52:38:320 : Pourquoi ils rient ? srt : 00:52:38:320 : On a la sensation d'être amoureux srt : 00:51:33:680 : Pourquoi ils rient ? idx/sub : 00:52:49:120 : On a la sensation d'être amoureux is it normal ? |
|
7th December 2005, 19:27 | #258 | Link |
Programmer
Join Date: Sep 2003
Posts: 382
|
Hmm, ok... so the timestamps don't correspond to the content. It's weird, you can actually just open the .sub file (without the .idx) and still get them "wrong". What tool are you using to visualize the .idx/.sub?
Anyway, I can't really help you, maybe Zuggy can take a look, but my guess is that it's a malformed .sub file. Again, just go online and get the text subs from some specialized website. |
7th December 2005, 19:35 | #259 | Link |
Registered User
Join Date: Sep 2004
Location: France
Posts: 367
|
I see the content of the sub with SubResync.
Going online to find subs is not really helpful. If i extract subs from file, it's often to find other subs in other languages and sync them with those extracted so that they match and i add those to original mkv. If extracted subs are wrong, i can't resync them ^^. I talked about this with Mosu and he thought it wasn't ang mkvtoolnix problem, but i think he's wrong, so he asked to see with you... |
7th December 2005, 21:07 | #260 | Link |
Programmer
Join Date: Sep 2003
Posts: 382
|
Well, sync-ing won't help you, since the contents are shifted. What you need to do is get a sub that is "good" from the web (maybe in another language), then open it in translator mode in SubtitleWorkshop, and load the "bad" sub as the translation. Then, the times will be taken form the "good" sub, and all you have to do is save the "bad" sub with the new timings. Of course. SubtitleWorkshop also lets you check if the tranlsation is correct.
|
|
|