Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
21st September 2005, 04:58 | #221 | Link |
Programmer
Join Date: Sep 2003
Posts: 382
|
@TiaoMacaleh: If you mean that you often get the prompt to type the entire dialogue, it's probably due to the fact that the lines don't have enough distance between them (usually that happens if you have accentuated characters on the second line). Try lowering the "min interline height" value in the OCR options. Sometimes it's impossible to draw a straight line between the two lines of text no matter what you do, so you'll need to type in those dialogues manually. It is indeed possible to separate text lines with a line that "goes around" accents and the like, but these cases are too rare to justify the programming effort at the moment. And since now you also have the special characters table in that window, it's a lot easier to type the entire line.
@LeMoi: 1) You probably mean the last CodePage, not CharSet. The French language usually triggers the UniCode flag due to the "oe" ligature character. Answer "No" when asked to save as UniCode, then just leave "DEFAULT_CHARSET" and "1252 - ANSI Latin I". You should have no problems with the resulting subtitle. You should never need to choose CodePage 65000 or 65001 (which are UTF-7 and UTF-8, by the way). When you change the CharSet, SubRip automatically sets the corresponding CodePage for you, and normally you should never have to change the CodePage yourself (it's there only for more flexibility). Alternatively, you can save the text as UniCode, then load it in Word and save it as text with the encoding (CodePage) of your choice. Word highlights the characters that cannot be converted for you (it shows them as red question marks). 2) The translator didn't respect the requirement to keep the number of characters when translating, so the best guess text is overlapping the text that says "Meilleure reponse". It should be a bit better in the last beta, because now the best guess has spaces before and after it, like " l ". The font and color are different also, so you should easily see the best guess. Or... edit "Lang\Francais.lng" with NotePad and put "Meil. repo." instead of "Meilleure reponse". |
21st September 2005, 10:08 | #222 | Link | |
Registered User
Join Date: Sep 2004
Location: France
Posts: 367
|
Quote:
and thanks for the tip |
|
21st September 2005, 15:49 | #223 | Link |
Programmer
Join Date: Sep 2003
Posts: 382
|
@LeMoi: That's weird, because that character is kept by default. Are you sure you OCR-ed it as such? For example, if you OCR-ed it as regular "C" (either by mistake, or because your char matrix tells it to) and the inter-line distance is too small, it may skip the "," underneath the special "C" altogether.
|
21st September 2005, 15:56 | #224 | Link |
Registered User
Join Date: Sep 2004
Location: France
Posts: 367
|
Hmm, i don't really remember, but i'm almost sure. But maybe i am wrong, since in most of the cases, it's well OCR-ed and it advises me to save it in unicode, i'll try with other subs and tell you about it.
|
21st September 2005, 18:53 | #225 | Link |
Registered User
Join Date: Aug 2004
Posts: 24
|
ai4spam it worked, i tryed changing the setting before without luck but the problem was the program doesnt seem to apply the changes right away. You need to restart subrip to changes take effect. After restart everithing was fine. Thanks =]
|
21st September 2005, 21:35 | #228 | Link |
Unbeliever
Join Date: Sep 2002
Location: Greece
Posts: 111
|
No amount of time unfreezes it.
I realised, though, that the problem exists only if the matrix is loaded automatically using the search for match button. If it is loaded manually at the beginning, it works fine. |
22nd September 2005, 01:00 | #229 | Link |
Programmer
Join Date: Sep 2003
Posts: 382
|
@LeMoi: the special ",c" character is part of the ANSI CharSet, so if you only have that and no other special characters SubRip will not ask you to save as UniCode, but will directly go to the saving window and let you set the CharSet and CodePage for conversion.
The problem may be if DEFAULT_CHARSET has a different value for French Windows XP. To be sure, select ANSI_CHARSET instead - this changes the CodePage to 1252 and it should be fine. @sapient: bad idea, using the search button with my huge matrix. Again, if you see some really weird font, make a new matrix, otherwise just add characters to mine. Last edited by ai4spam; 22nd September 2005 at 03:09. |
22nd September 2005, 10:31 | #230 | Link | |
Registered User
Join Date: Sep 2004
Location: France
Posts: 367
|
Quote:
|
|
26th October 2005, 13:33 | #232 | Link |
Programmer
Join Date: Sep 2003
Posts: 382
|
SubRip 1.40 Beta 3 is now up.
ChangeLog: Added support for negative timestamps. It'll show a warning that some delay needs to be added to synchronize, and start with 00:00:00,000 as the first subtitle timestamp. Added support for using the file offsets from .idx files. It does not seem to help with badly-formed subtitles and screws up some good ones, as the file offsets seem to be bogus, so it's not on by default. Fixed a few bugs. |
30th October 2005, 03:15 | #234 | Link |
Programmer
Join Date: Sep 2003
Posts: 382
|
There's a mirror (kindly provided by FoxyShadis): http://foxyshadis.slightlydark.com/r...CharMatrix.rar
|
7th November 2005, 18:15 | #236 | Link |
Registered User
Join Date: Jan 2004
Location: Czech
Posts: 181
|
I can't save right format
Hi,
I have big problem with SUbRip. When I rip czech subtitles, program ask me to save it to UNICODE or not, but I don't want unicode, because it's sux. So I chose NO but what now. Which charset to select? I try 1250 or 1252, but with even one cant' save right format. With first, instead character "č" it saves "e", and second instead "ř,ů,...." it saves "r,u,....". Beee, I wanna old SubRip without unicode, but new features... for any ideas... EDIT: I found, where is the problem. Because I'm using my old-saved matrices,that maybe isn't saved as unicode...
__________________
(Sorry for my bad english, I'm czech, not englishman... :)) Last edited by JnZ; 7th November 2005 at 19:35. |
8th November 2005, 13:39 | #238 | Link | |
Registered User
Join Date: Jan 2004
Location: Czech
Posts: 181
|
Quote:
BTW: In the last time I rip subs from about 30 DVDs, your matrices can't cover any single case. But my matrices covers 95% cases. So they are very valuable for me. :-)
__________________
(Sorry for my bad english, I'm czech, not englishman... :)) |
|
9th November 2005, 13:34 | #239 | Link |
Registered User
Join Date: May 2002
Posts: 203
|
char matrix ai4spam problem
Hello!
Charmatrix works perfectly but the problem is when i want to save and choose ansi charset 1250 to save i get the problem in saved file for character "i" i get "!" and for italic i get 0 instead of o. But in previous versions when i had to enter chars for matrix there wasnt option to save charactherset. Is there any workaround so it can be choosed only codepage and thats it ? Since charmatrix from ai4spam is perfect Thanks for any suggestions! |
9th November 2005, 19:46 | #240 | Link |
Programmer
Join Date: Sep 2003
Posts: 382
|
@JnZ: Well, you probably use the option to search in other char matrices, and you have many small ones. You should have had them all in one, and only start a new one when the big matrix wasn't working. BTW, if you use alphabetical sorting, correcting them shouldn't be that hard. How do you choose the right matrix out of 200, anyway?
@svcdprayer: Hmm, "perfect" is a little too much. The problem is not the charset you save in, but the OCR sesitivity. Try setting it to 1000. If you still get problems, then somewhere in the matrix you have an "!" instead of an "i" and so on. Use alphabetical sorting (click on the column heading to sort) and look at all the "!", one of them must be wrong. The "0" vs "o" problem may not be solvable, since characters otherwise identical may mean different things in different DVDs. For that, I'd run a spellchecker (Word or SubtitleWorkshop) on the final text. |
|
|