Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > General > Subtitles

Reply
 
Thread Tools Search this Thread Display Modes
Old 2nd March 2015, 11:06   #321  |  Link
Music Fan
Registered User
 
Join Date: May 2009
Location: Belgium
Posts: 1,744
Quote:
Originally Posted by aax View Post
Exactly. But check the cheat sheet, there are good examples.

For adding space before three points (a.k.a. ellipsis) when there is none this should do it:
find: (?<![\. ])\.{3}(?!\.)
replace with: space and three dots

In regular expressions dot matches any character except line breaks, so when you want it to mean an actual dot you have to "escape" it by putting a backslash before it.
Hi,
I have nearly the same question than a few months ago : how to add space after (and not before) 3 points only when there is a letter after these 3 points (and not when the 3 points are the end of the sentence) ?
For example ;
...and he went there.
would become ;
... and he went there.
Music Fan is offline   Reply With Quote
Old 2nd March 2015, 13:47   #322  |  Link
Betsy25
Registered User
 
Join Date: Sep 2008
Location: Holland, Belgium
Posts: 330
Quote:
Originally Posted by Music Fan View Post
Hi,
I have nearly the same question than a few months ago : how to add space after (and not before) 3 points only when there is a letter after these 3 points (and not when the 3 points are the end of the sentence) ?
For example ;
...and he went there.
would become ;
... and he went there.
You can try regex's here : https://regex101.com/

Function should be a Lookahead after the match (in human terms : "Look if there's something specific after that what you'll want to match")

PS: A good tip which makes it a lot less difficult to decypher, whatever is inside (?somestuffhere) cases will always be checked for, but will never end up being included in the final match string.

Thus, this should do it :
Find : (3 dots, only if they are directly followed by either a letter or a number)
Code:
\.{3}(?=[a-z0-9])
Replace with 3 dots + space

Last edited by Betsy25; 2nd March 2015 at 14:17. Reason: some rewording.
Betsy25 is offline   Reply With Quote
Old 2nd March 2015, 19:18   #323  |  Link
Thunderbolt8
Registered User
 
Join Date: Sep 2006
Posts: 2,197
apparently HD DVD subtitles are not recognized and cannot be opened. could you add support for that please?

https://www.sendspace.com/file/5sije6

thanks!
__________________
Laptop Lenovo Legion 5 17IMH05: i5-10300H, 16 GB Ram, NVIDIA GTX 1650 Ti (+ Intel UHD 630), Windows 10 x64, madVR (x64), MPC-HC (x64), LAV Filter (x64), XySubfilter (x64) (K-lite codec pack)
Thunderbolt8 is offline   Reply With Quote
Old 5th March 2015, 21:24   #324  |  Link
Thunderbolt8
Registered User
 
Join Date: Sep 2006
Posts: 2,197
looking at this https://github.com/SubtitleEdit/subt...a656373f305324 a question:

does this only fix the problem in this specific example mentioned here? or would this also fix similar cases, e.g. just with different words than used in the example?


another thing, would it be possible to add a kind of ignore list we can edit, consisting of certain phrases or sentences we can exclude from the hearing impaired removal?

for example, I have "oh" in my hearing impaired removal list, but I like to keep it in case of "Oh, my god" or "Oh, dear". I always have to check the complete list of changes manually for that and untick it. I could add the phrase "my god" to the multiple replace list and set it to replace with "oh, my god", but there are sometimes also lines which really are just "my god", which would then changed wrongly to "oh, my god".

perhaps it sounds a little bit picky, but such a kind of ignore removal list could save the work to manually check the entire proposed hearing impaired removal list of a subtitle file before I can apply the changes.

Also, would it be possible to add a match case box we could tick for each single entry of the hearing impaired removal list? for example, I have "er" in that list to remove things like "Er... I dont know" or "wait...er..." or something like that. But then the word "ER" (emergency room) also gets to be removed as a false positive.
__________________
Laptop Lenovo Legion 5 17IMH05: i5-10300H, 16 GB Ram, NVIDIA GTX 1650 Ti (+ Intel UHD 630), Windows 10 x64, madVR (x64), MPC-HC (x64), LAV Filter (x64), XySubfilter (x64) (K-lite codec pack)

Last edited by Thunderbolt8; 6th March 2015 at 17:34.
Thunderbolt8 is offline   Reply With Quote
Old 6th March 2015, 18:23   #325  |  Link
Thunderbolt8
Registered User
 
Join Date: Sep 2006
Posts: 2,197
there seem to be some bugs with the latest update. sometimes, the SHD removal or fixes in the fix common error menu are not applied. I can press OK, but the relating entries dont get removed and when I open the window again they are all still listed there. doesnt seem to happen always or with all files, though.

e.g. in this one: https://www.sendspace.com/file/8q11ky
__________________
Laptop Lenovo Legion 5 17IMH05: i5-10300H, 16 GB Ram, NVIDIA GTX 1650 Ti (+ Intel UHD 630), Windows 10 x64, madVR (x64), MPC-HC (x64), LAV Filter (x64), XySubfilter (x64) (K-lite codec pack)
Thunderbolt8 is offline   Reply With Quote
Old 6th March 2015, 22:29   #326  |  Link
Music Fan
Registered User
 
Join Date: May 2009
Location: Belgium
Posts: 1,744
Quote:
Originally Posted by Betsy25 View Post
Thus, this should do it :
Find : (3 dots, only if they are directly followed by either a letter or a number)
Code:
\.{3}(?=[a-z0-9])
Replace with 3 dots + space
Thanks, this works well

Quote:
Originally Posted by Thunderbolt8 View Post
apparently HD DVD subtitles are not recognized and cannot be opened. could you add support for that please?
I already asked it, but IIRC Nikse said it was too much work, and as BDSup2Sub supports this conversion, it's less useful to add it in SE ;
http://forum.doom9.org/showthread.ph...07#post1683707
But you can use SE after BDSup2Sub's conversion (in Blu-ray sup) if you need OCR.
Music Fan is offline   Reply With Quote
Old 6th March 2015, 23:02   #327  |  Link
Thunderbolt8
Registered User
 
Join Date: Sep 2006
Posts: 2,197
Quote:
Originally Posted by Thunderbolt8 View Post
there seem to be some bugs with the latest update. sometimes, the SHD removal or fixes in the fix common error menu are not applied. I can press OK, but the relating entries dont get removed and when I open the window again they are all still listed there. doesnt seem to happen always or with all files, though.

e.g. in this one: -https://www.sendspace.com/file/8q11ky
it seems the removal of lines consisting only of SHD information like stuff in () or [] or lines consisting just of "aahh" "uh-huh" etc. is broken. those lines dont get removed, no matter how often I click it.
__________________
Laptop Lenovo Legion 5 17IMH05: i5-10300H, 16 GB Ram, NVIDIA GTX 1650 Ti (+ Intel UHD 630), Windows 10 x64, madVR (x64), MPC-HC (x64), LAV Filter (x64), XySubfilter (x64) (K-lite codec pack)
Thunderbolt8 is offline   Reply With Quote
Old 7th March 2015, 11:55   #328  |  Link
Thunderbolt8
Registered User
 
Join Date: Sep 2006
Posts: 2,197
theres still a bug with in the SHD removal process despite the latest fixes. a line with SHD information and regular speech e.g. "[coughs] Well, that it was I mean" now gets entirely removed instead of just the SHD part in the brackets. Its shown correctly in the SHD removal preview window though, just not done correctly during the process.
__________________
Laptop Lenovo Legion 5 17IMH05: i5-10300H, 16 GB Ram, NVIDIA GTX 1650 Ti (+ Intel UHD 630), Windows 10 x64, madVR (x64), MPC-HC (x64), LAV Filter (x64), XySubfilter (x64) (K-lite codec pack)
Thunderbolt8 is offline   Reply With Quote
Old 7th March 2015, 12:19   #329  |  Link
Nikse555
Registered User
 
Join Date: Feb 2004
Location: Mars
Posts: 428
@Thunderbolt8: thx for the bug reports
Latest beta (http://www.nikse.dk/SubtitleEdit.zip - 3.4.5 build 431) seems to work, right?

Edit: He, found the bug... latest beta (http://www.nikse.dk/SubtitleEdit.zip - 3.4.5 build 436) seems to work, right?

Last edited by Nikse555; 7th March 2015 at 12:52.
Nikse555 is offline   Reply With Quote
Old 7th March 2015, 15:19   #330  |  Link
Thunderbolt8
Registered User
 
Join Date: Sep 2006
Posts: 2,197
yes, seems like it. thanks for the fix! In case Ill find more I report back.
__________________
Laptop Lenovo Legion 5 17IMH05: i5-10300H, 16 GB Ram, NVIDIA GTX 1650 Ti (+ Intel UHD 630), Windows 10 x64, madVR (x64), MPC-HC (x64), LAV Filter (x64), XySubfilter (x64) (K-lite codec pack)
Thunderbolt8 is offline   Reply With Quote
Old 7th March 2015, 22:24   #331  |  Link
Thunderbolt8
Registered User
 
Join Date: Sep 2006
Posts: 2,197
some kind of special case for SHD removal: maybe you could add some exception to the "remove text before a locon (':') only if text is UPPERCASE" scenario in combination with certain names like McWHATEVER. even though the letter c in the name is not uppercase, the name is basically meant to be.
__________________
Laptop Lenovo Legion 5 17IMH05: i5-10300H, 16 GB Ram, NVIDIA GTX 1650 Ti (+ Intel UHD 630), Windows 10 x64, madVR (x64), MPC-HC (x64), LAV Filter (x64), XySubfilter (x64) (K-lite codec pack)

Last edited by Thunderbolt8; 7th March 2015 at 22:28.
Thunderbolt8 is offline   Reply With Quote
Old 8th March 2015, 23:32   #332  |  Link
Thunderbolt8
Registered User
 
Join Date: Sep 2006
Posts: 2,197
could you please change that the window size of the hearing impaired removal window (ctrl+shift+h) stays the same the way you changed it to last time? This works for the Fix common errors window (ctrl+shift+F), but not the first one.


and some small SHD fix:

Which one do you want to be?|Uh--

gets changed to

Which one do you want to be?--

afaik this also works in case if there is only one hyphen "-" or maybe also the longer special hyphen character, but apparently not yet in the case of double hyphens "--"
Also please add the double hyphen "--" in general to all other sort of stuff which gets checked and corrected when it comes to hyphens and for which the single hyphen and the special hyphen character are already affected, if the double hyphen is not included yet in such hyphen type of scenarios (e.g. stuff related to the fix common errors section. afaiks the "fix first letter to uppercase after paragraph detects a single hyphen (maybe also the special hyphen character) at the end of a line, with no hyphen following in the next line, and does correctly not suggest to capitalize the first letter of that next line (because the sentence is supposed to continue). that doesnt work in case of a double hyphen, though)


some more:

- Mr. Harding?|-Mm-hm. Oh.

gets changed to

- Mr. Harding?



and similarly:

Oh.|-I'm awfully tired.

gets changed to

-I'm awfully tired.



And:

-Sit down. Sit down.|-Oh! Oh!

gets changed to

- Sit down. Sit down.!
__________________
Laptop Lenovo Legion 5 17IMH05: i5-10300H, 16 GB Ram, NVIDIA GTX 1650 Ti (+ Intel UHD 630), Windows 10 x64, madVR (x64), MPC-HC (x64), LAV Filter (x64), XySubfilter (x64) (K-lite codec pack)

Last edited by Thunderbolt8; 9th March 2015 at 01:26.
Thunderbolt8 is offline   Reply With Quote
Old 9th March 2015, 01:19   #333  |  Link
Thunderbolt8
Registered User
 
Join Date: Sep 2006
Posts: 2,197
another kind of problem/bug: when there is an unneeded period, e.g. "!." then after this instance all following lines or the entire subtitle file, with normal periods are wrongly detected as unneeded periods as well. although theres nothing wrong with those lines.

heres an example: https://www.sendspace.com/file/zv9znz

the undeeded period is in line 79 and after that there occurs the problem (theres also a 2nd one in line 180)




also, in case of .ass subs, there is no missing space in case of }" because the } is part of the line position information on the screen which means that the " is actually the first real character of a line referring to actual speech information. e.g. as in
{\an4\pos(717,888)}"How cool,

so there doesnt need to be a missing space detected in the fix common error section in such a case.
__________________
Laptop Lenovo Legion 5 17IMH05: i5-10300H, 16 GB Ram, NVIDIA GTX 1650 Ti (+ Intel UHD 630), Windows 10 x64, madVR (x64), MPC-HC (x64), LAV Filter (x64), XySubfilter (x64) (K-lite codec pack)

Last edited by Thunderbolt8; 9th March 2015 at 11:24.
Thunderbolt8 is offline   Reply With Quote
Old 9th March 2015, 21:35   #334  |  Link
Nikse555
Registered User
 
Join Date: Feb 2004
Location: Mars
Posts: 428
Quote:
Originally Posted by Thunderbolt8 View Post
another kind of problem/bug: when there is an unneeded period, e.g. "!." then after this instance all following lines or the entire subtitle file...
Nice catch

thx for the bug reports - new portable beta up: http://www.nikse.dk/SubtitleEdit.zip (also on GitHub)
Nikse555 is offline   Reply With Quote
Old 9th March 2015, 23:24   #335  |  Link
Thunderbolt8
Registered User
 
Join Date: Sep 2006
Posts: 2,197
thanks for the quick fixes!
__________________
Laptop Lenovo Legion 5 17IMH05: i5-10300H, 16 GB Ram, NVIDIA GTX 1650 Ti (+ Intel UHD 630), Windows 10 x64, madVR (x64), MPC-HC (x64), LAV Filter (x64), XySubfilter (x64) (K-lite codec pack)
Thunderbolt8 is offline   Reply With Quote
Old 9th March 2015, 23:35   #336  |  Link
Thunderbolt8
Registered User
 
Join Date: Sep 2006
Posts: 2,197
another thing, in some cases when OCRing a subtitle file it can happen that the baseline of dots, commas and exclamation marks is off. meaning they are then put into a seperate line, taken apart from the rest of the speech line.

e.g.

You forgot "cursed"|.

gets correctly changed to

You forgot "cursed".


in the above case with a dot "." this gets detected in the hearing impaired window and the line break is correctly removed. however, that isnt the case for commas or exclamation marks which are off in the same way. its not exactly a part which belongs to SHD removal, but it would be nice if this could get detected & fixed as well, either in the SHD removal window or the fix common error one.

another version of this btw. is when these characters are put into a separate line before the line they belong to:

e.g.

?|How was your day

gets changed to

How was your day ---> should be: How was your day?

(the way this is handled atm is simply to remove the question mark and the line break instead of repositioning the question mark; in case of commas afaik nothing is detected atm; not sure about exclamation marks).


referring to the above, if there is such a misplaced dot then the SHD removal can get a bit screwed up as well:

- and I'll speak to you later|.|- [ Anna] OK.

gets changed to

- and I'll speak to you later .|- - OK. (the space between the dot and the last word gets inserted there for some reason, even though this wouldnt be the case if the 3rd line wasnt there, as seen in the first example at the beginning)


its working correctly though if the dot is put correctly where it belongs, so not sure if "fixing" this is maybe taking it too far.
__________________
Laptop Lenovo Legion 5 17IMH05: i5-10300H, 16 GB Ram, NVIDIA GTX 1650 Ti (+ Intel UHD 630), Windows 10 x64, madVR (x64), MPC-HC (x64), LAV Filter (x64), XySubfilter (x64) (K-lite codec pack)

Last edited by Thunderbolt8; 9th March 2015 at 23:49.
Thunderbolt8 is offline   Reply With Quote
Old 7th April 2015, 12:49   #337  |  Link
JayJayH
Registered User
 
Join Date: Feb 2015
Posts: 2
Displaying final subtitles "on screen" during editing (SE 3.3.15)

This is my first post, so please bear with me.
I currently use SubtitleEdit 3.3.15 to edit .SSA subtitles and their timings. On occasions I need two subtitles to be displayed on screen at the same time, for example in red colour at the top of the screen (maybe to display a cell-phones text message) while at the same time another line at the bottom of the screen in green colour containing what is being said. The attached .jpg best illustrates what I mean. Note that after I make changes, I save (ctrl+S) the .SSA file in SE edit and the corrected results are displayed.
I have been unable to achieve the same result in any version of SE later than 3.3.15 no matter what I do - even if I change nothing.
I'm using Win 7 with latest updates.
Any ideas anyone, please?
Attached Images
 
JayJayH is offline   Reply With Quote
Old 10th April 2015, 19:59   #338  |  Link
jpsdr
Registered User
 
Join Date: Oct 2002
Location: France
Posts: 2,316
I've tried to use Subtitle Edit 3.4.5 to OCR on an idx/sub file, with image compare method. Almost each time there is 2 lines of text, it ask me a caracter but considering letter of both lines being one caracter. Is there a parameter to set to avoid this issue ?
jpsdr is offline   Reply With Quote
Old 12th April 2015, 20:19   #339  |  Link
Thunderbolt8
Registered User
 
Join Date: Sep 2006
Posts: 2,197
some more SHD removal fixes:

WOMAN: <i>Mr. Sportello?</i>|- Mm-hm.

gets changed to:

<i>- Mr. Sportello?</i>

but should get changed to: <i>Mr. Sportello?</i>

--> the WOMAN: speaker information part is not taken into consideration for SHD removal in combination with the hyphen from the speech part after the line break.
__________________
Laptop Lenovo Legion 5 17IMH05: i5-10300H, 16 GB Ram, NVIDIA GTX 1650 Ti (+ Intel UHD 630), Windows 10 x64, madVR (x64), MPC-HC (x64), LAV Filter (x64), XySubfilter (x64) (K-lite codec pack)
Thunderbolt8 is offline   Reply With Quote
Old 14th April 2015, 14:30   #340  |  Link
minhjirachi
Registered User
 
Join Date: Sep 2012
Posts: 110
Still using version 3.4.2. The newest version have a little bug when exporting as .sup file or bdn/xml. When exporting as .sup, the program close suddenly. And with the bdn/xml, I can't import to Scenarist.
minhjirachi is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 12:42.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.