View Full Version : Subtitle Edit
Janusz
30th October 2021, 15:42
@Nikse
Thanks for the fix. It works.
@darksen has to find a bug in his script that causes it to hang.
tormento
30th October 2021, 19:09
I assume you have "Settings/Tools/Fix common OCR errors - also use hard-coded rules" enabled.
I have this option turned off so that the rules hidden under it do not change the text corrected according to my rules.
@Nikse555 could you expose the hidden OCR rules?
Nikse555
31st October 2021, 17:43
@tormento:
I think the hard coded rules should probably be moved to the OCR fix replace list... at some point.
I did a small test and mostly got stuff about periods (right part is with hard coded rules):
. ..is a meat by-product. <-> ...is a meat by-product.
How did you.. .? <-> How did you...?
The code is here: https://github.com/SubtitleEdit/subtitleedit/blob/master/src/ui/Logic/Ocr/OcrFixEngine.cs#L999
For now I guess you should disable the hard-coded rules, and add something for the periods.
@Janusz: Did you add some rules to handle periods?
Janusz
1st November 2021, 03:24
@Janusz: Did you add some rules to handle periods?@Nikse
Yes. A few more rules that were missing when I turned off "hard-coded rules", for example removing spaces but only between "1" and the next digit, setting correct entries for: , . ; : ! ? <i> - .
Since my character base does not contain an "I", I had to add the replacement of "l" with "I".
@tormento
Here you have the test files: (https://www.mediafire.com/file/oijczirp4fjh0il/tormento.test.zip/file)
ita_OCRFixReplaceList.xml, test_8.20.237.100e.nocr with character base (contains "l" and "I") - options for ocr set by name: No of ... 8, Max wr ... 20, threshold ... 237
From test.txt, test.srt I created test.sup, from which I got test_ocr.srt. In my opinion everything works as it should, even with the "hard coded rules" option turned on.
tormento
1st November 2021, 10:58
I think the hard coded rules should probably be moved to the OCR fix replace list... at some point.
YES, please.
Plus, as I addressed some time ago, it would be really helpful to have an additional "common" name list button in OCR, not to have to add it multiple times when you recognize multiple languages. I usually OCR original language + italian and the proper names are the same in all the languages, i.e. Luke is always Luke and so on.
Janusz
1st November 2021, 11:39
@tormento
Words or expressions added to names.xml are checked regardless of the language used.
Add a word or phrase directly to the file by editing, or use the "Name list manager" plug-in in SE.
tormento
1st November 2021, 11:51
@tormento
Words or expressions added to names.xml are checked regardless of the language used.
Use [Word lists], switch to English, add a new word or phrase. From now on you will have the word added in your Italian and I will have the Polish dictionary.
I know the existence of that file but it would be really uncomfortable to exit SE every time I find a name, manually edit the file, run SE again and go on like that. A button inside the OCR would be much better.
Janusz
1st November 2021, 12:45
You don't need to turn off the program during ocr etc.
Just stop ocr, add a word to the file, change the currently used dictionary to another or "none", return to your dictionary, then necessary - already corrected dictionary files will be read again. This is definitely not a comfortable solution - an extra button would be better to add a word to names.xml. At least today there is no other option. The facilitation is that the words added to ..._ names_user.xml are at the end and are not sorted, so it's easy to find and transfer them to names.xml
darksen
2nd November 2021, 03:08
@Janusz: thx, slightly improved in beta 234: https://github.com/SubtitleEdit/subtitleedit/releases/download/3.6.2/SubtitleEditBeta.zip
@Nikse
Thanks for the fix. It works.
@darksen has to find a bug in his script that causes it to hang.
Thanks both, I found where the problem with the regex, it was searching nonstop :D
tormento
3rd November 2021, 15:14
You don't need to turn off the program during ocr etc.
Easy of use is always preferred.
tormento
3rd November 2021, 15:15
Here you have the test files
Unfortunately I use OCR and not nOCR.
von Suppé
4th November 2021, 11:21
Hi Nikse
Getting more & more into the look and feel of subtitles, the ASS styles window can prove me strugling to determine the way subtiles actually will turn out.
When trying to choose the proper font & settings, I often find myself in need of a WYSIWYG preview. Would you be able to implement such?
Also, I'd more than welcome a window where a custom preview text can be typed in. I experience the effect of "real-life" words and sentences being different from the one of current fixed "ABC... ...123"
Any chance?
cheers
Emulgator
4th November 2021, 12:00
Thats why I still mainly work in Aegisub, but SE is coming happily more and more into my workflow.
von Suppé
4th November 2021, 12:51
Yes, I also use Aegisub a lot and often I want to import the files into SE. When already being busy in SE however, more than once I need to add a style for certain occasions. And especially these times I miss my wish-list.
Implementation would provide for a significant quicker way of on-the-fly choosing the right font & settings. In comparison to going back to Aegisub or other SE preview windows.
[EDIT] I also maybe found a small bug: in text editor window, "rightclick --> Selected lines --> Save selected lines as" works only when three or more lines are selected. I am not lazy, but just sayin'...:D
Nikse555
4th November 2021, 20:27
@von Suppé: The ASSA style window already has a preview - if you use mpv as a video player, the preview will be generated with mpv (which uses libass) which should be pretty WYSIWYG .
And you can right-click in the preview to change the preview text :)
https://www.nikse.dk/se-assa-styles.png
The ASSA support in SE is slowly improving - check the plugin "ASSA Draw" and a few other tools:
http://nikse.dk/se-assa-tools.png
von Suppé
4th November 2021, 23:25
I did not know this, Nikse.
Thanks, I'll go check it out.
Another thing: When checking "Underline" I experience the images of exported PNG-XML or SUP having no underlined text. Is this by design and is it something reserved for ASS only?
Nikse555
5th November 2021, 07:42
I guess "underline" is not supported... is that something you use?
von Suppé
5th November 2021, 09:59
Hi Nikse
The custom preview text was new to me, works like a charm! Simply didn't know.
I have set mpv as player. As for the preview being WYSIWYG, your screenshot compared to mine will tell you. Please have a look at both.
https://i.ibb.co/qN4z1Gd/ASS-preview.png (https://imgbb.com/)
We have both set font Arial and size 47,0. Your text has been scaled smaller than mine. Also in my image the bottom offset doesn't come near real life representation. I think it has to do with how much the total "ASS styles"-window is stretched, and/or against what the video-resolution of the background the renderer "thinks" it is.
You can imagine when scaling is not right, it's hard to determine how big the subs will really come out. Of course you can compare with other fonts, but this will only tell you size ratio between fonts themselves. But not how big they'll be in the video.
So, I think the preview needs adjustable background-video settings to be able to scale properly. Now it looks like as if a default 4:3 background has been set.
Thinking ahead of this, the option of manual setting of video-resolution comes into play. For instance, SUPs - as used in UHD-BD - are still being authored in 1920x1080 screensize. When the preview would auto-set the resolution to the imported 3840x2160 video you're working for, things would go wrong.
Sorry it's a bit verbose, but I couldn't explain this in a shorter way.
As for the underlining, it's the first time I indeed use this for image based subtitles. I don't see myself using it often in the future.
Fortunately it concerned just three images, so for this time a basic image editor came to the rescue.
Still have to take a look at the ASS tools.
For now, thanks!
[EDIT] Did I find a bug? When exporting ASS to xml-png or sup, I miss both shadow and outline in the images
darksen
7th November 2021, 19:01
Just noticed that after spell checking with Word lots of lines have +/- 1ms, I noticed this because I ran the spell check and after not changing any text the window title had an asterisk at the end of the filename so I did a ctrl+z and noticed that.
As you can see here all I redo and undo is Word checking:
https://i.imgur.com/5oyJVLl.png
Nikse555
7th November 2021, 19:20
@von Suppé: I'm not really sure about the preview... perhaps it requires FFmpeg too... otherwise a preview.mkv file is used (located in the SE data folder - press ctrl+alt+shift+d to open this folder in SE).
The SE export feature is really simple, only supports very simple stuff like simple colors and italic.
@darksen: I could not re-create this... tried on several subtitles. Can you re-create this? What steps do I need to re-create this?
Please test in latest beta (SE 3.6.3 is really close): https://github.com/SubtitleEdit/subtitleedit/releases/download/3.6.2/SubtitleEditBeta.zip
darksen
7th November 2021, 22:05
What I did to have this happen (as it just happened again with another sub) is load the sub, make some adjustments with multiple replace and Fix common errors and then go to spell check with Word, change the language to my own (for some reason it always auto loads English), get some errors (they are names so I ignore them) and then press OK button.
I have tried again with the same sub with which this just happened minutes ago but couldn't recreate this by just opening the sub and going to spell check directly. Maybe it has to do with these subs coming from OCR (I've previously loaded a SUP file).
Edit before sending the reply: Yes, if I OCR a SUP file and then directly spell check it with Word this problem happens. Just tried with a SUP file and after just some lines stop it, clicked OK to load SE and used spell check with Word.
Just tried the latest beta and it is still happening.
darksen
8th November 2021, 06:33
Also it seems there is some memory problem here, I have left SE sit doing nothing for 8 hours after OCRing a SUP file (I haven't clicked OK to load SE) and it is now using 14GB of RAM.
https://i.imgur.com/PtUeLov.png
Janusz
10th November 2021, 08:25
@Niksee
Please correct the Polish translation for the program:
https://forum.doom9.org/attachment.php?attachmentid=17940&stc=1&d=1636528009
and the layout of elements in the "Blu-ray (.sup) subtitle file for edit ..." window
This is the window in the English version:
https://forum.doom9.org/attachment.php?attachmentid=17941&stc=1&d=1636528180
and so in the Polish version:
https://forum.doom9.org/attachment.php?attachmentid=17942&stc=1&d=1636528155
Here for:
"Czas rozpoczęcia" you can use "Czas rozp."
"Czas zakończenia" you can use "Czas zak."
"rozp." and "zak." are abbreviations commonly used.
Nikse555
10th November 2021, 11:31
@darksen: I could not create either issue - the first probably because I do not have Word - could also depend of your default subtitle format. The second issue, I left SE with waveform using mpv video player for a day, and it still used the same amount of memory - but it could depend on the video player used in SE. After OCR a lot of memory might be in use...
@Janusz: thx - updated :)
Janusz
10th November 2021, 12:26
@Janusz: thx - updated :)
What about changing "Ogółne" to "Ogólne"?
Nikse555
10th November 2021, 22:33
What about changing "Ogółne" to "Ogólne"?
Sure, updated here: https://github.com/SubtitleEdit/subtitleedit/commit/13b08db6da865135e37da838cb8b81d51b08f5db
Also, probably the final beta of SE 3.6.3 is here: https://github.com/SubtitleEdit/subtitleedit/releases/download/3.6.2/SubtitleEditBeta.zip
Janusz
10th November 2021, 23:46
Yes, the newest beta 359 version already contains the corrected pl-PL.xml file.
Thank you very much.
Janusz
12th November 2021, 09:35
@niksee, Thanks with the new version.
It applies to the stable version SE 3.6.3.
While the calculation of times for subtitles is correct after using "Set start and offset the rest (F9)", the update of the display of these times takes place only from the next line after the selected one.
How can I get it?
Method 1. Load subtitles, add a video file, select any subtitle line, select "Set start and offset the rest" or press (F9).
Method 2. Load subtitles, add a video file, select any subtitle line, select any place on the video preview bar, select "Set start and offset the rest" or press (F9).
In both cases, the updated time is displayed only from the next line after the selected one.
Since the problem does not occur with computing subtitle times, I haven't noticed this before. This bug has appeared since beta 296.
Master Yoda
12th November 2021, 15:05
Just installed 3.6.3 and the issue I posted here (https://forum.doom9.org/showpost.php?p=1942909&postcount=1373) is still present. I don't think it's the way SE is outputting the new .sup or png files, I think it's how SE is reading the original .sup.
Instead of importing the original .sup for edit, I imported the .sup for OCR and you can see the broken/rough edges in the preview window.
https://i.imgur.com/UTaREkh.png
Nikse555
12th November 2021, 20:22
@niksee, Thanks with the new version.
It applies to the stable version SE 3.6.3.
While the calculation of times for subtitles is correct after using "Set start and offset the rest (F9)", the update of the display of these times takes place only from the next line after the selected one.
How can I get it?
Method 1. Load subtitles, add a video file, select any subtitle line, select "Set start and offset the rest" or press (F9).
Method 2. Load subtitles, add a video file, select any subtitle line, select any place on the video preview bar, select "Set start and offset the rest" or press (F9).
In both cases, the updated time is displayed only from the next line after the selected one.
Since the problem does not occur with computing subtitle times, I haven't noticed this before. This bug has appeared since beta 296.
thx :)
How is this new beta: https://github.com/SubtitleEdit/subtitleedit/releases/download/3.6.3/SubtitleEditBeta.zip ?
tormento
14th November 2021, 14:35
How is this new beta?
I have some problems with OCR by binary image compare with this (https://www.mediafire.com/file/uexa1p51p5ud0t7[/I]/WolfChildren.rar/file)subtitle.
Regardless of the text position (I will fix later by ASS codes), some letters, such as C, O, S, U are wrongly recognized as capital letters but entering the Inspect matches for current image and scrolling the wrong ones, the Text associated with image is actually correct.
I have tried to manually update the characters, deleting Latin.db and starting from scratch but nothing seems to work.
Can you please check it?
Nikse555
14th November 2021, 16:01
Just installed 3.6.3 and the issue I posted here (https://forum.doom9.org/showpost.php?p=1942909&postcount=1373) is still present. I don't think it's the way SE is outputting the new .sup or png files, I think it's how SE is reading the original .sup.
Instead of importing the original .sup for edit, I imported the .sup for OCR and you can see the broken/rough edges in the preview window.
The OCR preview is a little rough... check the File -Import - Blu-ray sup for edit - to see a better preview (or save the image and open it in paint.net or photoshop).
I've fixed an issue regarding VLC recently - perhaps it is this issue you got? https://github.com/SubtitleEdit/subtitleedit/issues/5352
Nikse555
14th November 2021, 16:04
I have some problems with OCR by binary image compare with this (https://www.mediafire.com/file/uexa1p51p5ud0t7[/I]/WolfChildren.rar/file)subtitle.
Regardless of the text position (I will fix later by ASS codes), some letters, such as C, O, S, U are wrongly recognized as capital letters but entering the Inspect matches for current image and scrolling the wrong ones, the Text associated with image is actually correct.
I have tried to manually update the characters, deleting Latin.db and starting from scratch but nothing seems to work.
Can you please check it?
I turned off the "NOCR fallback" and OCR'ed the entire subtitle (minus the two/three last lines which are a bit weird).
No problems here...
tormento
15th November 2021, 08:09
I turned off the "NOCR fallback" and OCR'ed the entire subtitle (minus the two/three last lines which are a bit weird).
No problems here...
I even started with a fresh install of SE but couldn't have as good result as yours. Please attach me your settings.xml file.
Nikse555
15th November 2021, 16:36
I even started with a fresh install of SE but couldn't have as good result as yours. Please attach me your settings.xml file.
Yeah, actually also got some casing problems... I think it's just easier to start a new ocr db, like this: https://www.nikse.dk/BinOcrDbWolf.zip
Janusz
15th November 2021, 16:38
I even started with a fresh install of SE but couldn't have as good result as yours. Please attach me your settings.xml file.
If I can advise something?
In my opinion, the problem arises in the first three lines. SE does not find here the correct height for the strings, hence the later substitution of lowercase to uppercase or vice versa. It was like that for me too. If I do ocr (Binary image compare) from the fourth line, the phenomenon does not occur, and the text literally requires a few corrections in the character base so that the entire text looks flawless only with the use of the Italian dictionary. The italic text used also introduces some restrictions on spaces, hence a few errors of combined or separated words. (between „L” i „'”: 5 times, there should be no space - it's just like that quickly).
I turned off the "NOCR fallback" and OCR'ed the entire subtitle (minus the two/three last lines which are a bit weird).
No problems here...
In the sup file I replaced the contents of 2, 3 and the penultimate line, deleted the last empty one and got it. :) (https://www.mediafire.com/file/qx3r0vol8moqewi/tormento.wolf_PID_1200_ita.1.zip/file)
Edit:
A few notes on ocr with my character base created for this file:
1. Settings / Tools / Fix common OCR errors - enabled - fixes OCR errors:
- fix double apostrophes to a single quote (")(5), but also
- turns double apostrophes + . (''.) to ellipsis (...) (2).
Application. It is better to leave the option off because the double apostrophes is easy to find and repair during ocr and after.
2. In the default file ita_OCRFixReplaceList.xml, the "<Word from ="l"to ="I"/>" line is problematic and generates OCR errors that need to be manually corrected (1). I did not use this file at all.
Master Yoda
16th November 2021, 13:54
The OCR preview is a little rough... check the File -Import - Blu-ray sup for edit - to see a better preview (or save the image and open it in paint.net or photoshop).
I've fixed an issue regarding VLC recently - perhaps it is this issue you got? https://github.com/SubtitleEdit/subtitleedit/issues/5352
Opened the sup by going File - Blu-ray sup for edit, selected one of the lines giving issues and export to png. This is what the exported png looks like in photoshop
https://i.imgur.com/s55jGvH.jpg
Nikse555
17th November 2021, 19:44
Opened the sup by going File - Blu-ray sup for edit, selected one of the lines giving issues and export to png. This is what the exported png looks like in photoshop
OK, what bd export settings do you use?
Here is my test - the result is with nice alpha blending:
https://www.nikse.dk/se-bd-export.png
Master Yoda
18th November 2021, 13:49
Just to clarify, I'm not trying to create/export a .sup from srt or another format. I have a .sup which I open by going file-import-Blu-ray (.sup) subtitle file for edit.
Unless I missed something, I didn't see any BD export settings under options.
On this screen, after adjusting times I select save as and it just asks where to save to, no options. If I click on export image, again just asks where to save to, no options.
You can also see in the preview window the edges aren't smooth.
https://i.imgur.com/7N1j5uX.jpg
Janusz
18th November 2021, 17:57
... I have a .sup which I open by going file-import-Blu-ray (.sup) subtitle file for edit.
Or maybe they are animated subtitles and what we see is the first or the last step of this animation?
I suspect that during the export, SE does not transfer the animation, hence the differences in the perception of subtitles when watching the movie and during editing, when we see only one step from the animation.
@Niksee
Is my reasoning correct?
Nikse555
18th November 2021, 20:25
@Janusz: Ah, yes. SE is not showing all images when detecting fade in/out - only the "middle image" which is normally preferred.
@Master Yoda: Could you possible upload/email the sup somewhere?
Master Yoda
19th November 2021, 17:12
@Nikse555 Uploaded to mediafire. Here (https://www.mediafire.com/file/s4byw4i8jmkmwyl/original_01.sup/file) is the sup file.
Not every line has an issue, some are ok, some have that rough broken edge.
Nikse555
19th November 2021, 21:37
@Nikse555 Uploaded to mediafire. Here (https://www.mediafire.com/file/s4byw4i8jmkmwyl/original_01.sup/file) is the sup file.
Not every line has an issue, some are ok, some have that rough broken edge.
thx for the file :)
Seems to be some palette issue - how is this?
https://github.com/SubtitleEdit/subtitleedit/releases/download/3.6.3/SubtitleEditBeta.zip
I don't really have the bdsup specs, only some old java code from BdSub2Sup - the fix in above beta is always use last palette which seems to work for sub #88... might be correct, or it might not. If anybody knows feel free to join in :)
Master Yoda
20th November 2021, 14:28
thx for the file :)
Seems to be some palette issue - how is this?
https://github.com/SubtitleEdit/subtitleedit/releases/download/3.6.3/SubtitleEditBeta.zip
I don't really have the bdsup specs, only some old java code from BdSub2Sup - the fix in above beta is always use last palette which seems to work for sub #88... might be correct, or it might not. If anybody knows feel free to join in :)
Done a quick checked of the first 5 minutes and it looks like what you have done has fixed the issue.
I'll do a more thorough check and then reply back once I have.
Nikse555
20th November 2021, 16:53
Done a quick checked of the first 5 minutes and it looks like what you have done has fixed the issue.
I'll do a more thorough check and then reply back once I have.
Cool, thx for sharp eyes/testing :)
Master Yoda
21st November 2021, 14:46
@Nikse555
The change you made in the beta seems to have fixed the problem.
Looked through this .sup and all the lines I checked that had the problem now look ok.
Checked another .sup from a different show, which has the same colour and font and also had the same issue in 3.6.3, but it was ok in the beta.
VAMET
24th November 2021, 02:14
Dear Friends
I have 23.976 movie, but without my native subtitles. I have found ones, but for movie 29.97. Are there any possibility to change .srt 29.97 to .srt 23.976? Will it match my movie?
PS. There is a way to use English subtitles and translate it to my language, but this the last in queue.
Thank you in advance for your help and support.
Sincerely
Nikse555
24th November 2021, 07:33
@Nikse555
The change you made in the beta seems to have fixed the problem.
Looked through this .sup and all the lines I checked that had the problem now look ok.
Checked another .sup from a different show, which has the same colour and font and also had the same issue in 3.6.3, but it was ok in the beta.
Thx again :)
I've also checked a few difficult sups and they are still fine. New version soon...
Nikse555
24th November 2021, 07:40
Dear Friends
I have 23.976 movie, but without my native subtitles. I have found ones, but for movie 29.97. Are there any possibility to change .srt 29.97 to .srt 23.976? Will it match my movie?
PS. There is a way to use English subtitles and translate it to my language, but this the last in queue.
Thank you in advance for your help and support.
Sincerely
You can try the "Sync" menu item "Change frame rate", but "Sync" - "Visual sync" might be better as it handles all frame rate issues + start offset.
Still, the video might have different scene cuts and extra/deleted scenes, so it's not always easy.
To translate you can use the "Auto-translate - Auto-translate (Ctrl+shift+G)" menu item. The results will not be perfect and the amount of text you can translate per day is limited.
VAMET
24th November 2021, 10:43
Dear Nikse555
You can try the "Sync" menu item "Change frame rate", but "Sync" - "Visual sync" might be better as it handles all frame rate issues + start offset.
Still, the video might have different scene cuts and extra/deleted scenes, so it's not always easy.
To translate you can use the "Auto-translate - Auto-translate (Ctrl+shift+G)" menu item. The results will not be perfect and the amount of text you can translate per day is limited.
Thank you for your reply.
I tried Point Sync via other subtitle and I have set point of sync for every 100 lines of subtitles and end effect is "wow", I have already checked some different parts of the movie and it looks OK.
Sincerely
vBulletin® v3.8.11, Copyright ©2000-2026, vBulletin Solutions Inc.