View Full Version : Subtitle Edit
ndjamena
20th April 2015, 10:38
For what it's worth (which isn't much) here's the source code:
http://www.mediafire.com/download/5gbqrafm7n728c3/608_Captions.zip
Bear in mind, originally I was just trying to split the track into files containing individual frames so I could have a look at them, then I tried translating each of the "commands" into strings I could output into a text file, then I though I'd try applying the commands to a buffer and finally I though I'd try converting the buffer into an srt. So it's just one thing built onto another, I hadn't really figured out how any of it worked until I'd got a few srts out of the way so I had to patch it up on the way and make it do things it wasn't designed for, and I'm pretty sure there's still some things I'm missing. Basically the code is junk, testament to my learning process but otherwise rather worthless and needs to be rewritten from scratch. (I've started redoing the buffers so I can swap them easier and use the second channel but haven't gotten very far yet, so there's some unused code in there too.)
It's not obvious from any of the documentation I've found, I assume they don't bother mentioning it because when used with it's original transmission medium (ie analogue NTSC) it should be fairly bloody obvious.
http://en.wikipedia.org/wiki/EIA-608
OK,
It uses a fixed bandwidth of 480 bit/s per line 21 field for a maximum of 32 characters per line per caption (maximum four captions) for a 30 frame broadcast.
480bps / 30 fps = 16 bits = 2 bytes
So basically, line 21 of each field contains 2 bytes worth of information, which means each 2 bytes in 608 captions, regardless of how it's stored or how it's being transmitted, increases the current timecode by 1/30 of a second (which is when the next even field begins) and every 608 "frame" (2 bytes) is a P-frame that inherits the state from the previous frame. (Captions update 30 times a second, to convert to srt you need to figure out where the actual "display" memory changes, capture it's current state and convert THAT to srt, rather than the commands themselves.)
The way MP4 stores the captions is misleading, there are no frames in 608 beyond the 2 bytes on each line 21, which leaves the mp4 frames themselves as having durations of [Number_Of_Bytes / 60] seconds. If the timecode of the current mp4 frame is later than the end of the last mp4 frame then that's the equivalent of the time in between the end of the last frame and the beginning of the current frame being filled with nulls. If the end of the last frame overlaps the beginning of the current frame then that's an authoring error.
[end of caption] SWAPS the displayed memory with the non-displayed memory. So basically, if I load the words "Rabbit Season!" into Non Displayed Memory [resume caption loading] then send [end of caption] the contents of the Non Displayed Memory and the Display Memory will be swapped and "Rabbit Season!" will be displayed on the screen. If I then load "Duck Season!" into Non Displayed memory and then send [end of caption] it will now display "Duck Season!" if I then wait a few seconds and send [end of caption] again, the buffers will be swapped and it will display "Rabbit Season!" again. I don't know if iTunes will ever use it like that, but they can if they want to.
The only way to move to a new line is to explicitly tell it to, if you've filled the last column in a row (column 32) and are told to display a new character, the standard is to replace the character in column 32 with the new character and keep doing that until you receive a command to move the cursor. If they send a [backspace] command and you're in column thirty two, the recommendation is to determine if you've just written to column 32, if you have delete the contents of column 32, otherwise delete the contents of column 31. (I keep forgetting that I've left a bug in my code in regards to that, my code sets the column number to -32 once column 32 has been written in to, I was supposed to add math.abs functions to all the array coordinates so they don't error out but haven't got round to it yet [FIXED].)
[resume direct captioning] writes the following commands directly to the display buffer. So if someone's screaming the same word over and over with increasing volume, you can pop up a caption saying "You!" the first time, then for the second, switch to Direct Captioning and send just a "!" to make it "You!!" and then send another "!" each time they yell, again I'm not sure iTunes would use it that way, but they can.
Most of my code was written with nothing but Wikipedia as a reference, I did find another document that filled in a few blanks, but it's a PDF and it wastes so much effort explaining the differences between how each generation of 608 decoders handle each command that it's hard to find anything useful in there. It doesn't help that they've denied copying permissions, so I can't copy/paste important parts or convert it to a Word Document to make finding things easier.
Anyway, that's most of it. The positioning data makes flawless conversion to srt impossible in every possible case, I can imagine situations where it would be almost impossible to figure out the order in which words are said without user intervention:
Hello |Hi,
How are you? |I'm fine thanks.
{Sorry, the forum is removing all the spaces separating the captions}
It would make perfect sense as a caption when you're watching the video, I could probably think of worse and more likely situations if I tried. If it was a comedy and they were deliberately playing with the captions maybe...
{If left to me my program will likely never get finished, my head's not the most stable construct on Earth and I'm surprised I got this far}
{I had more, but my head hurts so I'll stop now.}
-edit- [Replace 60fps with (60000/1001) fps if you like, or 30 fps with(30000/1001) fps]
Thunderbolt8
27th April 2015, 12:51
some more SHD removal fixes:
WOMAN: <i>Mr. Sportello?</i>|- Mm-hm.
gets changed to:
<i>- Mr. Sportello?</i>
but should get changed to: <i>Mr. Sportello?</i>
--> the WOMAN: speaker information part is not taken into consideration for SHD removal in combination with the hyphen from the speech part after the line break.@Thunderbolt8: thx, the "remove text for HI" issue should be fixed in latest beta: http://www.nikse.dk/SubtitleEdit.zip (portable versionhey, thanks. however, could you please also include the special "—" hyphen character in all those hyphen removal related rulesets as well? Current example is:
Uh— Well, I, uh—
gets changed to
— Well, I —
its correct for the last part in which the hyphen has to remain after the ", uh" removal (because the "I" is still real speech information), but is not taken into consideration as part of the 'no real speech' SHD stuff with which it should be removed together if thats the only thing remaining, as seen at the first part of that line.
ndjamena
27th April 2015, 14:50
Grrr, I discovered CCExtractor could extract captions from MP4, so I thought I'd give it a try. I figured PGS would be the best output format to keep the positionings, there's an output format called "spupng" which looked promising, it created and xml and a bunch of png images of the captions, but so far I haven't found anything that can even open it, much less convert it to pgs. So I thought I'd try simple srt output, I tested it on a file I'd already converted using my crappy little program... AND THEY GOT THE DAMN TIMECODES WRONG. I know it's wrong, because I can play the files synced with iTunes, mine lines up exactly, theirs doesn't. It's CCExtractor!!! WTF???
Unfortunately I've encountered my first 608 with Roll-Up captions, for one thing they're not what I thought they were, for another it makes mincemeat of my program. I looked at my code and got embarrassed enough to remove it from mediafire, I'll pretend it was never there. I've started rewriting it into a single neater 608 Captions class... I doubt I'll manage to finish it though.
RECOMMENDATION: Service providers shall calculate whether a Backspace is being issued
before or after display in Column 32. If a character or mid-row code has been placed in Column 32,
send a transparent space or Delete to End of Row command (DER) to erase the 32nd column. If
Column 31 is also to be erased, send the transparent space or DER first, then the Backspace.
In general purpose captioning, the Backspace command should not be used as long as TC1
decoders continue to be supported.
---------------
The FCC rules specify that "a Backspace received when the cursor is in Column 1 shall be ignored," but it
does not specify how Backspace should be applied when it is received following a character displayed in
Column 32. Since the rules say, however, that "Backspace shall move the cursor one column to the left,
erasing the character or Mid-Row Code occupying that location," and since there is no Column 33, many
manufacturers have concluded that a Backspace received either before or after displaying a character in
Column 32 shall move the cursor to Column 31 and erase the character there. This application is legal under
the rules, and, although a different method might have been preferable, all decoders shall implement
Backspace in this manner. When erasing Column 31, the decoder may also erase any displayable
character or other code in Column 32 (as is currently done in TC2).
OK, I read that wrong, it's the people who send the captions that are supposed to monitor what's in column 32, not the ones who decode it, we're supposed to delete both columns 31 and 32. I need to rewrite that. I still can't figure out if PACs take up a space, or where I'm supposed to store their text style info... If I press process a backspace after a MidRow, do I cancel the style, or just remove the space and not remove the style unless there's another backspace? I don't know.
Thunderbolt8
24th May 2015, 01:32
it seems like quotation marks screw up the SHD line detection a bit:
- Cover him!|EAMES: Down! Down now! ---> - Cover him!|- Down! Down now!
working as intended
- "My father doesn't want me to be him."|EAMES: Exactly. ---> - "My father doesn't want me to be him."|Exactly
the hyphen indicating a change of speaker after the line break is missing
izanami
1st June 2015, 09:42
i dont see my font in waveline . pls help me how can i do
http://postimg.org/image/557fqwv9b/
Nikse555
1st June 2015, 15:51
@Thunderbolt8: thx, this case should be fixed in next update.
@izanami: could you test latest beta: http://www.nikse.dk/SubtitleEdit.zip ?
(the text in the waveform will appear double... hopefully one of them will be correct)
@ndjamena: Sorry, I've not had too much time to look at it... will have more time in about 14 days I think
minhjirachi
1st June 2015, 16:42
@Thunderbolt8: thx, this case should be fixed in next update.
@izanami: could you test latest beta: http://www.nikse.dk/SubtitleEdit.zip ?
(the text in the waveform will appear double... hopefully one of them will be correct)
@ndjamena: Sorry, I've not had too much time to look at it... will have more time in about 14 days I think
I don't know why the bit depth of the exported bdn/xml files always 32-bit. Or I think that you should add the BDSup2Sub to Subtitle Edit, which process BDN/XML files really good.
izanami
2nd June 2015, 09:26
Thank You Nikse555 . My font is ok in SE 3.4.6 . I really appreciate for your help :)
ndjamena
11th June 2015, 15:01
In case you find something useful in it here is the other document I was using:
http://www.mediafire.com/view/0j4whlx7obo7ejf/EIA-CEA-608.unlocked.pdf
(I've unlocked it.)
I suppose it's possible iTunes is buggy and CCExtractor time-codes are correct, but the muxing mode is Final Cut Pro, which is a program owned and written by apple... someone with a Mac could check how FCP handles the time-codes, if it agrees with iTunes then that's pretty much the end of the story. I couldn't figure out where CCExtractor was getting it's time-codes from but 608 captions are line 21 captions for NTSC analogue broadcasts, theoretically if you play the M4Vs back on an analogue TV you should be able to write the captions back into line 21, which isn't possible with the CCExtractor time-codes because you'd have to start sending the information for the first caption before you even begin playback of the file to get it to display at the right time, and all the other time-codes are off too.
CCExtractor is open source... Apparently. Beyond that VLC has a 608 decoder if you can find it.
https://wiki.videolan.org/VLC_Source_code/
http://ccextractor.sourceforge.net/about-ccextractor.html
My test program has been downloaded 3 times now... I probably should have deleted it but I guess it kind of works if all you want from it is SRTs and you don't use it on files with roll up captions... And unless I missed a program it does seem to be the only way to get the correct time-codes short of owning a MAC... well, this looks like it will do the job but it costs $900:
http://www.drastic.tv/index.php?option=com_content&view=article&id=211&Itemid=303
There's probably something better out there but blow if I can find it.
That's about all I have to contribute. :(
Dean007
21st June 2015, 15:22
Hi. I have a question. Can I sync subtitles with a press of a button like Subtitle Workshop has (alt+m).
For example; I load subs and a video, mark from which subtitle I want to sync, play the video to that subtitle and simply press alt+m and subtitles that were marked are all sync with the video.
kalehrl
21st June 2015, 18:39
There is 'visual sync' option where you use 2 points for sync - at the beginning of the video and at the end for best results.
ndjamena
19th July 2015, 22:20
The developer of CCExtractor:
I'm probably not doing it correctly... didn't have too many itunes samples to begin with.
VLC won't play them properly, MPC-HC won't play them at all, neither CCExtractor nor Subtitle Edit will extract them properly, Handbrake doesn't notice they're there...
Does anything that's not an Apple product actually work with these things?
ndjamena
21st July 2015, 01:01
FFMPEG can see them and attempts to convert them to ASS when muxing to an MKV but ultimately the subtitle track comes out empty.
Music Fan
28th July 2015, 20:48
I have a strange problem with version 3.4.7 when I make OCR on french DVB-SUB ;
"J'ai gagné !" becomes "♪ ai gagné !"
Most of the J' are considered as the music symbol ♪
It's strange because in the "All fixes" part in OCR window, I see the correct spelling on the left and the bad correction on the right, whatever I check or not "fix OCR errors" (I use french dictionnary).:confused:
I guess it means that the correct spelling is detected but is wrongly corrected for some reasons, while no correction is needed in this case.:confused:
And if I don't choose french dictionnary and choose none, this problem disappear and the J' stay as is.
But I need it because if I let on none, I get other errors that I don't get when I choose french dictionnary.:o
Does it mean the french dictionnary is bugged ?
Something else : after OCR is done, I click on OK and I get this message : "do you want to discard changes made in current OCR session ?"
If I choose no, nothing happens, and if I chose yes, the OCR window closes and the text appears in the main window with the changes, as usual, thus I don't understand why I get this message :confused:
raymondjpg
29th July 2015, 02:24
Something else : after OCR is done, I click on OK and I get this message : "do you want to discard changes made in current OCR session ?"
If I choose no, nothing happens, and if I chose yes, the OCR window closes and the text appears in the main window with the changes, as usual, thus I don't understand why I get this message :confused:
Looks like a bug to me. I've gone back to v3.4.6, but I'll gladly persist with v3.4.7 if, as you say, clicking "yes" does not result in changes being discarded.
jpsdr
29th July 2015, 08:53
@Music Fan
As french user also, i want to know if this issue is specific to 3.4.7, or does it happen also with 3.4.6 ?
Nikse555
29th July 2015, 09:07
@Music Fan: thx for reporting the discard message after pressing OK - it's a bug - should be fixed on latest version on GitHub and also here: http://www.nikse.dk/SubtitleEdit.zip (beta, no installer)
Also, could you email me the sub that makes problems with music nodes?
Music Fan
29th July 2015, 23:08
Thanks for the fix, no message anymore.
But OCR on the sup I created yesterday (Blu-ray sup export from TS without OCR) is less well done than with 3.4.7 on some lines : ? is sometimes considered as 'I
edit : I can't reproduce this problem, no problem with the ? now :confused:
Look at this sup file (exported in Blu-ray sup from a TS file, no OCR was done, created with v3.4.7) ;
http://www31.zippyshare.com/v/p6pYpzXt/file.html
Same subtitles but exported this time with your last beta version (again without OCR) ;
http://www25.zippyshare.com/v/81vGbafc/file.html
The result of the OCR with this sup is exactly the same than when done from the original TS.
@ jpsdr : I don't know, I didn't try 3.4.6.
Music Fan
3rd August 2015, 19:10
Nikse555, did you find why J' become ♪ (with the sup I posted here) ?
Thunderbolt8
3rd August 2015, 20:53
could you please provide an example how the syntac structure has to look for this and how it is supposed to work out exactly? "Remove text for HI - "Remove if text contains" now allows multiple items separated by comma or semicolon" ?
Music Fan
4th August 2015, 12:59
3.4.8 is released ;
3.4.8 (2nd August 2015)
* NEW:
* Added support for Blu-ray TextST - thx Timo/ndjamena
* Added "Google it" to spell check dialog
* IMPROVED:
* Updated Chinese Simplified translation - thx Leon
* Updated Danish translation
* Updated Croatian OCR fix replace list - thx diomed & xylographe
* Remove text for HI - "Remove if text contains" now allows multiple items separated by comma or semicolon - thx Jesper
* Added fix for invalid time codes in Avid (bug in Avid) - thx Xenophon
* All Google urls now uses https
* FIXED:
* Fixed "Discard" message in OCR when pressing "OK" (regression from 3.4.7) - thx Music Fan
* Fixed Blu-ray sup export with frame rate 23.976 (regression from 3.4.7) - thx Arjan
* Fixed remebered value from Tools -> Adjust all times (regression from 3.4.7) - thx GH
* Fixed GT by using https - thx Sopor
* Fixed crash after using "Split" - thx Krystian
* Fix for large data inside "Sami" files - thx hhgyu
* Fix for font tags without quote/apos in "Advanced Substation Alpha" - thx hhgyu
* Now comboboxes from "Remove text for HI " should save/restore last used value - thx Jesper
* Fixed several issues with format "CIP" output - thx Victor
* + Many minor fixes from Ivandrofly and xylographe
I still have my J' problem but it's ok if I uncheck "music symbol" in the OCR window ! I don't remember if this option was already present in previous versions.
Anyway, ♪ can also be converted to J' with the "multiple replace" option.
Thunderbolt8
4th August 2015, 14:57
I'd like to request an option for blocking parts of a sentence from HI removal if you feel they belong together. e.g. I have 'oh' in the interjections list, but I dont want to remove it from "oh, my god" or "oh well", "oh boy" and such. currently, I still have to go through each subtitle line marked for HI removal manually and see if there is some line I need to untick and process manually. its quite tiring, especially when doing it for whole TV series like for example all 86 episode subtitle files from the sopranos. this improvement could save me a lot of time here.
von Suppé
24th August 2015, 07:26
Hi Nikse555,
Is there a possibility to give a color to the breakline tag <br /> and italic tags <i> and </i> in the list view?
Regards
von Suppé
speedyrazor
1st September 2015, 17:26
I am asking this here because I know what an excellent and powerful application this is and would love to use it for this task.
I am trying to convert a .itt file to .ssa and then change the height of the subtitles, top and bottom.
I have attached 2 files, Subtitles_da.itt and Subtitles_da.ssa (attached to this post in a zip file).
I am trying to alter the vertical position of the bottom and top subtitles so that neither comes into the blanking of the video file. So basically I need to make the bottom subtitle higher and the top subtitle lower. I don't think this is possible to do in .itt files, so my question is how do I do this in the Sub Station Alpha file?
I have also attached some screen grabs to show what I am trying to do (attached to this post in a zip file).
Let me know if there's any more info I can provide.
I appreciate any help you could offer.
Kind regards.
von Suppé
4th September 2015, 11:15
Is it allowed to ask questions about movie-subs from BD?
cheers
von Suppé
S_E_New
8th September 2015, 18:43
Hi! I want to know if is there any command line for export .srt to sub/idx with a .bat with these specifications.
http://i.imgur.com/6jBtan4.png?1
:thanks:
ukendt
10th September 2015, 10:27
Is it allowed to ask questions about movie-subs from BD?
cheers
von Suppé
Sure!
von Suppé
12th September 2015, 09:49
Sure!
Thank you, ukendt.
I ripped subs from the movie BD Interstellar (2014) and in SE OCR'ed them for editing reasons.
Now, this film has both 16:9 and (approx.) 2.40:1 footage.
Remuxing to mkv I leave video intact so PAR will stay 1:1, AR will stay 16:9 with most of the footage having black bars.
So, when exporting to SUP, I'd like to have the possibility to give some subtitles a different bottom offset.
I know this can be manually done with BDSup2Sub, but the workflow is rather unhandy and quite laboursome to me.
I'd like the idea of (in this case) selecting the concerning subs and give them another offset than the rest.
Any ideas?
Cheers
Thunderbolt8
5th October 2015, 20:40
lines which contain only numbers (and characters as - + / { ...) are considered as uppercase lines in removal of hearing impaired stuff with "remove line if UPPERCASE" ticked. I guess this is not meant to happen.
Xebika
7th October 2015, 05:58
3.4.10 is released ;
3.4.10 (6th October 2015)
* NEW:
* Audio visualizer waveform filled - thx jdpurcell
* FIXED:
* Fixed crash in "Visual sync" - thx aMvEL / Bolshevik
* Fixed "Fix common errors in selected lines" - thx ingo
* Fixed audio visualizer "Seek silence" - thx jdpurcell
* Fixed audio visualizer "Guess time codes" - thx jdpurcell
* Fixed possible startup crash with tiny or bad video file - thx ttvd94
* Fixed alpha in ASS styles - thx ravi
* Fixed line splitter regarding unicode 8242 char
3.4.9 (3rd October 2015)
* NEW:
* New subtitle formats (XIF xml, Jetsen, NCI Timed Roll Up Captions and more)
* Ukrainian translation - thx Maximaximum
* Option to play a sound when new network message arrives - thx InCogNiTo124
* IMPROVED:
* Updated Portuguese translation - thx moob
* Updated Korean translation - thx domddol
* Updated French language file - thx JM GBT/xylographe
* Updated Hungarian translation - thx Zityi
* Updated Dutch translation - thx xylographe
* Updated German translation - thx xylographe
* Updated Romanian translation - thx Mircea
* Updated Polish translation - thx admas
* Updated Croatian OCR fix replace list - thx diomed & xylographe
* Generating of spectrogram is now *many* times faster - thx jdpurcell
* SubtitleListView: Enable double buffering to eliminate flickering - thx jdpurcell
* Added "Count" to "Find" dialog - thx ivandrofly
* FFMPEG audio extraction will now prompt for audio file if more than one
* Format PAC now includes "Chinese simplified" - thx Man
* Merge lines with same text now ignores casing - thx Michel
* Now keeps blank lines inside SubRip texts
* Better Croatian/Serbian language detection - thx aaaxx/xylographe
* Sync tools now display info about what is applied (factor and -/+ adjustment)
* Plugins are now allowed to return another format than SubRip
* Better resizing of list view in "Bridge gaps in durations" - thx ivandrofly
* Audio visualizer: don't generate wav if source is already wav - thx MM
* Audio visualizer: zoom with control key + scroll wheel - thx jdpurcell
* "Fix common errors" only shows English "i to I" fix for English language - ivandrofly
* Allow up to 10 mb subtitles in batch convert - thx Kymophobia
* FIXED:
* Fixed tags accumulating texts in Sami format (regression from 3.4.8 refact) - thx domddol
* Show all paragraphs in audio visualizer when zoomed out - thx jdpurcell
* Fixed bug in "Set end and offset the rest" that degraded performance more and more - thx jdpurcell/Leon
* Fixed several issues with format "Cavena 890" - thx Victor
* Better handling of some zero width Unicode spaces - thx Krystian
* Spell check issue with multiple occurrences of same word in one subtitle - thx Krystian
* Rounding issue in formats with duration in output - thx Victor/Jamakmake
* Fixed possible crash in setting regarding VLC path - thx xylographe
* Fixed shortcut key in French replace dialog - thx Claude
* Go to first empty line now also focuses it - thx Jamakmake
* Clear overlap messages in main window after "New" - thx domddol
* Minor fix for split / timed text 1.0 - thx Krystian
* Issues with italic+bold tags in image export - thx marb99/aaaxx
* Possible crash in export to DOST
* Time codes in export to DOST - thx Christian
* Fixed crash when cleaning spectrogram temp images
* Audio visualizer: Fix crash when using mouse wheel without audio - thx jdpurcell
* Audio visualizer: Fix crash when using shortcuts without audio
* Audio visualizer: Fix issues with HH:MM:SS:FF time code format - thx jdpurcell
* Audio visualizer: Stable end time when using HH:MM:SS:FF time code format - thx ing
* Audio visualizer: Fix new selection disappearing if scrolled out of the left - thx jdpurcell
* Inline margin is now loaded/saved in SSA/ASS (can only be edited in source view)
* Changed LAV Filters link from Google Code to GitHub - thx suvjunmd
* Fixed bug in FAB export regarding center alignment - thx felagund
* Some fixes for "Remove text for HI" - thx Rasmus
* Fix for added words no "don't break after" list - thx ivandrofly
* Fixed bug in batch convert filter (2+ lines)
* + Many minor fixes from ivandrofly and xylographe
varekai
7th October 2015, 11:39
Hello!
First time user of SubtitleEdit 3.4.9
Just recieved a new Blu-ray disc Mad Max: Fury Road.
There are 2 subtitles I'd like to edit,
one is in english SDH where I want to remove the SDH,
and one which is in my opinion poorly translated to swedish.
I extracted the sups (00100.track_4609.sup and 00100.track_4617.sup) with tsMuxer and then imported it in SubtitleEdit.
It does its job and I can see the sup images showing the text but for some reason which I don't understand it won't generate any srt text?
File-->Import/OCR Blu-ray (.sup) subtitle file...-->00100.track_4609.sup--> Hit 'Start OCR' button... and nothing!?
I think I'm missing something fundamentally so any hints and tips would be much appreciated.
Regards
Edit: Just noted there's a new version, will update! Thanks!
Boulder
7th October 2015, 11:48
I've found out that the other OCR option works better than Tesseract so you might want to try that (it's the same you use with DVD subs). You probably need to tweak the two parameters it has, but it's quite easy once you get the hang of it.
varekai
7th October 2015, 12:06
OK, will try that, thanks!
Lucius Snow
10th October 2015, 12:11
Hello all,
I've got a little problem with an arabic subtitle that i want to export with Final Cut Pro + Image sequences. When there's a sentence in italic, with <i> and </i>, it displays properly in the video preview. However, when exporting, the dot, the comma etc. will get inverted (finishing at the right of the sentence instead of finishing at the left).
Any ideas?
Thank you.
mbcd
10th October 2015, 14:03
First of all:
Thanks for that really cool application !! :thanks:
While usage I found some things that made me a little "boring", because of batch conversation.
OCR:
Is it possible to do batch-OCR with "picture-recognition" ?
I mean not fully automatic batch, but selecting e.g. 20 subs, and they were loaded and saved automaticly.
Now you have to do those steps manualy.
- Load one Subtitle
- Start OCR
- Save one Subtitle
For mass-conversation much work. If they were loaded and saved (with automaticly started OCR by picture-recognition), each after each other, it would be very nice.
Export:
Is it possible to do a batch-export for files (load and export already existing subtitles)?
By now you have to load each subtitle by hand and export each subtitle by hand.
Best Regards
Thunderbolt8
10th October 2015, 22:24
numbers are still considered as uppercase characters in 3.4.10 and therefore will be removed if a line consists of numbers (+special characters) only when "remove line is uppercase" is ticked in SHD removal.
Thunderbolt8
21st October 2015, 17:40
for some reason this line gets treated incorrectly with SHD removal while other lines of the same type are not. Guess the hyphen at the end of the first displayed line throws it off.
WOMAN: Excuse me-|ALAN: Tim, I gotta call you back.
--> Excuse me-|ALAN: Tim, I gotta call you back.
but should be: - Excuse me-|- Tim, I gotta call you back.
jpsdr
23rd October 2015, 12:41
I have some issues with OCR Tesserac. Sometimes, i have the result which is something like "D: oi", when the picture displayed
I've also opened an issue on github here (https://github.com/SubtitleEdit/subtitleedit/issues/1394).
I can provide several files which produce the issue, PM for them if you want.
Version is 3.4.10.
jpsdr
24th October 2015, 09:13
Issue have been identified and fixed, nice work, thanks. Now, just have to wait for the next release.
varekai
3rd November 2015, 13:12
Can anyone identify this font?
Looks really nice and I would like to use it in a video project.
http://i.imgur.com/30cBd4q.jpg
http://i.imgur.com/267zC61.jpg
Edit:
Think I found a close match to the font.
This one will be perfect.
http://www.fontspring.com/fonts/fontsite/microsquare
Thunderbolt8
22nd November 2015, 16:13
some more SHD removal inconsistencies, some parts of lines are not removed correctly when italics are involved
("remove text between ♪ and ♪" is ticked and "remove text if it contains: ♪,♪♪" is unticked)
correct:
- ♪ Was a good friend of mine ♪|- All right! ==> All right!
incorrect:
<i>- ♪♪[Continues ]</i>|- It's pretty strong stuff. --> <i></i>|- It's pretty strong stuff.
this one here is debatable, I guess its working as intended with leaving the hyphen because it is needed as signal for speech. it just looks a bit strange just to have the hyphen within the italics and not the rest of the line, as it was already strange with having the hyphen + [Nick] speech information in italics and not the rest as well, before. if youd ask me, the italics could be removed in such a case, but as said, according to the rules its probably working as intended:
- The meal is ready. Let's go!|<i>- [Nick]</i> J. T. Lancer! ==> - The meal is ready. Let's go!|<i>-</i> J. T. Lancer!
in this case here, the italics tag needs to be closed after the first line, because the closing tag gets removed along with the content of the line after the break and the italics tag extends over that line break (this should not be a problem in case of subtitles which have a separate open&close tag for each part of a subtitle line, before and after the line break symbol):
<i>- Here it is.|- Whoa!</i> ==> <i>Here it is.
it would be nice if the "remove text between" tick box for ♪ and ♪ could be extended to also have ♪,♪♪ and ♪,♪♪ as in case of the "remove text if it contains:" box right next to it already has. right now, the line
♪ Trotting down the paddock|on a bright, sunny day ♪♪
does not get removed, because the open and closing music symbol is different. the line would get removed if the "remove text if it contains: ♪,♪♪" box was ticked, but then a line like
- ♪ Was a good friend of mine ♪|- All right!
would get removed entirely as well (instead of only the part before the line break). so in such cases, if a subtitle file consists of different lines like
- ♪ Was a good friend of mine ♪|- All right!
and
♪ Trotting down the paddock|on a bright, sunny day ♪♪
theres currently no option to have this handled automatically correctly (which is not to remove the "all right" part in case of the first line and remove the 2nd line entirely), youd have to tick/delete stuff manually. this could be avoided if the options to select from of the first box "remove text between" would be extended from only including ♪ to include ♪,♪♪ as well.
in such a case here, if SHD information gets removed and a hyphen (— or double --) follows afterwards, a syntactic indicator of ending or linking a sentence like full stop (.), exclamation mark (!) or comma (,) which directly preceeds that SHD information should be removed then.
Oh. Oh, yeah. Um — ==> Yeah. —
btw. Id like to request again a kind of exclusion list (case sensitive) for the "remove interjections" list you can edit manually. in cases of "oh, boy" or "oh, my" I wouldnt like to have the "oh" removed (just "my" doesnt really make sense) while I would like to see it removed in all other cases. same for stuff like "Er" or "er" as SHD information of speech (like errrr....), I would want this gone but then again not "ER" for emergency room. in the end, youd have to check the whole SHD removal list manually just to avoid such small unwanted cases.
Thunderbolt8
22nd November 2015, 18:22
also, please in the "compare subtitles" window, please change the colour of how same lines are marked in the right window, of that subtitle file you are comparing against. currently, that colour looks like sand, its too difficult to spot from the white background. please turn it into something easier to distinguish from white.
Lucius Snow
26th November 2015, 16:42
Hello all,
I've got a little problem with an arabic subtitle that i want to export with Final Cut Pro + Image sequences. When there's a sentence in italic, with <i> and </i>, it displays properly in the video preview. However, when exporting, the dot, the comma etc. will get inverted (finishing at the right of the sentence instead of finishing at the left).
Any ideas?
Thank you.
Nobody to fix this bug? (:
Nikse555
26th November 2015, 22:40
Yo :)
In SE 3.4.10 alpha color values from SSA/ASS styles are used in File -> Export -> image based format - easy to make transparent background box etc.
@Thunderbolt8: thx for all the info :)
I've tried to fix the "remove text for HI" issues here: https://github.com/SubtitleEdit/subtitleedit/releases/download/3.4.10/SubtitleEditBetaTest.zip (portable version without installer, beta)
@jpsdr: thx, some nice fixes :)
@mbcd: The above beta has some improvements to "Tools" -> "Batch convert...". OCR might work and BD Sup output is brand new.
@Lucius Snow: I'm not really sure, but try Edit -> Revert RTL start/end... - you can also try the "Simple rendering".
Thunderbolt8
27th November 2015, 18:31
thanks, I'll try it and report back something might be still not working as intended
Lucius Snow
10th December 2015, 20:15
Thank you. It works wih "Revert RTL start/ end".
Rudde
11th December 2015, 15:44
We need this in the Subtitle OCR world.
http://gizmodo.com/a-new-ai-system-passed-a-visual-turing-test-1747500554
mariner
15th December 2015, 07:54
Greetings Nike.
There appears to be an issue with add duration in the latest built. The duration of last subtitle always ends up with 3.099 sec.
Many thanks and best regards.
mbcd
25th December 2015, 11:19
Yo :)
@mbcd: The above beta has some improvements to "Tools" -> "Batch convert...". OCR might work and BD Sup output is brand new.
And I am very happy about that.
Thanks a lot for the new BETA, your program is very well formed, you got the right job and ... yes ... you love it ;)
I am mainly working with Bluray-Subs and I think I found out some little problems:
1.
On Bluray TEXT-Output you cant add a userstyle yet, you can choose it later by Style-Id, but you cant define your own.
Or are they definied at another place for reuseability?
2.
I am missing some feature to change position of those captions.
SRT doesnt support different positions, but I think that this is an "important" feature.
Cool would be, if position is read out by OCR / Capturing an directly reused.
Also creating different styles and applying them in the editor (on different captions at the same time) would be cool.
Often you have some seconds in the beginning of a film where you can read actors-name or something else.
It would be cool to get a function to shift the text verticaly for some captions, so that subtitletext and filmtext are not on the same position for that case.
Best regards and a happy new Year to you and your family.
Marco
StainlessS
13th January 2016, 14:10
Yo Nikse555,
Great improvements in SE, love it :)
One thing though, Loading MP4/AVC video, 720x436@25-2.35:1 1600Kbps shows weird video
with centre section greyscale and left and right parts weird color versions (as if UV starts 2/3 way across
clip and wraps around to left).
Was same in v2.4.8 which I had installed previously.
Not any problem though in use.
vBulletin® v3.8.11, Copyright ©2000-2026, vBulletin Solutions Inc.