Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > General > Subtitles

Reply
 
Thread Tools Search this Thread Display Modes
Old 7th December 2023, 11:46   #1721  |  Link
dbtayag
Registered User
 
Join Date: Mar 2013
Posts: 17
Quote:
Originally Posted by dbtayag View Post
This started with OPPENHEIMER and has continued with A HAUNTING IN VENICE. A little squiggle at the end of the letter is screwing up the OCR. I'm currently using NOCR.



Is there a way to do an OCR so the OCR would just ignore the little squiggle?
I was able to find a way to remove the squiggle. Right click > Image pre-processing > Binary image compare threshold lever > move left or right until squiggle is gone.
dbtayag is offline   Reply With Quote
Old 16th December 2023, 08:20   #1722  |  Link
markfilipak
Registered User
 
markfilipak's Avatar
 
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly San Jose, California)
Posts: 280
Pre-Script:

I got an answer from a heavyweight:
https://forum.doom9.org/showthread.p...44#post1995144
-----
Hi All, Thank you for SE. I like it!

I have a problem that needs some basic info.
Here is an IDX from mkvextract:
Code:
# VobSub index file, v7 (do not modify this line!)
size: 1920x1080
palette: 000000, 0000ff, 00ff00, ff0000, ffff00, ff00ff, 00ffff, ffffff, 808000, 8080ff, 800080, 80ff80, 008080, ff8080, 555555, aaaaaa
langidx: 0

id: und, index: 0
timestamp: 00:00:10:135, filepos: 000000000
timestamp: 00:00:15:808, filepos: 000001800
timestamp: 00:02:18:889, filepos: 000003000
... and so forth
Problem: There's no end times for subtitle instances.
So, SE did the best it could and made this SRT:
Code:
1
00:00:10,135 --> 00:00:15,808
THE FIFTH ACT

2
00:00:15,808 --> 00:00:23,808
DEMONS

3
00:02:18,889 --> 00:02:21,892
Good day, Mr. Jacobi.
... and so forth
With no end times in the IDX, SE is 'bumping' the subtitles end-to-end, or it's putting up a subtitle for 8 seconds.

The fault is with mkvextract of course, not with SE.
Question 1: Where does the IDX format come from? -- Who can I contact?
Question 2: Is there another IDX format that has end times? -- I could cook up end times if I knew the format.
Question 3: Is there a way to change the 8 second default to 4 seconds?

Thanks again the SE, and thanks for this opportunity to ask,
Mark.

Last edited by markfilipak; 17th December 2023 at 01:17.
markfilipak is offline   Reply With Quote
Old 17th December 2023, 02:33   #1723  |  Link
markfilipak
Registered User
 
markfilipak's Avatar
 
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly San Jose, California)
Posts: 280
Quote:
Originally Posted by markfilipak View Post
Pre-Script:

I got an answer from a heavyweight:
https://forum.doom9.org/showthread.p...44#post1995144...
So it appears the problem _may_be_ with SE.

I'm giving SE an '.idx' that contains correct 'start' timestamps, and a '.sub' that contains the images & durations. So, it appears that SE may be misreading the durations or miscalculating the 'end' timestamps that it puts into the '.srt'. What are the odds of either of those, eh?

I'd be ready and willing to help but I can't parse a '.sub' because I don't know the struct. Can you help at that end of this problem?


Best Regards,
Mark.

Last edited by markfilipak; 17th December 2023 at 02:57.
markfilipak is offline   Reply With Quote
Old 17th December 2023, 02:55   #1724  |  Link
markfilipak
Registered User
 
markfilipak's Avatar
 
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly San Jose, California)
Posts: 280
Hey, I have another question: How can I do the integration of ffprobe/ffmpeg into SE?

They are already in my system. I've been using them for years. My Windows machine is not on the Internet, so downloading them inside SE is not possible. What do I do to achieve the integration?

I have exactly the same issue with MPV.

Thanks,
Mark.

Later: Thanks VoodoFX.

Last edited by markfilipak; 18th December 2023 at 17:29.
markfilipak is offline   Reply With Quote
Old 17th December 2023, 18:06   #1725  |  Link
VoodooFX
Banana User
 
VoodooFX's Avatar
 
Join Date: Sep 2008
Posts: 989
Copy it to "Subtitle Edit\ffmpeg" folder, at least that's where it is in the standalone.
VoodooFX is offline   Reply With Quote
Old 20th December 2023, 15:40   #1726  |  Link
Lucius Snow
Registered User
 
Join Date: Oct 2003
Posts: 157
The XML/PNG export for DCP SMPTE 2014 is full of bugs. Is there any way to fix it?
Lucius Snow is offline   Reply With Quote
Old 20th December 2023, 17:56   #1727  |  Link
VoodooFX
Banana User
 
VoodooFX's Avatar
 
Join Date: Sep 2008
Posts: 989
Quote:
Originally Posted by Lucius Snow View Post
The XML/PNG export for DCP SMPTE 2014 is full of bugs. Is there any way to fix it?
Yes, describe it there -> https://github.com/SubtitleEdit/subtitleedit/issues

Last edited by VoodooFX; 31st December 2023 at 18:38.
VoodooFX is offline   Reply With Quote
Old 23rd December 2023, 19:22   #1728  |  Link
Zetti
Registered User
 
Join Date: Dec 2015
Posts: 309
Thanks for new release:
https://github.com/SubtitleEdit/subt...ases/tag/4.0.3
Zetti is offline   Reply With Quote
Old 30th December 2023, 16:02   #1729  |  Link
GCRaistlin
Registered User
 
GCRaistlin's Avatar
 
Join Date: Jun 2006
Posts: 353
The topic's name is a little confusing.

Bug: wrong picture is displayed for the character.
Steps to reproduce:
  1. Open .sup file.
  2. In subtitle 4, select the following:
  3. Then select the next letter:
    .
    The picture is obviously wrong (the one for the previously viewed letter).
  4. Click "Add better match" and you'll see the right one:

I can't make SE to recognize these letters correctly at all in nOCR mode: it recognizes them both either as i or as I (or l), depending on what letter I have added last as "better match".
__________________
Windows 8.1 x64

Magically yours
Raistlin

Last edited by GCRaistlin; 30th December 2023 at 17:57.
GCRaistlin is online now   Reply With Quote
Old 31st December 2023, 18:51   #1730  |  Link
VoodooFX
Banana User
 
VoodooFX's Avatar
 
Join Date: Sep 2008
Posts: 989
This is not a bug. Ill look at the sup later, most likely the next year.
VoodooFX is offline   Reply With Quote
Old 8th January 2024, 16:25   #1731  |  Link
GCRaistlin
Registered User
 
GCRaistlin's Avatar
 
Join Date: Jun 2006
Posts: 353
Quote:
Originally Posted by VoodooFX View Post
This is not a bug.
Isn't this the actual picture of the recognized letter?

How do I select the audio track in a loaded video file?
__________________
Windows 8.1 x64

Magically yours
Raistlin

Last edited by GCRaistlin; 8th January 2024 at 23:01.
GCRaistlin is online now   Reply With Quote
Old 8th January 2024, 17:44   #1732  |  Link
GCRaistlin
Registered User
 
GCRaistlin's Avatar
 
Join Date: Jun 2006
Posts: 353
Audio gets out of sync while playing video file (6 secs) in SE.
__________________
Windows 8.1 x64

Magically yours
Raistlin

Last edited by GCRaistlin; 8th January 2024 at 17:50.
GCRaistlin is online now   Reply With Quote
Old 9th January 2024, 00:07   #1733  |  Link
GCRaistlin
Registered User
 
GCRaistlin's Avatar
 
Join Date: Jun 2006
Posts: 353
Feature requests:
  1. Remember last used section (on the left) in Settings.
  2. Waveform/spectrogram | [ ] On double click, correct end if gap is too short.
    Currently, one should choose either to have the ability to fix too small gaps with mouse click or to have the ability to reduce the gap below minimal size using mouse in Waveform.
    With the new option on, double click on subtitle end should correct it if size of the next gap is below "Min. gap between subtitles in ms" value. That is, double click may move subtitle end to the left, never to the right.
__________________
Windows 8.1 x64

Magically yours
Raistlin
GCRaistlin is online now   Reply With Quote
Old 14th January 2024, 18:54   #1734  |  Link
Emulgator
Big Bit Savings Now !
 
Emulgator's Avatar
 
Join Date: Feb 2007
Location: close to the wall
Posts: 1,545
Small tip to MaestroSBT users:
Subtitle Edit seems to export .ass with font size in float.
But good old Maestro SBT does not know about float.

On import of Subtitle Edit .ass Maestro SBT complains:
Code:
"No styles were found.
Most likely, this is not a SSA V4 file.
If this is a SSA V2 file, you must open it
with Sub Station Alpha and save it as SSA V4."
.ass import from Aegisub 2.1.8 runs fine though.
Comparison of both scripts led me to assume the solution:
If Subtitle Edit exports .ass with fontsize "32.0", just handedit the ".0" away.
__________________
"To bypass shortcuts and find suffering...is called QUALity" (Die toten Augen von Friedrichshain)
"Data reduction ? Yep, Sir. We're that issue working on. Synce invntoin uf lingöage..."
Emulgator is offline   Reply With Quote
Old 23rd January 2024, 07:05   #1735  |  Link
TDS
Formally known as .......
 
TDS's Avatar
 
Join Date: Sep 2021
Location: Down Under.
Posts: 995
I have a lot of random problems with Subtitle Edit recognising this symbol, ....it's really "hit & miss", some will be copied, but more often that not, they can be any character it feels like

And its very time consuming manually adding them.
__________________
Long term RipBot264 user.

RipBot264 modded builds..
TDS is offline   Reply With Quote
Old 24th January 2024, 02:31   #1736  |  Link
Emulgator
Big Bit Savings Now !
 
Emulgator's Avatar
 
Join Date: Feb 2007
Location: close to the wall
Posts: 1,545
In case you mean OCR within SE: Which OCR machine do you use ?
If Tesseract, which version ?
Or nOCR ? Binary compare ? Google Cloud Vision API ?
__________________
"To bypass shortcuts and find suffering...is called QUALity" (Die toten Augen von Friedrichshain)
"Data reduction ? Yep, Sir. We're that issue working on. Synce invntoin uf lingöage..."
Emulgator is offline   Reply With Quote
Old 24th January 2024, 03:25   #1737  |  Link
TDS
Formally known as .......
 
TDS's Avatar
 
Join Date: Sep 2021
Location: Down Under.
Posts: 995
Quote:
Originally Posted by Emulgator View Post
In case you mean OCR within SE: Which OCR machine do you use ?
If Tesseract, which version ?
Or nOCR ? Binary compare ? Google Cloud Vision API ?
I think this was directed to me

I generally use the "Default, based on what is available"

I'm using the latest Tesseract, 5.3.3, SE asked for it after install / update.

So like I said, it's random, sometimes it will recognise all of them, and other's not very many at all

But I have had this issue to years, it's just that yesterday I was SE'ing a lot of subs with that music character.
__________________
Long term RipBot264 user.

RipBot264 modded builds..
TDS is offline   Reply With Quote
Old 25th January 2024, 13:15   #1738  |  Link
Emulgator
Big Bit Savings Now !
 
Emulgator's Avatar
 
Join Date: Feb 2007
Location: close to the wall
Posts: 1,545
I had success once, IIRC with nOCR, and I would suggest to try that.
__________________
"To bypass shortcuts and find suffering...is called QUALity" (Die toten Augen von Friedrichshain)
"Data reduction ? Yep, Sir. We're that issue working on. Synce invntoin uf lingöage..."
Emulgator is offline   Reply With Quote
Old 25th January 2024, 15:04   #1739  |  Link
TDS
Formally known as .......
 
TDS's Avatar
 
Join Date: Sep 2021
Location: Down Under.
Posts: 995
Quote:
Originally Posted by Emulgator View Post
I had success once, IIRC with nOCR, and I would suggest to try that.
I tried pretty much every available setting... didn't seem to make much difference

Thanks anyway.
__________________
Long term RipBot264 user.

RipBot264 modded builds..
TDS is offline   Reply With Quote
Old 27th February 2024, 20:38   #1740  |  Link
tormento
Acid fr0g
 
tormento's Avatar
 
Join Date: May 2002
Location: Italy
Posts: 2,578
I have tried to export a sup to png+xml but you need to choose every image position, even if it's embedded in sup itself.

My goal is to divide upper and lower part of png and OCR them separately to deal with upper/lower anime subtitles.

Is possibile in any way to export png+xml having png as 1920x1080 AND proper subtitle positioning?
__________________
@turment on Telegram
tormento is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 15:39.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.