Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > General > Subtitles

Reply
 
Thread Tools Search this Thread Display Modes
Old 22nd March 2025, 03:13   #1981  |  Link
markfilipak
Registered User
 
markfilipak's Avatar
 
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly, Silicon Valley in California)
Posts: 420
In waveform, there really needs to be a way to continuously loop between _any_ two points. Press and hold a key, click point A, click point B, and the waveform is looped, A-B, until the key is released. While the key is held, either point can be dragged and the loop responds.
markfilipak is offline   Reply With Quote
Old 24th March 2025, 23:35   #1982  |  Link
Music Fan
Registered User
 
Join Date: May 2009
Location: Belgium
Posts: 1,761
Hi Nikse,
is there a way to replace an uppercase by a lowercase when it follows a coma and a space ?
For example ;
Code:
Hello, How are you Hunter ?
replace by ;
Code:
Hello, how are you Hunter ?
This pattern can be found with this ;
Code:
(\,\s)([A-Z])
And I hoped it could be replaced with that but it does not work ;
Code:
$1\l$2

edit : I finally added a case for each letter ;
Code:
(\,\s)(A)
replace by ;
Code:
$1a
...

Last edited by Music Fan; 25th March 2025 at 09:56.
Music Fan is offline   Reply With Quote
Old 1st April 2025, 22:46   #1983  |  Link
GCRaistlin
Registered User
 
GCRaistlin's Avatar
 
Join Date: Jun 2006
Posts: 372
The default DirectShow Video Player has an issue with audio sync. mpv library that SE downloads doesn't work on Windows 8.1 x64. Windows 8.1 users should replace it with mpv-dev-x86_64-20240922-git-71f2220.7z manually.
__________________
Windows 8.1 x64

Magically yours
Raistlin
GCRaistlin is offline   Reply With Quote
Old 8th April 2025, 03:46   #1984  |  Link
markfilipak
Registered User
 
markfilipak's Avatar
 
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly, Silicon Valley in California)
Posts: 420
Faster editing

I'm editing the subtitles for "The Ghost and Mrs. Muir" [1947], DVD. Timing wise, they are a mess. And much of the audio is too soft to see it in waveforms.

With Waveforms losing half their resolution (by design), and with no function (by design) that loops like this: Push key, click point A, click point B, the audio loops A-to-B, drag A (as the audio loops) in order to find where an utterance starts, drag B (as the audio loops) in order to find where an utterance ends, release key, I instead have to drag A, play, drag A again, play, drag A again, etc., drag B, play, drag B again, play, drag B again, etc. Without smarter functions, better thought out and designed functions, editing just takes forever.

I am in despair. I suggest better operations here and get no responses. Does no one give a sh!t?

READ ME: See https://forum.doom9.org/showthread.p...45#post2017445 for the resolution of this issue.

Last edited by markfilipak; 11th April 2025 at 03:54.
markfilipak is offline   Reply With Quote
Old 8th April 2025, 05:20   #1985  |  Link
TR-7970X
Registered User
 
TR-7970X's Avatar
 
Join Date: Jan 2025
Posts: 53
Quote:
Originally Posted by markfilipak View Post
I'm editing the subtitles for "The Ghost and Mrs. Muir" [1947], DVD. Timing wise, they are a mess. And much of the audio is too soft to see it in waveforms.

With Waveforms losing half their resolution (by design), and with no function (by design) that loops like this: Push key, click point A, click point B, the audio loops A-to-B, drag A (as the audio loops) in order to find where an utterance starts, drag B (as the audio loops) in order to find where an utterance ends, release key, I instead have to drag A, play, drag A again, play, drag A again, etc., drag B, play, drag B again, play, drag B again, etc. Without smarter functions, better thought out and designed functions, editing just takes forever.

I am in despair. I suggest better operations here and get no responses. Does no one give a sh!t?
Why don't you download it from somewhere in better resolution than DVD, and it may already have the subtitles....or you could get the subtitles from other places (not going to post URL's)

I have had a look, and it's all out there, ready to be got...
__________________
Main Systems:-
Threadripper 7970X on Asus Pro WS TRX50-Sage WiFi
Ryzen 9 9950X3D on MSI Carbon X670E
Ryzen 9 7950X on Gigabyte Aorus Elite B650
Intel 13900KF on MSI Tomahawk B660
TR-7970X is offline   Reply With Quote
Old 8th April 2025, 23:16   #1986  |  Link
markfilipak
Registered User
 
markfilipak's Avatar
 
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly, Silicon Valley in California)
Posts: 420
Faster editing

Quote:
Originally Posted by TR-7970X View Post
Why don't you download it ...
Good grief. Thank you, but my comment is not about the movie. It's about how poorly thought out SE's editing functions are, and how my suggestions get no response. Correcting subtitle times in waveforms is crude and incredibly tedious because the editing functions are crude.

Last edited by markfilipak; 8th April 2025 at 23:18.
markfilipak is offline   Reply With Quote
Old 8th April 2025, 23:56   #1987  |  Link
TR-7970X
Registered User
 
TR-7970X's Avatar
 
Join Date: Jan 2025
Posts: 53
Quote:
Originally Posted by markfilipak View Post
Good grief. Thank you, but my comment is not about the movie. It's about how poorly thought out SE's editing functions are, and how my suggestions get no response. Correcting subtitle times in waveforms is crude and incredibly tedious because the editing functions are crude.
I guess why I didn't make any suggestions was, that I don't use SE for what you're trying to do...

I've only just recently started using Whisper....

It's all very time consuming, at the best of times.

Good luck.
__________________
Main Systems:-
Threadripper 7970X on Asus Pro WS TRX50-Sage WiFi
Ryzen 9 9950X3D on MSI Carbon X670E
Ryzen 9 7950X on Gigabyte Aorus Elite B650
Intel 13900KF on MSI Tomahawk B660

Last edited by TR-7970X; 9th April 2025 at 00:09.
TR-7970X is offline   Reply With Quote
Old 9th April 2025, 00:51   #1988  |  Link
VoodooFX
Banana User
 
VoodooFX's Avatar
 
Join Date: Sep 2008
Posts: 1,131
Quote:
Originally Posted by markfilipak View Post
I'm editing the subtitles for "The Ghost and Mrs. Muir" [1947], DVD. Timing wise, they are a mess. And much of the audio is too soft to see it in waveforms.
Can you PM me the audio and timestamps where "audio is too soft to see"?
VoodooFX is offline   Reply With Quote
Old 9th April 2025, 02:07   #1989  |  Link
markfilipak
Registered User
 
markfilipak's Avatar
 
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly, Silicon Valley in California)
Posts: 420
Faster editing

Quote:
Originally Posted by VoodooFX View Post
Can you PM me the audio and timestamps where "audio is too soft to see"?
No, I'm sorry to say that I can't. It's copyrighted video and it's 2.6 GB. Would it do if I posted screen shots with arrows showing where an utterance _actually_ starts and ends but that isn't otherwise obvious?

Sometimes the audio is just a flat line but there's actually several frames of utterance there -- sometimes _seconds_ of utterance. Sometimes the utterance is buried in music, so it's all just jagged. If you've tried to set subtitles precisely (meaning: within 10 frames or so), you've run across this problem. You cannot rely on the waveform to show you where an utterance starts and ends. You have to hear it, and it's best to hear it in a loop and to have the power to move the cues while hearing the loop.

Right now there's no good way to audition an utterance, so there's no good way to set in- and out-cues quickly. I have posted a couple of ways to speed up editing. The latest also has "Faster editing" as the subject. I conservatively estimate that providing that function would speed up editing in the waveform window by at least 10x. My audition between points A & B (looping, with A & B both actively dragable) is not the same as simply looping from in-cue to out-cue. Please, read what I wrote and I'm sure you will 'get it'. If you don't 'get it', ask. My proposed method includes a button assignment, clicking A, clicking B, draging A and/or draging B while hearing the audition, and releasing the button. That audition is then automatically followed by setting of the length of in-pad and out-pad with and without intervening shot change. In other words, everything beautify does, but beautify is incapable of listening to utterances.

READ ME: See https://forum.doom9.org/showthread.p...45#post2017445 for the resolution of this issue.

Last edited by markfilipak; 11th April 2025 at 03:55.
markfilipak is offline   Reply With Quote
Old 9th April 2025, 02:26   #1990  |  Link
markfilipak
Registered User
 
markfilipak's Avatar
 
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly, Silicon Valley in California)
Posts: 420
Quote:
Originally Posted by TR-7970X View Post
I guess why I didn't make any suggestions was, that I don't use SE for what you're trying to do...

I've only just recently started using Whisper....
Cute name. What does it do?

Quote:
It's all very time consuming, at the best of times.
It doesn't have to be so time consuming.

Quote:
Good luck.
Thanks! But luck has little to do with it.
markfilipak is offline   Reply With Quote
Old 9th April 2025, 03:01   #1991  |  Link
TR-7970X
Registered User
 
TR-7970X's Avatar
 
Join Date: Jan 2025
Posts: 53
Quote:
Originally Posted by markfilipak View Post
Cute name. What does it do?
"Whisper" is an "add on" for SE, that performs an audio to text operation, that is, it creates subtitles from audio.

However, if your video/audio isn't "loud" enough Whisper may not be able to do it's job.

I've tried it on a couple of movies that I can't get any subtitles for, and it definitely does a pretty good job...there would be some reviewing & editing, but at least it's a very good start.

https://www.youtube.com/watch?v=4YZ0...el=DavidMbugua

https://www.youtube.com/watch?v=ZDXy...ingwithClaudia
__________________
Main Systems:-
Threadripper 7970X on Asus Pro WS TRX50-Sage WiFi
Ryzen 9 9950X3D on MSI Carbon X670E
Ryzen 9 7950X on Gigabyte Aorus Elite B650
Intel 13900KF on MSI Tomahawk B660

Last edited by TR-7970X; 9th April 2025 at 03:04.
TR-7970X is offline   Reply With Quote
Old 9th April 2025, 05:05   #1992  |  Link
markfilipak
Registered User
 
markfilipak's Avatar
 
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly, Silicon Valley in California)
Posts: 420
Quote:
Originally Posted by TR-7970X View Post
"Whisper" is an "add on" for SE, that performs an audio to text operation, that is, it creates subtitles from audio.
Ah, that's what I thought. Thanks. And as you note, it wouldn't work with soft utterances. Besides that, of the several hundred subtitles I've done, the videos come with subtitles that I OCR and fix up, so no Whisper. It's those fix ups that take forever with the current SE waveform tools but which could be greatly streamlined.

I've watched quite a few YouTubes, but they weren't useful. All the ones I've seen review how to use SE, not how to deal with difficult subs, and not with how SE can be improved.
markfilipak is offline   Reply With Quote
Old 9th April 2025, 05:15   #1993  |  Link
TR-7970X
Registered User
 
TR-7970X's Avatar
 
Join Date: Jan 2025
Posts: 53
Quote:
Originally Posted by markfilipak View Post
Ah, that's what I thought. Thanks. And as you note, it wouldn't work with soft utterances. Besides that, of the several hundred subtitles I've done, the videos come with subtitles that I OCR and fix up, so no Whisper. It's those fix ups that take forever with the current SE waveform tools but which could be greatly streamlined.


I've watched quite a few YouTubes, but they weren't useful. All the ones I've seen review how to use SE, not how to deal with difficult subs, and not with how SE can be improved.
You could try and run the current video thru Whisper and see what it finds.

I'm actually running an old movie that I can't get any subs for, and I'm using a "bigger" library/model, and it's taking forever, I hope it finds everything & accurately too.
__________________
Main Systems:-
Threadripper 7970X on Asus Pro WS TRX50-Sage WiFi
Ryzen 9 9950X3D on MSI Carbon X670E
Ryzen 9 7950X on Gigabyte Aorus Elite B650
Intel 13900KF on MSI Tomahawk B660
TR-7970X is offline   Reply With Quote
Old 9th April 2025, 05:29   #1994  |  Link
markfilipak
Registered User
 
markfilipak's Avatar
 
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly, Silicon Valley in California)
Posts: 420
Quote:
Originally Posted by TR-7970X View Post
You could try and run the current video thru Whisper and see what it finds.
Oh, that's a very good idea, but I'm very skeptical. I could compare the Whisper subs to the provided subs, but the comparison could only be academic -- not a practical solution, even if it worked. Such a comparison would only take even more time but with no assurance that the in- and out-cues were correct without me listening to them, which is what I'm doing now. I'm not saying that the solution has to be foolproof, only that nothing beats actually listening. It's that listening that I'm trying to optimize.

Quote:
I'm actually running an old movie that I can't get any subs for, and I'm using a "bigger" library/model, and it's taking forever, I hope it finds everything & accurately too.
Well, good luck to you!
markfilipak is offline   Reply With Quote
Old 9th April 2025, 05:39   #1995  |  Link
TR-7970X
Registered User
 
TR-7970X's Avatar
 
Join Date: Jan 2025
Posts: 53
Quote:
Originally Posted by markfilipak View Post
Oh, that's a very good idea, but I'm very skeptical. I could compare the Whisper subs to the provided subs, but the comparison could only be academic -- not a practical solution, even if it worked. Such a comparison would only take even more time but with no assurance that the in- and out-cues were correct without me listening to them, which is what I'm doing now. I'm not saying that the solution has to be foolproof, only that nothing beats actually listening. It's that listening that I'm trying to optimize.
Well, good luck to you!
Do you use SE to OCR ??

I generally use gMKVExtractGUI.

I have done a couple of tests with a basic Whisper model, and despite the odd typo or misinterpretation, the timing was pretty good.

I will let you know how this current job turns out, it's STILL going, it's been well over 2 hours for a movie that 1.5 hours

But if turns out good, then it's better than the alternative, I guess.
__________________
Main Systems:-
Threadripper 7970X on Asus Pro WS TRX50-Sage WiFi
Ryzen 9 9950X3D on MSI Carbon X670E
Ryzen 9 7950X on Gigabyte Aorus Elite B650
Intel 13900KF on MSI Tomahawk B660
TR-7970X is offline   Reply With Quote
Old 9th April 2025, 05:48   #1996  |  Link
TR-7970X
Registered User
 
TR-7970X's Avatar
 
Join Date: Jan 2025
Posts: 53
Quote:
Originally Posted by markfilipak View Post
I'm editing the subtitles for "The Ghost and Mrs. Muir" [1947], DVD. Timing wise, they are a mess. And much of the audio is too soft to see it in waveforms.

Does no one give a sh!t?
I just thought of something...

You're saying that the audio is "soft"...what if you extracted the audio and amplified it, and then see if the waveform process works for you !!
__________________
Main Systems:-
Threadripper 7970X on Asus Pro WS TRX50-Sage WiFi
Ryzen 9 9950X3D on MSI Carbon X670E
Ryzen 9 7950X on Gigabyte Aorus Elite B650
Intel 13900KF on MSI Tomahawk B660
TR-7970X is offline   Reply With Quote
Old 9th April 2025, 05:50   #1997  |  Link
markfilipak
Registered User
 
markfilipak's Avatar
 
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly, Silicon Valley in California)
Posts: 420
Quote:
Originally Posted by TR-7970X View Post
Do you use SE to OCR ??
Yes. I'm satisfied with it. Not perfect, but very good. Kudos to Nik.
Quote:
I generally use gMKVExtractGUI.
I package solely MP4. MKV has a 1 kHz clock, and that leads to too many problems.

Last edited by markfilipak; 9th April 2025 at 07:34.
markfilipak is offline   Reply With Quote
Old 9th April 2025, 07:28   #1998  |  Link
markfilipak
Registered User
 
markfilipak's Avatar
 
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly, Silicon Valley in California)
Posts: 420
Faster editing

Quote:
Originally Posted by TR-7970X View Post
I just thought of something...

You're saying that the audio is "soft"...what if you extracted the audio and amplified it, and then see if the waveform process works for you !!
1) I would have to make the louder audio.
2) I would have to mux the louder audio into the movie at the beginning, and mux it out at the end.
3) Doing so would not improve the situation -- I still have to listen -- and would only add more time to the effort.

The problem isn't that I can't hear the utterances. The problem is that I can't see the actual start and end of the utterances. That's mainly (partly) because waveforms could have twice it's current resolution, but doesn't.

The solution is one that facilitates setting in- and out-cues while simultaneously listening, and doing so much more rapidly than is currently possible.

Compare these methods:

Current SE: A is an in-cue, B is an out-cue. Audio is the sub.
Click-drag A, press a key to listen to A plus a little bit, release key.
Click-drag A again, repeat the hunt until A coincides with the start of the utterance.
Click-drag B, press a key to listen to the whole subtitle in order to hear the end, release key.
Click-drag B again, repeat the hunt until B coincides with the end of the utterance.
Manually add in-padding and out-padding by again dragging A, and again dragging B.
It takes many clicks, many drags, and many listen-key presses to accomplish this.

Proposed SE: A is an out-cue, B is an in-cue. Audio is the space between subs.
Press and hold a key, click A, click B, (SE continuously loops A-to-B).
Click-drag A while audio loops and drop it where utterance A ends.
Click-drag B while audio loops and drop it where utterance B begins.
Release key, (SE automatically adds out- and in-padding while taking shot changes into account).
It takes one mode-key press-and-hold, two clicks, and two drags to accomplish this.

You see, the proposed is not editing subs, it's editing the spaces between subs!

Large gaps between subs exist of course. For them, set A & B using the current, hunting method, above. However, small gaps greatly outnumber large gaps in real videos, so the proposed will work in the vast majority of cases.

READ ME: See https://forum.doom9.org/showthread.p...45#post2017445 for the resolution of this issue.

Last edited by markfilipak; 11th April 2025 at 03:56.
markfilipak is offline   Reply With Quote
Old 9th April 2025, 07:39   #1999  |  Link
TR-7970X
Registered User
 
TR-7970X's Avatar
 
Join Date: Jan 2025
Posts: 53
Quote:
Originally Posted by markfilipak View Post
1) I would have to make the louder audio.
2) I would have to mux the louder audio into the movie at the beginning, and mux it out at the end.
3) Doing so would not improve the situation -- I still have to listen -- and would only add more time to the effort.

The problem isn't that I can't hear the utterances. The problem is that I can't see the actual start and end of the utterances. That's mainly (partly) because waveforms could have twice it's current resolution, but doesn't.

The solution is one that facilitates setting in- and out-cues while simultaneously listening, and doing so much more rapidly than is currently possible.

Compare these methods:

Current SE: A is an in-cue, B is an out-cue. Audio is the sub.
Click-drag A, press a key to listen to A plus a little bit, release key.
Click-drag A again, repeat the hunt until A coincides with the start of the utterance.
Click-drag B, press a key to listen to the whole subtitle in order to hear the end, release key.
Click-drag B again, repeat the hunt until B coincides with the end of the utterance.
Manually add in-padding and out-padding by again dragging A, and again dragging B.
It takes many clicks, many drags, and many listen-key presses to accomplish this.

Proposed SE: A is an out-cue, B is an in-cue. Audio is the space between subs.
Press and hold a key, click A, click B, (SE continuously loops A-to-B).
Click-drag A while audio loops and drop it where utterance A ends.
Click-drag B while audio loops and drop it where utterance B begins.
Release key, (SE automatically adds out- and in-padding while taking shot changes into account).
It takes one mode-key press-and-hold, two clicks, and two drags to accomplish this.

You see, the proposed is not editing subs, it's editing the spaces between subs!

Large gaps between subs exist of course. For them, set A & B using the current, hunting method, above. However, small gaps greatly outnumber large gaps in real videos, so the proposed will work in the vast majority of cases.
Well, now that you've put it that way, it does sound like a LOT of extra work.

However, I thought I saw that you can export the audio to a text file....and also grab the subs from just the audio track.

I ended up stopping that Whisper run, @ 4 hours, it kept what it had done, and it got up to just over an hour thru the movie, there was a lot of extra stuff generated (not needed), but the timing was pretty good, and there weren't too many typos.

I'm going to try a different library/model...

Has the author of SE got a "git" page ??? maybe you need to post your concerns there, not here....

I might try Whisper on the "The Ghost and Mrs Muir" that I got the other day, even tho it came with subs.
__________________
Main Systems:-
Threadripper 7970X on Asus Pro WS TRX50-Sage WiFi
Ryzen 9 9950X3D on MSI Carbon X670E
Ryzen 9 7950X on Gigabyte Aorus Elite B650
Intel 13900KF on MSI Tomahawk B660
TR-7970X is offline   Reply With Quote
Old 9th April 2025, 16:02   #2000  |  Link
markfilipak
Registered User
 
markfilipak's Avatar
 
Join Date: Jul 2016
Location: Mansfield, Ohio (formerly, Silicon Valley in California)
Posts: 420
Quote:
Originally Posted by TR-7970X View Post
Well, now that you've put it that way, it does sound like a LOT of extra work.
Yes. Going through a 2 hour movie while checking and correcting timing can take a full day. Setting accurate in- and out-cue times is very important for making subtitles that flow well and are therefore easy to read. I try to match the pace of the utterances. In a well made movie, dialog has a certain pacing that expresses the mood that the director intends. I have found that when the cues match that pacing, the subtitles almost magically become easier to read and understand. It's quite amazing.

Quote:
However, I thought I saw that you can export the audio to a text file...
Yes. I save subtitles in SRT format -- that's text. SRT is easy to mux-merge into a package stream like MP4, via FFmpeg.

Quote:
and also grab the subs from just the audio track.
It seems that all movies and TV shows after about the year 2000 include subtitles. So, no, I haven't had to make subs from just an audio track. I have some very old DVDs that don't have subtitles but I just leave them be.

Quote:
I ended up stopping that Whisper run, @ 4 hours, it kept what it had done, and it got up to just over an hour thru the movie, there was a lot of extra stuff generated (not needed), but the timing was pretty good, and there weren't too many typos.
Well, that's good to know. May I ask: How do you know the timing was pretty good?

Quote:
Has the author of SE got a "git" page ???
Yes (https://github.com/SubtitleEdit/subtitleedit), and a web site (https://www.nikse.dk/subtitleedit), too.
Quote:
maybe you need to post your concerns there, not here....
Doom9 is for discussion. I appreciate discussion of proposed changes. I think discussion makes for better applications like SE. SE needs to be more interactive than it is now. I've appreciated your thoughts.
markfilipak is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 16:54.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2025, vBulletin Solutions Inc.