Log in

View Full Version : Machine Transcription from videos


Lincoln Burrows
19th December 2011, 18:32
Youtube is capable of doing a transcription from videos if the language is english, as you can see from this image:

http://i.imgur.com/VDy62.png

We are able to download the track, captions.sbv, and then we see something that resembles this:

0:00:00.819,0:00:06.160
text

0:00:06.160,0:00:08.449
textHowever, Youtube is not capable of doing the same with other languages. And of course, we need to edit the subtitle with the correct sentences.

What I need here is a software that can do the same, but it doesn't have to even fill with text, only give me the timecodes from each line.

For example, I have a video with 5 minutes where the language is portuguese.

And I already have the full-text from what the people are talking about.

I need this software to give me something like this:

0:00:00.819,0:00:06.160
???????

0:00:06.160,0:00:08.449
???????

etc.By examining each video.

And then I would fill this using Notepad. Because it's a little harder to guess when each sentence will be displayed and fade from the screen.

Any ideas?

kypec
20th December 2011, 12:53
You can try latest SubtitleEdit (http://www.nikse.dk/SubtitleEdit.zip), it should support *.sbv subtitle format and allows you to open & edit it directly ;)

Lincoln Burrows
20th December 2011, 16:54
You can try latest SubtitleEdit (http://www.nikse.dk/SubtitleEdit.zip), it should support *.sbv subtitle format and allows you to open & edit it directly ;)Thanks, but you can use any editor, like Notepad++, to edit the *.sbv subtitle format.

That's not the issue here - but finding a way to generate timecodes (if that's what you call it) for each line, after checking the video.

You see, since Youtube is not recognizing other languages, I will need to create this thing > 0:00:06.160 > 0:00:08.449 several times, and it would take some time to adjust the whole thing, to check if the subtitle lines are out of sync.

smok3
20th December 2011, 18:13
Thanks, but you can use any editor, like Notepad++, to edit the *.sbv subtitle format.

That's not the issue here - but finding a way to generate timecodes (if that's what you call it) for each line, after checking the video.

You see, since Youtube is not recognizing other languages, I will need to create this thing > 0:00:06.160 > 0:00:08.449 several times, and it would take some time to adjust the whole thing, to check if the subtitle lines are out of sync.

i think there is a commercial solution than can add timecodes to transcriptions (but thats probably not what you are looking for?).

Lincoln Burrows
21st December 2011, 01:32
i think there is a commercial solution than can add timecodes to transcriptions (but thats probably not what you are looking for?).It doesn't have to be only free, if you guys know about any software that can do that, please feel free to inform on this thread.

Unfortunately it takes considerable time to create timecodes from scratch and add them into subtitles. If the transcription is correct it doesn't matter, I don't even need the software to fill with text, just to create the timecodes when the people in the video are talking.

smok3
21st December 2011, 09:16
try to contact this guys http://www.zemanta.com/, the tech they use to develop is/was called "odprti kop", i have no clue what is the status of this thingy now.