Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
|
|
Thread Tools | Search this Thread | Display Modes |
3rd October 2010, 22:33 | #1 | Link |
Registered User
Join Date: Sep 2010
Posts: 1
|
Announcing VobSub2SRT: convert .sub/.idx to .srt on Linux
VobSub2SRT is a Linux command line tool to convert Vobsub (.idx/.sub) subtitles into the .srt subtitle format. It is based on mplayer's vobsub code and uses tesseract for the OCR part.
You can get the source and manual at http://github.com/ruediger/VobSub2SRT The quality of the OCR depends heavily on the quality of the subtitles. I'm currently planning to add some preprocessing features (like rescaling) to increase the OCR probabilities. I'm developing VobSub2SRT on Kubuntu (current 10.04) but it should work on other Linux systems as well (and maybe even Mac OS X). To build vobsub2srt on Ubuntu use Code:
sudo apt-get install libavutil-dev tesseract-ocr-dev tesseract-ocr-eng build-essential cmake ./configure make sudo make install Code:
vobsub2srt Filename I hope this tool is useful to you and please give me some feedback. |
26th January 2011, 18:13 | #3 | Link |
Registered User
Join Date: Nov 2003
Posts: 3
|
compiling on mac osx
Hi,
I am running mac osx 10.6.6 on a macbook pro early 2008. I have installed macports and have installed tesseract from the repositories. The version of tesseract seems to be 3.0: Code:
cn-b204-2:ruediger-VobSub2SRT-e46e81a bn$ tesseract -v tesseract 3.00 When configuring , I get the following error: Code:
cn-b204-2:ruediger-VobSub2SRT-e46e81a bn$ ./configure -- The C compiler identification is GNU -- The CXX compiler identification is GNU -- Checking whether C compiler has -isysroot -- Checking whether C compiler has -isysroot - yes -- Checking whether C compiler supports OSX deployment target flag -- Checking whether C compiler supports OSX deployment target flag - yes -- Check for working C compiler: /opt/local/bin/gcc -- Check for working C compiler: /opt/local/bin/gcc -- works -- Detecting C compiler ABI info -- Detecting C compiler ABI info - done -- Checking whether CXX compiler has -isysroot -- Checking whether CXX compiler has -isysroot - yes -- Checking whether CXX compiler supports OSX deployment target flag -- Checking whether CXX compiler supports OSX deployment target flag - yes -- Check for working CXX compiler: /opt/local/bin/c++ -- Check for working CXX compiler: /opt/local/bin/c++ -- works -- Detecting CXX compiler ABI info -- Detecting CXX compiler ABI info - done -- Source: /Users/bn/Movies/ruediger-VobSub2SRT-e46e81a -- Binary: /Users/bn/Movies/ruediger-VobSub2SRT-e46e81a/build -- Build type: Debug -- checking for module 'libavutil' -- found libavutil, version 50.15.1 -- Found Tesseract: Tesseract_LIBRARIES-NOTFOUND;/opt/local/lib/libtiff.dylib CMake Error: The following variables are used in this project, but they are set to NOTFOUND. Please set them or make sure they are set and tested correctly in the CMake files: Tesseract_LIBRARIES (ADVANCED) linked by target "vobsub2srt" in directory /Users/bn/Movies/ruediger-VobSub2SRT-e46e81a/src -- Configuring incomplete, errors occurred! /opt/local/share/tessdata in a file named eng.traineddata. Are there any ways to get this compiled on my macbook? |
26th January 2011, 22:36 | #4 | Link |
Registered User
Join Date: Nov 2003
Posts: 3
|
problems with german language?
Hi,
I am also trying to use your program in order to convert german subtitles. This is on a linux amd64 machine running ubuntu 10.10, the version of tesseract is 2.04-2. German language files are installed and are found under /usr/share/tesseract-ocr/tessdata These files start with deu. When running vobsub2srt with --lang de, I get the following Code:
vobsub2srt --lang de --verbose output VobSub: Can't open IFO file vobsub: ignoring size: 720x576 vobsub: ignoring palette: bbe20c, 0ba7cc, 101010, eaeaea, 438143, ec14ed, ebff0b, 0d617a, 7b7b7b, d1d1d1, 7b2a0e, 0d790d, 0ce60b, eaeaea, bc5a38, bbd838 vobsub: ignoring forced subs: OFF [vobsub] subtitle (vobsubid): 0 language de Index Count: 1 Id: 0 Lang: <no id> Selected VOBSUB language: 0 language: de Unable to load unicharset file /usr/share/tesseract-ocr/tessdata/ger.unicharset What could I do in order to make it work? I already tried to install tesseract 3.00, but this seems to be incompatible with vobsub2srt. |
Tags |
linux, ocr, srt, vobsub |
Thread Tools | Search this Thread |
Display Modes | |
|
|