Log in

View Full Version : Possible to OCR Japanese subtitles??


kpic
12th May 2010, 23:09
I was wondering if this were possible;

I have a movie that has Japanese subtitles I know I can rip the subtitles off of DVD to .sup format - I have Systran translator on my work computer, I never used it before but today I played around with it and it does have a Japanese to English translation but I would need to copy\paste the text (I know it will not be perfect but it would get me in the ballpark).

So I guess I would need to OCR the subtitles somehow to text (.srt)?? I imagine that would be a problem?

LUCHOO
13th May 2010, 12:43
SubRip to extract subtitles and save in Unicode, ..then you can translate it slowly......
and you need keyboard with Japanese language, it is a very tedious ..

http://h.imagehost.org/t/0241/Subrip_kanji.jpg (http://h.imagehost.org/view/0241/Subrip_kanji)

kpic
13th May 2010, 22:45
OK, thanks for the pic. I know OCR is difficult so I am prepared for that.
When you mention save as Unicode, would that be the "Code Page" and would I save as UTF 7 or 8? And would I then be able to open this Unicode file as text? Because I will need to copy\paste it.

If I use a Japanese\English keyboard do I need Microsoft IME running? Not really sure what the IME is but I read that I would need it?

Thanks!

06_taro
22nd May 2010, 18:20
There's a tool named IdxSubOCR, which uses MS Office MODI engine to OCR English, Chinese(Traditional &Simplified), Japanese, Korean and many other languages if the MODI Module of these languange has been installed (only your default language of MS Office is installed so you have to install other languanges manually if you want to use them in OCR). Though the GUI is in Chinese, it is very simple to use.