View Full Version : DVB subtitles
Foofaraw
23rd September 2012, 23:07
Anybody know of a tool which can extract/ and or OCR DVB-T subtitles from mts streams?
Preferable to srt
kalehrl
24th September 2012, 12:29
Try projectx.
It works for .ts files but I'm not sure about .mts.
Subtitle Edit is good for OCRing them.
Foofaraw
24th September 2012, 16:17
yeah projectx just gives me 30000+ errors (literally)
Haven't tried Subtitle Edit yet.
kalehrl
24th September 2012, 19:30
yeah projectx just gives me 30000+ errors (literally)
Even though projectx doesn't support h264 codec, it is possible to extract dvb subtitles from a h264 .ts file despite the errors.
Subtitles re processed correctly.
The same may be applicable to mts files.
Foofaraw
25th September 2012, 02:32
I tried testing the output in subtitle edit and it looks rather flakey alas.
kalehrl
25th September 2012, 14:33
Try a different colour model:
http://i49.tinypic.com/2q038e1.png
Uros
30th September 2012, 22:46
I wrote a simple tool that can extract DVB-T subtitles from MPEG-TS and OCR's them with a Tesseract engine 3.
Results are not perfect, but recognition can be improved with training Tesseract for specific font. Unfortunately that is quite complicated process.
You can find this tool here (https://sites.google.com/site/multicasttv/)
Foofaraw
2nd October 2012, 01:02
I wrote a simple tool that can extract DVB-T subtitles from MPEG-TS and OCR's them with a Tesseract engine 3.
Results are not perfect, but recognition can be improved with training Tesseract for specific font. Unfortunately that is quite complicated process.
You can find this tool here (https://sites.google.com/site/multicasttv/)
Good to see new programs, but strangely it didn't detect any in it.
Ghitulescu
2nd October 2012, 07:52
I believe you have hard subtitles :) some call them burnt-in, in other words the subtitles are part of the image, that also explains the other issues you have reported.
There are tools that "extract" hard subtitles, but you have to test them yourself, on your particular file you have, see videohelp for a list of them.
Foofaraw
2nd October 2012, 17:39
I believe you have hard subtitles :) some call them burnt-in, in other words the subtitles are part of the image, that also explains the other issues you have reported.
There are tools that "extract" hard subtitles, but you have to test them yourself, on your particular file you have, see videohelp for a list of them.
Given that I have recorded the dvb-t file from an antenna, and mediainfo says it contains several text streams and that VLC's subtitle menu turns them on or off (or shows teletext) I think you are wrong.
Chetwood
3rd October 2012, 06:48
I believe you have hard subtitles :) some call them burnt-in
Right, most people call them hardcoded.
I wrote a simple tool that can extract DVB-T subtitles from MPEG-TS and OCR's them with a Tesseract engine 3.
I always thought, DVB subs were text only?
Uros
3rd October 2012, 10:54
I always thought, DVB subs were text only?
ETSI EN 300 743 standard defines graphical and text subtitles, but in practise they are almost always bitmaps.
nikopa
4th October 2012, 17:55
I wrote a simple tool that can extract DVB-T subtitles from MPEG-TS and OCR's them with a Tesseract engine 3.
Results are not perfect, but recognition can be improved with training Tesseract for specific font. Unfortunately that is quite complicated process.
You can find this tool here (https://sites.google.com/site/multicasttv/)
Interesting tool.
Finland MTV3 make it crash after sometime (probably when it encounters first subtitle?)
Yle subtitles seems to (sometimes) go undetected. Probably it stops search too early (I start recordings ~2min early)
When detected seems to work with these.
Uros
4th October 2012, 22:44
Can you upload some samples somewhere ?
Some transport streams are missing PAT and PMT tables or interval of these tables is way to big and whitout them it doesn't even start searching for subtitles. I think that some tools exists that can rebuild these tables.
nikopa
5th October 2012, 00:13
Clip 1: http://filecloud.io/4hqpwmz1
Crashes after pressing Start (but is detected ok).
Clip 2: http://filecloud.io/eqf6rm09 (this is around 100mb into file, I just cut smallest possible clip for upload)
Subtitle with PID 1050 is not detected. PID 1027 is detected but it is empty.
Ghitulescu
5th October 2012, 08:00
I always thought, DVB subs were text only?
And in addition to them, there always are those teletext/videotext/ceefax subtitles, the German broadcasters finally agreed on pages 150 and/or 777.
vBulletin® v3.8.11, Copyright ©2000-2026, vBulletin Solutions Inc.