Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
![]() |
#1 | Link |
Registered User
Join Date: Feb 2009
Posts: 11
|
Converting DVB subtitles to text
Hello
I need to convert DVB subtitles to text format for a project that I am making. i have been trying for a while and this is what i do currently: demux videos with ProjectX ro .SUP (bitmap like DVD) files, then run an OCR, the best so far is DVDSubedit with uses a OCR by GOCR. it still has a few problems with some characters however and i cant go and correct all as i need to rip alot of subs. Anyone have any ideas of a differnet solution I could use or how i can improve the OCR, maybe the bitmaps themselves? |
![]() |
![]() |
![]() |
#3 | Link |
Guest
Join Date: Jan 2002
Posts: 21,906
|
|
![]() |
![]() |
![]() |
#4 | Link |
Registered User
Join Date: Mar 2009
Location: Germany
Posts: 5,769
|
Thank you for the link, I have the document already. I asked of "his" opinion ...
![]() Since I'm doing DVB subtitles for some time I noticed that most people simply don't know how to set up ProjectX - and that TXT subtitles are anyway text based, thus no OCR needed ... unlike the PGS ones - ProjectX can extract both of them as SUP. |
![]() |
![]() |
![]() |
#5 | Link |
Guest
Join Date: Jan 2002
Posts: 21,906
|
In my experience the majority of broadcast subtitles are bitmaps.
I write subtitle drivers for settop boxes and I have never seen text subtitles. The reason is that multiple languages must be supported and it's easier to just send bitmaps rather than try to implement font rendering in the settop box. Last edited by Guest; 19th May 2010 at 15:01. |
![]() |
![]() |
![]() |
#6 | Link |
Registered User
Join Date: Mar 2009
Location: Germany
Posts: 5,769
|
You are US-based
![]() Within EU on the other hand, only the Nordics have PGS (also TXT), followed by UK, and France. Principially for HD, as the subtitles for SDs are kept in the old(?) format to assure the compatibility with the installed TV base (many/most cable operators simply feed the SAT signal into their network). |
![]() |
![]() |
![]() |
#7 | Link | |
Guest
Join Date: Jan 2002
Posts: 21,906
|
Quote:
Let's wait for the OP to tell us what he actually has in his source material. If he would like to post a sample, we can see what he actually has. Last edited by Guest; 19th May 2010 at 14:39. |
|
![]() |
![]() |
![]() |
#8 | Link |
Registered User
Join Date: Mar 2006
Posts: 1,049
|
BBC transmit both types (TXT and DVB) - also for live content dynamically typed (for DVB it is quite rare and seems that some vendors have problem with proper displaying them), also BBC subtitles are rich in attributes (especially colors - some times few changes in one line).
btw I looking for open source solution to direct (without intermediate DVD subtitles phase) create DVB subtitles (best from series of the pictures) - some limited capabilities are implemented in VLC but documentation is quite vague... |
![]() |
![]() |
![]() |
#9 | Link | |
Registered User
Join Date: Mar 2006
Posts: 1,049
|
Quote:
maybe for example Abbyy OCR solution? http://www.abbyy.com/ (one of the best in my opinion - if not best of all) |
|
![]() |
![]() |
![]() |
#10 | Link |
Registered User
Join Date: Feb 2009
Posts: 11
|
sorry about late response havent looked at this for a while.
i was referring to DVB *bitmap* subtitles which are the only type available on freeview television (not freesat). I found a script to demux and OCR, the thing is it takes a while and not completely free of errors. Its a real pain. |
![]() |
![]() |
![]() |
Tags |
dvb, dvdsubedit, ocr, subtitles, television |
Thread Tools | Search this Thread |
Display Modes | |
|
|