Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > General > Subtitles

Reply
 
Thread Tools Search this Thread Display Modes
Old 2nd May 2010, 08:06   #1  |  Link
Hixie
Registered User
 
Join Date: Apr 2010
Posts: 5
Heads-up regarding HTML5 and subtitles

Hello,

As part of the HTML5 work at the WHATWG and the W3C, we've added a <video> element to HTML, as you may have heard. Part of that involves adding support for external subtitles.

After doing stoopid amounts of research on existing formats, and after collecting various images and videos showing what we need to support, I've come to the conclusion that the simplest thing to do is to extend SRT to support the few things we need that it doesn't already do (inline italics, bold, and ruby, relative positioning, voice annotation so that the different speakers can be coloured whatever the user's preferred colours are, and karaoke annotations), and work from there. You can see some of the work so far here on our wiki: http://wiki.whatwg.org/wiki/Timed_tracks

If anyone has any feedback, advice, or sees anything we've missed, please feel free to post to the WHATWG mailing list, e-mail me directly at ian@hixie.ch, or of course post on this thread.

Thanks!
Hixie is offline   Reply With Quote
Old 2nd May 2010, 08:09   #2  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Why not stick to an existing standard instead of making yet another custom format mutually incompatible with everything else out there and thus unable to make use of the many, many high quality tools that already exist?

You'd be better off picking SSA or ASS and supporting a small subset as opposed to creating yet another entirely new format. Doing something like this is just a recipe for disaster, making it all the more likely that support will be too flaky across modern browsers for anyone to actually use.

Last edited by Dark Shikari; 2nd May 2010 at 08:15.
Dark Shikari is offline   Reply With Quote
Old 2nd May 2010, 09:25   #3  |  Link
Hixie
Registered User
 
Join Date: Apr 2010
Posts: 5
The browsers will have to implement something either way; I don't think it matters for those implementations whether or not the format is something that already exists, an extension of something that already exists, or something entirely new. What matters is just the simplicity.

In general though I'd love to use something that already exists. That's why I went through all the formats I could find. Unfortunately none of them really match the needs we have (e.g. none of the simple formats support ruby).

The goal currently is to do something that will read existing SRT files successfully, making it backwards-compatible with existing subtitles in that format, and make it something where you can easily avoid the new features and make content that is compatible with existing SRT processors.

I looked at basing it on SSA/ASS instead of SRT, but SSA is a way more complicated format.
Hixie is offline   Reply With Quote
Old 2nd May 2010, 13:42   #4  |  Link
Ritsuka
Registered User
 
Join Date: Mar 2007
Posts: 93
I think there should be an option to expose the subtitles tracks inside a container (mp4, mkv or whatever). It's odd to have video and audio inside a single file, and to have separate file for others tracks types.

I guess a simple file html-based format would be the best option IMHO.
Ritsuka is offline   Reply With Quote
Old 2nd May 2010, 13:48   #5  |  Link
Keiyakusha
契約者
 
Keiyakusha's Avatar
 
Join Date: Jun 2008
Posts: 1,576
Extending SRT is a very bad idea. ASS is not "way" more complicated. Its human-readable, thats enough. You really can't expect from subtitles to do all of what you listed and in the same time don't be more complicated. It is possible to make support for both, ASS and not extended SRT.
More importantly, there is great crossplatform tools for creating ASS subtitles. And converting from SRT to ASS is just a few clicks (basically just set the path and press "save"). But SRT extension most likely never will be supported, never will have something similar.

Also, how many years ASS with us already? Well, I don't remember but people already familiar with it.
Oh by the way there is few opensource renderers for ASS. Which is probably good for browsers like Firefox/Chrome ^__^

Last edited by Keiyakusha; 2nd May 2010 at 14:09.
Keiyakusha is offline   Reply With Quote
Old 2nd May 2010, 15:33   #6  |  Link
JEEB
もこたんインしたお!
 
JEEB's Avatar
 
Join Date: Jan 2008
Location: Finland / Japan
Posts: 512
I'll go with the posters on this thread, which is that already defined formats that have powerful tools to work with them would be very much preferred as an example.

And yes, "extending" srt would in my opinion be a rather bad idea. Creating things that would just look like garbage with current srt renderers would be just too easy to accomplish.

ASS might not be the brightest kid to come out on the subtitling streets, but if it gets overlooked, I'd like for you to at least talk about it all with nielsm (jfs) of the Aegisub developers, since he -- as far as I can see -- has had ideas boiling around for something to "take over" ASS in the open source subtitling area for quite some time now. Because supporting something that the (in my opinion) most versatile subtitle creator/editor so far will/does support is a big thing in my books, I'd say developing ideas and formats together with such people wouldn't exactly be a bad idea.
__________________
[I'm human, no debug]
JEEB is offline   Reply With Quote
Old 2nd May 2010, 15:34   #7  |  Link
GoodzMastaJ
Usered Register
 
Join Date: Dec 2006
Posts: 9
I'm with most of the other posters in this thread in recommending a format like SSA or ASS.

One key thing you have to keep in mind is not only do browsers have to implement a renderer, but someone also needs to implement the tools to *create* subtitle tracks in this magical new format. Wouldn't it be much easier to say you can use <insert one of many currently popular ASS subtitle applications here, Aegisub being my favorite for ease of use> but only features x, y, and z are supported and the rest will be ignored?

If you're careful about what features you choose to add to ASS, you can end up with a subtitle track created by existing, mature tools that works both in HTML5 and existing video players. That makes your work pretty easy.
GoodzMastaJ is offline   Reply With Quote
Old 2nd May 2010, 17:32   #8  |  Link
TheFluff
Excessively jovial fellow
 
Join Date: Jun 2004
Location: rude
Posts: 1,100
ASS is a terrible hack and has several levels of parsing hell (like overloaded tags (\clip) and meta-tags (\t)) but it's by far the most complete and usable format out there today for doing any sort of decently formatted subtitling. It also has a pretty big userbase, is well understood and there are multiple open-source renderers and other tools. If you want an existing format, ASS is the way to go. If you're going for a new format, just skip basing it on something else (realistically, it won't be backwards compatible in any useful way anyway) and Do It Right from scratch. The Aegisub developers have been kicking around ideas about a new format that corrects the failings of ASS, so I'm just gonna page nielsm to this thread and let him handle it.

SRT doesn't have working pixel positioning in any currently existing renderer, by the way. Subrip can output coordinates when doing OCR from a DVD vobsub stream, but the meaning of the coordinates is not known (it's possibly even undefined) and all renderers will ignore them (if they don't just cause a parsing error).

Working any meaningful backwards compatibility into an extended version of SRT would be quite hard, if not impossible, and would severely limit the potential of the format. There are probably dozens (if not hundreds) of parser implementations and despite it being a very simple format, their capabilities and error handling vary wildly because there's no specification whatsoever and even SubRip itself has historically changed the format on a whim now and then. Some implementations blow up when they encounter positioning data. Many blow up on extra newlines, inconsistent line numbering or overlapping lines (that last one is a particularly annoying problem). At least one (VSFilter) allows SSA/ASS override tags as well as the traditional HTML-style ones. Just such a simple thing as how to handle < and > is completely undefined, with some parsers supporting &lt; and &gt;, others just having special-case handling for the four existing tags (i, b, u and font) and still others just having no support at all for those characters.

Personally I'd recommend going for a new format, because quite frankly, all the existing ones suck.

Last edited by TheFluff; 2nd May 2010 at 17:40.
TheFluff is offline   Reply With Quote
Old 2nd May 2010, 18:46   #9  |  Link
yuvi
Registered User
 
Join Date: Jan 2006
Posts: 30
SSA/ASS have no real spec other than source code, not exactly something to put in a standard. Even the positioning isn't well defined, especially with anamorphic video.

The problem with supporting container subtitles is that then you have to support qt/mp4 timed text, kate, and ssa/ass to cover the major supported containers. Which is more complex than needed given that there isn't a non-Apple renderer for qt/mp4 timed text that doesn't ignore styles+positioning, and there's only a single library for the other two formats. It would be nice, but I don't see multiple browsers doing the work.

Given how much software can already output SRT, I'd say that a new format where well-formed basic SRT (no overrides/tags, only UTF-8/16, strictly consecutive numbers starting at 1, no extraneous spaces/newlines, etc.) is still valid would be nice. Or ignore that and maybe have basic SRT as a second fallback format.

No real need to keep compatibility with all the random permutations of SRT given that various parsers don't accept subsets of that as TheFluff said.

EDIT: One other thing, whatever format you come up with, do try to encourage overlayed text + optional outlining and drop shadow (see most SSA/ASS, DVD subtitles, and nico video comments) instead of the annoying boxes in many formats (CEA-608 closed captions, qt/mov timed text, youtube annotations).

Last edited by yuvi; 2nd May 2010 at 19:10.
yuvi is offline   Reply With Quote
Old 2nd May 2010, 20:20   #10  |  Link
jiifurusu
Aegisub developer
 
Join Date: Jan 2007
Posts: 17
Beforementioned jfs/nielsm of the Aegisub project is me.

I have some ideas for a subtitle format inspired by ASS, SSA, some more, and some new ideas, which I believe to be very strong. I have begun writing specifications for this format, though they are not finished.
The format is based on having basically two levels of compliance, where one level (the basic one) is elastic in that rendering applications can pick and choose features they find reasonable from a set. The advanced compliance level is still only on the idea stage, but is intended to require full conformance on all points, and support everything ASS does, and more, but better.

As mentioned before, the disadvantage of defining a new format is that there won't be any existing software for authoring into the format. I believe that by basing a new format (loosely) on the ideas of existing successful formats getting community support for the format will be easier.

I'd like to point out two subtitle format projects that have failed, and why I believe they failed to gather the required community support.
The first is USF, Universal Subtitle Format, an XML-based format intended to be the primary format for Matroska video. It never became popular because there was never a full-featured renderer (though VSFilter has partial support I've never seen it in action), there never was a usable authoring application, and authoring XML by hand becomes very tedious. XML does have strengths, but I believe subtitles is not an area it is well-suited for.
The other failed format is the SSF, Structured Subtitle Format, by Gabest who is the primary developer of VSFilter. I'm not aware how far the implementation of it got, but the syntax and desired feature-set was specified. Unfortunately semantics were only specified very loosely. The format looks superficially like CSS, or maybe rather JSON, and is based on a somewhat loose concept of "definitions" which can reference each other in various ways, and once a definition becomes "defined enough" it turns into a renderable subtitle. While this concept is powerful in what it can express, it doesn't lend itself well to authoring software, and it's also a bit too verbose for hand-writing though not as bad as USF. The result was again a format with poor to no software support, and hard to author for, and thus a failure.

The lesson I learn from those two formats is that for a subtitle format to be successful it needs three factors: A good renderer, a good authoring application and the basic form must be simple to write. The last, simple to write, makes it easier to obtain the two former, good software support.

As mentioned, we (at the Aegisub team) have a partial draft for this new format which we call AS6 (Advanced Subtitles 6) which could become a complete draft for the basic compliance level with some more work. Personally, I'm busy with schoolwork the next 10 days or so, and it seems I'm the one with all the ideas and knowledge (unfortunately) so things are a bit on hold, but once I get time I will finish the draft for proofreading. I hope this will be considered an option.
On a side-note, the current developers of MPC-HC (Media Player Classic Home Cinema) have also expressed support for AS6.
__________________
Aegisub advanced subtitle editor // Various junk of mine
Better know as jfs on the rest of the 'net, but someone's squatting that here.
jiifurusu is offline   Reply With Quote
Old 3rd May 2010, 18:57   #11  |  Link
Mr VacBob
Registered User
 
Join Date: Feb 2005
Posts: 140
The major problem with subtitles is that, since nobody except ASS users care about them already, it's probably not possible to convince them to support anywhere near its featureset.
If they're happy with white monospace text over black boxes, you'll never get anyone to accept having to test a parser for a weird-looking format like ASS, let alone the text style system.

Sneaking something equivalent in using CSS/HTML would probably work better, although you'll still have problems even getting them to support outlines - I can guarantee WebKit will have problems doing it, as I've found myself that the people who wrote the OS X text renderer have never heard of inner-stroked text...

Having said that, ASS works pretty well in the unspecified and poorly tested implementations it already has, and browsers probably can't quite handle rendering HTML on top of a video in realtime. And I don't want to have to go back and support it in Perian if they do.

Quote:
Originally Posted by jiifurusu View Post
As mentioned, we (at the Aegisub team) have a partial draft for this new format which we call AS6 (Advanced Subtitles 6) which could become a complete draft for the basic compliance level with some more work. Personally, I'm busy with schoolwork the next 10 days or so, and it seems I'm the one with all the ideas and knowledge (unfortunately) so things are a bit on hold, but once I get time I will finish the draft for proofreading. I hope this will be considered an option.
On a side-note, the current developers of MPC-HC (Media Player Classic Home Cinema) have also expressed support for AS6.
I believe *not* having ideas is more important than having them; ASS hardly needs to be more expressive. But I certainly want to read your draft.

Last edited by Mr VacBob; 3rd May 2010 at 19:01.
Mr VacBob is offline   Reply With Quote
Old 3rd May 2010, 22:03   #12  |  Link
Hixie
Registered User
 
Join Date: Apr 2010
Posts: 5
jiifurusu: Is there a mailing list for the work you're doing, or do you have any links to any of the spec drafts you mentioned? I'd love to take a look, if it's a simple proposal that can be made to do what we need for HTML5, it would be better for us than making a new format.
Hixie is offline   Reply With Quote
Old 3rd May 2010, 23:45   #13  |  Link
jiifurusu
Aegisub developer
 
Join Date: Jan 2007
Posts: 17
Hixie: We have some issues with our mailing list right now but hopefully they'll be resolved in a few hours. We're keeping the list closed from the public for now, because of what happened with an older project called AS5.
If you PM me here with an email address I'll see that you get invited to the mailing list. (This is not a general invitation.)

The short story of AS5: We also agreed that ASS sucks back then, I think early 2007 or 2008 and wanted to define a new format. We started various discussion in public and put working documents up on the Aegisub SVN. Eventually, discussion died out before we really had more than a couple of loose ideas for extensions to ASS. Full semantics was never defined, and syntax wasn't finalised either. We weren't even sure whether everything was a good idea. (Another thing is that we were still mostly just extending ASS and changing some line formats to be more flexible, instead of really thoroughly rethinking what a subtitle format needs to do.)
So what eventually happened was that some people started implementing parts of this draft into existing ASS renderers, IIRC it's part of what is now called VSFilterMod (which we have a blog post with opinions on) which is really just a bunch of incompatible extensions to the ASS format. But back then, work on AS5 just kind of stopped and we never got further than that incomplete draft.

So, to avoid history repeating and people starting making new, incompatible renderers before the specification for AS6 has reached some level of maturity, we want to keep the details away from the general public until it's reasonably stable.
__________________
Aegisub advanced subtitle editor // Various junk of mine
Better know as jfs on the rest of the 'net, but someone's squatting that here.

Last edited by jiifurusu; 3rd May 2010 at 23:54.
jiifurusu is offline   Reply With Quote
Old 4th May 2010, 00:07   #14  |  Link
Hixie
Registered User
 
Join Date: Apr 2010
Posts: 5
Sure. My address is ian@hixie.ch.
Hixie is offline   Reply With Quote
Old 23rd July 2010, 02:19   #15  |  Link
Hixie
Registered User
 
Join Date: Apr 2010
Posts: 5
I ended up speccing a variant of SRT:

http://www.whatwg.org/specs/web-apps....html#websrt-0
http://www.whatwg.org/specs/web-apps...timed-tracks-0

There are no examples there yet, I plan to add a bunch over the next few weeks.
Hixie is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 05:35.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.