View Full Version : yyC2Swp - Convert Closed Captions (SCC/G608) to subtitles with positioning
yukichigai
24th August 2025, 04:35
After searching for a tool which could convert Closed Captions while preserving positioning information and finding none (or at least none that were affordable) I have made my own. I present to you:
yyC2Swp - Y|yukichigai's Caption to Subtitle (with positioning!) (https://github.com/yukichigai/yyc2swp)
This is a modification of McPoodle's CCASDI Perl script from the discontinued SCC TOOLS project, and works much the same: give it an input caption file in SCC format and it'll spit out a subtitle file that most players can actually read. The key difference here is that yyC2Swp preserves the positioning information in the captions; if text is placed on the top left side of the screen in the captions then it'll show up on the top left side of the screen in the subtitles as well, so on and so forth.
Additionally, yyC2Swp supports input files in the Grid 608 format (https://github.com/CCExtractor/ccextractor/blob/master/docs/G608.TXT) used by CCExtractor. Some versions of CCExtractor have difficulties converting caption data to Scenarist/SCC format depending on the source, but almost all versions are able to output correctly to Grid 608 format (or close enough). Grid 608 is also a lot easier to visually inspect for errors.
Speaking of errors, I know that you can't always control the quality of the input files you work with. For this yyC2Swp has a "-ec" (Error Correction) command line option, which will allow conversion to continue even when errors are encountered in the input file, correcting them automatically if possible or skipping them entirely if not. Errors will still be noted in the console output.
For output formats, yyC2Swp mainly supports output to SubStation Alpha (SSA) and Advanced SubStation (ASS) formats. Caption positioning is preserved with 97.684% accuracy (http://pulled-from-my.ass) with those two formats. TTML/DFXP is also supported: it has similar accuracy to SSA/ASS, but is supported mostly by professional software and unsupported by most consumer software/devices. WebVTT and SAMI are supported for conversion, but have their own issues. Both formats will convert the text correctly, but positioning information is not correctly interpreted (or at all) by many players. SAMI in particular seems to have zero players which support its positioning tags, even using example scripts from the format spec! Output is also supported for some lesser used/esoteric formats: Spruce Technologies Language (STL, aka DVDMaestro), GPAC's TTXT (which can be used to convert to TX3G), and even QuickTime Text.
Anyway, if you'd like to give this a go all you need is Perl and a dream. For Windows users there is also a compiled executable available on GitHub, though it's significantly larger than the Perl version (~9megs vs ~400k).
Also I would like to give many thanks to Emulgator for providing a delightfully specific breakdown of how Closed Captions are decoded and displayed with relation to screen position and resolution. A lot of the positioning math is based off of that analysis and it was incredibly helpful.
Update: Version 1.3 Release
Version 1.3.1 has released! This version fine tunes the default font sizing so that it matches Closed Caption presentation even closer; turns out I'd miscalculated and the correct Courier New font size was 26, not 30. It also improves the efficiency of encoding subtitles with multiple empty line breaks (e.g. text on the top and the bottom of the screen at the same time) and fixes an issue which would cause the script to fail when trying to convert a file with a dash (-) in the name.
The 1.3.1 GitHub release also includes a Windows EXE for Windows users who do not have Perl installed on their system for whatever reason.
Emulgator
24th August 2025, 10:27
Beautiful, yukichigai ! ATM I don't have Perl at my Win10 system, but I am curious how well it fares.
Until now for such task I have used Subtitle Edit, and the numberpad-like alignment use of the {\an1} through {\an9} in .ass was well enough for me.
Now to have any position available brings the CC experience back a lot closer.
(PAL land here, rarely any CC, and I have only a few original US and Japan NTSC CC DVDs to train on)
Emulgator
24th August 2025, 11:15
What I get so far is Perl portable seemingly working, on call giving me its CLI.
I pasted the path to script and the source without options.
Script is executed, yyC2Swp reports with its help file, but does not start to transcribe.
C:\_PROG_64\Perl64>"C:\_SETTINGS\Subtitle Scripts\yukichigai SCC to ASS\yyC2Swp.pl" "C:\_SETTINGS\Subtitle Scripts\yukichigai SCC to ASS\vts_02_1_vob_GP_CC.scc"
yyC2Swp Version 0.4
Y|yukichigai's Caption to Subtitle (with positioning!)
Converts Scenarist Closed Captions or Grid 608 files to subtitles while
preserving positioning information. Also converts to SCC Dissasembly.
Based on CCASDI by McPoodle.
Syntax: yyC2Swp -cCC2 -a -o01:00:00:00 -td infile.scc outfile.ass
-c (OPTIONAL): Channel to convert to subtitle. SCC input only.
(CC1 default, CC2, CC3, CC4, T1, T2, T3 and T4 are other choices)
-r (OPTIONAL): Output roll-up subtitles in roll-up format, instead of
one line at a time
-a (OPTIONAL): Adjust timecodes to be start time for SCC
and display time for dissassembly
-o (OPTIONAL): Offset to apply to timecodes, in HH:MM:SS:FF format
(DEFAULT: 00:00:00:00, negative values are permitted)
-f (OPTIONAL): Number of frames per second (range 12 - 60) (DEFAULT: 29.97)
-t (OPTIONAL; automatically sets fps to 29.97):
NTSC timebase: d (dropframe) or n (non-dropframe) (DEFAULT: n)
Notes: outfile argument is optional (name.scc/g608 -> name.ass). Format is
controlled by outfile suffix: .vtt WebVTT, .smi SAMI, .ssa Sub-Station Alpha,
.ass Advanced Sub-Station (default), or .ccd SCC Disassembly (SCC input only).
C:\_PROG_64\Perl64>
Running Perl CLI as administrator did not help.
Changing paths without spaces, then without "" did not help.
Going into the .pl script and change the default "my $convert = 0" to 1;... did not help either.
The .scc file in question could be opened fine in SubtitleEdit 4.0.12
P.S. Ah, found a remnant file in Perl64 folder: "name.ass)", size 0B, creation date 24.08.2025 12:20.
Something stalled.
yukichigai
24th August 2025, 19:19
Huh... well that's a new one. I wonder if it's something about the path included. I'll do some poking.
Does it at least work if you put the files in the same folder as the script?
EDIT: I tried running it on my system with non-local files and it seemed to work. One thing I noticed is that I'm initiating Perl on the command line rather than invoking the CLI first and then running the script. That SHOULDN'T matter, but still... maybe try doing "perl yyC2Swp.pl <filename>" and see if that gets you any better results? Also my system has Strawberry Perl installed, not sure what version you're running.
Emulgator
25th August 2025, 18:39
Strawberry here too, but no install, rather portable, cautious me.
So my system does not know about perl and calling perl directy does not work.
Does it at least work if you put the files in the same folder as the script?
Did that, and it makes no difference.
Although I am running as real Administrator, it could be a Windows specific block, permissions...who knows.
In between I was watching a NTSC CC DVD (Sony Authoring) on a Oppo 205, and:
Oppo's CC placement seems faulty, What should appear at the right side comes out centered at best.
Have to repair my Panasonic DMR-EX95 (won't start up, needs recapping) to see what Panasonic implemented there.
yukichigai
25th August 2025, 19:08
Huh. Well it sounds like I need to put together a compiled Windows version earlier than I thought. I've been testing this on a few systems (Windows and Linux) and the behavior is the same, but then again I've been the one installing Perl on each system. This really sounds like Perl environment differences; easiest way to fix that is to compile under the intended environment. I'll put together a compiled version and let you know when it's up.
Emulgator
25th August 2025, 19:33
Many thanks, yukichigai !
yukichigai
25th August 2025, 20:39
Alright, in trying to compile the script I think I discovered a possible issue: somehow the file encoding on the script was set to UTF-16, not UTF-8. Perl only just supports UTF-8. That may be why your version of Perl was having issues. My bad. I'll update the Github so the file is encoded UTF-8.
I case that wasn't the issue, I've uploaded to MEGA (https://mega.nz/file/h1EHBRBB#5b709uZ-wUsdCw23_gOTZ7QKZGtCLAZBTlQnPQbkYaY) a version of my current WIP that I've compiled to an exe. TTML and SAMI output isn't quite right, and VTT output still doesn't seem to produce any positioning changes, but SSA and ASS output is still correct, minus the inevitable nitpicky tweaks I'll need to make.
cubicibo
26th August 2025, 08:51
In between I was watching a NTSC CC DVD (Sony Authoring) on a Oppo 205, and:
Oppo's CC placement seems faulty, What should appear at the right side comes out centered at best.
Have to repair my Panasonic DMR-EX95 (won't start up, needs recapping) to see what Panasonic implemented there.
Blu-ray players are not particularly good at, if even capable of, reproducing closed captions. Technology Connections covered that very topic extensively in a recent video (https://www.youtube.com/watch?v=OSCOQ6vnLwU). Unless your player has an analog output, the SoC has to decode, render and overlay the captions itself, but not all (BD / DVD) vendors implement this functionality.
DVD players don't render the captions themselves either, they merely inject the binary data from the DVD straight into the vertical blanking interval: only your TV would render them.
yukichigai
26th August 2025, 17:29
Blu-ray players are not particularly good at, if even capable of, reproducing closed captions. Technology Connections covered that very topic extensively in a recent video (https://www.youtube.com/watch?v=OSCOQ6vnLwU). Unless your player has an analog output, the SoC has to decode, render and overlay the captions itself, but not all (BD / DVD) vendors implement this functionality.
Funnily enough, that video was actually part of the reason I started working on this project in the first place. The examples he uses are very good examples of captions where positioning information is important to them being understandable.
yukichigai
28th August 2025, 06:30
Alright, new version out: 0.6. This adds Timed Text Markup Language (TTML/DFXP) support and improves WebVTT positioning. Unfortunately WebVTT feature support is really inconsistent between players. Currently VLC will display everything as intended except for the horizontal positioning, while MPV and MPC both ignore any formatting not supported by SRT (i.e. anything but bold-italic-underline and colors). Web-based players on the other hand display everything as intended. I'm going to try a few different approaches to see if I can find a way to improve compatibility with more players, or at least get VLC to process horizontal positioning on top of everything else.
TTML is a bit more feast-or-famine. VLC supports all critical features for proper positioning other than line spacing, and that may simply be a matter of me finding the right parameter to modify. Meanwhile MPC doesn't support TTML at all, while MPV does not position subtitles accurately (though it tries) and ignores font specifications. Web-based player support is better, but not as near-universal as WebVTT. In retrospect I'm not sure if there was much gained by adding this format, but I already did it so I may as well keep it in.
Overall Advanced SubStation and SubStation still produce the best looking results, though there's still some fine tuning to be done to get the output to match Closed Caption letter and line spacing exactly.
Emulgator
28th August 2025, 14:32
Just got back to testing the 0.5.exe on a 1956 music film, and it worked this time.
Wow ! Good work, yukichigai !
The CC as .ass look convincing, and add a nice vintage experience to the subtitle choices where no CC decoder is available anymore.
And thanks for the credits, happy to help. I was hoping my digging some years ago would make sense someday.
For refinement:
https://www.zilog.com/docs/tv/z86229.pdf
yukichigai
28th August 2025, 20:20
Great! Glad the results are suitably vintage. :3
One of the reasons I chose Courier New was because Courier-family fonts are the standard for screenwriting. I figured it would give the subtitles an appropriate look.
Still working on updates. Oh and thanks for that link! I'll probably include a link to it in the readme in case anyone wants to know more about the device that those overscan calculations were derived from.
yukichigai
28th September 2025, 21:26
Alright, yyC2Swp v1.0 has released (https://github.com/yukichigai/yyc2swp/releases/tag/Release). This has almost every feature I wanted in the script, missing only a few rarely used or "nice to have" features. For the majority of use cases this will accomplish everything you need.
I have also included a standalone Windows EXE for Windows users who don't have Perl installed on their system. It's significantly larger than the Perl script by itself however (8.9 megs vs 410k) so if you do have Perl I strongly recommend you use the script instead.
Emulgator
30th September 2025, 08:21
Again, many thanks for your continued efforts !
Now lets see if I can ask font guru Peter Wiegel nicely to port the original CC font from the Z86229.
Maybe there is a similar one already available ? I am thankful for pointing me into that direction.
BTW, the 1.0 exe won't start on my Win10P64.
"This app can not be run on your PC.
Consult the software maker to find a suitable version for your PC."
(hand translated from German)
yukichigai
30th September 2025, 19:55
BTW, the 1.0 exe won't start on my Win10P64.
"This app can not be run on your PC.
Consult the software maker to find a suitable version for your PC."
(hand translated from German)
Huh... that's odd. Might be something in the packer settings I used. I'll check.
yukichigai
1st October 2025, 06:57
Alright, no idea what happened to the EXE in between testing on my system and uploading it to GitHub, but I encountered the same issue trying to run it. I recompiled it, tested it, and uploaded a fixed version. Should work now.
Emulgator
1st October 2025, 17:08
exe runs now, but produces an almost empty .ass, only the header.
[Script Info]
Update Details: Converted from scc Closed Captions by yyC2Swp 1.0 (a CCASDI fork)
ScriptType: v4.00+
Collisions: Normal
PlayResX: 640
PlayResY: 480
LayoutResX: 640
LayoutResY: 480
Timer: 100.0000
WrapStyle: 1
[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, SecondaryColour, OutlineColour, BackColour, Bold, Italic, Underline, StrikeOut, ScaleX, ScaleY, Spacing, Angle, BorderStyle, Outline, Shadow, Alignment, MarginL, MarginR, MarginV, Encoding
Style: *Default,Courier New,30,&H00FFFFFF,&H0000FFFF,&H00000000,&H00000000,1,0,0,0,100,100,0,0,1,2,3,8,0,0,0,1
[Events]
Format: Layer, Start, End, Style, Actor, MarginL, MarginR, MarginV, Effect, Text
Did the syntax change ? I called only by source file path, no other parameters.
yyC2Swp.exe "C:\_SOFT\yyC2S\1.0\vts_02_1_vob_GP_CC.scc"
-pause
yukichigai
1st October 2025, 19:49
The syntax shouldn't have changed. On my system I'm able to run it using a path to the file in quotes.
That said, I've found an error in script that crops up only with specific test files. Turns out I'd added a double $$ for a variable name in a few places (oopsie), which would cause the conversion to bomb out midway (though usually with an error). Might be what was causing your issue. I've uploaded a fixed v1.0.1.
Emulgator
2nd October 2025, 01:40
Many thanks, yukichigai !
Well, 1.0.1.exe runs, generates the 17 lines of header plus lines 18..36, then the .ass ends prematurely.
Not just yet, but cloooose ;_)
The 0.5 .exe delivered all 1486 lines in total.
In the .ass the italics tags end up split, the closing {\i0} ends up at the beginning of next line, after \N, and the last closing {\i0} is missing.
Now looking at the .pl: Did the Perl syntax change ? In 1.0.1 I find lots of == changed to "eq"; != to "ne" and so on.
Well, I am without deeper knowledge of course.
WinMerge Diffing with 0.6 shows decode pattern shifts from /27/ to /a7/, 28 to /a8/ and so on.
If you would be so kind and build .exes for 0.6, 0.7, 0.8, 0.9 I can run torture tests with what I have.
That would help narrow down the issue.
P.S: And maybe I should upload the concerning .scc
You may PM me.
yukichigai
2nd October 2025, 06:15
I'm not sure if that was it, but that pointed me at something I'd overlooked. In a previous version I found a few inappropriate uses of "==" vs "eq" (numeric vs string) but there are some replacements that I did and then remember undoing... or at least I thought I did. Versioning issue maybe? I'm working on this on two separate workstations so it's possible. In any case I'll fix those.
The differences in matching logic are also a relic of some changes I thought I'd undid. I made some additions to the code based on McPoodle's latest SCC documentation, but then discovered that his processing block removes 128 (Hex 80) from both high and low order byte pairs.
...it seems like I may have somehow lost a revision or two in the middle here. Gimme a bit to go over this again and see what got jacked up where. I'll work on making packed versions of the other scripts as well.
yukichigai
3rd October 2025, 22:20
Alright, I've found a couple test case files that are producing odd results. SCC -> CCD conversion works fine, but going to ASS produces no output. Haven't figured out the cause yet, but I'm working on it.
Blue_MiSfit
4th October 2025, 05:50
You must check out ttconv :) https://github.com/sandflow/ttconv
Emulgator
4th October 2025, 19:02
yukichigai, I have put up a link on wetransfer, check your PM, valid for 3 days from now on.
yukichigai
7th October 2025, 00:35
yukichigai, I have put up a link on wetransfer, check your PM, valid for 3 days from now on.
I got it, thanks!
Alright, so I'm looking at this SCC file and it appears there's an orphaned single byte pair near the end of the caption at 00:01:12:18. Here's the end of the line, emphasis added for the one that's wrong
2054 4f4f 20 9137 942c 942c 942f 942f
Now that said, I'm not sure why previous versions would process that while the new version doesn't. An orphaned byte pair is an error no matter how you look at it, and according to the way McPoodle's original logic is supposed to work that should have always been a fatal error. However, it should be printing an error message when it encounters that, specifically "Incorrect word length for word 33, timecode 00:01:12:18, stopped at yyC2Swp.pl line 749, <RH> line 47." I see that on my system when I try to convert the file at least. If you're not then that's also odd.
Still, this is a good test case to have. The rest of the SCC file is correct, and if that errant byte pair is removed it processes fully. I think this calls for a command line option to continue processing even after encountering errors like this. Guess I know one of the first features in v1.1!
EDIT: I think I may know why: in between v0.5 and v1.0 a kind soul on github shared with me the "lost" version 3.8 of CCASDI, which included a number of changes and bugfixes. One of those was a change to the way byte pairs were processed and checked.
yukichigai
7th October 2025, 00:38
You must check out ttconv :) https://github.com/sandflow/ttconv
Ooo, this has EBU-STL support. Neat! I'll definitely give this a poke and see if there's something in there that inspires me for later versions of my script. With credit given, of course. Thanks for the heads up!
Emulgator
7th October 2025, 21:07
Good find ! I had extracted this .scc with General Parser, IIRC, therefore the filename contains _GP_
Maybe I should give other demuxers a try again.
yukichigai
7th October 2025, 22:38
Quite honestly I've yet to find an SCC extractor (other than Scenarist itself) that is perfectly accurate. That's one of the reasons I added Grid 608 support, because CCExtractor never seems to have problems outputting to that format at least.
yukichigai
10th October 2025, 07:02
Alright, Version 1.1 is out. This has the "-ec" (Error Correction) command line option which will allow for processing to continue even when errors are encountered in input files. Errors will still be noted in the console output.
yukichigai
27th November 2025, 04:14
Version 1.2 is out. This adds methods to override the Font and the Font Size, as well as an option to disable the default bolding of text.
Simon_H
25th December 2025, 09:23
Love the tool. I've been working on one, too. In my case it's a self-contained web page.
I created an embeddable monospaced font using the open-font-licensed Fira Code. My version is under 40KiB because I stripped out the characters that aren't supported by CEA-608. It’s called CCapbofira (Closed Captions based on fira; pronounced like capoeira) and uses the same metrics as Lucida Console.
As a Christmas gift, please feel free to embed it in your program. I posted it up at https://forum.videohelp.com/threads/419679-Christmas-gift-App-to-convert-%2A-scc-to-positioned-subs-with-embedded-font
vBulletin® v3.8.11, Copyright ©2000-2026, vBulletin Solutions Inc.