Doom9's Forum - View Single Post

Prettz · 28th May 2010, 02:06

Quote:

Originally Posted by TheFluff

OCR only works decently if the letters look similar each time they appear and if edges are easily detectable. Your video sounds like it doesn't fulfill either of those requirements so no OCR engine will be able to make much sense of it. It's probably faster to just copy the subs by hand.

You're right about my video being total crap. But I'm determined to use the OCR; for one thing, I've got to get the timings. There's no way I'm going to do the entire subtitling process by hand (and for the first time ever), hell no. So I at least needed to try to get it so SubRip could recognize when there are subs and when there aren't, otherwise it interrupts every few frames and asks me to enter the entire subtitle (when there isn't one on screen).

After tons of trial and error I've finally come up with an Avisynth script that gets SubRip to behave acceptably. Although there's probably more I could do, I don't have much experience using filters to dramatically change the picture. After playing around with other options I came back to MSmooth because it does the effect I need best. There was also no way to get any kind of acceptable results without feeding SubRip a greatly upscaled video res.

Code:

LoadPlugin("C:\Program Files (x86)\Avisynth 2.5\plugins\RemoveGrainSSE3.dll")
LoadPlugin("C:\Program Files (x86)\Avisynth 2.5\plugins\FluxSmooth.dll")
LoadPlugin("C:\Program Files (x86)\Avisynth 2.5\plugins\MSmooth.dll")
LoadPlugin("C:\Program Files (x86)\Avisynth 2.5\plugins\WarpSharpYV12.dll")


AviSource("E:\Video2\Anime\Kiki's Delivery Service (sub).avi")   #560x320

Crop(32,252,-32,0,align=true)   #496x68

FluxSmoothST(4,4)
RemoveGrain(mode=1)

BicubicResize(1736,238,b=1/3,c=1/3)
RemoveGrain(mode=1)
MSmooth(threshold=20,strength=5,chroma=true,highq=true)
RemoveGrain(mode=1)
WarpSharp(depth=90,blur=4)
FluxSmoothST(6,6)
RemoveGrain(mode=1)

ConvertToRGB32()
LanczosResize(992,136)
ConvertToRGB24()

My use of FluxSmooth here probably isn't having any helpful effect, but I don't think it's having a detrimental effect on the picture for the later filters, so I keep it there "just in case". Same thing with RemoveGrain. The text still has many different shades of yellow and yellow-green in it (even within the same letter) and there's definitely nothing that can be done about that. So I needed jack the OCR's text color tolerance way, way up.

I'm still not sure whether the filtering task I'm trying to accomplish would be more effective in YV12, with subsampled chroma, or in a 1:1 color space. No way to test that, obviously, so it's just a theoretical question.