miamicanes
17th December 2010, 01:33
Suppose you have a 720p24 video that you want to convert to 480i60, specifically optimized for playback on an inherently-interlaced CRT 480i60 display. I'm pretty sure I can figure out most of it, but I'm totally stumped about how to do the "middle" (selective interpolation and duplication) part:
-----
for each sequential pair of 720p24 source frames A and B:
1. use MVinterpolate(?) to synthesize a frame that's roughly 2/3 of the way between A and B. We'll call it "Q".
2. output five frames: A A Q B B
-----
Ultimately, I'd rip through the faux 60fps progressive output and alternate between grabbing odd and even lines to create an interlaced 30fps video. If A, B, C, and D are 24p source frames, Q is synthesized and roughly 2/3 of the way between A and B, and V is synthesized and roughly 1/3 of the way between C and D, and "o" & "e" indicate an odd or even set of scanlines, a 5-frame/10-field chunk of the final video would be in the form:
AoAe QoBe BoCe CoVe DoDe (then repeat from AoAe...)
I know this would look horrific on a natively-progressive display, but I believe it would be a nearly-ideal compromise between judder-minimization and motion artifacts from synthetic fields for a SD CRT TV. Each source frame would start at the same relative moment as it would in normal 3:2 pulldown, but the addition of two interpolated fields would turn the 3:2:3:2 cadence into 2:(1):2:2:(1):2. Put another way, it would still "buzz" from judder, but the frequency of the visual buzzing would be roughly double what it is with conventional pulldown.
I know mixing fields in a single frame is an absolute no-no for progressive output, but logically it seems like for a natively-interlaced display, it would kind of be like using adjacent subpixels to smooth text the way cleartype does. The MPEG encoder might know that film frame #2 is split between video frames #2 and #3, but presumably your eyes aren't keeping tally of field pairs, and the TV itself is completely indifferent. You'd still notice something is amiss, but I'm predicting that it wouldn't be nearly as obvious as the effect you get from traditional 3:2 pulldown. As an added bonus, only 2 out of 10 fields would be synthetic, and each would persist for only 1/60th of a second, so even if a motion glitch caused some distracting artifact, it would only be a single field surrounded by four unmodified fields from the original video source (2 source frames).
Worst-case, I'm guessing that it might throw a monkey wrench into the efficiency of the MPEG-2 encoding algorithm by breaking an assumption made by its authors about the nature of the video being encoded... but if file size is only a secondary concern, it seems like it would work spectacularly well (and might have even become the norm, had CRT displays not become commercially obsolete a decade before computers became fast enough to casually synthesize interpolated video fields like this)
Anyway, the part between the hyphenated lines is what I'm really stuck on right now (I've done a fair bit with Avisynth, but I've never tried to do anything that didn't apply to *everything*, as opposed to trying to selectively pick out frames and do specific things for specific frame numbers.
-----
for each sequential pair of 720p24 source frames A and B:
1. use MVinterpolate(?) to synthesize a frame that's roughly 2/3 of the way between A and B. We'll call it "Q".
2. output five frames: A A Q B B
-----
Ultimately, I'd rip through the faux 60fps progressive output and alternate between grabbing odd and even lines to create an interlaced 30fps video. If A, B, C, and D are 24p source frames, Q is synthesized and roughly 2/3 of the way between A and B, and V is synthesized and roughly 1/3 of the way between C and D, and "o" & "e" indicate an odd or even set of scanlines, a 5-frame/10-field chunk of the final video would be in the form:
AoAe QoBe BoCe CoVe DoDe (then repeat from AoAe...)
I know this would look horrific on a natively-progressive display, but I believe it would be a nearly-ideal compromise between judder-minimization and motion artifacts from synthetic fields for a SD CRT TV. Each source frame would start at the same relative moment as it would in normal 3:2 pulldown, but the addition of two interpolated fields would turn the 3:2:3:2 cadence into 2:(1):2:2:(1):2. Put another way, it would still "buzz" from judder, but the frequency of the visual buzzing would be roughly double what it is with conventional pulldown.
I know mixing fields in a single frame is an absolute no-no for progressive output, but logically it seems like for a natively-interlaced display, it would kind of be like using adjacent subpixels to smooth text the way cleartype does. The MPEG encoder might know that film frame #2 is split between video frames #2 and #3, but presumably your eyes aren't keeping tally of field pairs, and the TV itself is completely indifferent. You'd still notice something is amiss, but I'm predicting that it wouldn't be nearly as obvious as the effect you get from traditional 3:2 pulldown. As an added bonus, only 2 out of 10 fields would be synthetic, and each would persist for only 1/60th of a second, so even if a motion glitch caused some distracting artifact, it would only be a single field surrounded by four unmodified fields from the original video source (2 source frames).
Worst-case, I'm guessing that it might throw a monkey wrench into the efficiency of the MPEG-2 encoding algorithm by breaking an assumption made by its authors about the nature of the video being encoded... but if file size is only a secondary concern, it seems like it would work spectacularly well (and might have even become the norm, had CRT displays not become commercially obsolete a decade before computers became fast enough to casually synthesize interpolated video fields like this)
Anyway, the part between the hyphenated lines is what I'm really stuck on right now (I've done a fair bit with Avisynth, but I've never tried to do anything that didn't apply to *everything*, as opposed to trying to selectively pick out frames and do specific things for specific frame numbers.