Best way to deinterlace hybrid 1080i29.97 footage [Archive]

groucho86

22nd September 2016, 17:12

I'm hoping someone can steer me in the right direction.

I have an hour long 1080i29.97 clip. Half of it is 23.976 footage with 3:2 pulldown, half of it is true 29.97i footage (some is 480i upscaled to 1080, some is true 1080i).

I have a feeling this requires using scene detect, then evaluating and processing every scene's type of content (ivtc for telecine, tdeint or qtgmc for true interlaced). Still trying to teach myself frame (scene?) evaluation and would love a little jump-start sample code.

Thank you all!

Mystery Keeper

22nd September 2016, 17:23

I would VFM and then deinterlace it all. Though a plugin that detects combing and switches frame for one from deinterlaced clip would be nice.

groucho86

22nd September 2016, 19:35

Interesting...

I realize I didn't mention an important additional variable: fps output.

I'm kinda leaning towards 23.976p (which means decimating ivtc material and using InterFrame/svp to frame-convert the true 29.97 parts to 23.976).

If I were to remain in 29.97(p), that would necessarily mean the 3:2 pulldown elements would have duplicate frames, right?

brucethemoose

22nd September 2016, 19:45

Is it animation or live action? And how much of it is really 30i?

Animation is usually 12 FPS anyway, so decimating to 23.976 might not look that bad during the 30 parts. You might see some stutters during panning, but thats it.

Also, be careful with interpolation like SVP, as much as I love it it's going to introduce artifacts... Can it even downsample frames like that instead of upsampling?

groucho86

22nd September 2016, 19:49

Live action. This is a TV documentary that was shot 23.976 but had lots of true 29.97i archival, both SD and HD. It was delivered for broadcast as 1080i29.97.

I've been using SVP successfully to frame-convert 29.97i archival to 23.976. When the artifacting is bad (usually due to crazy handheld motion/rolling shutter) I decimate instead.

jackoneill

22nd September 2016, 20:36

Why does the frame rate have to be constant? You can have the 24 fps parts at 24 fps and the 30 fps parts at 30 fps.

brucethemoose

22nd September 2016, 20:37

Live action. This is a TV documentary that was shot 23.976 but had lots of true 29.97i archival, both SD and HD. It was delivered for broadcast as 1080i29.97.

I've been using SVP successfully to frame-convert 29.97i archival to 23.976. When the artifacting is bad (usually due to crazy handheld motion/rolling shutter) I decimate instead.

Hmm...

Maybe you could run IVTC on a copy of the clip, detect if it's combed or not, and deinterlace if it's combed?

I literally just made a script for that yesterday, but instead of deinterlacing the IVTC'd clip, you would take the original clip, do you SVP + QTGMC magic to it, and replace the old frame with that.

Problem is VFM's and TDeint's comb detection isn't 100% reliable. Whatever you do, it has to look OK on a few frames in the the 24fps material too.

groucho86

22nd September 2016, 20:45

Why does the frame rate have to be constant? You can have the 24 fps parts at 24 fps and the 30 fps parts at 30 fps.

The file needs to be useable for editing/grading (Premiere, Resolve, etc.). These softwares do not support VFR.

groucho86

22nd September 2016, 20:47

I literally just made a script for that yesterday, but instead of deinterlacing the IVTC'd clip, you would take the original clip, do you SVP + QTGMC magic to it, and replace the old frame with that.

Your script inspired me to revisit this old problem of mine :)

I'll play around with it some more, thanks!

Mug Funky

26th September 2016, 01:32

i'd considered porting (and improving) my old NTSCtools, but by now i've no idea how it worked. it was wired for CFR output, but could be set to an arbitrary rate (like 120 which would alternate 4 dupes and 5 dupes or 2 dupes for 60i).

brucethemoose

26th September 2016, 10:55

Your script inspired me to revisit this old problem of mine :)

I'll play around with it some more, thanks!

I wish you the best of luck.

Turns out my footage has the exact same problem. Right now I'm "solving" it by just interpolating the IVTC'd footage to 60p with SVP, but it's less smooth than I had hoped.

If we could somehow feed SVP a VFR feed we generate... unfortunately that's beyond my ability to figure out, as combing isn't a very reliable indicator.

groucho86

26th September 2016, 16:10

i'd considered porting (and improving) my old NTSCtools, but by now i've no idea how it worked. it was wired for CFR output, but could be set to an arbitrary rate (like 120 which would alternate 4 dupes and 5 dupes or 2 dupes for 60i).

That would be a very useful porting.

...I'm going to slowly develop my own function. It will be crude and at times inaccurate but I'm sure I'll learn a lot.

My plan is to do a first pass scene detect. Then for each scene, detect if it's fully interlaced or if it's 3:2 (or fully progressive). I'll concatenate each scene after processing them and will need to make sure timing remains accurate.

Mug Funky

27th September 2016, 12:20

@groucho86: if the footage is even slightly clean (ie from a not too compressed source), you can handle it in one pass by comparing fields. in a 2:3 pattern there are 3-field frames whereby the 1st and 3rd field should be identical.

if you grab a sequence of 5 fields (maybe a couple more for safety) you can look at any one field and find 2 of these 3-field groups (one forward and one backward). my old function did 5 separate compares to work with the 5 possible positions the current field can be in. if there was 1 "match" where 2 of these groups are found, the field was declared to be part of a 2:3 sequence.

this makes it very robust. perhaps with VS, new concepts can be implemented to make the whole thing faster and include more potential cases (my script only distinguished between 60i and 24p, treating 30p as a case of 60i and handing it all off to tdeint, which did a good job of not hitting the 30p too badly).

Myrsloik

27th September 2016, 13:37

@groucho86: if the footage is even slightly clean (ie from a not too compressed source), you can handle it in one pass by comparing fields. in a 2:3 pattern there are 3-field frames whereby the 1st and 3rd field should be identical.

if you grab a sequence of 5 fields (maybe a couple more for safety) you can look at any one field and find 2 of these 3-field groups (one forward and one backward). my old function did 5 separate compares to work with the 5 possible positions the current field can be in. if there was 1 "match" where 2 of these groups are found, the field was declared to be part of a 2:3 sequence.

this makes it very robust. perhaps with VS, new concepts can be implemented to make the whole thing faster and include more potential cases (my script only distinguished between 60i and 24p, treating 30p as a case of 60i and handing it all off to tdeint, which did a good job of not hitting the 30p too badly).

I like to promote creative solutions so I'm just going to point out that you can get the matches and matching metrics as output from VFM which generally should be a fairly good hint of 24/30fps content. You can follow up with VDecimate(dryrun=True) to get an even better difference metric. With all that you could effectively implement your own hybrid decimation using FrameEval. Maybe a bit clunky but even looking at a radius of 20 frames to make the actual decisions shouldn't be that slow since all you need is a cached metric.

Not that I like this kind of stuff anyway, I did fight in the VFR war of 2003 (http://forum.doom9.org/showthread.php?p=394581#post394581) after all.

brucethemoose

28th September 2016, 00:47

Ask if you need a test clip, I've got some hybrid 24/30 footage myself.

In fact, VFM/VDecimate has trouble with it as is (I have to use a separate combing detector), so it might be a good stress test.

Mug Funky

28th September 2016, 14:44

i'm all for checking as many frames as possible. the real world always trips up my fancy metrics, but python offers some intriguing possibilities versus what avisynth could do.

in avisynth i ran a single pass with two functions that spoke to each other. one looked at the video and output a "data" clip (a tiny square that was either red for 2:3, green for 30p or blue for 60i), and another function that took the source and the "data" clip as arguments, which did the conversion.

there are about a thousand better ways to do this in a real programming language. i'd jump on it but i scarcely deal with hybrid stuff any more (used to work at an anime distributor, now i work with live-action).

brucethemoose

2nd October 2016, 08:19

That would be a very useful porting.

...I'm going to slowly develop my own function. It will be crude and at times inaccurate but I'm sure I'll learn a lot.

My plan is to do a first pass scene detect. Then for each scene, detect if it's fully interlaced or if it's 3:2 (or fully progressive). I'll concatenate each scene after processing them and will need to make sure timing remains accurate.

i'm all for checking as many frames as possible. the real world always trips up my fancy metrics, but python offers some intriguing possibilities versus what avisynth could do.

in avisynth i ran a single pass with two functions that spoke to each other. one looked at the video and output a "data" clip (a tiny square that was either red for 2:3, green for 30p or blue for 60i), and another function that took the source and the "data" clip as arguments, which did the conversion.

there are about a thousand better ways to do this in a real programming language. i'd jump on it but i scarcely deal with hybrid stuff any more (used to work at an anime distributor, now i work with live-action).

Well if either of ya'll ever need a test clip, I attached a clip with a 30p pan + NTSC telecined 24p with ~12fps animation, and another with a rather extreme 24 pan from the same episode:

http://www.mediafire.com/file/66d247iz4geq6xx/testclips.7z

Also, with the SVP script, do you just manually select the 30-> 24 scenes now? I'll try that out, I suppose.

EDIT: Original inks were invalid due to copyright detection on 2 seperate sites (really? on 30 second clips?), so I just dropped them in a 7z file.

groucho86

3rd October 2016, 19:33

Also, with the SVP script, do you just manually select the 30-> 24 scenes now? I'll try that out, I suppose

That's the plan.

I'm making incredibly slow process. First time not using spoon-fed functions and actually using FrameEval and accessing frame properties...

brucethemoose

3rd October 2016, 20:48

That's the plan.

I'm making incredibly slow process. First time not using spoon-fed functions and actually using FrameEval and accessing frame properties...

Is the code on GitHub?

brucethemoose

7th October 2016, 03:53

@groucho86: if the footage is even slightly clean (ie from a not too compressed source), you can handle it in one pass by comparing fields. in a 2:3 pattern there are 3-field frames whereby the 1st and 3rd field should be identical.

if you grab a sequence of 5 fields (maybe a couple more for safety) you can look at any one field and find 2 of these 3-field groups (one forward and one backward). my old function did 5 separate compares to work with the 5 possible positions the current field can be in. if there was 1 "match" where 2 of these groups are found, the field was declared to be part of a 2:3 sequence.

this makes it very robust. perhaps with VS, new concepts can be implemented to make the whole thing faster and include more potential cases (my script only distinguished between 60i and 24p, treating 30p as a case of 60i and handing it all off to tdeint, which did a good job of not hitting the 30p too badly).

I'm looking at NTSCTools now, and I assume this is the code that's doing the source type detection:

function NTSCanalyse (clip c, float "th_film", int "th_film2", int "th_prog", float "tol", int "precision", bool "show")
{
# output clip is teeny coloured clip in same space as input
# red = FILM
# green = progressive
# blue = interlaced

show=default(show,false)
order = c.getparity()==true? 1 : 0

global isfade=true
global filmnum=-4
global progtele=0
global th_film=default(th_film,.3)
global th_film2=default(th_film2,10)
#global th_prog=default(th_prog,2)
global th_prog=default(th_prog,30)
global tol=default(tol,.1)
global precision=default(precision,1)

#d = c.horizontalreduceby2().bob(1,0,height=c.height/2).converttoyv12()
d = c.bicubicresize(32*precision,c.height,1/3.,1/3.,16,0,c.width-32,c.height).separatefields().bicubicresize(32*precision,32*precision,1/3.,1/3.,0,8,32*precision,(c.height/2.)-16).converttoyv12()

global film_c = d
global film_d = d.selectevery(2,0)

global f01 = mt_lutxy(film_c.selecteven(),film_c.selectodd(),expr="x y - abs")#.mt_inpand(mode=mt_rectangle(0,2))

global f02 = mt_lutxy(film_c.selectevery(2,0), film_c.selectevery(2,2),expr="x y - abs")
global f13 = mt_lutxy(film_c.selectevery(2,1), film_c.selectevery(2,3),expr="x y - abs")
global b1f1 = mt_lutxy(film_c.selectevery(2,-1), film_c.selectevery(2,1),expr="x y - abs")
global b02 = mt_lutxy(film_c.selectevery(2,0), film_c.selectevery(2,-2),expr="x y - abs")
global f24 = mt_lutxy(film_c.selectevery(2,2), film_c.selectevery(2,4),expr="x y - abs")
global b13 = mt_lutxy(film_c.selectevery(2,-1), film_c.selectevery(2,-3),expr="x y - abs")
global b24 = mt_lutxy(film_c.selectevery(2,-2), film_c.selectevery(2,-4),expr="x y - abs")
global f35 = mt_lutxy(film_c.selectevery(2,3), film_c.selectevery(2,5),expr="x y - abs")
global b35 = mt_lutxy(film_c.selectevery(2,-3), film_c.selectevery(2,-5),expr="x y - abs")

global NTSCanalyse_red = blankclip(film_d,width=8,height=4,color=$ff0000)
global NTSCanalyse_green = blankclip(film_d,width=8,height=4,color=$00ff00)
global NTSCanalyse_blue = blankclip(film_d,width=8,height=4,color=$0000ff)

film_d1 = scriptclip(NTSCanalyse_blue,"isfilm==1? NTSCanalyse_red : isprog==1? NTSCanalyse_green : NTSCanalyse_blue")
#film_d2 = frameevaluate(film_d1,"isprog= f01.yplanemax(tol) <= th_prog ? 1 : 0")
film_d2 = frameevaluate(film_d1,"isprog= f01.yplanemax() <= th_prog ? 1 : 0")
film_d3 = frameevaluate(film_d2,"""
\ isfilm= filmnum!=current_frame? 0 : (b1f1.yplanemax() < th_film2) ? 1 :
\ (f02.yplanemax() < th_film2) && (b35.yplanemax() < th_film2) ? 1 :
\ (b02.yplanemax() < th_film2) && (f35.yplanemax() < th_film2) ? 1 :
\ (b13.yplanemax() < th_film2) && (f24.yplanemax() < th_film2) ? 1 :
\ (f13.yplanemax() < th_film2) && (b24.yplanemax() < th_film2) ? 1 :
\ (f24.yplanemax() < th_film2) && (b13.yplanemax() < th_film2) ? 1 :
\ (b24.yplanemax() < th_film2) && (f13.yplanemax() < th_film2) ? 1 : 0 """)

film_d4 = frameevaluate(film_d3,"""
\ filmnum= (b1f1.averageluma() < th_film) ? current_frame :
\ (f02.averageluma() < th_film) && (b35.averageluma() < th_film) ? current_frame :
\ (b02.averageluma() < th_film) && (f35.averageluma() < th_film) ? current_frame :
\ (b13.averageluma() < th_film) && (f24.averageluma() < th_film) ? current_frame :
\ (f13.averageluma() < th_film) && (b24.averageluma() < th_film) ? current_frame :
\ (f24.averageluma() < th_film) && (b13.averageluma() < th_film) ? current_frame :
\ (b24.averageluma() < th_film) && (f13.averageluma() < th_film) ? current_frame : filmnum """)

show==true? overlay(c,film_d4) : film_d4
}

Forgive me, for I'm new to Python and VapourSynth (and totally unfamiliar with AviSynth), but I don't have a grasp on what's going on here... are you checking to see if various parts of frames match each other? Is that what b13, f04 and so on stand for?

I ask because I have all the code I need for hybrid 24/30 IVTC, except for the source type detection part itself.

ChiDragon

8th October 2016, 06:39

As a video topic, hybrid sources as well as bizarro pulldown are in my wheelhouse but I've never used VapourSynth or Python.

http://forum.doom9.org/showthread.php?t=149003&highlight=scivtc
http://forum.doom9.org/showthread.php?p=1693859#post1693859

My Avisynth filter works decently when it doesn't crash, provided you're enough of a perfectionist/masochist to manually fix all of the scene change detection errors and film/video pattern errors. I'm not as keen on making things automagical, which is where I believe you guys are at.

A typical override file from a 45-minute '90s TV show. This one is 44 lines including the whitespace separators. I believe it would typically take me about 1.5 hours of eagle-eyed viewing to produce this override file while watching the episode.
8023,8051 dsc+
8122,8160 dsc+
8510,8543 dsc+

#section with video intercut with film
17260 p0
17942 p0
18209 p0
18504 p0
18709 p0
19884 p0

42308 p0

43718 X #bug in encoded stream

49643 sc+

53094 sc-

56057 sc+

72785 p0

#black-and-white choppy video intercut with film
72926,73012 sc-
72926 sc+
72926 p0
73013 p0
73121,73168 sc-
73121 sc+
73121 p0
73169 p0
73196 p0
73282,73356 sc-
73282 sc+
73282 p0

75025,75141 dsc+

#end credits
79037 p0
79622 X
80581 p0

dsc+ indicates a dissolve between two different pulldown patterns.
p0 is a video section (handed over to an external filter; in my case in 2009 this was TFM+TDecimate to poorly blend-down from 29.97 to 23.976 for CFR)
sc+ and sc- are adding or removing scene changes
X is interpolating from same-order field (deinterlacing) rather than matching to anything.

brucethemoose

8th October 2016, 10:52

As a video topic, hybrid sources as well as bizarro pulldown are in my wheelhouse but I've never used VapourSynth or Python.

http://forum.doom9.org/showthread.php?t=149003&highlight=scivtc
http://forum.doom9.org/showthread.php?p=1693859#post1693859

My Avisynth filter works decently when it doesn't crash, provided you're enough of a perfectionist/masochist to manually fix all of the scene change detection errors and film/video pattern errors. I'm not as keen on making things automagical, which is where I believe you guys are at.

A typical override file from a 45-minute '90s TV show. This one is 44 lines including the whitespace separators. I believe it would typically take me about 1.5 hours of eagle-eyed viewing to produce this override file while watching the episode.
8023,8051 dsc+
8122,8160 dsc+
8510,8543 dsc+

#section with video intercut with film
17260 p0
17942 p0
18209 p0
18504 p0
18709 p0
19884 p0

42308 p0

43718 X #bug in encoded stream

49643 sc+

53094 sc-

56057 sc+

72785 p0

#black-and-white choppy video intercut with film
72926,73012 sc-
72926 sc+
72926 p0
73013 p0
73121,73168 sc-
73121 sc+
73121 p0
73169 p0
73196 p0
73282,73356 sc-
73282 sc+
73282 p0

75025,75141 dsc+

#end credits
79037 p0
79622 X
80581 p0

dsc+ indicates a dissolve between two different pulldown patterns.
p0 is a video section (handed over to an external filter; in my case in 2009 this was TFM+TDecimate to poorly blend-down from 29.97 to 23.976 for CFR)
sc+ and sc- are adding or removing scene changes
X is interpolating from same-order field (deinterlacing) rather than matching to anything.

Thanks for posting, I'm gonna give that a shot if I can get it working in VS. Some animation that was edited post-telecine is giving me a real headache, and VIVTC + QTGMC as a decomber is honestly not cutting it.

ChiDragon

8th October 2016, 14:47

I looked at YATTA's Pattern Guidance last night after finally finding a reasonable guide. Points 1 and 4 probably apply to YATTA PG as well as my filter. But I think 2, 3, and 5 are unique. It would be nice if they could be added to YATTA (or the VS rewrite "Wobbly"), cause the interface is a lot nicer than bouncing between VDub or AvsPmod to view + manual text edit + reload script.

I processed your samples using my filter. Will upload later.

ChiDragon

9th October 2016, 05:15

Here are the cartoon samples from post #17 (http://forum.doom9.org/showpost.php?p=1782010&postcount=17), converted to 23.976 CFR using SCIVTC semi-auto. I've never used SVP, so I just got lazy and used horrible TDecimate blending on the 30p pan. But my filter just grabs video-rate segments directly from an external clip, therefore that section could easily be replaced by swapping out one parameter. If the clip instead had 30i sections that should be bobbed, I don't know of a good way to handle that. At one point I wanted to rewrite it to offer a 60p output mode, but I don't think that is in the cards.

Doom9 brucethemoose ATLA samples 1 - SCIVTC.zip (https://mega.nz/#!FUhhRBrJ!uUmaQpEvJWqGDkUuzmeCSRHyWzJxafe8DyMwXig2Csc) (Mega.co.nz | 9.4 MB)