[Questions] Automated script generation [Archive]

View Full Version : [Questions] Automated script generation

cogman

15th December 2020, 02:28

I'm getting ready to setup a super plex encoding server for fun. One of my goals is to, as much as possible, automate the process.

So here's my plan of attack, tell me if it's crazy.

Use ffmpeg's interlace detection to figure out if a video has interlacing.

Use ffmpeg's black bar detection to find black bars

Do a test noise filter. Use something like KNLMeansCL with a high strength setting to and do a comparison between input and output to see if there's a big difference. I'm thinking something like VMAF would be good. My assumption (Would this be wrong)? is that noisy videos would have lower VMAF scores than clean videos. Is that true? Would there be a better way to score noise of a video?

Assuming all that works correctly. Go off and generate an appropriate vapoursynth script. (Adding and removing filters based on what was previously detected).

So, my question.

Which noise filters should I be looking at?
Which deinterlacing filters should I be looking at?
Does it make sense to try and split a video's frequencies and denoise specific frequencies? (For example, I've seen videos where whites seem to have a ton of noise whereas darks not so much).
Are there tools I'm overlooking?

stax76

15th December 2020, 05:34

Use ffmpeg's black bar detection to find black bars

Not working in my experience.

cogman

15th December 2020, 16:11

Not working in my experience.

Do you know of alternatives? It's worked for me in the past on some media (Haven't extensively tested it though).

Selur

15th December 2020, 18:27

Use ffmpeg's interlace detection to figure out if a video has interlacing.
Problematic,..
I get:
Input #0, mpeg, from 'f:\TestClips&Co\files\interlaceAndTelecineSamples\interlaced\eis-sample.mpg':
Duration: 00:00:59.98, start: 0.813356, bitrate: 2441 kb/s
Stream #0:0[0x1e0]: Video: mpeg2video (Main), yuv420p(tv, top first), 480x576 [SAR 8:5 DAR 4:3], 25 fps, 25 tbr, 90k tbn, 50 tbc
Side data:
cpb: bitrate max/min/avg: 2350000/0/0 buffer size: 1835008 vbv_delay: N/A
Stream #0:1[0x1c0]: Audio: mp2, 44100 Hz, stereo, s16p, 224 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (mpeg2video (native) -> rawvideo (native))
Press [q] to stop, [?] for help
Output #0, rawvideo, to 'NUL':
Metadata:
encoder : Lavf58.65.100
Stream #0:0: Video: rawvideo (I420 / 0x30323449), yuv420p(tv, top coded first (swapped)), 480x576 [SAR 8:5 DAR 4:3], q=2-31, 82944 kb/s, 25 fps, 25 tbn
Metadata:
encoder : Lavc58.115.102 rawvideo
frame= 360 fps=0.0 q=-0.0 Lsize= 145800kB time=00:00:14.40 bitrate=82944.0kbits/s speed=79.3x
video:145800kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000000%
[Parsed_idet_0 @ 00000292278275c0] Repeated Fields: Neither: 352 Top: 5 Bottom: 4
[Parsed_idet_0 @ 00000292278275c0] Single frame detection: TFF: 247 BFF: 47 Progressive: 16 Undetermined: 51
[Parsed_idet_0 @ 00000292278275c0] Multi frame detection: TFF: 330 BFF: 22 Progressive: 9 Undetermined: 0
for a totally tff source.
And I get for example:
Input #0, mpeg, from 'f:\TestClips&Co\files\interlaceAndTelecineSamples\telecine\mpeg2_720x480_4-3.vob':
Duration: 00:03:18.03, start: 0.228411, bitrate: 7030 kb/s
Stream #0:0[0x1bf]: Data: dvd_nav_packet
Stream #0:1[0x1e0]: Video: mpeg2video (Main), yuv420p(tv, fcc/bt470bg/bt470bg, top first), 720x480 [SAR 8:9 DAR 4:3], 29.97 fps, 29.97 tbr, 90k tbn, 59.94 tbc
Side data:
cpb: bitrate max/min/avg: 8000000/0/0 buffer size: 1835008 vbv_delay: N/A
Stream #0:2[0xa0]: Audio: pcm_dvd, 48000 Hz, stereo, s16, 1536 kb/s
Stream mapping:
Stream #0:1 -> #0:0 (mpeg2video (native) -> rawvideo (native))
Press [q] to stop, [?] for help
Output #0, rawvideo, to 'NUL':
Metadata:
encoder : Lavf58.65.100
Stream #0:0: Video: rawvideo (I420 / 0x30323449), yuv420p(tv, fcc/bt470bg/bt470bg, top coded first (swapped)), 720x480 [SAR 8:9 DAR 4:3], q=2-31, 124291 kb/s, 29.97 fps, 29.97 tbn
Metadata:
encoder : Lavc58.115.102 rawvideo
frame= 360 fps=0.0 q=-0.0 Lsize= 182250kB time=00:00:12.01 bitrate=124291.7kbits/s speed=50.2x
video:182250kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000000%
[Parsed_idet_0 @ 000001dbffdd75c0] Repeated Fields: Neither: 359 Top: 1 Bottom: 1
[Parsed_idet_0 @ 000001dbffdd75c0] Single frame detection: TFF: 332 BFF: 0 Progressive: 0 Undetermined: 29
[Parsed_idet_0 @ 000001dbffdd75c0] Multi frame detection: TFF: 332 BFF: 0 Progressive: 0 Undetermined: 29
for a telecined source.

-> Nope, I don't know any reliable method to properly detect whether a source is interlaced/telecine/mixed/progressive/fieldblended/... other than looking at the source with human eyes.
I tried it years ago with https://github.com/Selur/MPlayerInterlaceDetection and MeGui also has some interlaced detection http://avisynth.nl/index.php/Interlace_detection, but none of these is really reliable.
Please let me know in case you find something good.

Use ffmpeg's black bar detection to find black bars
you would probably need to run the detection multiple times with different settings (on different parts of the input) and then add some boundaries evaluating the results.

Do a test noise filter. Use something like KNLMeansCL with a high strength setting to and do a comparison between input and output to see if there's a big difference. I'm thinking something like VMAF would be good. My assumption (Would this be wrong)? is that noisy videos would have lower VMAF scores than clean videos. Is that true? Would there be a better way to score noise of a video?
I doubt this works, at least I don't know a way to do this properly to differentiate between noise/details and artifacts.

Which noise filters should I be looking at?
Different noise characteristics (spatial/temporal/chroma noise,..) -> different filters

Which deinterlacing filters should I be looking at?
for the analysis steps or for the end deinterlacing?
For the analysis analying the effects of standard ivtc and bob filters is probably good enough, for the end deinterlacing as a start:
QTGMC for interlaced content
VIVTC for telecined and different pull-down sources (+ may be vinverse/vinverse2)
soft telecine -> no clue
VFM for field shifted content
QTGMC+sRestore for mixed content (not as good as manual tuning, but a quick way)

So here's my plan of attack, tell me if it's crazy.
Seeing that you need to ask about recommend filters -> It's crazy, but that doesn't mean you can't stumble over some good heuristics. :D
I thought about this when I started with the first versions of sx264 which later became Hybrid, sadly I couldn't figure it out back then.
Assuming your input has similar characteristics regarding noise etc. I think it's possible to do some best guesses based on some tests, but I don't see this reliably working on random sources.
(Knowing for example the source was created with video camera xy or similar I think its possible, but still lots of work.)

Cu Selur

cogman

15th December 2020, 20:47

Seeing that you need to ask about recommend filters -> It's crazy, but that doesn't mean you can't stumble over some good heuristics.

:D Probably. It's been a really long time since I've done any sort of encoding work. I know I'm horribly out of date on a lot of stuff (last time I really approached something like this was back in college ~2006) so I wanted to make sure what I'm trying to do hasn't been done better by others.

I figure this sort of stuff should give me ample opportunity to screw around with a lot of tech and programming that I've not messed with in a long time :D. And who knows, maybe I'll get some good looking videos out the end (lol)

Thanks for all the pointers.

zorr

15th December 2020, 21:20

noisy videos would have lower VMAF scores than clean videos. Is that true? Would there be a better way to score noise of a video?

VMAF and pretty much all the similarity measures available in Vapoursynth need a reference video to compare with. But you could try to detect noise by denoising the video and comparing it to the unprocessed version, if the difference (in VMAF, SSIM, GMSD etc.) is small it's probably not very noisy.

I have used a much simpler yet quite effective strategy to measure noise. First apply an edge detection filter such as EdgeDetect(clip, mode='TEdge') from G41Fun and then measure the brightness. The clip is brighter when there's a lot of noise. Of course the edge detection filter is also going to make edges bright so it gives you false alarms. I have used this method to detect noisy horizontal areas in videos, there I can compare the brightness levels of different slices of the same frame which makes it much more robust.

stax76

15th December 2020, 21:42

Do you know of alternatives? It's worked for me in the past on some media (Haven't extensively tested it though).

I wrote an autocrop.exe for staxrip, it depends on frameserver.dll also written for staxrip, but I'm not providing any support or help for it unless it's run by staxrip...

There are auto crop avs and vs filters that can also be used for automation.

cogman

15th December 2020, 21:52

I wrote an autocrop.exe for staxrip, it depends on frameserver.dll also written for staxrip, but I'm not providing any support or help for it unless it's run by staxrip...

There are auto crop avs and vs filters that can also be used for automation.

This right?

https://github.com/staxrip/staxrip/blob/a6dc0e02c327ba158adeb9f6d30f8b7171b6bfb2/Tools/AutoCrop/Main.vb

Am I correct in seeing that you are mostly looking for the darkest pixels and then walking inward (based on a tolerance)

stax76

15th December 2020, 22:12

Sorry, I don't remember how it works, there were never complaints however, so it should work well.

StainlessS

16th December 2020, 00:23

AVS AutoCrop, IIRC, then for TOP edge only.

Starts with Top Limit Cursor about mid frame,

For Sample Frame of SAMPLES frames
Scanning inwards from Top edge towards Top Limit Cursor,
if at least Z number of consecutive raster lines luma average above some theshold then
Move Top Limit Cursor outwards to that outer raster line and move on to next frame in SAMPLES.

At end of SAMPLES frames, Top Limit Cursor is the result for top image edge. [above it is border]

Detected image grows over SAMPLES frames, whatever is left outside of detection is border.
Z number of consecutive raster lines, avoids noise.

EDIT:
Avs AutoCrop uses default SAMPLES=5, RoboCrop uses default SAMPLES=40,
Not sure, think autocrop uses Z=3, RoboCrop Z=4.
AutoCrop fixed threshold. RoboCrop is auto thresholded via Min Luma over SAMPLES frames [+ threshold], [Or fixed threshold].
Autocrop defaults can make major [overcrop] mistakes.
EDIT: AutoCrop/RoboCrop both clip global scanners, ie do not detect dynamic changing borders, outermost border detectors.