PDA

View Full Version : AviSynth use in a program


Dark Shikari
15th June 2007, 15:52
I've been working on a very simple H.264 encoding GUI with a friend for a bit, and its coming along well and its becoming quite popular in the community we developed it for.

However, recently I've gotten some interesting feature requests that I'm not exactly sure how to deal with.

One issue is that the GUI does not actually read any data from the AviSynth script; it simply creates a script that DirectShowSource()s and ConvertToYV12()s the input file and a series of commands that are run, from the encoding of the video to the encoding of the audio to the muxing.

Because of this, the program itself knows nothing about the video stream.

Is there a way to in AviSynth to:

a) Resize an image to a specific % of the original or to a specific height or width while keeping aspect ratio automatically?
b) Crop the edges of a video evenly to make x and y both mod 16, for improved H.264 compression?

If either of these is something one can't do in AviSynth, whats a good way for a program to read the resolution of the input video? I would think the best way would be to read it somehow using the AviSynth script, but I have zero experience in doing operations on videos on this lower level.

Leak
15th June 2007, 16:47
a) Resize an image to a specific % of the original or to a specific height or width while keeping aspect ratio automatically?
AviSynth doesn't know anything about aspect ratios, but every clip (like the implicit "last") has both a "width" and a "height" property, so you can simply use any resize filter:
ratio=0.75 # 75%
newWidth=Round((last.width*ratio)/4.0)*4
newHeight=Round((last.height*ratio)/4.0)*4
last.LanczosResize(newWidth,newHeight)
(with rounding to multiples of 4 for great justice)
b) Crop the edges of a video evenly to make x and y both mod 16, for improved H.264 compression?
You can do that similarly:
excessWidth=last.Width%16
excessHeight=last.Height%16
cropLeft=excessWidth/2
cropTop=excessHeight/2
cropRight=excessWidth-cropLeft
cropBottom=excessHeight-cropTop
last.Crop(cropLeft,cropTop,-cropRight,-cropBottom)
np: Robert Babicz - Liquid Titan (A Cheerful Temper)

Dark Shikari
15th June 2007, 16:52
An additional question: the most common modes used in my program use --crf 25 or --crf 30 for the quality factor. At this sort of quality level, what resizing algorithms are best for downscaling? I heard that Lanczos/Blackman are higher quality and sharper but tend to require a higher bitrate to compress, and so are often not worth it at lower bitrates. Is this true?

Dark Shikari
15th June 2007, 17:38
OK, I've done a test of various resizing functions for downscaling to 75% of the total size using the following x264 options:

--trellis 2 --merange 64 --subme 6 --bframes 16 --ref 3 --b-pyramid --partitions all --aq-strength 0.5 --8x8dct --me umh --bime --no-psnr --b-rdo --mixed-refs --direct auto --weightb --progress --threads 4 --crf 30 --deblock 0:0

Lanczos 75% Test:
Bitrate: 813.16kbps
SSIM: 0.9774791

Blackman 75% Test:
Bitrate: 806.33kbps
SSIM: 0.9774914

Bicubic 75% Test:
Bitrate: 765.73kbps
SSIM: 0.9787687

Bilinear 75% Test:
Bitrate: 765.75kbps
SSIM: 0.9787784

Spline64 75% Test:
Bitrate: 765.72kbps
SSIM: 0.9787686

The Lanczos-based resizers seem to require about 5% more bitrate.

Avenger007
6th April 2008, 08:38
I used a clip from StarCraft II Zerg Reveal Trailer from frame 1947 to 3329 to do this test.

Clip info: 1280x544 @ 29.97 fps (1383 frames) 9808 kbps
To be resized to 640x272.

Settings (using x264 808 VAQ2 alpha build with metric=0 and strength=1 (VAQ1 defaults)):
--pass 2 --bitrate 2000 --keyint 300 --min-keyint 30 --ref 16 --mixed-refs --no-fast-pskip --bframes 8 --b-pyramid --b-rdo --bime --weightb --direct auto --subme 7 --trellis 2 --partitions all --8x8dct --me umh --merange 32 --threads auto --thread-input --sar 1:1 --progress --no-psnr --aq-metric 0 --aq-strength 1

The reason why I used 2000 kbps is because the full trailer is only 2m 24s. This reduces the file by about 1/4 of the original size. For longer videos, such as the Terran and Protoss gameplay videos, I would use bitrates of 1500 and 1000 resp.

Resize Function...........................SSIM (2000 kbps).......SSIM (1000 kbps)

none (source)....................................0.8566463...................0.7756393

Spline16Resize(640,272).......................0.9580357...................0.9142428
Spline36Resize(640,272).......................0.9533653...................0.9063235
Spline64Resize(640,272).......................0.9525919...................0.9051941

BlackmanResize(640,272)......................0.9525652...................0.9053714
BlackmanResize(taps=8,640,272)............0.9480446...................0.8964842
BlackmanResize(taps=16,640,272)..........0.9452565.........................-

Lanczos4Resize(640,272)......................0.9497143...................0.8990401
LanczosResize(taps=8,640,272)..............0.9460655...................0.8935069
LanczosResize(taps=16,640,272)............0.9439958.........................-

BicubicResize(640,272).........................0.9657626...................0.9358651
BilinearResize(640,272).........................0.9688197...................0.9292139

GaussResize(640,272)...........................0.9723385...................0.9421302
GaussResize(p=60,640,272)...................0.9578017...................0.9173009
GaussResize(p=90,640,272)...................0.9512161...................0.9053051

Observations:
Pre-rendered:
Pre-rendering Spline, Lanczos and Blackman looked virtually identical and they looked close to the original minus some degree of sharpness. Gauss p=90 looked as sharp as the others but p=60 looked slightly less sharp and p=30 was noticeably blurry. However, Gauss showed artifacts such as aliasing of straight lines for p=60,90.

Encoded:
Blackman seemed to look sharper than Lanczos. Higher values of taps increased sharpness but lower values ultimately kept more detail, thus looking closer to the original clip. Spline16 looked closer to Spline64 than to Spline36.
To me, Spline16 and Spline64 looked closest to the original image followed by Spline36 in its own way (different areas of sharpness and blurring). Next comes Blackman and then Lanczos or it could be the other way around due to different areas of sharpness and loss of detail. Thereafter, a soft Bicubic is followed by a softer Bilinear. Gauss simply wasn't able to keep enough textured detail.

These results are a spin-off of my VAQ2 testing. Instead of letting these results go to waste, I decided to post them here in case others finds them useful. :)

Dark Shikari
6th April 2008, 08:47
Oh wow, necromancy much?

IanB
6th April 2008, 09:28
Given you are downsizing you probably should also give GaussResize() a whirl with a range of P values.

And possibly tabulate more severe bitrates like 1200 and 1600 (and maybe 2500 for good measure).

And you should reserve the high tap count Lanczos and Blackman for those really high quality bansdwidth unlimited examples. The subtle ringing they cause really eats encoding bits, so you are on a hideing to nothing.

Manao
6th April 2008, 11:52
Those SSIM values don't mean much. Suppose the resizer was making the picture entirely black. You'd then get a SSIM of 1.0 with x264.

Thus, when comparing different prefiltering, you must compare SSIM with the source before any filtering took place, not after.

Avenger007
7th April 2008, 03:54
Updated the post above to include GaussResize as well as SSIM for 1000 kbps.
I also included SSIM value for source (no resizing). The encoded clip was sharper than the other clips for well defined lines and bright points but was terribly blurry in areas of textured detail.

Lele-brz
7th April 2008, 09:51
I think also the PAR should be considered when doing a rescaling.
Is there a way using Avisynth to calculate the new size that will keep the aspect ratio even from anamorphic videos?
thanks

IanB
7th April 2008, 11:03
@Avenger007,

Interesting that the Gauss gave the highest SSIM values, but you did not like the result, this is why I suggest you try it. :D

Avenger007
7th April 2008, 11:16
I decided not to test Gauss initially because most people here don't use Gauss, and rightfully so it seems. :)

IanB
7th April 2008, 15:32
Gauss is an unusual resizer, it has no negative coefficients for the tap points. This is like Bilinear but uses more sample points and you can somewhat control the level of blur.

At some point this softening becomes less objectionable than the other artifacts introduced as you reduce the bandwidth.

But here you are obviously processing for maximum fidelity and are being generous with your bandwidth, so Gauss is not appropriate here.

An interesting exercise may be to try sharpening at the playback stage to somewhat reverse the blur introduced by Gauss.

Razorholt
9th April 2008, 19:19
You can do that similarly:
excessWidth=last.Width%16
excessHeight=last.Height%16
cropLeft=excessWidth/2
cropTop=excessHeight/2
cropRight=excessWidth-cropLeft
cropBottom=excessHeight-cropTop
last.Crop(cropLeft,cropTop,-cropRight,-cropBottom)

Width some of my videos I got an error message that says: YV12 images can only be cropped by even numbers. How can we prevent that?

Thanks Leak!

- Dan

Wilbert
9th April 2008, 22:23
cropLeft=excessWidth/2
cropTop=excessHeight/2
Round them to a multiple of two.

Razorholt
9th April 2008, 22:55
Works like a charm. Thanks Wilbert!

- Dan