Best downscale algorithm for UHD to 1080p [Archive]

View Full Version : Best downscale algorithm for UHD to 1080p

xabregas

19th December 2017, 14:01

I want to downscale some 4k videos to my TV resolution that is 1080p because the downscale made by gpu is very resource waste yet. so my doubt is if the downscale algorithm mostly used for 1080p to 720p (spline36) is still the best?

Blue_MiSfit

19th December 2017, 21:36

"Best" is quite subjective.

I do like spline36 quite a lot, and it's my go-to.

Even so, for downscaling it might be overkill. Good old bicubic does a pretty good job and is implemented in hardware basically everywhere now.

xabregas

19th December 2017, 22:03

Blue_MiSfit

20th December 2017, 00:57

What, specifically?

Weyoun

20th December 2017, 03:30

bicubic

xabregas

20th December 2017, 05:35

What, specifically?

UHD Bluray HEVC to x264 1080p

Atak_Snajpera

21st December 2017, 18:09

I use Spline36Resize + gentle sharpening
Spline36Resize(1920,800).Sharpen(0.2)

xabregas

21st December 2017, 19:30

I use Spline36Resize + gentle sharpening
Spline36Resize(1920,800).Sharpen(0.2)

But how can i encode 4k hevc with avisynth script in megui?? Im using ffmpeg.

Atak_Snajpera

21st December 2017, 19:32

Are you encoding to 1080p HEVC or AVC?

FranceBB

21st December 2017, 21:59

@xabregas... It's fine to pipe Avisynth to ffmpeg, however you could also compile your own codec (like x264, x265 etc) and pipe Avisynth to it, avoiding MeGUI completely.
As to Avisynth itself, I think both ffms2 and libav can decode HEVC videos.

By the way, I use Spline36Resize if there aren't captions, otherwise I generally use Lanczos to downscale.

GMJCZP

21st December 2017, 23:19

I think that StainlessS had already clarified this question of how to perform the downscale, the bad thing is that I do not remember where the thread is.

IMHO I think that with Simple x264 Launcher you can convert to x264 / 265 without a lot of complications.

EDIT: I remembered, https://forum.doom9.org/showthread.php?t=174496

xabregas

22nd December 2017, 00:27

Are you encoding to 1080p HEVC or AVC?

To avc ofc. HEVC to avc x264 because x265 is incompatible to every standalone media player as HEVC.

@xabregas... It's fine to pipe Avisynth to ffmpeg, however you could also compile your own codec (like x264, x265 etc) and pipe Avisynth to it, avoiding MeGUI completely.
As to Avisynth itself, I think both ffms2 and libav can decode HEVC videos.

By the way, I use Spline36Resize if there aren't captions, otherwise I generally use Lanczos to downscale.

I will try it. ty

Atak_Snajpera

22nd December 2017, 11:59

Do not forget to tonemap bt2020 to bt709.

Forteen88

31st July 2018, 18:28

I use Spline36Resize + gentle sharpening
Spline36Resize(1920,800).Sharpen(0.2)Isn't it better to just use Spline64Resize then? Since Spline64Resize is sharper than Spline36Resize.

Asmodian

31st July 2018, 19:09

Spline64Resize does ring a lot too, though I am not sure how it compares to Sharpen(0.2) in that regard. They are pretty different ways of getting a sharper image.

Gser

31st July 2018, 21:15

My favorite is using SSIM downscaling from AvisynthShader (https://github.com/mysteryx93/AviSynthShader). It doesn't support BT.2020 so you should convert to BT.709 before downscaling.

Or if you want to use something else Dither tools (http://avisynth.nl/index.php/Dither_tools) is quite good. Both AvisynthShader and dither tools offer you the chance to do colorspace conversions in 16 bit depth and in linear light.

Wolfberry

1st August 2018, 03:27

SSIM downscaling is also available in VapourSynth, implemented by WolframRhodium in his muvsfunc (https://github.com/WolframRhodium/muvsfunc/blob/master/muvsfunc.py#L3434)
The internal calculations are done at 32-bit float, and you can select which kernel you want to use as the base. Bicubic is usually sufficient, Gaussian if you want an even better result.

StainlessS

1st August 2018, 06:34

GMJCZP,
You can find threads started by any user via User Profile/Statistics Tab/
"Find All Posts by User_Name"
"Find All Threads Started by User_Name".

Bicub_b, BicubicResize() b and c args. (If supplied as args, BEST SUPPLY BOTH Bicub_b and Bicub_c)
Bicub_c Default:-
DownSize b = -0.5, c = 0.25, Groucho2004,Didée:- https://forum.doom9.org/showthread.php?p=1802716#post1802716
Upsize, b = 0.0, c = 0.5, Catmull-Rom spline, sharp.
Avisynth BicubicResize() defaults are:- b=1.0/3, c=1.0/3

EDIT: Oops, GMJCZP post was from last year.

Ghitulescu

2nd August 2018, 15:44

benwaggoner

2nd August 2018, 19:21

Downscaling from UHD/4k to HD is not that difficult - there are literally million of points acrrying information, so practically all algorithms are on the same level, more or less.

The true difference can be seen where there is little information to start with, like from VideoCD or VHS to FullHD.
Well, it's UHD->1080p is particularly easy because it is a 2x downscale, so you have exactly four input pixels per output pixels. Downscales >2x can cause issues if algorithms don't equally sample every source pixel. Going from 3840x2160 to 352x198 (10.91x) can cause all kinds of aliasing in many algorithms. Scrolling end credits often make even subtle problems very obvious, since you have fine details that should remain absolutely identical as they move slowly across the pixel grid. Bilinear and Bicubic are pretty bad in this case. ffmpeg's area is good, and Spline36 and Gaussian seem like the best in AVISynth. Commerical tools often have their own proprietary solutions.

A good sign is a scaling algorithm's performance is relatively proportional to (source height*width) + (output height*width). An algorithm that has perf proportional to mainly the output frame size likely isn't sampling all the source pixels, and There Be Dragons.

foxyshadis

6th August 2018, 07:59

Something seemingly forgotten here is linear-vs-log scaling; both BT.709 and BT.2020 should always be converted to linear to scale and then back, unless you happen to be one of the weirdos using BT.2020 constant (ICtCp), otherwise frames will become uniformly darker and point highlights will be extinguished.

alex1399

8th August 2018, 14:42

Obviously, the BT.601 (NTSC variant) is already linear scaling. Everybody should be familiar about the RGB - YUV conversation which could be expressed by a beautiful linear matrix equality. Correct me if I'm wrong. Entering the HD era and the post-HD era, it seems that there is an elephant in the room. The conversation chain that works like: BT.709 to BT.601, up-scaling/down-scaling, BT.601 to BT.709, instead of directly up-scaling/down-scaling for the quality stuffs seems like shooting one in leg.

I'm used to process the timing conversion that recovers 23.976/29.97 fps from 60/30 fps, or some arbitrary frame-rate conversion. But when it comes to up-scaling/down-scaling or even some *** 8-bit-depth to high-bit-depth conversion, No experts comes up to give some nag about it, until now.

Suppose that the clip is a normal FHD BT.709 video converted into UHD which mis-follow the BT.601 characteristic, it is easy to recover back into FHD. I'm kindly to ask for the proper treatment of some cleverly encoded UHD that correctly follow the BT.709/BT.2020 characteristic (up-scaled from HD/FHD), how should we handle those clip to properly down-scale or any arbitrary processing?

alex1399

8th August 2018, 15:03

Maybe the encoder (not THE X265 library) done all those process correctly under the hood, but one should not over-estimate crowd's wisdom. Double limited-color-range to full color-range conversion is just a common mistake of those. The raw yuv420p contents, out of question is not BT.609, requires some post-processing before scaling?

FranceBB

8th August 2018, 17:55

I'm kindly to ask for the proper treatment of some cleverly encoded UHD that correctly follow the BT.709/BT.2020 characteristic (up-scaled from HD/FHD), how should we handle those clip to properly down-scale?

You are not looking for a downscaler, then, you are looking for a "reverse upscaler".
If you deal with a content that has been originally shot in 4K, then you may just want to downscale it to a lower resolution for whatever reason and you are gonna use a normal resizer, but if the content has been upscaled, then it's different.
If a content has been upscaled from FULL HD to UHD, it means that they used a resizing kernel, generally Bilinear or Bicubic is what companies use (unfortunately).
During the upscaling step, blur is introduced, which means that if you simply downscale an upscaled content back to FULL HD, you'll save space, but the blur is still gonna be there and it's not gonna look as sharp as it used to be before the upscale.
In order to avoid that and in order to resize an upscaled content to its original resolution getting back its original sharpness, a "reverse upscale" has to be done, using "DeBilinear, DeBicubic" and so on, inverting the kernel used during the upscaling step.

Let's suppose that we have a file upscaled from HD to FULL HD using Bicubic and we wanna bring it back to HD, we should do the following:

Avisynth+:

#Use your favorite Indexer
FFMpegSource2("something.mov")

#Reverse Upscale
DeBicubicResizeMT(1280, 720)

Avisynth 2.6.1:

#Use your favorite Indexer
FFMpegSource2("something.mov")

#bring everything to 16bit stacked
Dither_convert_8_to_16()

#16bit reverse upscale 1920x1080 -> 1280x720
ly = debicubicy(1280,720,lsb_inout=true)
lu = utoy().dither_resize16(1280,720,kernel="bicubic",invks=true,invkstaps=3,src_left=0.25,u=1,v=1)
lv = vtoy().dither_resize16(1280,720,kernel="bicubic",invks=true,invkstaps=3,src_left=0.25,u=1,v=1)
ytouv(lu,lv,ly)

#Output 16bit stacked
Dither_Out()

When it comes to UHD, things are more complicated, 'cause you need to tone-map BT2020 to BT709.

UHD BT2020 to FULL HD BT709:

Avisynth+ only afaik

#Use your favorite Indexer
FFMpegSource2("something.mov")

#BT2020 to BT709
ConvertYUVtoXYZ(Color=1)
ConvertXYZtoYUV(pColor=1)

#Reverse Upscale
DeBicubicResizeMT(1920, 1080)

You are gonna need This (https://forum.doom9.org/showthread.php?p=1776920) and This (https://forum.doom9.org/showthread.php?t=175488)

I assume you use Avisynth+.
If you are on Avisynth 2.6.1, you might be in troubles 'cause the HDRTools plugin doesn't work in 16bit stacked/interleave. Besides, it uses RGB64 or RGBPS and Avisynth 2.6.1 doesn't support either of them.

alex1399

9th August 2018, 04:53

Thank you very much. Would try them in advance. The reverse upscaler reminds me of the blur/anti-alias versus sharp/alias debate in the old old time. Also there are couple ways to measure the light cycle and dark halo ringing due to the mis-matching of the unknown upscale and the reverse upscale.

The ffmpeg (in some old builds) uses the bicublin which means a bicubic luma and bilinear chroma scaling. This is why the luma and the chroma use separated resizing kernel?

FranceBB

9th August 2018, 08:59

This is why the luma and the chroma use separated resizing kernel?

Yes, according to your needs you can use a different kernel for luma and chroma. Generally, studios just upscale using a consistent kernel: bicubic if they do it via software or a sort of fast-bilinear if they do it via hardware (basically an SDI signal routed by a matrix into a device that handles upscale/downscale of the signal).
The hardware way was kinda popular back in the days, when you had VTRs and you had to do everything linearly, routing the signal through different devices, and it's no longer used, except for some live broadcast contents (or by some old broadcast studios that refused to update their workflow due to either stubbornness or lack of money).

mparade

5th July 2019, 20:10

Generally, studios just upscale using a consistent kernel: bicubic if they do it via software...

Is it true with fake 4K movies burnt to UHD BD-s as well?
I would like to spare some encoding time by getting rid of ~half of the pixels at the first place by reversing the upscale kernel they supposed to be performed in the studio.
Would you suggest reversing the upscaler used by the studio in this case as well? For me it is just a rough assumption that they used just a bicubic kernel. If I am going to use debicubic on a fake 4K source without knowing exactly what they have performed on the original 2K content, won't it make the output even worse?

Thank you very much for the help in advance.

benwaggoner

6th July 2019, 20:56

Is it true with fake 4K movies burnt to UHD BD-s as well?
I would like to spare some encoding time by getting rid of ~half of the pixels at the first place by reversing the upscale kernel they supposed to be performed in the studio.
Would you suggest reversing the upscaler used by the studio in this case as well? For me it is just a rough assumption that they used just a bicubic kernel. If I am going to use debicubic on a fake 4K source without knowing exactly what they have performed on the original 2K content, won't it make the output even worse?
For important titles, studios will use very advanced upscaling techniques, even varying parameters shot by shot. Lots of Hollywood content is produced 4K with 2K VFX, with the VFX shots upscaled for the 4K master. Film grain may be rendered at 4K on top of the 2K uprez.

Among other things, Hollywood people would feel guilty about setting something at 4K that could have just been upscaled from 1080p by the TV itself, so they want to do something more hands on and "better" than that.

For older titles, the 4K uprez and the HDR grade are part of the same remastering effort.

There isn't any one kernel that's going to be used.

Stereodude

7th July 2019, 20:26

For important titles, studios will use very advanced upscaling techniques, even varying parameters shot by shot. Lots of Hollywood content is produced 4K with 2K VFX, with the VFX shots upscaled for the 4K master. Film grain may be rendered at 4K on top of the 2K uprez.

Among other things, Hollywood people would feel guilty about setting something at 4K that could have just been upscaled from 1080p by the TV itself, so they want to do something more hands on and "better" than that.

For older titles, the 4K uprez and the HDR grade are part of the same remastering effort.

There isn't any one kernel that's going to be used.
I'm skeptical that they're doing anything special for the scaling. For the color grading and HDR sure, but the scaling...

Judging what I've seen from older film based movies that have been re-graded for HDR & WCG and have been put on UHD-BD and what gets past their color grading where the output is obvious and at times ugly and headscratching I can't see them spending any time on the scaling algorithms where the differences are probably quite minor. If they're not sweating the big stuff the odds that they're sweating the minor stuff has to be asymptotically approaching 0.

benwaggoner

10th July 2019, 17:53

I'm skeptical that they're doing anything special for the scaling. For the color grading and HDR sure, but the scaling...

Judging what I've seen from older film based movies that have been re-graded for HDR & WCG and have been put on UHD-BD and what gets past their color grading where the output is obvious and at times ugly and headscratching I can't see them spending any time on the scaling algorithms where the differences are probably quite minor. If they're not sweating the big stuff the odds that they're sweating the minor stuff has to be asymptotically approaching 0.
I don't know how much of a difference it actually makes, but Hollywood has traditionally used higher touch tools when upscaling. Also, upscaling is almost never done by itself for modern content (maybe for long tail old SD TV content). It is generally done as part of a remaster or regrade, so it's rarely the only operation happening.

And Hollywood tech people are just intrinsically opposed to doing just a basic scaling operation if it's something the TV could do at least as well. If customers don't see UHD looking better (or at least different) than 1080p, then they will stop paying extra for UHD version. Plus they want to feel their craft and experience and attention really matters.

My team did UHD and HDR remastering of a bunch of TV episodes, so I'm speaking with some experience here.

Stereodude

11th July 2019, 13:37

I don't know how much of a difference it actually makes, but Hollywood has traditionally used higher touch tools when upscaling. Also, upscaling is almost never done by itself for modern content (maybe for long tail old SD TV content). It is generally done as part of a remaster or regrade, so it's rarely the only operation happening.

And Hollywood tech people are just intrinsically opposed to doing just a basic scaling operation if it's something the TV could do at least as well. If customers don't see UHD looking better (or at least different) than 1080p, then they will stop paying extra for UHD version. Plus they want to feel their craft and experience and attention really matters.

My team did UHD and HDR remastering of a bunch of TV episodes, so I'm speaking with some experience here.
Thanks for the reply, but your response isn't necessarily comforting. Turning a knob for the sake of turning a knob doesn't mean the results are optimal even the intention behind it is good.

For example I wouldn't actually expect someone to tune the b & c parameters of bicubic resize on a scene by scene basis and would question the thought process of someone who thought that was necessary or beneficial.

DTL

12th September 2021, 22:07

Scrolling end credits often make even subtle problems very obvious, since you have fine details that should remain absolutely identical as they move slowly across the pixel grid. Bilinear and Bicubic are pretty bad in this case. ffmpeg's area is good, and Spline36 and Gaussian seem like the best in AVISynth. Commerical tools often have their own proprietary solutions.

In the old days Gauss was the only Avisynth resize kernel with low-pass processing of valid frequencies in the lower samples count output domain. To re-condition output spectrum against Gibbs ringing. But it typically makes too low sharpness.

Now the most advanced is UserDefined2ResizeMT from jpsdr's tools and it allow to adjust 'visual makeup' from almost zero overshoots like pure-film (close to Gauss with wide enough kernel) to some 'video-look' with more over/under shoots. Still trying to keep 'far' ringing as low as possible. Currently it uses 2 control params and only sample table available to select b and c pairs with ringing depressed as good as possible with 2 control params. May be later will be more user-friendly version with build-in table inside and 1 user-defined param to control 'sharper-softer'. Table is in the https://forum.doom9.org/showthread.php?p=1951250#post1951250 post.

And if source was conditioned against ringing in linear domain it also recommended to perform downsize in linear domain. But it will require playback device with scaling in linear domain too (if its display pixel count differs from 1920x1080 and scaling at display time required). So it better to make tests with own target display device where it better to perform downscaling and conditioning against ringing - in linear domain or in typcal storage and distribution compressed transter-function domain.

Blue_MiSfit

13th September 2021, 04:17

I'd suggest opening a new thread next time instead of resurrecting an old thread from the dead 2+ years later :)

DTL

13th September 2021, 08:35

Obviously, the BT.601 (NTSC variant) is already linear scaling. Everybody should be familiar about the RGB - YUV conversation which could be expressed by a beautiful linear matrix equality. Correct me if I'm wrong.

It only nice if making from 4:4:4 to 4:4:4 . But also useless for bitspeed compression in this form. After subsampling and delivering UV subsampled only it become damaging transform because for correct de-matrixing of UV-subsampled we need Y-subsampled. And for correct de-matrixing YUV full-size we need UV-full-size. And neither of Y-subsampled nor UV-full-size available at decoder when try to decode 4:2:2 or 4:2:0. And I think no completely correct way of decoding exist because all data is in compressed-transfer-function form. Also for lower price and higher speed the UV also created from Y'CrCb and not from linearly subsampled linear RGB in separate sub-sampled path of the YUV-subsampled encoder.
I hope the non-perfect by design 4:2:0 encoding from the old past times will be dropped after good progress of MPEG encoding and we will use 4:4:4 R'G'B' . Because the 4:2:0 compression only adds 2:1 compression rate but adds some distortions in hard or non possible to revert. I made tests with pre-265 x264 encoder and with 4:4:4-predictable profile it produces lower filesize with same crf control param. Not twice larger. Though the quality with same crf for 4:2:0 and 4:4:4P may be different inside x264 treatment of crf-param.
So to the initial topic question - it is best to scale only luma and cast UV to 4:4:4 and use 4:4:4 profile of encoder if it is compatible with target playback hardware or software setup.

rwill

13th September 2021, 13:34

lol..

Balling

24th September 2021, 17:57

"Obviously, the BT.601 (NTSC variant) is already linear scaling. Everybody should be familiar about the RGB - YUV conversation which could be expressed by a beautiful linear matrix equality. Correct me if I'm wrong. "

Fine, I will correct you. Do you know what nonlinearity is? This is what is called gamma or transfer function. Until a couple years ago, nonlinear algebra did not even exist! All normal matrices are only linear operations, one cannot do transfer functions in matrices (in nonlinear algebra some nonlinear matrices are introduced, but they have functions, not numbers inside). And this is not RGB -- YUV. This is R'G'B' --> Y'Cb'Cr', where ' is nonlinearity. Nonlinearity is applied before that or with this step for constant luminance BT.2020 or R'G'B' is not used at all in ICtCp, L'M'S' is used in ICtCp.