Log in

View Full Version : How exactly works the super resolution methods?


luquinhas0021
16th May 2015, 02:07
I`m talking about super resolution methods for upscaling single frames and videos. About optics (Microscopic images and videos) super resolution I know the principles. If someone wants explain me more accurately, I will be glad.
I`ve been reading that motion adaptive super resolution and multi image super resolution are based in sub-pixel displacement and aliasing presence. But... How?
Too, what is exactly sub-pixel, in image/video context? NNEDI3 causes sub-pixel misalignment and center shift. What is this?
How works single image super resolution that is based in patch seamless? Is it based in sub-pixel misalignment too?

raffriff42
16th May 2015, 09:18
IIRC, it's a patented algorithm...*googles* (https://www.google.com/search?q=patent+search+super+resolution)

Here it is (http://techlinkcenter.org/summaries/super-resolution-image-reconstruction-srir), I think

feisty2
16th May 2015, 09:46
nothing fancy, one of the many ways to do something similar in avisynth would be like
1. upscale your clip, with whatever algo you like
2. sharpen it insanely, like, real crazy
3. mcompensate+mt_clamp to limit the result

raffriff42
16th May 2015, 21:47
You da man, feisty2. I'm gonna have to try that. Have you got a script prepared? Because I'm lazy, uh very busy.

feisty2
17th May 2015, 08:36
http://i.imgur.com/2o6SVo1.png
http://i.imgur.com/WY3ghO4.png
http://i.imgur.com/BB75BM5.png


eedi3 (dh=true,alpha=0.1,beta=0.5,gamma=60,mdis=40,nrad=3,sclip=nnedi3 (dh=true,nns=4,qual=2,nsize=0))
turnleft ()
eedi3 (dh=true,alpha=0.1,beta=0.5,gamma=60,mdis=40,nrad=3,sclip=nnedi3 (dh=true,nns=4,qual=2,nsize=0))
turnright ()
uplow=last
up=uplow.dither_convert_8_to_16 ().converttoy8 ()
sharp=up.ShrinkSharp16(str=1.0).CMSharp16(str=2.0).ShrinkSharp16(str=1.0).ditherpost (mode=6).converttoyv12 ().aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).converttoy8 ().dither_convert_8_to_16 ()
super=uplow.MSuper (pel=4,levels=0,sharp=2)
vmulti=super.MAnalyse(blksize=8,search=3,delta=3,multi=true,overlap=4,badrange=-24,dct=5,searchparam=6)
comp=uplow.MCompensate(super, vmulti, thSAD=400, tr=3, thsad2=200).dither_convert_8_to_16 ().converttoy8 ()
max=comp.MaxMulti (tr=3)
min=comp.MinMulti (tr=3)
up.sharplimit16 (sharp=sharp.Dither_clamp16(max, min, 0, 0))
ditherpost (mode=6)

required functions here (https://github.com/IFeelBloated/Placebo/blob/master/avsi)

Warperus
19th May 2015, 13:08
I`ve been reading that motion adaptive super resolution and multi image super resolution are based in sub-pixel displacement and aliasing presence. But... How?
There's a good article for deblur. It starts quite general and then becomes more specific.
http://yuzhikov.com/articles/BlurredImagesRestoration1.htm
http://yuzhikov.com/articles/BlurredImagesRestoration2.htm

Too, what is exactly sub-pixel, in image/video context?
It's is just a pixel or higher resolution image. If you divide pixels to pixels calling them all pixels is a bit confusing. So there's a new name...

How works single image super resolution that is based in patch seamless?
It looks like fractal compression algo being applied to picture. It produces transition function for block-domain pairs that being applied repeatedly can reproduce source image. If you apply this decompression to higher resolution image it can actually produce more detailed image. Up to a point, of cause, and with some guessed info. You cannot read car number out of a 1*2 pixel plate, but something like 2 times resolution increase might be possible.

wonkey_monkey
19th May 2015, 17:18
I`ve been reading that motion adaptive super resolution and multi image super resolution are based in sub-pixel displacement and aliasing presence. But... How?

That sounds like drizzle (http://en.wikipedia.org/wiki/Drizzle_%28image_processing%29) to me.

It works if your camera sensor's elements collect light only in a small area, and are separated from other elements by a gap - as is the case with the Hubble Space Telescope. So what you end up with is aliased - it's almost the equivalent of taking a normal image and scaling down with a pointresize.

By taking lots of photos and slightly shifting the sensor/camera, you can pick up the "missing" light that would have hit the gaps instead of the sensor elements. A lot of people got excited about it for a while, but it's no use unless your sensor is undersampling, and (as I understand it) that rarely applies to cameras.

Think of it like looking through a piece of paper with 1cm square holes punched in it every 2cm. Place it over an image, and you can only see 1/4 of it, but those pixels you can see are sharp and clear. Shift it by 1cm, and now you've seen half the image. Shift again, and again, and you'll get the whole image.

One case where it does apply, though, is a technique called Bayer drizzle, which can fill in the gaps (caused by undersampling R, G, and B) in a Bayer sensor (http://en.wikipedia.org/wiki/Bayer_filter) by similar means. You need RAW camera data to do that.

ChiDragon
19th May 2015, 22:01
Think of it like looking through a piece of paper with 1cm square holes punched in it every 2cm. Place it over an image, and you can only see 1/4 of it, but those pixels you can see are sharp and clear. Shift it by 1cm, and now you've seen half the image. Shift again, and again, and you'll get the whole image.
Is this like JVC's e-shift (http://cdn.jvc.eu/dla-x900r/feature01.html)?

wonkey_monkey
20th May 2015, 09:44
Is this like JVC's e-shift (http://cdn.jvc.eu/dla-x900r/feature01.html)?

Hard to say from reading that bit of marketing blurb. It's possibly broadly similar. Sounds to me like they're projecting 4x 1920x1080 images in quick succession and at slight offsets to give you 4K.

Reel.Deel
20th May 2015, 18:33
This doesn't really add any info on how super-resolution works but have you guys heard of waifu2x (http://waifu2x.udp.jp/)? It's described as "Single-Image Super-Resolution for anime/fan-art using Deep Convolutional Neural Networks".


I think NNEDI3 with pre/postpossessing is still better, but to each is own. :)

https://raw.githubusercontent.com/nagadomi/waifu2x/master/images/slide.png

*.mp4 guy
27th May 2015, 20:12
Based on the minute description of its operation, it is yet another of the growing number of clones that are "differently the same" in respect to nnedi. I understand why people want to play around with the techniques tritical developed, but If you end up with nothing but a slighty inferior result for your effort, it just feels wrong to me to present that as something new without even mentioning the originator, or providing a fair comparison to it to show the assumptive differences.

foxyshadis
28th May 2015, 00:44
Based on the minute description of its operation, it is yet another of the growing number of clones that are "differently the same" in respect to nnedi. I understand why people want to play around with the techniques tritical developed, but If you end up with nothing but a slighty inferior result for your effort, it just feels wrong to me to present that as something new without even mentioning the originator, or providing a fair comparison to it to show the assumptive differences.

It uses the techniques of nnedi, but seems to be tuned more like eedi2/3:
https://raw.githubusercontent.com/nagadomi/waifu2x/master/images/lena_waifu2x.png

I'd imagine that an alternative weights file for nnedi3 could be generated specifically for toons/anime/cgi, and it'd be a lot faster than this waifu2x. Probably even better looking, too.

Regarding the thread topic, I've yet to see anything rival Video Enhancer (http://www.infognition.com/super_resolution_avisynth/) by a fellow from MSU, but of course they don't give you more than general details on the algorithm. The comparison shots (http://www.infognition.com/articles/video_resize_shootout.html) are pretty good, though.

Avisynth has an open-source (but very specialized) super resolution: QTGMC and its ancestors, which might well be the best deinterlacers ever made. Nearby frames are used to iteratively add detail to the generated parts. I'm certain you could pick it apart to make a general super resolution script, although some of its detail retention features wouldn't work quite as well.

huhn
28th May 2015, 01:21
Regarding the thread topic, I've yet to see anything rival Video Enhancer (http://www.infognition.com/super_resolution_avisynth/) by a fellow from MSU, but of course they don't give you more than general details on the algorithm. The comparison shots (http://www.infognition.com/articles/video_resize_shootout.html) are pretty good, though.

and you don't think it's fishy that lanczos and spline are better with PSNR than nnedi3?

i get the feeling the downscaling has a lot to do with the results.

Reel.Deel
28th May 2015, 02:45
Based on the minute description of its operation, it is yet another of the growing number of clones that are "differently the same" in respect to nnedi. I understand why people want to play around with the techniques tritical developed, but If you end up with nothing but a slighty inferior result for your effort, it just feels wrong to me to present that as something new without even mentioning the originator, or providing a fair comparison to it to show the assumptive differences.

I agree. It's a shame that waifu2x has gotten so much recognition and praise since its release. 2,940 stargazers and 219 forks on GitHub! :eek:. There's also lots of threads on the web discussing waifu2x and there's even some people claiming that it's more sophisticated than NNEDI3. Please, I think those claims are just a big pile of chickenshit. IMOH, I think NNEDI3 with pre/post-processing can produce better results any day. It also works reasonably well with just about any type of content (not just anime). It's crazy that NNEDI3 (and its predecessors) have been around for years yet nobody seemed to care...

Kudos to tritical!!

I've yet to see anything rival Video Enhancer (http://www.infognition.com/super_resolution_avisynth/) by a fellow from MSU, but of course they don't give you more than general details on the algorithm.

Really? I tried it a a couple of years back and I was not impressed. I'm not the only one who feels that way either.


In 99% of all practical cases, spline36resize (or pretty much any other resizer) is almost as good as video enhancer (since in fact it doesn't do very much, except for munching CPU time) ... and nnedi2_rpow2 most probably is ahead of video enhancer.

feisty2
28th May 2015, 06:13
here, I present you, fake single frame "SuperResolution" :)

lena high
http://i.imgur.com/6WMb7H9.png

lena low

ImageSource("Lena.png")
interleave (showred ("y8"),showgreen ("y8"),showblue ("y8"))
convert8to16 (false)
dither_resize16 (256,256,kernel="cubic",a1=-1,a2=0)
round8 (false)
mergergb (selectevery (3,0),selectevery (3,1),selectevery (3,2))

http://i.imgur.com/3lYp6R2.png

spline64

ImageSource("Lena.png")
interleave (showred ("y8"),showgreen ("y8"),showblue ("y8"))
convert8to16 (false)
dither_resize16 (256,256,kernel="cubic",a1=-1,a2=0)
round8 (false)
spline64resize (512,512)
mergergb (selectevery (3,0),selectevery (3,1),selectevery (3,2))

http://i.imgur.com/OJ6BpSf.png

eedi3

ImageSource("Lena.png")
interleave (showred ("y8"),showgreen ("y8"),showblue ("y8"))
convert8to16 (false)
dither_resize16 (256,256,kernel="cubic",a1=-1,a2=0)
round8 (false)
converttoyv12 ()
eedi3_rpow2 (rfactor=2,nrad=3,mdis=40)
mergergb (selectevery (3,0),selectevery (3,1),selectevery (3,2))

http://i.imgur.com/XDCOC1z.png

nnedi3

ImageSource("Lena.png")
interleave (showred ("y8"),showgreen ("y8"),showblue ("y8"))
convert8to16 (false)
dither_resize16 (256,256,kernel="cubic",a1=-1,a2=0)
round8 (false)
converttoyv12 ()
nnedi3_rpow2 (rfactor=2,nsize=0,nns=4,qual=2)
mergergb (selectevery (3,0),selectevery (3,1),selectevery (3,2))

http://i.imgur.com/jQYDJl2.png

fake "SR"

ImageSource("Lena.png")
interleave (showred ("y8"),showgreen ("y8"),showblue ("y8"))
convert8to16 (false)
dither_resize16 (256,256,kernel="cubic",a1=-1,a2=0)
round8 (false)

converttoyv12 ()
eedi3 (dh=true,alpha=0.1,beta=0.5,gamma=60,nrad=3,mdis=40,sclip=nnedi3 (dh=true,nns=4,nsize=0,qual=2,etype=0))
turnleft ()
eedi3 (dh=true,alpha=0.1,beta=0.5,gamma=60,nrad=3,mdis=40,sclip=nnedi3 (dh=true,nns=4,nsize=0,qual=2,etype=0))
turnright ()
converttoy8 ()

convert8to16 (false)
up=last
sharp1=up.shrinksharp16 (str=1.0).CMSharp16(str=1.0).shrinksharp16 (str=1.0)
limit1=up.SharpLimit16(sharp=sharp1, str=1.0)
sharp2=limit1.shrinksharp16 (str=1.0).CMSharp16(str=1.0).shrinksharp16 (str=1.0)
limit2=limit1.SharpLimit16(sharp=sharp2, str=1.0)
sharped=up.SharpLimit16(sharp=limit2, str=1.0)
warp=sharped.round8 (false).converttoyv12 ()
\.aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1)
\.aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1)
\.aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1)
\.aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1)
\.aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1)
\.aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1).aWarpSharp2 (depth=1)
\.converttoy8 ().convert8to16 (false)
sharped.SharpLimit16(sharp=warp, str=1.0)
round8 (false)
mergergb (selectevery (3,0),selectevery (3,1),selectevery (3,2))

http://i.imgur.com/iG9rERE.png

foxyshadis
28th May 2015, 08:46
and you don't think it's fishy that lanczos and spline are better with PSNR than nnedi3?

i get the feeling the downscaling has a lot to do with the results.

I ignored PSNR, but I think it's easy to explain: They either didn't correct or didn't fully correct for the half-pixel shift. It's obvious when you look at the shots, which is what I compared to. Now with a buttload of sharpening, like feisty's last shot, it can look pretty good, but it still doesn't incorporate any temporal information.

As far as I know no one has created a general Super Resolution script or plugin for Avisynth that really works, beyond just basic MVTools averaging, which is why I pointed out Video Enhancer -- it seems to actually add detail that isn't in a frame by combining with what is in nearby frames. (Which is why it's a natural fit for deinterlacers.) In general it may not any better than nnedi3 in motion, but SR is best used for making maximum detail screenshots.

feisty: It's not useful to me to compared using linear light resize, because no source in the wild will have used linear light to downsize. Pretty much everything available has been downsized in gamma light. Perhaps if you are upsizing from an exceptionally well-scanned and high detail DVD, like a Criterion.

feisty2
28th May 2015, 09:18
nah, I wasn't resizing under linear light, both downsizing and upsizing were done under gamma light

foxyshadis
28th May 2015, 09:25
nah, I wasn't resizing under linear light, both downsizing and upsizing were done under gamma light

Oh, I thought dither_resize16 automatically used linear. My bad.

StainlessS
28th May 2015, 12:49
The first lena high kept more of the real detail on the crown of the hat (comparing with the original photo, not the weird color lena low crop out of the full image, EDIT: Crop version presumably converted to 256 color Gif some years ago).

feisty2
28th May 2015, 13:11
The first lena high kept more of the real detail on the crown of the hat.

sure, because it's the native one, pics below were converted from it

StainlessS
28th May 2015, 13:20
Haha, what a silly billy I am :)

jmac698
6th June 2015, 08:54
I have to point out, there's a very good use for drizzle:
the high frame rate modes in some cameras skip pixels, leading to horrible aliasing and weird colours. Would be very useful to clean that up. In fact I should post a sample.

240fps video just scans a subset of pixels across the sensor to make 320x240 video

wonkey_monkey
6th June 2015, 18:10
the high frame rate modes in some cameras skip pixels, leading to horrible aliasing

I was going to express scepticism of this being very common. Then I went to YouTube and the first video I happened to watch exhibited just this effect!

jmac698
11th June 2015, 06:52
I am avenged :)