Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Announcements and Chat > General Discussion

Reply
 
Thread Tools Search this Thread Display Modes
Old 15th November 2016, 21:09   #1  |  Link
Overdrive80
Anime addict
 
Overdrive80's Avatar
 
Join Date: Feb 2009
Location: Spain
Posts: 619
Enhance! RAISR Sharp Images with Machine Learning

https://research.googleblog.com/2016...h-machine.html
__________________
Intel i7-6700K + Noctua NH-D15 + Z170A XPower G. Titanium + Kingston HyperX Savage DDR4 2x8GB + Nvidia GTX750 2GB DDR5 + SSD Vertex 4 256 GB + Antec EDG750 80 Plus Gold Mod + Corsair 780T Graphite
Overdrive80 is offline   Reply With Quote
Old 15th November 2016, 23:59   #2  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 12,811
Original:


Spline36Resize (4x):


NNEDI3 (4x):


NNEDI3 (4x) + Sharpen:


Theirs:
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.



Last edited by LoRd_MuldeR; 16th November 2016 at 00:04.
LoRd_MuldeR is offline   Reply With Quote
Old 16th November 2016, 01:19   #3  |  Link
Gser
Registered User
 
Join Date: Apr 2008
Posts: 309
Is the original picture available somewhere or did you just point resize that one? Would like to try some other things.
Gser is offline   Reply With Quote
Old 16th November 2016, 19:31   #4  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 12,811
Quote:
Originally Posted by Gser View Post
Is the original picture available somewhere or did you just point resize that one? Would like to try some other things.
What they call "original" seems to be a 4x upscale (using PointResize?) of the actual original. So, yes, reduced it to 1/4 via PointResize before applying NNEDI3_rpow2.

Quote:
Originally Posted by smok3 View Post
So this will basically be problematic for video/video compression? Only usefull for single-images?
Any scaling method that works on "single" images can trivially be extended to work on video, because a video is nothing but a sequence of "single" images.

The most common image scaling method, like BiLinear, BiCubic or Lanczos interpolation also are "single image", i.e. they process each frame like an independent picture. This applies to plain NNEDI3 just as well.

Whether their method is fast enough to be useful for (real-time) video upscaling, that's a whole different question though...
Attached Images
 
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.



Last edited by LoRd_MuldeR; 16th November 2016 at 19:45.
LoRd_MuldeR is offline   Reply With Quote
Old 21st March 2017, 13:15   #5  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 8,960
FWIW, the blown-up "original" image doesn't match the small sized image posted by LoRd_MuldeR. The "original" image used by RAISR appears to be higher quality. So all the comparison images with "our" algorithms seem to be at a disadvantage here.

It's a pitty Google didn't find it necessary to make all the images from their PDF available as PNG (or even JPG). But maybe they had a good reason for that...

The biggest problem with all these neural network algorithms is that usually the guys publishing such "scientific" papers use *one* specific algorithm to downscale their images, and then their neural network just learns how to revert the downscaling. Obviously you can get pretty good results that way. But as soon as you feed those neural networks with images downscaled with a different algorithm (e.g. Lanczos instead of Catrom, or Bilinear or Box), the algos quickly fall apart. The only way to properly evaluate the quality of an upscaling algorithm is to test it with multiple different images, which were downscaled with different algorithms.
madshi is offline   Reply With Quote
Old 22nd March 2017, 13:38   #6  |  Link
luquinhas0021
The image enthusyast
 
Join Date: Mar 2015
Location: Brazil
Posts: 267
Your thoughts make complete sense, Madshi.

I thought the correct approach for upscaling is simply an image that must be size-increased... Just that. Not a H.R image that was downscaled and the upscale will try reverse the downscaling.
__________________
Searching for great solutions
luquinhas0021 is offline   Reply With Quote
Old 22nd March 2017, 14:55   #7  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 8,960
Quote:
Originally Posted by luquinhas0021 View Post
I thought the correct approach for upscaling is simply an image that must be size-increased... Just that. Not a H.R image that was downscaled and the upscale will try reverse the downscaling.
Yes. However, at doom9 we're talking about DVDs and Blu-Rays which were usually downscaled from higher resolution masters. So learning how to revert downscaling is not generally a bad idea, IMHO. For our purposes here it could even be a great idea. But the important thing is that an upscaling algorithm, which works by trying to revert a prior downscaling operation, should not be limited to work with only one specific downscaling algorithm. But it should work reasonably well for all common downscaling algorithms. If it does, it might also work well enough for images that weren't downscaled at all.

My point was that we have to take any results from such "scientific PDFs" with a pinch of salt, because they often train for just one specific downscaling algorithm. So it's important that we're able to double check the upscaling results ourselves, with test images we
created ourselves. Otherwise it's hard to judge how good such an algo really is.
madshi is offline   Reply With Quote
Old 22nd March 2017, 14:59   #8  |  Link
burfadel
Registered User
 
Join Date: Aug 2006
Posts: 2,167
Quote:
Originally Posted by madshi View Post
But as soon as you feed those neural networks with images downscaled with a different algorithm (e.g. Lanczos instead of Catrom, or Bilinear or Box), the algos quickly fall apart. The only way to properly evaluate the quality of an upscaling algorithm is to test it with multiple different images, which were downscaled with different algorithms.
Between the downsclaing and upscaling the image wasn't lossy compressed, this would assumingly throw it farther.
burfadel is offline   Reply With Quote
Old 16th November 2016, 12:50   #9  |  Link
smok3
brontosaurusrex
 
smok3's Avatar
 
Join Date: Oct 2001
Posts: 2,375
from https://arxiv.org/abs/1606.01299
Quote:
Given an image, we wish to produce an image of larger size with significantly more pixels and higher image quality. This is generally known as the Single Image Super-Resolution (SISR) problem. The idea is that with sufficient training data (corresponding pairs of low and high resolution images) we can learn set of filters (i.e. a mapping) that when applied to given image that is not in the training set, will produce a higher resolution version of it, where the learning is preferably low complexity. In our proposed approach, the run-time is more than one to two orders of magnitude faster than the best competing methods currently available, while producing results comparable or better than state-of-the-art.
So this will basically be problematic for video/video compression? Only usefull for single-images?
smok3 is offline   Reply With Quote
Old 16th November 2016, 13:58   #10  |  Link
feisty2
I'm Siri
 
feisty2's Avatar
 
Join Date: Oct 2012
Location: Los Angeles, California
Posts: 2,058
I'm a bit disappointed cuz nn doesn't seem to work so well imho(at least for now)..
sure it won't be possible to "recover" the image to higher resolution, but the guess work, which is what nn does, is also not good enough to fool my eyes
__________________
If I got new ideas, will post here: https://github.com/IFeelBloated
feisty2 is offline   Reply With Quote
Old 16th November 2016, 20:53   #11  |  Link
smok3
brontosaurusrex
 
smok3's Avatar
 
Join Date: Oct 2001
Posts: 2,375
LoRd_MuldeR: , What I meant was that two slightly different frames could use completely different resizing methods due to the brute-force methodology used here (Which may look weird when 'animated'). But thanks for that explanation anyway, especially the part about "video is nothing but a sequence of single images" < that got me giggling.
smok3 is offline   Reply With Quote
Old 16th November 2016, 21:23   #12  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 12,811
Quote:
Originally Posted by smok3 View Post
Quote:
Given an image, we wish to produce an image of larger size with significantly more pixels and higher image quality. This is generally known as the Single Image Super-Resolution (SISR) problem. The idea is that with sufficient training data (corresponding pairs of low and high resolution images) we can learn set of filters (i.e. a mapping) that when applied to given image that is not in the training set, will produce a higher resolution version of it, where the learning is preferably low complexity. In our proposed approach, the run-time is more than one to two orders of magnitude faster than the best competing methods currently available, while producing results comparable or better than state-of-the-art.
What I meant was that two slightly different frames could use completely different resizing methods due to the brute-force methodology used here (Which may look weird when 'animated').
It's not "brute force", it's how neural networks work:

You start with an initial (random) network and then you "train" the network with pairs of input data and corresponding (optimal) output data, aka the "training set" – in this case pairs of low-resolution images and corresponding high-resolution images. In the end, you get a network that (hopefully) produces "good" results, even for unknown inputs – in this case a network that will approximate "sharp and naturally-looking" high-resolution image from an (unknown) low-resolution image.

This is more or less exactly how NNEDI3 (and its predecessors) have been created! I don't think there is reason to assume that this approach will necessarily create a highly discontinuous filter, i.e. a filter that would produce completely different outputs, even for very small deviations of the input. At least NNEDI3 shows the opposite. It works pretty well for (progressive) video upscaling, doesn't it?

(Note: The reason why NNEDI3 alone is not a good double-rate deinterlacer and produces notable "bobbing" effect is a different one - it is because the images are alternatingly approximated from "odd" and "even" lines in that scenario)
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.



Last edited by LoRd_MuldeR; 16th November 2016 at 21:33.
LoRd_MuldeR is offline   Reply With Quote
Old 17th November 2016, 09:10   #13  |  Link
smok3
brontosaurusrex
 
smok3's Avatar
 
Join Date: Oct 2001
Posts: 2,375
Define "train". Yeah it is not brute-force once the user gets this, its a cache of brute-force rather.
smok3 is offline   Reply With Quote
Old 17th November 2016, 19:39   #14  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 12,811
Quote:
Originally Posted by smok3 View Post
Define "train".
You present your current network with an input/output pair, the training sample. In this case, the "input" would be a low-resolution version of the original image, and the "output" is the original high-resolution image. You let your network create its own (high-res) output image from the given (low-res) "input" image. And then you compare that against the given optimal "output" image (original). Of course, there will be some difference between the network's actual output and the optimal (desired) output - especially at the beginning of the training phase. This difference, or "error", will be used to update (improve) the network, so that the error is reduced. For example, one approach is to let the "error" propagate through the network in backwards direction and adjust the individual weights accordingly. You repeat this process with many training samples (input/output pairs). In the end, you get a network that (hopefully) produces good results, even for unknown inputs.
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.



Last edited by LoRd_MuldeR; 17th November 2016 at 19:46.
LoRd_MuldeR is offline   Reply With Quote
Old 17th November 2016, 19:48   #15  |  Link
Overdrive80
Anime addict
 
Overdrive80's Avatar
 
Join Date: Feb 2009
Location: Spain
Posts: 619
Quote:
Originally Posted by smok3 View Post
Define "train". Yeah it is not brute-force once the user gets this, its a cache of brute-force rather.
Quote:
Yaniv Romano, John Isidoro, Peyman Milanfar
(Submitted on 3 Jun 2016 (v1), last revised 4 Oct 2016 (this version, v3))

Given an image, we wish to produce an image of larger size with significantly more pixels and higher image quality. This is generally known as the Single Image Super-Resolution (SISR) problem. The idea is that with sufficient training data (corresponding pairs of low and high resolution images) we can learn set of filters (i.e. a mapping) that when applied to given image that is not in the training set, will produce a higher resolution version of it, where the learning is preferably low complexity. In our proposed approach, the run-time is more than one to two orders of magnitude faster than the best competing methods currently available, while producing results comparable or better than state-of-the-art.
A closely related topic is image sharpening and contrast enhancement, i.e., improving the visual quality of a blurry image by amplifying the underlying details (a wide range of frequencies). Our approach additionally includes an extremely efficient way to produce an image that is significantly sharper than the input blurry one, without introducing artifacts such as halos and noise amplification. We illustrate how this effective sharpening algorithm, in addition to being of independent interest, can be used as a pre-processing step to induce the learning of more effective upscaling filters with built-in sharpening and contrast enhancement effect.
https://arxiv.org/abs/1606.01299
__________________
Intel i7-6700K + Noctua NH-D15 + Z170A XPower G. Titanium + Kingston HyperX Savage DDR4 2x8GB + Nvidia GTX750 2GB DDR5 + SSD Vertex 4 256 GB + Antec EDG750 80 Plus Gold Mod + Corsair 780T Graphite
Overdrive80 is offline   Reply With Quote
Old 19th November 2016, 16:56   #16  |  Link
CruNcher
Registered User
 
CruNcher's Avatar
 
Join Date: Apr 2002
Location: Germany
Posts: 4,950
SOA Compare of different test cases from the dataset the building reconstruction result is pretty good also all the text (letter) parts and the faceted eyes pattern reconstruction

very impressive high frequency preservation in all of those results by default

No one yet here gained those sharp results without haloing so far in the Post Process with the Human Eyes Face sample is still far away from those reconstruction results especially at the fine hair structure reconstruction (shown also with the cat test case).

https://drive.google.com/file/d/0BzC...VFZGJ4OWc/view

Triticals NNEDI3 got on the first sight beaten by Google R&D

Wee need the Clown and Lighthouse test with Googles Trained Algorithm
__________________
all my compares are riddles so please try to decipher them yourselves :)

It is about Time

Join the Revolution NOW before it is to Late !

http://forum.doom9.org/showthread.php?t=168004

Last edited by CruNcher; 19th November 2016 at 20:06.
CruNcher is offline   Reply With Quote
Old 21st November 2016, 22:05   #17  |  Link
plasma
Registered User
 
Join Date: Nov 2016
Posts: 15
using neural enhance+denoise

Last edited by plasma; 21st November 2016 at 22:06. Reason: image
plasma is offline   Reply With Quote
Old 4th February 2017, 21:58   #18  |  Link
luquinhas0021
The image enthusyast
 
Join Date: Mar 2015
Location: Brazil
Posts: 267
Where can I get some stable and updated version of this algorithm for personal use? RAISR outperformed Waifu2x, in my tests.
__________________
Searching for great solutions
luquinhas0021 is offline   Reply With Quote
Old 17th November 2016, 17:12   #19  |  Link
Gser
Registered User
 
Join Date: Apr 2008
Posts: 309
Theirs

SuperRes(2, 1, 0, """nnedi3_rpow2(rfactor=4, nns=4, cshift="Spline16Resize")""")

2xSuperResXBR(2, .6, xbrStr=2.3, xbrSharp=1.2)

2xSuperResXBR(1, .7, xbrStr=.1, xbrSharp=.7)

Last edited by Gser; 17th November 2016 at 17:15.
Gser is offline   Reply With Quote
Old 17th March 2017, 22:20   #20  |  Link
Overdrive80
Anime addict
 
Overdrive80's Avatar
 
Join Date: Feb 2009
Location: Spain
Posts: 619
https://github.com/google/guetzli/
__________________
Intel i7-6700K + Noctua NH-D15 + Z170A XPower G. Titanium + Kingston HyperX Savage DDR4 2x8GB + Nvidia GTX750 2GB DDR5 + SSD Vertex 4 256 GB + Antec EDG750 80 Plus Gold Mod + Corsair 780T Graphite
Overdrive80 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 10:08.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2017, vBulletin Solutions Inc.