Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
22nd March 2017, 14:59 | #21 | Link | |
Registered User
Join Date: Aug 2006
Posts: 2,229
|
Quote:
|
|
22nd March 2017, 17:57 | #22 | Link |
The image enthusyast
Join Date: Mar 2015
Location: Brazil
Posts: 270
|
But the important thing is that an upscaling algorithm, which works by trying to revert a prior downscaling operation, should not be limited to work with only one specific downscaling algorithm. But it should work reasonably well for all common downscaling algorithms. If it does, it might also work well enough for images that weren't downscaled at all.
SAR Image Processor 5.2 try do this, with it Pseudo-Inverse algorithm. Try to reverse downscaling is not required when the downscaler is detail preserve. Another form of try the inverse downscaling is do the upscaling in an image (Without assuming the prior downscaling) and, then, apply some good deblur operator, specially in frequency domain. Between the downscaling and upscaling the image wasn't lossy compressed, this would assumingly throw it farther. The problem is the most part of images are lossy compressed, in great or lesser degree. One more reason why upscaling based on prior downscaling is not convince, at least theoretically, at all.
__________________
Searching for great solutions Last edited by luquinhas0021; 22nd March 2017 at 18:06. |
25th March 2017, 23:45 | #24 | Link |
The image enthusyast
Join Date: Mar 2015
Location: Brazil
Posts: 270
|
The biggest problem with all these neural network algorithms is that usually the guys publishing such "scientific" papers use *one* specific algorithm to downscale their images, and then their neural network just learns how to revert the downscaling. Obviously you can get pretty good results that way. But as soon as you feed those neural networks with images downscaled with a different algorithm (e.g. Lanczos instead of Catmul-Rom, or Bilinear or Box), the algos quickly fall apart. The only way to properly evaluate the quality of an upscaling algorithm is to test it with multiple different images, which were downscaled with different algorithms.
Complementing what you said, Madshi, another big problem in that neural network comes from mathematical aspect (Remember that RAISR use neural networks for learning the difference between a cheap-upscaled downsized image and the ground-truth one, and it uses a 4 x 4 matrix in both cases): for each pixel, considering just one channel, there are 2 ^ k possible color values. Considering that a value can be repeated in some other pixel, the total number of possibilities is 2 ^ 16k image 4 x 4 matrices. If we take account a 8 bit channel, so the number of possibilities will be 256 ^ 16, which is equal a number with 39 digits. A content, by it size, impossible to be real, at least in actual days. Google said that was trained 10000 images. But what are 10000 images in comparison with the number I wrote?
__________________
Searching for great solutions Last edited by luquinhas0021; 28th March 2017 at 04:33. |
26th March 2017, 12:00 | #25 | Link |
Registered Developer
Join Date: Sep 2006
Posts: 9,140
|
If you analyze what those neural networks learn, the first layer actually ends up to be an edge detector for different angles and detectors for smooth areas, and then based on this first layer, the neural networks apply their learned upscaling (or rather downscaling inversion). It's pretty cool to see that you just throw images at the network training, and it ends up using edge detection filters. Which is not a decision that a human made, but the result of the neural network learning. This also shows that for good quality upscaling, you *do* need to detect edges and treat them differently. Which is why simple linear algorithms like spline, bicubic, lanczos, jinc or sinc are not really good upscalers, because they don't detect edges at all. They just naively apply their weights on every pixel, regardless of whether it's an edge or a smooth area.
|
26th March 2017, 13:17 | #26 | Link |
I'm Siri
Join Date: Oct 2012
Location: void
Posts: 2,633
|
Right, the first layer is pretty much an edge detector, but you could always do a neural net with like probably 100 layers and hopefully it might be able to predict stuff more complex than just edges, there's like 1002-layer neural net out there last time I checked
Last edited by feisty2; 26th March 2017 at 13:21. |
26th March 2017, 15:32 | #29 | Link |
Moderator
Join Date: Nov 2001
Location: Netherlands
Posts: 6,364
|
I don't know anything about neural networks. It is possible (mathematically) to show that the first layer is an edge detector is some way or some form?
I guess it should also be possible to incorperate edge detection in standard resizing routines? Thus that you neglect an edge (and every source pixel up to that edge) that would normally contribute to a target pixel (and normalize the remaining source pixels that do contribute). Perhaps people tried things like that? |
26th March 2017, 15:44 | #30 | Link |
Registered Developer
Join Date: Sep 2006
Posts: 9,140
|
One convolutional neural networks layer basically consists of a number of rectangular "filters". With the right tools, you can convert these filters into bitmap images, which lets you see what the filters do. Doing that you can see that the filters in the first layer are usually mostly edge detection filters, plus at least one "smooth area" detection filter, plus a couple "weird" filters. It's really interesting look at what a learned neural network does. I've read about this in some PDF some time ago, but I don't have a link at hand right now. The most interesting thing is that all of these filters are created automatically by machine learning, with no human intervention or guiding.
Yes, you could try to tweak standard linear filter resizers to look for edges and treat them differently. Many people have tried that, including myself, but it's hard to make that work really well. Often if you try to interpret gradient/line angles, you end up adding directional artifacts into the interpolated image. E.g. see here for an example: http://www.general-cathexis.com/inte...ownDDL3_4X.jpg (Data-Dependent Lanczos 3) Pretty ugly, if you ask me. That's an extreme example, of cousre. It's possible to get better results than that. |
27th March 2017, 09:26 | #31 | Link |
Registered User
Join Date: Apr 2002
Location: Germany
Posts: 4,926
|
Nvidia is open up their Deep Learning Super Resolution R&D a tad more to the public in context of VR and Game Development use cases
https://developer.nvidia.com/deep-le...erials-texture
__________________
all my compares are riddles so please try to decipher them yourselves :) It is about Time Join the Revolution NOW before it is to Late ! http://forum.doom9.org/showthread.php?t=168004 Last edited by CruNcher; 27th March 2017 at 09:45. |
27th March 2017, 10:08 | #32 | Link |
Registered User
Join Date: May 2014
Posts: 292
|
Pixel Recursive Super Resolution
https://arxiv.org/abs/1702.00783 |
28th March 2017, 04:47 | #33 | Link |
The image enthusyast
Join Date: Mar 2015
Location: Brazil
Posts: 270
|
If you analyze what those neural networks learn, the first layer actually ends up to be an edge detector for different angles and detectors for smooth areas, and then based on this first layer, the neural networks apply their learned upscaling (or rather downscaling inversion). It's pretty cool to see that you just throw images at the network training, and it ends up using edge detection filters. Which is not a decision that a human made, but the result of the neural network learning. This also shows that for good quality upscaling, you *do* need to detect edges and treat them differently. Which is why simple linear algorithms like spline, bicubic, lanczos, jinc or sinc are not really good upscalers, because they don't detect edges at all. They just naively apply their weights on every pixel, regardless of whether it's an edge or a smooth area.
Anyhow, might to be passed into neural network a couple millions of data. Yes, you could try to tweak standard linear filter resizers to look for edges and treat them differently. Many people have tried that, including myself, but it's hard to make that work really well. Often if you try to interpret gradient/line angles, you end up adding directional artifacts into the interpolated image. See here, for an example, Data-Dependent Lanczos 3... You and me know that periodic based image interpolation doesn't use to give us optimal results, due to it periodicity artifacts. Data-dependent polynomial based interpolation would give us better results, due to it flexibility when we will build the linear system, i.e, put till the n-th derivative and whatever more we want, as long as fit in system.
__________________
Searching for great solutions Last edited by luquinhas0021; 27th November 2017 at 16:30. |
13th November 2017, 19:20 | #34 | Link |
Registered User
Join Date: Oct 2016
Posts: 56
|
online upscaler: https://letsenhance.io
Original Up x4: - Lanczos4 - waifu2x UpPhoto noise_scale Level1 x4 - letsenhance io ! Original downscale Up x4: - Lanczos4 - waifu2x UpPhoto noise_scale Level0 - letsenhance io ! Last edited by zub35; 13th November 2017 at 20:09. |
13th November 2017, 20:34 | #35 | Link |
Registered Developer
Join Date: Sep 2006
Posts: 9,140
|
Hmmm... That website offers a "boring" and "magic" version. The boring one looks very similar to waifu2x. The "magic" one most probably tries to hallucinate texture detail, based on some of the recent scientific PDFs suggesting that approach. And it works surprisingly well (even magical) on many image areas, especially on trees and stuff. But it also adds a shitload of weird artifacts, sometimes does truly scary things (like mutating an image element into something completely different), and this is when testing with losslessly compressed images. Testing with lossily compressed images, artifacts become even worse. E.g. any sort of Mosquito or Block artifacts are very strongly enhanced and reinterpreted as being weird image detail.
It's a very interesting technology, but I don't think it's a good match for upscaling lossily compressed video content. Just my 2 cents, of course. |
13th November 2017, 21:22 | #36 | Link |
Registered User
Join Date: Oct 2016
Posts: 56
|
While all this is demonstrational on individual images. The technology can be expanded, given a temporary.
Thereby improving not only the quality but also minimizing the artifacts. Besides it affects the quality more and a good database of samples for neural computing. If go even further. In the next-gen video standard, can be implemented neural-data in GOP for fast/quality restoration. Thus allowing compress 6K-8K in the current bitrates 2K Last edited by zub35; 13th November 2017 at 21:27. |
13th November 2017, 21:41 | #37 | Link |
Registered Developer
Join Date: Sep 2006
Posts: 9,140
|
You do know that neural networks of this size don't process in real time for video, right? And that's when talking about single images. Adding temporal processing would help reducing artifacts (but most probably not remove them completely), but it would also slow things down even further.
|
13th November 2017, 21:49 | #38 | Link |
Registered User
Join Date: Oct 2016
Posts: 56
|
Of course. For real-time processing requires an appropriate implementation of the instruction set, for example as a separate unit in the GPU. That's why I mentioned the next-gen and the adoption of appropriate standards.
Maybe even next-next-gen standart. But you will agree, the technology is extremely promising and has a chance at hardware implementation |
13th November 2017, 22:47 | #39 | Link |
Registered Developer
Join Date: Sep 2006
Posts: 9,140
|
When even car manufacturers like Tesla use GPUs to run neural networks then I don't know how much more efficient "hardware implementations" would be. Maybe they would achieve a somewhat higher power vs performance ratio than a GPU, but I wouldn't expect miracles. It's not like e.g. video decoding, where the algorithms are relatively complicated and hardware implementations can achieve dramatic improvements over a general purpose CPU. Processing neural networks is extremely simple math, but requires very high GLOPS. It's mostly just lots and lots and lots of matrix multiplications, and GPUs are already very good at doing that.
|
13th November 2017, 22:58 | #40 | Link |
Registered Developer
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,342
|
NVIDIA is already putting Tensor cores into their Volta-based Datacenter Deep Learning GPUs, which helps Neural Network inference quite a bit. And some other vendors are working on special chips for NN inference. So in the consumer space, there is still plenty performance to achieve with special Neural Network hardware.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders |
Thread Tools | Search this Thread |
Display Modes | |
|
|