Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Announcements and Chat > General Discussion

Reply
 
Thread Tools Search this Thread Display Modes
Old 13th November 2017, 23:18   #41  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Ok, let's wait and see what the future will bring. Current situation is that GPUs would probably have to be about 20-50x faster than they are now to be able to apply waifu2x sized neural networks on video in real time. And that's just single frame processing, not yet taking temporal information into account.

Of course, for offline processing real time performance is not needed.
madshi is offline   Reply With Quote
Old 14th November 2017, 09:02   #42  |  Link
feisty2
I'm Siri
 
feisty2's Avatar
 
Join Date: Oct 2012
Location: void
Posts: 2,633
looks like GAN related model to me, judging from all that generated fake details
feisty2 is offline   Reply With Quote
Old 14th November 2017, 15:46   #43  |  Link
ABDO
Registered User
 
Join Date: Dec 2016
Posts: 65
the author says
“We took few state-of-art approaches, hacked around and rolled them into production-ready system. Basically we were inspired by SRGAN and EDSR papers.”
ABDO is offline   Reply With Quote
Old 14th November 2017, 21:47   #44  |  Link
cork_OS
Registered User
 
cork_OS's Avatar
 
Join Date: Mar 2016
Posts: 160
Quote:
Originally Posted by zub35 View Post
Wow, clown is unbelievable!
__________________
I'm infected with poor sources.
cork_OS is offline   Reply With Quote
Old 16th November 2017, 10:23   #45  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
One problem with hallucinated texture detail is that it's most probably not "stable" in motion. Which means if there's only a small change in the video image, the hallucinated texture detail may shift its position, although still having a similar "look" to it. In motion this will probably look like very strong dithering noise. It in extreme cases it could even turn into flickering.

Unfortunately I've already spent my 5 allowed images on the website, so I can't test that. Maybe someone else can try upscaling the clown image shifted by 1 pixel? Or window boxed by 1 pixel? Or mirrored or rotated? That would be a quick way to check how "stable" the hallucinated textures are.
madshi is offline   Reply With Quote
Old 16th November 2017, 11:43   #46  |  Link
tebasuna51
Moderator
 
tebasuna51's Avatar
 
Join Date: Feb 2005
Location: Spain
Posts: 6,890
Quote:
Originally Posted by madshi View Post
Maybe someone else can try upscaling the clown image shifted by 1 pixel? Or window boxed by 1 pixel?
I spent my 5 images with 1,2,3,4,5 pixels shifted of original clown.png and magic 4X: https://www.sendspace.com/file/8mifbt

To see in movement order the letsenhance io was c5-0p.png
__________________
BeHappy, AviSynth audio transcoder.
tebasuna51 is offline   Reply With Quote
Old 16th November 2017, 12:06   #47  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Thank you! So it seems simple pixel shifting doesn't harm. Well, I suppose it makes sense because convolutional neural networks aren't really position dependent. So window boxing wouldn't make any difference, either. Argh, should have thought of that! Maybe adding a very small amount of noise would be interesting, or changing image brightness ever so slightly, or rotating by 1 degree, or something like that?
madshi is offline   Reply With Quote
Old 16th November 2017, 12:24   #48  |  Link
ABDO
Registered User
 
Join Date: Dec 2016
Posts: 65
Quote:
Originally Posted by madshi View Post
Maybe someone else can try upscaling the clown image shifted by 1 pixel? Or window boxed by 1 pixel? Or mirrored or rotated?
Mirrored


Rotated1


Rotated2
ABDO is offline   Reply With Quote
Old 16th November 2017, 12:57   #49  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Very good, thank you! I've de-rotated/de-mirrored your images and here they are for easy comparison:

no modification - | - rotated left - | - rotated right - | - mirrored horizontally

If you compare these images, you'll see that the texture changes a lot in all 4 images, it's completely different in each frame. It still has an overall similar look to it, but the changes are still much bigger than any dithering, so in motion this will look extremely noisy/unstable.

For still images it might not matter too much, but for video this type of "texture hallucination" is IMHO currently not feasible in motion, because it is not stable when the image content changes slightly. I'm not sure if the algorithm could be changed to fix this problem. I kind of doubt it because the algo by design doesn't even try to restore the original texture (which is technically impossible, anyway), it just tries to hallucinate a texture which hopefully has a similar look to the texture the original hi-res image had before downscaling. So the algo is by design not able to maintain a stable "position" of the texture in motion.

Even worse, if you look at the very bottom of the image, the alphalt texture is changing its brightness very strongly from frame to frame, this will actually produce visible flickering in motion. That said, these brightness fluctuations should be fixable with better neural network training.
madshi is offline   Reply With Quote
Old 16th November 2017, 13:59   #50  |  Link
ABDO
Registered User
 
Join Date: Dec 2016
Posts: 65
Quote:
Originally Posted by madshi View Post
but the changes are still much bigger than any dithering, so in motion this will look extremely noisy/unstable.
that is very sad news

For still images it realy do very good, i try most of this Single Image Super-Resolution, and most of them give Similar result to "boring" version, just srgan can give very Similar sharp result to "magic" version but less acuracy in hallucinate texture detail.

but for video i think this multiple frames (Detail-revealing Deep Video Super-resolution) give good result than any Single Image Super-Resolution
https://github.com/jiangsutx/SPMC_VideoSR

Last edited by ABDO; 16th November 2017 at 14:04.
ABDO is offline   Reply With Quote
Old 16th November 2017, 14:02   #51  |  Link
huhn
Registered User
 
Join Date: Oct 2012
Posts: 7,903
i was thinking to do the extra work of taking 24-48 frames from a video file and scale them all with this scaler and to turn it into a lossless RGB video file to show it in motion but the rotation images show the issue more than enough.
huhn is offline   Reply With Quote
Old 16th November 2017, 14:51   #52  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by HolyWu View Post
Here is another example with significant artifacts.
How many neurons did you use for NNEDI3? I'm quite surprised by how much better NGU-AA (medium!!) quality is compared to NNEDI3 with this test image. I find NGU-AA to be not only more detailed, but it also has less ringing artifacts and looks more natural at the same time! I guess I should ask (in the madVR thread) once more if NNEDI3 is still needed/useful.

Quote:
Originally Posted by ABDO View Post
but for video i think this multiple frames (Detail-revealing Deep Video Super-resolution) give good result than any Single Image Super-Resolution
https://github.com/jiangsutx/SPMC_VideoSR
Yes, it looks good. Would be interesting to see how it would handle typical movie sources. Usually these types of algos are too slow for real time use, though.

Quote:
Originally Posted by huhn View Post
i was thinking to do the extra work of taking 24-48 frames from a video file and scale them all with this scaler and to turn it into a lossless RGB video file to show it in motion but the rotation images show the issue more than enough.
The RGB video would have been quite interesting, but it would have required the help of 5-10 users because the website limits the number of upscales for each user to max 5 images.
madshi is offline   Reply With Quote
Old 16th November 2017, 14:55   #53  |  Link
zub35
Registered User
 
Join Date: Oct 2016
Posts: 56
HolyWu, madshi

crowd_run in motion
original 1 2 3 4 5 (downscale lanczos)
waifu2x 1 2 3 4 5
letsenhance 1 2 3 4 5
Quote:
If the system detects, that you uploaded compressed JPEG, it automatically applies anti-JPEG neural network. While it is highly efficient in removing JPEG artifacts, it can also lead to some blurring. This is anticipated.
If you would like to avoid it, you can save you JPEG as PNG-24. This will skip anti-JPEGing, as well as preserve JPEG noise. If you are sure that this is something you need, we advise to do the PNG-24 thing.

Last edited by zub35; 16th November 2017 at 15:17.
zub35 is offline   Reply With Quote
Old 16th November 2017, 15:17   #54  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
@zub35, are you involved in that website? Or how did you manage to create more than 5 images? Would you mind sharing the original crowd run video? I think I only have an image, but not the video.

On a positive note, the texture in the big tree in the middle is relatively stable in motion. It's not completely stable, but it's better than I expected. However, the crowd in the very (right) background switches between blurry and sharp from one frame to the next, and some of the faces in the crowd are really scary looking, when enhanced with the hallucinated textures. Overall, I still think this algo won't work well for video.
madshi is offline   Reply With Quote
Old 16th November 2017, 15:23   #55  |  Link
zub35
Registered User
 
Join Date: Oct 2016
Posts: 56
madshi, No, I am the same observer as you are. Upload more than 5 - how will allow the number of email addresses and your conscience
crowd_run and more test sample: https://media.xiph.org/video/derf/

Last edited by zub35; 16th November 2017 at 15:35.
zub35 is offline   Reply With Quote
Old 16th November 2017, 15:33   #56  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Ah cool, thanks for the link. Some of those videos look familiar to me.
madshi is offline   Reply With Quote
Old 16th November 2017, 16:16   #57  |  Link
zub35
Registered User
 
Join Date: Oct 2016
Posts: 56
Quote:
Originally Posted by zub35 View Post
crowd_run in motion
original 1 2 3 4 5 (downscale lanczos)
waifu2x 1 2 3 4 5
letsenhance 1 2 3 4 5
Encode x264 medium crf=25
original 1 2 3 4 5
lanczos4 1 2 3 4 5
letsenhance PNG 1 2 3 4 5
letsenhance JPG(q100) 1 2 3 4 5

HolyWu Maybe you used a 1080 sample. Alas, it is of poor quality downscale. Use 2160 sample

Last edited by zub35; 16th November 2017 at 16:39.
zub35 is offline   Reply With Quote
Old 16th November 2017, 16:27   #58  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
How does it compare to madVR's "remove compression artifacts"?
madshi is offline   Reply With Quote
Old 16th November 2017, 16:42   #59  |  Link
zub35
Registered User
 
Join Date: Oct 2016
Posts: 56
madshi Try it, I can make mistakes with the choice of filters. File to download in the post above.
p.s. Private Messages

UPD: From a technical, it is better to create a new neural network, comparing frame by frame the source file with compression (several thousand variations) for restoration / removal of artifacts (given the different codecs and settings)
And then, to apply this neural network to upscale. Either combine them into one big neural network.

Based on the above. Provided that a stable algorithm. Add to the container (mkv) of the video file (or individual *.neural file), the minimum data needed for fast/realtime work of the neural network on the players.
At the same time, don't even have to create a new video standard, and apply them to the existing AVC or HEVC. Accordingly, retaining the ability to play videos where there is no power to the neural network.

Last edited by zub35; 16th November 2017 at 18:32.
zub35 is offline   Reply With Quote
Old 16th November 2017, 22:34   #60  |  Link
ABDO
Registered User
 
Join Date: Dec 2016
Posts: 65
Quote:
Originally Posted by madshi View Post
I find NGU-AA to be not only more detailed, but it also has less ringing artifacts and looks more natural at the same time! I guess I should ask (in the madVR thread) once more if NNEDI3 is still needed/useful.
for me i never used NNEDI3 since NGU-AA come to madvr, NGU-AA looks more natural as you said and much faster in the same time
Quote:
Would be interesting to see how it would handle typical movie sources. Usually these types of algos are too slow for real time use, though.
yeah, it is too slow for real time use,i will traning the network soon if author did not put the pretraning moudel and i wish it give good result in typical movie sources.

Quote:
How does it compare to madVR's "remove compression artifacts"?
it is equal to madvr RCA-6 Values, but madvr also much much faster, i will up some comparison image soon.
edit:
All Images upscaling with NGU Sharp VH

jpg Source
https://postimg.org/image/pcaefvqfd/

madvr RCA6-high 1080p
https://postimg.org/image/qtvurl0cp/

letsenhance-anti-jpeg source
https://postimg.org/image/qv5skkewp/

letsenhance-anti-jpeg 1080p
https://postimg.org/image/ay70o19pl/

i think while (RCA*6) totally treat jpeg artifacts as efficiency as (letsenhance-anti-jpeg), but it is clear that (madvr-RCA) excel in save more sharpen and texture detail ��, With Note that (madvr- RCA) give this result in real time, though for now definitely (madvr-RCA) does best job than (letsenhance-anti-jpeg) since we do not have any control over letsenhance-anti-jpeg strength.(sorry if i miss understand your question right)

Last edited by ABDO; 17th November 2017 at 03:00.
ABDO is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 15:51.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.