Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 10th October 2015, 05:17   #41  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
I further increased performance by using DirectXMath DirectX::PackedVector::XMConvertFloatToHalfStream instead of D3DXFloat32To16Array.

It went from 18.5fps to 20fps. CPU usage at only 40%.

ConvertToFloat is faster when calculating in INT, but ConvertFromFloat is faster with FLOAT than with INT.
MysteryX is offline   Reply With Quote
Old 5th November 2015, 04:19   #42  |  Link
luquinhas0021
The image enthusyast
 
Join Date: Mar 2015
Location: Brazil
Posts: 270
SuperRes doesn't work like typical resize algorithms. It runs
around other resizers. So let's say you want to use NNEDI3, it
takes the larger image and the original image, does a Bicubic resize
on the enlarged image to size it back down to the original, and
then creates a difference map between the two, showing the
details that were lost during the upsizing. From that diff map, it
restores details and edges that were lost. Brilliant idea. It makes
even basic resizers like Bilinear look decent. Is not I`m thinking your work is bad, but it looks like to me more like a detail restoration than detail "add", case of super resolution. In single image, sr works by searching seems patterns and slightly different details in each one of them.
__________________
Searching for great solutions
luquinhas0021 is offline   Reply With Quote
Old 6th November 2015, 00:11   #43  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
New version v0.9.1 is released. It greatly reduces memory usage!! This version allows running several shaders in a row by creating command chains and calling ExecuteShader() at the end.

https://github.com/mysteryx93/AviSynthShader

As for SuperRes, it cannot yet fully benefit from this as I'm still missing a Bicubic downscaling shader that needs to be run in the middle. I can only combine 2 of the shader calls (twice if doing 2 passes), yet that's enough to considerably reduce memory usage. ConvertToFloat and ConvertFromFloat have also been modified to reduce memory usage.

With this version, you'll be able to run 8 threads without any issue.

If I can get a Bicubic downscaler, then we could remove unnecessary ConvertFromFloat and ConvertToFloat, as well as chain all of the commands to run at once, which would greatly improve memory usage and performance.

This Cubic code would work for Bicubic upscaling, but Bicubic downscaling requires a few tweaks. I can't do this as I know nothing about HLSL programming.
https://github.com/zachsaw/MPDN_Exte...er/Chroma.hlsl

Last edited by MysteryX; 6th November 2015 at 00:15.
MysteryX is offline   Reply With Quote
Old 6th November 2015, 00:42   #44  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
Here's a comparison of the image quality.

Original
Spline16
NNEDI3(nns=4)
NNEDI3(nns=4)+SuperRes(passes=2, strength=.42)


Result speak for themselves. It makes the image shaper without creating any artificial details.

If I eventually get a Bicubic HLSL downscaling, there might be a 'slight' further quality improvement.

Last edited by MysteryX; 6th November 2015 at 02:02.
MysteryX is offline   Reply With Quote
Old 6th November 2015, 01:05   #45  |  Link
wonkey_monkey
Formerly davidh*****
 
wonkey_monkey's Avatar
 
Join Date: Jan 2004
Posts: 2,496
Quote:
Originally Posted by MysteryX View Post
Here's a comparison of the image quality.

Result speak for themselves.
I find it very hard to see any difference. I don't think your choice of test image was a very good one - it doesn't have a lot of detail and looks very JPEGy.
__________________
My AviSynth filters / I'm the Doctor
wonkey_monkey is offline   Reply With Quote
Old 6th November 2015, 01:10   #46  |  Link
luquinhas0021
The image enthusyast
 
Join Date: Mar 2015
Location: Brazil
Posts: 270
MysteryX, I didn't speak any SR algorithm creates artificial details. What I wanted say is common SR algorithms analizes a set of neighbour frames or, in a single image, a lot of similar patterns, in order to, let's say, replicate details from a frame to another.
Your algorithm first upscale image using some algorithm, then downscale it using BiCubic and compare it to original image, creates a difference map, upscale the missing parts and paste into upscale image. This was what I understood.
But, generally, upscaling are detail lossy and downscaling, too. So, probably, the difference map will show so much difference. What I really want know is how the details that are in original image, but aren't in downscaled upscaled image, are pasted into upscaled image, i.e, how this details are upscaled.
David Horman, I don't see so much difference too. The maximum I see was some ringing, in SuperRes image, disappear.
__________________
Searching for great solutions
luquinhas0021 is offline   Reply With Quote
Old 6th November 2015, 01:20   #47  |  Link
Bloax
The speed of stupid
 
Bloax's Avatar
 
Join Date: Sep 2011
Posts: 317

Here's a nice still image if you need one. :-)
Bloax is offline   Reply With Quote
Old 6th November 2015, 01:46   #48  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
Here's some tests with the lighthouse and clown.

Original
Spline16
NNEDI3(nns=4, cshift="Spline16Resize")
NNEDI3+SuperRes(passes=2, strength=.42)
SuperRes+nnedi3_rpow2(rfactor=2, nsize=0, nns=4, qual=2, etype=0, cshift="SincResize", ep0=4, threads=0, opt=0, fapprox=0)
NNEDI3(nns=4, cshift="Spline16Resize")+SuperRes(passes=3, Strength=1, Softness=.85)







I see more difference between NNEDI3 and NNEDI3+SuperRes than between Spline16 and NNEDI3. There's also something funny happening with the reds... the reds are different but actually looks better with SuperRes. I've seen in a video with a chair where half of it was plain red (color cropping), that the texture of the chair somehow came back after passing it through SuperRes and it looked more like a chair afterwards. It must have to do with the way it's doing color conversion, but it's accidental. It seems to 'sometimes' recover cropped colors. Somehow. Another time I've seen it turn overflow colors into some other color

Last edited by MysteryX; 6th November 2015 at 02:43.
MysteryX is offline   Reply With Quote
Old 6th November 2015, 02:00   #49  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
luquinhas0021, I couldn't answer the technical of how it's doing its job internally, but in SuperRes.avsi, you can see the diff map by returning the output of SuperResDiff.cso instead of processing the image with it.

Bloax, here's the result with your image


Last edited by MysteryX; 6th November 2015 at 02:40.
MysteryX is offline   Reply With Quote
Old 6th November 2015, 02:06   #50  |  Link
luquinhas0021
The image enthusyast
 
Join Date: Mar 2015
Location: Brazil
Posts: 270
Nice, MysteryX. The differences from nnedi3 upscaling to superres nnedi3 upscaling, at least the ones I realize, are less haloing and more sharpness. What if use the Sinc4, Lanczos4 and apply super resolution in each one?! I believe you apply SR in nnedi3 nns=4 with default parameters. What would happen if SR was applied in this script...?

nnedi3_rpow2(rfactor=2, nsize=0, nns=4, qual=2, etype=0, cshift="SincResize", ep0=4, threads=0, opt=0, fapprox=0)
__________________
Searching for great solutions
luquinhas0021 is offline   Reply With Quote
Old 6th November 2015, 02:24   #51  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
I was using
nnedi3_rpow2(2, nns=4, cshift="Spline16Resize", Threads=2)

I have added the results with Sinc above.
MysteryX is offline   Reply With Quote
Old 6th November 2015, 02:31   #52  |  Link
luquinhas0021
The image enthusyast
 
Join Date: Mar 2015
Location: Brazil
Posts: 270
it stayed sharper than all others algorithms! As I expected. After, I will test your SR algorithm with Lanczos4, because Sinc makes much ringing.
Only one question: You alrady put etype=1 (minimize squared error)? Comparing with etype=0, what you like more?

I JUST SEE YOU USED SINC EP0 =4.
__________________
Searching for great solutions

Last edited by luquinhas0021; 6th November 2015 at 02:39.
luquinhas0021 is offline   Reply With Quote
Old 6th November 2015, 02:39   #53  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
I copy/pasted your command. I personally prefer the Spline16 over Sinc which looks more artificially processed.

I'm adding another test: NNEDI3(spline16)+SuperRes(Passes=3, Strength=1, Softness=.85)

Shiandow added Softness for the purpose of being able to use higher strength and passes and then softening it down. So far I wasn't convinced... I'll see better with this test. The previous tests I made, these settings looked good too, but not better than Passes=2, Strength=.42. Let's see what we get! HD pictures might make it better.

OK. With Softness, Clown and Eclipse look a LOT better, but it makes Lighthouse look like a painting. It works for some content but not all. Passes=2 with Strength=.42 gives more consistent results.

Last edited by MysteryX; 6th November 2015 at 02:46.
MysteryX is offline   Reply With Quote
Old 6th November 2015, 02:49   #54  |  Link
luquinhas0021
The image enthusyast
 
Join Date: Mar 2015
Location: Brazil
Posts: 270
The algorithm you just post generates more ringing in letters of Eclipse image. But increases sharpness on Clown image. Maybe use 3 passes of superres don't be a gorgeus thing to do in all images. I suggest to you use my script, instead (...cshift="SincResize", ep0=4...), use (...cshift="Spline144resize"...) with passes=2, strenght=1 and softness=0,4.
In Lighthouse image, you last script generated aliasing in some parts.
__________________
Searching for great solutions

Last edited by luquinhas0021; 6th November 2015 at 02:54.
luquinhas0021 is offline   Reply With Quote
Old 6th November 2015, 07:07   #55  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
Isn't Spline144 a broken algorithm? You're free to play with the algorithms and post your results.

Here are some tests with the lighthouse with NNEDI3(nns=4, cshift="Spline16Resize")

SuperRes(passes=2, strenth=XXX, softness=0)
Strength=30, 40, 50, 60, 100


SuperRes(passes=3, strenth=1, softness=XXX)
30, 40, 50, 60, 70, 80, 90, 100


Which one is your favorite?

When playing with madVR, I found that NEDI+SuperRes were doing a good job together; would be worth a try to see how it compares to NNEDI3. Jinc+SuperRes, however, don't go well together.

SuperRes(passes=2, strength=.4, softness=0)

NNEDI3(nns=4, cshift="Spline16Resize")
NNEDI2(cshift="Spline16Resize")
NNEDI3(nns=3, cshift="Spline16Resize")
NNEDI3(nns=1, cshift="Spline16Resize")


Honestly... NNEDI2 is doing ALMOST as good as NNEDI3(nns=4), but NNEDI3 with lower NNS gets blurrier. NNEDI2 gives a sharp output. The only downside is distortion on the white bars.

Last edited by MysteryX; 6th November 2015 at 07:40.
MysteryX is offline   Reply With Quote
Old 6th November 2015, 22:24   #56  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
I have added a variant of SuperRes that does the YUV conversion via shaders. Performance is slightly lower, but the quality of colors is better. When doing YUV conversion on the CPU, I get 21fps. However, it is doing Rec.601 color conversion on Rec.709 content! When doing YUV conversion via shaders, I get 17fps (including processing NNEDI3). The first implementation might cause a very slight color distortion, and the 2nd implementation makes the colors more vivid.

To use this variant, use file SuperResYUV.avsi

Here's the comparison. Using SuperRes(passes=2, strength=.42, softness=0)

CPU conversion / GPU conversion








EDIT: Now this is embarrassing and strange... when I put both versions side-to-side on my computer, I can clearly see a difference. But once converted to PNG, I honestly can't see any difference at all! Perhaps whatever color this makes is being discarded by PNG compression? But PNG is supposed to be lossless. Not sure on this one.

Last edited by MysteryX; 6th November 2015 at 22:35.
MysteryX is offline   Reply With Quote
Old 6th November 2015, 22:40   #57  |  Link
luquinhas0021
The image enthusyast
 
Join Date: Mar 2015
Location: Brazil
Posts: 270
I don`t know if you saw things like me, but there`s a strange effect in lighthouse when use softness with passes=3. The bigger is amount of softness, a little, little more sharper it`s this photo. Don`t know why!
You asked me about which parameters I like. Then, for not take risk, I preffer passes=2, strenght=1, softness=0. From passes=3, looks like algorithm blurs the photo, and, with passes=3, softness sharp then. I may be wrong, but this is what I`ve seen.
__________________
Searching for great solutions
luquinhas0021 is offline   Reply With Quote
Old 6th November 2015, 23:48   #58  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
Yeah, I don't like the effect of Softness, it's not working well. Shiandow might improve it in the future. For now, Passes=2 and Strength=.42 works best for noisy material, and for clear images, you can get away with 2 passes of strength up to 1.

I figured out what I did wrong in the last test: I was upscaling with nns=1 and comparing with the previous results that had nns=4!

Here's a new test of frame quadrupling.

Spline16
edi_rpow2(2, nns=4, cshift="Spline16Resize") uses NNEDI3 but fixes a few details
SuperRes(passes=2, strength=.42)
SuperRes (YUV conversion done on GPU)
SuperRes (YUV conversion done on GPU) + NNEDI2





Interestingly enough, the 3rd one (which is the I had been testing before) has some distortion with frame quadrupling! Not sure where that's coming from... The newer implementation doesn't have that distortion... or is that the distortion I saw when using NNEDI2? Here's also with NNEDI2 to compare.

EDIT: NNEDI2 is *slower* than NNEDI3(nns=4), so we can discard it, although it gives *almost* identical results. The new implementation with YUV conversion on the GPU does give considerably better image quality. There's a lot less color distortion on the Lighthouse.

Last edited by MysteryX; 7th November 2015 at 05:35.
MysteryX is offline   Reply With Quote
Old 7th November 2015, 03:39   #59  |  Link
luquinhas0021
The image enthusyast
 
Join Date: Mar 2015
Location: Brazil
Posts: 270
Can you provide me the file I should paste in Avisynth plug-in folder? Or can you tell me how I install your algorithm in Avisynth?
__________________
Searching for great solutions
luquinhas0021 is offline   Reply With Quote
Old 7th November 2015, 05:48   #60  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
I have improved the performance of the YUV conversion via shaders.

You'll need Shader.dll and the AVSI and CSO files within "Shaders\SuperRes". You might have to specify the "folder" argument to tell SuperRes.avsi where to find all the CSO files. Ideally I'll want to automate that parameter.

SuperRes converts colors on the CPU, while SuperResYUV converts via Shaders. I'll probably remove the CPU-conversion implementation as the other one gives better quality, and now performance is similar. It "was" slower but then the CPU usage was also lower so you could just increase the amount of threads to make it up. Now it runs even better.
MysteryX is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 23:43.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.