Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Hardware & Software > Software players

Reply
 
Thread Tools Search this Thread Display Modes
Old 13th December 2014, 09:23   #61  |  Link
Orf
YAP author
 
Join Date: Jul 2014
Location: Russian Federation
Posts: 111
Quote:
Originally Posted by leeperry View Post
OK thanks for the reply, so basically ACDSee-like features(previous/next/random/shrink to fit/1:1 scaling) with NNEDI3 and Jinc3 aren't gonna happen in YAP?
No, what you've quoted means that madVR in its current state can't be used to accomplish this. But I think this idea of yours is nice, I like it much and I want to implement it anyway. So, I've already started to add pixel shader support on my own. When it will be finished, it will enable you to apply any combination of hlsl sources to images, madVR and to EVR CP in the future.

Also I've come to an understanding that compute shaders is better fit for this purpose then pixels shaders. Can anyone say, am I right that NNEDI3 is only currently available here @doom9 as pixel shader hlsl sources ?

@madshi
I can't reproduce the F11-F12 bug you've reported by your description. Maybe you have some more to report ? Is it always happen to you ?

Last edited by Orf; 13th December 2014 at 09:43.
Orf is offline   Reply With Quote
Old 13th December 2014, 16:15   #62  |  Link
Shiandow
Registered User
 
Join Date: Dec 2013
Posts: 753
Quote:
Originally Posted by Orf View Post
Also I've come to an understanding that compute shaders is better fit for this purpose then pixels shaders. Can anyone say, am I right that NNEDI3 is only currently available here @doom9 as pixel shader hlsl sources ?
As far as I know it's available as Avisynth (.avsi) or OpenCL (.cl) files. MadVR uses DirectCompute but I'm not sure if madshi ever made that code public.

Edit: I was mistaken. It seems that MadVR also uses OpenCL. And I think part of the code actually comes with MadVR (in the folder 'legal stuff').

Last edited by Shiandow; 13th December 2014 at 16:26.
Shiandow is offline   Reply With Quote
Old 13th December 2014, 16:19   #63  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,344
NNEDI3 is done in OpenCL in madVR as well. Only the dithering shaders are DirectCompute, afaik.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 13th December 2014, 16:24   #64  |  Link
Shiandow
Registered User
 
Join Date: Dec 2013
Posts: 753
Oh, I think you're right, the changelog mentions that NNEDI3 needs OpenCL.
Shiandow is offline   Reply With Quote
Old 13th December 2014, 16:29   #65  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
I've tried converting NNEDI3 to DirectCompute, but it performed *much* slower than with OpenCL, unlike error diffusion which was actually slightly faster with DirectCompute. So nevcairiel ist right, NNEDI3 is done in OpenCL, error diffusion in DirectCompute.

@Orf, unfortunately I don't have any time atm. But I know that it was 100% reproducable for me, when I reported the problem.
madshi is offline   Reply With Quote
Old 13th December 2014, 19:36   #66  |  Link
Orf
YAP author
 
Join Date: Jul 2014
Location: Russian Federation
Posts: 111
Shiandow, nevcairiel, madshi,
thanks guys firstly for your comments, they are really helpful finding my way in the dark. But at second thought, and may be I'm missing something here, but why all of your starts talking in one voice about OpenCL vs DirectCompute, when I was initially asked about pixels shaders vs direct compute shaders ?
Orf is offline   Reply With Quote
Old 14th December 2014, 10:56   #67  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by Orf View Post
at second thought, and may be I'm missing something here, but why all of your starts talking in one voice about OpenCL vs DirectCompute, when I was initially asked about pixels shaders vs direct compute shaders ?
Because you were asking about NNEDI3 GPU implementations, and you were asking about DirectCompute. So we tried to explain to you that for NNEDI3, you'd better be using OpenCL because it's dramatically faster.
madshi is offline   Reply With Quote
Old 15th December 2014, 06:39   #68  |  Link
Orf
YAP author
 
Join Date: Jul 2014
Location: Russian Federation
Posts: 111
madshi, I agree, I do not ask a question in a correct way may be. Will try to correct myself now. As far as I understood beside using DirectCompute/OpenCL APIs there's third way to do it by simply drawing a quad via Direct3D and applyng pixel shader to it. Did you test is that method any faster then OpenCL ?
Orf is offline   Reply With Quote
Old 15th December 2014, 08:09   #69  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 10,344
Quote:
Originally Posted by Orf View Post
As far as I understood beside using DirectCompute/OpenCL APIs there's third way to do it by simply drawing a quad via Direct3D and applyng pixel shader to it. Did you test is that method any faster then OpenCL ?
Pixel Shaders are much more limited, and something as complex as NNEDI3 is unlikely to be possible with pixel shaders alone.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 15th December 2014, 08:15   #70  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by Orf View Post
madshi, I agree, I do not ask a question in a correct way may be. Will try to correct myself now. As far as I understood beside using DirectCompute/OpenCL APIs there's third way to do it by simply drawing a quad via Direct3D and applyng pixel shader to it. Did you test is that method any faster then OpenCL ?
I actually did try to do NNEDI3 via PS3.0 pixel shaders, and from what I remember, it was slower by a factor of around 1000x, compared to OpenCL.

The reason why pixel shaders are so much slower for OpenCL is that pixel shaders apply math to every destination pixel separetely. OpenCL and DirectCompute are more flexible, you can configure them to render multiple destination pixels with one kernel pass. Doing that allows to cleverly cache things and to share some calculations for multiple pixels etc. Especially for NNEDI3 that's very important to get things up to speed.
madshi is offline   Reply With Quote
Old 16th December 2014, 05:32   #71  |  Link
Orf
YAP author
 
Join Date: Jul 2014
Location: Russian Federation
Posts: 111
nevcairiel, madshi, thanks
I do understand OpenCL/DirectCompute is more powerfull, that was why I initially asked.
But this NEDI/NNEDI is really confusing me. To summarize what I've learned from your:
- NNEDI3 is the most heavy algorithm, but it gives the best result in the end
- NEDI implemented here for example is less heavy, so PS use is acceptable
- madVR internally should have at least two image processing conveyers. #1 is PS conveyer, #2 is OpenCL conveyer. Also it possible have DirectCompute conveyer as #3
- IMadVRExternalPixelShaders supports only #1 (?)
- To make general and flexible image processing support all three conveyers have to be implemented
- Which one of the three is better is kind of an open question. Also picking and implementing only one of them will require rewriting of hlsl/cl code (thing I very unlikely can do myself)

I'm I still missing something ?
Orf is offline   Reply With Quote
Old 16th December 2014, 09:28   #72  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Quote:
Originally Posted by Orf View Post
But this NEDI/NNEDI is really confusing me. To summarize what I've learned from your:
- NNEDI3 is the most heavy algorithm, but it gives the best result in the end
- NEDI implemented here for example is less heavy, so PS use is acceptable
- madVR internally should have at least two image processing conveyers. #1 is PS conveyer, #2 is OpenCL conveyer. Also it possible have DirectCompute conveyer as #3
- IMadVRExternalPixelShaders supports only #1 (?)
- To make general and flexible image processing support all three conveyers have to be implemented
- Which one of the three is better is kind of an open question. Also picking and implementing only one of them will require rewriting of hlsl/cl code (thing I very unlikely can do myself)
Seems all correct to me.

Although the names suggest otherwise, NNEDI3 and NEDI are *totally* different algorithms, which have almost nothing in common (except for doing an exact 2x upscale). IMO NNEDI3 has better image quality, but it's also quite a bit slower than NEDI. And yes, NEDI works fine with simple PS3.0 pixel shaders, while NNEDI3 requires OpenCL to run at a decent speed.

FYI, Shiandow has written a super-res post-processing algorithm (using simple pixel shaders, once again) which improves NEDI quality even further, bringing it even nearer to NNEDI3 quality. This super-res algorithm is currently available for NEDI, only, I think, but it could in theory also be used to improve other 2x upscale algorithms, e.g. NNEDI3, or even Bicubic/Lanczos. I'm not sure if this super-res algorithm would improve NNEDI3 quality, too, we haven't tried yet, I think. But it might. I'm hoping that the super-res algorithm will sooner or later be a separate filter, running after any other 2x upscaling algorithm.
madshi is offline   Reply With Quote
Old 16th December 2014, 12:24   #73  |  Link
Shiandow
Registered User
 
Join Date: Dec 2013
Posts: 753
Quote:
Originally Posted by madshi View Post
FYI, Shiandow has written a super-res post-processing algorithm (using simple pixel shaders, once again) which improves NEDI quality even further, bringing it even nearer to NNEDI3 quality. This super-res algorithm is currently available for NEDI, only, I think, but it could in theory also be used to improve other 2x upscale algorithms, e.g. NNEDI3, or even Bicubic/Lanczos. I'm not sure if this super-res algorithm would improve NNEDI3 quality, too, we haven't tried yet, I think. But it might. I'm hoping that the super-res algorithm will sooner or later be a separate filter, running after any other 2x upscaling algorithm.
SuperRes works for arbitrary scaling factors, and arbitrary algorithms. It's not hard to combine with other scaling algorithms, it basically just needs a 'before' and 'after' image. So far I'm having a bit of trouble with larger scaling factors, since it's hard to add detail back into the image without introducing aliasing, but I still have some ideas I could try and MPDN's render scripts make it quite easy to try things out so hopefully I'll be able to improve that soon.
Shiandow is offline   Reply With Quote
Old 16th December 2014, 12:40   #74  |  Link
madshi
Registered Developer
 
Join Date: Sep 2006
Posts: 9,140
Ok, sounds good!
madshi is offline   Reply With Quote
Old 16th December 2014, 14:45   #75  |  Link
Gravitator
Registered User
 
Join Date: May 2014
Posts: 292
Quote:
Originally Posted by Orf View Post
Still barely understand, who are we and what do they want from me.
p/s pls remove this torrent shot of yours from my thread

This is just a proposal for expansion/improvement of your product. The problem is in the understanding of the translation. It would be good to contact you via e-mail (Vkontakte).
Gravitator is offline   Reply With Quote
Old 17th December 2014, 06:13   #76  |  Link
Orf
YAP author
 
Join Date: Jul 2014
Location: Russian Federation
Posts: 111
madshi,
can you please share your the PS and DirectCompute versions of nnedi3? It will be nice example for me to estimate the differences in code base and test performance maybe. Quick looked through Shiandow SuperRes realization, I'm right guessing that separate PS hlsl's theoretically can be combined in one DirectCompute hlsl or one OpenCL cl ?

Shiandow,
as far as I understand SuperRes requires only that separate hlsl been applied in correct order to work. So what benefits MPDN's render scripts gives to your comparing to MPC-HC way of configuring shaders ?

Gravitator,
I do not use any of social networks. But you can use the PM I guess.
Orf is offline   Reply With Quote
Old 17th December 2014, 12:07   #77  |  Link
Shiandow
Registered User
 
Join Date: Dec 2013
Posts: 753
Quote:
Originally Posted by Orf View Post
Shiandow,
as far as I understand SuperRes requires only that separate hlsl been applied in correct order to work. So what benefits MPDN's render scripts gives to your comparing to MPC-HC way of configuring shaders ?
A few weeks ago madshi asked a similar question, you may want read my reply here. But the gist of it is that SuperRes needs to compare the current image to the original image, so you need to be able to store the original somewhere. If it wasn't for some very creative use of the alpha channel, I wouldn't have been able to do SuperRes with just shaders, at all.
Shiandow is offline   Reply With Quote
Old 19th December 2014, 10:03   #78  |  Link
Orf
YAP author
 
Join Date: Jul 2014
Location: Russian Federation
Posts: 111
Shiandow,
sorry for delay, have to check some things before answer. That's because we're implementing the same thing. Does it simply means you need another sampler with source image on any stage of processing ? Because from what I've found here it looks more complex. Like you need samplers with the result of all previous stages or something more maybe.
Orf is offline   Reply With Quote
Old 19th December 2014, 10:34   #79  |  Link
Shiandow
Registered User
 
Join Date: Dec 2013
Posts: 753
There's only one step where I'd need another sampler with the source image, but more importantly that is still not quite enough to implement SuperRes. To implement SuperRes it's more or less necessary to be able to create new samplers and be able to send multiple samplers to one shader. It's technically possible to do SuperRes for one of the channels by storing things in the alpha channel, but that's not ideal.

The way this is achieved in MPDN is by building a chain of so called 'filters' which keeps track of allocating textures and sending the right textures to the right shaders. It might seem that you can use results from all previous stages, but under the hood it will try to allocate as few textures as possible, it also won't calculate results that aren't used and since recently it can even optimize away unnecessary conversions (so if you have X -> ConvertToYUV -> ConvertToRGB -> Y, it will simply do X -> Y).
Shiandow is offline   Reply With Quote
Old 19th December 2014, 11:02   #80  |  Link
Orf
YAP author
 
Join Date: Jul 2014
Location: Russian Federation
Posts: 111
In other words, some logics need to be programmed anyway. I'm currently trying' to understand if texture creating is possible inside compute shader hlsl. Can't find any useful information so far. May be madshi will shed some light on this matter.
Orf is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 00:25.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.