Quote:
Originally Posted by Orf
madshi, I agree, I do not ask a question in a correct way may be. Will try to correct myself now. As far as I understood beside using DirectCompute/OpenCL APIs there's third way to do it by simply drawing a quad via Direct3D and applyng pixel shader to it. Did you test is that method any faster then OpenCL ?
|
I actually did try to do NNEDI3 via PS3.0 pixel shaders, and from what I remember, it was slower by a factor of around 1000x, compared to OpenCL.
The reason why pixel shaders are so much slower for OpenCL is that pixel shaders apply math to every destination pixel separetely. OpenCL and DirectCompute are more flexible, you can configure them to render multiple destination pixels with one kernel pass. Doing that allows to cleverly cache things and to share some calculations for multiple pixels etc. Especially for NNEDI3 that's very important to get things up to speed.