View Single Post
Old 17th March 2005, 01:33   #11  |  Link
kassandro
Registered User
 
Join Date: May 2003
Location: Germany
Posts: 502
Corrected version with SSE support

I have removed the attachment of the first posting in this thread and replaced by a link to the corrected. I have also updated the link from last sunday. Both links now point to the same location. I have corrected a difficult to find bug (no crash and hard to see). Actually only one character was incorrect, which led to wrong V values. The new plugin contains also an SSE/SSE2 version of CNRF. However, with standard YUY2 the C filter of CNRF is used. The SSE is used only with planar YUY2.

What is planar YUY2? For Avisynth there is no difference between planar and the normal interleaved YUY2. Only the organisation of the data is quite different. Thus if you display a planar YUY2 clip, you will see only garbage. The advantage of planar YUY2 is, that one can process planar YUY2 almost the same way as YV12, i.e. SSE/SSE2/SSE3 can be used nicely. Virtually any YV12 can be extended to planar YUY2. Over the course of the next months I will extend the filters RemoveDirt and RemoveGrain to planar YUY2 and remove the support for interleaved YUY2 in RemoveDirt. Especially the YUY2 performance of RemoveDust will be improved quite substantially.

If there are no "natural" planar YUY2 clips, how can one utilise it? Answer: through the conversion filters Interleaved2Planar and Planar2Interleaved of my yet inofficial plugin SSETools (SSE2Tools is for SSE2 cpus, i.e. P4/Athlon64/Sempron 3100). Interleaved2Planar is called at the beginning of the filter chain and Planar2Interleaved is called at the end of the filter chain, but all the filters of the chain must support planar YUY2 and currently CNRF is the only one, though many temporal filter, but not CNRT, work also correctly with planar YUY2. Interleaved2Planar and Planar2Interleaved are highly optimised (should be almost bitblt speed). Since Avisynth doesn't know of planar YUY2 (actually there is also planar RGB24 and planar RGB32), one has to tell a filter, that the input is planar and not interleaved. This is usually done with boolean variable "planar". Thus
Code:
function CNRF_SSE(clip input)
{
   smooth = RemoveGrain(input, mode=12).Interleaved2Planar()
   planar = Interleaved2Planar(input)
   return CNRF(planar, smooth, planar=true).planar2interleaved()
}
does the same as
Code:
function CNRF_C(clip input)
{
   smooth = RemoveGrain(input, mode=12)
   return CNRF(input, smooth)
}
but should be somewhat faster. Unfortunately the interleaved RemoveGrain(input, mode=12) is currently a big brake in CNRF_SSE. Once we have support for planar YUY2 in RemoveGrain
Code:
function CNRF_SSE(clip input)
{
   planar = Interleaved2Planar(input)
   smooth = RemoveGrain(planar, mode=12, planar=true)
   return CNRF(planar, smooth, planar=true).planar2interleaved()
}
should be a lot faster.

I hope to post an YV12 version of CNRF next weekend.
kassandro is offline   Reply With Quote