Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 15th December 2004, 22:50   #1  |  Link
Antitorgo
Registered User
 
Join Date: Dec 2004
Posts: 32
AviShader (Hardware assisted avisynth plugin)

Hi folks, been a long time lurker.

I've been really interested in the prospect of doing hardware assisted plugin for AviSynth, and have been playing around with the ATI Video Shader demo that uses the pixel shaders on modern graphics cards to do video manipulation. The demo has some pretty cool HLSL files that do some processing of some sort, and even have one that will perform an FFT/iFFT and I was getting like 15fps on my laptop.

So, being that I'm a developer, and have a little bit of free time on my hands, I've started development on what I'm calling "AviShader". It is a plugin that I'm basing off of the SimpleSample plugin that Si put together (great job BTW), and the ATI Video Shader demo.

I'm wondering if anyone else has done any work in this area (I don't think so), and wanted some advise/input/thoughts from you folks on things you'd like to see, or things I should watch out for. I'm aware of the usual arguments that reading video memory is too slow etc, etc. However, I'm not sure how slow it really is. If I could get 5fps on something that does the same thing as IIP and uses very little of my CPU, that would be a huge improvement. Plus, with PCIX, I think the video card slow read issue isn't going to apply much longer (Look at the nVidia's TurboCache that the anounced today, it renders to system memory).

Another thought I had is that me, being an image/video processing amateur, could someone explain some of the more advanced filtering techniques so that I could try coding some of them in HLSL? Something like Dust (I know Steady isn't coming around much anymore, and Didée seems to have some idea of what it does), or Fizick's new very slow FFT denoiser, which looks cool but the paper he mentioned in his post on Motion Picture Restoration seems to not be there anymore. If there is already a thread like this, I couldn't find it using search/etc.

Anyway, I'm hoping to have a beta of "AviShader" out soon, maybe by XMas or New Years? It would be cool to have some HLSL or FX files that do something useful by then.
Antitorgo is offline   Reply With Quote
Old 15th December 2004, 23:40   #2  |  Link
Fizick
AviSynth plugger
 
Fizick's Avatar
 
Join Date: Nov 2003
Location: Russia
Posts: 2,183
Links:

Paper:
http://www.mee.tcd.ie/~ack/papers/a4ackphd.ps.gz

GPU:
http://forum.doom9.org/showthread.ph...&highlight=GPU

http://forum.doom9.org/showthread.ph...&highlight=GPU

http://forum.doom9.org/showthread.ph...&highlight=GPU

etc...
__________________
My Avisynth plugins are now at http://avisynth.org.ru and mirror at http://avisynth.nl/users/fizick
I usually do not provide a technical support in private messages.
Fizick is offline   Reply With Quote
Old 16th December 2004, 04:55   #3  |  Link
morsa
the dumbest
 
Join Date: Oct 2002
Location: Malvinas
Posts: 494
If I'm not worng I guess Shodan was into using GPU for something I don't remember
morsa is offline   Reply With Quote
Old 16th December 2004, 07:53   #4  |  Link
Leak
ffdshow/AviSynth wrangler
 
Leak's Avatar
 
Join Date: Feb 2003
Location: Austria
Posts: 2,441
Quote:
Originally posted by morsa
If I'm not worng I guess Shodan was into using GPU for something I don't remember
Dunno about sh0dan, but Avery Lee is using the GPU in VirtualDub for Bicubic resizing:

http://www.virtualdub.org/oldnews
Leak is offline   Reply With Quote
Old 16th December 2004, 11:33   #5  |  Link
Kurosu
Registered User
 
Join Date: Sep 2002
Location: France
Posts: 432
I think TheJam79, whose page can't be accessed anymore, had written such a tool, called 'GPU'. He had implemented Convolution3D, a temporal smoother and some colorspace transform AFAIK. It was using DX9 and needed at least a Radeon 9500 IIRC.

Maybe I can dig up the sources, but it had never worked on my 9800, even when recompiled.

I remember a syntax of the like:
GPU_Start()
GPU_<function>
GPU_End()

Probably you could put several GPU function between Start() and End().
Kurosu is offline   Reply With Quote
Old 16th December 2004, 14:09   #6  |  Link
bill_baroud
Registered User
 
Join Date: Feb 2002
Posts: 407
Quote:
I'm wondering if anyone else has done any work in this area (I don't think so)
well,i had the same thought as you some times ago and tried some things, but i never got something really working and my free time disappeared

i'm looking forward your progress

(and btw, have a look to the nvidia Cg toolkit ... that's much nicer than asm to start playing with shader )
bill_baroud is offline   Reply With Quote
Old 16th December 2004, 14:54   #7  |  Link
sh0dan
Retired AviSynth Dev ;)
 
sh0dan's Avatar
 
Join Date: Nov 2001
Location: Dark Side of the Moon
Posts: 3,480
I've been toying with GPU-based video-processing earlier. It seems promising, but I don't see any good way of integrating it into AviSynth, so I basicly started from scratch on my tests.

For now I've tested upload and download speeds over the bus, and most performance issues arise from this. I've used DirectX9 as framework and cG for pixel shaders. DX9 is really great and does a good job. cG is a bit more flaky, with several compiler bugs already peeping out (though v1.3 mixed most of them).

* Pixel Shader 2.X (DX9) is needed for this to be useful.
* For now I use FP16 4:4:4 YUV.
* Uploading YV12 as three greyscale textures works quite nice.
* Downloading frames is a bit more tricky, as it will lock the CPU while the GPU is producing the image. This will also make SLI-supported processing harder. Buffering might be able to make this better.
* Interlaced processing is tricky, but possible with PS 2.X.
* Making framebased dicisions (like telecide for instance) is a rather big problem.

I've got a basic framework working, but nothing big this far. A bit more info on my blog.
__________________
Regards, sh0dan // VoxPod
sh0dan is offline   Reply With Quote
Old 16th December 2004, 17:20   #8  |  Link
Leak
ffdshow/AviSynth wrangler
 
Leak's Avatar
 
Join Date: Feb 2003
Location: Austria
Posts: 2,441
Quote:
Originally posted by sh0dan
* Making framebased dicisions (like telecide for instance) is a rather big problem.
How about just calculating the metrics that Telecide and Decomb (and TIVTCs equivalents) use on the GPU and reading those back? One metric per 32x32 block is a lot smaller than a full image and doing the final decision on the CPU is almost nothing compared to the metrics calculation...

Just braindumping here, though.

np: Autechre - Remix Of Spangle By Seefeel
Leak is offline   Reply With Quote
Old 16th December 2004, 20:50   #9  |  Link
Prettz
easily bamboozled user
 
Prettz's Avatar
 
Join Date: Sep 2002
Location: Atlanta
Posts: 373
I have yet to learn anything about pixel shaders (been meaning to one of these days), but would PS 3.0 bring any further benefits, at least ones which would justify writing seperate routines for 3.0?
Prettz is offline   Reply With Quote
Old 16th December 2004, 21:56   #10  |  Link
Soulhunter
Bored...
 
Soulhunter's Avatar
 
Join Date: Apr 2003
Location: Unknown
Posts: 2,812
A GPU accelerated ssxsharpen would be cool...

Guess the supersampling could be done much faster this way !?!


Bye
__________________

Visit my IRC channel
Soulhunter is offline   Reply With Quote
Old 16th December 2004, 23:01   #11  |  Link
Antitorgo
Registered User
 
Join Date: Dec 2004
Posts: 32
Yes, a hardware accelerated ssxsharpen would be cool, but seeing as how I don't know all the specifics of how it works, I can only guess on it supersampling and doing maybe an unsharp mask? I dunno. That was one of the points to me starting this thread.

I have coded up an HLSL function that does a so called "smart" sharpen by doing an unsharp mask on sobelized output so that only the edges get sharpened. I thought it would be kinda interesting to apply the sharpening to the edges, and leave the gaussian blur on the rest. It was kinda cool when I was testing, I left the sobelized edges in there instead of the unsharp mask because it turned into a cartoonish type medium (would be cool for someone wanting to do non-photorealistic type stuff).

Anyway, as far as the comments on using cG, HLSL isn't the asm pixel shader stuff, and I am more familiar with it and I think that CG is an nVidia thing and since I have ATI... I might see about using FX which lets you specify some more rules around the HLSL you use in it, and I think it would be "free" in the since that MS makes it as easy to compile the FX file into a shader as it does the HLSL.

Beyond that, I was going to post links and maybe some screen caps of my progress, but I'll do that tonite when I have more time.
Antitorgo is offline   Reply With Quote
Old 16th December 2004, 23:29   #12  |  Link
Soulhunter
Bored...
 
Soulhunter's Avatar
 
Join Date: Apr 2003
Location: Unknown
Posts: 2,812
Quote:
Originally posted by Antitorgo
Yes, a hardware accelerated ssxsharpen would be cool, but seeing as how I don't know all the specifics of how it works, I can only guess on it supersampling and doing maybe an unsharp mask? I dunno. That was one of the points to me starting this thread.
Afaik it works like this...
Code:
LanczosResize(ox*4,oy*4).XSharpen(255,255).LanczosResize(ox,oy)
Maybe have a look here, here, here and here !!!


Bye
__________________

Visit my IRC channel

Last edited by Soulhunter; 16th December 2004 at 23:35.
Soulhunter is offline   Reply With Quote
Old 17th December 2004, 00:20   #13  |  Link
AS
Registered User
 
Join Date: Jan 2002
Posts: 76
Actually the whole point of ssxsharpen is to sharpen everything, both edges and areas
AS is offline   Reply With Quote
Old 17th December 2004, 02:54   #14  |  Link
Antitorgo
Registered User
 
Join Date: Dec 2004
Posts: 32
I've coded up XSharpen as a shader with hardcoded 255,255 values and straight comparison vs. luma comparison (I was trying to calculate the luma inline and saturated the registers, so to do true luma i'd have to do two seperate shaders with one rendering to a different texture for lookups, and I was too lazy so i settled a straight comparison as a quick compromise). The results look about the same as XSharpen in the static images, but I will compare actuals when I fix the screen capturing code. BTW, XSharpen on Shrek2 w/o supersampling looks like absolute crap.

So I just need to add supersampling code and see what happens... As my only way to test the HLSL right now is the super buggy Video Shader (which I'm trying to debuggify a bit), the built in screen caps are resizing to some odd size (the video being displayed was off as well, but I fixed that particular bug).

Anyway, with luck, I'll get some of the supersampling coded and have some results later tonite.
Antitorgo is offline   Reply With Quote
Old 17th December 2004, 04:28   #15  |  Link
Soulhunter
Bored...
 
Soulhunter's Avatar
 
Join Date: Apr 2003
Location: Unknown
Posts: 2,812
Quote:
Originally posted by Antitorgo

Anyway, with luck, I'll get some of the supersampling coded and have some results later tonite.
Nice...

Btw, what kind of resizing you gonna use ???


Bye
__________________

Visit my IRC channel
Soulhunter is offline   Reply With Quote
Old 17th December 2004, 19:32   #16  |  Link
Antitorgo
Registered User
 
Join Date: Dec 2004
Posts: 32
Ugh... I need to work more on learning DX9 and dusting off my C++ skills, I've been pampered too much by .NET. Plus, this is a long rambling type post, so bear with me...

Anyway, I think I'm going to concentrate right now on getting a basic framework in place for the plugin, and then I can work more on the shader algorithms. I have some screencaps of Shrek2 with XSharpen and my modified smart sharpen (with my gausian on non-edges and the edge detection cranked up to a high threshold of .9 (90%)). The smart sharpen is actually damn impressive looking, and of course XSharpen looks like crap because either I implemented something wrong (possible), or it needs supersampling really bad. It seems that XSharpen on animation just introduces a crapload of aliasing, maybe it does better on natural images? Or maybe it is because I have it hardcoded to (255,255) type parameters (not really hardcoded, but left off for simplicity at the time and 255,255 is the "default" behavior).

Oh, and I thought of a good way to calculate Luma using less instructions (It is a dot product of two vectors, so I would imagine that it calculates luma a hell of a lot faster too being that the hardware likes doing dot products, I woudn't know offhand because it is so damn fast anyway). So the XSharpen uses true luma for all its calculations now.

I would upload the screenshots to my web site, but my hosting provider apparently screwed something up, because I can't upload files via FTP. So I guess I'll post later when I get some files uploaded.

As far as the resizing (interpolation), I have no idea how lanczos works, I sorta know how bicubic and binormal work, but then again, I could just let the video card handle it (which was my initial thought). But then I did a search on interpolation algorithms to see how hard it would be to maybe to lanczos, and ran across a triangular interpolation algorithm which is custom tailored for 3d video cards in that you could map the triangles to a mesh and resize like crazy and it would be super incredibly fast. (Imagine almost free resizing to do some crazy supersampling (of course, memory becomes a concern here)). One of the amazing things about it, is setting up the mesh is pretty fast, and should be comparable to bicubic in terms of speed (I think, this is all theoretic on my part, and I could be totally wrong). Anyway, I will leave that for later I think, and just concentrate on the basics and go from there.

For now though, I am thinking that I will implement first with HLSL for the shader language, and then probably graduate to Effect (.FX) files so you could define multi-pass and multi-shader scenarios in one file (which would be really friggin awesome). The major hurdle though is getting the framework in place, and testing the speed of reads from the graphics card (I wish I had PCI-E)).
Antitorgo is offline   Reply With Quote
Old 17th December 2004, 20:15   #17  |  Link
AS
Registered User
 
Join Date: Jan 2002
Posts: 76
Quote:
Originally posted by Antitorgo
xSharpen looks like crap because either I implemented something wrong (possible), or it needs supersampling really bad.
No, you probably have done it right and getting the correct behaviour. Xsharpen(255,255) does need supersampling bad, which is why we use supersampling for it, to disguise the aliasing caused by xsharpen().

As for Lanczosresize(), it's a native avisynth filter, which you can get the source within avisynth, I believe.
AS is offline   Reply With Quote
Old 17th December 2004, 20:45   #18  |  Link
Antitorgo
Registered User
 
Join Date: Dec 2004
Posts: 32
Yeah, I know lanczos is in AviSynth and I could go dig through the source code to find it. I hate to do that because it is like deciphering greek and translating it into Babylonian cunieform. I'd rather do English to Babylonian cunieform (heh).

For example, looking the XSharpen code, it took me a minute to figure out that he was bitshifting things around to get the RGB and store the luma in that A channel, then deciphering the fact that he used an unrolled loop to sample the 3x3 matrix. When a description of something like: take the RGB values in a 3x3 matrix centered on the target, find the max/min luma and use the luma closest to the center pixel as the new value. Would have been easier for me to code. Actually, I spend a couple of hours trying to calculate the luma correctly, and everyone seems to have slightly different opinions on the constants to multiple your RGB values by in order to get the Luma. But I digress...

I think what I want to do is see if I can't get the D3D environment set up, render a frame to it, and read it back all in an avisynth plugin. I think that getting to that point is going to be the biggest challenge. One of the difficulties is that the AVI Video Shader demo uses VMR9 to do all the decoding and stream serving, so I need to replace that, strip out all the code they have in there for setting up cube maps, textures etc etc for some of the more "advanced" features, and to be honest, I think I'm better off starting w/o that baggage and just using it for reference as needed. Kinda going to the KISS principle.
Antitorgo is offline   Reply With Quote
Old 17th December 2004, 21:51   #19  |  Link
Soulhunter
Bored...
 
Soulhunter's Avatar
 
Join Date: Apr 2003
Location: Unknown
Posts: 2,812
Quote:
Originally posted by Antitorgo

Yeah, I know lanczos is in AviSynth and I could go dig through the source code to find it. I hate to do that because it is like deciphering greek and translating it into Babylonian cunieform...
Not sure, but maybe this could help ya...

EDIT: Or this here ???


Bye
__________________

Visit my IRC channel

Last edited by Soulhunter; 17th December 2004 at 22:33.
Soulhunter is offline   Reply With Quote
Old 18th December 2004, 07:46   #20  |  Link
Antitorgo
Registered User
 
Join Date: Dec 2004
Posts: 32
Okay, finally was able to upload to my web server. Here's some screen caps of the raw, "smart" sharpen, and XSharpen (w/o supersampling). It isn't exactly the same frame, because Video Shader doesn't have frame level control, but I got them as close to the same as possible. (Sorry, I'll hopefully have better tonite).

[Edit] I wanted to point out that Default Shader does nothing, so it is the raw image [/Edit]





I'd be interested to see what comments anyone has. I think the smart sharpen looks a heck of a lot better than the original, and you can tell by the file sizes that it would seem it compresses quite a bit less than the original.

As far as the supersampling for XSharpen... I've been thinking about how to do this, and the biggest issue that I'm running into is that I don't think I can cheat and get a "quick" example by using the hardware to resize in one pass, I'd have to do multiple passes on it, feeding the textures back in. Maybe I can size it up, take a screen cap, and then resize in photoshop as a quick example? Anyone actually interested? I'm kinda curious just to see what happens to performance if I resize 4x... So curious, I just went and hacked it in, and a 4x XSharpen runs at ~8fps, the 4x SmartSharpen ran at ~6fps... Note that this isn't doing any copying back to system memory yet, so speed might be quite slower. This makes me think that if I code the resize as a "mapped" function within HLSL, it would be the more optimal solution... I'll have to think about it...

I've written up a quick little app in C# to figure out how to set up all the D3D rendering, and have it all working with the exception of getting the rendered image back to system memory so I can save it off to a file or whatever. With any luck, I'll have that done tonight, and I'll post the source code up on my server, along with the smart sharpen and xsharpen HLSL shaders so people can play around with it and see if it'll work on their hardware or not. Oddly enough, I was writing part of it on my work computer, which has an onboard Intel video solution, and it says it supported pixel shader 2_0 (PS_2_0) and it did actually render everything correctly (I'm not sure what the speed situation was though, since it is limited to a static image for now).

Anyway, next step will be to take what I've learned on the D3D stuff, and write it in C++ and write it as an AviSynth plugin. Overall, it is pretty cool.

Oh, and lastly, I've thought more about my triangular resize using a mesh, and I'm liking it even more and more. I think I'll try coding something up to see if it works like I think it will, and then I'll make a post with neat graphics on how it works, and why it is better than other interpolations...
Antitorgo is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 10:20.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.