Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

Domains: forum.doom9.org / forum.doom9.net / forum.doom9.se

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 17th October 2012, 19:52   #1  |  Link
SassBot
Guest
 
Posts: n/a
ColorMatrix for Vapoursynth

So I've finished the initial porting of ColorMatrix to Vapoursynth. It can be downloaded here. Source code is on Github here. It seems to work fine on the few test clips I tried but I can't say I tested it fully so let me know if you stumble upon any issues. There is still further cleanup and removal of Windows dependencies to finish up, but this should be enough to at least test.

Parameters are the same as the original ColorMatrix the only difference is that the bool values have been replaced with ints (0 for false and 1 for true for anyone who is unfamiliar).

Usage:

Code:
>>> import vapoursynth as vs
>>> core = vs.Core()
>>> core.std.LoadPlugin('/path/to/ColorMatrix.dll')
And minimum needed to call it:

Code:
>>> ret = core.colormatrix.ColorMatrix(clip)

Last edited by SassBot; 17th October 2012 at 19:57.
  Reply With Quote
Old 17th October 2012, 22:14   #2  |  Link
SassBot
Guest
 
Posts: n/a
I forgot to add this does require you to have the ported Avisynth core filters since it still uses the call to Limiter to do the clamping.
  Reply With Quote
Old 17th October 2012, 22:36   #3  |  Link
active1
Registered User
 
Join Date: Nov 2011
Location: spain
Posts: 45
great job! i hope that you can make it work on linux too
active1 is offline   Reply With Quote
Old 17th October 2012, 22:39   #4  |  Link
SassBot
Guest
 
Posts: n/a
Yeah, that's a goal. Most of the Windows dependency is in the threading it uses that isn't really necessary anymore (and it actually borks with the parallelizing that VapourSynth and causes green frames to happen at intervals so the filter is set to run serially) and making the asm portable.

Last edited by SassBot; 17th October 2012 at 22:43.
  Reply With Quote
Old 17th October 2012, 22:59   #5  |  Link
Myrsloik
Professional Code Monkey
 
Myrsloik's Avatar
 
Join Date: Jun 2003
Location: Kinnarps Chair
Posts: 2,647
Quote:
Originally Posted by SassBot View Post
Yeah, that's a goal. Most of the Windows dependency is in the threading it uses that isn't really necessary anymore (and it actually borks with the parallelizing that VapourSynth and causes green frames to happen at intervals so the filter is set to run serially) and making the asm portable.
Converting all that asm to something sane like yasm+x86asm.inc will take a lot of work. As usual I also suspect that rewriting it from scratch would be easier...
__________________
VapourSynth - proving that scripting languages and video processing isn't dead yet

Last edited by Myrsloik; 17th October 2012 at 23:00. Reason: me being sleepy
Myrsloik is offline   Reply With Quote
Old 17th October 2012, 23:45   #6  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
Quote:
Originally Posted by Myrsloik View Post
Converting all that asm to something sane like yasm+x86asm.inc will take a lot of work. As usual I also suspect that rewriting it from scratch would be easier...
Actually, the C routines are just as fast as the SSE2 routines. To double check that, I just compiled a Colormatrix dll so it uses only the C routines (removed the asm code completely). I get 300 fps on an old Core 2 Duo with 2.5 GHz, single thread.
Groucho2004 is offline   Reply With Quote
Old 17th October 2012, 23:56   #7  |  Link
SassBot
Guest
 
Posts: n/a
Quote:
Originally Posted by Groucho2004 View Post
Actually, the C routines are just as fast as the SSE2 routines. To double check that, I just compiled a Colormatrix dll so it uses only the C routines (removed the asm code completely). I get 300 fps on an old Core 2 Duo with 2.5 GHz, single thread.
Okay, well if the asm really isn't needed then it's just a case of removing the internal threading. Makes things much simpler.
  Reply With Quote
Old 18th October 2012, 23:29   #8  |  Link
SassBot
Guest
 
Posts: n/a
Quote:
Originally Posted by Myrsloik View Post
Converting all that asm to something sane like yasm+x86asm.inc will take a lot of work. As usual I also suspect that rewriting it from scratch would be easier...
Actually I looked into it and it's been pretty simple so far and I'm far from an assembly expert. It's only about 800 lines and it doesn't seem to use anything MASM-specific so it's been mostly just copy-and-paste extraction.
  Reply With Quote
Old 18th October 2012, 23:33   #9  |  Link
Myrsloik
Professional Code Monkey
 
Myrsloik's Avatar
 
Join Date: Jun 2003
Location: Kinnarps Chair
Posts: 2,647
So you are seriously going to port the asm despite people reporting the same speed without it?
This is just ridiculous. At least benchmark it properly to see that the asm gives a 50% boost or more or it must be truly horribly written.

What is it with avisynth filter writers and inline asm anyway?
__________________
VapourSynth - proving that scripting languages and video processing isn't dead yet
Myrsloik is offline   Reply With Quote
Old 18th October 2012, 23:34   #10  |  Link
SassBot
Guest
 
Posts: n/a
Like I said it's really just copy and paste. If it was going to take significant effort I wouldn't do it.
  Reply With Quote
Old 18th October 2012, 23:37   #11  |  Link
Myrsloik
Professional Code Monkey
 
Myrsloik's Avatar
 
Join Date: Jun 2003
Location: Kinnarps Chair
Posts: 2,647
Quote:
Originally Posted by SassBot View Post
Like I said it's really just copy and paste. If it was going to take significant effort I wouldn't do it.
The reason I object isn't that you're copying the asm as such. asm can be a fun hobby. The real reason is because of the maintainability going down the drain.
__________________
VapourSynth - proving that scripting languages and video processing isn't dead yet
Myrsloik is offline   Reply With Quote
Old 18th October 2012, 23:39   #12  |  Link
TheFluff
Excessively jovial fellow
 
Join Date: Jun 2004
Location: rude
Posts: 1,100
Quote:
Originally Posted by SassBot View Post
Actually I looked into it and it's been pretty simple so far and I'm far from an assembly expert. It's only about 800 lines and it doesn't seem to use anything MASM-specific so it's been mostly just copy-and-paste extraction.
Just nuke it. There's absolutely no benefit in keeping it, as Groucho demonstrated above.

Avisynth filter writers frequently suffer from the Gentoo Ricer Syndrome and write hilariously unoptimized algorithms in assembler even though they don't really understand what they're doing, use ricer compiler switches that don't do what they think they do and use needlessly unreadable bitwise operations, all because "it's faster" and then never do any actual benchmarks. All of this are contributing factors as to why a lot of Avisynth filters are unmaintainable and unportable messes. The documentation really doesn't help in this regard either; the "simple filter example" that is supposed to teach people how you use the API very quickly goes completely bananas and ends up in a ridiculous mess of trying to teach you how to write "optimized" inline assembler.

If you have an opportunity to clean this up, please please please do so. Unmaintainable open source code could just as well be closed source. Maintaining existing assembler is really hard even if it's well written and documented; in this case it clearly isn't so just scrap it and let someone write new code from scratch if they want to optimize it.

Last edited by TheFluff; 18th October 2012 at 23:41.
TheFluff is offline   Reply With Quote
Old 18th October 2012, 23:41   #13  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
I tested again on my i2500K and got about 20% more speed with SSE2 compared to the C routines.

However, I have always used the C routines because of this little note in the Colormatrix manual:

Due to rounding differences, the output from the mmx and sse2 routines (only present for YV12) is not exactly the same as the output from the c routine (the c routine is more accurate). The maximum difference between the simd and c routines is +-1 on the Y/U/V planes.

And, I really don't care if the filter runs with 400 or 500 fps.
Groucho2004 is offline   Reply With Quote
Old 19th October 2012, 00:03   #14  |  Link
SassBot
Guest
 
Posts: n/a
Yeah, I saw from 15-25% speedup on the 3 systems I tested between forcing the pure C and the SSE2. So it's enough that I think the mostly minimal copy-and-paste effort is worth it. It's consisted of cutting, pasting, clapping on the prologue and ret macros and renaming some arguments. If it was going to take days or weeks of time, then I'd agree, but it's pretty simplistic "porting".
  Reply With Quote
Old 19th October 2012, 00:21   #15  |  Link
SassBot
Guest
 
Posts: n/a
Quote:
Originally Posted by Groucho2004 View Post
However, I have always used the C routines because of this little note in the Colormatrix manual:

Due to rounding differences, the output from the mmx and sse2 routines (only present for YV12) is not exactly the same as the output from the c routine (the c routine is more accurate). The maximum difference between the simd and c routines is +-1 on the Y/U/V planes.

And, I really don't care if the filter runs with 400 or 500 fps.
Sure they are not exactly the same output but can you really see the difference? I doubt it.




Quote:
Originally Posted by TheFluff View Post
If you have an opportunity to clean this up, please please please do so. Unmaintainable open source code could just as well be closed source. Maintaining existing assembler is really hard even if it's well written and documented; in this case it clearly isn't so just scrap it and let someone write new code from scratch if they want to optimize it.
I'll clean up the C++ code for sure. The assembly I'm just really copying over because it does give some benefit for the people who want to run it with the SSE2 optimization. I'll probably just drop the MMX and SSE and just keep it as the C or SSE2. If anyone really wants the other assembly versions they can compile their own version. As I said, I'm not investing any significant effort into this because it honestly isn't worth it that much, but copying and pasting the SSE2 code over and getting it to assemble with yasm took literally 30 minutes after working out some Visual Studio wonkyness with x86inc.asm's name mangling that was causing some linker errors.

Last edited by SassBot; 19th October 2012 at 03:35.
  Reply With Quote
Old 19th October 2012, 00:21   #16  |  Link
Wilbert
Super Moderator
 
Join Date: Nov 2001
Location: Netherlands
Posts: 6,375
Quote:
The documentation really doesn't help in this regard either; the "simple filter example" that is supposed to teach people how you use the API very quickly goes completely bananas and ends up in a ridiculous mess of trying to teach you how to write "optimized" inline assembler.
In what way does it go bananas? The simple filter example doesn't contain any assembler code at all, so i'm curious what you mean here.
Wilbert is offline   Reply With Quote
Old 19th October 2012, 04:03   #17  |  Link
TheFluff
Excessively jovial fellow
 
Join Date: Jun 2004
Location: rude
Posts: 1,100
Quote:
Originally Posted by Wilbert View Post
In what way does it go bananas? The simple filter example doesn't contain any assembler code at all, so i'm curious what you mean here.
Seems I either misremembered or it's been removed. I have a distinct memory of being angry about asm in simplesample but maybe my brain is just making things up.
TheFluff is offline   Reply With Quote
Old 19th October 2012, 17:17   #18  |  Link
StainlessS
HeartlessS Usurer
 
StainlessS's Avatar
 
Join Date: Dec 2009
Location: Over the rainbow
Posts: 11,406
The Assmembler Optimizing section is in the SDK.
http://avisynth.org/mediawiki/Filter...ler_optimizing
__________________
I sometimes post sober.
StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace

"Some infinities are bigger than other infinities", but how many of them are infinitely bigger ???
StainlessS is offline   Reply With Quote
Old 19th October 2012, 17:23   #19  |  Link
kolak
Registered User
 
Join Date: Nov 2004
Location: Poland
Posts: 2,869
Quote:
Originally Posted by SassBot View Post
Sure they are not exactly the same output but can you really see the difference? I doubt it.





Well I can see it easily.
kolak is offline   Reply With Quote
Old 19th October 2012, 17:50   #20  |  Link
SassBot
Guest
 
Posts: n/a
I can only hope you're being sarcastic... Subtract the two images in Avisynth and you essentially get an almost perfect gray image. Running coloryuv between the two versions gives a difference between each plane as .01. There is no way the difference is even remotely perceptible. If you aren't being sarcastic then you are basically imaging a difference.

Last edited by SassBot; 19th October 2012 at 17:52.
  Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 07:18.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2026, vBulletin Solutions Inc.