View Full Version : ColorMatrix for Vapoursynth
SassBot
17th October 2012, 19:52
So I've finished the initial porting of ColorMatrix to Vapoursynth. It can be downloaded here (https://github.com/downloads/amichaelt/vscolormatrix/ColorMatrix.7z). Source code is on Github here (https://github.com/amichaelt/vscolormatrix). It seems to work fine on the few test clips I tried but I can't say I tested it fully so let me know if you stumble upon any issues. There is still further cleanup and removal of Windows dependencies to finish up, but this should be enough to at least test.
Parameters are the same as the original ColorMatrix the only difference is that the bool values have been replaced with ints (0 for false and 1 for true for anyone who is unfamiliar).
Usage:
>>> import vapoursynth as vs
>>> core = vs.Core()
>>> core.std.LoadPlugin('/path/to/ColorMatrix.dll')
And minimum needed to call it:
>>> ret = core.colormatrix.ColorMatrix(clip)
SassBot
17th October 2012, 22:14
I forgot to add this does require you to have the ported Avisynth core filters since it still uses the call to Limiter to do the clamping.
active1
17th October 2012, 22:36
great job! i hope that you can make it work on linux too :)
SassBot
17th October 2012, 22:39
Yeah, that's a goal. Most of the Windows dependency is in the threading it uses that isn't really necessary anymore (and it actually borks with the parallelizing that VapourSynth and causes green frames to happen at intervals so the filter is set to run serially) and making the asm portable.
Myrsloik
17th October 2012, 22:59
Yeah, that's a goal. Most of the Windows dependency is in the threading it uses that isn't really necessary anymore (and it actually borks with the parallelizing that VapourSynth and causes green frames to happen at intervals so the filter is set to run serially) and making the asm portable.
Converting all that asm to something sane like yasm+x86asm.inc will take a lot of work. As usual I also suspect that rewriting it from scratch would be easier...
Groucho2004
17th October 2012, 23:45
Converting all that asm to something sane like yasm+x86asm.inc will take a lot of work. As usual I also suspect that rewriting it from scratch would be easier...
Actually, the C routines are just as fast as the SSE2 routines. To double check that, I just compiled a Colormatrix dll so it uses only the C routines (removed the asm code completely). I get 300 fps on an old Core 2 Duo with 2.5 GHz, single thread.
SassBot
17th October 2012, 23:56
Actually, the C routines are just as fast as the SSE2 routines. To double check that, I just compiled a Colormatrix dll so it uses only the C routines (removed the asm code completely). I get 300 fps on an old Core 2 Duo with 2.5 GHz, single thread.
Okay, well if the asm really isn't needed then it's just a case of removing the internal threading. Makes things much simpler.
SassBot
18th October 2012, 23:29
Converting all that asm to something sane like yasm+x86asm.inc will take a lot of work. As usual I also suspect that rewriting it from scratch would be easier...
Actually I looked into it and it's been pretty simple so far and I'm far from an assembly expert. It's only about 800 lines and it doesn't seem to use anything MASM-specific so it's been mostly just copy-and-paste extraction.
Myrsloik
18th October 2012, 23:33
So you are seriously going to port the asm despite people reporting the same speed without it?
This is just ridiculous. At least benchmark it properly to see that the asm gives a 50% boost or more or it must be truly horribly written.
What is it with avisynth filter writers and inline asm anyway?
SassBot
18th October 2012, 23:34
Like I said it's really just copy and paste. If it was going to take significant effort I wouldn't do it.
Myrsloik
18th October 2012, 23:37
Like I said it's really just copy and paste. If it was going to take significant effort I wouldn't do it.
The reason I object isn't that you're copying the asm as such. asm can be a fun hobby. The real reason is because of the maintainability going down the drain.
TheFluff
18th October 2012, 23:39
Actually I looked into it and it's been pretty simple so far and I'm far from an assembly expert. It's only about 800 lines and it doesn't seem to use anything MASM-specific so it's been mostly just copy-and-paste extraction.
Just nuke it. There's absolutely no benefit in keeping it, as Groucho demonstrated above.
Avisynth filter writers frequently suffer from the Gentoo Ricer Syndrome (http://funroll-loops.info/) and write hilariously unoptimized algorithms in assembler even though they don't really understand what they're doing, use ricer compiler switches that don't do what they think they do and use needlessly unreadable bitwise operations, all because "it's faster" and then never do any actual benchmarks. All of this are contributing factors as to why a lot of Avisynth filters are unmaintainable and unportable messes. The documentation really doesn't help in this regard either; the "simple filter example" that is supposed to teach people how you use the API very quickly goes completely bananas and ends up in a ridiculous mess of trying to teach you how to write "optimized" inline assembler.
If you have an opportunity to clean this up, please please please do so. Unmaintainable open source code could just as well be closed source. Maintaining existing assembler is really hard even if it's well written and documented; in this case it clearly isn't so just scrap it and let someone write new code from scratch if they want to optimize it.
Groucho2004
18th October 2012, 23:41
I tested again on my i2500K and got about 20% more speed with SSE2 compared to the C routines.
However, I have always used the C routines because of this little note in the Colormatrix manual:
Due to rounding differences, the output from the mmx and sse2 routines (only present for YV12) is not exactly the same as the output from the c routine (the c routine is more accurate). The maximum difference between the simd and c routines is +-1 on the Y/U/V planes.
And, I really don't care if the filter runs with 400 or 500 fps. :rolleyes:
SassBot
19th October 2012, 00:03
Yeah, I saw from 15-25% speedup on the 3 systems I tested between forcing the pure C and the SSE2. So it's enough that I think the mostly minimal copy-and-paste effort is worth it. It's consisted of cutting, pasting, clapping on the prologue and ret macros and renaming some arguments. If it was going to take days or weeks of time, then I'd agree, but it's pretty simplistic "porting".
SassBot
19th October 2012, 00:21
However, I have always used the C routines because of this little note in the Colormatrix manual:
Due to rounding differences, the output from the mmx and sse2 routines (only present for YV12) is not exactly the same as the output from the c routine (the c routine is more accurate). The maximum difference between the simd and c routines is +-1 on the Y/U/V planes.
And, I really don't care if the filter runs with 400 or 500 fps. :rolleyes:
Sure they are not exactly the same output but can you really see the difference? I doubt it.
http://imageshack.us/a/img823/9199/shot1l.png
http://imageshack.us/a/img42/8043/shot2v.png
If you have an opportunity to clean this up, please please please do so. Unmaintainable open source code could just as well be closed source. Maintaining existing assembler is really hard even if it's well written and documented; in this case it clearly isn't so just scrap it and let someone write new code from scratch if they want to optimize it.
I'll clean up the C++ code for sure. The assembly I'm just really copying over because it does give some benefit for the people who want to run it with the SSE2 optimization. I'll probably just drop the MMX and SSE and just keep it as the C or SSE2. If anyone really wants the other assembly versions they can compile their own version. As I said, I'm not investing any significant effort into this because it honestly isn't worth it that much, but copying and pasting the SSE2 code over and getting it to assemble with yasm took literally 30 minutes after working out some Visual Studio wonkyness with x86inc.asm's name mangling that was causing some linker errors.
Wilbert
19th October 2012, 00:21
The documentation really doesn't help in this regard either; the "simple filter example" that is supposed to teach people how you use the API very quickly goes completely bananas and ends up in a ridiculous mess of trying to teach you how to write "optimized" inline assembler.
In what way does it go bananas? The simple filter example doesn't contain any assembler code at all, so i'm curious what you mean here.
TheFluff
19th October 2012, 04:03
In what way does it go bananas? The simple filter example doesn't contain any assembler code at all, so i'm curious what you mean here.
Seems I either misremembered or it's been removed. I have a distinct memory of being angry about asm in simplesample but maybe my brain is just making things up.
StainlessS
19th October 2012, 17:17
The Assmembler Optimizing section is in the SDK.
http://avisynth.org/mediawiki/Filter_SDK/Assembler_optimizing
kolak
19th October 2012, 17:23
Sure they are not exactly the same output but can you really see the difference? I doubt it.
http://imageshack.us/a/img823/9199/shot1l.png
http://imageshack.us/a/img42/8043/shot2v.png
Well I can see it easily.
SassBot
19th October 2012, 17:50
I can only hope you're being sarcastic... Subtract the two images in Avisynth and you essentially get an almost perfect gray image. Running coloryuv between the two versions gives a difference between each plane as .01. There is no way the difference is even remotely perceptible. If you aren't being sarcastic then you are basically imaging a difference. :p
Myrsloik
19th October 2012, 18:07
I also call bullshit on this one. I have a calibrated ips screen and even after minutes of staring both pictures look identical.
kolak
19th October 2012, 19:29
For me to bottom one is bit darker/more saturated, specially in the top/left quarter (and on reds).
Funny enough I did load it to photoshop and thought that there is no difference, but difference mode shows difference, exatctly in places which I described :)
I think it was more imaginary and due to LCD angels, even if I have very good monitor, but I'm also convince that I did see "real" difference :)
SassBot
19th October 2012, 19:39
No, it's not. You are imaging a difference that does not exist. The pictures are basically identical beyond the value difference in the YUV planes being .01 between them. Here is the two images run through Histogram. If there was a brightness or saturation difference the graphs would not be almost exactly the same.
http://imageshack.us/a/img525/2908/shot1a.png
http://imageshack.us/a/img88/2199/shot2h.png
To add further this is the output from changing them with only Tweak(sat=1.1,bright=.9). It's hardly a dramatic change yet you can see the YUV graphs are noticeably different.
http://imageshack.us/a/img24/7752/shot3j.png
kolak
19th October 2012, 19:52
Yes they are almost identical- basically identical, but funny enough difference is in areas which I did described, before I checked it in Photoshop. Probably simple coincidence.
I think it's due to monitor angles imperfection.
SassBot
19th October 2012, 19:54
Yes, as I said on the previous page you get basically a perfect gray image by subtracting them. So, in conclusion, while the doc did have a warning that there will potentially be mathematically different values between the C and SSE2 routines, it is hardly what is going to be perceptible.
kolak
19th October 2012, 19:59
I would imagine that difference at this level is not a problem at all.
ajp_anton
19th October 2012, 23:07
If you're looking at them side by side, your conclusion means nothing because, as you said, the monitor's angle differences will be much larger anyway.
I also have a high quality screen, and I can't spot a difference even if I flip back and forth between them in different tabs.
jmac698
22nd October 2012, 21:44
http://www.screenshotcomparison.com/comparison/153403
I admit I can't see a difference.
ryrynz
25th October 2012, 00:38
It'll likely be the viewing angle properties of his LCD screen affecting the contrast of the bottom image more as he's not doing a proper comparison in an image viewer and instead looking at them as they're presented on the forum.
jackoneill
11th November 2012, 10:44
Hello.
Could you share that ported assembly code, please?
Or... nevermind now, I guess.
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.