ColorMatrixTransform v0.1

TomArrow · 20th November 2021, 14:09

Hey guys,

I'm not sure if this is a "dupe" but I couldn't find any plugin that does this, so I made it myself.

Download and code (x64 and x86):
https://github.com/TomArrow/ColorMat...eases/tag/v0.1

You provide an input clip (only RGBPS is supported atm) and exactly 12 float values. The 12 values are a color transformation matrix, allowing you to convert between color spaces, apply color calibration matrices or do any other fancy effects you might be tempted to do.

The 12 float values you provide are like in a 3x4 matrix, in this order:
aa ab ac ad
ba bb bc bd
ca cb cc cd

The last column are offsets, similar to what many RGB mixer filters (Photoshop, After Effects,...) provide.

A call that wouldn't affect the image at all would look like this (identity matrix):

Code:

ColorMatrixTransform(1,0,0,0,0,1,0,0,0,0,1,0)

Simple use case example:
1. Let's say you want to convert to a different color space that isn't built into something like Avsresize.

2. You go get yourself the correct color transform matrix you need: http://color.support/colorspacecalculator.html

3. Use Avsresize to convert to linear RGBPS (make sure the color matrix you specify in z_ConvertFormat is the source color space of your transform matrix).

4. Done.

5. Optionally, now do whatever adjustments you wanted to make in this color space and convert back.

This could also be used for very fancy white point adjustments in the Bradford color space and whatever else you can think of.

I don't know how well it will perform, but I at least used OpenMP on one of the loops, so there should be a reasonable amount of multiprocessing and I'm hoping the optimizer did some vectorization. Runs somewhat well for me. If anyone wants to contribute more advanced optimizations, feel free to send a pull request.

Edit:

Possible other use cases as I come up with them:
- Converting to black and white with your own weights
- Fine tuning saturation/separation of different colors

Dogway · 20th November 2021, 17:38

Thanks! It only works in 32-bit float by now right?

TomArrow · 20th November 2021, 18:06

Quote:

Originally Posted by Dogway

Thanks! It only works in 32-bit float by now right?

Correct. It's arguably wiser to work only in floating point anyway because color space conversions can result in negative values.

Let me know if other color spaces are desired and if so, which.

Dogway · 20th November 2021, 18:33

From a useability standpoint maybe it would be useful to allow transposing, inverting and array inputs with broadcsating. The thing is that a 3x4 matrix is going to get in the way of most matrix definitions, maybe the offset args could be be moved to last position to allow both things or just auto broadcast when array size is 9.

I will run some benchmarks.

EDIT: runs a bit slower than Expr(), 113fps vs 143 fps, I used 16-bit clip in Expr() though (internally works in float)

TomArrow · 20th November 2021, 21:32

Quote:

Originally Posted by Dogway

From a useability standpoint maybe it would be useful to allow transposing, inverting and array inputs with broadcsating. The thing is that a 3x4 matrix is going to get in the way of most matrix definitions, maybe the offset args could be be moved to last position to allow both things or just auto broadcast when array size is 9.

I will run some benchmarks.

EDIT: runs a bit slower than Expr(), 113fps vs 143 fps, I used 16-bit clip in Expr() though (internally works in float)

Thanks for the feedback, can you explain what transposing and all those terms mean to explain your use case please? The order was chosen based on how the Adobe suite typically presents it, but it would be no problem to just add an alternate function name for a different order if desired, or one that just works without offsets.

Does Expr allow to do matrix transforms like this? It's actually one of the first things I looked into when I tried doing this but I couldn't figure it out. I know how to apply operations on individual pixels in Expr but I don't know how to take the other channels into account in the calculation.

Dogway · 20th November 2021, 21:46

No problem, I crafted somewhat a wrapper function around Expr() to work as a Matrix transform, look here. A bit below it you can see examples for transpose and invert, and the order I follow. I think offsets are nice but not commonly used(?)

It would be great you could vectorize the code, I know nothing about programming but maybe there's room for improvement.

TomArrow · 20th November 2021, 23:41

Aah. So you extract each plane as its own clip. And that performance comparison you did was your wrapper against my plugin?

If so, I'm guessing there is no need for my plugin in the first place then. Vectorizing better is a good idea but I'm still a beginner in C++ so for the most part I just hoped that the compiler would do some of this work for me.

Looking at the Expr documentation it seems it converts the expressions to assembly. That kind of stuff is definitely beyond my skill set.

If I get any good ideas to improve performance I'll make a new post, and if someone wants to contribute faster code that's of course also welcome. But I don't think I can make any promises, especially since I also have other projects and this would seem like something requiring more learning curve.

Interesting though about the array stuff. I had no idea AVISynth supported arrays. I'll have to look into that someday.

Dogway · 21st November 2021, 00:08

Yes it was against my wrapper, fair to say that just a few days ago I was running 3 Expr(3 clips) and 1 CombinePlanes() because I didn't know about the merge feature of Expr(), it was then about same speed as ColorMatrixTransform().

Maybe you can have a look at similar plugins like avsresize or fmtconv on how they did it. I think that currently anything that isn't super optimized or very specific isn't worth it, but there lots of things I would like to see as plugins like a Median filter, an Hysteresis plugin, an analysis tool like ShowChannels() but with HBD support, an optimized box average filter, an HMT fill mode, segmentation filters, etc. Many things can't be done with just avs syntax.

TomArrow · 21st November 2021, 01:10

Ah I see, interesting. I agree, it would only make sense if it's faster.

Meanwhile I have been taking some baby steps and looked into how auto-vectorization works. Turns out you can print diagnostics and it showed me that it wasn't doing any vectorization at all.

So now I tuned it a bit and it does do it now. In addition, I've now also created AVX2 and AVX512 compiles, where the old one was merely AVX (because that's what my CPU supports).

Would you do another one of your benchmarks with the updated version?
https://github.com/TomArrow/ColorMat...ses/tag/v0.1.1

Just pick whatever instruction set your CPU can handle, though the AVX512 is with a newer compiler version which in my tests seemed to actually produce a bit slower code, so if you try the AVX512 one, maybe also try the AVX2 one. And for AVX and AVX2 I included a second version called "OMPtest" which has the OpenMP parallelization of the outer loop (rows) while the other one does not have this.

To me subjectively it feels snappier now, but maybe I'm imagining it. Whereas with the OpenMP thing I really can't tell and I don't know how to do benchmarks.

Dogway · 21st November 2021, 01:23

Thanks, I tested with AVX2 as my CPU doesn't support AVX512. The OpenMP was kinda slower than before, the other one was a little bit faster maybe 114 or 115. I'm testing with Prefetch(6) for my CPU which gives me the highest numbers. You can run benchs with avsmeter, very useful tool.

Maybe someone else can give you hints on SIMD optimization, looking over stackoverflow is also good practice.

TomArrow · 21st November 2021, 01:29

Oh well that's quite the letdown then. I could swear it felt faster to me but such is life.

Guess I'm at my wits end for the time being then.

20th November 2021, 14:09	#1 \| Link
TomArrow Registered User Join Date: Dec 2017 Posts: 90	ColorMatrixTransform v0.1 Hey guys, I'm not sure if this is a "dupe" but I couldn't find any plugin that does this, so I made it myself. Download and code (x64 and x86): https://github.com/TomArrow/ColorMat...eases/tag/v0.1 You provide an input clip (only RGBPS is supported atm) and exactly 12 float values. The 12 values are a color transformation matrix, allowing you to convert between color spaces, apply color calibration matrices or do any other fancy effects you might be tempted to do. The 12 float values you provide are like in a 3x4 matrix, in this order: aa ab ac ad ba bb bc bd ca cb cc cd The last column are offsets, similar to what many RGB mixer filters (Photoshop, After Effects,...) provide. A call that wouldn't affect the image at all would look like this (identity matrix): Code: ColorMatrixTransform(1,0,0,0,0,1,0,0,0,0,1,0) Simple use case example: 1. Let's say you want to convert to a different color space that isn't built into something like Avsresize. 2. You go get yourself the correct color transform matrix you need: http://color.support/colorspacecalculator.html 3. Use Avsresize to convert to linear RGBPS (make sure the color matrix you specify in z_ConvertFormat is the source color space of your transform matrix). 4. Done. 5. Optionally, now do whatever adjustments you wanted to make in this color space and convert back. This could also be used for very fancy white point adjustments in the Bradford color space and whatever else you can think of. I don't know how well it will perform, but I at least used OpenMP on one of the loops, so there should be a reasonable amount of multiprocessing and I'm hoping the optimizer did some vectorization. Runs somewhat well for me. If anyone wants to contribute more advanced optimizations, feel free to send a pull request. Edit: Possible other use cases as I come up with them: - Converting to black and white with your own weights - Fine tuning saturation/separation of different colors Last edited by TomArrow; 20th November 2021 at 17:14.

20th November 2021, 17:38	#2 \| Link
Dogway Registered User Join Date: Nov 2009 Posts: 2,361	Thanks! It only works in 32-bit float by now right? __________________ i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread

20th November 2021, 18:33	#4 \| Link
Dogway Registered User Join Date: Nov 2009 Posts: 2,361	From a useability standpoint maybe it would be useful to allow transposing, inverting and array inputs with broadcsating. The thing is that a 3x4 matrix is going to get in the way of most matrix definitions, maybe the offset args could be be moved to last position to allow both things or just auto broadcast when array size is 9. I will run some benchmarks. EDIT: runs a bit slower than Expr(), 113fps vs 143 fps, I used 16-bit clip in Expr() though (internally works in float) __________________ i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread Last edited by Dogway; 20th November 2021 at 18:58.

20th November 2021, 21:46	#6 \| Link
Dogway Registered User Join Date: Nov 2009 Posts: 2,361	No problem, I crafted somewhat a wrapper function around Expr() to work as a Matrix transform, look here. A bit below it you can see examples for transpose and invert, and the order I follow. I think offsets are nice but not commonly used(?) It would be great you could vectorize the code, I know nothing about programming but maybe there's room for improvement. __________________ i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread

21st November 2021, 00:08	#8 \| Link
Dogway Registered User Join Date: Nov 2009 Posts: 2,361	Yes it was against my wrapper, fair to say that just a few days ago I was running 3 Expr(3 clips) and 1 CombinePlanes() because I didn't know about the merge feature of Expr(), it was then about same speed as ColorMatrixTransform(). Maybe you can have a look at similar plugins like avsresize or fmtconv on how they did it. I think that currently anything that isn't super optimized or very specific isn't worth it, but there lots of things I would like to see as plugins like a Median filter, an Hysteresis plugin, an analysis tool like ShowChannels() but with HBD support, an optimized box average filter, an HMT fill mode, segmentation filters, etc. Many things can't be done with just avs syntax. __________________ i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread

20th November 2021, 23:41	#7 \| Link
TomArrow Registered User Join Date: Dec 2017 Posts: 90	Aah. So you extract each plane as its own clip. And that performance comparison you did was your wrapper against my plugin? If so, I'm guessing there is no need for my plugin in the first place then. Vectorizing better is a good idea but I'm still a beginner in C++ so for the most part I just hoped that the compiler would do some of this work for me. Looking at the Expr documentation it seems it converts the expressions to assembly. That kind of stuff is definitely beyond my skill set. If I get any good ideas to improve performance I'll make a new post, and if someone wants to contribute faster code that's of course also welcome. But I don't think I can make any promises, especially since I also have other projects and this would seem like something requiring more learning curve. Interesting though about the array stuff. I had no idea AVISynth supported arrays. I'll have to look into that someday.

21st November 2021, 01:10	#9 \| Link
TomArrow Registered User Join Date: Dec 2017 Posts: 90	Ah I see, interesting. I agree, it would only make sense if it's faster. Meanwhile I have been taking some baby steps and looked into how auto-vectorization works. Turns out you can print diagnostics and it showed me that it wasn't doing any vectorization at all. So now I tuned it a bit and it does do it now. In addition, I've now also created AVX2 and AVX512 compiles, where the old one was merely AVX (because that's what my CPU supports). Would you do another one of your benchmarks with the updated version? https://github.com/TomArrow/ColorMat...ses/tag/v0.1.1 Just pick whatever instruction set your CPU can handle, though the AVX512 is with a newer compiler version which in my tests seemed to actually produce a bit slower code, so if you try the AVX512 one, maybe also try the AVX2 one. And for AVX and AVX2 I included a second version called "OMPtest" which has the OpenMP parallelization of the outer loop (rows) while the other one does not have this. To me subjectively it feels snappier now, but maybe I'm imagining it. Whereas with the OpenMP thing I really can't tell and I don't know how to do benchmarks.

21st November 2021, 01:23	#10 \| Link
Dogway Registered User Join Date: Nov 2009 Posts: 2,361	Thanks, I tested with AVX2 as my CPU doesn't support AVX512. The OpenMP was kinda slower than before, the other one was a little bit faster maybe 114 or 115. I'm testing with Prefetch(6) for my CPU which gives me the highest numbers. You can run benchs with avsmeter, very useful tool. Maybe someone else can give you hints on SIMD optimization, looking over stackoverflow is also good practice. __________________ i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread

21st November 2021, 01:29	#11 \| Link
TomArrow Registered User Join Date: Dec 2017 Posts: 90	Oh well that's quite the letdown then. I could swear it felt faster to me but such is life. Guess I'm at my wits end for the time being then.