RGVS float version [Archive]

feisty2

21st June 2015, 12:58

binary (x64+IA32) https://github.com/IFeelBloated/RGSF/releases/tag/r5
source
https://github.com/IFeelBloated/RGSF

32bits floating point version of the RemoveGrain plugin
all single precision floating point color spaces are supported (GRAYS/YUV4xxPS/RGBS)

low precision color spaces (anything less than 32bits) are not supported!!! DO NOT APPLY IT ON THOSE VIDS!!!

feisty2

21st June 2015, 16:37

https://github.com/IFeelBloated/RGVS/releases/tag/r2
r2
fixed a bug in removegrain mode5
new: rgsf.Repair
new: rgsf.Cleanse
new: rgsf.ForwardCleanse
new: rgsf.BackwardCleanse
new: rgsf.VerticalCleaner

now it got all functions rgvs got, complete replacement of rgvs on float clips

jackoneill

21st June 2015, 18:58

But... why isn't it integrated into the existing rgvs source code?

feisty2

21st June 2015, 19:09

Cuz I'm a lame ass newbie programmer, I just wrote it for temporary emergency, I'll delete it when you guys come up with better code, like asm opt for float32

feisty2

26th June 2015, 14:13

r3
https://github.com/IFeelBloated/RGVS/releases/tag/r3
new binary compatible with vspipe

MonoS

27th June 2015, 23:03

Ohi feisty, i'd like to do some optimization like i did with dctfilter, i'll do avx optimization and remove the c version, i'll make all the modification without testing on my fork [maybe i'll test them if they will compile without hurting me].

Let me know if you need lower sse optimization, i may do it

EDIT: Also, asm optimization is simpler than you expect, it's just a matter of thinking "i just need to do this x times at once" with x as 4 if you use sse2, 8 for avx and 16 for avx512.
I suggest you watching the episodes of handmade hero about SIMD optmization

EDIT2: Done the first batch of conversion, you can find all the commit at https://github.com/MonoS/RGVS
I have NOT tested the code, but at least is in a compiling state.
I think the code will crash due to unaligned load/store, i'll fix this tomorrow, but i hope that there wont be any other bug introduced by my translation into avx

feisty2

28th June 2015, 07:51

I know shit about asm.... and not planning to learn it any time soon, just be my guest, do whatever you think, okay or cool or.. kinda stuff
:)

MonoS

28th June 2015, 10:55

I know shit about asm.... and not planning to learn it any time soon, just be my guest, do whatever you think, okay or cool or.. kinda stuff
:)

At least tell me what are your CPU capabilities [you can check using cpuz or x264 at the line "using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX"] at least you can use and test the new codepath.

Please, listen to me, probably i used C/C++ since more years than you [well, i've also released some games using C/C++], but until two months ago i thought the same about asm optimization, than i saw Handmade Hero and told me "This is not hard" and done some [as someone have noticed in this forum].
If you need an hand, please contact me i can teach you something if you need :)

PS: if you can test the new version you'll make me very happy, testing is BOOORING and i prefer using that time for translation XD

feisty2

28th June 2015, 11:37

I'm... still at the "Hello World!" kinda stage of cpp, ain't got it through yet, so really not ready to learn asm I guess, c'mon, I'm just 18 and graduated from high school months ago, I'm not even, like, a freshman college nerd before Sep 29, I don't think I got enough stuff in my head to handle asm...
but I'll test your code and report back...

feisty2

28th June 2015, 12:16

guess I won't be able to test the AVX version for the next couple of days, my main computer with an i7 4960x is doing some simulation stuff about black hole (I'm gonna major in physics) on matlab, now I only got an outdated low end computer available, and cpuz tells me it supports instructions up to SSE4A at most

feisty2

28th June 2015, 15:09

MonoS

28th June 2015, 15:59

I'm getting lots of errors in "repair.cpp" (uninitialized local variable "a" used), so removed it
cleanse compiled smoothly, but I can't test it now (no avx support on this shitty computer)
binary (x64), test it yourself, https://github.com/IFeelBloated/RGVS/releases/tag/tmp

Those are warnings, not errors so they don't harm the compilation, a good compiler will remove these variables automatically and produce no code for them.

I don't remember of having problem compiling it, well i didn't even tried to be honest XD
Thanks, will do asap

feisty2

28th June 2015, 16:20

MonoS

28th June 2015, 16:35

those are errors in vs2015 RC, gotta remove "repair.cpp" or it will deny to give you a binary
it's been an exciting and proud couple of days for Americans lately, it's 8:21 AM, I'll go out and join the celebration, fuck yeah!

Please, celebrate also for me :D

Also, maybe you turned on some option like "treat warning as error" that's why

jackoneill

28th June 2015, 18:57

MonoS

28th June 2015, 19:08

Please always keep a (correct, up to date) plain C/C++ equivalent for every function with SIMD optimisations. It will make it easier for other people (and future you) to understand what the code does, and it allows the plugin to run on systems without those SIMD instruction set(s).

(There is no Awarpsharp2 plugin for VapourSynth yet because someone decided years ago that the Avisynth plugin didn't really need plain C equivalents for the most important functions in it. :mad: )

And you are perfectly right, i've done so because feisty2 repo is a fork that someday will be merged, the same i did for dctfilter, it's more of a POC than a full fledged release.

I'm also doing some optimization to ContinuityFixer, and here i'll comment out the c equivalent and leave only the avx implementation [as for 16bit i don't want to waste time now learning how to build a CPU dispatcher, or maybe i could use agner vector library, maybe...].

Also, in this precise case, a c equivalent is also present because i still need it for the last < 7 pixel in the row and the code is in a repo so i can always come back to previous commit and watch the code.

Please, tell me if any of my thought is wrong

MonoS

29th June 2015, 18:00

I've translated mode 19-23, 1 and 17
The other need sorting or equality test and i want to tackle them whit a fresh mind, if someone have hints on how to do things let me know

feisty2

1st July 2015, 08:12

I've translated mode 19-23, 1 and 17
The other need sorting or equality test and i want to tackle them whit a fresh mind, if someone have hints on how to do things let me know

https://github.com/vapoursynth/vapoursynth/blob/master/src/filters/removegrain/removegrainvs.cpp
asm code is there for every single mode, I don't think it would be hard to make them float point if know asm, "know" know, not just the name for sure, like already, know asm...

MonoS

1st July 2015, 20:30

well, the original code already uses intrinsics, so it's just a matter of translating them.
Probably i'll need to change the equality test
Using integer it's pretty trivial to test equality, there're very few values, but using float create a LOT of rounding issues, so i'll need to think about it.

Your last phrase, other than being pretty funny, it's almost correct, it's enough that you know the name of the instrinsics to call and then magic happens XD.

EDIT: if only Jai will be available i'll make a combined version of the integer RGVS and float one, but it's not yet released, sadly.
Also, i'd like to know how he deal with out of boundary pixel, he load 8 or 16 pixel at a time but doesn't seems to care about going out of the border

jackoneill

1st July 2015, 20:43

Also, i'd like to know how he deal with out of boundary pixel

Don't touch them at all.

MonoS

1st July 2015, 21:07

Don't touch them at all.

I think you are refering to first and last line of the frame, i'm talking about the right side of the frame and he do exactly like me, exclude the last pixel and process them in C

feisty2

18th January 2016, 16:59

https://github.com/IFeelBloated/RGSF/releases/tag/r4
r4
1.added support to all floating point color spaces
2.can get along with the default removegrain plugin and not tryna kill each other now

feisty2

25th January 2016, 10:33

https://github.com/IFeelBloated/RGSF/releases/tag/r5
r5
precision boost, float -> double for all internal processes

Quadratic

16th September 2021, 12:46

Does anyone have this working on Linux? I am facing the same issue as here: https://github.com/IFeelBloated/RGSF/issues/1

WolframRhodium

16th September 2021, 13:37

Does anyone have this working on Linux? I am facing the same issue as here: https://github.com/IFeelBloated/RGSF/issues/1

It works on my machine by compiling with

g++ -shared -o librgsf.so *.cpp -O3 -march=native

Quadratic

16th September 2021, 14:51

I apologise for my incompetence, but I do not understand where I am going wrong.

Using your command, vsedit shows me autocomplete suggestions for the plugin but when I attempt to preview the frame it closes with a segfault "[1] 5192 segmentation fault (core dumped) vsedit"
Inspecting the file with vspipe tells me "AttributeError: No attribute with the name rgsf exists. Did you mistype a plugin namespace?"

This information is conflicting, and I don't know how to proceed.

Edit:
The vsedit error was because I forgot to change my clip to RGBS. My mistake. However, the second problem remains, despite being able to preview the script (and benchmark it with VSEdit, I cannot call the script with vspipe)

For whatever reason, other scripts attempt to claim that lsmas does not exist. "AttributeError: No attribute with the name lsmas exists. Did you mistype a plugin namespace?". This is completely false, I have a working lsmas install and use it frequently...

Edit again:
I am asking here: https://forum.doom9.org/showthread.php?p=1952415#post1952415