Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
|
|
Thread Tools | Search this Thread | Display Modes |
24th March 2020, 14:15 | #1 | Link |
I'm Siri
Join Date: Oct 2012
Location: void
Posts: 2,633
|
vsFilterScript: writing C++ plugins like python scripts (WIP)
https://github.com/IFeelBloated/vsFilterScript
this is yet another C++ wrapper for VSAPI. However, it is much higher level than vsxx and provides a "scripting" kinda experience to help you sketch your filter in the fastest possible way. take a look at the 3x3 gauss blur example, less than 40 lines of code and you got your filter up and running. A temporal median example is also provided to show you how to write temporal or spatiotemporal filters. there're 2 more examples, Crop and Rec601ToRGB showing you filters with advanced vaporsynth features. Crop shows you how to write filters that modify the measures of the input (e.g. frame size) and filters that adapt to inputs with arbitrary bitdepths. Rec601ToRGB converts a YUV444 clip to RGB using the Rec601 matrix, it shows you how to write filters that modify the format of the input (e.g. YUV->RGB) and how to manipulate frame properties. C++20 support required (you probably need GCC10 from the trunk). The scripting style syntax is only possible with the latest C++ standard. I haven't finished porting all C APIs to this wrapper, but the filter skeleton generator is here: https://github.com/IFeelBloated/vsFi.../Interface.hxx, it requires certain properties, some constants, some member functions as shown in the example filter. The skeleton generator works in a duck typing manner, it generates a set of skeleton functions as long as the filter struct has all the required properties. you should write each filter in a header filer and include the headers in "EntryPoint.cxx" and register each filter with "VaporInterface::RegisterFilter". The "Clip" object could be accessed as a 4D array ([time (frame)][channel][height][width]) with "GetFrames" or as a 3D array ([channel][height][width]) with "GetFrame". The "time" dimension is relative to the current frame (t=0 for the current frame, t=-1 for the previous frame and t=1 for the next), the other 3 dimensions are absolute. Out-of-bound access is allowed and triggers automatic padding, the behavior of out-of-bound access is defined by concrete padding policies (repeat, reflect, zero...) and the default padding policy is "repeat" for both spatial and temporal dimensions. More details about this part are defined in Plane.hxx, Frame.hxx and Clip.hxx latest update: new functionality: full integration of C++ exceptions. with exceptions, you no longer have to manually handle any of the following errors: a) failing to invoke an external plugin (plugin does not exist) b) failing to invoke an external filter c) failing to invoke a python function d) failing to invoke SelfInvoker ... and possibly many more. SelfInvoker is now allowed to throw exceptions so the earlier restriction requiring SelfInvoker to always be successfully evaluated has been removed. Any of these errors will transparently pass through your filters and propagate to a root caller like Create() which automatically handles any error. To you, it would be like the error does not exist so you NEVER have to worry about errors. It's now one step closer to python scripts. Initialize() has been replaced by normal constructors because with exceptions, it is no longer required to return a value to introspect if the filter has been successfully constructed. Last edited by feisty2; 8th October 2020 at 11:45. |
24th March 2020, 16:07 | #2 | Link |
Professional Code Monkey
Join Date: Jun 2003
Location: Kinnarps Chair
Posts: 2,555
|
I'm curious, what does the actual generated code look like for this? What's the performance penalty for all the abstraction?
__________________
VapourSynth - proving that scripting languages and video processing isn't dead yet |
24th March 2020, 18:50 | #3 | Link | |
I'm Siri
Join Date: Oct 2012
Location: void
Posts: 2,633
|
Quote:
The main runtime overhead is automatic padding (out-of-bound access detection) which gcc -O3 seems to handle pretty well. everything else is determined at compile time and thus zero cost abstraction. Last edited by feisty2; 25th March 2020 at 05:57. |
|
25th March 2020, 10:09 | #4 | Link |
I'm Siri
Join Date: Oct 2012
Location: void
Posts: 2,633
|
did some speed tests,
Code:
clp = core.test.GaussBlur(clp) Code:
clp = core.std.Convolution(clp, [1,2,1,2,4,2,1,2,1]) the comparison is not completely fair tho, test.GaussBlur is a 100% C++ filter (GCC doesn't seem to autovectorize any loop) and std.Convolution has manual avx2 optimization. Last edited by feisty2; 25th March 2020 at 10:12. |
26th March 2020, 12:31 | #5 | Link |
I'm Siri
Join Date: Oct 2012
Location: void
Posts: 2,633
|
new example: temporal median
44 lines of perfectly readable, script-like code, and probably even easier to write vs 376 lines of cryptic C code |
29th March 2020, 20:14 | #7 | Link |
Registered User
Join Date: Jun 2012
Location: Ibiza, Spain
Posts: 321
|
I noticed when using vsedit benchmark utility CPU cores are at 50% for this and at 20% for convolution.
I rerun the test with vspipe and got this: Code:
14488fps all cores at ~85% load (Convolution) 4098fps all cores at 100% load (test) |
29th March 2020, 21:29 | #9 | Link |
I'm Siri
Join Date: Oct 2012
Location: void
Posts: 2,633
|
@Are_
could you compile this GaussBlur filter written with low level APIs and run a speed test again? |
30th March 2020, 17:17 | #13 | Link | |
HeartlessS Usurer
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
|
Quote:
From here:- https://forum.doom9.org/showthread.p...59#post1905459 Code:
clip = core.std.BlankClip(format=vs.GRAYS, length=100000, fpsnum=24000, fpsden=1001, keep=True)
__________________
I sometimes post sober. StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace "Some infinities are bigger than other infinities", but how many of them are infinitely bigger ??? |
|
31st March 2020, 16:36 | #16 | Link |
Registered User
Join Date: Apr 2010
Posts: 16
|
I am sorry I am not a pro-developer (it is my hobby) so I am not sure if I understand you correctly. Nim compiles to C language. It is garbage collected, but you can disable the garbage collector (just passing "--gc:none"). So my understanding is that there is no runtime (so no overhead in that regard).
It is my first time dealing with memory, so probably I am doing something really bad. I have asked for advice here. You might want to take a look. When I compile the following code: Code:
import ../vapoursynth import options BlankClip( format=pfGrayS.int.some, width=640.some, height=480.some, length=100000.some, fpsnum=24000.some, fpsden=1001.some, keep=1.some).Convolution(@[1.0,2.0,1.0,2.0,4.0,2.0,1.0,2.0,1.0]).Savey4m("/dev/null") Code:
$ nim c -f --threads:on --gc:none -d:release -d:danger modifyframe $ time ./modifyframe real 0m58,879s user 0m54,969s sys 0m7,433s This uses the Convolution filter plus a custom made filter (Savey4m) that for sure is adding some overhead, despite is sending the data to "/dev/null". How do you read the memory once you have the plane's pointer? Could you send me a link to that particular piece of code (I don't understand much C/C++, I hope to understand enough). |
31st March 2020, 17:03 | #20 | Link | ||
I'm Siri
Join Date: Oct 2012
Location: void
Posts: 2,633
|
Quote:
also in your other post Quote:
then, it seems that your nim version operates on int8 clips, the C and C++ plugins were coded for fp32 clips, there's also a significant performance gap here, you can't compare like that. Last edited by feisty2; 31st March 2020 at 17:29. |
||
|
|