View Full Version : Watermark?
Yanak
9th February 2018, 22:26
Hello,
ran my tests again with this one and yes indeed it is faster :
https://s13.postimg.org/ggm8878h3/2018-02-09_212710.png
Now Master tell me your secrets please :p
Looked a bit and ICL13 seems to be a ( not much recent apparently ) version of the Intel c++ compiler, and PGO an optimization setting possible, seems also like there is something more or less similar in VS2017 so I'll have to document myself on this and try in vs2017, might serve for the future, like i said I'm very new to this and despites reading a lot last weeks i still am a complete noob and it probably will stay like this for a long long time...
I bet it is possible for the Intel compiler to be integrated into VS2017, a bit like it is possible with LLMV/clang ? and most probably the Intel compiler is not free too so probably useless for the little needs i plan to have.
Sorry for hijacking a bit the thread but now you got me really curious, if it's possible to get better performances i'd rather start with this before going on further experiments in this domain.
Thanks a lot,
and thanks for the faster dll ;)
Groucho2004
9th February 2018, 23:12
Looked a bit and ICL13 seems to be a ( not much recent apparently ) version of the Intel c++ compiler, and PGO an optimization setting possibleNot the latest version but that's what I have (and an even older ICL10).
PGO - Profile Guided Optimization, look it up.
I bet it is possible for the Intel compiler to be integrated into VS2017The Intel C compiler is not stand-alone, it has to be used in conjunction with Visual Studio (or at least its libraries and command line tools). It also integrates with VS.
most probably the Intel compiler is not free too so probably useless for the little needs i plan to have.Definitely not free.
Yanak
9th February 2018, 23:33
Was looking quickly around for PGO in VS2017, will have to figure out this in the next days as it seems to improve things nicely in some cases, will dig into this and do some testings.
For intel compiler i saw the prices for the basic version.. I'd rather renew a good part of my PC hardware if i had such sum to spend :/
There is ways to get it for free, but many years have passed since i was a student, and i'm not entering in other categories like Open Source contributor or an educator either, so nope, just need to forget about this option :p
PGO is something i will study as it can serve for many other things, something i had not seen before during all my readings and searches about optimizations for VS2017 in during the last couple of weeks, thanks for making me discover something new, it will keep me busy for the next few days until i figure it out.
Thanks again for all.
StainlessS
10th February 2018, 13:28
Not sure, think Intel compiler may be free on Linux, (not that it helps here).
I've finally downloaded XP64 bit updates (WususOfflineUpdate), and may at some point get a 64 bit sys up and running, and see if I can
improve on my 64 bit compile using VS 2008. I used 32 bit offsets to arrays (64 bit simply not necessary), but vs2008 may sign extend everything
and make compile slower, whereas the later compilers may do as they were told,(or convert ints to prtdif, or whatever that horrible replacement for int is)
dont know. I'll probably have to aquaint myself at least a little
with assembler so that I can see what code compiler issues, and so better judge what may be of benefit to speed.
Nice job guys, every little bit helps :)
Groucho2004
10th February 2018, 15:04
@Stainless
The reason for the slow speed of your build are the compiler settings. You neither specified "O2" nor "Ob2". I just did a test with VC10 and your settings and it is also very slow.
Here is my makefile with the VC10 settings I use for best performance:
CXX=@cl.exe
CXXFLAGS=/MT /EHsc /O2 /Ob2 /D "NDEBUG" /D "WIN32" /D "_WINDOWS"
LINK32=@link.exe
LINK32FLAGS=/dll /machine:x64 /nologo
WaterMark2_x64.dll: WaterMark2.obj
$(LINK32) $(LINK32FLAGS) WaterMark2.obj /out:WaterMark2_x64.dll
WaterMark2.obj: WaterMark2.cpp
$(CXX) $(CXXFLAGS) WaterMark2.cpp -c
Yanak
10th February 2018, 18:45
After struggling to find a decent "How to" for PGO under VS2017 i found this : https://docs.microsoft.com/en-us/windows/uwp/debug-test-perf/pgo-for-uwp
All the rest i found reference things i don't seems to have installed or available on my version...
- Step 1 used my existing solution directly, step 2 was already set, followed step3, added also the same thing in "General > Whole Program Optimization > Profile guided optimization - Instrument" , this was not indicated.
- At step 4: i only built the .dll, it seems i can't activate the deploy option, for some reasons it stays grayed out...
- At Step 5 the pgort140.dll path is not correct, found it at "C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.12.25827\bin\Hostx64\x64\pgort140.dll"
- Step 6 i done this :
I left the instrumented WaterMark2_x64.dll in the Solution folder ( AVSMeter seems to create 4 metric files on each run if the dll's are put inside avisynth+ auto load plugin folder, probably the auto load ).
Copied the pgort140.dll there, next to the .pgd file , then run all my above tests once using AVSmeter, ( it is very slow of course, around 6fps for all the tests and 8 fps for the 1920x800px file ...)
Only change was just indicating loadplugin() path for the watermark2x64.dll with the path to my solution release folder ( and removed also the already existing watermark2_x64.dll i had in avisynth plugins folder so it won't auto-load and mess everything)
- Skipped steps 7 & 8, not needed.
- AVSMeter created two .pgc file on each script i ran, WaterMark2_x64!1.pgc + WaterMark2_x64!2.pgc, one at the start of the script run and another once the script was processed, i guess the 2nd one is probably when it unloads the script/dll's so i deleted each 2nd metric .pgc files to keep only the ones created during the script running, *!1.pgc , *!3.pgc , *!5.pgc etc etc..
- Followed step 9 and 10 ( + went again to : General > Whole Program Optimization > Profile guided optimization - Optimize ), this was not indicated.
Then run my tests with the new created and "trained" .dll made using PGO, and this is the results :
https://s13.postimg.cc/aft4ofg93/2018-02-10_174906.png
Very very happy with this, wasn't expecting such boost of performances :)
Thanks a lot Groucho2004 for making me discover this, took me a bit to figure it out but it's something i will use again for sure. ( left me wondering what it could have been with the Intel compiler tho :p )
I'll have a last question about the compiler.h file and how to add more recent definitions coming from the sdk there but i won't pollute this thread more, i'll revive StainlessS's " Avisynth CPP Interface for C Programmers. - 31 Mar 2015" thread and ask there after trying first by myself tonight or tomorrow :p
I leave the boosted WaterMark2_64.dll + sources + solution + the .pgd & .pgc "training" files inside the "Source > x64>Release" folder
( initial pgd, not having the training metrics .pgc files merged, this will be done automatically when importing the solution and building the project for the first time and then not needed anymore, on the next builds only the .pgd file having metrics already merged will be used if i make no errors )
Thanks again to all, learned a lot doing this and I'm quite happy with the results, game changer for me to have been able to do this, couple of weeks ago i was staying far away from the compiling stuff after failing at it a few times in the past ;)
Mirror for file download : http://www75.zippyshare.com/v/o3gy0rpL/file.html
StainlessS
10th February 2018, 19:58
You neither specified "O2" nor "Ob2"
Then I need a good slap.
Yanak,
Nice job, did the "Learn X in Y minutes" thing come in handy ? [EDIT: Originally linked via VapourSynth thread]
https://learnxinyminutes.com/
Many Programming languages covered, and in multiple spoken language.
Yanak
10th February 2018, 22:08
Hi and thanks,
I started to dig and read into this, thanks again for the links provided earlier, but i'll need more time on this to read and assimilate some of this stuff, but it is really handy and well explained, hopefully i will finish this in a few days and maybe start to experiment a bit more.
PS: i don't set /Ob2, if i understood correctly it is automatically triggered when /O1, /O2 or /Ox are enabled, in vs2017 at least, so i just leave it by default but have/O2 set, might try to enable it too...
Thanks.
StainlessS
17th February 2018, 22:45
@Stainless
The reason for the slow speed of your build are the compiler settings. You neither specified "O2" nor "Ob2". I just did a test with VC10 and your settings and it is also very slow.
Actually, I think I've found a bit of a disturbing bug in VS2008 Express x64 [EDIT: and x86].
By default, in IDE project configuration, it looks like
Configuration Properties/C/C++/Optimization/Optimization=Maximize Speed(\O2)
With the Maximize Speed(\O2) not in bold, ie default setting.
However, seems that Maximize Speed is NOT the used default, and only properley set when switched to something else, and then back to Maximize Speed again(where it is now bolded).
If compiled with default (not bolded) I'm getting about 16.75 FPS on my quad core 2.4Ghz Core Duo (1 core used).
If switched to Full Optimization(\Ox), I'm getting about 36FPS.
If switch back to Maximize Speed(\O2), I'm still getting about 36FPS (where it originally gave 16.75 FPS).
Also NOTE, should have preprocessor definitions also set, "Configuration Properties/C/C++/PreProcessor/PreProcessor Definitions=_WIN64;_AMD64
_WIN64 is said by M$ to be a 'standard' definition, but also says that it is not set defaulted by IDE (when compiling x64), and must be specified in preprocessor definition.
https://social.msdn.microsoft.com/Forums/vstudio/en-US/b1efc987-490f-4690-a964-f9bcce1ae625/win64-is-not-defined-for-my-64-bit-appications?forum=vclanguage
EDIT: By MSDN moderator:-
The IDE doesn't know beans about x64. Maybe next version.
EDIT: This is used in Compiler.h"
// Compile for minimum supported system. MUST be defined before includes.
// NEED to use SDK for updated headers, TK3 will give error messages/Warnings about Beta versions.
#ifdef _WIN64
#define WINVER 0x0502 // XP 64 Bit or Server 2003
#define _WIN32_WINNT 0x0502 // XP 64 Bit or Server 2003
#define NTDDI_VERSION 0x05020000 // Server 2003 SP0
#define _WIN32_IE 0x0700 // 0x0700=IE 7 SP0
#else
#define WINVER 0x0501 // XP 32 bit
#define _WIN32_WINNT 0x0501 // XP 32 bit
#define NTDDI_VERSION 0x05010300 // XP SP3
#define _WIN32_IE 0x0603 // 0x0603=IE 6 SP2
#endif
EDIT: Maybe above _WIN64 is only a prob in IDE, and compiles ok, but IDE shows wrong code in Compiler.h greyed out when _WIN64 not defined.
raffriff42
18th February 2018, 00:17
36FPS? On an effect that doesn't change from frame to frame? You might not be caching properly. I can do 130FPS in script. FFMS2("Sintel.mp4")
## the watermark: any white on black mask
M=BlankClip(Last)
\ .Subtitle("© copyright Blender Foundation", size=42,
\ align=3, text_color=$ffffff)
\ .ColorYUV(levels="TV->PC")
watermark42(M, 0.2, nofill=true)
return Last
https://www.dropbox.com/s/tvzcpri8qbc2ykw/watermark22-demo1.jpg?raw=1
##################################################
function watermark42(clip C, clip M, float opacity, bool "nofill")
{
nofill = Default(nofill, false)
C
## 'CF' = 'fill' (white part of mask)
## this is one possible treatment: shift + low contrast
CF=BilinearResize(Width, Height, -(-3), -(-3), Width, Height) [* shift left, up 3px *]
\ .ColorYUV(cont_y=f2c(0.7)) [* lower contrast *]
\ .Trim(0, length=1).Loop
#return CF
(nofill) ? Last : Overlay(CF, mask=M, opacity=opacity)
## 'MC' = mask + convolution (embossed effect)
MC=M.ConvertToRGB32(matrix="PC.709")
\ .GeneralConvolution(128, "
-1 0 0
0 0 0
0 0 1")
\ .Trim(0, length=1).Loop
## 'MH' = highlight part of emboss
MH=MC.Levels(128, 1, 255, 0, 255, coring=false)
\ .Trim(0, length=1).Loop
Overlay(ColorYUV(off_y=(60)), mask=MH, opacity=0.5*opacity)
## 'MS' = shadow part of emboss
MS=MC.Levels(128, 1, 0, 0, 255, coring=false)
\ .Trim(0, length=1).Loop
#return MS
Overlay(ColorYUV(off_y=(-60)), mask=MS, opacity=0.5*opacity)
return Last
## scale ColorYUV args to a "normal" scale
function f2c(float f) {
return Round((f - 1.0) * 256.0)
}
}
as shown: 130FPS
w/o code in blue: 45FPS
StainlessS
18th February 2018, 00:44
I can do 130FPS in script.
I'm not too worried about my speed, was on original 'Text' logo over Colorbars [EDIT: As in post #43, much bigger than your little logo, little clip and simpler args]. My machine is Quad core Core Duo, Q6600 (Lithography 65 nano meter,
not the later 45nm), and DDR2, so is quite crappy (also switched off other optimize functionality for that bug test).
EDIT: Arh, your WaterMark42() does not even use WaterMark2() and so cannot produce same effect as in post #43.
I tested with VS2008 Express 32 bit, compiling to x64, and same bug as posted in my prev post (not tested with compile to 32 bit as yet, I imagine it will be same bug).
Tested the _WIN64 thing, and does indeed get defined and correct code produced, problem only for IDE.
You can set optimization to defaults by hand editing WaterMark2_x64.vcproj, this part (delete lines marked in BLUE,
[EDIT: Well actually delete them all except the Name="VCCLCompilerTool" line and preprocessor defines])
<Configuration
Name="Release|x64"
OutputDirectory="$(SolutionDir)$(PlatformName)\$(ConfigurationName)"
IntermediateDirectory="$(PlatformName)\$(ConfigurationName)"
ConfigurationType="2"
EnableManagedIncrementalBuild="0"
>
<Tool
Name="VCPreBuildEventTool"
/>
<Tool
Name="VCCustomBuildTool"
/>
<Tool
Name="VCXMLDataGeneratorTool"
/>
<Tool
Name="VCWebServiceProxyGeneratorTool"
/>
<Tool
Name="VCMIDLTool"
/>
<Tool
Name="VCCLCompilerTool"
EnableIntrinsicFunctions="true"
FavorSizeOrSpeed="1"
WholeProgramOptimization="true"
PreprocessorDefinitions="_WIN64;_AMD64"
EnableEnhancedInstructionSet="2" # EDIT: SSE2, Will be ignored when compiled to x64, so remove anyway
AssemblerOutput="4"
Detect64BitPortabilityProblems="true" # EDIT: Deprecated
/>
Note, the Release|x64 section.
EDIT: Optimization="2" is set in the BLUE segment when non default Optimization=Maximize Speed(\O2) is bolded.
StainlessS
18th February 2018, 08:53
WaterMark2, v2.10 x86 and x64, new version (~61KB, EDIT: Actually correct zip is 137KB).
http://www.mediafire.com/file/61ugwwvvc7rx111/WaterMark2_dll_v2.10_x86_x64_20180218.zip
EDIT: Above link updated.
Considerably faster where single frame watermark.
For post#43 style Colorbars and "Text" logo [with loop() removed for single frame watermark], x86 about 194 FPS, and x64 about 209FPS, on my crappy machine.
Give it a whirl Yanak :)
EDIT: I took the opportunity to mod the WATER_SWITCH from 32 to 128,
Watermark Luma Black is Y less than 128, white as greater or equal to 128.
StainlessS
18th February 2018, 14:19
Oops, sorry guys, I linked older WaterMark.zip in above post, updated. :(
EDIT: Also available via SendSpace link below this post.
Yanak
19th February 2018, 12:42
Hello,
I won't be able to test it before a few days sadly, but be sure that I will do as soon as possible and report back, the improvements seems to be really nice, i have to deal with a few things and hopefully in 4-5 days i will have time to test this properly.
Thanks a lot.
PS :
Just noticed that my last attachment on previous post is still not approved :/ Added a mirror link for the download.
Edit :
Still didn't had time to dig into this compiler.h file and see what it does exactly, i noticed the reference to the SDK in it, this is the content of sdkddkver.h using win10 sdk in VS2017 :
https://pastebin.com/raw/wNggMEh8
Not sure if it can help to update the compiler.h file or not.
StainlessS
19th February 2018, 14:57
Compiler.h is just a 'standard' (ie my standard) header to switch off some things, switch on some things, and also set required
windows version and IE version for some of the plugs I compile, maybe not necessary for most plugs but just to save me time.
Also, provides for enable/disable my debug messages via DPRINTF(), for VS2008 and above, or alternatively below VS2008.
You dont really need worry about it at all (but to correctly set minimum OS, then should set _WIN64 on 64 bit where min req OS would be XP64 bit).
Not sure what the IE min was for, probably something in RT_Stats (open file selector or something like that).
EDIT: Yeh, not many attachments are approved of late, perhaps the mods are on sabbatical :)
I downed your mirror link and tried with same script where I got 209FPS in new x64 version, with yours I am getting 51FPS,
so new one about 4x faster for single frame watermark (BuildMask is only called once on a persistent buffer, rather than at every frame).
EDIT: Script used.
#LoadPlugin("D:\WaterMark\WaterMark2_x86.dll")
LoadPlugin("D:\WaterMark\WaterMark2_x64.dll")
Colorbars.killaudio.Trim(0,-1).Loop(20000)
L="TR"
#L="BR"
#L="BL"
#L="TL"
C =Last
WM = ImageSource(".\ExampleText.jpg", end = 0, pixel_type = "RGB32").BicubicResize(Width,Height)
WM2 = WM.ConvertToYV12 #.Loop(Framecount) # Single frame watermark ONLY
LIGHT=200
SOFTEDGE=true
DEP=10
DISP=20
return Watermark2(watermark = WM2,lightFrom=L,light=LIGHT,softEdge=SOFTEDGE,depth=DEP,displace=DISP)
And result:-
https://s20.postimg.cc/bxzwbmo8t/Watermark.jpg (https://postimages.cc/)
Yanak
21st February 2018, 01:40
Sorry for the delay in replying to this but was really too busy those last days.
Thanks for the infos StainlessS, i will not worry about compiler.h file then, was not sure what it was doing exactly, now i have the answer.
Took a couple of hours to try your new compiled dll, using the same scripts as before, the new code is more efficient compared to previous one.
Your x64 dll, then compiled in vs2017, then vs2017 and also PGO trained :
https://s14.postimg.cc/g61axmlxd/2018-02-21_003936.png
Very nice results and improvements.
Did not had much time to test the scripts above, but the default colorbars size is only 640x480, the smaller the size and also the smaller the mask size to "print" on the video and the faster it goes.
On 1080p footage and with a text or logo to print taking quite some space like in my tests the speed takes a hit.
Anyways resized my mask picture to 640x480 and run your script above ( only once, missing time to push this more ) :
- your compiled x64 dll gives me 475.2 FPS
- Mine with PGO training goes up to 584.7 FPS o_O
now if i use this script with 1080p size :
ColorBars(width = 1920, height = 1080).killaudio.Trim(0,-1).Loop(20000)
L="NE"
#L="BR"
#L="BL"
#L="TL"
C =Last
WM = ImageSource("H:\WaterMark_Pic_000.bmp", end = 0, pixel_type = "RGB24").BicubicResize(Width,Height)
WM2 = WM.ConvertToYV12 #.Loop(Framecount) # Single frame watermark ONLY
LIGHT=100
SOFTEDGE=true
DEP=3
DISP=10
return Watermark2(watermark = WM2,lightFrom=L,light=LIGHT,softEdge=SOFTEDGE,depth=DEP,displace=DISP)
- Your x64 dll : 74.07 fps
- PGO one : 91.25 fps
I leave the last dll i compiled with PGO training and your last code here : http://www56.zippyshare.com/v/91J6OXZq/file.html
The results are quite impressive if you look at where this .dll started for speed, really happy about this, and a big thank you for making all this possible guys.
PS :
@raffriff42 :
Thank you.
I'll try your script when i'll have a bit of time too, but i'm afraid the result is not the same, the dll give a kind of glossy effect that is really nice, can be used with text or a logo picture or even use a white and black rendered full frame and you can simply make a kind of glossy emboss effect on your video with it, the results you can get with this dll goes far behind a watermarking effect.
Yanak
31st March 2018, 14:35
Intel System Studio is now free (https://www.pcper.com/news/General-Tech/Intel-System-Studio-2018-Commercial-License-Now-Free), need to register to get a 90 day license that can be renewed after the 3 months for free still :
https://software.intel.com/en-us/system-studio/choose-download
Integrated the Intel C++ Compiler 18.0 into Visual Studio 2017 , made a Dll with PGO to compare with the one i had before and it gains a bit in some tests :
https://i.imgur.com/TcAgoRn.png
Well the previous one gains a bit too, it' a different version of AvsMeter and a different version of AviSynth+ too ( r2664), not sure what made the previous dll i posted gain a few fps compared to my last tests... anyways the Intel compiler squeeze it a bit more so all good for me.
Script i posted in previous post now bypasses 100 FPS by a tiny bit with this new dll.
For what it's worth this is it : http://www38.zippyshare.com/v/HOAQ2sJA/file.html
For Intel CPU's only i guess since it uses some Intel tricks for the optimizations.
:)
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.