Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
![]() |
#102 | Link | |
Registered User
Join Date: Jan 2012
Location: Mesopotamia
Posts: 2,583
|
Quote:
![]() ![]() has the Depan fix this bug (turns the video green)?
__________________
See My Avisynth Stuff |
|
![]() |
![]() |
![]() |
#103 | Link | ||
Registered User
Join Date: Mar 2012
Location: Texas
Posts: 1,655
|
Quote:
![]() Quote:
A while back tp7 started working on a 16-bit MaskTools, unfortunately it was not finished. See here: https://github.com/tp7/masktools/commits/16bit Maybe someone will come along and finish it. Another route is porting VS' Expr and co. |
||
![]() |
![]() |
![]() |
#104 | Link |
I'm Siri
Join Date: Oct 2012
Location: void
Posts: 2,633
|
Someday, maybe there will be a complete version of mvtools that could merge all mvtools variations into one
One binary serves as both a vaporsynth plugin and an avisynth plugin, supporting bitdepths from 8 to 32 and arbitrary temporal radius.. |
![]() |
![]() |
![]() |
#105 | Link | |
Registered User
Join Date: Jan 2012
Location: Mesopotamia
Posts: 2,583
|
Quote:
__________________
See My Avisynth Stuff |
|
![]() |
![]() |
![]() |
#106 | Link |
Registered User
Join Date: Jan 2012
Location: Mesopotamia
Posts: 2,583
|
MvTools2 2.7.5.22d is slower than mvtools2 2.7.0.22d
~1.7 fps vs ~1.5 fps with same complex script is this because it built without ICC?
__________________
See My Avisynth Stuff |
![]() |
![]() |
![]() |
#107 | Link | |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
Quote:
![]() If you post the script that causes this behaviour and possibly mention what CPU you have you'll probably get a better answer.
__________________
Groucho's Avisynth Stuff |
|
![]() |
![]() |
![]() |
#108 | Link | |
Registered User
Join Date: Jan 2014
Posts: 2,275
|
Quote:
Since then I found some bottlenecking places when using __forceinline helped poor vs2015. Testing on a typical MDegrain3 script: 2.7.5.22: 4.13 fps (VS2015) 2.7.0.22d: 4.69 fps 2.7.futu.re: 4.62 fps (VS2015) Promising. |
|
![]() |
![]() |
![]() |
#109 | Link |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
What CPU do you use for these tests? Also, what switches for ICC?
__________________
Groucho's Avisynth Stuff |
![]() |
![]() |
![]() |
#111 | Link |
Registered User
Join Date: Jan 2012
Location: Mesopotamia
Posts: 2,583
|
why stop using ICL in last ver.? for amd users?
__________________
See My Avisynth Stuff |
![]() |
![]() |
![]() |
#112 | Link |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
Well, that's just one switch (QaxNNN). From my experience with Intel compilers, there are at least 5-7 other switches that can affect performance. Just a little selection:
Code:
/O3 optimize for maximum speed and enable more aggressive optimizations /Qipo[n] enable multi-file IP optimization between files /Qunroll-aggressive /Qopt-ra-region-strategy[:<keyword>] select the method that the register allocator uses to partition each routine into regions routine - one region per routine block - one region per block trace - one region per trace loop - one region per loop default - compiler selects best option /Qprof profiling
__________________
Groucho's Avisynth Stuff |
![]() |
![]() |
![]() |
#114 | Link |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
One more thing about the automatic CPU dispatcher (enabled with QaX...) in the Intel compiler - This could actually have an impact on AMD CPUs since I suspect that even with the latest incarnation of the compiler it may chose sub-optimal optimizations for those.
If I build for specific instruction sets, I always "hard-code" them by using "Qx..." instead of "QaX...". This way all CPUs use the same code path. That of course means building binaries for each instruction set. I also recommend testing if the SSE4.x or even AVX options really make a difference. More often than not, they don't. As usual, it all depends on the code.
__________________
Groucho's Avisynth Stuff |
![]() |
![]() |
![]() |
#115 | Link | |
Registered User
Join Date: Oct 2002
Location: France
Posts: 2,262
|
Quote:
Didn't know some of the options : Code:
/Qunroll-aggressive /Qopt-ra-region-strategy[:<keyword>] |
|
![]() |
![]() |
![]() |
#116 | Link | |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
Quote:
![]() I suppose my way of building binaries differs a lot from what everyone else does. I don't use the IDE, I create makefiles for my projects and build from the command line with batch files. The makefiles have easily accessible compiler and linker options so I can quickly change them, rebuild and test. As for all compiler options - run "icl -help" and pipe into a text file. And there are of course the Intel docs that come with the compiler which have lots of stuff about optimization (which almost nobody reads, I guess).
__________________
Groucho's Avisynth Stuff |
|
![]() |
![]() |
![]() |
#117 | Link | |
Registered User
Join Date: Jan 2014
Posts: 2,275
|
Quote:
Finally I couldn't find my old ICC settings. Sure, loop unrolling was at default, so it was not fine-tuned, but I've seen unrolled loops in the asm output (sometimes I check the asm code that compilers generate). Global optimization was on, and the maximum optimization, too. When I have successfully gained back a lot of speed, I used VS2015's performance profiler that showed me the parts where the code spends most of the time. Then I forced these functions to be inline. There were other optimizations I have made, so perhaps the ICC version could also be faster, from the current codebase. Interesting article on the optimizer changes came with VS2015 update 3: https://blogs.msdn.microsoft.com/vcb...ode-optimizer/ |
|
![]() |
![]() |
![]() |
#118 | Link |
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
|
There are also oddball cases where the compiler options for max. speed have the opposite effect. I have a couple of programs where turning off automatic inlining or even using "O1" instead of "O2/O3" results in faster binaries. Always test, if possible on Intel and AMD CPUs.
__________________
Groucho's Avisynth Stuff |
![]() |
![]() |
![]() |
#120 | Link |
Registered User
Join Date: Jan 2014
Posts: 2,275
|
New version. 2.7.6.22
Depan and DepanEstimate are unchanged. https://github.com/pinterf/mvtools/r...s/tag/2.7.6.22 Change log 2.7.6.22 (20161204) - fixes and speedup
Last edited by pinterf; 4th December 2016 at 21:03. |
![]() |
![]() |
![]() |
Thread Tools | Search this Thread |
Display Modes | |
|
|