Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 10th May 2017, 15:04   #201  |  Link
tormento
Acid fr0g
 
tormento's Avatar
 
Join Date: May 2002
Location: Italy
Posts: 2,563
Quote:
Originally Posted by pinterf View Post
For speed: if a script is using mt_lutxy, it cannot always use fast lookup tables for memory reasons.
Above 12 bits mt_lutxy calculates the expression realtime, pixel-by-pixel, which is slooooow (unlike Expr in VapourSynth).
For the specific bit depths at which realtime expression evaluation kicks in, see masktools2 readme ("feature matrix" section) or the wiki.

Memory? On modern computer, x64 enabled, it shouldn't be a problem anymore. As I am a programming ignorant, could you explain more?

Size... that matter has no answer yet.
__________________
@turment on Telegram
tormento is offline   Reply With Quote
Old 10th May 2017, 15:21   #202  |  Link
pinterf
Registered User
 
Join Date: Jan 2014
Posts: 2,314
Set realtime=false manually for a 16 bit lutxy. On x64 it can work with plenty of memory. Lut size is 8 gbyte, even the initial lut calc is a minute I guess. Over a specific clip length it would be faster however.
pinterf is offline   Reply With Quote
Old 10th May 2017, 21:04   #203  |  Link
wonkey_monkey
Formerly davidh*****
 
wonkey_monkey's Avatar
 
Join Date: Jan 2004
Posts: 2,496
Quote:
Originally Posted by pinterf View Post
Set realtime=false manually for a 16 bit lutxy. On x64 it can work with plenty of memory. Lut size is 8 gbyte, even the initial lut calc is a minute I guess. Over a specific clip length it would be faster however.
How about calculating each term when required, then putting it in the lut for the next time those x and y values crop up?

Of course you'd need more memory to store whether or not a particular value was already in the lut, unless you can determine an "illegal" value beforehand. The additional checks would make it slightly slower than a full lut, but at least you wouldn't have to wait for the whole table to be generated before you got results.
__________________
My AviSynth filters / I'm the Doctor
wonkey_monkey is offline   Reply With Quote
Old 10th May 2017, 21:12   #204  |  Link
Myrsloik
Professional Code Monkey
 
Myrsloik's Avatar
 
Join Date: Jun 2003
Location: Kinnarps Chair
Posts: 2,554
Quote:
Originally Posted by davidhorman View Post
How about calculating each term when required, then putting it in the lut for the next time those x and y values crop up?

Of course you'd need more memory to store whether or not a particular value was already in the lut, unless you can determine an "illegal" value beforehand. The additional checks would make it slightly slower than a full lut, but at least you wouldn't have to wait for the whole table to be generated before you got results.
Now we need 8.5GB per lut.
__________________
VapourSynth - proving that scripting languages and video processing isn't dead yet
Myrsloik is offline   Reply With Quote
Old 11th May 2017, 05:51   #205  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
Quote:
Originally Posted by Myrsloik View Post
Now we need 8.5GB per lut.
That's why your computer has 4 memory slots
MysteryX is offline   Reply With Quote
Old 11th May 2017, 09:15   #206  |  Link
tormento
Acid fr0g
 
tormento's Avatar
 
Join Date: May 2002
Location: Italy
Posts: 2,563
Let's for a moment forget speed. There is an explanation for that. Size i.e. noise reduction in the culprit of issue. Why 16 bit is so inefficient?
__________________
@turment on Telegram
tormento is offline   Reply With Quote
Old 11th May 2017, 11:14   #207  |  Link
dlnm
Registered User
 
Join Date: Jun 2009
Posts: 7
Sorry if I post in the wrong thread.
May I ask where I can find the x64 version of RemapFrames 0.4.1?
It seems the link in the wiki (http://avisynth.nl/index.php/AviSynth%2B) is not valid anymore.
Thank you.
dlnm is offline   Reply With Quote
Old 11th May 2017, 15:49   #208  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
Quote:
Originally Posted by tormento View Post
Let's for a moment forget speed. There is an explanation for that. Size i.e. noise reduction in the culprit of issue. Why 16 bit is so inefficient?
If you're using prefilter, I'm pretty sure MinMax isn't doing what it's supposed to do -- which will result in weird results. It hasn't been ported to 16-bit. It doesn't crash but threats the data as 8-bit.

That could explain what you're seeing in terms of high bit-rate.
MysteryX is offline   Reply With Quote
Old 11th May 2017, 16:10   #209  |  Link
tormento
Acid fr0g
 
tormento's Avatar
 
Join Date: May 2002
Location: Italy
Posts: 2,563
Quote:
Originally Posted by MysteryX View Post
If you're using prefilter, I'm pretty sure MinMax isn't doing what it's supposed to do -- which will result in weird results. It hasn't been ported to 16-bit. It doesn't crash but threats the data as 8-bit.

That could explain what you're seeing in terms of high bit-rate.


I use prefilter=4
__________________
@turment on Telegram
tormento is offline   Reply With Quote
Old 11th May 2017, 17:23   #210  |  Link
Motenai Yoda
Registered User
 
Motenai Yoda's Avatar
 
Join Date: Jan 2010
Posts: 709
I'm pretty sure smdegrain (at least the current one) feed minblur with an 8bit clip.
__________________
powered by Google Translator
Motenai Yoda is offline   Reply With Quote
Old 11th May 2017, 22:28   #211  |  Link
real.finder
Registered User
 
Join Date: Jan 2012
Location: Mesopotamia
Posts: 2,587
Quote:
Originally Posted by tormento View Post
Let's for a moment forget speed. There is an explanation for that. Size i.e. noise reduction in the culprit of issue. Why 16 bit is so inefficient?
this is what pinterf said back then

Quote:
There's no reason to get identical results.
For 16 bit input, even if the original clip is 8 bits and its straight 16 bit conversion has zero lsb, the lower resolution subclips in Super are already interpolated and have meaningful lsb parts.
So the vectors after MAnalyze are possibly different than it would be estimated from a single 8 bit source.
Then the weighting and blending inside MDegrain works with higher precision than for a 8 bit input. That is a difference, too.
and this

Quote:
Indeed, when I alt-tabbed the 8 bit and 10+ bit result, the 10+ bit version possibly found better motion vectors than 8 bits, I saw less orphaned countour lines and remnants from previous frames.
__________________
See My Avisynth Stuff
real.finder is offline   Reply With Quote
Old 11th May 2017, 23:23   #212  |  Link
Groucho2004
 
Join Date: Mar 2006
Location: Barcelona
Posts: 5,034
Quote:
Originally Posted by dlnm View Post
Sorry if I post in the wrong thread.
May I ask where I can find the x64 version of RemapFrames 0.4.1?
It seems the link in the wiki (http://avisynth.nl/index.php/AviSynth%2B) is not valid anymore.
Thank you.
Yes, link is dead and I couldn't find the file anywhere else. So, I have taken cretindesalpes' last update to the code of RemapFrames from here, updated it to AVS2.6 interface and built 32 and 64 bit versions. I tested them very briefly, let me know how it goes.

Link is in my signature.
__________________
Groucho's Avisynth Stuff

Last edited by Groucho2004; 11th May 2017 at 23:26.
Groucho2004 is offline   Reply With Quote
Old 15th May 2017, 15:56   #213  |  Link
ajp_anton
Registered User
 
ajp_anton's Avatar
 
Join Date: Aug 2006
Location: Stockholm/Helsinki
Posts: 805
Quote:
Originally Posted by pinterf View Post
For speed: if a script is using mt_lutxy, it cannot always use fast lookup tables for memory reasons.
Above 12 bits mt_lutxy calculates the expression realtime, pixel-by-pixel, which is slooooow (unlike Expr in VapourSynth).
For the specific bit depths at which realtime expression evaluation kicks in, see masktools2 readme ("feature matrix" section) or the wiki.
Why is the VapourSynth-version faster?

Quote:
Originally Posted by pinterf View Post
Set realtime=false manually for a 16 bit lutxy. On x64 it can work with plenty of memory. Lut size is 8 gbyte, even the initial lut calc is a minute I guess. Over a specific clip length it would be faster however.
Don't know if it already is, but couldn't the LUT calculation be quite easily multithreaded?
ajp_anton is offline   Reply With Quote
Old 15th May 2017, 16:07   #214  |  Link
Myrsloik
Professional Code Monkey
 
Myrsloik's Avatar
 
Join Date: Jun 2003
Location: Kinnarps Chair
Posts: 2,554
Quote:
Originally Posted by ajp_anton View Post
Why is the VapourSynth-version faster?

Don't know if it already is, but couldn't the LUT calculation be quite easily multithreaded?
1. Because on x86 it converts the expression to native SSE2 code and does all calculations in floating point. Including some optimizations like pre-calculating constant parts of the expression and other fun stuff.

2. If your LUT has that many values a LUT is a generally bad idea.
__________________
VapourSynth - proving that scripting languages and video processing isn't dead yet
Myrsloik is offline   Reply With Quote
Old 15th May 2017, 22:26   #215  |  Link
TheFluff
Excessively jovial fellow
 
Join Date: Jun 2004
Location: rude
Posts: 1,100
Quote:
Originally Posted by Myrsloik View Post
1. Because on x86 it converts the expression to native SSE2 code and does all calculations in floating point. Including some optimizations like pre-calculating constant parts of the expression and other fun stuff.
Or, to explain it with a few more words: writing a runtime RPN expression evaluator in C++ is pretty trivial and a ton of CS undergrad students have been subjected to it as an exercise. It's easy to write but if you want to put it in a video filter it gets really slow since the expression has to be re-evaluated for every pixel value, and the runtime expression evaluator is a pretty hefty bit of code compared to the tiny bits of math that you're actually writing in the RPN expression.

The expr filter in VS isn't like that. The expr filter in VS is (on x86) a fully-fledged, optimizing just-in-time compiler that takes your RPN expression and compiles it to SSE2-optimized native code. When I say "optimizing" I mean it does things like optimize out constant parts of the expression so they don't have to be re-calculated for each pixel, including optimizing out immutable conditionals so you can avoid branching where possible. It also does auto-vectorization, so the compiled code loads, processes and stores four pixels at a time (since XMM registers are 128 bits wide and it works with 32-bit floats internally). In other words, its performance is on the same level as if you had written the equivalent of your RPN expression in C, compiled it as a plugin and used that instead of mt_lut.

8 GB LUT's are almost definitely slow as molasses in comparison. Memory bandwidth isn't free.
TheFluff is offline   Reply With Quote
Old 15th May 2017, 22:42   #216  |  Link
wonkey_monkey
Formerly davidh*****
 
wonkey_monkey's Avatar
 
Join Date: Jan 2004
Posts: 2,496
Quote:
Originally Posted by TheFluff View Post
The expr filter in VS isn't like that. The expr filter in VS is (on x86)
Why not on x64 as well?

Quote:
a fully-fledged, optimizing just-in-time compiler that takes your RPN expression and compiles it to SSE2-optimized native code.
Is it based on some other piece of open-source software?

My rgba_rpn plugin does something similar, but using the x87 FPU. I'm wondering now if I should move to SSE2 instead. It certainly isn't crazy-optimal, although I've done my best, and it can do a lot more than expr can.

It's too complex to warrant vectorization, but if there's any interest/need I'd be willing to look into crafting something similar to VS's expr - I was going to provide it as an alias, anyway, but if people would find it really useful it might be worth writing something more optimal for those specific requirements.
__________________
My AviSynth filters / I'm the Doctor
wonkey_monkey is offline   Reply With Quote
Old 15th May 2017, 23:24   #217  |  Link
TheFluff
Excessively jovial fellow
 
Join Date: Jun 2004
Location: rude
Posts: 1,100
By x86 I meant x86_64 too. VS can be compiled for other archs as well though, but for those there's no JIT compilation.

It uses jitasm to actually do the compilation but all the code generation/optimization is mainly Myrsloik's and dubhater's work AFAIK.

Last edited by TheFluff; 16th May 2017 at 00:15.
TheFluff is offline   Reply With Quote
Old 17th May 2017, 22:10   #218  |  Link
ajp_anton
Registered User
 
ajp_anton's Avatar
 
Join Date: Aug 2006
Location: Stockholm/Helsinki
Posts: 805
What I meant was: why can't the Avisynth version be as fast as VS? Anything preventing the use of the same code?
ajp_anton is offline   Reply With Quote
Old 17th May 2017, 22:25   #219  |  Link
TheFluff
Excessively jovial fellow
 
Join Date: Jun 2004
Location: rude
Posts: 1,100
I... don't think so? It's a fairly simple filter, so feel free to go hog wild
TheFluff is offline   Reply With Quote
Old 18th May 2017, 20:41   #220  |  Link
wonkey_monkey
Formerly davidh*****
 
wonkey_monkey's Avatar
 
Join Date: Jan 2004
Posts: 2,496
I'm trying to think about updating my plugin to handle all the new colour spaces. I've been reading this:

https://forum.doom9.org/showpost.php...postcount=2484

as a reference and I'm wondering about using the new stuff like ComponentCount() - what happens if I make use of that in my code, but then someone still runing AviSynth 2.6 tries to use it? Will it fail? Is there a "best way" to code for this to maintain compatability?
__________________
My AviSynth filters / I'm the Doctor
wonkey_monkey is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 14:58.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.