Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 21st March 2018, 10:29   #41  |  Link
videoh
Useful n00b
 
Join Date: Jul 2014
Posts: 1,667
No comment on the source filter handling penalty? That appears more serious than the avsresize slowness.

Last edited by videoh; 21st March 2018 at 13:17.
videoh is offline   Reply With Quote
Old 21st March 2018, 12:46   #42  |  Link
videoh
Useful n00b
 
Join Date: Jul 2014
Posts: 1,667
Quote:
Originally Posted by poisondeathray View Post
How about avsmeter diagnostics ?
Can't work with Vapoursynth, so comparison would not be possible.

Quote:
changing prefetch values ? - check memory / cpu usage etc... is "prefetch(8)" the "best" number in that situation ? Personally I like the auto threading of vpy better.
Prefetch 4 or 8 made no difference. Anyway, if users are forced to tweak things like this then it is just braindead, IMHO.

Quote:
Can you double check some other measurement tools? maybe vspipe to ffmpeg, vs. avs to ffmpeg for example .
Never used those and don't see the relevance.
videoh is offline   Reply With Quote
Old 21st March 2018, 12:59   #43  |  Link
videoh
Useful n00b
 
Join Date: Jul 2014
Posts: 1,667
I want to demonstrate the source filter handling penalty of Avisynth+. Recall that the same DGSource executable is used by both Vapoursynth and Avisynth+. Here are the two scripts:

Vapoursynth:

Code:
import vapoursynth as vs
core = vs.get_core()

core.avs.LoadPlugin(path="D:/Don/Programming/C++/DGDecNV/DGDecodeNV/x64/release/DGDecodeNV.dll")
video = core.avs.DGSource('THE GREAT WALL.dgi',fulldepth=True)
video=core.std.AssumeFPS(video,fpsnum=1000, fpsden=1)
video.set_output()
Avisynth+:

Code:
loadplugin("d:\don\Programming\C++\dgdecnv\DGDecodeNV\x64\release\dgdecodenv.dll")
SetFilterMTMode("DGSource", MT_MULTI_INSTANCE)
dgsource("THE GREAT WALL.dgi",fulldepth=true)
assumefps(1000.0)
prefetch(8)
Results playing in VirtualDub2:

Vapoursynth: 6 seconds
Avisynth+: 12 seconds

If I remove the prefetch then Avisynth+ finishes in 7 seconds. But that will disable multithreading for anything else in the script.

This together with the inefficient conversions makes Avisynth+ unusable for me. I remind of the overall result for the simple process of tonemapping a UHD stream:

Vapoursynth: 13 seconds
Avisynth+ with avsresize: 45 seconds

Which one do you think a sensible person would use?
videoh is offline   Reply With Quote
Old 21st March 2018, 16:55   #44  |  Link
TheFluff
Excessively jovial fellow
 
Join Date: Jun 2004
Location: rude
Posts: 1,100
As much as I approve of demonstrating the superiority of Vapoursynth, I really don't think setting a source filter to MT_MULTI_INSTANCE is a good idea. MT_SERIALIZED forces everything upstream of the filter to be synchronous and single threaded, but source filters have no upstream so it doesn't matter. VS source filters tend to be serial in nature too.
TheFluff is offline   Reply With Quote
Old 21st March 2018, 16:56   #45  |  Link
Stephen R. Savage
Registered User
 
Stephen R. Savage's Avatar
 
Join Date: Nov 2009
Posts: 327
It is because the avsresize was never tested with gamma function operations. It is missing this statement from the VS z.lib filter that enables certain speed-ups:

Code:
vsresize.cpp:L683: m_params.allow_approximate_gamma = 1;
Stephen R. Savage is offline   Reply With Quote
Old 21st March 2018, 16:59   #46  |  Link
videoh
Useful n00b
 
Join Date: Jul 2014
Posts: 1,667
Quote:
Originally Posted by TheFluff View Post
As much as I approve of demonstrating the superiority of Vapoursynth, I really don't think setting a source filter to MT_MULTI_INSTANCE is a good idea. MT_SERIALIZED forces everything upstream of the filter to be synchronous and single threaded, but source filters have no upstream so it doesn't matter. VS source filters tend to be serial in nature too.
I'm sure you're right, but the setting doesn't seem to have any effect at all. I'm not familiar with Vapoursynth internals.
videoh is offline   Reply With Quote
Old 21st March 2018, 17:01   #47  |  Link
poisondeathray
Registered User
 
Join Date: Sep 2007
Posts: 5,371
You would expect prefetch(2) should be faster in that script

For the src filter threading issue, it did not affect the old avisynth(-) mt (at least no reports as bad as this) - what is different about the threading model here in avisynth(+) ?
poisondeathray is offline   Reply With Quote
Old 21st March 2018, 17:01   #48  |  Link
videoh
Useful n00b
 
Join Date: Jul 2014
Posts: 1,667
Quote:
Originally Posted by Stephen R. Savage View Post
It is because the avsresize was never tested with gamma function operations. It is missing this statement from the VS z.lib filter that enables certain speed-ups:

Code:
vsresize.cpp:L683: m_params.allow_approximate_gamma = 1;
Thanks, I will try adding that. Of course, that may remove one penalty but not the source filter penalty.
videoh is offline   Reply With Quote
Old 21st March 2018, 17:14   #49  |  Link
videoh
Useful n00b
 
Join Date: Jul 2014
Posts: 1,667
Quote:
Originally Posted by Stephen R. Savage View Post
It is because the avsresize was never tested with gamma function operations. It is missing this statement from the VS z.lib filter that enables certain speed-ups:

Code:
vsresize.cpp:L683: m_params.allow_approximate_gamma = 1;
Bravo Stephen! With that change, avsresize performance is the same as Vapoursynth. Will you release an update with this change?

Now let's try to figure out the second issue.

Last edited by videoh; 21st March 2018 at 18:25.
videoh is offline   Reply With Quote
Old 21st March 2018, 17:23   #50  |  Link
poisondeathray
Registered User
 
Join Date: Sep 2007
Posts: 5,371
For the src filter issue - it doesn't appear to be a GPU latency /transfer issue, because avs+ prefetch(4) incurs no significant penalty with dss/dss2 cuvid
poisondeathray is offline   Reply With Quote
Old 21st March 2018, 17:24   #51  |  Link
videoh
Useful n00b
 
Join Date: Jul 2014
Posts: 1,667
Quote:
Originally Posted by poisondeathray View Post
You would expect prefetch(2) should be faster in that script

For the src filter threading issue, it did not affect the old avisynth(-) mt (at least no reports as bad as this) - what is different about the threading model here in avisynth(+) ?
I ran the entire HDR->SDR process again with the avsresize fix. Here is the script:

Code:
loadplugin("d:\don\Programming\C++\dgdecnv\DGDecodeNV\x64\release\dgdecodenv.dll")
loadplugin("D:\Don\Programming\C++\Avisynth filters\ToneMap float\x64\release\tonemap.dll")
loadplugin("D:\Don\Programming\C++\avsresize-r1c\x64\release\avsresize.dll")
SetFilterMTMode("z_ConvertFormat", MT_MULTI_INSTANCE)
dgsource("THE GREAT WALL.dgi",fulldepth=true)
z_ConvertFormat(pixel_type="RGBPS",colorspace_op="2020ncl:st2084:2020:l=>rgb:linear:2020:l", dither_type="none")
tonemap()
z_ConvertFormat(pixel_type="YV12",colorspace_op="rgb:linear:2020:l=>709:709:709:l",dither_type="ordered")
prefetch(2)
The overall performance is now the same as Vapoursynth!

The prefetch number didn't affect the timing but value 2 made things smooth and 8 made it go in stops and starts.

So, hey, Avisynth+ lives on. but

Last edited by videoh; 21st March 2018 at 17:37.
videoh is offline   Reply With Quote
Old 21st March 2018, 17:32   #52  |  Link
videoh
Useful n00b
 
Join Date: Jul 2014
Posts: 1,667
I'll release an Avisynth+ port of Phillip Blucas's tonemap filter within a few days (with some performance improvements). Then there will be a full HDR->SDR solution for Avisynth+.

It's close to being realtime for UHD (22fps on my machine) when playing in VirtualDub2. It would be nice if we could squeeze a little more performance out of zimg (or port equivalent conversions to CUDA).

Last edited by videoh; 22nd March 2018 at 11:35.
videoh is offline   Reply With Quote
Old 21st March 2018, 17:38   #53  |  Link
Sharc
Registered User
 
Join Date: May 2006
Posts: 3,997
Quote:
Originally Posted by videoh View Post
so, hey, avisynth+ lives on. But
oh happy day
Sharc is offline   Reply With Quote
Old 21st March 2018, 17:49   #54  |  Link
TheFluff
Excessively jovial fellow
 
Join Date: Jun 2004
Location: rude
Posts: 1,100
Setting the source filter to MT_MULTI_INSTANCE with prefetch(2) is functionally much like doing something like
Code:
src1 = dgsource("file.dgi")
src2 = dgsource("file.dgi")
interleave(src1.selecteven(), src2.selectodd())
So if you have too many instances you'll start trashing the disk because you'll have many threads reading from the same file. Depending on the filter chain, it may or may not be beneficial to do this. You'll get as many instances as you have threads whether you want to or not. On the other hand if the source filter is MT_SERIALIZED there'll only be one instance of it and it'll only process one frame request at a time, but other filters that are requesting frames from it may run in parallel in various ways. The requesting thread will always block and wait for the request to complete though because that's just how the API is designed.

In VS, all Avisynth filters are always run single threaded and in a single instance, but VS attempts to fit them into its nonblocking frame request framework as well as it can, and it will also automatically attempt to request frames from source filters linearly in order. I'd expect it to run most source filters slightly faster than Avs+ would with MT_SERIALIZED.

I wrote a very long and boring attempt to explain the differences between the concurrency models here a while ago.

Last edited by TheFluff; 21st March 2018 at 17:52.
TheFluff is offline   Reply With Quote
Old 21st March 2018, 18:01   #55  |  Link
videoh
Useful n00b
 
Join Date: Jul 2014
Posts: 1,667
Thanks for that explanation. I tried to read your link some time ago, but I fell asleep.

Is there anything to be gained by making a Vapoursynth-native version of DGSource()?
videoh is offline   Reply With Quote
Old 21st March 2018, 18:14   #56  |  Link
TheFluff
Excessively jovial fellow
 
Join Date: Jun 2004
Location: rude
Posts: 1,100
For a source filter I don’t think there’s anything significant to gain performance-wise, no. Not sure about how complete the VS compatibility layer is in terms of various exotic colorspaces and/or bitdepths though.
TheFluff is offline   Reply With Quote
Old 21st March 2018, 18:16   #57  |  Link
DJATOM
Registered User
 
DJATOM's Avatar
 
Join Date: Sep 2010
Location: Ukraine, Bohuslav
Posts: 377
Quote:
Originally Posted by videoh View Post
Thanks for that explanation. I tried to read your link some time ago, but I fell asleep.

Is there anything to be gained by making a Vapoursynth-native version of DGSource()?
Yes. Autoloading is great!
__________________
Me on GitHub
PC Specs: Ryzen 5950X, 64 GB RAM, RTX 2070
DJATOM is offline   Reply With Quote
Old 21st March 2018, 18:17   #58  |  Link
videoh
Useful n00b
 
Join Date: Jul 2014
Posts: 1,667
Quote:
Originally Posted by TheFluff View Post
For a source filter I don’t think there’s anything significant to gain performance-wise, no. Not sure about how complete the VS compatibility layer is in terms of various exotic colorspaces and/or bitdepths though.
Thank you. All I need is YUV420P016 and that is covered now.

Last edited by videoh; 21st March 2018 at 18:21.
videoh is offline   Reply With Quote
Old 21st March 2018, 18:17   #59  |  Link
videoh
Useful n00b
 
Join Date: Jul 2014
Posts: 1,667
Quote:
Originally Posted by DJATOM View Post
Yes. Autoloading is great!
Alrighty then, I'll do it. Actually I already have it from a while ago, just have to add a few fixes that flowed under the bridge since then.

Last edited by videoh; 21st March 2018 at 18:20.
videoh is offline   Reply With Quote
Old 21st March 2018, 22:04   #60  |  Link
videoh
Useful n00b
 
Join Date: Jul 2014
Posts: 1,667
Please give the script that plays that fast for UHD content (3840x2160 16-bit). Just to be clear, I am talking about playing the script in VitualDub2.

Will you release a fixed avsresize so I don't have to include one? Thanks.

Last edited by videoh; 21st March 2018 at 22:29.
videoh is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 17:27.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.