Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Development

Reply
 
Thread Tools Search this Thread Display Modes
Old 7th February 2023, 19:28   #181  |  Link
DTL
Registered User
 
Join Date: Jul 2018
Posts: 1,068
Quote:
Originally Posted by anton_foy View Post
If using the first Manalyse line I get this error "MAnalyse: frame width not supported by DX12_ME, max supported width 0".
It looks something wrong with DX12 environment at the execution system - may be driver or device not support motion estimator.

HRESULT feature_support = dev_D3D12VideoDevice->CheckFeatureSupport(D3D12_FEATURE_VIDEO_MOTION_ESTIMATOR, &MotionEstimatorSupport, sizeof(MotionEstimatorSupport));

return no-error, but MotionEstimatorSupport structure looks like not initialized with real working values.

You may try that old DX12-ME checking tool from first posts in this thread - https://forum.doom9.org/showthread.p...67#post1959067

In correctly working system it should report max frame sizes like 4096x4096 or may be more. Something like in https://forum.doom9.org/showthread.p...78#post1959078 post.

"multi_vec=MAnalyse(sup8, multi=true, blksize=8, delta=tr, search=3, searchparam=2, overlap=0, optSearchOption=1, optPredictorType=4, chroma=false, mt=false, levels=1).convertbits(16).converttoYUV444()"

output of MAnalyse is special clip for mvtools other members and not need and can not be correctly processed with any other AVS filters. Also it is independent of bitdepth and colour format of processed clip. It only about blocksize/blocknumbers and other analysis params. So it is not correct to attempt to change it with .convertbits(16).converttoYUV444(). Same is for all other lines with MAnalyse. If MVs clip is damaged with any processing it may cause crashes in other mvtools filters because it is not checked for checksum or other ways. It is put to documentation - you can not modify it with almost any tools (except some of mvtools like MScaleVect() and others).

From mvtools2.html :
Technical note: MAnalyse does not generate a regular clip that can be displayed. Don't try to modify its content or it will get corrupted. It is made of a single long line, actually containing binary data (vectors, block SAD, misc. information…) instead of pixel values. It also alters the audio descriptor to pass additional data to other filters before any frame request. Therefore, in the current state, a vector clip cannot be saved to a lossless file and reloaded later for processing. If you want to do so, you have to transcode it first with MStoreVect and MRestoreVect. Furthermore, joining vector clips generated with different parameters may lead to unexpected results, because the aforementioned additional data is global to the whole clip and is not updated on each frame. When the MAnalyse filter is destructed (removed from the memory), the additional data is lost too, and an attempt to using the produced vectors may crash the application or give wrong results. For this reason, avoid using MAnalyse in ScriptClip and other functions of the Avisynth runtime subsystem.

Last edited by DTL; 7th February 2023 at 20:24.
DTL is offline   Reply With Quote
Old 8th February 2023, 03:35   #182  |  Link
anton_foy
Registered User
 
Join Date: Dec 2005
Location: Sweden
Posts: 703
Tried the D3D12HelloTriangle and I got "result: 0". I use the SSE2 version of your mvtools.
anton_foy is offline   Reply With Quote
Old 8th February 2023, 06:44   #183  |  Link
DTL
Registered User
 
Join Date: Jul 2018
Posts: 1,068
"Tried the D3D12HelloTriangle and I got "result: 0"."

HRESULT of 0 in Microsoft API typically mean S_OK (no-error). But you need to got non-zero (and not very small) MaxH/MaxW values in

MEstimator Feature support: D3D12_VIDEO_SIZE_RANGE SizeRange MaxW 4096 MaxH 4096 MinW 32 MinH 32

It must be more than your frame size.
DTL is offline   Reply With Quote
Old 8th February 2023, 23:47   #184  |  Link
anton_foy
Registered User
 
Join Date: Dec 2005
Location: Sweden
Posts: 703
Quote:
Originally Posted by DTL View Post
"Tried the D3D12HelloTriangle and I got "result: 0"."

HRESULT of 0 in Microsoft API typically mean S_OK (no-error). But you need to got non-zero (and not very small) MaxH/MaxW values in

MEstimator Feature support: D3D12_VIDEO_SIZE_RANGE SizeRange MaxW 4096 MaxH 4096 MinW 32 MinH 32

It must be more than your frame size.
My card does not support it then:
D3D12_VIDEO_SIZE_RANGE SizeRange MaxW 0 MaxH 0 MinW 0 MinH 0
anton_foy is offline   Reply With Quote
Old 9th February 2023, 09:44   #185  |  Link
ReinerSchweinlin
Registered User
 
Join Date: Oct 2001
Posts: 454
Quote:
Originally Posted by anton_foy View Post
My card does not support it then:
D3D12_VIDEO_SIZE_RANGE SizeRange MaxW 0 MaxH 0 MinW 0 MinH 0
What Card are you using ?
ReinerSchweinlin is offline   Reply With Quote
Old 9th February 2023, 11:01   #186  |  Link
anton_foy
Registered User
 
Join Date: Dec 2005
Location: Sweden
Posts: 703
Quote:
Originally Posted by ReinerSchweinlin View Post
What Card are you using ?
The GTX 970
anton_foy is offline   Reply With Quote
Old 9th February 2023, 11:34   #187  |  Link
DTL
Registered User
 
Join Date: Jul 2018
Posts: 1,068
It may really not support exposing DX12-ME API (though may supoport hardware MPEG encoding via older version of MPEG encoder ASIC). As I my test shows mostly probably DX12-ME interface is supported from GTX1xxx cards numbers.

Last edited by DTL; 10th February 2023 at 12:25.
DTL is offline   Reply With Quote
Old 10th February 2023, 11:50   #188  |  Link
ReinerSchweinlin
Registered User
 
Join Date: Oct 2001
Posts: 454
Quote:
Originally Posted by anton_foy View Post
The GTX 970
Thats Maxwell 2.0 GM204 Chip. AFAIR the Encoder is one generation too old.
ReinerSchweinlin is offline   Reply With Quote
Old 10th February 2023, 12:32   #189  |  Link
DTL
Registered User
 
Join Date: Jul 2018
Posts: 1,068
It is also interesting when finally Intel integrated MPEG encoder (or may be separate video boards or internal accelerators) will expose DX12-ME interface. Also at some theoretical point one can collect 3 different manufacturers of hardware MPEG encoders (NVIDIA, AMD and Intel) exposing DX12-ME interface and check its quality relative using single noised footage.

Different ME engines may have different search range and most interesting is quality of MVs at significantly noised source. For example how many definitely false non-zero MVs do ME engine produce at flat static noised only areas. For visual comparison of quality output MVs may be displayed with MShow() filter. If MShow() crashes - it is required to greatly increasing padding in MSuper() because some hardware MEs produces large out of frame MVs and current MShow() do not have protection from running out of buffer at attempt to draw such long invalid MV. onCPU 'standard' MAnalyse do not seach MVs outside given padded buffer or clips MVs to given padded buffer so no additional out of buffer checks in other mvtools filters (except MDegrainN it last builds. So even MCompensate with DX12-ME search in MAnalyse may also crash now. Sorry - it was only tested with MAnalyse+MDegrainN for degrain).

Double check and clipping of MVs in both MAnalyse (received from DX12-ME) and in MDegrainN will slightly degrade performance. So may be additional checking/clipping need to be added to all other consumer filters (or finally move from MDegrainN to MAnalsye if it not cause additional issues).

Last edited by DTL; 10th February 2023 at 12:41.
DTL is offline   Reply With Quote
Old 12th February 2023, 01:01   #190  |  Link
anton_foy
Registered User
 
Join Date: Dec 2005
Location: Sweden
Posts: 703
Thanks Reiner and DTL now I borrowed a graphics card tested with d3d test and it put out 4096x4096 res so it hopefully works. I will report.

EDIT: Nope, "can not load file Compute.cso". I have made a separate folder with only DTL's mvtools2.dll and the .cso-file.

EDIT2: now it works when I change from optSearchOption=5 to optSearchOption=1.

Last edited by anton_foy; 13th February 2023 at 08:15.
anton_foy is offline   Reply With Quote
Old 13th February 2023, 22:34   #191  |  Link
DTL
Registered User
 
Join Date: Jul 2018
Posts: 1,068
" "can not load file Compute.cso"."

Compute.cso file need to be in the 'current working directory'. It depends on the settings of your application loading your AVS script. You may start with placing Compute.cso to the directory with .avs script.

"I change from optSearchOption=5 to optSearchOption=1."

DX12_ME is used only in SO=5 or 6. Option 5 uses Compute.cso and SAD computing in accelerator (faster) and option 6 uses onCPU SAD computing (slower but may be a bit better quality with pel 2 or 4).
DTL is offline   Reply With Quote
Old 14th February 2023, 00:14   #192  |  Link
anton_foy
Registered User
 
Join Date: Dec 2005
Location: Sweden
Posts: 703
Quote:
Originally Posted by DTL View Post
" "can not load file Compute.cso"."

Compute.cso file need to be in the 'current working directory'. It depends on the settings of your application loading your AVS script. You may start with placing Compute.cso to the directory with .avs script.

"I change from optSearchOption=5 to optSearchOption=1."

DX12_ME is used only in SO=5 or 6. Option 5 uses Compute.cso and SAD computing in accelerator (faster) and option 6 uses onCPU SAD computing (slower but may be a bit better quality with pel 2 or 4).
Really? The directory where my .avs script is ok!
anton_foy is offline   Reply With Quote
Old 14th February 2023, 11:58   #193  |  Link
DTL
Registered User
 
Join Date: Jul 2018
Posts: 1,068
If Compute.cso loads OK you should be able run processing mode like:
Code:
MSuper(pelrefine=false, pel=4) # do not create refined subplanes for pel > 1
MAnalyse(optSearchOption=5) # use hardware motion search and SAD computing with Compute Shader on the same accelerator
MDegrainN(UseSubShift=1) # use blend-time computing of subsample shifted blocks
DTL is offline   Reply With Quote
Old 14th February 2023, 13:02   #194  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Finland
Posts: 5,733
Can the mod be used just to speed up calculating the MVs by setting optSearchOption=6 in MAnalyse without any ill effects, and use other MVTools functions from the original pinterf build?
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...
Boulder is offline   Reply With Quote
Old 14th February 2023, 16:07   #195  |  Link
DTL
Registered User
 
Join Date: Jul 2018
Posts: 1,068
You may try and look if it work or throws some errors (or crashes). optSearchOption=6 also uses DX12-ME for getting MVs from accelerator and only compute dissimilarity metric at CPU (SAD or others). Because DX12-ME not provide SAD data (or other (dis)similarity metric). So either compute shader is used in optSearchOption=5 or old MAnalyse with optSearchOption=6 single pass without search and only with SAD (or other) metric compute.

If some consumer plugin/function do not need SAD data from MAnalyse output - it may be designed one more option with better performance (only output MVs, like optSearchOption=7 for example).

Because mvtools support lots of different bitdepth and blocksize and other processing combinations - lots of different limitations also exists. For example
optSearchOption=5 - support only SAD computing in current shader implementaion.
optSearchOption=6 - can use any dissimalarity metric compute (defined by DMFlags option) like classic SAD or SSIM or VIF (or some combination like SSIM-structure only).

In the future mvtools it is better to rename SAD only dismetric to DM so params naming from thSAD to thDM for example.

Last edited by DTL; 14th February 2023 at 16:12.
DTL is offline   Reply With Quote
Old 14th February 2023, 16:17   #196  |  Link
Boulder
Pig on the wing
 
Boulder's Avatar
 
Join Date: Mar 2002
Location: Finland
Posts: 5,733
Well, looks like overlap is not supported so that's a no-no for me
__________________
And if the band you're in starts playing different tunes
I'll see you on the dark side of the Moon...
Boulder is offline   Reply With Quote
Old 14th February 2023, 16:49   #197  |  Link
DTL
Registered User
 
Join Date: Jul 2018
Posts: 1,068
Interpolated overlap currently only in the MDegrainN. It need to be implemented in other functions.

Or may be added to MAnalyse to output 4x-overlap (of the overlap=blocksize/2) in DX12-ME MVs processing mode. Better in performance 'diagonal 2x' overlap need to be implemented in each consumer filter. In both 'real search' and 'interpolated' versions.

Also you may try 'scripted diagonal overlap' with 2 shifted clips and BlockOverlap() plugin. It will be 2 real searches and possibly a bit higher in quality in compare with 1 search and interpolation. But it require to change your processing scripts.

Where do you need overlap ? MCompensate for QTGMC ?

It was expected some programmers may expand existing working solutions like interpolated overlap designed currently in MDegrainN to other filters/functions required by any existing users. But it looks current civilization dies too fast and almost no programmers and no users of avisynth and mvtools left today. And it only 2023 year - next will be even worse.

I not use deinterlacing of interlaced SD/HD and not use QTGMC so not have great need to redesign MCompensate or other parts of mvtools.

Last edited by DTL; 14th February 2023 at 23:12.
DTL is offline   Reply With Quote
Old 17th February 2023, 15:58   #198  |  Link
anton_foy
Registered User
 
Join Date: Dec 2005
Location: Sweden
Posts: 703
Thank you DTL it is working now but I get severe blockiness in high motion areas. Tried many script variants.
anton_foy is offline   Reply With Quote
Old 17th February 2023, 19:45   #199  |  Link
DTL
Registered User
 
Join Date: Jul 2018
Posts: 1,068
To suppress blocky look (edges of blocks) with single non-overlap search with DX12-ME in MAnalyse it is designed interpolated overlap processing in MDegrainN. To enable it set IntOvlp param of MDegrainN > 0. I typically use IntOvlp=3 (2x diagonal overlap with SAD re-check, better quality and slower is IntOvlp=1 - 4x blocks number equal to mvtools overlap=blksize/2 in both H and V).

So when you use optSearchOption=5 or 6 in MAnalyse you can not set overlap > 0 in MAnalyse and it recommended to use IntOvlp > 0 in MdegrainN (valid values from 1 to 4). Same is for better performance of MAnalyse in all modes - set non-overlapped search in MAnalyse and use interpolated overlap in MDegrainN. Because overlapped search in MAnalyse significantly degrades performance and interpolated overlap (fastest is 2x diagonal) runs much faster with may be still good enough quality.

Last edited by DTL; 17th February 2023 at 19:54.
DTL is offline   Reply With Quote
Old 18th February 2023, 04:20   #200  |  Link
magnetite
Registered User
 
Join Date: May 2010
Posts: 64
Quote:
Current possibly best settings for processing:

Code:
MDegrainN(last,super, multi_vec, tr, thSAD=250, thSAD2=240, thSADC=240, mt=false, wpow=4, thSCD1=400, adjSADzeromv=0.5, adjSADcohmv=0.5, thCohMV=16, MVLPFGauss=0.9, 
thMVLPFCorr=50, UseSubShift=1, IntOvlp=3, MPBthSub=5, MPBthAdd=5, MPBNumIt=3, MPB_SPCadd=3, MPB_SPCsub=0.3, MPBthIVS=1500, showIVSmask=false, MPB_DMFlags=64, MPB_MVlth=8, MPBtgtTR=tr)
As an end-user, I'm not really sure what all this does aside from some of the basic options like TR, thSAD/C, MT. Is there any way to make a preset quality value so people who are unfamiliar with all the settings can get decent results without all the guesswork? I think QTGMC has a preset quality feature, but that's a script, not a plugin like this.

Last edited by magnetite; 18th February 2023 at 04:30.
magnetite is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 17:27.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.