Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
![]() |
#221 | Link | |
Registered User
Join Date: Dec 2005
Location: Sweden
Posts: 663
|
Quote:
|
|
![]() |
![]() |
![]() |
#222 | Link |
Registered User
Join Date: Jul 2018
Posts: 989
|
Yes - MAnalyse with optional SuperCurrent can be used with any optSearchOption value (so including hardware search options). I even think of using >1 HWacc in the system for better performance in pipelined way. So first HWacc making initial analysis and second make refining step.
Later we will have many cheap secondhand old HWaccs capable of DX12-ME so it may be tested. Currently you can try accnum different for MAnalyses: Code:
init_thSAD=400 s1=MSuper() mv1 = MAnalyse(s1, optSearchOption=5, accnum=1) # use first DX12-ME accelerator in system or accnum=0 ? need testing dg1 = MDegrain(s1, mv1, thSAD=init_thSAD) 1stgen_thSAD = (int)(init_thSAD/1.8) # divisor - subject to Zopti refine ? s2=MSuper(dg1) mv2 = MAnalyse(s2, SuperCurrent=s1, optSearchOption=5, accnum=2) # use second DX12-ME accelerator in system or accnum=1 - need testing dg2=MDegrain(s1, mv2, thSAD=1stgen_thSAD) Also may be combination of 1 external PCI-board DX12-ME acc and build-in into CPU may be tested where avaiable. Also as I read some NVIDIA boards/chips have >1 MPEG encoder ASIC (?) so may expose >1 full-speed DX12-ME interfaces for applications. At https://developer.nvidia.com/video-e...ort-matrix-new # OF CHIPS # OF NVENC /CHIP Total # of NVENC So GeForce GTX 965M > 980M / 980MX Maxwell (2nd Gen) may have 2 full-speed DX12-ME interfaces ? Also GeForce GTX 960 Ti / 970 / 980 , GeForce GTX 980 Ti , GeForce GTX Titan X GeForce GTX 1070M / 1080M , GeForce GTX 1070 / 1070Ti, GeForce GTX 1080 , GeForce GTX 1080 Ti, GeForce GTX Titan X / Titan Xp Same is GeForce RTX 4080 Laptop , GeForce RTX 4080 16GB , GeForce RTX 4090 Laptop , GeForce RTX 4090 - but much more expensive. Also Titan V - 3 NVENC. Dogway have GTX 1070? May be good to try to ask for testing 1 vs 2 MAnalyse performance (also accepting different accnum values >0 or >1). Addition: I not sure if several MPEG encoder ASICs located in single physical board will be switched as different Direct3D12 devices with 'accnum' param. May be environment will auto-spread motion estimation tasks if single board have several task dispatch resources avaialble. So 2-NVENC boards may be simply allow to run 2 MAnalyse with about equal speed with default accum=0. Last edited by DTL; 8th March 2023 at 14:59. |
![]() |
![]() |
![]() |
#223 | Link |
Registered User
Join Date: Jul 2018
Posts: 989
|
New release: https://github.com/DTL2020/mvtools/r.../r.2.7.46-a.20
Fixed possible bug with trymany in MAnalyse. Added trymany into optPredictorType=1 mode (zero, global and median predictors only). Added partial fix for 4:2:x formats processing chroma shift issue for MAnalyse, MDegrainN, MCompensate (may also MRecalculate). With the curernt pel-precision from MSuper. The multi-generations MVs refining looks like also work very visibly against blurring for complex motion like facial animation. Cleaned from MShow processing script: Code:
my_DMFlags=1 my_thSAD=300 my_thSAD2=250 my_thSAD_mg=150 my_thSAD2_mg=100 my_thSCD=500 my_global=true my_pzero=10 my_pnew=10 my_pglobal=10 my_pel=2 my_trymany=true my_oPT=1 tr=6 super=MSuper(last,chroma=true, mt=false, pel=my_pel) multi_vec=MAnalyse(super, multi=true, delta=tr, search=3, searchparam=2, trymany=my_trymany, overlap=0, chroma=true, mt=false, optSearchOption=1, truemotion=false, pnew=my_pnew, pzero=my_pzero, pglobal=my_pglobal, global=my_global, levels=4, DMFlags=my_DMFlags, optPredictorType=my_oPT) g1=MDegrainN(super, multi_vec, tr, thSAD=my_thSAD, thSAD2=my_thSAD2, mt=false, wpow=4, thSCD1=my_thSCD, adjSADzeromv=0.5, adjSADcohmv=0.5, thCohMV=16, MVLPFGauss=0.9, thMVLPFCorr=50, adjSADLPFedmv=0.9, IntOvlp=3) gen1=g1 super_g1=MSuper(gen1,chroma=true, mt=false, pel=my_pel) multi_vec_g2=MAnalyse(super_g1, SuperCurrent=super, multi=true, delta=tr, search=3, searchparam=2, trymany=my_trymany, overlap=0, chroma=true, mt=false, optSearchOption=1, truemotion=false, pnew=my_pnew, pzero=my_pzero, pglobal=my_pglobal, global=my_global, levels=4, DMFlags=my_DMFlags, optPredictorType=my_oPT) g2=MDegrainN(super, multi_vec_g2, tr, thSAD=my_thSAD_mg, thSAD2=my_thSAD2_mg, mt=false, wpow=4, thSCD1=my_thSCD, adjSADzeromv=0.5, adjSADcohmv=0.5, thCohMV=16, MVLPFGauss=0.9, thMVLPFCorr=50, adjSADLPFedmv=0.9, IntOvlp=3) gen2=g2 super_g2=MSuper(gen2,chroma=true, mt=false, pel=my_pel) multi_vec_g3=MAnalyse(super_g2, SuperCurrent=super, multi=true, delta=tr, search=3, searchparam=2, trymany=my_trymany, overlap=0, chroma=true, mt=false, optSearchOption=1, truemotion=false, pnew=my_pnew, pzero=my_pzero, pglobal=my_pglobal, global=my_global, levels=4, DMFlags=my_DMFlags, optPredictorType=my_oPT) g3=MDegrainN(super, multi_vec_g3, tr, thSAD=my_thSAD_mg, thSAD2=my_thSAD2_mg, mt=false, wpow=4, thSCD1=my_thSCD, adjSADzeromv=0.5, adjSADcohmv=0.5, thCohMV=16, MVLPFGauss=0.9, thMVLPFCorr=50, adjSADLPFedmv=0.9, IntOvlp=3) gen3=g3 super_g3=MSuper(gen3,chroma=true, mt=false, pel=my_pel) multi_vec_g4=MAnalyse(super_g3, SuperCurrent=super, multi=true, delta=tr, search=3, searchparam=2, trymany=my_trymany, overlap=0, chroma=true, mt=false, optSearchOption=1, truemotion=false, pnew=my_pnew, pzero=my_pzero, global=my_global, levels=4, DMFlags=my_DMFlags, optPredictorType=my_oPT) g4=MDegrainN(super, multi_vec_g4, tr, thSAD=my_thSAD_mg, thSAD2=my_thSAD2_mg, mt=false, wpow=4, thSCD1=my_thSCD, adjSADzeromv=0.5, adjSADcohmv=0.5, thCohMV=16, MVLPFGauss=0.9, thMVLPFCorr=50, adjSADLPFedmv=0.9, IntOvlp=3) gen4=g4 super_g4=MSuper(gen4,chroma=true, mt=false, pel=my_pel) multi_vec_g5=MAnalyse(super_g4, SuperCurrent=super, multi=true, delta=tr, search=3, searchparam=2, trymany=my_trymany, overlap=0, chroma=true, mt=false, optSearchOption=1, truemotion=false, pnew=my_pnew, pzero=my_pzero, pglobal=my_pglobal, global=my_global, levels=4, DMFlags=my_DMFlags, optPredictorType=my_oPT) g5=MDegrainN(super, multi_vec_g5, tr, thSAD=my_thSAD_mg, thSAD2=my_thSAD2_mg, mt=false, wpow=4, thSCD1=my_thSCD, adjSADzeromv=0.5, adjSADcohmv=0.5, thCohMV=16, MVLPFGauss=0.9, thMVLPFCorr=50, adjSADLPFedmv=0.9, IntOvlp=3) gen5=g5 super_g5=MSuper(gen5,chroma=true, mt=false, pel=my_pel) multi_vec_g6=MAnalyse(super_g5, SuperCurrent=super, multi=true, delta=tr, search=3, searchparam=2, overlap=0, chroma=true, mt=false, optSearchOption=1, truemotion=false, pnew=my_pnew, pzero=my_pzero, pglobal=my_pglobal, global=my_global, levels=4, DMFlags=my_DMFlags, optPredictorType=my_oPT) g6=MDegrainN(super, multi_vec_g6, tr, thSAD=my_thSAD_mg, thSAD2=my_thSAD2_mg, mt=false, wpow=4, thSCD1=my_thSCD, adjSADzeromv=0.5, adjSADcohmv=0.5, thCohMV=16, MVLPFGauss=0.9, thMVLPFCorr=50, adjSADLPFedmv=0.9, IntOvlp=3) return Interleave(g6.Subtitle("g6"), g2.Subtitle("g2"), g1.Subtitle("g1")) Frames g1, g2 and g6 2x enlarged with BSpline: ![]() ![]() ![]() It was non-field separated interlaced so 2 fields present. May be somehow this many calls to MSuper/MAnalyse/MDegrainN for each generation of MVs refining can be compacted to some AVS function and make script smaller. imgsli comparisons: https://imgsli.com/MTYwODgx https://imgsli.com/MTYwODgw Last edited by DTL; 9th March 2023 at 13:26. Reason: fixed error in last gen6 MDegrainN |
![]() |
![]() |
![]() |
#224 | Link |
Registered User
Join Date: Jul 2018
Posts: 989
|
Real working script with both accelerator and CPU search and refining functions. For 1920x1080i input.
Code:
# Input plugins LoadPlugin("ffms2.dll") LoadPlugin("mvtools2.dll") SetFilterMTMode("DEFAULT_MT_MODE", 3) my_thSAD=260 my_thSAD2=240 my_thSAD_mg=130 my_thSAD2_mg=120 my_thSCD=500 my_pzero=10 my_pnew=10 my_pglobal=10 my_pel=2 my_thCohMV=5 # 5..8 for pel=2, 10..16 for pel=4 ? my_trymany=true my_oPT=1 my_overlap=0 my_IntOvlp=3 my_searchparam=2 my_MPBNumIt=2 my_init_tr=12 my_refine_tr=12 Function RefineMV(clip mvclip, clip super_ref, clip src, int _thSAD, int _thSAD2, int in_tr, int refine_tr, int my_thSCD, int my_pel, bool my_trymany, int my_pnew, int my_pzero, int my_pglobal, \ int my_oPT, int my_overlap, int my_searchparam, int my_IntOvlp, int my_thCohMV) { g_next=MDegrainN(src, super_ref, mvclip, in_tr, thSAD=_thSAD, thSAD2=_thSAD2, mt=false, wpow=4, thSCD1=my_thSCD, adjSADzeromv=0.5, adjSADcohmv=0.5, thCohMV=my_thCohMV, \ MVLPFGauss=0.9, thMVLPFCorr=50, adjSADLPFedmv=0.9, IntOvlp=my_IntOvlp) super_g_next=MSuper(g_next,chroma=true, mt=false, pel=my_pel) return MAnalyse(super_g_next, SuperCurrent=super_ref, multi=true, delta=refine_tr, search=3, searchparam=my_searchparam, trymany=my_trymany, overlap=my_overlap, chroma=true, mt=false,\ optSearchOption=1, truemotion=false, pnew=my_pnew, pzero=my_pzero, pglobal=my_pglobal, global=true, optPredictorType=my_oPT) } Function RefineMV_HW(clip mvclip, clip super_ref, clip src, int _thSAD, int _thSAD2, int in_tr, int refine_tr, int my_thSCD, int my_pel, int my_thCohMV) { g_next=MDegrainN(src, super_ref, mvclip, in_tr, thSAD=_thSAD, thSAD2=_thSAD2, mt=false, wpow=4, thSCD1=my_thSCD, adjSADzeromv=0.5, adjSADcohmv=0.5, thCohMV=my_thCohMV, \ MVLPFGauss=0.9, thMVLPFCorr=50, adjSADLPFedmv=0.9, IntOvlp=3, UseSubShift=1) super_g_next=MSuper(g_next,chroma=true, mt=false, pel=my_pel, levels=1, pelrefine=false) return MAnalyse(super_g_next, SuperCurrent=super_ref, multi=true, delta=refine_tr, chroma=true, mt=false, optSearchOption=5, levels=1) } FFmpegSource2("1920x1080i.mp4") AddBorders(0,0,0,72) noproc=last SeparateFields() super_hwa=MSuper(last, mt=false, chroma=true, pel=my_pel, hpad=8, vpad=8, levels=1, pelrefine=false) super_cpu=MSuper(last, mt=false, chroma=true, pel=my_pel, hpad=8, vpad=8, levels=0, pelrefine=true) multi_vec_hwa=MAnalyse(super_hwa, multi=true, blksize=8, delta=my_init_tr, overlap=0, chroma=true, optSearchOption=5, mt=false, levels=1) multi_vec_cpu=MAnalyse(super_cpu, multi=true, delta=my_init_tr, search=3, searchparam=my_searchparam, trymany=my_trymany, overlap=my_overlap, chroma=true, mt=false, \ optSearchOption=1, truemotion=false, pnew=my_pnew, pzero=my_pzero, pglobal=my_pglobal, global=true, optPredictorType=my_oPT) multi_vec_cpu2=RefineMV(multi_vec_cpu, super_cpu, last, my_thSAD, my_thSAD2, my_init_tr, my_refine_tr, my_thSCD, my_pel, my_trymany, my_pnew, my_pzero, my_pglobal, my_oPT, \ my_overlap, my_searchparam, my_IntOvlp, my_thCohMV) multi_vec_hybr2=RefineMV(multi_vec_hwa, super_cpu, last, my_thSAD, my_thSAD2, my_init_tr, my_refine_tr, my_thSCD, my_pel, my_trymany, my_pnew, my_pzero, my_pglobal, my_oPT, \ my_overlap, my_searchparam, my_IntOvlp, my_thCohMV) multi_vec_hwa2=RefineMV_HW(multi_vec_hwa, super_hwa, last, my_thSAD, my_thSAD2, my_init_tr, my_refine_tr, my_thSCD, my_pel, my_thCohMV) cpu2=MDegrainN(last,super_cpu, multi_vec_cpu2, my_refine_tr, thSAD=my_thSAD_mg, thSAD2=my_thSAD2_mg, mt=false, wpow=4, UseSubShift=1, thSCD1=my_thSCD, adjSADzeromv=0.7, \ adjSADcohmv=0.7, thCohMV=my_thCohMV, MVLPFGauss=0.9, thMVLPFCorr=50, adjSADLPFedmv=0.9, IntOvlp=my_IntOvlp, MPBthSub=5, MPBthAdd=20, MPBNumIt=my_MPBNumIt, \ MPB_SPCsub=0.5, MPB_SPCadd=1.5, MPBthIVS=2200, showIVSmask=false).Weave().Subtitle("cpu2") hwa2=MDegrainN(last,super_hwa, multi_vec_hwa2, my_refine_tr, thSAD=my_thSAD_mg, thSAD2=my_thSAD2_mg, mt=false, wpow=4, UseSubShift=1, thSCD1=my_thSCD, adjSADzeromv=0.7, \ adjSADcohmv=0.7, thCohMV=my_thCohMV, MVLPFGauss=0.9, thMVLPFCorr=50, adjSADLPFedmv=0.9, IntOvlp=my_IntOvlp, MPBthSub=5, MPBthAdd=20, MPBNumIt=my_MPBNumIt, \ MPB_SPCsub=0.5, MPB_SPCadd=1.5, MPBthIVS=2200, showIVSmask=false).Weave().Subtitle("hwa2") hybr2=MDegrainN(last,super_hwa, multi_vec_hybr2, my_refine_tr, thSAD=my_thSAD_mg, thSAD2=my_thSAD2_mg, mt=false, wpow=4, UseSubShift=1, thSCD1=my_thSCD, adjSADzeromv=0.7, \ adjSADcohmv=0.7, thCohMV=my_thCohMV, MVLPFGauss=0.9, thMVLPFCorr=50, adjSADLPFedmv=0.9, IntOvlp=my_IntOvlp, MPBthSub=5, MPBthAdd=20, MPBNumIt=my_MPBNumIt, \ MPB_SPCsub=0.5, MPB_SPCadd=1.5, MPBthIVS=2200, showIVSmask=false).Weave().Subtitle("hybr2") Interleave(cpu2, hybr2, hwa2, noproc.Subtitle("src")) #last=hybr2 Crop(0,0,1920,1080) Prefetch(6) Hybrid mode with GTX1060 and i5-9600K CPU run at about 1.24 fps (pel=4) and 1.75fps (pel=2). Quality is close to full CPU search. Full accelerator search and refine run only a bit faster (about 1.3 fps with pel=4) and quality is a bit lower of hybrid mode at some scenes. Last edited by DTL; 10th March 2023 at 17:25. Reason: updated script |
![]() |
![]() |
![]() |
#225 | Link |
Registered User
Join Date: Jul 2018
Posts: 989
|
New version: https://github.com/DTL2020/mvtools/r.../r.2.7.46-a.21
Added new processing mode to MDegrainN: MEL (Most Equal Looking) search mode for TTH (Temporal Thresholding). New params to MDegrainN: pmode=0 (default) - standard blending, pmode=1 - MEL search and TTH only. TTH_DMFlags - dismetric flags for estimating blocks difference at TTH compare. Flags 0x01 to 0x20 valid (except 0x08). TTH_thUPD (0 default, additional thresholding disabled, 100% linear mode, must be >0 for pmode=1) - integer threshold for selection: keep output old in pmode=0 or 'best' in pmode=1 block from memory or update block in memory and output new block. Typical working values expected to be significantly below thSAD (like thSAD/3.. thSAD/4 and less). Startng from 0. 0 mean no blocks from memory used (standard MDegrainN mode - FIR filter). TTH_chroma - use chroma in TTH dismetric analysis (slower, better quality) or not (faster). Fixed performance issue with double processing of chroma planes in combined YUV processing with no overlap. Current testscript: Code:
tr=10 super=MSuper(last, mt=false, chroma=true, pel=2, hpad=8, vpad=8, levels=0, pelrefine=true) multi_vec=MAnalyse(super, multi=true, blksize=8, delta=tr, search=3, searchparam=2, overlap=0, optSearchOption=1, optPredictorType=0, chroma=false, mt=false) ref=MDegrainN(last,super, multi_vec, tr, thSAD=185, thSAD2=170, mt=false, wpow=4, thSCD1=350, adjSADzeromv=0.5, adjSADcohmv=0.5, thCohMV=16, MVLPFGauss=0.9, thMVLPFCorr=50, IntOvlp=3) super2=MSuper(ref, mt=false, chroma=true, pel=2, hpad=8, vpad=8, levels=0, pelrefine=true) MDegrainN(ref,super2, multi_vec, tr, thSAD=250, thSAD2=240, mt=false, thSCD1=350, pmode=1, TTH_thUPD=100, IntOvlp=3) Complexity of analysis in pmode=1 currently is ~tr^2 so it may use separate tr value (and mvclip created with lower tr value). Quality expected to be ~tr value (probability to found most commonly looking block in the total tr-pool). Param thSAD in pmode=1 also controls initial block skipping when accumulating blocks in analysis pool. TTH_thUPD is the main param to adjust - the higher its value - the more noise blocks are skipped but too high value may cause 'hanging' blocks visible or motion quality degradation. Setting too high thSAD in pmode=1 also may cause more artifacts. pmode=1 expected to be 'final cleaning' after initial MDegrainN (also must use new super clip with pre-denoised frames) and if highest quality required. For general everyday encodings may be enough to play with TTH_thUPD param in standard pmode=0. Last MDegrainN with pmode=1 may or may not use refined mv-clip (for best results best refined mvclip is recommended). pmode=1 not blend at all - so no degrade details quality with any thSAD. It only additional way to select 'best' looking block in current tr-scope and duplicate it in output frames until visual difference with current frame block is below threshold. TTH_thUPD may be enabled in any MDegraiN in processing script (in MVs refining and final degrain and final cleanup). TTH_DMFlags may set any avaialable dismetric for visual difference analysis (SAD - faster, SSIM and VIF - slower) at any MDegrainN with enabled TTH_thUPD or pmode=1. Last edited by DTL; 16th March 2023 at 10:25. |
![]() |
![]() |
![]() |
#226 | Link |
Registered User
Join Date: Jul 2018
Posts: 989
|
Some morning quicky implementation of this year idea about noise bitrate estimation to check the degrain quality.
Release 30.03.2023 - https://github.com/DTL2020/mvtools/r.../r.2.7.46-a.22 Added computing and displaying of residual noise bits count per frame to MCompensate. Compute sum of log2 of the samples absolute difference between source and motion compensated output frame of MCompensate. For complete static frame sequence RNB=0 bits/frame. For noise bitrate per second - value should be multiplied to frame rate. New param to MCompensate: showRNB (default = false). Usage example: super=MSuper() mv=MAnalyse(super) MCompensate(super, mv, showRNB=true) Currently only for 8bit sources. Need to offset processing function to templated for HBD support. Can process Y only input clip or YUV/RGB (3 planes present). For >1 planes the sum of all planes is displayed. Computing part: Code:
for (int y = 0; y < nHeight; y++) { uint8_t* pDstFrame = pDst[0] + nDstPitches[0] * y; uint8_t* pSrcFrame = (uint8_t*)pSrc[0] + nSrcPitches[0] * y + nOffset[0]; for (int x = 0; x < nWidth; x++) { iSumNzBits += 32 - __lzcnt(SADABS((int)pSrcFrame[x] - (int)pDstFrame[x])); } } Usage example to measure denoise process: Code:
SeparateFields() fields_orig=last tr=3 super=MSuper(last, mt=false, chroma=true, pel=2, hpad=8, vpad=8, levels=0, pelrefine=true) multi_vec=MAnalyse(super, multi=true, blksize=8, delta=tr, search=3, searchparam=2, overlap=0, optSearchOption=1, optPredictorType=0, chroma=false, mt=false, DMFlags=1) ref=MDegrainN(last,super, multi_vec, tr, thSAD=185, thSAD2=170, mt=false, wpow=4, thSCD1=350, adjSADzeromv=0.5, adjSADcohmv=0.5, thCohMV=16, MVLPFGauss=0.9, thMVLPFCorr=50, IntOvlp=3) super2=MSuper(ref, mt=false, chroma=true, pel=2, hpad=8, vpad=8, levels=0, pelrefine=true) MDegrainN(ref,super2, multi_vec, tr, thSAD=350, thSAD2=340, mt=false, thSCD1=350, pmode=1, TTH_thUPD=100, IntOvlp=3) super_ref=MSuper(ref) mv_ref=MAnalyse(super_ref) rnb_den_ref=MCompensate(super_ref, mv_ref, showRNB=true) super2=MSuper() mv2=MAnalyse(super2) rnb_den=MCompensate(super2, mv2, showRNB=true) super_orig=MSuper(fields_orig) mv_orig=MAnalyse(super_orig) rnb_orig=MCompensate(super_orig, mv_orig, showRNB=true) StackHorizontal(rnb_den, rnb_den_ref, rnb_orig) Weave() ![]() Yes - the fileds are blended not very nice. It shows how for static areas the second MDegraiN(pmode=1) decreases noise bitcount about 10 times. First stage denoise about 2.9 times decrease noise bitrate. Addition of secondary non-linear IIR-type filter with memory decreases nosie bitrate about 30 times from source. Completely (100%) temporal denoised frame sequence for zero calibration is Trim(1,1) Loop() Last edited by DTL; 30th March 2023 at 14:30. |
![]() |
![]() |
![]() |
#227 | Link | |
Formally known as .......
Join Date: Sep 2021
Location: On a need to know basis.
Posts: 806
|
Quote:
__________________
This can be Very "TeDiouS".. Long term RipBot264 user. Ryzen 9 7950X Intel i9-13900KF Ryzen 9 5950X Ryzen 9 5900X Ryzen 9 3950X Link to RB v1.27.0 |
|
![]() |
![]() |
![]() |
#229 | Link |
Registered User
Join Date: Sep 2008
Posts: 365
|
DTL, I can't get your builds to work at all in Windows11, I have tried all 4 different .dll's in the zip file...
AVSmeter just stops at 0 frames forever, until I hit ctrl+c. Code:
AVSMeter64.exe -o d:\test.avs AVSMeter 3.0.9.0 (x64), (c) Groucho2004, 2012-2021 AviSynth+ 3.7.3 (r3973, 3.7, x86_64) (3.7.3.0) Number of frames: 33304 Length (hh:mm:ss.ms): 00:23:09.054 Frame width: 960 Frame height: 720 Framerate: 23.976 (24000/1001) Colorspace: YV12 Frame (current | last): 0 | 33303 Code:
An out-of-bounds memory access (access violation) occurred in module 'VirtualDub64'... ...reading address FFFFFFFFFFFFFFFF. Code:
Traceback (most recent call last): File "_ctypes/callbacks.c", line 315, in 'calling callback function' File "avsp.pyo", line 5136, in local_wnd_proc WindowsError: exception: access violation reading 0xFFFFFFFFFFFFFFFF I also have no idea how to troubleshoot this other than give you some information about my system and hope you have any idea of what is wrong: AviSynth+ 3.7.3 (r3973, 3.7, x86_64) Windows 11 22H2 (22621.1413) Avisynth script I tested with: Code:
FFVideoSource("test.mkv") SMDegrain(tr=3, thSAD=400, RefineMotion=false, contrasharp=false, plane=4, prefilter=0, chroma=true)
__________________
(i have a tendency to drunk post) Last edited by mastrboy; 30th March 2023 at 18:56. |
![]() |
![]() |
![]() |
#230 | Link |
Registered User
Join Date: Jul 2018
Posts: 989
|
Unfortunately my builds may be not compatible with many old scripts (using no-default block size of 16x16 and may be some more not tested options). So it still pre-release demos of some features and mostly tested at the examples scripts provided here and typically block size of 8x8. I even make somehow changed QTGMC to use with my builds when I tested deinterlacing.
So it is no good to put this .dll in 'common' folder and recommended to load with LoadPlugin() from current working folder. It is expected may be in some years (in beginning of 2024 it is expected great all planet celebration of 20 years for mvtools) we will have some features ported to 'more official' pinterf or may be other programmer capable to test and bugfix most of supported modes of mvtools. But it still not happen. I going to make some e-table (may be google web docs ?) of all new features and ideas accumulated and partially implemented for post-2.7.45 version with current 'status' and other data for analyse and creating list of mostly important features to port/bugfix. Also I not use SMDegrain script and make my own denosie scripts based on mvtools only. So I not know what cause crash there. May be some day I will have time to attempt to install SMDegrain and check it with debugger where may be crash with that settings and if it possible to more or less fast to fix it. For the very first possible solution it is recommended to test with block size of 8x8 (internal default for mvtools). Though if you use SMDegrain as I read it still not support any new features of post-2.7.45 mvtools so it may be safely to use old 'stable' 2.7.45 build from pinterf. May be still many years until we will have some more stable post-2.7.45 build fully compatible with 2.7.45 processing with default new settings and Dogway will make changes to SMDegrain to use new features. Last edited by DTL; 30th March 2023 at 20:14. |
![]() |
![]() |
![]() |
#231 | Link |
Registered User
Join Date: May 2018
Posts: 155
|
@DTL
Have you seen this https://devblogs.microsoft.com/direc...y-sdk-1-710-0/ Is it applicable for mvtools? |
![]() |
![]() |
![]() |
#232 | Link |
Registered User
Join Date: Jul 2018
Posts: 989
|
I have some strategic idea how may be make current post-2.7.45 version more compatible with old scripts and 2.7.45 build - rename all filters with adding _a to the end (like alpha-state). So it may be possible to load both 2.7.45 and post-2.7.45 mvtools in single AVS environment and only use selected filters from post-2.7.45 if required (also it may be (partially) compatible in between - super and mv clips). Now because of same naming it either not loads or may cause undefined usage of different filters from different .dlls. May be in next build.
"Is it applicable for mvtools?" About new heaps mode - currently the performance is very few limited by textures upload and backward download of MVs and SAD data is very small in size. About sampling - currently some 'simple' sampling mode used in SAD compute shader (sort of sample(x,y) request as CPU from host RAM do (not possible 'complex 3D' sampling when texture mapping to some virtual triangle or other patch performed). So no update of sampling required and can not help in performance. |
![]() |
![]() |
![]() |
#233 | Link |
Registered User
Join Date: Jul 2018
Posts: 989
|
New release: https://github.com/DTL2020/mvtools/r.../r.2.7.46-a.23
Added denoise mask clip input into MDegrainN. Work only on block-based mode. Must be Y8 format with frame size equal to blocks number to process (including any used overlap mode). New param to MDegrainN: dnmask - clip. 0 is full standard denoise, 255 is no denoise (so positive Y-channel can be used as mask to degrain only low brightness levels). Example script (for IntOvlp=3): dn_mask1=ConvertToY8() blksize=8 #int_ ovlp=3 dn_mask_x=dn_mask1.width/blksize overlap_size=blksize/2 dn_mask_y=(dn_mask1.height-overlap_size)/(blksize-overlap_size) dn_mask1=BilinearResize(dn_mask1, dn_mask_x, dn_mask_y) dn_mask1=Levels(dn_mask1, 0, 1, 100, 0, 255, coring=false) dn_masked=MDegrainN(.., IntOvlp=3, dnmask=dn_mask1) Added update MEL memory with best (lowest sum of DM table row) block and memory for sum of current stored in IIR memory block. Real fast way to get block numbers is to feed any sized mask clip and read error message if size is not correct - it will show current blocks number for H and V directions for current used overlap mode. Simple BilinearResize do not make perfect mask for any overlap mode because even blocks rows shifted to the right to overlap size (typical half block size with max overlap). So better to separate to odd/even rows - shift even rows to the right and combine to frame back. But any overlap processing modes looks like hide these errors with not too large block sizes. Last edited by DTL; 18th June 2023 at 15:09. |
![]() |
![]() |
![]() |
#234 | Link |
Registered User
Join Date: Dec 2005
Location: Sweden
Posts: 663
|
@DTL
How is it possible to use mdegrain2 with your version, or is it only possible with tr=1 and mdegrain()? I would like to make my Clay script to work with your version separately to get a speed boost and also a quality boost and yet keep close to the results I get with the current Pinterf version of mvtools. But I think I have to restructure the script without using mrecalculate and overlap in manalyse. Edit: maybe you have any further ideas for improvement in both speed and quality when using your version? Need to make it quite simple with for example HQ=true/false or I will have to add many possible parameters. Last edited by anton_foy; 6th July 2023 at 13:16. |
![]() |
![]() |
![]() |
#235 | Link |
Registered User
Join Date: Jul 2018
Posts: 989
|
"is it only possible with tr=1 and mdegrain()?"
MDegrain2 is tr=2. Yes - all new features only included in MDegrainN. Also as it was found while testing IIR mode with TTempSmooth - any IIR (with previous frames memory) filter can only run in MT_SERIALIZE AVS+ MT mode correctly. So with any IIR-setting enabled (TTH_thUPD > 0) in current MDegraiN release (up to a.23) also require to manually set MT_SERIALIZE for MDegrainN (with SetFilterMTMode(.., force=true)) and to keep multithreading - only use internal AVSTP-based multithreading (mt=true and use updated avstp.dll from pinterf to save from hangs). Thanks to gods pinterf found and fixed that odd issue in avstp and now mvtools can run again with internal multithreading as it really the only possible MT mode with 'temporal' processing like IIR-filtering enabled. MT_SERIALIZE also auto-activated for MAnalyse if 'temporal' predictor is used for the same reason. In the next versions MDegrainN will auto-register itself with MT_SERIALIZE if any IIR-setting is activated. Maybe also try to set mt=true too ? "I would like to make my Clay script to work with your version separately to get a speed boost and also a quality boost and yet keep close to the results I get with the current Pinterf version of mvtools. But I think I have to restructure the script without using mrecalculate and overlap in manalyse." Best quality of MVs expected only from multi-generation MVs refining - example also in the https://forum.doom9.org/showthread.p...64#post1987964 It is more complex in control because it is required to adjust at least 2 different thSAD for first and next generations. With not very small tr-value for first generation it is expected significant part of noise is removed at first generation so the thSAD for second generation may be about 0.6..0.7 of thSAD of first generation. With perfect noise removed it is expected that the last thSAD is first_thSAD/2. But the best strategy of number of generations of refining and decreasing of thSAD (and may be changing tr-value from lower at first generation to higher at second and next generations) at each generation is subject of research (also may be with Zopti). Also with such research the quality metric better be structure-aware (like SSIM or VIF or other). Also best quality expected from onCPU MAnalyse and full 4x overlap in both MAnalyse and MDegrainN (it is the slowest mode). So currently very many performance/quality modes are possible. Last edited by DTL; 10th July 2023 at 10:00. |
![]() |
![]() |
![]() |
#236 | Link |
Registered User
Join Date: Jul 2018
Posts: 989
|
New version - https://github.com/DTL2020/mvtools/r.../r.2.7.46-a.24
Added Auto-thSAD for MDegrainN. New params to MDegrainN: thSADA_a (float), default = 0. Multiplier proportional to estimated nosie level thSADA_b (float), default = 0. Offset to calculated Auto-thSAD. If both thSADA_a and thSADA_b = 0 - Auto-thSAD is disabled. Used Auto-thSAD is a scaled and offsetted arithmetic mean of blocks SAD values below thSCD1 (noise_estimate). Next is applied adjusting params: Auto_thSAD = thSADA_a * noise_estimate + thSADA_b thSAD2, thSADC, thSADC2 calculated proportionally to provided old params values. For a typical workflow user must provide both non-default thSADA_a and thSADA_b values. If only thSADA_b provided - it will be equal to static thSAD. Expected start values are thSADA_a = 1.0 and thSADA_b = 10. Setting thSADA_a < 1.0 will make higher denoise on low noise scenes and lower at high nosie scenes. Setting thSADA_a > 1.0 will make higher denoise on high noise scenes and higher at high noise scenes. thSADA_b is a simple additive offset (may be negative too). Initial release of Auto-thSAD feature for testing. Example: Code:
MDegrainN(last,super, multi_vec, tr, thSAD=135, thSAD2=120, mt=false, wpow=4, thSCD1=450, thSADA_a=1.05, thSADA_b=5, adjSADzeromv=0.5, adjSADcohmv=0.5, thCohMV=16, MVLPFGauss=0.9, thMVLPFCorr=50, IntOvlp=3) |
![]() |
![]() |
![]() |
#237 | Link | |
Registered User
Join Date: Dec 2005
Location: Sweden
Posts: 663
|
Quote:
Edit: btw. I tried to make your version correspond visually to pinterf's latest version of mvtools2 but yours with optSearchOption=5 and intOvlp=3 gave less denoising and less temporal stability even if I turned up thsad. Will post the full script comparisons later today if I can (Clay with fast=true since your version does not have MDegrain2 now) . Also I did not get any speed improvement which I guess is because of my old GPU. Last edited by anton_foy; 1st August 2023 at 08:13. |
|
![]() |
![]() |
![]() |
#238 | Link |
Registered User
Join Date: Jul 2018
Posts: 989
|
"Does this new feature slow things down alot?"
I did not take tests of performance. But it is expected to be very fast and not make a visible performance hit. If performance hit is visible - performance may be better in next versions with pre-calculating of tr-weights. Currently each frame tr-weights roll-off (defined by thSAD/thSAD2 difference) calculated using float cos() function. " optSearchOption=5 and intOvlp=3 gave less denoising and less temporal stability" In my tests the quality of ME with the GTX1060 card is somehow worse in comparison with onCPU MAnalyse. But acceptable for the denoise of documentaries series with offloading part of work from CPU so total mvtools+x264 encoding run faster. For highest quality denoise work only onCPU MAnalyse is recommended (optSearchOption != 5/6). Hardware ME from MPEG encoder ASIC is not simply hardware-accelerated MAnalyse but completely different ME engine may be optimized for faster MPEG encoding and not for quality. Also at each version of hardware and each vendor (NVIDIA/AMD/Intel/others ?) it may provide different quality and performance. Maybe hardware ME can be used to make things faster in multi-generations ME refining as first generation of MAnalyse. My current test script for 2 generations MVs refining and Auto-thSAD used: Code:
# Input plugins LoadPlugin("ffms2.dll") LoadPlugin("mvtools2.dll") SetFilterMTMode("DEFAULT_MT_MODE", 3) my_thSADA_a=1.1 my_thSADA_b=50 my_thSAD=250 my_thSAD2=Int(Float(my_thSAD) * 0.8) my_thSAD_mg=150 my_thSAD2_mg=Int(Float(my_thSAD_mg) * 0.8) my_thSCD=my_thSAD+200 my_pzero=10 my_pnew=10 my_pglobal=10 my_pel=2 my_thCohMV=5 # 5..8 for pel=2, 10..16 for pel=4 ? my_trymany=false my_oPT=1 my_overlap=0 my_IntOvlp=3 my_searchparam=2 my_MPBNumIt=2 my_init_tr=6 my_refine_tr=6 Function RefineMV(clip mvclip, clip super_ref, clip src, int _thSAD, int _thSAD2, int in_tr, int refine_tr, int my_thSCD, int my_pel, bool my_trymany, int my_pnew, int my_pzero, int my_pglobal, \ int my_oPT, int my_overlap, int my_searchparam, int my_IntOvlp, int my_thCohMV) { g_next=MDegrainN(src, super_ref, mvclip, in_tr, thSAD=_thSAD, thSAD2=_thSAD2, mt=false, wpow=4, thSCD1=my_thSCD, adjSADzeromv=0.6, adjSADcohmv=0.6, thCohMV=my_thCohMV, \ MVLPFGauss=0.9, thMVLPFCorr=50, adjSADLPFedmv=0.9, IntOvlp=my_IntOvlp) super_g_next=MSuper(g_next,chroma=true, mt=false, pel=my_pel) return MAnalyse(super_g_next, SuperCurrent=super_ref, multi=true, delta=refine_tr, search=3, searchparam=my_searchparam, trymany=my_trymany, overlap=my_overlap, chroma=true, mt=false,\ optSearchOption=1, truemotion=false, pnew=my_pnew, pzero=my_pzero, pglobal=my_pglobal, global=true, optPredictorType=my_oPT) } Function RefineMVa(clip mvclip, clip super_ref, clip src, int _thSAD, int _thSAD2, float _thSADA_a, float _thSADA_b, int in_tr, int refine_tr, int my_thSCD, int my_pel, bool my_trymany, int my_pnew, int my_pzero, int my_pglobal, \ int my_oPT, int my_overlap, int my_searchparam, int my_IntOvlp, int my_thCohMV) { g_next=MDegrainN(src, super_ref, mvclip, in_tr, thSAD=_thSAD, thSAD2=_thSAD2, thSADA_a=_thSADA_a, thSADA_b=_thSADA_b, mt=false, wpow=4, thSCD1=my_thSCD, adjSADzeromv=0.6, adjSADcohmv=0.6, thCohMV=my_thCohMV, \ MVLPFGauss=0.9, thMVLPFCorr=50, adjSADLPFedmv=0.9, IntOvlp=my_IntOvlp) super_g_next=MSuper(g_next,chroma=true, mt=false, pel=my_pel) return MAnalyse(super_g_next, SuperCurrent=super_ref, multi=true, delta=refine_tr, search=3, searchparam=my_searchparam, trymany=my_trymany, overlap=my_overlap, chroma=true, mt=false,\ optSearchOption=1, truemotion=false, pnew=my_pnew, pzero=my_pzero, pglobal=my_pglobal, global=true, optPredictorType=my_oPT) } FFmpegSource2("src.mp4") noproc=last super_cpu=MSuper(last, mt=false, chroma=true, pel=my_pel, hpad=8, vpad=8, levels=0, pelrefine=true) multi_vec_cpu=MAnalyse(super_cpu, multi=true, delta=my_init_tr, search=3, searchparam=my_searchparam, trymany=my_trymany, overlap=my_overlap, chroma=true, mt=false, \ optSearchOption=1, truemotion=false, pnew=my_pnew, pzero=my_pzero, pglobal=my_pglobal, global=true, optPredictorType=my_oPT) multi_vec_cpu2=RefineMV(multi_vec_cpu, super_cpu, last, my_thSAD, my_thSAD2, my_init_tr, my_refine_tr, my_thSCD, my_pel, my_trymany, my_pnew, my_pzero, my_pglobal, my_oPT, \ my_overlap, my_searchparam, my_IntOvlp, my_thCohMV) multi_vec_cpu2a=RefineMVa(multi_vec_cpu, super_cpu, last, my_thSAD, my_thSAD2, my_thSADA_a, my_thSADA_b, my_init_tr, my_refine_tr, my_thSCD, my_pel, my_trymany, my_pnew, my_pzero, my_pglobal, my_oPT, \ my_overlap, my_searchparam, my_IntOvlp, my_thCohMV) cpu2=MDegrainN(last,super_cpu, multi_vec_cpu2, my_refine_tr, thSAD=my_thSAD_mg, thSAD2=my_thSAD2_mg, mt=false, wpow=4, UseSubShift=1, thSCD1=my_thSCD, adjSADzeromv=0.7, \ adjSADcohmv=0.7, thCohMV=my_thCohMV, MVLPFGauss=0.9, thMVLPFCorr=50, adjSADLPFedmv=0.9, IntOvlp=my_IntOvlp, MPBthSub=5, MPBthAdd=20, MPBNumIt=my_MPBNumIt, \ MPB_SPCsub=0.5, MPB_SPCadd=1.5, MPBthIVS=2200, showIVSmask=false) cpu2a=MDegrainN(last,super_cpu, multi_vec_cpu2a, my_refine_tr, thSAD=my_thSAD_mg, thSAD2=my_thSAD2_mg, thSADA_a=my_thSADA_a, thSADA_b=my_thSADA_b, mt=false, wpow=4, UseSubShift=1, thSCD1=my_thSCD, adjSADzeromv=0.7, \ adjSADcohmv=0.7, thCohMV=my_thCohMV, MVLPFGauss=0.9, thMVLPFCorr=50, adjSADLPFedmv=0.9, IntOvlp=my_IntOvlp, MPBthSub=5, MPBthAdd=20, MPBNumIt=my_MPBNumIt, \ MPB_SPCsub=0.5, MPB_SPCadd=1.5, MPBthIVS=2200, showIVSmask=false) cpu2a_s=MDegrainN(last,super_cpu, multi_vec_cpu2a, my_refine_tr, thSAD=my_thSAD_mg, thSAD2=my_thSAD2_mg, thSADA_a=my_thSADA_a, thSADA_b=my_thSADA_b, mt=false, wpow=4, UseSubShift=1, thSCD1=my_thSCD, adjSADzeromv=0.7, \ adjSADcohmv=0.7, thCohMV=my_thCohMV, MVLPFGauss=0.9, thMVLPFCorr=50, adjSADLPFedmv=0.9, IntOvlp=my_IntOvlp) cpu_s=MDegrainN(last,super_cpu, multi_vec_cpu, my_init_tr, thSAD=my_thSAD, thSAD2=my_thSAD2, mt=false, thSCD1=my_thSCD, IntOvlp=my_IntOvlp) Interleave(noproc.Subtitle("src"),cpu2.Subtitle("cpu2"), cpu2a_s.Subtitle("cpu2a_s"), cpu_s.Subtitle("cpu_s")) Prefetch(..) src - input source cpu2 - 2 generations MVs refined with MPB and static thSAD cpu2a_s - 2 generations MVs refined without MPB and Auto-thSAD at all generations cpu_s - single generation MAnalyse and MDegrain with static thSAD (mostly close to 2.7.45 version, only interpolated overlap used for better performance). Settings for MAnalsye in the script are not the best possible for best quality - I set lower for better performance at my old test CPU of E7500. Better quality expected with my_pel=4 my_thCohMV=12 # 10..16 for pel=4 ? my_trymany=true my_oPT=0 # all predictors used my_overlap=4 # full 4x real search overlap - slowest my_IntOvlp=0 my_searchparam=2 # better expected with >2 and also pelsearch > 4 (for pel=4) MPB processing in MDegrainN still looks not make things visibly better (at least at my grainy test footage) so currently may be disabled for a bit better performance. 2 generations MVs refining sometime reduce search errors also at the borders of objects and dark parts of scenes. Using of Auto-thSAD (with old added SAD-related tweaks for static and moving and 'coherent moving' blocks with adjSADzeromv, adjSADcohmv keeps more details at some areas like moving parts with lower denoising at these areas). Last edited by DTL; 1st August 2023 at 09:49. |
![]() |
![]() |
![]() |
#240 | Link |
Registered User
Join Date: Jul 2018
Posts: 989
|
It is not likely to be soon until we get good programmers to fix currently already added bugs. As I found with an attempt to enable internal MT with avstp.dll - both MAnalyse and MDegrain crashes with something like memory corruption. Only works stable with AVS+ global frame-based MT. So it looks like compatibility with internal MT via AVSTP is severely broken. An internal MT in MDegrainN is highly recommended if use IIR-based temporal additional filtering (only works good in MT_SERIALIZED). So I think Dogway does not like to use such not stable versions.
About using hardware ME with very slow filtering - it really greatly helps in 2 generations MVs refining. 2 MAnalyse with 'very' slow settings like pel=4, tr=12, trymany=true close to no-start at all. And with the use of DX12-ME from GTX1060 at first MAnalyse total transcoding runs at about 0.25 fps with i5-9600K CPU. Current practical script with 'best quality' settings is: Code:
# Input plugins LoadPlugin("ffms2.dll") LoadPlugin("mvtools2.dll") SetMemoryMax(10000) my_thSADA_a=1.3 my_thSADA_b=80 my_thSAD=250 my_thSAD2=Int(Float(my_thSAD) * 0.8) my_thSAD_mg=150 my_thSAD2_mg=Int(Float(my_thSAD_mg) * 0.8) my_thSCD=my_thSAD+200 my_pzero=10 my_pnew=10 my_pglobal=10 my_pel=4 my_thCohMV=4 # 5..8 for pel=2, 10..16 for pel=4 ? my_trymany=true my_oPT=0 my_overlap=4 my_IntOvlp=0 my_searchparam=4 my_pelsearchparam=4 my_MPBNumIt=2 my_init_tr=12 my_refine_tr=12 my_MT=false Function RefineMVa(clip mvclip, clip super_hwa, clip super_ref, clip src, int _thSAD, int _thSAD2, float _thSADA_a, float _thSADA_b, int in_tr, int refine_tr, int my_thSCD, int my_pel, bool my_trymany, int my_pnew, int my_pzero, int my_pglobal, \ int my_oPT, int my_overlap, int my_searchparam, int _pelsearchparam, int my_IntOvlp, int my_thCohMV, bool _my_MT) { g_next=MDegrainN(src, super_hwa, mvclip, in_tr, thSAD=_thSAD, thSAD2=_thSAD2, thSADA_a=_thSADA_a, thSADA_b=_thSADA_b, mt=_my_MT, wpow=4, UseSubShift=1, thSCD1=my_thSCD, adjSADzeromv=0.6, adjSADcohmv=0.6, thCohMV=my_thCohMV, \ MVLPFGauss=0.9, thMVLPFCorr=50, adjSADLPFedmv=0.9, IntOvlp=3) super_g_next=MSuper(g_next,chroma=true, mt=_my_MT, pel=my_pel) return MAnalyse(super_g_next, SuperCurrent=super_ref, multi=true, delta=refine_tr, search=3, searchparam=my_searchparam, pelsearch=_pelsearchparam, trymany=my_trymany, overlap=my_overlap, chroma=true, mt=false,\ optSearchOption=1, truemotion=false, pnew=my_pnew, pzero=my_pzero, pglobal=my_pglobal, global=true, optPredictorType=my_oPT) } FFmpegSource2("src.mp4") super_cpu=MSuper(last, mt=my_MT, chroma=true, pel=my_pel, hpad=8, vpad=8, levels=0, pelrefine=true) super_hwa=MSuper(last, mt=my_MT, chroma=true, pel=4, hpad=8, vpad=8, levels=1, pelrefine=false) multi_vec_cpu=MAnalyse(super_cpu, multi=true, delta=my_init_tr, search=3, searchparam=my_searchparam, trymany=my_trymany, overlap=my_overlap, chroma=true, mt=false, \ optSearchOption=1, truemotion=false, pnew=my_pnew, pzero=my_pzero, pglobal=my_pglobal, global=true, optPredictorType=my_oPT) multi_vec_hwa=MAnalyse(super_hwa, multi=true, delta=my_init_tr, chroma=true, mt=false, \ optSearchOption=5, levels=1) multi_vec_cpu2a=RefineMVa(multi_vec_hwa, super_hwa, super_cpu, last, my_thSAD, my_thSAD2, my_thSADA_a, my_thSADA_b, my_init_tr, my_refine_tr, my_thSCD, my_pel, my_trymany, my_pnew, my_pzero, my_pglobal, my_oPT, \ my_overlap, my_searchparam, my_pelsearchparam, my_IntOvlp, my_thCohMV, my_MT) MDegrainN(last,super_cpu, multi_vec_cpu2a, my_refine_tr, thSAD=my_thSAD_mg, thSAD2=my_thSAD2_mg, thSADA_a=my_thSADA_a, thSADA_b=my_thSADA_b, mt=my_MT, wpow=4, UseSubShift=1, thSCD1=my_thSCD, adjSADzeromv=0.7, \ adjSADcohmv=0.7, thCohMV=my_thCohMV, MVLPFGauss=0.9, thMVLPFCorr=50, adjSADLPFedmv=0.9, IntOvlp=my_IntOvlp) Prefetch(..) So I get new ideas about better quality of MVs using still free resources of hardware accelerator: To use extra free resources of hardware ME accelerator (also typically not capable to make overlapping processing in mvtools-order with single search job) send several small steps shifted frames for search MVs with a bit different blocks assignment (like +-1 sample for 4:4:4 formats and +-2 samples for 4:2:0 formats) to generate 4 or 8 additional MVs around 'current' block position and calculate some averaging of these 5 or 9 MVs to get possibly more noise-free MV for current block. Averaging modes may be arithmetic mean or median (or other non-linear filtering of data 1D vector or even 2D array). To make it usable with any MAnalyse mode and any other filter consumer of MVs data - make it finally separated mvtools filter like MVProc() with 5 or 9 possible inputs from several MAnalyse (or in the future 1 input from single MAnalyse in special multi-mode). Also maybe transfer MVLPF (and other possible future MVs data intermediate processing) in this filter so it can be used with any MVs data consumer filter in complex scripting and allow to split its output to different filters using AVS scripting - for example as additional predictor for multi-generation search scripts (see feature 48 also). The number of search positions around the current block may be increased up to filling all possible integer blocks positions. Also maybe subsample shifted positions can be added too (to fill radius of 0.5..0.25 to 1.25 and more around current block position). Expected new features script is like: Code:
#current block pos super=MSuper(last) current_mvclip=MAnalyse(super,..) #shifted 2 samples up block pos sh2_up=Crop(0,2,last.width, last.height-2).AddBorders(0,0,2,0) super_sh2up=MSuper(sh2_up) sh2_up_mvclip=MAnalyse(super_sh2up,..) # same here for shifting 2 samples left, down, right # combine 5 MV clips from current and shifted blocks assignment mvclip=MVProc(current_mvclip, sh2_up_mvclip, sh2_down_mvclip, sh2_left_mvclip, sh2_right_mvclip, average_mode='median',.., optional MVLPF and other) MDegrainN(last, super, mvclip,...) |
![]() |
![]() |
![]() |
Thread Tools | Search this Thread |
Display Modes | |
|
|