Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
28th June 2022, 18:54 | #1301 | Link | |
21 years and counting...
Join Date: Oct 2002
Location: Germany
Posts: 716
|
Quote:
|
|
28th June 2022, 19:05 | #1302 | Link |
Registered User
Join Date: Nov 2009
Posts: 2,361
|
Finally updated SMDegrain. Adds RGB support to ex_luma_rebuild() and fixes the issue in ex_retinex(), also added option to disable highlight compression to ex_retinex(), so in essence working as a shadow enhancer, it's built inline so for lvl=1 it's faster than before.
Also for GradePack fixed saturation in ex_contrast() as it was a regression I didn't notice.
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread |
29th June 2022, 10:25 | #1303 | Link |
Registered User
Join Date: Jul 2018
Posts: 1,074
|
About high thSAD:
As I see in current version of SMDegrain: thSCD1 = Default( thSCD1, round(0.35*thSAD+260)) # Typically between 330 and 400 So if user want to use thSAD of 1000 the default internal thSCD1 will be about 610 only. So if blocks SAD generally > 610 it will quickly overload the counter of too bad blocks per frame and frame will be marked as not-usable and any increasing of tr (and thSAD) will be useless (also the total MDegrain processing will be disabled if all ref frames will be marked as non-usable). So for large noise level content is it better to manually set thSCD1 param or increasing of thSAD above about 500 will quickly become useless and will not take more blocks/frames into processing. It is good to add as a note to script users: if thSAD > 500 it is recommended to set thSCD1 manually about thSAD (not less). With a caution about possibly more errors of bad blocks blending at scene change. Or may be added into script as a condition. So the formula thSCD1=0.35*thSAD+260 stop working good with thSAD about 500..600+. And user will silently got no-degrained result. May be it is a design feature of SMDegrain to save user from more distorted result but with more noise left ? If so it is good to be documented. The function for marking total frame as non-usable is https://github.com/pinterf/mvtools/b...Blocks.cpp#L72 Code:
IsSceneChange(sad_t nTh1, int nTh2) const { int sum = 0; for ( int i = 0; i < nBlkCount; i++ ) sum += ( blocks[i].GetSAD() > nTh1 ) ? 1 : 0; return ( sum > nTh2 ); } So 'scene change' detector may trigger frame-non-usable flag also with too noisy content (too high interframe SAD of blocks) and too low thSCD1 param. When frame is marked as non-usable - no any one block of frame is used for useful denoising of output. Whole frame is failed. Last edited by DTL; 29th June 2022 at 10:47. |
29th June 2022, 10:50 | #1304 | Link |
Registered User
Join Date: Nov 2009
Posts: 2,361
|
The formula was modeled after scene change detection. For noisy sources anything higher than ~600 was not detecting any scene change at all and hence blending scenes. It's typically always above default of 400 so already improving on defaults. Maybe we should decouple it from pure SC detection.
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread |
29th June 2022, 12:53 | #1306 | Link |
Registered User
Join Date: Jul 2018
Posts: 1,074
|
"Maybe we should decouple it from pure SC detection."
There is no other 'scene detect' in MAnalyse+MDegrain system. So the only high quality solutions for very noisy movies may be to cut scenes manually or some automatically using AVS plugins to separated clips and process each clip separately and concatenate back. It will save from different scenes frames blending inside falling in 2_x_tr group of neibour frames. With any high thSAD values and with completely removed SCD functionality with setting thSCD1 to some high value like 10000 and thSCD2 to 255. |
29th June 2022, 13:08 | #1307 | Link |
Registered User
Join Date: Nov 2009
Posts: 2,361
|
I think block size of 16 is not supported, which is the default for HD. Should look into other changes though. Current defaults are modeled after vanilla mvtools so I would need to account for the changes. In any case I still haven pendant of the final Zopti optimization run with synthetic grain. What I'm seeing more and more is playback film grain addition in codecs like JXL, AV1, HEVC, etc. So the trend is to encode clean and add synthetic grain on top on playback, although I'm suspicious of such real-time grain quality.
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread |
29th June 2022, 13:30 | #1308 | Link |
Registered User
Join Date: Jul 2018
Posts: 1,074
|
" block size of 16 is not supported"
It is supported as max current block size of DX12-ME HW accelerated mode and also in most of already working MDegrainN new modes. It simply not tested well because it produces lower quality and I currently use 8x8 only for HD. As for speed in hardware ME I not remember if using 16x16 makes significantly better speed. There planned some 'fastest' hacking mode of ME processing with supplying half sized image to ME engine (level 1 of MSuper clip instead of level 0) and search with block size 8x8 and interpolate MVs to double sized with 16x16 block size. But still not finished. It is also already should work with 'onCPU' MAnalyse with optPredictorType about 5 (output only interpolated prediction at level 0 and SAD recalculation). |
6th July 2022, 10:18 | #1310 | Link |
21 years and counting...
Join Date: Oct 2002
Location: Germany
Posts: 716
|
I'm playing around with prefilters at the moment. I can't decide between DGDenoise and BM3DCuda.
DGDenoise is very fast even on older graphic cards but I read quite a few times now it should be used mainly on anime and not films because it is very soft? I don't know how much this is valid if it's running as a prefilter for MDegrain. Comments? Last edited by LeXXuz; 6th July 2022 at 10:57. |
6th July 2022, 10:36 | #1311 | Link | |
Registered User
Join Date: Dec 2005
Location: Sweden
Posts: 703
|
Quote:
Edit: although I only used it on 4K resolution material, I would think it needs different settings for lower resolutions. Last edited by anton_foy; 6th July 2022 at 11:38. |
|
7th July 2022, 07:30 | #1312 | Link | |
21 years and counting...
Join Date: Oct 2002
Location: Germany
Posts: 716
|
Quote:
|
|
7th July 2022, 08:23 | #1313 | Link | |
Registered User
Join Date: Dec 2005
Location: Sweden
Posts: 703
|
Quote:
Code:
Convertbits(16) #Levels shadow compensate: Levels(0, 1, 255*256, 36*256, 255*256, coring=false) #Prefilter: p=levels(12*256,1.0,100*256,0,255*256).temporalsoften(3,20,255,15,2).ex_DGDenoise(str=0.3) #Denoise: SMDegrain(prefilter=p) #levels reset: Levels(36*256,1.0,255*256,0,255*256) Last edited by anton_foy; 9th September 2022 at 15:04. Reason: Edit: forgot to add 'Levels' before parenthesis in the first line. |
|
7th July 2022, 17:00 | #1314 | Link |
Registered User
Join Date: Dec 2005
Location: Sweden
Posts: 703
|
ex_reduceflicker
ex_reduceflicker(strength=1) is great but it gives me strange halos and sometimes like double exposed offset lightsources on 4K sources. Can post before after images if needed. It seems to only happen in bright areas.
|
7th July 2022, 17:18 | #1315 | Link | |
Registered User
Join Date: Jul 2018
Posts: 1,074
|
Quote:
In new approach of 2.7.45+ builds the noise from MVs is more or less removed after MAnalyse search and before MDegrain blending using the properties of long enough time-domain vectors. Like for static block the MV in each frame of the tr-scoped group of frames should be zero and with some real life motion the coordinate of MV in a sequence of frames can not change too fast. I still not found significant benefit from prefilter while running hardware ME from GTX1060 and applying MVLPF in the new builds of mvtools as intermediate processing before ME and MDegrain blending to fix some noise from MVs. Currently the some implementation of MVLPF filtering is intergated into MDegrainN (of 2.7.45+ builds) and not available for direct tweaking by script writers. I really interesting why so useful and not very complex processing was still not implemented many years ago at the active development of mvtools. Or may be it is somehow working at MRecalculate or other parts of mvtools ? The only idea that first MDegrainX of X 1,2,3 runs too slow on old CPUs so not efficient time-domain filtering may be applied visibly profitable. Nowdays with MDegrainN and tr > 10 running at acceptable speed and ME-acceleration where available it become more usable and allow to use more complex transfer functions filters in time domain for motion of each block. It really very science-loaded processing (may not have some single best for all implementation of linear or even non-linear filter) and may be need to be tweaked for per-movie or per-scene basis. So it planned to add some long-requested interface for script-accessible MVs format of datastream between MAnalyse and MDegrain so script-writers may make lots of testing with intermediate MVs data additional processing to get best results. As well as lots of work was put for pre-processing before MAnalyse in the past. Though any anti-noise processing of MVs is not image-processing task but DSP processing in time-domain and can not be performed fast with existing 2D processing filters and direct data access scripts in interperting mode may be much slower. So it mostly for finding some profitable filtering methods for future addition in the compiled executable form. Last edited by DTL; 7th July 2022 at 17:32. |
|
7th July 2022, 19:13 | #1316 | Link | ||
21 years and counting...
Join Date: Oct 2002
Location: Germany
Posts: 716
|
Quote:
Quote:
|
||
8th July 2022, 10:33 | #1317 | Link |
Registered User
Join Date: Jul 2018
Posts: 1,074
|
The sad truth may be that mvtools was not designed as denoise tool and the degrain part is only small part. Its main task was the frame interpolation for motion smoothing and it cause the main hacks and tweaks in MAnalyse for better motion interpolation (less visible bugs) and not for best possible degraining for video compression. So mvtools old versions do not have built-in prefiltering and interfiltering of data for best possible denoising and users need to test different handmade approaches around the given filters set in mvtools project.
*After again reading mvtools2 doc* : Do the MStoreVect and MRestoreVect functions work for converting MVs internal format to 'standard clip' and back so script writer can get access to MVs data and make intermediate filtering of MVs too (between MAnalyse and MDegrain) and continue to pass the changed data to next mvtools filters ? So may be this functionality already exist (but not initially designed for this process) and no need to be developed again ? It is very few documented and do not document the mapping of each block MV to sample data of the 'normal clip' but it may be checked. A question to Dogway: Was it tested as a way to access MAnalyse MVs data (of both dx, dy data for inter-filtering and so frequently asked SAD for script-based automation like auto-thSAD assignment) ? Using MStoreVect and 'pixel-accessing' functions from a script (like Eval ?) ? Also about MFlow: documentation says it can be used for pixel-based and not block-based compensation and for denoising too (without block artefacts and may be not required overlap mode directly). So in theory it can create less detail smoothing at the small and complex moving images like facial animation where even mid-sized blocks like 8x8 is still too large (and typical 16x16 and 32x32 are awfully large) and downsampling to 4x4 for the total clip is too slow in processing: Was it used in the known degrain scripts already ? Or it was too slow so the old single-filter MDegrain block-based runs faster ? Are there plans to look for MFlow-based degrain in Dogway scripts to test it quality compared to MDegrain-block-based execution engine ? Unfortunately it looks not currently support 'wrapped' multi-tr data passing from MAnalyse to degraining so need long hand-written scripts for passing +-forward/backward compensated groups of frames to blending engine like old MDegrainX for every tr-value (or may be it is possible to create a function in script ?). The simulation of simplest MDegrain1 with MFlow in documentation is already awfully large: Code:
super = MSuper() backward_vectors = MAnalyse(super, isb = true) forward_vectors = MAnalyse(super, isb = false) forward_compensation = MFlow(super, forward_vectors, thSCD1=500) # or use MCompensate backward_compensation = MFlow(super, backward_vectors, thSCD1=500) # create interleaved 3 frames sequences interleave(forward_compensation, last, backward_compensation) DeGrainMedian() # place your preferred temporal (spatial-temporal) denoiser here selectevery(3,1) # return filtered central (not-compensated) frames only Or it is a task for some future MDegrainNFlow() function or a option to MDegrainN like (UseFlowInterpolation=0/1). Also as addition to new 2.7.45+ builds with supporting hardware ME acceleration and different block sizes: Currently Microsoft DX12-ME API support only 8x8 and 16x16 block size. But as I see from mvtools2 doc there are existing special functions for 'scale' of MVs between different block sizes (for faster processing). So if blocksize of 32x32 and higher is required the downsampled clip can be feed to MAnalyse with blocksize 16x16 and post-processed with MScaleVect to 32x32 block size. It same idea as optSearchOption=6 for the internals of MAnalyse but looks not required and may be removed if it already designed as developed function of mvtools. It may be important in the era of HD/UHD/UHD2 mix formats or for 8K that is not currently directly covered by old ME-accelerators of mid-201x years. So blocksize of 8x8 may be processed for 4K downsampled clip from 8K source and after MScaleVect to 16x16 being supplied to MDegrain of 8K processing full size input clip. The reserve of precision default down to qpel at 8x8 and 16x16 block sizes from ME-accelerators at full speed is very enough for 4K and 8K processing - even after scaling to double block size the precision is only lowered to half pel. Last edited by DTL; 8th July 2022 at 10:47. |
8th July 2022, 16:03 | #1318 | Link | |
Registered User
Join Date: Dec 2005
Location: Sweden
Posts: 703
|
Quote:
|
|
8th July 2022, 17:10 | #1319 | Link |
Registered User
Join Date: Jul 2018
Posts: 1,074
|
" I see you use degrainmedian/"use your favourite spatial/temporal denoiser""
It is example copied from mvtools2 doc directly with comments. Now I see the DeGrainMedian() is possibly some real external denoiser plugin/script. " is this like motion compensation without mdegrain?" In this example of simulating MDegrain1 processing the motion compensation is performed by MFlow. But for final blending some external engine is used (may be with protection from too bad blending based on absolute difference of samples values ?). Or with even number of frames and/or with equal weigthing may be AVS internal blenders may be used like Overlay() or Layer() with 1/3 weight mask for each of 3 clips (current, -1 frame compensated, +1 frame compensated). The MDegrainX/N is compiled executable for several operations: 1. Motion compensation of blocks based on MAnalyse data 2. Blending of compensated blocks in the tr-scoped pool of surrounding frames using some weighting based on SAD values relative to thSAD params (also scene change detection). Simple or overlapped mode blending based on service data from MAnalsye. Last edited by DTL; 8th July 2022 at 17:14. |
9th July 2022, 10:57 | #1320 | Link |
21 years and counting...
Join Date: Oct 2002
Location: Germany
Posts: 716
|
I'm looking for a preset between 'Low' and 'Normal' for ex_BM3D in SMDegrain.
I'm asking because the low profile still leaves a lot of headroom for my GPUs on 1080p sources. The 'Normal' profile however drops performance significantly and makes the GPU the bottleneck. Both with radius=1, of course. I just don't want to tweak numbers here as long as I don't understand what they are really doing and maybe risk to worsen quality. I also don't quite understand why the RTX 30 GPU performs just slightly better than an old GTX 10 GPU, although it has several times the CUDA processing power according to Geekbench. |
Tags |
avisynth, dogway, filters, hbd, packs |
|
|