Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
23rd November 2021, 21:34 | #641 | Link |
Registered User
Join Date: Nov 2009
Posts: 2,371
|
I ported new nvidia's sharpener, read its blog, here the repo. It's open source to compete against AMD's FSR. Someone else can port their scaler as that isn't possible with avisynth syntax.
I didn't like it much though, it's a bit naive, simple minmax edge mask that is later discretized, minmax limiting and a simple unsharp mask fixed at 3x3 window. Bitrate increase is understandable, you are increasing acutance which raises SAD, but when I encode I only care about output quality. I guess if you want to maximize quality for bitrate using LSFmod's options is best suited because you can tell it to localize sharpening only where it cares most. I didn't read much of the new MVTools developments by DTL, I read about some concessions on using only 8x8 blocks, that's not optimal for HD, furthermore I would really like to see adaptive block sizes like H.265, a clear sky wouldn't need less than 32 block sizes whereas a city landscape would do better with 8. I will wait for stable. I will have a look now to the pending issues.
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread |
23rd November 2021, 22:03 | #643 | Link | |
Registered User
Join Date: Nov 2009
Posts: 2,371
|
ah, that's easy, use blur with clamping. ex_luts("clamp") as alternative to dehaloers or custom convolutions.
Quote:
Code:
limit = 0.30 convertbits(16) a=SPresso(bias=20, biasc=40, rgmodec=4, limit=limit, limitc=limit*2) ex_makediff(a, aug=100, UV=128) When using your call, yes some blocks seem to change more with 0.29 to 0.30 but this also happens with with older SPresso scripts, this is the nature of hard thresholding/limiting. I implemented soft thresholding on a few of my filters (not all). Anyway, I'm going to use old=true to my ex_MinBlur() call so it's an exact match to older SPresso. I will edit the script again when pinterf adds the sign operator.
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread Last edited by Dogway; 23rd November 2021 at 22:49. |
|
24th November 2021, 05:15 | #644 | Link | |
Pig on the wing
Join Date: Mar 2002
Location: Finland
Posts: 5,812
|
Quote:
__________________
And if the band you're in starts playing different tunes I'll see you on the dark side of the Moon... |
|
24th November 2021, 17:47 | #646 | Link |
Pig on the wing
Join Date: Mar 2002
Location: Finland
Posts: 5,812
|
I found a rather peculiar problem with SoftLimiter and apparently multithreading.
This script causes strange seeking which quite often crashes RequestLinear with "RequestLinear: internal error (frame not cached)!". I'm currently on test 27 of Avisynth+'s latest builds. DGSource("source.dgi") RequestLinear(clim=100, debug=true) SoftLimiter(dyn=2) Prefetch(threads=24, frames=12) # good for 12-core 3900X https://pastebin.com/TgjREc0f, see for example line 588.
__________________
And if the band you're in starts playing different tunes I'll see you on the dark side of the Moon... |
24th November 2021, 19:46 | #648 | Link |
Pig on the wing
Join Date: Mar 2002
Location: Finland
Posts: 5,812
|
This strange glitch, https://github.com/pinterf/mvtools/issues/37, does not occur if I use RequestLinear there.
__________________
And if the band you're in starts playing different tunes I'll see you on the dark side of the Moon... |
24th November 2021, 20:09 | #649 | Link |
Useful n00b
Join Date: Jul 2014
Posts: 1,666
|
DGIndex and DGIndexNV both use the same random access logic, so if you can find a way to produce this error with DGIndex then perhaps pinterf will be able to look at it. We should chip in to get him an nVidia GPU, if anyone can find one.
|
24th November 2021, 22:42 | #650 | Link |
Registered User
Join Date: Nov 2009
Posts: 2,371
|
It's the same that I posted a while ago, antialiasing, dehaloers, masked blurs are all good for that.
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread |
25th November 2021, 11:21 | #651 | Link |
Acid fr0g
Join Date: May 2002
Location: Italy
Posts: 2,846
|
Unfortunately none of them gave good results on movies such as Transformers, Pacific rim, etc... where the CGI sharpness was exasperated to increase "realism".
When you have time, please, get a look on my message about Brazil, before file shares expire.
__________________
@turment on Telegram |
25th November 2021, 11:55 | #652 | Link | |
Registered User
Join Date: Jul 2018
Posts: 1,218
|
Quote:
Though with latest speed test i try 4x 8x8 block search (in 4x1 arrangement that is faster to load from memory) with both H and V steps reload from memory (cache) and it still a bit faster (about 1.5x). Also the 4x 8x8 blocks may be arranged in 'macroblock' in different ways - like 4x1 in H direction or 2x2 in H and V and it makes 16x16 block. But this multi-8x8 block processing still have 8x8-precision refining because each 8x8 block is refined. In classic 16x16 block search there is no refining down to 8x8 granularity. But because 16x16 block really fits in 1/2 of AVX2 register file so I will add 16x16 block search in 'standard' mode (with 16x16 granularity). It will not be as fast as 8x8 (inside search algoriphm) but should be still faster in compare with old 16x16 each pos SSE search (Expanding search). Because of 4times less vector data processing overhead in compare with 8x8 block size. Though it is still only for AVX2 capable chips. Sadly the war between intel and amd cause the amd to go to massive-multicore chips without avx512. So it looks the CPU chips manufacturers lost war with HW accelerators of large vectors processing and the future expansion of large vectors co-processors at standard CPU cores may be stopped. May be AVX2 is the last most commonly used large vector co-procession capability at general client CPU chips in current dying civilization. All other large vectors processing must be offloaded to separate HW accelerators (like GPU or other). There some ideas of 'distributed' MAnalyse processing with sending 'workunits' to 'workers' that can be IP interconnected to any host in the net and also localhost can have any based 'worker' (also GPU-accelerated) to receive working task from main processing MAnalyse and return small data about vector data refined (12 bytes chunk per each block). But all workers must have access to the processing files (source). So users/programmers can make separated by IP layer 'vector task search/refine workers' with any processing capabilities (CPU/GPU/ASIC etc) and it will be easier to debug and scale. The size of 'workunit' may be as small as one block or as large as the tr-step (total tr-stepped frame for MDegrainN, though currently MDegrainN asks for separated each tr frame with GetFrame - not for all 2*tr MAnalyse data at once). The AVX512 can fit 32x32 block for faster search but it looks the nowdays and close future of AVX512 in clients chips become not as nice as expected. Though some pro with highperformance workstations (some HP workstations have Xeons Gold with AVX512) and servers can use it. I also move to the idea of dymanic block size processing because 8x8 still have lots of vector processing overhead. And if vectors of the local region are coherent enough - we can use multi-8x8 block processing at SAD search/refining with increasing of speed. But typically clear sky and clear low-detailed (and low contrast ?) areas do not create enough coherent vectors fields. And as the most critical for speed level 0 search uses internally r=1 so we need really totally coherent prediction vectors x1=x2=x3=x3 and y1=y2=y3=y4) to use multi-block processing (or even arrange area to 4x macroblock) because any possible refining at finest level can be only to +-1 in x,y coordinates. May be maximum non-coherency may be +-1 because if predicted block coords are best - the search phase will simply confirm prediction and re-generate SAD value (as currently done with optPredictorType=3). Or algoriphm of switching to 'macroblocks' need addition like not only check vectors coherency but also may be its spectum or total contrast, or something else. Last edited by DTL; 25th November 2021 at 12:28. |
|
25th November 2021, 12:42 | #653 | Link | |
Pig on the wing
Join Date: Mar 2002
Location: Finland
Posts: 5,812
|
Quote:
Then again, the strange seeking in SoftLimiter is also something worth investigating.
__________________
And if the band you're in starts playing different tunes I'll see you on the dark side of the Moon... |
|
25th November 2021, 13:41 | #654 | Link | |
Registered User
Join Date: Nov 2009
Posts: 2,371
|
Quote:
I already downloaded the clips and at first glance it seems to be a problem of high 'thSADC', but I need to give a deeper look at it. As for softening I can't recommend anything outside deringing (which ex_luts("clamp") does) and/or blurring. I run tests on it and it worked fairly well for transformers with the antialiasing + deringing combo. Not sure what your goal is, this is as good as it gets without butchering the source. Antialising for cutting out the sharpness Deringing to eliminate acutance Blurring (Optional) for extra softening As a side effect mosquito noise also seems to be eliminated, so no bitrate hungry artifacts remain, specially if you pair it with a faint temporal denoiser like FluxSmooth or STpresso. Code:
SantiagMod(strh=2,strv=2) a=last ex_luts(last,removegrain(12,0),mode="clamp",UV=1) th = ex_bs(7, 8, bi, false) ex_lutxy(last,a,Format("x y - abs {th} > y x y min ?")) # hard threshold # The next very optional msk=ex_makediff(ex_guidedblur(12,use_gauss=false,regulation_mode=2), aug=10, UV=128) ex_merge(last,ex_boxblur(0.15,mode="weighted"),msk,luma=true) AA+DR AA+DR+Blur AA+DR+Localized Blur @Boulder: Can you try with Dyn=0? I think the issues may come from ShowChannels() or something, maybe ScriptClip() since I'm using "After_Frame=True". Until pinterf has a look on what is preventing ScriptClip to catch ShowChannels() globals I can't switch to the newer ScriptClip() syntax (more MT friendly)
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread Last edited by Dogway; 25th November 2021 at 14:08. |
|
25th November 2021, 16:00 | #655 | Link | |
Pig on the wing
Join Date: Mar 2002
Location: Finland
Posts: 5,812
|
Quote:
__________________
And if the band you're in starts playing different tunes I'll see you on the dark side of the Moon... |
|
29th November 2021, 14:29 | #657 | Link |
Registered User
Join Date: Nov 2009
Posts: 2,371
|
Just updated SPresso again with more HD modes. STpresso I haven't made it, but you can use STTWM() which is now included in ExTools.
I'm also in the middle of a big update for all my filters, big in the sense of importance not number of changes, since I'm going to rework the whole HBD scale thingy to reach parity with AVS+ core and masktools2, after the changes in test26. This is important for MIX mods but I need to synchronize all the scripts to upload at once.
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread |
29th November 2021, 17:50 | #658 | Link | |
Acid fr0g
Join Date: May 2002
Location: Italy
Posts: 2,846
|
Quote:
Unfortunately I have absolutely no idea about how to translate the parameters from STpressto to STTWM.
__________________
@turment on Telegram |
|
29th November 2021, 18:55 | #659 | Link |
Registered User
Join Date: Nov 2009
Posts: 2,371
|
Code:
sthres -> limit tthres -> tlimit tw -> tbias sw -> bias aw -> back Also the temporal part is not motion compensated because a weighted temporal median is already self motion protected. The benefit of all these changes is greater speed and simplicity. The only thing that can be improved is to use Didée style thresholding/limiting or my own soft threshold solution.
__________________
i7-4790K@Stock::GTX 1070] AviSynth+ filters and mods on GitHub + Discussion thread |
30th November 2021, 14:36 | #660 | Link | |
Acid fr0g
Join Date: May 2002
Location: Italy
Posts: 2,846
|
Quote:
Despite being a BW movie, it has some really difficult trade between noise reduction and details (marble, clothes, wood, smoke), plus I am having really a hard time to obtain a decent bitrate when encoding. I have tried the some of your previous hints, plus other I knew already, such as: ConvertBits(16) SMDegrain (tr=9, thSAD=900, refinemotion=true, contrasharp=false, PreFilter=5, plane=4, chroma=true) fmtc_bitdepth (bits=8,dmode=8) ConvertBits(16) pre=smdegrain(tr=6,mode="temporalsoften",blksize=32,thSAD=900,prefilter=5,contrasharp=false,refinemotion=true) smdegrain(tr=6,mode="MDegrain",blksize=32,prefilter=pre,thSAD=900,LFR=false,contrasharp=false,refinemotion=true) fmtc_bitdepth (bits=8,dmode=8) ConvertBits(16) Spresso(limit=10,bias=30).STTWM(sw=30, tw=49, aw=1, sthres=10, tthres=8) fmtc_bitdepth (bits=8,dmode=8) ConvertBits(16) Spresso(10,30).STpresso(limit=10,bias=30,RGmode=4,tthr=22,tlimit=8,tbias=49,back=1,mc=true) SMDegrain (tr=6, thSAD=600, refinemotion=true, contrasharp=false, PreFilter=4, plane=4, chroma=true) fmtc_bitdepth (bits=8,dmode=8) ConvertBits(16) Spresso(limit=10,bias=30).STTWM(sw=30, tw=49, aw=1, sthres=10, tthres=8) SMDegrain (tr=6, thSAD=600, refinemotion=true, contrasharp=false, PreFilter=4, plane=4, chroma=true) fmtc_bitdepth (bits=8,dmode=8) Useless to say, I am not having good results, both in quality and in size terms. Can you pull a rabbit out of the hat?
__________________
@turment on Telegram |
|
Tags |
avisynth, dogway, filters, hbd, packs |
Thread Tools | Search this Thread |
Display Modes | |
|
|