Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > VapourSynth

Reply
 
Thread Tools Search this Thread Display Modes
Old 19th October 2021, 04:59   #41  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
Released beta 4

TODO:
- re-test GRAY source and chroma=False
- for CPU mode, use BM3DCPU 32-bit or BM3D 16-bit?
- once FMTC releases a bug fix, use it for OPP-RGB conversion for better performance

Let me know if you see missing dependencies in the doc.

Last edited by MysteryX; 19th October 2021 at 15:23.
MysteryX is offline   Reply With Quote
Old 20th October 2021, 17:28   #42  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
Beta 5

It's feature-complete. The only thing I would do at this point is
- look for a more optimized YUV-OPP conversion
- fix bugs
- convert to Avisynth

One possible performance improvement would be to process Luma and Chroma with different settings; but I run it at max quality anyway.

Last edited by MysteryX; 20th October 2021 at 17:34.
MysteryX is offline   Reply With Quote
Old 22nd October 2021, 18:10   #43  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
Optimized RGB-OPP conversion for BM3D (10x faster)

Convert to YCgCoR for KNLMeans, which removes a slight "bleech" effect.
MysteryX is offline   Reply With Quote
Old 23rd October 2021, 04:14   #44  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
Could someone help me out with optimization?

I wrote a 2nd version of the script that manages formats and colorspaces differently. The numbers are the output of vspipe -p showing plugins performance (showing the top of the list)

Script #1 runs at 0.66fps
Code:
BM3D                 parallel        99.94      46.98
KNLMeansCL           parreq          67.36      31.66
Bicubic              parallel        47.61      22.38
Degrain3             parallel        39.56      18.59
Analyse              parallel        33.87      15.92
Analyse              parallel        33.84      15.91
Analyse              parallel        33.68      15.83
Analyse              parallel        33.66      15.82
Analyse              parallel        33.44      15.72
Analyse              parallel        33.19      15.60
Bicubic              parallel        30.92      14.53
Super                parallel        25.06      11.78
Bicubic              parallel        24.06      11.31
Script #2 is crawling at 0.45fps
Code:
Bicubic              parallel        91.75      61.10
BM3D                 parallel        67.49      44.94
bitdepth             parallel        55.19      36.75
KNLMeansCL           parreq          54.42      36.24
Degrain3             parallel        43.89      29.22
Analyse              parallel        40.95      27.27
Analyse              parallel        40.59      27.03
Analyse              parallel        39.86      26.55
Analyse              parallel        39.49      26.30
Analyse              parallel        38.80      25.83
Analyse              parallel        38.39      25.56
Bicubic              parallel        36.30      24.17
Super                parallel        23.90      15.92
Bicubic at the top of the list!?? It's the Bicubic at the top that converts YUV420P8 to RGB48. Why does it crawl like that?

Strange thing is that Script #2 has only a single resize, whereas Script #1 has 2 such resizes at the top.

Anyone knows what's wrong?

In terms of quality, I'm unable to detect a difference. Both process BM3D in OPP colorspace and KNLMeans in YCgCoR. MVTools running in YUV isn't having much impact on the output.

EDIT: I tested with a 720p video. Script 1 goes at 14.6fps with 3.1GB memory usage, Script 2 goes at 14.0fps with 2.8GB memory usage. That's more what I would expect.

Thus, the problem is a VapourSynth performance problem with high-res videos. I reported it here.

Last edited by MysteryX; 24th October 2021 at 00:40.
MysteryX is offline   Reply With Quote
Old 24th October 2021, 06:06   #45  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
Released beta 6

Added feisty2's ChromaReconstructor_faster and nnedi3 chroma upscaling options to script 2. Set chroma = none|bicubic|nnedi3|reonstructor. nnedi3 has very little impact on performance, but reconstructor is very heavy; and it's a faster variation from the original script.

Unfortunately that 2nd script isn't very usable on 4K+ content until the performance bug is fixed in VapourSynth, but works perfectly fine for SD.

Last edited by MysteryX; 24th October 2021 at 18:05.
MysteryX is offline   Reply With Quote
Old 27th October 2021, 22:42   #46  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
Released beta 7.

Few tweaks, and now supports RGB source format. For 4K+ sources, it seems to work better in VapourSynth r55 than r57 in certain cases.

It looks stable now. Let me know if you encounter any issue.

Last edited by MysteryX; 27th October 2021 at 22:48.
MysteryX is offline   Reply With Quote
Old 6th November 2021, 08:42   #47  |  Link
PatchWorKs
Registered User
 
PatchWorKs's Avatar
 
Join Date: Aug 2002
Location: Italy
Posts: 304
Hi there, very nice work !

About optimizations: I fear that a port into a more performing language (such as C / C ++ or, even better, ASM) is mandatory, but a colab version like this would be also nice.
PatchWorKs is offline   Reply With Quote
Old 6th November 2021, 14:17   #48  |  Link
Quadratic
Registered User
 
Join Date: Jul 2021
Posts: 26
Quote:
Originally Posted by PatchWorKs View Post
About optimizations: I fear that a port into a more performing language (such as C / C ++ or, even better, ASM) is mandatory, but a colab version like this would be also nice.
I'm not sure that would bring much benefit at all, since all the heavy processing is being done by plugins which are already written in the usual suspects (C / C++ / Rust).
Quadratic is offline   Reply With Quote
Old 7th November 2021, 04:28   #49  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
Yes all the hard work is done by plugins; BM3D, MVTools and KNLMeans being most of the processing. They are already optimized; MVTools is probably the most poorly optimized old code.

Btw did you know it's faster to write using Expr than to write it in C++? Because Expr uses SIMD automatically whereas C++ code requires complicated inline assembly to not be slow.
MysteryX is offline   Reply With Quote
Old 10th November 2021, 18:22   #50  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
Beta 8, use FMTC for resize and resampling for performance.
MysteryX is offline   Reply With Quote
Old 20th November 2021, 09:14   #51  |  Link
PatchWorKs
Registered User
 
PatchWorKs's Avatar
 
Join Date: Aug 2002
Location: Italy
Posts: 304
@Selur just tested HINet, how does it performs (in terms of both speed and fidelity) compared to xClean ?

Authors claims:
Quote:
With the help of HIN Block, HINet surpasses the state-of-the-art (SOTA) on various image restoration tasks. For image denoising, we exceed it 0.11dB and 0.28 dB in PSNR on SIDD dataset, with only 7.5% and 30% of its multiplier-accumulator operations (MACs), 6.8 times and 2.9 times speedup respectively. For image deblurring, we get comparable performance with 22.5% of its MACs and 3.3 times speedup on REDS and GoPro datasets. For image deraining, we exceed it by 0.3 dB in PSNR on the average result of multiple datasets with 1.4 times speedup.
Quote:
Originally Posted by Selur View Post
Has anyone tested https://github.com/HolyWu/vs-hinet ?

Here are a few screen shots: (not sure what to make of them and for what content this is really useful)

Mode: Deblur GoPro

Mode: Deblur REDS

Mode: denoise

Mode: derain

Last edited by PatchWorKs; 20th November 2021 at 09:32.
PatchWorKs is offline   Reply With Quote
Old 17th January 2022, 00:19   #52  |  Link
Reclusive Eagle
Registered User
 
Join Date: Oct 2021
Posts: 83
Bit of an older post but why not use Dfttest as a reference + BM3D?

I also find pre-sharpening the clip before the reference preserves more detail.
Using this combination preserves all but the most minute details in heavy noise clips.
I'm talking less than 1%-0.5% and only in the already faintest of details.

For anything that isn't SD compressed DVD grain, you will preserve all detail this way



In fact honestly after testing this I would recommend:

1: Double the resolution of your clip and then sharpen it. (This reduces the effect of sharpening noise)
2: Denoising with Dfttest
3: Using Dfftest for a BM3D Reference
4: Using the first BM3D Reference as a reference for a second BM3D
5: Downscale to original resolution

This will preserve massive amounts if not all detail, and because it relies on BM3D Cuda will result in 2fps + renders.
So +- 2 hours for 25 minutes of footage. If you have a better GPU than my 2014 750 TI you will get dramatically faster results.

Here is an example of a very noisy DVD IVTC:

Here is the original image after IVTC:
https://i.slow.pics/4c69Dt8B.png

Here is the same image after denoising:
https://i.slow.pics/jNtO6qyC.png

Even the clouds preserve all detail. Which is extremely, extremely hard to preserve in older anime backgrounds when denoising

To achieve this I: ​
1: pre-sharpened with LSFMod
2: Denoised with Dfftest
3: Denoised with BM3D using Dfftest as a reference
4: Denoised with a second BM3D using the first BM3D as a reference.



Here is the code I used if you or anyone else would like to replicate it.
Just remember this Dfftest's nlocation is custom made for this DVD to preserve extremely small details. It won't work on all clips.

Quote:
#Upscale
clip = core.resize.Lanczos(clip, 1440,1080)

#Pre-sharpen clip
clip = haf.LSFmod(clip, preblur=3, strength=110, Smode=1,Smethod=3, soothe=True, edgemaskHQ=True, secure=True,soft=10)
clip=core.fmtc.bitdepth (clip,bits=32)

#3 Pass Denoise
ref = core.dfttest.DFTTest(clip, ftype=1, tbsize=3, nlocation=[35,0,0,10, 28,0,83,87, 95,0,0,100],sigma=10.0,sst=[0.0,2.0, 1.0,20.0], ssystem=1,opt=3)
ref2= core.bm3dcuda.BM3D(clip, ref=ref, sigma=[25,25,25], block_step=3)
clip= core.bm3dcuda.BM3D(clip, ref=ref2, sigma=[17,17,17], block_step=3)

clip=core.fmtc.bitdepth (clip,bits=8)
clip.set_output()
Notice how each denoise decreases in intensity. Also for BM3D Block step of 1 will reduce performance by 80%.
Having it at 3 is the difference between rendering at 5.5fps vs 0.6fps. And there is ZERO difference in quality (in this case)

Last edited by Reclusive Eagle; 17th January 2022 at 01:22.
Reclusive Eagle is offline   Reply With Quote
Old 17th January 2022, 01:22   #53  |  Link
Reclusive Eagle
Registered User
 
Join Date: Oct 2021
Posts: 83
In fact at this point the denoising is so good with BM3D if you wanted to you can just continue stacking BM3D references at less intensity.
You can have 8 pass noise reduction with 1 dfttest and 7 BM3D's with lightning speed compared to KLMeans.

And don't be afraid to add more sharpness after every few stacks.
For example, taking the above code I stacked it 4 times but this time added more sharpness to the original clip but because the references were so good-
I retained more detail, increased overall sharpness and gained zero new noise compared to the above example with this code:

Quote:
#Upscale
clip = core.resize.Lanczos(clip, 1440,1080)

#Pre-sharpen
clip = haf.LSFmod(clip, preblur=3, strength=110, Smode=1,Smethod=3, soothe=True, edgemaskHQ=True, secure=True,soft=10)
clip=core.fmtc.bitdepth (clip,bits=32)

#3 Pass denoise
ref = core.dfttest.DFTTest(clip, ftype=1, tbsize=3, nlocation=[35,0,0,10, 28,0,83,87, 95,0,0,100],sigma=10.0,sst=[0.0,2.0, 1.0,20.0], ssystem=1,opt=3)
ref2= core.bm3dcuda.BM3D(clip, ref=ref, sigma=[25,25,25], block_step=3)
ref3= core.bm3dcuda.BM3D(clip, ref=ref2, sigma=[17,17,17], block_step=3)

#Post reference sharpen
clip = haf.LSFmod(clip, preblur=3, strength=40, Smode=1,Smethod=3, soothe=True, edgemaskHQ=True, secure=True,soft=10)

#Final 4th pass denoise
clip= core.bm3dcuda.BM3D(clip, ref=ref3, sigma=[13,13,13], block_step=3)

clip=core.fmtc.bitdepth (clip,bits=8)
clip.set_output()
The results are again, increased sharpness due to the post reference sharpen however with zero increase to the overall noise in the final 4th denoise pass.
Obviously you can see sharpening halos. But those can be entirely fixed by masking.

You can find a really detailed masking tutorial on kageru's blog:
https://blog.kageru.moe/legacy/edgemasks.html

and you can fix sharpening halos directly with this tutorial also by Kageru:
https://guide.encode.moe/encoding/ma...iting-etc.html

Btw with this setup including Nnedi 3 and IVTC and upscaling and 4 pass denoising -
I am rendering at 2.8fps in prores 1440x1080 with an i5 9600k and GTX 750 Ti.

Just as a baseline for anyone who want's to compare future settings.


I also recommend tbilateral after your pre-sharpen but before your denoise references or-
Apply it to your final reference before final denoise (same affect either way).

This will give you an insanely clean image

Last edited by Reclusive Eagle; 19th January 2022 at 03:44.
Reclusive Eagle is offline   Reply With Quote
Old 17th January 2022, 06:21   #54  |  Link
kedautinh12
Registered User
 
Join Date: Jan 2018
Posts: 2,156
Wow, it's great
kedautinh12 is offline   Reply With Quote
Old 23rd January 2022, 10:12   #55  |  Link
MysteryX
Soul Architect
 
MysteryX's Avatar
 
Join Date: Apr 2014
Posts: 2,559
Reclusive Eagle, your reference clip is an anime while mine is raw camera footage -- great method, but completely different use-case.

It gives a nice polishing effect on your anime; but that will most likely give the plastic effect on other videos that I've been trying to avoid.
MysteryX is offline   Reply With Quote
Old 19th November 2022, 01:05   #57  |  Link
WarnerBrah
DPX -> HEVC@Veryslow
 
WarnerBrah's Avatar
 
Join Date: Oct 2022
Posts: 6
Script Error
Script error: There is no function named 'nmod'.
TransformsPack - Main.avsi, line 456
xClean.avsi, line 232

any help?
WarnerBrah is offline   Reply With Quote
Old 19th November 2022, 01:46   #58  |  Link
kedautinh12
Registered User
 
Join Date: Jan 2018
Posts: 2,156
You lack this script
https://github.com/Dogway/Avisynth-S...izersPack.avsi
kedautinh12 is offline   Reply With Quote
Old 19th November 2022, 21:11   #59  |  Link
WarnerBrah
DPX -> HEVC@Veryslow
 
WarnerBrah's Avatar
 
Join Date: Oct 2022
Posts: 6
nnedi3_weights.bin is missing now
WarnerBrah is offline   Reply With Quote
Old 19th November 2022, 21:21   #60  |  Link
kedautinh12
Registered User
 
Join Date: Jan 2018
Posts: 2,156
You lack NNEDI3CL, download extra 1.0.3 to got .bin file and put same folder with .dll file from 1.0.5
https://github.com/Asd-g/AviSynthPlus-NNEDI3CL/releases

Last edited by kedautinh12; 19th November 2022 at 21:26.
kedautinh12 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 00:54.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.