Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Usage
Register FAQ Calendar Today's Posts Search

Reply
 
Thread Tools Search this Thread Display Modes
Old 2nd November 2018, 10:13   #61  |  Link
ErazorTT
Registered User
 
Join Date: Mar 2003
Location: Germany
Posts: 215
Were you using both fftthreads=4 and Prefetch(2) at the same time? I though those to be mutually exclusive. Thus I would have guessed that it would be more performant to have only Prefetch, which then could be set higher. What is your experience with that?

Additionally I really would not set degrainTR higher than FPS/8, thus for the usual cinema film not higher than 3. There can be some very rare, hard to find but ugly effects! I could show an example when I have time to prepare the pictures later today. Thus I would decrease degrainTR to 3 or even 2 and instead increase postSigma to 2 or 3.

Well I have never used Vapoursynth, so this is not something which will happen immediately..

Last edited by ErazorTT; 2nd November 2018 at 10:25.
ErazorTT is offline   Reply With Quote
Old 2nd November 2018, 10:26   #62  |  Link
ChaosKing
Registered User
 
Join Date: Dec 2005
Location: Germany
Posts: 1,795
Quote:
Originally Posted by ErazorTT View Post
Were you using both fftthreads=4 and Prefetch(2) at the same time? I though those to be mutually exclusive.
.
Yeah, it seems mixing both is not a good idea. You can see the negative effects even more with higher values. Prefetch(8) alone runs now with 7.2fps. If I add fftthreads (tried 2 and 8) it slows down to 3.5fps regardles of that fftthread value I set.

If you want give VapourSynth a try you can use my portable pack (all filters, scripts and editors included). https://forum.doom9.org/showthread.php?t=175529

EDIT
For some reason I set degrainTR=6 while testing with fftthreads...
Prefetch(8) + fftthreads=8 => 9.2 fps, so yes mixem them is ok

Code:
TemporalDegrain2(degrainTR=4, postFFT=3, fftthreads=8) # 9.2fps
Prefetch(8)

TemporalDegrain2(degrainTR=3, postFFT=3, fftthreads=8) # 12.5fps
TemporalDegrain2(degrainTR=2, postFFT=3, fftthreads=8) # 17.3fps

TemporalDegrain2(degrainTR=3, postFFT=3) # 10.1fps
TemporalDegrain2(degrainTR=2, postFFT=3) # 14.0fps
__________________
AVSRepoGUI // VSRepoGUI - Package Manager for AviSynth // VapourSynth
VapourSynth Portable FATPACK || VapourSynth Database

Last edited by ChaosKing; 2nd November 2018 at 10:45.
ChaosKing is offline   Reply With Quote
Old 2nd November 2018, 11:04   #63  |  Link
tormento
Acid fr0g
 
tormento's Avatar
 
Join Date: May 2002
Location: Italy
Posts: 2,582
Quote:
Originally Posted by ErazorTT View Post
@tormento: Could you please also test using postFFT=3 with your Prefetch of 6?
Results on 1920x1032 input:

TemporalDegrain2(degrainTR=4,postFFT=0,postSigma=3,postDither=0) 0.69 (1.56*) fps, 3791.50 kb/s
TemporalDegrain2(degrainTR=4,postFFT=3,postSigma=3,postDither=0) 0.45 (0.85*) fps, 3060.34 kb/s
TemporalDegrain2(degrainTR=4,postFFT=4,postSigma=3,postDither=0) 0.64 (0.98*) fps, 3331.73 kb/s

(* Prefeth=6)

And, as comparison:

SMDegrain (tr=4, thSAD=400, refinemotion=false, n16=true, mode=0, contrasharp=false, PreFilter=4, truemotion=false, plane=4, chroma=true) 9.34 fps, 4083.54 kb/s

I think the problem does not reside on postFFT side. As you can see the delta is negligible with postFFT=0. Please notice that every output is then processed to x264 slow preset.
Quote:
Originally Posted by ErazorTT View Post
And, could it be that you also have an internal intel graphics card apart from your actual GPU? Because in this case I would guess that perhaps the KNLMeans runs on the wrong GPU.
My CPU is a Sandy Bridge and does not support OpenCL. Plus the slowness would appear on SMDegrain too.
Quote:
Originally Posted by ErazorTT View Post
Could you give it another using KNLMeans with this version?
All the new trials were made with your newer script.
__________________
@turment on Telegram

Last edited by tormento; 2nd November 2018 at 16:55. Reason: Misplaced values TBD
tormento is offline   Reply With Quote
Old 2nd November 2018, 11:34   #64  |  Link
ChaosKing
Registered User
 
Join Date: Dec 2005
Location: Germany
Posts: 1,795
I noticed that your scripts produces some halos:

200% zoom
https://i.imgur.com/IK83WEL.png

I used TemporalDegrain2(degrainTR=3, postFFT=3, fftthreads=8). It's still visible with postFFT=0 but slightly less.
__________________
AVSRepoGUI // VSRepoGUI - Package Manager for AviSynth // VapourSynth
VapourSynth Portable FATPACK || VapourSynth Database

Last edited by ChaosKing; 2nd November 2018 at 11:45.
ChaosKing is offline   Reply With Quote
Old 2nd November 2018, 12:17   #65  |  Link
ErazorTT
Registered User
 
Join Date: Mar 2003
Location: Germany
Posts: 215
Quote:
Originally Posted by tormento View Post
TemporalDegrain2(degrainTR=4,postFFT=0,postSigma=3,postDither=0) 0.86 fps 3060 kb/s
TemporalDegrain2(degrainTR=4,postFFT=3,postSigma=3,postDither=0) 0.95 fps 3371 kb/s
TemporalDegrain2(degrainTR=4,postFFT=4,postSigma=3,postDither=0) 1.58 fps 3791 kb/s
This is ultra wierd! How can it be faster to have the postFFT enabled compared to not having it enabled
Were you running with Prefetch(x)? What do you get when you increase your Prefetch(x) setting from 1 to whatever you usually use?
ErazorTT is offline   Reply With Quote
Old 2nd November 2018, 12:25   #66  |  Link
ErazorTT
Registered User
 
Join Date: Mar 2003
Location: Germany
Posts: 215
Quote:
Originally Posted by ChaosKing View Post
I noticed that your scripts produces some halos:
200% zoom
https://i.imgur.com/IK83WEL.png
Is it the original to the right? Because I for sure see there the same structure. It seems it is just visually hidden behind the noise. Once the noise is removed, the structure is revealed.

Last edited by ErazorTT; 2nd November 2018 at 16:50.
ErazorTT is offline   Reply With Quote
Old 2nd November 2018, 12:47   #67  |  Link
tormento
Acid fr0g
 
tormento's Avatar
 
Join Date: May 2002
Location: Italy
Posts: 2,582
Quote:
Originally Posted by ErazorTT View Post
This is ultra wierd!
I will rerun again to have confirmation.
__________________
@turment on Telegram

Last edited by tormento; 2nd November 2018 at 16:56.
tormento is offline   Reply With Quote
Old 2nd November 2018, 16:56   #68  |  Link
tormento
Acid fr0g
 
tormento's Avatar
 
Join Date: May 2002
Location: Italy
Posts: 2,582
Quote:
Originally Posted by ErazorTT View Post
This is ultra wierd!
Post fixed.
__________________
@turment on Telegram
tormento is offline   Reply With Quote
Old 2nd November 2018, 23:20   #69  |  Link
zorr
Registered User
 
Join Date: Mar 2018
Posts: 447
AvisynthOptimizer could be used to test the speed of different Prefetch and SetFilterMTMode settings.

Quote:
Originally Posted by ErazorTT View Post
Additionally I really would not set degrainTR higher than FPS/8, thus for the usual cinema film not higher than 3. There can be some very rare, hard to find but ugly effects!
Also by comparing the degrained clip to the original with SSIM it should be easy to spot any anomalies in the rendering. The script could return the worst SSIM of any frame. It will probably need a long clip to find those rare errors though, but it should be possible. If anyone is interested I can help make the script ready for the optimizer.
zorr is offline   Reply With Quote
Old 3rd November 2018, 11:33   #70  |  Link
ErazorTT
Registered User
 
Join Date: Mar 2003
Location: Germany
Posts: 215
Quote:
Originally Posted by zorr View Post
AvisynthOptimizer could be used to test the speed of different Prefetch and SetFilterMTMode settings.
That would be great, however as much as would love to do this test myself, I guess my crappy dual core laptop is not performant enough to do this analysts.

Quote:
Originally Posted by zorr View Post
Also by comparing the degrained clip to the original with SSIM...
Yes , something along the lines of this here: http://forum.doom9.org/showthread.ph...726#post861726

Quote:
Originally Posted by zorr View Post
It will probably need a long clip to find those rare errors though...
The suggestion by StainlessS is probably a good starting point:
http://forum.doom9.org/showthread.ph...48#post1856048

Quote:
Originally Posted by zorr View Post
If anyone is interested I can help make the script ready for the optimizer.
Yeah that would be interesting to see! (Who knows perhaps this would trigger my urge to get a decent standalone pc again )

Last edited by ErazorTT; 3rd November 2018 at 11:42.
ErazorTT is offline   Reply With Quote
Old 3rd November 2018, 12:14   #71  |  Link
tormento
Acid fr0g
 
tormento's Avatar
 
Join Date: May 2002
Location: Italy
Posts: 2,582
Quote:
Originally Posted by ErazorTT View Post
That would be great, however as much as would love to do this test myself, I guess my crappy dual core laptop is not performant enough to do this analysts.
My two cents: I don't want to weigh you down but I simply think that the script, even as so much good as it is, is simply too slow to have any improvements with tweaks on the Prefetch side. I have looked at the CPU usage, both on single thread and MT, and there are so much time it plains does nothing. With Prefetch(6), over 8 logical cores, I get a media of 30% with peaks of 50%.
You (we) should analyze every single routine inside and find where the cycles run void.

And, since x264 Simple Launcher was created, it's the first time I see it telling me the script is not responding, to give you an idea.

Please, if you have time, try to create a simpler one, using only OpenCL filtering (KNLMeansCL in my mind) + MVTools. Probably the results will be crappier but we could have an idea of theoretical speed.
__________________
@turment on Telegram

Last edited by tormento; 3rd November 2018 at 12:17.
tormento is offline   Reply With Quote
Old 3rd November 2018, 12:26   #72  |  Link
ChaosKing
Registered User
 
Join Date: Dec 2005
Location: Germany
Posts: 1,795
I offer to run the Optimizer tests on my 8 core Ryzen.
__________________
AVSRepoGUI // VSRepoGUI - Package Manager for AviSynth // VapourSynth
VapourSynth Portable FATPACK || VapourSynth Database
ChaosKing is offline   Reply With Quote
Old 3rd November 2018, 16:58   #73  |  Link
ErazorTT
Registered User
 
Join Date: Mar 2003
Location: Germany
Posts: 215
Well considering that the result from ChaosKing are more than an order of magnitude faster than yours, and yours are exactly as fast as both TempralDegrain v1 and v2 on my crappy laptop I would guess that there is something wrong for you. I don’t know what simple launcher is supposed to do but apparently there goes something wrong.
Considering that you have a Sandy Bridge this means that you have 4 cores. Perhaps it would be an idea to not set Prefetch higher that 3 and have x264 on the 4th core. And also try to run once with postFFT=1.

Last edited by ErazorTT; 3rd November 2018 at 17:04.
ErazorTT is offline   Reply With Quote
Old 3rd November 2018, 17:15   #74  |  Link
ChaosKing
Registered User
 
Join Date: Dec 2005
Location: Germany
Posts: 1,795
My CPU util was still @ 50-60% max with Prefetch 8. But the big diffenrence could be this: I have a RX480 gpu, that means KNLMeansCL is not that slow on my system.
tormento, what fps do you get with KNLMeansCL alone?
__________________
AVSRepoGUI // VSRepoGUI - Package Manager for AviSynth // VapourSynth
VapourSynth Portable FATPACK || VapourSynth Database
ChaosKing is offline   Reply With Quote
Old 3rd November 2018, 21:28   #75  |  Link
ChaosKing
Registered User
 
Join Date: Dec 2005
Location: Germany
Posts: 1,795
Here is a quick test for:
optimize degrainTR = _n_ | 1..4 | degrainTR
optimize postFFT = _n_ | 0,1,3 | postFFT

I used AddGrain(80, 0, 0, seed=2) on a clean anime 720p source
But after iter 16 I got into an endless DUPLICATE PARAMS loop.
Code:
225.84503 33630 degrainTR=1 postFFT=0 
241.66373 76180 degrainTR=4 postFFT=0 
239.99747 60510 degrainTR=3 postFFT=0 
239.99747 61410 degrainTR=3 postFFT=0 
239.99747 62140 degrainTR=3 postFFT=0 
244.31024 16655 degrainTR=3 postFFT=3 
244.09448 95600 degrainTR=4 postFFT=1 
244.91441 18072 degrainTR=4 postFFT=3 
244.09448 95270 degrainTR=4 postFFT=1 
236.55763 48110 degrainTR=2 postFFT=0 
225.84503 34390 degrainTR=1 postFFT=0 
243.23091 80750 degrainTR=3 postFFT=1 
244.91440 18117 degrainTR=4 postFFT=3 
242.97444 15382 degrainTR=2 postFFT=3 
241.66373 75960 degrainTR=4 postFFT=0 
232.85495 53540 degrainTR=1 postFFT=1
__________________
AVSRepoGUI // VSRepoGUI - Package Manager for AviSynth // VapourSynth
VapourSynth Portable FATPACK || VapourSynth Database
ChaosKing is offline   Reply With Quote
Old 3rd November 2018, 22:28   #76  |  Link
ErazorTT
Registered User
 
Join Date: Mar 2003
Location: Germany
Posts: 215
Ok that is interesting! Perhaps you could also include a sweep over the postSigma = _n_ | 1,2,3,4 | . Since postFFT=0 is apparently always worse you could exclude it to save some time.

Last edited by ErazorTT; 3rd November 2018 at 22:34.
ErazorTT is offline   Reply With Quote
Old 3rd November 2018, 22:36   #77  |  Link
ErazorTT
Registered User
 
Join Date: Mar 2003
Location: Germany
Posts: 215
@tormento: Could you please try again with this version here. I included knlDevId so perhaps playing with it changing it from 0 to 1 will help..
ErazorTT is offline   Reply With Quote
Old 3rd November 2018, 22:36   #78  |  Link
zorr
Registered User
 
Join Date: Mar 2018
Posts: 447
Quote:
Originally Posted by ChaosKing View Post
Here is a quick test for:
optimize degrainTR = _n_ | 1..4 | degrainTR
optimize postFFT = _n_ | 0,1,3 | postFFT

I used AddGrain(80, 0, 0, seed=2) on a clean anime 720p source
But after iter 16 I got into an endless DUPLICATE PARAMS loop.
Hmm, seems *almost* normal behavior. The DUPLICATE PARAMS happens when the optimizer is unable to find a parameter combination it hasn't already tried. I should probably add an ending condition there as well or make it smart enough to realize it has tried all combinations.

However what I don't quite understand is why it did that after 16 iterations, there are 4*3 possible combinations which is 12. And it already has some duplicates in your log...
zorr is offline   Reply With Quote
Old 3rd November 2018, 23:24   #79  |  Link
ChaosKing
Registered User
 
Join Date: Dec 2005
Location: Germany
Posts: 1,795
Code:
degrainTR = 4					# optimize degrainTR = _n_ | 1..4 | degrainTR
postFFT = 1					# optimize postFFT = _n_ | 0,1,3 | postFFT
postSigma = 1					# optimize postSigma = _n_ | 1..4 | postSigma
Now with Prefetch(8), lower res (640x360) and trim to 150 frames.
Is it normal that the graph doesn't show lower values like 135?

Code:
145.50601 9670 degrainTR=1 postFFT=1 postSigma=3 
145.67668 25930 degrainTR=2 postFFT=3 postSigma=1 
142.8922 17800 degrainTR=3 postFFT=0 postSigma=1 
147.25772 20230 degrainTR=3 postFFT=1 postSigma=3 
142.8922 17830 degrainTR=3 postFFT=0 postSigma=2 
142.8922 17730 degrainTR=3 postFFT=0 postSigma=1 
135.96149 7380 degrainTR=1 postFFT=0 postSigma=1 
145.5126 20260 degrainTR=3 postFFT=1 postSigma=1 
142.8922 17950 degrainTR=3 postFFT=0 postSigma=2 
135.96149 7350 degrainTR=1 postFFT=0 postSigma=1 
142.8922 17950 degrainTR=3 postFFT=0 postSigma=1 
144.04181 14660 degrainTR=2 postFFT=1 postSigma=1 
143.90645 22950 degrainTR=4 postFFT=0 postSigma=2 
143.90645 23380 degrainTR=4 postFFT=0 postSigma=1 
140.8963 12090 degrainTR=2 postFFT=0 postSigma=2 
147.57607 31830 degrainTR=3 postFFT=3 postSigma=3 
145.67668 25760 degrainTR=2 postFFT=3 postSigma=1 
147.12352 36800 degrainTR=4 postFFT=3 postSigma=1 
140.8963 12100 degrainTR=2 postFFT=0 postSigma=3 
140.8963 12100 degrainTR=2 postFFT=0 postSigma=1 
145.50601 9660 degrainTR=1 postFFT=1 postSigma=3 
139.70255 9660 degrainTR=1 postFFT=1 postSigma=1 
147.17488 25320 degrainTR=4 postFFT=1 postSigma=2 
143.70836 9700 degrainTR=1 postFFT=1 postSigma=2 
146.21446 20850 degrainTR=1 postFFT=3 postSigma=4 
146.8522 14590 degrainTR=2 postFFT=1 postSigma=3 
135.96149 7390 degrainTR=1 postFFT=0 postSigma=2 
135.96149 7380 degrainTR=1 postFFT=0 postSigma=3 
147.3242 31640 degrainTR=3 postFFT=3 postSigma=2 
142.22742 21040 degrainTR=1 postFFT=3 postSigma=1 
147.40378 20430 degrainTR=3 postFFT=1 postSigma=4 
142.8922 17960 degrainTR=3 postFFT=0 postSigma=4 
147.4705 25270 degrainTR=4 postFFT=1 postSigma=3 
147.70525 32060 degrainTR=3 postFFT=3 postSigma=4 
146.24329 9590 degrainTR=1 postFFT=1 postSigma=4 
147.18077 25890 degrainTR=2 postFFT=3 postSigma=3 
147.1184 14480 degrainTR=2 postFFT=1 postSigma=4 
144.46893 21110 degrainTR=1 postFFT=3 postSigma=2 
145.5126 20240 degrainTR=3 postFFT=1 postSigma=1 
142.8922 17930 degrainTR=3 postFFT=0 postSigma=3 
146.74043 26020 degrainTR=2 postFFT=3 postSigma=2 
140.8963 12540 degrainTR=2 postFFT=0 postSigma=4 
147.25772 20260 degrainTR=3 postFFT=1 postSigma=3 
147.41321 26070 degrainTR=2 postFFT=3 postSigma=4 
143.90645 22770 degrainTR=4 postFFT=0 postSigma=4 
147.84923 36290 degrainTR=4 postFFT=3 postSigma=4 
135.96149 7400 degrainTR=1 postFFT=0 postSigma=4 
146.84929 20350 degrainTR=3 postFFT=1 postSigma=2 
146.16771 14490 degrainTR=2 postFFT=1 postSigma=2 
147.7625 36640 degrainTR=4 postFFT=3 postSigma=3 
147.56343 25300 degrainTR=4 postFFT=1 postSigma=4 
145.57582 20730 degrainTR=1 postFFT=3 postSigma=3 
146.18916 25550 degrainTR=4 postFFT=1 postSigma=1 
143.90645 23240 degrainTR=4 postFFT=0 postSigma=3 
147.58546 36390 degrainTR=4 postFFT=3 postSigma=2 
146.68575 31290 degrainTR=3 postFFT=3 postSigma=1



The output by evaluate mode is also a bit strange, worst result = best?
Code:
Single run results: 56
Series results: 0

Run 1 best: 147.84923 36290 degrainTR=4 postFFT=3 postSigma=4

Best result: 147.84923
Worst result: 147.84923

Pareto front:
  147.84923 36290 degrainTR=4 postFFT=3 postSigma=4
  147.70525 32060 degrainTR=3 postFFT=3 postSigma=4
  147.57607 31830 degrainTR=3 postFFT=3 postSigma=3
  147.56343 25300 degrainTR=4 postFFT=1 postSigma=4
  147.4705 25270 degrainTR=4 postFFT=1 postSigma=3
  147.40378 20430 degrainTR=3 postFFT=1 postSigma=4
  147.25772 20230 degrainTR=3 postFFT=1 postSigma=3
  147.1184 14480 degrainTR=2 postFFT=1 postSigma=4
  146.24329 9590 degrainTR=1 postFFT=1 postSigma=4
  135.96149 7350 degrainTR=1 postFFT=0 postSigma=1
__________________
AVSRepoGUI // VSRepoGUI - Package Manager for AviSynth // VapourSynth
VapourSynth Portable FATPACK || VapourSynth Database

Last edited by ChaosKing; 3rd November 2018 at 23:28.
ChaosKing is offline   Reply With Quote
Old 4th November 2018, 00:45   #80  |  Link
ErazorTT
Registered User
 
Join Date: Mar 2003
Location: Germany
Posts: 215
Judging by this measurement the optimum is also the maximal settings which were allowed:

147.84923 36290 degrainTR=4 postFFT=3 postSigma=4

It would appear that the maxima fit degrainTR and postSigma were still too low in this run. However, the result should be wildly different in different circumstances: like for clips with much quick movements I would guess that increasing degrainTR above some threshold will be counterproductive.
ErazorTT is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 05:34.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.