Need Suggestions for VERY GRAINY source [Archive] - Page 6

Sagekilla

15th February 2008, 22:50

Also, I just got my hands on a new Blu-Ray drive so I can finally rip and play with 300 in HD! I'll see if I can do a before and after using 1080p frames to show how well it works, and help me do testing to improve it. If I have enough time, I might even encode it twice to show the bitrate reduction possible.

Edit: How odd, I tried changing the idx on the NR2 MVDegrain from idx=2 to idx=3 and wanna know what my results were? idx=2 (same idx as the last MVdegrain) was more detailed and looked better. It was only MARGINALLY better though. You have to look VERY closely to see if there was any change.

Results from no TD (first one) vs with TD
G:\Movies\300_HD>x264_aq --crf 18 --ref 5 --mixed-refs --no-fast-pskip --bframes
16 --bime --weightb --b-pyramid --b-rdo --8x8dct --subme 7 --me umh --trellis 1
--aq-strength 0.3 --threads auto --progress "source.avs" --output "Grainy.264"
--pass 1 --stats "Grainy.log"
avis [info]: 1920x800 @ 23.98 fps (360 frames)
x264 [info]: using cpu capabilities: MMX MMXEXT SSE SSE2 SSE3 3DNow!
x264 [info]: slice I:2 Avg QP:15.80 size:402956 PSNR Mean Y:47.30 U:51.04
V:51.57 Avg:48.26 Global:48.26
x264 [info]: slice P:181 Avg QP:18.05 size:224610 PSNR Mean Y:44.65 U:49.72
V:50.04 Avg:45.80 Global:45.78
x264 [info]: slice B:177 Avg QP:20.15 size:152771 PSNR Mean Y:42.90 U:49.27
V:49.64 Avg:44.20 Global:44.18
x264 [info]: mb I I16..4: 29.0% 45.7% 25.3%
x264 [info]: mb P I16..4: 29.5% 16.7% 12.0% P16..4: 18.4% 17.3% 5.1% 0.0% 0
.0% skip: 0.9%
x264 [info]: mb B I16..4: 13.4% 7.8% 7.2% B16..8: 38.1% 6.1% 6.6% direct:
14.4% skip: 6.4%
x264 [info]: 8x8 transform intra:28.5% inter:17.4%
x264 [info]: ref P 75.9% 12.5% 6.4% 3.2% 2.1%
x264 [info]: ref B 74.2% 19.5% 4.4% 2.0%
x264 [info]: SSIM Mean Y:0.9813726
x264 [info]: PSNR Mean Y:43.801 U:49.506 V:49.853 Avg:45.025 Global:44.928 kb/s:
36497.25

encoded 360 frames, 0.68 fps, 36497.60 kb/s

G:\Movies\300_HD>pause
Press any key to continue . . .

G:\Movies\300_HD>pause
Press any key to continue . . .

G:\Movies\300_HD>x264_aq --crf 18 --ref 5 --mixed-refs --no-fast-pskip --bframes
16 --bime --weightb --b-pyramid --b-rdo --8x8dct --subme 7 --me umh --trellis 1
--aq-strength 0.3 --threads auto --progress "source.avs" --output "NotGrainy.26
4" --pass 1 --stats "NotGrainy.log"
avis [info]: 1920x800 @ 23.98 fps (360 frames)
x264 [info]: using cpu capabilities: MMX MMXEXT SSE SSE2 SSE3 3DNow!
x264 [info]: slice I:2 Avg QP:15.45 size:278436 PSNR Mean Y:47.03 U:51.37
V:51.57 Avg:48.07 Global:48.07
x264 [info]: slice P:136 Avg QP:17.75 size:108867 PSNR Mean Y:44.63 U:50.04
V:50.26 Avg:45.81 Global:45.79
x264 [info]: slice B:222 Avg QP:19.84 size: 32984 PSNR Mean Y:43.42 U:50.00
V:50.23 Avg:44.73 Global:44.71
x264 [info]: mb I I16..4: 21.3% 54.7% 24.0%
x264 [info]: mb P I16..4: 6.4% 12.5% 3.1% P16..4: 34.6% 30.1% 11.2% 0.0% 0
.0% skip: 2.1%
x264 [info]: mb B I16..4: 0.3% 0.7% 0.2% B16..8: 38.1% 3.7% 8.2% direct:
6.2% skip:42.6%
x264 [info]: 8x8 transform intra:57.0% inter:46.0%
x264 [info]: ref P 70.6% 21.3% 3.3% 2.4% 2.3%
x264 [info]: ref B 85.3% 13.8% 0.6% 0.3%
x264 [info]: SSIM Mean Y:0.9810072
x264 [info]: PSNR Mean Y:43.895 U:50.026 V:50.245 Avg:45.159 Global:45.100 kb/s:
12086.75

encoded 360 frames, 0.26 fps, 12087.09 kb/s

G:\Movies\300_HD>pause
Press any key to continue . . .

Sagekilla

17th February 2008, 00:02

Sasovics, did my updated script or the uploaded version work for you at all or do you still have problems?

Sasovics

17th February 2008, 02:06

Post moved to separate topic here (http://forum.doom9.org/showthread.php?t=134917)

Sagekilla

17th February 2008, 02:55

Please don't use my FastDegrain -- It's currently b0rked. Also, Sasovics please don't cross post.

Nikos

17th February 2008, 03:48

Edit: How odd, I tried changing the idx on the NR2 MVDegrain from idx=2 to idx=3 and wanna know what my results were? idx=2 (same idx as the last MVdegrain) was more detailed and looked better. It was only MARGINALLY better though. You have to look VERY closely to see if there was any change.
According to Didée from post 125 for using 2 mvdegrain (mvdegrain2+mvdegrain1):
Without corrected weightings, it will denoise less, not more. (Unless you split the idx'es, then it will denoise more indeed.)

With the word split he mean different idx in each degrainX?
We need a little more teaching from Didée about idx :)

Terranigma

17th February 2008, 04:30

I think the guy "Bobby" who encoded the movie, had to use some kind of combination of avs filters as I am using the very same x264 encoder settings and of course the very same source, US retail HDDVD.

I have tried various avs filters, including Sagekilla's TemporalDegrain, FastDegrain, Dark Shikari's GrainOptimizer,
as well the LimitedSharpenFaster alone and many more.

None of them helped to get to the quality of iLL release :(

Because iLL's like the greatest (real-life source) encoder there is?
They don't use any of these premade scripts, I can tell you that for sure. :)

Sasovics

17th February 2008, 04:33

Because iLL's like the greatest (real-life source) encoder there is?
They don't use any of these premade scripts, I can tell you that for sure. :)

Don't understand what you mean with that "real-life source"
.. if they don't use these scripts, what are they using ? How can you tell ?

Terranigma

17th February 2008, 04:42

Don't understand what you mean with that "real-life source"

Real Life source, as in, non-toon sources (like this movie 300 for instance).

.. if they don't use these scripts, what are they using ? How can you tell ?
I can tell because they've been encoding way before temporaldegrain, mc_spuds, or this thread started, so that shouldn't be so hard to figure out.:cool:
(btw, lets leave out encoders or groups for that matter, as it's against the rules per se.)

Lets just say that what they use is actually simpler than you think; btw, it's not so hard to write scripts, just look over the syntax document in avisynth's directory, or take examples from scripts like mc_spuds or temporaldegrain for that matter.

Sagekilla

17th February 2008, 05:14

Real Life source, as in, non-toon sources (like this movie 300 for instance).

I can tell because they've been encoding way before temporaldegrain, mc_spuds, or this thread started, so that shouldn't be so hard to figure out.:cool:
(btw, lets leave out encoders or groups for that matter, as it's against the rules per se.)

Lets just say that what they use is actually simpler than you think; btw, it's not so hard to write scripts, just look over the syntax document in avisynth's directory, or take examples from scripts like mc_spuds or temporaldegrain for that matter.

Indeed, TD is fairly simple as far as filter chains go. It doesn't use a lot of complex concepts compared to others imo.

Didée

17th February 2008, 19:03

ATM I've not the energy (a really bad flu', can't concentrate) to cut through all the stuff posted on the last few pages.

Not sure if it's to rate as "tragedy" or "comedy" ... In any case, lots of nonsense has been posted.

There've been really screwed-up findings & correlations about blocksizes, bogus inventions like "detection thresholds", funny alternate filters (dfttest with default tbsize=1 versus fft3dfilter with bt=5 ... hell, you cannot compare that!), instances of non-functional functions, even posted to the wiki (Sagekilla, do you actually try your own stuff? MVAnalyse has no argument "ov", only "overlap", and why is "ov" responsible for "blocksize"), backwards-evolutions of proven concepts (in the longer-ago past, the biggest complain about MVDenoise was blockyness due to not-yet implemented block-overlap modes ... nowadays block overlapping is hardcoded'ly taken away because of performance...welcome in stoneage), and what not else.

Back to sick bed, crying in silence.

Terranigma

17th February 2008, 19:42

Welcome back Didée. Hope you get well soon, and then you can explain to us about some of the nonsense that was posted (I know that i'm responsible for most of it. :D). Honestly though, Fizick would have to update his document if blocksize 4 is weaker than blocksize 8 for denoising. I was thinking that they also functioned like the blockx, blocky arguments in tfm. Again, from comparisons i've done, blocksize 4 seemed to be stronger than 8 and 16 just like how Fizick has it documented, but whatever: you're the expert around these parts. :)

jeffy

17th February 2008, 20:54

Didée,
please get well soon and forget all the nonsense for a while!
:thanks: for everything.

Back to sick bed, crying in silence.

foxyshadis

17th February 2008, 22:22

Zanejin

17th February 2008, 23:32

It's not so much that blocksize 4 is stronger, it's that it has a higher chance of making bad correlations and subsequently denoising places that shouldn't by rights have been affected. Meaning smudgier details and edges in random places. blocksize 8 or 16 with a soft threshold, as Didée's been asking for forever, would be a better improvement.

So for MVAnalyse, based purely on results for MVDegrain, the blksize value that gives the greatest speed also happens to give the strongest and most accurate filtering?

Nikos

18th February 2008, 03:45

Welcome back Didée.
Please explain to me the differents if any between 1 and 2:

1.
source=last
b1_vec = source.MVAnalyse(isb = true, delta = 1, blksize = 8, overlap =4, pel = 2, sharp= 2, idx=1)
f1_vec = source.MVAnalyse(isb = false, delta = 1, blksize = 8, overlap =4, pel = 2, sharp= 2, idx=1)

NR1=source.MVDegrain1(b1_vec, f1_vec, thSAD = 400, idx = 1)
NRa=NR1.MVDegrain1(b1_vec, f1_vec, thSAD = 400, idx = 1)

2.
source=last
b1_vec = source.MVAnalyse(isb = true, delta = 1, blksize =8, overlap = 4, pel = 2, sharp = 2, idx = 1)
f1_vec = source.MVAnalyse(isb = false, delta = 1, blksize =8, overlap = 4, pel = 2, sharp = 2, idx = 1)

NR1=source.MVDegrain1(b1_vec, f1_vec, thSAD = 400, idx = 1)
NRb=NR1.MVDegrain1(b1_vec, f1_vec, thSAD= 400, idx = 2)

And the correlation, if any between the two above with this:
3.
source=last
b2_vec = source.MVAnalyse(isb = true, delta = 2, blksize = 8, overlap = 4, pel = 2, sharp = 2, idx = 1)
b1_vec = source.MVAnalyse(isb = true, delta = 1, blksize = 8, overlap = 4, pel = 2, sharp = 2, idx = 1)
f1_vec = source.MVAnalyse(isb = false, delta = 1, blksize = 8, overlap = 4, pel = 2, sharp = 2, idx = 1)
f2_vec = source.MVAnalyse(isb = false, delta = 2, blksize = 8, overlap = 4, pel = 2, sharp = 2, idx = 1)

NRc=source.MVDegrain2(b1_vec, f1_vec, b2_vec, f2_vec, thSAD = 400, idx = 1)

I know that here (3) we have more frames.

Thank you.

foxyshadis

18th February 2008, 03:58

Strongest and fastest, but least accurate. On the other hand, it can be more accurate on some edges. Using large blocksizes in most areas and progressively smaller ones in occulded areas is the best way, as x264 does, but what can you do at the moment, y'know? This was discussed in the mvtools thread a while back.

Nikos

18th February 2008, 15:17

Thank you foxyshadis for the reply. Yoy mean that 1 and 2 are strongest and fastest but least accurate than 3 in my understanding.

What's the differents between 1 and 2?
Both of them are correct or only one?

foxyshadis

18th February 2008, 19:34

Sorry, that was a reply to Zanejin.

You always have to use a new idx when you use a new source (unless you use pel=1, then idx doesn't offer a speedup). So you have one idx for MV with source, and another idx for MV on NR1.

Nikos

18th February 2008, 20:25

Thank you foxyshadis for the explain.
With simple words the right script is the 2 with idx=2.
2.
source=last
b1_vec = source.MVAnalyse(isb = true, delta = 1, blksize =8, overlap = 4, pel = 2, sharp = 2, idx = 1)
f1_vec = source.MVAnalyse(isb = false, delta = 1, blksize =8, overlap = 4, pel = 2, sharp = 2, idx = 1)

NR1=source.MVDegrain1(b1_vec, f1_vec, thSAD = 400, idx = 1)
NRb=NR1.MVDegrain1(b1_vec, f1_vec, thSAD= 400, idx = 2)

thetoof

19th February 2008, 07:37

@ Didée

From what you said... I guess/hope you'll rewrite the script to correct everything and I was wondering if you could add (or simply explain how to do it) a function to use dfttest and fft3dfilter at the same time. Maybe it's just a bad idea to use 2 denoisers... but the source I have requires additional chroma denoise with specific s2/3/4 values with fft3d to be clean.

I tweaked the parameters of dfttest in the TemporalDegrain.avsi and I got pretty good results, but couldn't find a way to add fft3d... hope you'll be willing to help me.

archaeo

25th February 2008, 00:16

Just tried sagekilla's TemporalDegrain() on a very grainy old film transfer: "I am a Fugitive from a Chain Gang" (1932), and I am very impressed with the results (See attachments). The grain on the original transfer was extremely heavy, and other degrain combinations never got close to the results I got with TD... Slow...YES, (even with an E6700, I only got about 2 fps) but well worth the wait.
I have also tried this on a couple of other older films, with similar results - grain virtually gone, but excellent detail retention. Very nice script!

I only wish there was a way to speed it up a bit with MT :(

Original clip: http://www.mediafire.com/?mi11znznz3r
TD clip: http://www.mediafire.com/?its3m3z9yud

stills:

8141

8142

Nikos

28th February 2008, 02:01

I read an old Didee's post in another forum and i replace the prefilter with this:

soft = hqdn3d(source).bicubicresize(source.width/4/4*4,source.height/4/4*4).bicubicresize(source.width,source.height,1,0)
filter=mt_lutxy( source, soft, "x y = x x x y - abs 1 + 1 2 / ^ x y - x y - abs / * - ?", U=2, V=2 )

The different in rendering speed was respectable, +40% with overlap=4, and +70% with overlap=2.
I use MVDegrain2 and MT(2,2).
The quality in my eyes was the same, but i want an opinion from experienced user.

Any idea for another correct "spat" filter?

Didee help us :)

Didée

28th February 2008, 08:42

Several issues.

1) The string in lutxy is wrong. That syntax was for YV12lutxy() of old MaskTools v1.
For mt_Lutxy of MaskTools v2, the string has to be "x y == ..." (TWO equal signs. Try to retun "filter" with your current syntax, you'll immediately see that something's wrong.)

2) With that order of filters, you're actually nullifying the effect of HQDN3D. In this case, the more reasonable ordering is
soft = source.bicubicresize(source.width/4/4*4,source.height/4/4*4).bicubicresize(source.width,source.height,1,0)
filter=mt_lutxy( source, soft, "x y == x x x y - abs 1 + 1 2 / ^ x y - x y - abs / * - ?", U=2, V=2 ).hqdn3d()

3) Whether pre-denoising is needed at all depends on the stronginess of grain. I see slight danger that this sort of processing now gets thrown at all kind of sources, even those that don't need such a processing method.

4) That combo is a very speedy trick-filter. It might work out for the most part, but there'll be cases where it bites you back. Dumb filters can destroy the motion of rather smooth regions (e.g. close-up of smooth faces + head-movement) strong enough so that MVTools won't reckognize motion anymore.
BTW, HQDN3D is generally not fully safe in this respect.

4) We have about one thousand different denoising filters or -scripts. If source has strong grain, the pre-filtering should be barely strong enough to make static areas calm.
Which filter that is or could be, depends on the source. If you're going to ask about all thousand possibilities, this thread will become long. ;)

Nikos

28th February 2008, 14:52

Thanks Didée for the correction. That syntax was for YV12lutxy(). The reverse polish notation is a little difficult. What's the different beetween the "==" and "="? These days i read your old post to find out solutions for my problems :)
If you want translate in simple maths the function.

I need a speedy prefilter whith acceptable (not high) quality with for High Definition sources with moderate or low noise. The prefilter must be spatial, temporal, or spatiotemporal; I must put the spatial first or the temporal;

The FFT3DGPU is suitable for prefilter?
It's a good idea to write a "magic" script for prefilter in MVDegrain with options for heavy, medium or light denoise :)

Any help.....

Didée

28th February 2008, 16:35

"=" is an assignment, whereas "==" means "is equal to" in logical expressions.

a = 3 # assign the value "3" to the variable "a"

a == 4 # this is "TRUE" if "a" has the value "4", and is "FALSE" if "a" has ar value other than "4".

The syntax used by MaskTools v1 ever was wrong in that respect. MaskTools v2 now has it correct.

The reverse polish notation is a little difficult. No, not really. It's just a matter of practice.

"x y == x x x y - abs 1 + 1 2 / ^ x y - x y - abs / * - ?"

reads in english

IF (x is equal to y)
THEN (use x)
ELSE (apply the square root of (difference between x and y, plus 1) to x)

I need a speedy prefilter whith acceptable (not high) quality with for High Definition sources with moderate or low noise.
For low noise you need exactly no prefiltering at all. Use MVDegrainX directly as stated in MVTools' documentation.

The FFT3DGPU is suitable for prefilter?
When asking questions where the only objective answer is "there is no objective answer", it won't get better if you guys keep asking over and over again.

Nikos

28th February 2008, 17:09

Now with your help i begin to understand the mask tool and the Reverse Polish notation.
Now i am going for practice. I will be back soon with more serious questions :)

Thanks for your time to answer my questions.

jeffy

28th February 2008, 19:23

"x y == x x x y - abs 1 + 1 2 / ^ x y - x y - abs / * - ?"

reads in english

IF (x is equal to y)
THEN (use x)
ELSE (apply the square root of (difference between x and y, plus 1) to x)

Didée, could you please rewrite the lime part in a normal (infix) notation? Thank you.

Didée

28th February 2008, 21:01

In infix notation it is

(x == y) ? x : x - ( sqrt( abs( x - y ) + 1 ) * ( x - y ) / abs( x - y ) )

That's very basic math. Everyone once has used something like

clip1 = last
clip2 = clip1.blur(1)
result = clip1 .merge( clip2, 0.4 )

What is this? It is: apply a filter to your source (here: blur filter), but use only 40% of the result. (Or the resulting effekt). That's trivial, isn't it?

From a slightly different angle, it is this: calculate something, but instead of just using the resulting difference, multiply the difference by 0.4 before using it. That's a linear function: F(diff) = diff * p

Now, what the initial code does is just this: instead of using a linear function to reduce the effect, it uses a square root function: F(diff) = sqrt(diff)

That's all, and hardly worth all the babbling. :)

Lorax2161

28th February 2008, 22:48

Actually, thank you for "babbling." Not everyone is at the same level, and I learned some things from it. A good explanation goes a long way to helping people open doors they thought were locked.

BTW, to further on what archaeo said, I ran TemporalDegrain() on a somewhat noisy cable tv capture and the results were stunning. (I didn't know how noisy it was until after it was cleaned.) I only get 1.8fps on my dual core machine, but I haven't seen anything at any speed that comes close to that quality, especially true if I follow it with LSF. So to you and sagekilla I say thanks.

Nikos

28th February 2008, 22:48

To help a little.
( x - y ) / abs( x - y ) = sign(x-y)

In masktools Round(1.5)=? and Round(-1.5)=?
I suppose 2 and -1, but i am not sure.

Terranigma

28th February 2008, 23:22

I think you guys may want to split this thread into a new one, because it's now getting way offtopic :)

NathanX

3rd March 2008, 10:22

Please don't use my FastDegrain -- It's currently b0rked. Also, Sasovics please don't cross post.

Your FastDegrain scripts sounds interesting to me, is it still in development?
Thank you!

ChrisW77

3rd March 2008, 15:55

Sagekilla, Thats one amazing script that works extremely well on SD material, and heck, even on old VHS it does wonders.

So thanks to both you and Didee :)

I'm currently using Temporal Degrain V1.18 (Feb 14, 2008), but cannot seem to get any joy from MT, CPU doesn't go above 60% usage, no matter where I put SetMTmode(2) in a simple script.

For instance

SetMemoryMax(512)
LoadPlugin("C:\Program Files\AviSynth\plugins\DGDecode.dll")
LoadPlugin("C:\Program Files\AviSynth\plugins\mpasource.dll")
Import("C:\Program Files\AviSynth\plugins\TemporalDegrain2.avs")

SetMTmode(2)
video = mpeg2source("D:\DVB-T\test1.d2v", cpu=4, iPP=true, idct=3)
audio = MPASource("D:\DVB-T\test1 T01 DELAY 0ms.mpa", normalize = false)
AudioDub(video, audio)
ConvertToYV12(interlaced=true)

AssumeTFF()
SeparateFields()
TemporalDegrain()
Weave()
Crop(8,8,-8,-8,align=true).Addborders(8,8,8,8)

This is just a simple DVB-T capture test, keeping interlaced full frame, and on my Core2duo E6300 I get around 3 fps with or without SetMTmode(2).

One thing I have tried, as I have a powerful GPU (8800GT), that gained me 13fps from 3fps to 16fps with out any perceived loss in quality, is to modify your function as follows

Change all FFT3D entries to FFT3dGPU, changing BT=5 to BT=4, as the GPU version doesn't support 5, and blksize from 8 to 16. Huge speedup, and I couldn't see any difference in quality.

Many thanks for this function, it's amazing. :D

archaeo

3rd March 2008, 19:09

I'm currently using Temporal Degrain V1.18 (Feb 14, 2008), but cannot seem to get any joy from MT, CPU doesn't go above 60% usage, no matter where I put SetMTmode(2) in a simple script.

Try MT("TemporalDegrain()").
It worked well for me - I am seeing 90% plus usage on my Core2

Boulder

3rd March 2008, 19:20

I wonder if MVAnalyse/MVDegrain plays nice with MT..nobody ever tested it IIRC.

ChrisW77

3rd March 2008, 20:17

Try MT("TemporalDegrain()").
It worked well for me - I am seeing 90% plus usage on my Core2

Thanks, I'll give that a try.

Terranigma

3rd March 2008, 20:21

I wonder if MVAnalyse/MVDegrain plays nice with MT..nobody ever tested it IIRC.

I don't know what this (http://forum.doom9.org/showthread.php?p=1056510#post1056510) is exactly, but I stumbled across it in the MT thread. :P

maxisvk

3rd March 2008, 20:44

Thanks, I'll give that a try.

This is my test on Core 2 Duo with TemporalDegrain()
SetMTMode(2,0) , 50% Cpu
SetMTMode(2,3) , 65% Cpu
SetMTMode(2,4) , 90/100% Cpu

on Q6600 is the same, only if i use SetMTMode(2,8) it work at 100%

ChrisW77

3rd March 2008, 21:01

SetMTMode(2,4) , 90/100% Cpu

Yes, yes, now that worked well, getting 95% continuous. Cheers :) Never thought of using 4 threads.

A strange thing I found with MVTools was this

SetMemoryMax(512)
LoadPlugin("C:\Program Files\AviSynth\plugins\RemoveGrainSSE2.dll")
LoadPlugin("C:\Program Files\AviSynth\plugins\MVTools.dll")
LoadPlugin("C:\Program Files\AviSynth\plugins\DeGrainMedian.dll")
LoadPlugin("C:\Program Files\AviSynth\plugins\DGDecode.dll")
LoadPlugin("C:\Program Files\AviSynth\plugins\mpasource.dll")

SetMTmode(2)
video = mpeg2source("D:\VHS\90sadverts.d2v", cpu=6, iPP=true, idct=3)
audio = MPASource("D:\VHS\90sadverts T01 DELAY 0ms.mpa.d2a", normalize = false)
AudioDub(video, audio)

source=ConvertToYV12(interlaced=true).Crop(10,4,-10,-10,align=true).Addborders(10,7,10,7)
fields=source.AssumeTFF().SeparateFields().DeGrainMedian(mode=0,interlaced=false).RemoveGrain(mode=2)

lbda=512
blk=16
ol=8
sch=2
pel=1

backward_vec2 = fields.MVAnalyse(isb=true, search=sch, truemotion=true, lambda=lbda, delta=2, pel=pel, blksize=blk, overlap=ol, sharp=1, idx=1)
backward_vec1 = fields.MVAnalyse(isb=true, search=sch, truemotion=true, lambda=lbda, delta=1, pel=pel, blksize=blk, overlap=ol, sharp=1, idx=1)
forward_vec1 = fields.MVAnalyse(isb=false, search=sch, truemotion=true, lambda=lbda, delta=1, pel=pel, blksize=blk, overlap=ol, sharp=1, idx=1)
forward_vec2 = fields.MVAnalyse(isb=false, search=sch, truemotion=true, lambda=lbda, delta=2, pel=pel, blksize=blk, overlap=ol, sharp=1, idx=1)
fields.MVDegrain2(backward_vec1,forward_vec1,backward_vec2,forward_vec2,thSAD=800,idx=1).MVDenoise(backward_vec2,backward_vec1,forward_vec1,forward_vec2,tht=10,thSAD=800)
Weave()

That script ran well and fast, constantly at 100%, until it finished. Yet, add anything after weave, such as LimitedSharpenFaster, or even Tweak, and it kind of broke MT, and you were left with a script running 55%.

Cheers to maxisvk, that worked well.
Next, to look at a cheap quad core upgrade :D

Didée

3rd March 2008, 21:13

That script ran well and fast, constantly at 100%, ...
... and the result was 100% garbage.

source=ConvertToYV12(interlaced=true).Crop(...).Addborders(10,7,10,7)
Progressive YV12 shouldn't be cropped or add-border'ed by odd numbers vertically .
For interlaced YV12, that is MUST NOT. Never, never, never ever. If you do, you're changing chroma phase, i.e. luma and chroma are no more temporally aligned.

ChrisW77

3rd March 2008, 21:41

... and the result was 100% garbage.

Looked good enough to MY eyes ;)

Progressive YV12 shouldn't be cropped or add-border'ed by odd numbers vertically .
For interlaced YV12, that is MUST NOT. Never, never, never ever. If you do, you're changing chroma phase, i.e. luma and chroma are no more temporally aligned.

Then suggest something then, rather than just spouting off one.
How would one, by Didee's standards, crop off the shite on a Interlaced VHS cap.
I'm listening, o Great one :D

thetoof

4th March 2008, 08:10

Basic rule of MT : Look at the time it takes, not the CPU usage!
Also, I think setmtmode(2) for MVtools would be better since:
1- In the guide, there is this info: # Note: SetMTMode(2) mode of multithreaded AviSynth is also supported since MVTools v.1.8.4.1 (beta testing)
2-MT("filter()") splits the frames in multiple parts (depending on the # of threads)... and unless you use a HUGE overlap, the motion vectors won't be as accurate as if you had used the complete frame (which is done by setmtmode(2))

If you didn't know, setmtmode(2) separates the frames temporally (so it's always the complete frame that is processed), while MT() separates them spatially.

maxisvk

4th March 2008, 13:43

Basic rule of MT : Look at the time it takes, not the CPU usage!

ok, obviously, but in my test:

SetMTMode(2,0) , 50% Cpu
-> encoded 120 frames, 1.61 fps, 855.64 kb/s

SetMTMode(2,3) , 65% Cpu
-> encoded 120 frames, 1.64 fps, 855.64 kb/s

SetMTMode(2,4) , 90/100% Cpu
-> encoded 120 frames, 2.15 fps, 855.64 kb/s

i also noticed that using only Mvdegrain and not TemporalDegrain best result are stuck with SetMTMode(2,3), Why?..., I really did not know!

thetoof

4th March 2008, 22:35

Because Temporaldegrain uses a predenoised reference clip for motion vector search... so it takes more computer ressources and it's very possible your hard drive disk can't handle the high number of reads and writes. That's why there's a peak with speed. You can "solve" that by doing the reference clip yourself with FFT3Dfilter.

If you only want to denoise, use this:

1 - Create a simple script to create your reference clip, it doesn't really matter if you lose some details and it looks a tad overdenoise since it's only a clip used for motion vector search:

whateversource("yoursource")
FFT3DFilter(your favorite parameters... check the documentation for more info)

2 - Process through Direct Stream Copy in virtualdubmod. If you have another HDD, save it on this one. The file'll be HUGE.

3 - create another script with temporaldegrain using the source clip that is on the other HDD

source = whateversource("youroriginalclip")
denoised = avisource("thereferenceclip")
temporaldegrain(source, denoised)

About your speedtest... it's OBVIOUS that using more threads will increase the speed. SetMTmode(2) is always the best choice since it'll set the number of threads to the # of available processors.

When I say to look at speed instead of CPU usage, it's with the same # of CPUs. The thing is, when you have a more complex script, MT may not help (I have a script that runs @ 1fpm with 750MB of RAM and 25% CPU usage (100% of 1 processor on my Quad-core) and it's a LOT slower with MT and uses 3GB of RAM).

foxyshadis

5th March 2008, 07:32

Eh, no matter how slow a script is, writing part of it to disk and reading it back will always be slower. Whether you're going to make another lossless encode from the final output, or use it in a one-pass encoding, by the time you're done you'll have put much more effort into it and it will have taken longer overall. Memory overflow and disk swapping is about the only way that partial scripting can be faster. TemporalDegrain can't even remotely cause a hard drive bottleneck anyway, even with uncompressed sources it takes about a hundred times as long to filter as read.

On the other hand, if you have a second computer handy, running fft3dfilter on that and using TCPSource can really speed things up, as long as the second computer's fast enough to keep up.

I'd also say that your conclusion that SetMTmode(2) is best is mistaken; after all, you just said more threads was faster, and speed is all that matters, as long as the final encoder's thread is included rather than just the avisynth script measured. In some scripts 4 threads will be faster, others 1, though I suppose yours might be swapping like crazy once it gets there. Always measure though, never take a forum poster's word. :p

thetoof

5th March 2008, 08:39

Oh well, seems like I was lucky on my first try using the method I described before... I ran some other tests to show how effective it was, but everything I got looked like crap.
Dunno why it worked once (and there were actually great speed improvement by outputting the reference clip on another HDD and then call it in the script using temporalgrain as "denoise", from 53 mins total (complete script) to 13 mins total (split script) for the same amount of frames) and I'm sorry for posting it before running more tests... but heh, I'm learning!

Also, about my statement regarding setmtmode(2), I had in mind that it was regarding MVtools usage (and from what I understood from its documentation, it's the most appropriate mode). My bad for not being clear.

Zep

6th March 2008, 20:46

Eh, no matter how slow a script is, writing part of it to disk and reading it back will always be slower. Whether you're going to make another lossless encode from the final output, or use it in a one-pass encoding, by the time you're done you'll have put much more effort into it and it will have taken longer overall. Memory overflow and disk swapping is about the only way that partial scripting can be faster. TemporalDegrain can't even remotely cause a hard drive bottleneck anyway, even with uncompressed sources it takes about a hundred times as long to filter as read.

yup 1 pass encodes it would not help but 2 pass encoding is much faster when I do twriteavi() on the first pass of slow encode due to scripts like the stuff in this thread.

If you are only getting 2 FPS like me you for sure want to save out a lossless during the first pass else you are stuck doing it all again on the second pass. resize, decimation, MVtools, fft3dfilter, etc... I go from 2 FPS on the first pass to 14 FPS on the second pass and this is now with the real encoding being done also. (it would be only about .75 FPS on pass 2 if I didn't do this) huge time savings on 2 pass encodes.

no I would not do it like thetoof did it since you want the lossless (I use VBLE cause it is fast and thus little CPU overhead) made the same time as first pass stats are being made :)

g-force

7th March 2008, 07:49

TemporalDegrain looks really nice, but my PC is way to slow to run this thing practically. So I used Didee's limiting approach and wrote something that required less MV operations with very good results IMHO. Let me know what you think!

# GTemporalDegrainFaster ver.1.00 6MAR08
# Function by G-force,"Limited" Concept by Didee
# Requires FFT3DFilter.dll, mt_masktools, MVtools.dll, RemoveGrain.dll

function GTemporalDegrainFaster (clip input,int "threads",int "sigma")
{
source = input
ncpu = default(threads,1) #max number of CPU threads to use in FFT calculation (int>0, default=1)
sigma = default(sigma, 16) #turn down for more limited denoising
s2 = floor (sigma * 0.625)
s3 = floor (sigma * 0.375)
s4 = floor (sigma * 0.250)

filter = source.fft3dfilter(ncpu=ncpu,sigma=sigma,sigma2=s2,sigma3=s3,sigma4=s4,bt=5,bw=16,bh=16,ow=8,oh=8)
filterD = mt_makediff(source,filter)

backward_vec1= filter.MVAnalyse(isb=true, delta=1,pel=2,overlap=4,sharp=1,idx=1)
forward_vec1 = filter.MVAnalyse(isb=false,delta=1,pel=2,overlap=4,sharp=1,idx=1)
bw1 = source.MVCompensate(backward_vec1,idx = 2) #idx = 2 since working on a different clip than MVAnalyse
fw1 = source.MVCompensate(forward_vec1, idx = 2)

#nice thing about this approach is that it doesn't necessarily use any pixels from current frame
#this gets rid of a lot more dirt, scratches etc. (things that last only one frame)
nr = Interleave(bw1,source,fw1).Clense().SelectEvery(3,1)

# Limit "nr" to not do more than what "filter" would do. -Didee
nrD = mt_makediff(source,nr)
DD = mt_lutxy(filterD,nrD,"x 128 - abs y 128 - abs < x y ?")
source.mt_makediff(DD,U=2,V=2)

output = last
return(output)
}

-G

Didée

8th March 2008, 20:43

... and the result was 100% garbage.
Looked good enough to MY eyes ;)

I don't know your eyes, so I can't tell what seems good enough to them and what not.

With your sequence of crop+addborders, this is what happens to an interlaced YV12 source:

http://img401.imageshack.us/img401/5262/yv12iwrongua9.th.png (http://img401.imageshack.us/my.php?image=yv12iwrongua9.png)

In contrast, with correct treatment you get:

http://img401.imageshack.us/img401/967/yv12icorrectkt7.th.png (http://img401.imageshack.us/my.php?image=yv12icorrectkt7.png)

Then suggest something then, rather than just spouting off one.
How would one, by Didee's standards, crop off the shite on a Interlaced VHS cap.
I'm listening, o Great one :D

It's a basic technical requirement that the top border of interlaced YV12i may only be altered in mod4 steps.
This has nothing-at-all to do with "my standards".

-----

Generating script: (yeah, it's not elegant):p

base = blankclip(width=704,height=576,pixel_type="YV12",fps=50.000,color=$408080)

base = base.subtitle("This is some scrolling text ...",size=40,x=128,y=480)

interleave( base, base.crop(0,16*1,-0,-0).addborders(0,0,0,16*1,color=$408080),
\ base.crop(0,16*2,-0,-0).addborders(0,0,0,16*2,color=$408080),
\ base.crop(0,16*3,-0,-0).addborders(0,0,0,16*3,color=$408080),
\ base.crop(0,16*4,-0,-0).addborders(0,0,0,16*4,color=$408080),
\ base.crop(0,16*5,-0,-0).addborders(0,0,0,16*5,color=$408080),
\ base.crop(0,16*6,-0,-0).addborders(0,0,0,16*6,color=$408080),
\ base.crop(0,16*7,-0,-0).addborders(0,0,0,16*7,color=$408080),
\ base.crop(0,16*8,-0,-0).addborders(0,0,0,16*8,color=$408080),
\ base.crop(0,16*9,-0,-0).addborders(0,0,0,16*9,color=$408080),
\ base.crop(0,16*10,-0,-0).addborders(0,0,0,16*10,color=$408080),
\ base.crop(0,16*11,-0,-0).addborders(0,0,0,16*11,color=$408080),
\ base.crop(0,16*12,-0,-0).addborders(0,0,0,16*12,color=$408080),
\ base.crop(0,16*13,-0,-0).addborders(0,0,0,16*13,color=$408080),
\ base.crop(0,16*14,-0,-0).addborders(0,0,0,16*14,color=$408080),
\ base.crop(0,16*15,-0,-0).addborders(0,0,0,16*15,color=$408080) )

vid_50p = last

vid_50i = assumeTFF().SeparateFields().SelectEvery(4,0,3).Weave()

# the wrong way:
#---------------
vid_50i
crop(0,4,-0,-10)
addborders(0,7,0,7)

wrong = last

# the better way:
#---------------
vid_50i
crop(0,4,-0,-12)
addborders(0,8,0,8)

better = last

result_wrong = stackhorizontal( wrong.subtitle("interlaced after non-MOD4 cropping or addborders",y=20),
\ stackvertical( wrong.SeparateFields().selecteven().subtitle("--> top field",y=20),
\ wrong.SeparateFields().SelectOdd().Subtitle("--> bottom field",y=20) ) )

result_better = stackhorizontal( better.subtitle("interlaced after correct (MOD4) cropping or addborders",y=20),
\ stackvertical(better.SeparateFields().selecteven().subtitle("--> top field",y=20),
\ better.SeparateFields().SelectOdd().Subtitle("--> bottom field",y=20) ) )

result_wrong
#result_better

return( last )