Color banding and noise removal [Archive] - Page 5

mswaino2

15th July 2011, 17:34

:eek: smode=1 is indeed slower than smode=0, but not that slower. Something must be wrong. Are you sure your figures are right? Anyway, if you do a 2-pass encoding, render first to a lossless file then encode it, you'll save one avisynth pass.

I dunno, maybe its just Im using it in the script wrong or something. Could it be because Im using a denoiser and sharpener along with it?

I been doing a 2 pass encode with this for a script

LoadPlugin("C:\Program Files\megui\tools\dgindex\DGDecode.dll")

Import("C:\Program Files\AviSynth 2.5\plugins\LimitedSharpenFaster.avsi")

LoadPlugin("C:\Program Files\AviSynth 2.5\plugins\mt_masktools-25.dll")
Import("C:\Program Files\AviSynth 2.5\plugins\dither.avsi")
Import("C:\Program Files\AviSynth 2.5\plugins\GradFun2DBmod.avsi")
Import ("C:\Program Files\AviSynth 2.5\plugins\mt_xxpand_multi.avsi")
DGDecode_mpeg2source("C:\Users\Owner\Desktop\Case Closed - The Last Wizard Of The Century\VTS_06_1.d2v", info=3)
LoadPlugin("C:\Program Files\Megui\tools\avisynth_plugin\ColorMatrix.dll")
ColorMatrix(hints=true, threads=0)
#deinterlace
crop( 0, 0, -2, 0)
LanczosResize(852,480) # Lanczos (Sharp)
LoadPlugin("C:\Program Files\megui\tools\avisynth_plugin\UnDot.dll")
Undot() # Minimal Noise
deen()
LimitedSharpenFaster(ss_x=2.0, ss_y=2.0, strength=150, overshoot=0, undershoot=0, soft=0, edgemode=0)
GradFun3(smode=1)
__film = last
__t0 = __film.trim(0, 142520)
__t0

the only thing I have changed is the script and its taking a very long time for some reason. I was getting 11-12 FPS now Im at 3 FPS after adding it.

Dogway

16th July 2011, 10:50

cretindesalpes: Is it possible to add lsb_in feature in mvtools2. It would be very useful for certain sources with blocking+noise, where I would run a spatial dfttest pass then mdegrain. I read that normally spatial denoisers should go before temporal.

mandarinka

16th July 2011, 22:21

You can use a prefilter for the motion vector search (which I assume is completely fine to do in 8bit), to partially achieve that.

Something like this I think (hopefuly I got it right).
prefilt = FFT3Dgpu(sigma=2,bw=32,bh=32,bt=3,ow=16,oh=16,plane=4)

superfilt = prefilt.MSuper(pel=2, sharp=1)
super = MSuper(pel=2, sharp=1)
backward_vec2 = MAnalyse(superfilt, isb = true, delta = 2, overlap=4)
backward_vec1 = MAnalyse(superfilt, isb = true, delta = 1, overlap=4)
forward_vec1 = MAnalyse(superfilt, isb = false, delta = 1, overlap=4)
forward_vec2 = MAnalyse(superfilt, isb = false, delta = 2, overlap=4)

MDegrain2(super, backward_vec1,forward_vec1,backward_vec2,forward_vec2,thSAD=300,lsb=true)

Dogway

17th July 2011, 08:37

Hello mandarinka. Thanks for the input, I meant to denoise, not feed a better suited (prefiltered) clip for motion estimation. I know that lsb_in in mvtools is going to be quite annoying for cretindesalpes to make since he would probably need to adapt all the functions (manalyse,mmask,mcompensate,etc) for lsb_in in mdegrain. But I think it makes sense because spatial+temporal denoisers normally operate internally first spatial then temporally (dfttest,tnlmeans,etc)

By the way I found probably a bug in Dither_resize16 when processing a .png file. I tested and the error comes from the fh and fv parameters when resize dimensions are small:
This is my code, pitty I couldn't take advantage of high bitdepth rgb to yuv conversion : P
converttoyv12(matrix="PC.601")
MT("""Dither1Pre(flt="tnlmeans()",stacked=true)""",2,2,true)
dither_resize16(224,316,kernel="spline64",fh=1.1,fv=1.1)
Dither_convert_yuv_to_rgb(lsb_in=true,tv_range=false,output="rgb32",mode=6,ampn=0.5)

cretindesalpes

17th July 2011, 12:25

I know that lsb_in in mvtools is going to be quite annoying for cretindesalpes
Yes it would... but I could just leave the analysis part in 8 bits and focus on MSuper/MDegrain. Still, it's a lot of work, so I don't plan to do this in the near future.

By the way I found probably a bug in Dither_resize16 when processing a .png file.
I can't reproduce it here, everything looks OK. What is the bug, exactly? What are the dimensions of the original picture?

Dogway

17th July 2011, 12:45

Still, it's a lot of work, so I don't plan to do this in the near future.
No problem, I knew it was. You hard work on Dither is already impressive for a one man.
I can't reproduce it here, everything looks OK. What is the bug, exactly? What are the dimensions of the original picture?
Dimensions are 610x1114. I think its relevant because the other images work.

cretindesalpes

17th July 2011, 20:03

Dimensions are 610x1114. I think its relevant because the other images work.
OK I spotted the problem, a tile size calculation error again, resulting in a buffer overflow.

Fixed (I hope so) in Dither 1.9.4 (http://forum.doom9.org/showthread.php?p=1386559#post1386559).

TheProfosist

18th July 2011, 21:02

:eek: smode=1 is indeed slower than smode=0, but not that slower. Something must be wrong. Are you sure your figures are right? Anyway, if you do a 2-pass encoding, render first to a lossless file then encode it, you'll save one avisynth pass.

im getting crazy slower speed with smode=1 as well

smode=0 1.87FPS
smode=1 0.12FPS
this is with 1080p

cretindesalpes

19th July 2011, 07:46

im getting crazy slower speed with smode=1 as well

Weird.

I ran a few speed tests with 1280x720 progressive input, in single-threading and multi-threading modes. I added a naked dfttest() because it's the core of GradFun3(smode=1) and I would suspect this is the one slowing down the whole thing. Also added GradFun2DBMod for reference.

#SetMTMode (5, 4)
FFVideoSource ("random 720p source")
#SetMTMode (2)
Trim (0, -1000)

# Select one of these lines
#NOP ()
#GradFun3 (smode=0, mask=0)
#GradFun3 (smode=0)
#GradFun3 (smode=1, mask=0)
#GradFun3 (smode=1)
#GradFun3 (smode=2, mask=0)
#GradFun3 (smode=2)
#dfttest (sigma=35, tbsize=1, sbsize=36, sosize=27, lsb=true)
#GradFun2DBmod ()

PointResize (32, 32)
# Encoded with: x264 --preset ultrafast --output NUL "speed.avs"

Results (Phenom II X4 965 3.4 GHz):

| Single-threaded | Multi-threaded x4
| fps CPU | fps CPU
-----------------------------+-------------------+------------------
NOP () | 453.93 33 % | 444.44 33 %
GradFun3 (smode=0, mask=0) | 30.72 27 % | 89.63 90 %
GradFun3 (smode=0) | 7.38 25 % | 26.74 99 %
GradFun3 (smode=1, mask=0) | 7.96 78 % | 9.53 97 %
GradFun3 (smode=1) | 4.38 55 % | 7.65 97 %
GradFun3 (smode=2, mask=0) | 10.03 26 % | 32.80 98 %
GradFun3 (smode=2) | 4.93 25 % | 17.25 99 %
dfttest (...) | 8.87 85 % | 10.11 98 %
GradFun2DBmod () | 5.19 25 % | 18.95 96 %

So, GradFun3(smode=1) is 3.5 times slower than GradFun3(smode=0) in 4x MT mode, and just 1.6 times slower in single-threaded mode.

I'll check later if I get very different figures for 1080p input.

Dogway

19th July 2011, 14:02

Another error: 1434x1476 dimensions for Dither_resize16 resizing to half. Hope you get this kind of problems sorted, sorry I can't help more.

cretindesalpes

19th July 2011, 21:28

Another error: 1434x1476 dimensions for Dither_resize16 resizing to half. Hope you get this kind of problems sorted, sorry I can't help more.
I can't reproduce it. Is 1434x1476 the input size? Halving 1434 gives 717 which is odd. Could you please post the script?

Yellow_

19th July 2011, 23:14

I'm using Dither to go from 8bit h264AVC to 16bit OpenEXR image sequences but really those should be linear not gamma encoded, the compositing application I'm importing them into assumes linear. So does the Dither_y_gamma_to_linear function serve this purpose?

Is it only necessary to linearize the luma, more accurate and technically correct to do it that way before the conversion to RGB, rather than a typical 0.45 reverse gamma on all channels in RGB data after conversion?

Also as the source was encoded with a BT709 transfer curve not sRGB 2.2, to undo that ie: linearize YCbCr, I should be assuming something like the reciprocal of 2.35? If I understand correctly this helps prevent compressing shadow detail that can occur applying 0.45 to BT709 source?

Dogway

20th July 2011, 03:22

This is the code I'm testing with:
Interleave(showred("yv12"), showgreen("yv12"), showblue("yv12")).Dither_convert_8_to_16
Dither_resize16(1434/2,1476/2,kernel="spline36",y=3,u=1,v=1)

The error is very ugly, I thought Dither_resize did some kind of rounding. Anyway in the LinearResize function I changed target dimensions to (w%2+w,h%2+h). Nice to know it wasn't THAT error again.

As a last question, Im forcing to proces Dither_y_gamma_to_linear in PC range to have more data to work with. Do you think this is more detrimental than beneficial?

cretindesalpes

21st July 2011, 08:29

Dither 1.9.5 (http://forum.doom9.org/showthread.php?p=1386559#post1386559) released:

Bug fixed in Dither_resize16(), displaying green bars when SSE2 optimisations are disabled. Better check of stack16 clip dimensions.

As a last question, Im forcing to proces Dither_y_gamma_to_linear in PC range to have more data to work with. Do you think this is more detrimental than beneficial?
TV range is all about keeping a headroom for the ringing causing by some filters (resizers, sharpeners…), so with PC range this headroom is lost. But this is not a big difference, and the bottom headroom is cleared by the gamma/linear conversion anyway.

the compositing application I'm importing them into assumes linear. So does the Dither_y_gamma_to_linear function serve this purpose?
Yes.

Is it only necessary to linearize the luma, more accurate and technically correct to do it that way before the conversion to RGB, rather than a typical 0.45 reverse gamma on all channels in RGB data after conversion?
GammaYUV -> linearYUV -> linearRGB will give you wrong colors. Inverse colorspace conversions should be done in the same gamma/linearity as the forward conversion. For gamma YUV to linear RGB conversion you can use:

Dither_convert_yuv_to_rgb (output="rgb48y")
Dither_y_gamma_to_linear (tv_range_in=false, tv_range_out=false)
Dither_convey_rgb48_on_yv12 (
\ SelectEvery (3, 0),
\ SelectEvery (3, 1),
\ SelectEvery (3, 2) )

Also as the source was encoded with a BT709 transfer curve not sRGB 2.2, to undo that ie: linearize YCbCr, I should be assuming something like the reciprocal of 2.35? If I understand correctly this helps prevent compressing shadow detail that can occur applying 0.45 to BT709 source?
You're right, it appears that BT709 doesn't use the same transfer curve as sRGB, the slope at 0 in sRGB is much steeper than in BT709. If you want to linearize the R'G'B' components converted from the BT709 Y'CbCr, apply on each one : R = R'/4.5 if R' < 0.081 or R = ((R' + 0.099) / 1.099) ^ (1/0.45) if R' ≥ 0.081. The formula are the same as in sRGB, but with different values. I will add the BT709 mode to Dither conversion functions in a future release.

SilaSurfer

21st July 2011, 19:40

Hey cretindesalpes. Just wanted to stop by and say outstanding work. Finished off an encode using your ordered dithering in my filter chain. Awasome results. Thanks again for all of your hard work.

Dogway

21st July 2011, 23:48

hey I just needed to use addborders inside stacked 16b. Thought on sharing the function (primitive but useful). Thanks for the new version too!Function Dither_addborders16 (clip src, int "left", int "top",
\ int "right", int "bottom")
{
left = Default (left, 0)
top = Default (top, 0)
right = Default (right, 0)
bottom = Default (bottom, 0)

src
msb = crop(0,0,width,height/2).addborders (left, top, right, bottom)
lsb = crop(0,height/2,width,height/2).addborders (left, top, right, bottom)

StackVertical (msb, lsb)}

btw Im also looking forward the v&c implementation

mandarinka

22nd July 2011, 00:08

Hey cretindesalpes. Just wanted to stop by and say outstanding work. Finished off an encode using your ordered dithering in my filter chain. Awasome results. Thanks again for all of your hard work.

Indeed. Dither brought quite some fresh momentum into the abilities of avisynth, single-handedly. Thanks from me too.

Alek93j

22nd July 2011, 21:49

Function zzz_denoise (clip src, float "sigma", int "thr", bool "mask", int "sad")
{
sigma = Default (sigma, 16)
thr = Default (thr, 5)
mask = Default (mask, False)
sad = Default (sad, 200)

w = src.Width ()
h = src.Height ()

# Motion analysis
super = MSuper (src)
super_a = MSuper (src.TTempSmooth ().RemoveGrain (12))

fwd_vect_3 = super_a.MAnalyse (isb=false, delta=3, overlap=4)
fwd_vect_2 = super_a.MAnalyse (isb=false, delta=2, overlap=4)
fwd_vect_1 = super_a.MAnalyse (isb=false, delta=1, overlap=4)
bck_vect_1 = super_a.MAnalyse (isb=true, delta=1, overlap=4)
bck_vect_2 = super_a.MAnalyse (isb=true, delta=2, overlap=4)
bck_vect_3 = super_a.MAnalyse (isb=true, delta=3, overlap=4)

fwd_comp_2 = src.MCompensate (super, fwd_vect_2, thSAD=sad)
fwd_comp_1 = src.MCompensate (super, fwd_vect_1, thSAD=sad)
bck_comp_1 = src.MCompensate (super, bck_vect_1, thSAD=sad)
bck_comp_2 = src.MCompensate (super, bck_vect_2, thSAD=sad)

# Spatio-temporal denoising using modified dfttest
c_dft = Interleave (fwd_comp_2, fwd_comp_1, src, bck_comp_1, bck_comp_2)
c_dft = c_dft.dfttest (sigma=sigma, lsb=true) # Double height
c_dft = c_dft.SelectEvery (5, 2)

# Temporal-only denoising using modified MDegrain
c_deg = src.MDegrain3 (super, bck_vect_1, fwd_vect_1, bck_vect_2, fwd_vect_2, bck_vect_3, fwd_vect_3, thSAD=sad, lsb=true) # Double height

# Spatio-temporal denoising smoothes too much the details,
# therefore we use pure temporal denoising on edges or detailed areas.
edge_src = c_deg.Crop (0, 0, w, h)
edge_mask = edge_src.mt_edge (mode="prewitt", thY1=thr, thY2=thr)
edge_mask = edge_mask.mt_expand ()
edge_mask = StackVertical (edge_mask, edge_mask) # Double height
c_hyb = mt_merge (c_dft, c_deg, edge_mask, luma=true, y=3, u=3, v=3)

return (mask ? edge_mask.GreyScale () : c_hyb)
}

Maybe i'm gonna say something stupid, but this line:
super = MSuper (src)
It's used only by MCompensate and MDegrain3, which don't need to use levels=0 (if I understood right), so, it would be:
super = MSuper (src,levels=1)
to get the same result with a little increase of the velocity?

mswaino2

23rd July 2011, 07:13

im getting crazy slower speed with smode=1 as well

smode=0 1.87FPS
smode=1 0.12FPS
this is with 1080p

Seems I wasnt the only one who had this problem. I guess Ill just have to wait it out and hope it gets faster/fixed in time.

cretindesalpes

23rd July 2011, 08:45

so, it would be:
super = MSuper (src,levels=1)
to get the same result with a little increase of the velocity?
Yes, you're right. No need for all levels in MDegrain and MCompensate.

Seems I wasnt the only one who had this problem. I guess Ill just have to wait it out and hope it gets faster/fixed in time.
Have you run the test in post #211 (http://forum.doom9.org/showthread.php?p=1514203#post1514203)? How does it compare to my results? I don't know where your problem comes from, so I need a bit more information.

hey I just needed to use addborders inside stacked 16b. Thought on sharing the function (primitive but useful).
[...]
btw Im also looking forward the v&c implementation
Thank you. I added it to the next Dither release, but with the LSB part being a true 0.
What do you mean "v&c"?

SilaSurfer & mandarinka: thank you for your kind words.

Dogway

23rd July 2011, 16:45

What do you mean "v&c"?

Void & Cluster algo (http://forum.doom9.org/showthread.php?p=1513101#post1513101). I also recall reading something related to it a few months ago, but I can't talk since I have no idea on dither algorithms. Did you test it?

cretindesalpes

23rd July 2011, 17:21

OK. No, I haven't tested it yet.

Dogway

24th July 2011, 17:16

I had some time to look at making work nnedi with dither. I also wanted to contrasharp after dfttest and before nnedi so I mixed both workarounds. Can you confirm it is correct?
I also added a mask for nnedi because it was introducing some banding/artifacts at flat/gradient areas...

This is ultimately going to be processed by smdegrain(lsb=true) so I dither it down to mode=6. Is this ok, or should I use mode=-1?
raw=last

predf=dfttest(sigma=10,tbsize=1,lsb=true)
post=predf.ditherpost(mode=-1)
contr=Contrasharpening(post,raw).LSFmod(defaults="slow",strength=30,edgemode=1,soothe=true,ss_x=1.0,ss_y=1.0)

spl=contr.Spline36Resize(640,360)
nn=contr.nnedi3_rpow2(rfactor=2,cshift="spline64resize",nns=4,qual=2,pscrn=4)
hop=mt_edge (mode="prewitt", thY1=10, thY2=30).mt_expand.BilinearResize(640,360)
nn=mt_merge(spl,nn,hop, luma=true, y=3)

mask=DitherBuildMask(contr, post)
sharped=Dither_merge16_8(predf,contr.Dither_convert_8_to_16,mask).Dither_resize16(640,360,kernel="spline36")

mask2=DitherBuildMask(nn,spl)
Dither_merge16_8(sharped,nn.Dither_convert_8_to_16,mask2)

DitherPost(mode=6,prot=true)

SilaSurfer

24th July 2011, 18:14

Error diffusion modes (6,7,8) are awasome when using high bitrates, otherwise mode (0) 8-bit ordered dither + noise (Bayer matrix) accept that I don't use ampn setting but rather ampo (~0.5) when using higher compression.

SSH4

24th July 2011, 20:59

v&c looks great but "This technique is covered by Ulichney’s U.S. patent 5535020 and the specific implementation we showed is partly covered by Epson’s U.S. patent 6088512."
But I hope you don't care about this. :)

atra dies

25th July 2011, 03:22

I'm testing some images from here:
http://www.4p8.com/eric.brasseur/gamma.html

ImageReader("C:\gamma_colors.jpg")

Dither_convert_rgb_to_yuv(matrix="601",tv_range=false,lsb=false)
Dither_convert_8_to_16()
Dither_y_gamma_to_linear(false,false)
Dither_resize16(128,192)
Dither_y_linear_to_gamma(false,false)
DitherPost(mode=-1)

What am I doing wrong? The page claims they come out wrong but the one at the bottom says the scaling software rules. Also the NASA big lights from space image comes out with the lights less yellow than the original.

I modified the linear to gamma function to process chroma but that didn't change anything. Is chroma on the power scale too? Is this plugin built just for yv12 or should it work for rgb the same way? It could be something I don't know about because I haven't read much about it.

cretindesalpes

25th July 2011, 08:12

Can you confirm it is correct?
Yes at the first glance it looks correct, but this becomes a bit complicated to follow... Anyway, I would have extracted the "hop" edge mask from the nnedi result rather than from the original picture, although I don't know if it makes a significant difference.

This is ultimately going to be processed by smdegrain(lsb=true) so I dither it down to mode=6. Is this ok, or should I use mode=-1?
Definitely not mode=-1, you'd lose all the benefits of this complex script. Mode 6 looks like the most appropriate dither if you're going to filter the clip again, because you need the maximum quality at this intermediate stage and you don't want the motion estimation to be tricked by the regular patterns of an ordered dithering.

What am I doing wrong? The page claims they come out wrong but the one at the bottom says the scaling software rules. Also the NASA big lights from space image comes out with the lights less yellow than the original.
This is the same problem as with the colorspace conversions mentioned a few posts above. Ideally, for a color sRGB picture, linear/gamma scaling should be done on the RGB components, not on the converted Y signal only. This modified piece of code should work as expected:
ImageReader ("gamma_colors.jpg")

Interleave (ShowRed ("YV12"), ShowGreen ("YV12"), ShowBlue ("YV12"))
Dither_convert_8_to_16()
Dither_y_gamma_to_linear(false,false) # u=1, v=1
Dither_resize16(128, 192, u=1, v=1)
Dither_y_linear_to_gamma(false,false) # u=1, v=1
DitherPost(mode=-1, u=1, v=1)
MergeRGB (SelectEvery (3, 0), SelectEvery (3, 1), SelectEvery (3, 2))

I modified the linear to gamma function to process chroma but that didn't change anything. Is chroma on the power scale too?
Chroma depends on something on a power scale. The correct conversion is related to the initial RGB->YUV matrix. Keeping the chroma untouched is technically wrong, but the formula to fix it is complicated!

wOxxOm

26th July 2011, 09:10

pandy

26th July 2011, 11:39

is there any way to reduce bit depth with dither to less than 8 bits? (like 5 or 4 for embedded applications - quite frequently only 4 or 5 bits depth per component is supported)?

cretindesalpes

27th July 2011, 07:39

How about an option to have a static ampn noise? e.g. a negative value = static, or better an additional parameter like ampn_temp_avg=0..100 (similar to GrainFactory3) with -1 for a completely static noise.
You'll have static noise in the next Dither release. For a more fancy noise, I think it's better to provide a noise clip produced by an external source. I might implement that too.

is there any way to reduce bit depth with dither to less than 8 bits?
Yes. Reduce the signal before dithering and amplify it afterwards.
bits = 4 # Valid range: 0 - 8
mul = String (Pow (2, 8 - bits))

Dither_convert_8_to_16()
Dither_lut16 (expr="x "+mul+" /")
DitherPost ()
mt_lut ("x "+mul+" *", y=3, u=3, v=3)

SilaSurfer

28th July 2011, 18:31

Hello guys

cretindesalpes what would be better in your opinion. I'm doing some heavyweigth filtering Mvtools style on 1080p Bluray source that is going to be encoded to x264. But my processor wont handle that kind of stress (time to get I7 I guess :p). I was thinking of doing a lossless pass and then going to x264 but should I use your dithering before the lossless pass or later on lossless source before feeding it to x264? Thanks in advance

cretindesalpes

29th July 2011, 11:56

should I use your dithering before the lossless pass or later on lossless source before feeding it to x264? Thanks in advance

I think it doesn't matter much. DitherPost() is quite light on resource, compared with 1080p MC filtering. But if the output of your first lossless pass has a 16-bit depth, the generated file will be huge (compression of the LSB part is inefficient).

SilaSurfer

29th July 2011, 13:34

But if the output of your first lossless pass has a 16-bit depth, the generated file will be huge (compression of the LSB part is inefficient).

No I was thinking of staying in 8-bits and then on lossless as source doing:

Dither_convert_8_to_16 ()
Smoothgrad if needed
DitherPost()

atra dies

30th July 2011, 01:43

Seems I wasnt the only one who had this problem. I guess Ill just have to wait it out and hope it gets faster/fixed in time.

I am getting super slow speeds with smode 1 and 2 (I think 2 is supposed to be the highest quality, so I would try that one instead) but then I tried lowering the radius from 13 to 1, got an error saying valid values are 2-64, wouldn't accept anything below 6, then it works at twice the speed so I would try lowering the radius and seeing if you get acceptable results.

I am using it to smooth horrible compression artifacts, bands of blocks for flat animation colors and nothing removes it so completely as gradfun3 .66 strength and I tried a lot of filters and pp settings. For non-animation I have found that smoothd works wonders at removing lots of noise but blurs the picture, set adaptive mode to 1. Seems promising if someone used that principle for another filter let me know. (Edit: sorry, should have tried mdegrain2, it is great but slow and one of the mod16 filters.)

Now I have to figure out where or if levels or regular denoisers can be done on linear gamma for most accurate results?

Dogway

31st July 2011, 10:47

Definitely not mode=-1, you'd lose all the benefits of this complex script. Mode 6 looks like the most appropriate dither if you're going to filter the clip again, because you need the maximum quality at this intermediate stage and you don't want the motion estimation to be tricked by the regular patterns of an ordered dithering.

I kept wondering... smdegrain, actually any mdegrain that uses motion vectors is going to denoise only on static, or semistatic areas. This way moving areas are going to keep the floyd & steingberg dithering which consumes lots of bitrate. Do you think manalyse is really affected by the ordered dithering in the case I used mode=0 over mode=0?
Anyway this is about the easy way. I guess I can do something like, process motion analysis over mode=6 dummy, and degrain over mode=0, implement this inside the smdegrain function as a "fake" lsb_in.

In my last post I also wanted to tell you about something I observed. I didn't fully test so I was unsure to comment, but maybe you may want to check the fh and fv parameters of Dither_resize16 for next version. I think that in some occasions I couldn't get sharper results, maybe with small sources, downsizing or upsizing, can't remember. Can I assume >1.0 = sharper is always true?

@atra dies: Lately I discovered how effective dfttest can be at deblocking, this is a known fact, but the tricky thing I found is that deblocking works more effective when tbsize=1 (no temporal). try dfttest(tbsize=1,sigma=10,lsb=true).ditherpost. Im very surprised (using it every day). It blurs a little but this is unavoidable, although you can (contra)sharpen afterwards. sigma=10 looks like a fixed value, you won't likely deblock more with higher values.

Heaud

31st July 2011, 21:10

How about an option to have a static ampn noise? e.g. a negative value = static, or better an additional parameter like ampn_temp_avg=0..100 (similar to GrainFactory3) with -1 for a completely static noise.

For now you can achieve static noise on the post dithered content with addgrainc(constant=true). Set the strength to 0.8 as a good starting point and then adjust to taste.

wOxxOm

31st July 2011, 21:23

For now you can achieve static noise on the post dithered content with addgrainc(constant=true). Set the strength to 0.8 as a good starting point and then adjust to taste.
Actually for some types of content (and for most cartoons similar to anime) this is undesirable since the constant grain over the whole frame will be perceived as physical dust on a display device, visible everywhere except very bright areas during any panorama shift or character movement.

Adding of grain should be performed only in areas altered by DitherPost, and proportionally of course, so that there wouldn't be any abrupt grain density changes. Currently I do so by using a motion-mask, so that moving areas get moving grain applied and still areas get constant grain, with the final result applied only to areas altered by GradFun3.

SilaSurfer

1st August 2011, 13:40

Adding of grain should be performed only in areas altered by DitherPost, and proportionally of course, so that there wouldn't be any abrupt grain density changes. Currently I do so by using a motion-mask, so that moving areas get moving grain applied and still areas get constant grain, with the final result applied only to areas altered by GradFun3.

wOxxOm could you post a sample of this script? I'm interested of trying it out.

wOxxOm

1st August 2011, 14:34

well it might need some tweaking to make it usable for your type of content, but even in this state it might be handy to conceal dither pattern of GradFun3 in default mode=0. Requires variableblur plugin (http://avisynth.org/mediawiki/Variableblur).

#your GradFun3 settings
vGF3=GradFun3(0.5,8,2,1,thr_det=3,thr_edg=20,dthr=0.05,debug=0,smode=1)

#build motion mask
blk=8
vMA=msuper(rfilter=1,hpad=blk,vpad=blk).manalyse(blksize=blk)
vSC=MSCDetection(vMA,thscd1=300).mt_invert()
vMotionMask=mmask(vMA,ml=10,kind=0).mt_lut(y=0,w=blk).mt_lut(y=0,h=blk).mt_lut(y=0,offx=width-blk,w=blk).mt_lut(y=0,offy=height-blk)
vMotionMask=vMotionMask.mt_logic(vMotionMask.trim(1,0),"or").mt_logic(vSC,"and").mt_logic(vSC.trim(1,0),"and")
vMotionMask=vMotionMask.bilinearresize(ceil(width/blk/2)*2,ceil(height/blk/2)*2).mt_binarize(16)
vMotionMask=vMotionMask.removegrain(4).mt_expand().removegrain(11).bicubicresize(width,height).binomialblur(200,u=1,v=1)

#create motion-adaptive grain
vGray=mt_lut(y=-128,u=-128,v=-128)
vGrain=vGray.addgrainC(1,0,0,0,-1,true).mt_merge(vGray.addgrainC(1,0,0,0,-1,false),vMotionMask,true)

#grain will be added to dark areas only
vGrainMask=mt_lut("255 x 32 - 2 << 255 / 2 ^ 255 * -")
#contract mask down to areas touched by GF3 only, apply extreme blur to feather edges of the mask
vGrainMask=vGrainMask.mt_logic(mt_lutxy(vGF3,last,"x y - abs 8 << ").mt_expand().binomialblur(200,u=1,v=1),"min")

#apply grain to GF3
vGF3.mt_merge( vGF3.mt_adddiff(vGrain,u=2,v=2), vGrainMask,true )

SilaSurfer

1st August 2011, 15:12

Thanks man. ;)

upyzl

3rd August 2011, 16:09

Excuse me

Does dither.dll & dither.avsi support x64 now?
It's pretty tools and I want to use it for 10-bit x264 encoding

atra dies

6th August 2011, 21:46

I am using it to smooth horrible compression artifacts, bands of blocks for flat animation colors and nothing removes it so completely as gradfun3 .66 strength and I tried a lot of filters and pp settings. For non-animation I have found that smoothd works wonders at removing lots of noise but blurs the picture, set adaptive mode to 1. Seems promising if someone used that principle for another filter let me know. (Edit: sorry, should have tried mdegrain2, it is great but slow and one of the mod16 filters.)

Make that .41 strength and default radius (defaults are good). Took a better look at it with histogram "luma". I would like to apply only to Y. Is there a command to show Y or U or V separately in avisynth?

The live video problem was 16mm grain. I would use ttempsmoothf but I see banding in the motion parts (in 8bit land). I tried all kinds of denoisers but I won't accept softening instead of grain. I usually deblock live stuff with cpu=6 or cpu2 on just the luma but this also leads to banding and what would I use on that? haha

Dogway

8th August 2011, 19:14

In tone of my last post.
I haven't done any tests still but Im sketching some theories to put in practice later and see the results.
edit: updated with results. This is against ffv1 lossless codec, Im unsure if ordered dithering is rather targeted to h.264 encodings but results are more or less the expected.

EX01
ditherpost(mode=6)
+
mdegrain in mode=0
theory = consumes lots of bitrates because of mode=6 left overs
result = to my surprise this one was the best in compressibility o_O!
clip = 48.37Mb (http://www.mediafire.com/?x49ww3sggh7mx62)

EX02
ditherpost(mode=0)
+
mdegrain in mode=0
theory = motion estimation is(could be) altered
result = This was the worst for compressibility. It showed heavy overlapping ordered dithering
clip = 50.49Mb (http://www.mediafire.com/?hdj8djbd8kj6dq2)

EX03
ditherpost(mode=0)
+
motion estimation of ditherpost(mode=6) and mdegrain in mode=0
theory = ideal but probably overlapping ordered dithering, plus denoising "ordered dither" which is not recommended.
result = Looks like motion estimation has some effect when mode=0, so this compresses better than EX02. Actually results or more towards EX01 than EX02 !O_o
clip = 49.08Mb (http://www.mediafire.com/?bssfx4opk4tkuuv)

EX04
ditherpost(mode=0)
+
mdegrain in mode=0 over mode=6 dummy
+
Ditherbuildmask workaround
theory = paired with motion estimation over the mode=6 dummy this could work.
result = The thinking behind stays true but the code is a bit different, see the code box below.
clip = 48.4Mb (http://www.mediafire.com/?0vbtb8bp42pmpp1)

o=dfttest(tbsize=1,sigma=10,lsb=true)
la2=o.DitherPost(mode=6)
la1=o.DitherPost(mode=-1)

super = la2.MSuper(pel=2, sharp=2) #hpad=0,vpad=0, para +velocidad
b3vec = super.MAnalyse(isb = true, delta = 3, overlap=2, blksize=8, search=4)
b2vec = super.MAnalyse(isb = true, delta = 2, overlap=2, blksize=8, search=4)
b1vec = super.MAnalyse(isb = true, delta = 1, overlap=2, blksize=8, search=4)
f1vec = super.MAnalyse(isb = false, delta = 1, overlap=2, blksize=8, search=4)
f2vec = super.MAnalyse(isb = false, delta = 2, overlap=2, blksize=8, search=4)
f3vec = super.MAnalyse(isb = false, delta = 3, overlap=2, blksize=8, search=4)

ms=la1.MDegrain3(super, b1vec, f1vec, b2vec, f2vec , b3vec, f3vec, thSAD=400, limit=255, limitc=255,lsb=true)
ms2=la2.MDegrain3(super, b1vec, f1vec, b2vec, f2vec , b3vec, f3vec, thSAD=400, limit=255, limitc=255,lsb=true)

msk=DitherBuildMask(la1,ms.ditherpost(mode=-1))
Dither_merge16_8(o,ms2,msk)

DitherPost(mode=0)

Conclusion: EX04 could possibly be better than EX01 but differences are too small to be judged fairly, plus processing time would be longer and compressibility reduced. So Im more or less like in the beginning, in wonder. : /

I decided to spatially analyse the dither, checking with the used masks:
http://img833.imageshack.us/img833/8471/mdegrainmask.th.png (http://img833.imageshack.us/img833/8471/mdegrainmask.png)http://img508.imageshack.us/img508/1300/ex04gammeddfttestmdegra.th.png (http://img508.imageshack.us/img508/1300/ex04gammeddfttestmdegra.png)http://img202.imageshack.us/img202/8324/ex04gammed.th.png (http://img202.imageshack.us/img202/8324/ex04gammed.png)http://img839.imageshack.us/img839/1351/ex01gammed.th.png (http://img839.imageshack.us/img839/1351/ex01gammed.png)

What I see: This would probably not be the best example as there is little motion (showed by the mdegrain mask). The relevant "mask" to check out is the one in 2nd picture, the enclosed regions belongs to the pure dfttest parts, where the rest is mdegrain. Here you can see, specially on the cloak how error diffusion changes to ordered dithering, making it ideal for encoding.

06_taro

9th August 2011, 01:45

Excuse me

Does dither.dll & dither.avsi support x64 now?
It's pretty tools and I want to use it for 10-bit x264 encoding

you can find the answer here (http://forum.doom9.org/showthread.php?p=1432866#post1432866) and here (http://forum.doom9.org/showthread.php?p=1505297#post1505297).

SSH4

10th August 2011, 23:25

I thought about upscaling some source after dfttest 16bit mod. So i thought that will be great dither 16 bit data after upscaling, this must not upscale ditherings.
I was play a little bit with 16 bit MSB/LSB data and upscale with nnedi3. And found that not good idea for this moment. LSB part have aliased edges when value drop from 255 to 0. and nnedi and most of other scalers smooth them, so in result after ditherpost() we have worst result :(
Is there any another way for simulate 16 bit data in avisinth but without MSB/LSB and sharp edges in LSB?
My brain not work and i cant imagine anything %)

cretindesalpes

11th August 2011, 01:18

You can resize stacked 16-bit MSB/LSB data with Dither_resize16(). Probably not as sharp as nnedi3, but it works.
It's also possible to combine 8-bit nnedi3 with Dither_resize16(). For example:

# 16 bit input

nw = 1280 # new width
nh = 960 # new height

upnn8 = DitherPost (mode=-1)
upnn8 = upnn8.nnedi3_rpow2 (rfactor=2, fwidth=nw, fheight=nh, cshift="Spline36Resize")
upnn16 = upnn8.Dither_convert_8_to_16 ()

Dither_resize16 (nw, nh, kernel="bicubic")

last.Dither_limit_dif16 (upnn16, thr=1.0, elast=2.0)
DitherPost ()

TheProfosist

11th August 2011, 18:45

right now im trying to use:

Dither2Pre (flt="FFT3DGpu(Sigma=4)")
SmoothGrad (radius=16, thr=0.25, elast=2)
DitherPost ()

but it it throws a error that asks me to debug:
http://img.photobucket.com/albums/v519/TheProfosist/doom9/Unexpectederrorencountered_2011-08-11_12-11-21.png

i would like to use dither+smoothgrad with the current plugins as best as possible. current sample of the filters i normally use in a script.
#RemoveGrain()
#RemoveGrainHD()

#crop(0,0,-0,-0,align=true)

#FFT3dGPU(sigma=0.2, precision=2)
#FFT3DFilter(sigma=0.2, ncpu=4)

#EdgeCleaner()

#DAA()
#MAA()

#LSFmod(strength=20, preblur="OFF", ss_x=2.0, defaults="slow")

#DeHalo_alpha(darkstr=1.0, brightstr=1.0, ss=2.0)

#GradFun2DBmod()

i would like to keep that order if possible (not required)
i would like the script to be able to output 16bit for 10bit x264 but also be possible to output 8bit for 8bit x264 (should be 1 line of code difference DitherPost()?)

Since i dont think anyone would want to work on the whole script i plan to work on getting one filter at a time as i need them.

For now i just need Dither+SmoothGrad to output 8bit&16bit to 8bit&10bit x264 respectively.

cretindesalpes

11th August 2011, 19:44

right now im trying to use:
Dither2Pre (flt="FFT3DGpu(Sigma=4)")

IIRC FFT3DGpu can be run only once at a time. Dither2Pre instantiates many of them, so you cannot use it here. You can try to use the classic FFT3DFilter instead, but it will be very slow. A much a better alternative would be using dfttest(lsb=true) (no need for Dither2Pre).

i would like the script to be able to output 16bit for 10bit x264 but also be possible to output 8bit for 8bit x264 (should be 1 line of code difference DitherPost()?)
For 8/8 bits, use DitherPost() as usual. For 16/10 bits, replace DitherPost() with Dither_convey_yuv4xxp16_on_yvxx(). No further filtering will be possible past this line, so this should be the last one of the script. Then encode with the appropriate command line (see the Dither documentation).

Edit:

A more general way to insert 8-bit processing between 16-bit filters is the following:
dfttest (lsb=true) # or whatever generating stack16 data

# Insert 16-bit filters here

s16 = last
DitherPost (mode=-1)

# Insert 8-bit filters here

Dither_convert_8_to_16 ()
s16.Dither_limit_dif16 (last, thr=1.0, elast=2.0)

# Insert 16-bits filters again...

# Finally
DitherPost () # or Dither_convey_yuv4xxp16_on_yvxx ()

To work correctly, the 8-bit filters must leave the low-gradient or flat area intact, i.e. touch only the edges, details or their immediate surrounding. It won't work with level/curve things (use SmoothAdjust in 16 bits instead).

Set thr and elast according to taste. You can also use masking instead of Dither_limit_dif16(), but this is a bit more work.