Log in

View Full Version : Speeding HDRAGC up by a large margin


Ponder
8th May 2011, 23:24
I wanted to get the fantastic effects from HDRAGC, but it is 240% slower (cpu dependent)on dvd,
much slower on HD. On some materials, the slowness is 100% worthwhile, but in general, may not
be suitable. With the idea of side by side pixels are very alike, a color mask can extracted and
used with proper correction later to get the speed up.

Visually the result is close to identical(my goal), but a little sharper. So I sped
it up as follow:

On 1280x720, decoding is 60% faster, x264 is 38% faster.
Resize 1280 to 1024( if desired final size) , decode 39% faster, x264 28% faster.
On dvd, x264 is 28% faster, 7% larger size on long A.Flux clips, it sharpen the dvd a little, On
soft material, it actully looks better. On short extreme motion scenes, size can be 20% bigger.
Using undot or RemoveGrain(1) before and/or after help shave several % off size..
It is one of my favorite script now.

More improventments may be possible on size, speed or even visually since I have not try parameters
from the great unfilter resizers, or limiter techniques using Masktool. Any refinements are welcome.

Please share your CPU findings, between pure HDRAGC and this Speed_up_HDRAGC script. It will be
interesting to see how different between dual core,quad, cache size, Intel,and AMD all stack up.
Above results are tested on E5300.

#LoadPlugin("f:\AviSynth 2.5\PLUGINS1\DGDecodeNanSSE2.dll")
LoadPlugin("F:\AviSynth 2.5\PLUGINS\SimpleResize.dll") # sharp, use for hd
#LoadPlugin("F:\AviSynth 2.5\PLUGINS1\BicublinResize.dll")#sharper than SimpleResize,good for dvd
MPEG2Source("b:\z.d2v",idct=2,cpu=0)
#ffdshow("default") #if want, set 1024 inside ffdshow's LanczosResize, 10% faster
#RemoveGrain(1)
source=last #Too be safe, mod4 preferred
#width=1024
my_wid0=1280
#my_wid0=1024
ratio=my_wid0/4
mod4h=4*ROUND((HEIGHT*ratio)/width) #wrong mod may get ghosting
my_HEIGHT=mod4h
my_wid = (width > my_wid0) ? my_wid0 : width

a = source.SimpleResize(my_wid/2,my_HEIGHT/2)
a_up=a.SimpleResize(my_wid,my_HEIGHT)# (width,HEIGHT)
a_diff=mt_makediff(a_up,source)#get detail from unalter source

a=a.HDRAGC(Corrector=0.7,Reducer=1,protect=1,MODE=1,PASSES=2)
b=a.SimpleResize(my_wid,my_HEIGHT) #(width,HEIGHT)
bri2=mt_makediff(b,a_up)
mt_adddiff(bri2,source) # Add "brighten diff mask2" to unalter source
briten=mt_adddiff(bri2,source) #sharp, add details
mt_makediff(briten,a_diff)
#RemoveGrain(1)

hdr=source.HDRAGC(Corrector=0.7,Reducer=1,protect=1,MODE=1,PASSES=2)
stackhorizontal(hdr,last)

Ponder
25th May 2011, 18:37
Version 2, this one is even closer to HDRAGC than previous method, and compress more. x264
about 3% slower. It is better for HD. Previouse version still better at DVD, gets HDRAGC effect
and enhances subtle details at the same time. If material is grainy, must degrain first.

LoadPlugin("F:\AviSynth 2.5\PLUGINS1\BicublinResize.dll")# simpleResize is just as good
#ffdshow("default") #Use ffdshow's LanczosResize to resize initially, extremely fast
source=last
#RemoveGrain(1)
#width=1024
my_wid0=1280
#my_wid0=1024
ratio=my_wid0/4 #Mod8h may get pixels overlap, get bright green edge, mod4 preferred
mod4h=4*ROUND((HEIGHT*ratio)/width)
my_HEIGHT=mod4h
my_wid = (width > my_wid0) ? my_wid0 : width
# Separate source into 2 parts. Pure gradient, and details
a = source.FastBicubicResize(my_wid/2,my_HEIGHT/2)
a_up=a.FastBicubicResize(my_wid,my_HEIGHT)
a_diff=mt_makediff(a_up,source) # Get back all details from source
a=a.HDRAGC(Corrector=0.7,Reducer=1,protect=1,MODE=1,PASSES=2,max_sat=1.35)
b=a.FastBicubicResize(my_wid,my_HEIGHT)
bri2=mt_makediff(b,a_up)
source1=mt_average(source,a_up)
mt_adddiff(bri2,source1)
briten=mt_adddiff(bri2,source)
mt_makediff(briten,a_diff)
RemoveGrain(18)

hdr=source.HDRAGC(Corrector=0.7,Reducer=1,protect=1,MODE=1,PASSES=2,max_sat=1.35)
hdr=hdr.Subtitle(String("original"), x=400, y=40, font="lucida console", size=20)
#ColorYUV(Analyze=True)
interleave(hdr,last) #see diff
#stackhorizontal(hdr,last)

Ponder
26th May 2011, 14:19
Correction: Don't use RemoveGrain(18), it is just too strong, UnFilter(-22,-22) can be useful.
source1=mt_average(source.RemoveGrain(18),a_up) can be helpful, but does not look its best. Still working on a faster version.

Ponder
27th May 2011, 01:05
On vesion 2, I forgot to put a 1 on "source" on last line, this made heaven to earth different.
briten=mt_adddiff(bri2,source) should be briten=mt_adddiff(bri2,source1) instead, also
use SimpleResize instead of FastBicubic.

On this version 2a , much better than 2.
1280p to 1280p, x264 is 45% faster, decode to 1.4GB Gavotte ramdrive, direct copy via Virtualdub
is 73%-77% faster.
Test on vesion 1, x264 was 38% faster, decode to ramdrive was 60% faster.

Version 2a should be 7%-10% slower than version 1 on decoding due to mt_average's works.
The reason is, test on vesion 1, 1280p was mpeg2, my old mpeg decoder used only 1 cpu, whereas
version 2a's source is x264. Divx decodes with 2 cpus hence faster. This means if version 1 used
multithread mpeg2 decoder, I would expect at least 75%ish faster decoding HDRAGC.

Size between vesion 2a and pure HDRAGC on short clip is 0.3% bigger on clean source,
and 6% bigger on light dancing grain source for Version 2a, I'll call it Speedy HDRAGC or SHDRAGC.
Speedy HDRAGC 1 was born about 8 weeks ago. Version 2a is now fully mature for HD, extremely
close to pure HDRAGC, but with more subtle details.

LoadPlugin("F:\AviSynth 2.5\PLUGINS\SimpleResize.dll")# simpleResize is visually better for HD
#ffdshow("default") #Use ffdshow's LanczosResize to resize initially, extremely fast
source=last
#RemoveGrain(1)
#width=1024
my_wid0=1280
#my_wid0=1024
ratio=my_wid0/4 #Mod8h may get pixels overlap, get bright green edge, mod4 preferred
mod4h=4*ROUND((HEIGHT*ratio)/width)
my_HEIGHT=mod4h
my_wid = (width > my_wid0) ? my_wid0 : width

# Separate source into 2 parts. Pure gradient, and 100% details, using Hermione's magic potion

#HDRAGC(Corrector=0.7,Reducer=1,protect=1,MODE=1,PASSES=2,max_sat=1.35)
#return last #Uncomment these 2 lines to test pure HDRAGC

a = source.SimpleResize(my_wid/2,my_HEIGHT/2)
a_up=a.SimpleResize(my_wid,my_HEIGHT)
a_diff=mt_makediff(a_up,source) # Get back all details from source
a=a.HDRAGC(Corrector=0.7,Reducer=1,protect=1,MODE=1,PASSES=2,max_sat=1.35)
b=a.SimpleResize(my_wid,my_HEIGHT)
bri2=mt_makediff(b,a_up)
source1=mt_average(source,a_up)
mt_adddiff(bri2,source1)
briten=mt_adddiff(bri2,source1)
mt_makediff(briten,a_diff)

hdr=source.HDRAGC(Corrector=0.7,Reducer=1,protect=1,MODE=1,PASSES=2,max_sat=1.35)
hdr=hdr.Subtitle(String("original"), x=400, y=40, font="lucida console", size=20)
#ColorYUV(Analyze=True)
interleave(hdr,last) #see diff
#stackhorizontal(hdr,last)
#yLevels(0, 1.8, 255, 60,273) #to see

A verion where one can choose how much details to enhance is possible, nothing special.
A speed up temporal verion where motion is low may be possible by using HDRAGC info from this
frame and apply to next 1 or 2 frames...should be fun or headache.

Ponder
3rd June 2011, 22:10
This speedy HDRAGC version 3 is intend for light(medium) dancing grains. If no dancing, remove
fluxSmoothT.
Changes:

source=last.FluxSmoothT(8) Value 6-10 can be used since this script has internal sharpening effect.
Higher setting than other script are warranted.

General FluxSmoothT tips:
To get good quality, must visually verify by increasing from FluxSmoothT(5) to FluxSmoothT(x)
on flat_area scene near scene change until dancing are gone. Flip back and forth at scene change
to see if miniscule cross frame color bleeding occurs. If it bothers you, decrease FluxSmoothT
until it is gone, FluxSmoothT(12+) may cause such blending at scene change. In General, I rather
have tiny tiny bleeding (hard to see unless freeze frame) in 1 frame than the next 200 frames with
dancing grains by using FluxSmoothT(7+).

FluxSmoothT was incorporated in various lines to see its effects as tests with interesting results.
But the best place is still at the begining. For other scripts, the best place could be at the end
of the scripts.

Also use source1=mt_lutxy(source,a_up,"x y 4 * + 5 /") instead.

This will sooth the grains by using 1 part from source and 4 parts from gradients(softer).
or source1=mt_lutxy(source,a_up,"x y 12 * + 13 /") or more if grain are courser.
For zero grain, source1=mt_lutxy(source,a_up,"x y 2 * + 3 /") is still excellent, tiny smoother
than source1=mt_average(source,a_up) for people who like softer look.

The good thing is one can alter or vary the amount gradients to their taste (how sharp or soft
one want). For heavy dancing grains. Wbb.