Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > Avisynth Usage

Reply
 
Thread Tools Search this Thread Display Modes
Old 27th January 2008, 20:31   #1  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Gamma errors in resizing filters?

An interesting read.

Any chance we could get a properly gamma-corrected Avisynth filter?
Dark Shikari is offline   Reply With Quote
Old 27th January 2008, 21:43   #2  |  Link
MfA
Registered User
 
Join Date: Mar 2002
Posts: 1,075
Not just resizing, anything that averages pixels ... so denoisers, deinterlacers, the works.

Hell, even video codec subpixel interpolation introduces errors (needing extra bits to encode) because of the nonlinear color space (although it probably won't make a huge difference on average).

Another nice site with some examples and conversion source code :
http://mysite.verizon.net/spitzak/conversion/index.html

Last edited by MfA; 27th January 2008 at 21:56.
MfA is offline   Reply With Quote
Old 27th January 2008, 22:51   #3  |  Link
Wilbert
Moderator
 
Join Date: Nov 2001
Location: Netherlands
Posts: 6,365
They claim that the filters need to operate at a higher bit-depth. Luckely 16 bit will be available in v2.6, so that should be could enough.
Wilbert is offline   Reply With Quote
Old 28th January 2008, 11:48   #4  |  Link
IanB
Avisynth Developer
 
Join Date: Jan 2003
Location: Melbourne, Australia
Posts: 3,167
Don't be mistaken, don't be mislead ...

2.6.0 will only have API infrastructure for future 16 and 32 bit formats. There will be no 16 bit code for any filters.

In the interim
Code:
...
Levels(16, 1/2.2, 235, 0, 255, False) # Use PC range to avoid truncation
...Resize(640, 480)
Levels(0, 2.2, 255, 16, 235, False)
This can cause banding with some images, a compromise is to lower the Gamma from 2.2 to trade quantization error for gamma error.
IanB is offline   Reply With Quote
Old 28th January 2008, 18:59   #5  |  Link
Wilbert
Moderator
 
Join Date: Nov 2001
Location: Netherlands
Posts: 6,365
Quote:
Don't be mistaken, don't be mislead ...

2.6.0 will only have API infrastructure for future 16 and 32 bit formats. There will be no 16 bit code for any filters.
I know there is no 16/32 bit code yet. But once the API infrastructure is ready for it, people can start coding. I can't wait to start with some
Wilbert is offline   Reply With Quote
Old 30th January 2008, 17:04   #6  |  Link
2Bdecided
Registered User
 
Join Date: Dec 2002
Location: UK
Posts: 1,673
Idiots question: can't the code work in 16-bits or whatever internally, as long as it read and writes 8-bit data from/to what's around it?

I.e. couldn't you make a gamma correct resize filter which slotted into AVIsynth?

(I know nothing).


btw, those pages are fascinating. With a lot of the features in photoshop-like programs which don't work quite as you expect, or are difficult to set properly - it's because of this error! I love the torture images in the first link to show just how bad it can get.

Cheers,
David.
2Bdecided is offline   Reply With Quote
Old 30th January 2008, 22:04   #7  |  Link
IanB
Avisynth Developer
 
Join Date: Jan 2003
Location: Melbourne, Australia
Posts: 3,167
@2Bdecided,

Yes it would be possible. The current resizers do the intermediate calculations with 16 bit arithmetic already.

The real problem is doing the gamma exponetiation quickly. The fastest method I have in my kit bag is to use precalculated lookup tables, while this is fairly quick as things go, it is significantly slower than the current resizer code.

And yes the plan for 16bit and 32bit per pixel channel colour spaces is for them to be gamma=1.0
IanB is offline   Reply With Quote
Old 31st January 2008, 03:59   #8  |  Link
MfA
Registered User
 
Join Date: Mar 2002
Posts: 1,075
You can do really fast exponentiation with int/float casting tricks (and log on the way back). In a pinch squaring is pretty close too
MfA is offline   Reply With Quote
Old 31st January 2008, 04:18   #9  |  Link
IanB
Avisynth Developer
 
Join Date: Jan 2003
Location: Melbourne, Australia
Posts: 3,167
@MfA,

Yes, X*X is fast in MMX/SSE, which is great for gamma=2.0, but no use in the general case.

However for Sqrt(X) I don't have a MMX/SSE that is as fast as lookup table.

Any and all code fragment donations accepted

Damn Intel, I'd kill for PXLAT[BW] instructions or Movq mm0, [ebx+mm0] type of addessing.
IanB is offline   Reply With Quote
Old 31st January 2008, 06:22   #10  |  Link
MfA
Registered User
 
Join Date: Mar 2002
Posts: 1,075
Do you have IEEExplore access?

http://ieeexplore.ieee.org/xpls/abs_...rnumber=595279

Lack of SIMD LUT is painful, but understandable ... multiple very narrow ports to the cache? Forget it.

PS. GPUs can do all this a lot faster of course ...

PPS. LUT is faster for square root than SSE RSQRTPS + RCPPS?

Last edited by MfA; 31st January 2008 at 06:29.
MfA is offline   Reply With Quote
Old 31st January 2008, 06:29   #11  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Quote:
Originally Posted by IanB View Post
@MfA,

Yes, X*X is fast in MMX/SSE, which is great for gamma=2.0, but no use in the general case.

However for Sqrt(X) I don't have a MMX/SSE that is as fast as lookup table.

Any and all code fragment donations accepted

Damn Intel, I'd kill for PXLAT[BW] instructions or Movq mm0, [ebx+mm0] type of addessing.
Sqrt(x)? I remember a fast bit of code Pengvado tossed me a while back that did it in about 10 clock cycles.
Dark Shikari is offline   Reply With Quote
Old 31st January 2008, 07:48   #12  |  Link
IanB
Avisynth Developer
 
Join Date: Jan 2003
Location: Melbourne, Australia
Posts: 3,167
Quote:
Originally Posted by MfA
IEEExplore access?
No.
Quote:
Originally Posted by MfA
LUT is faster for square root than SSE RSQRTPS + RCPPS?
It's border line but generally 4 LUT's beat the SIMD Sqrt, and it only for gamma=2.0 not general values.
Quote:
Originally Posted by Dark Shikari
Sqrt(x)? I remember a fast bit of code Pengvado tossed me a while back that did it in about 10 clock cycles.
Well don't horde it, share it.
IanB is offline   Reply With Quote
Old 31st January 2008, 07:55   #13  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Ah, nevermind, looked it up--turned out it was EXP2, not square root.

Code:
inline float fast_exp2_sse2(float x)
{
    static const float ss_bias = 12582912.0; // 3<<22
    static const float ss_ln2 = 0.693147180559945;
    static const float ss_1 = 1.0;
    static const float ss_0_5 = 0.5;
    float t, u;
    asm volatile (
        "movaps %0, %1 \n\t"
        "addss  %3, %0 \n\t"
        "movaps %0, %2 \n\t"
        "subss  %3, %0 \n\t" // round(x)
        "pslld $23, %2 \n\t" // round(x) in the exponent
        "subss  %0, %1 \n\t" // frac(x)
        "mulss  %4, %1 \n\t" // frac(x)*ln2 = y
        "movaps %1, %0 \n\t"
        "mulss  %1, %1 \n\t" // y*y
        "addss  %5, %0 \n\t" // 1+y
        "mulss  %6, %1 \n\t" // y*y*.5
        "addss  %1, %0 \n\t" // 1+y+y*y*.5
        "paddd  %2, %0 \n\t" // (1+y+y*y*.5)<<round(x)
        :"+x"(x), "=x"(t), "=x"(u)
        : "m"(ss_bias), "m"(ss_ln2), "m"(ss_1), "m"(ss_0_5)
    );
    return x;
}
Couldn't one use similar code to the "fast inverse square root" used in the Quake 3 engine?
Dark Shikari is offline   Reply With Quote
Old 31st January 2008, 13:49   #14  |  Link
Jawed
Registered User
 
Join Date: Jan 2008
Location: London
Posts: 156
Presumably any filter, such as fft3dfilter, can work in linear space provided that it contains code at the start to linearise and code at the end to encode as gamma-2.2.

I'm also wondering if algorithms that use sum of absolute differences or sum or squares, would stand to gain.

Jawed

Last edited by Jawed; 31st January 2008 at 13:51. Reason: clarity
Jawed is offline   Reply With Quote
Old 31st January 2008, 20:45   #15  |  Link
MfA
Registered User
 
Join Date: Mar 2002
Posts: 1,075
Code written by Chris Lomont for the casting tricks I mentioned can be found here :

http://www.lomont.org/Software/Misc/.../FloatHack.zip
MfA is offline   Reply With Quote
Old 6th February 2008, 19:50   #16  |  Link
sunitram
Registered User
 
Join Date: Jan 2008
Posts: 1
floating point tricks for exponentials

Hi, I have an approximation to exp that might be useful here. I use this in a Java program which works quite well. Here is a working version in C (I dont have much experience in C so I am sure this can be written in a better way):

Code:
double fast_exp2(double a) {
    int tmp = (int)(1048576.0*a + 1072632447);
    double p = 0.0;
    *(1 + (int *)&p) = tmp;
    return p;
}
This approximates 2^x with an average precision of about 2.5% or so, and is extremely fast at it. It is basically the same as using a lookup table with 2048 values. The line where tmp is calculated is basically the same as

Code:
int tmp = (int)((1<<20) * a + (1023* (1<<20) - 60801));
For more info see this paper: http://citeseer.ist.psu.edu/schraudolph98fast.html
I have a compileable example here: http://martin.ankerl.com/wp-content/...008/02/exp.cpp
(compiles with g++ -O3 -fno-strict-aliasing exp.cpp)

Actually this technique can be easily used to do all kinds of approximations, e^x, log(x), a^b and so on.
sunitram is offline   Reply With Quote
Old 7th February 2008, 03:07   #17  |  Link
MfA
Registered User
 
Join Date: Mar 2002
Posts: 1,075
Which is exactly what Blinn did in his paper a year before that one
MfA is offline   Reply With Quote
Old 14th February 2008, 16:05   #18  |  Link
Archimedes
Registered User
 
Join Date: Apr 2005
Posts: 213
Quote:
Originally Posted by Dark Shikari View Post
An interesting read.

Any chance we could get a properly gamma-corrected Avisynth filter?
As a workaround, you can try this script. Input is a RGB32 image. Requires MaskTools 2.

Code:
function PhotoResize(clip input, int Width, int Height, bool GammaCorrection) {
  Width = (Width * input.Height / input.Width <= Height) ? Width : Height * input.Width / input.Height
  Height = (Width * input.Height / input.Width <= Height) ? Width * input.Height / input.Width : Height
  clip1 = input.Spline36Resize(Width, Height)
  clip2 = (GammaCorrection == true) ? input.Levels(0, 1/2.2, 255, 0, 255).Spline36Resize(Width, Height).Levels(0, 2.2, 255, 0, 255) : clip1
  difference = (GammaCorrection == true) ? Subtract(input.Levels(0, 1/2.2, 255, 0, 255).Levels(0, 2.2, 255, 0, 255), input).Spline36Resize(Width, Height) : clip1
  GammaCorrection == true ? Subtract(clip2, difference) : clip1
  clip1 = (GammaCorrection == true) ? clip1.RGBtoYV12() : last
  clip2 = (GammaCorrection == true) ? last.RGBtoYV12() : last
  GammaCorrection == true ? mt_clamp(clip2, clip1, clip1, 255, 0, u=3, v=3) : last
  isYV12() ? YV12toRGB() : last

  function RGBtoYV12(clip input) {
    input.Crop(0, 0, -(input.Width % 2), -(input.Height % 2))
    PointResize(last.Width * 2, last.Height * 2)
    ConvertToYV12(matrix="pc.601")
  }
  function YV12toRGB(clip input) {
    input.ConvertToRGB32(matrix="pc.601")
    PointResize(last.Width / 2, last.Height / 2)
  }
}
LanczosResize, incorrect scaling (GammaCorrection=false):
http://img89.imageshack.us/img89/884...lanczospi4.png

ImageMagick, correct scaling (with gamma correction), 16 bit:
http://img246.imageshack.us/img246/2...imagemasq0.png

LanczosResize, correct scaling (GammaCorrection=true) without error correction, 8 bit:
http://img99.imageshack.us/img99/813...lanczoszn8.png

LanczosResize, correct scaling (GammaCorrection=true) with error correction, 8 bit:
http://img263.imageshack.us/img263/9...lanczosyr5.png

Last edited by Archimedes; 16th February 2008 at 18:29.
Archimedes is offline   Reply With Quote
Old 15th February 2008, 01:01   #19  |  Link
IanB
Avisynth Developer
 
Join Date: Jan 2003
Location: Melbourne, Australia
Posts: 3,167
@Archimedes,

As I pointed out above range scaling YUV images to PC levels as part of the Gamma conversion is very worthwhile, you should try it in addition to your error correction.
Quote:
Originally Posted by ianb View Post
Code:
Levels(16, 1/2.2, 235, 0, 255, False) # Use PC range to avoid truncation
...Resize(640, 480)
Levels(0, 2.2, 255, 16, 235, False)
IanB is offline   Reply With Quote
Old 15th February 2008, 13:54   #20  |  Link
Didée
Registered User
 
Join Date: Apr 2002
Location: Germany
Posts: 5,391
@ IanB,

Archimedes is dealing with RGB input, so that kind of range expansion is not appliable.


@ Archimedes:

Oh, that's not the kind of error protection that I had thought of. But thinking about it, this is probably the better way, and ... then you can simplify it.

> mt_clamp(clip2, clip1, clip1, 255, 0, u=3, v=3)

is identical to, but slower as

> mt_lutxy(clip2, clip1, "x y > x y ?", u=3,v=3)

which in turn is identical to, but slower as

> mt_logic(clip2, clip1, "max", u=3,v=3)

Seeing this, it should be possible to completely ditch out masktools together with the 2*supersampled RGB->YY12 conversion, and just use

> Overlay(clip2, clip1, mode="Lighten", pc_range=true)
__________________
- We´re at the beginning of the end of mankind´s childhood -

My little flickr gallery. (Yes indeed, I do have hobbies other than digital video!)

Last edited by Didée; 15th February 2008 at 14:10.
Didée is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 09:20.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.