Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Capturing and Editing Video > VapourSynth

Reply
 
Thread Tools Search this Thread Display Modes
Old 1st November 2014, 01:45   #1  |  Link
Stephen R. Savage
Registered User
 
Stephen R. Savage's Avatar
 
Join Date: Nov 2009
Posts: 327
Scaling, colorspace conversion, and dithering library

Some folks have expressed interest in having a program library to perform the common tasks of manipulating resolution, colorspace, and depth. The "z" library (name subject to change) provides a simple interface to perform those tasks. An open source Vapoursynth plugin is provided, demonstrating the usage of the API.

Code:
NAME:
    z - z.lib for VapourSynth

SYNOPSIS:
    z.Format(clip clip,
             int "width",
             int "height",
             int "format",
             enum "matrix",
             enum "transfer",
             enum "primaries",
             enum "range",
             enum "chromaloc",
             enum "matrix_in",
             enum "transfer_in",
             enum "primaries_in",
             enum "range_in",
             enum "chromaloc_in",
             string "resample_filter",
             float "filter_param_a",
             float "filter_param_b",
             string "resample_filter_uv",
             float "filter_param_a_uv",
             float "filter_param_b_uv",
             string "dither_type")

    z.Subresize(clip clip,
                int width,
                int height,
                float "shift_w",
                float "shift_h",
                float "subwidth",
                float "subheight",
                string "resample_filter",
                float "filter_param_a",
                float "filter_param_b",
                string "dither_type")

DESCRIPTION:
    z.Format is a drop-in replacement for the built-in VapourSynth resize
    functions. It converts a clip of known or unknown format to another clip
    of known or unknown format, changing only the parameters specified by the
    user. z.Subresize provides advanced resampling capabilities intended for
    use by script writers.

    Arguments denoted as type "enum" may be specified by numerical index
    (see ITU-T H.265 Annex E.3) or by name. Enums specified by name have their
    argument name suffixed with "_s".

    clip:                   input clip
        The input may be of COMPAT color family (requires VS R28).

    width,
    height:                 output image dimensions

    format:                 output format preset id
        The output may be of COMPAT color family (requires VS R28).

    matrix,
    transfer,
    primaries:              output colorspace specification
        If not provided, the corresponding attribute from the input clip will
        be selected, except for YCoCg and RGB color families, where the
        corresponding matrix is set by default.

    range:                  output pixel range
        For integer formats, this allows selection of the legal code values.
        Even when set, out of range values (BTB/WTW) may be generated. If the
        input format is of a different color family, the default range is
        studio/limited for YUV and full-range for RGB.

    chromaloc:              output chroma location
        For subsampled formats, specifies the chroma location. If the input
        format is 4:4:4 or RGB and the output is subsampled, the default
        location is left-aligned, as per MPEG.

    matrix_in,
    transfer_in,
    primaries_in,
    range_in,
    chromaloc_in:           input colorspace/format specification
        If the corresponding frame property is set to a value other than
        unspecified, the frame property is used instead of this parameter.
        Default values are set for certain color families.

    resample_filter,
    filter_param_a,
    filter_param_b:         scaling method for RGB and Y-channel
        For the bicubic filter, filter_param_a/b represent the "b" and "c"
        parameters. For the lanczos filter, filter_param_a represents the
        number of taps.

    resample_filter_uv,
    resample_filter_uv_a,
    resample_filter_uv_b:   scaling method for UV channels

    dither_type:            dithering method
        Dithering is used only for conversions resulting in an integer format.

    shift_w,
    shift_h:                offset of image top-left corner
        The top-left image corner is assumed to be at coordinate (0, 0) and
        the first sample centered at coordinate (0.5, 0.5). An offset may be
        applied to the assumed image origin to "shift" the image.

    subwidth,
    subheight:              fractional dimensions of input image
        The input image is assumed to span from its origin a distance equal to
        its dimensions in pixels. An alternative image resolution may be
        specified.

    The following tables list values of selected colorspace enumerations and
    their abbreviated names. For all possible values, see ITU-T H.265.
        Matrix coefficients (ITU-T H.265 Table E.5):
        rgb         Identity
                    The identity matrix.
                    Typically used for GBR (often referred to as RGB);
                    however, may also be used for YZX (often referred to as
                    XYZ);
        709         KR = 0.2126; KB = 0.0722
                    ITU-R Rec. BT.709-5
        unspec      Unspecified
                    Image characteristics are unknown or are determined by the
                    application.
        470bg       KR = 0.299; KB = 0.114
                    ITU-R Rec. BT.470-6 System B, G (historical)
                    (functionally the same as the value 6 (170m))
        170m        KR = 0.299; KB = 0.114
                    SMPTE 170M (2004)
                    (functionally the same as the value 5 (470bg))
        ycgco       YCgCo
        2020ncl     KR = 0.2627; KB = 0.0593
                    Rec. ITU-R BT.2020 non-constant luminance system
        2020cl      KR = 0.2627; KB = 0.0593
                    Rec. ITU-R BT.2020 constant luminance system

        Transfer characteristics (ITU-T H.265 Table E.4):
        709         V = a * Lc0.45 - ( a - 1 ) for 1 >= Lc >= b
                    V = 4.500 * Lc for b > Lc >= 0
                    Rec. ITU-R BT.709-5
                    (functionally the same as the values 6 (601),
                    14 (2020_10) and 15 (2020_12))
        unspec      Unspecified
                    Image characteristics are unknown or are determined by the
                    application.
        601         V = a * Lc0.45 - ( a - 1 ) for 1 >= Lc >= b
                    V = 4.500 * Lc for b > Lc >= 0
                    Rec. ITU-R BT.601-6 525 or 625
                    (functionally the same as the values 1 (709),
                    14 (2020_10) and 15 (2020_12))
        linear      V = Lc for all values of Lc
                    Linear transfer characteristics
        2020_10     V = a * Lc0.45 - ( a - 1 ) for 1 >= Lc >= b
                    V = 4.500 * Lc for b > Lc >= 0
                    Rec. ITU-R BT.2020
                    (functionally the same as the values 1 (709),
                    6 (601) and 15 (2020_12))
        2020_12     V = a * Lc0.45 - ( a - 1 ) for 1 >= Lc >= b
                    V = 4.500 * Lc for b > Lc >= 0
                    Rec. ITU-R BT.2020
                    (functionally the same as the values 1 (709),
                    6 (601) and 14 (2020_10))

        Color primaries (ITU-T H.265 Table E.3):
        709         primary x y
                    green 0.300 0.600
                    blue 0.150 0.060
                    red 0.640 0.330
                    white D65 0.3127 0.3290
                    Rec. ITU-R BT.709-5
        unspec      Unspecified
                    Image characteristics are unknown or are determined by the
                    application.
        170m        primary x y
                    green 0.310 0.595
                    blue 0.155 0.070
                    red 0.630 0.340
                    white D65 0.3127 0.3290
                    SMPTE 170M (2004)
                    (functionally the same as the value 7 (240m))
        240m        primary x y
                    green 0.310 0.595
                    blue 0.155 0.070
                    red 0.630 0.340
                    white D65 0.3127 0.3290
                    SMPTE 240M (1999)
                    (functionally the same as the value 6 (170m))
        2020        primary x y
                    green 0.170 0.797
                    blue 0.131 0.046
                    red 0.708 0.292
                    white D65 0.3127 0.3290
                    Rec. ITU-R BT.2020

        Pixel range (ITU-T H.265 Eq E-4 to E-15):
        limited     Y = Clip1Y( Round( ( 1 << ( BitDepthY - 8 ) ) *
                                              ( 219 * E'Y + 16 ) ) )
                    Cb = Clip1C( Round( ( 1 << ( BitDepthC - 8 ) ) *
                                               ( 224 * E'PB + 128 ) ) )
                    Cr = Clip1C( Round( ( 1 << ( BitDepthC - 8 ) ) *
                                               ( 224 * E'PR + 128 ) ) )

                    R = Clip1Y( ( 1 << ( BitDepthY - 8 ) ) *
                                       ( 219 * E'R + 16 ) )
                    G = Clip1Y( ( 1 << ( BitDepthY - 8 ) ) *
                                       ( 219 * E'G + 16 ) )
                    B = Clip1Y( ( 1 << ( BitDepthY - 8 ) ) *
                                       ( 219 * E'B + 16 ) )
        full        Y = Clip1Y( Round( ( ( 1 << BitDepthY ) - 1 ) * E'Y ) )
                    Cb = Clip1C( Round( ( ( 1 << BitDepthC ) - 1 ) * E'PB +
                                          ( 1 << ( BitDepthC - 1 ) ) ) )
                    Cr = Clip1C( Round( ( ( 1 << BitDepthC ) - 1 ) * E'PR +
                                          ( 1 << ( BitDepthC - 1 ) ) ) )

                    R = Clip1Y( ( ( 1 << BitDepthY ) - 1 ) * E'R )
                    G = Clip1Y( ( ( 1 << BitDepthY ) - 1 ) * E'G )
                    B = Clip1Y( ( ( 1 << BitDepthY ) - 1 ) * E'B )

        Chroma location (ITU-T H.265 Figure E.1):
        left
        center
        top_left
        top
        bottom_left
        bottom

    The following scaling methods are available:
        point, bilinear, bicubic, spline16, spline36, lanczos
    The following dithering methods are available:
        none, ordered, random, error_diffusion
Release 2.0.2: Download link

Last edited by Stephen R. Savage; 7th December 2015 at 07:55. Reason: v2.0.2
Stephen R. Savage is offline   Reply With Quote
Old 1st November 2014, 14:25   #2  |  Link
kolak
Registered User
 
Join Date: Nov 2004
Location: Poland
Posts: 2,854
How this is different than fmtconv, which I found very good?
kolak is offline   Reply With Quote
Old 1st November 2014, 16:51   #3  |  Link
kolak
Registered User
 
Join Date: Nov 2004
Location: Poland
Posts: 2,854
Is library fully free? Is it cross-platform?

What error diffusion method is implemented? Floyd-Steinberg?

Last edited by kolak; 1st November 2014 at 16:53.
kolak is offline   Reply With Quote
Old 30th November 2014, 02:15   #4  |  Link
Stephen R. Savage
Registered User
 
Stephen R. Savage's Avatar
 
Join Date: Nov 2009
Posts: 327
After much (?) work, I am pleased to announce the release of "zlib" v1.0 FINAL. The library and VapourSynth example are linked in the frist post. Work is underway to improve the multithreaded scalability of the software, making it a better fit for playback scenarios

Last edited by Stephen R. Savage; 30th November 2014 at 02:17.
Stephen R. Savage is offline   Reply With Quote
Old 30th November 2014, 16:02   #5  |  Link
kolak
Registered User
 
Join Date: Nov 2004
Location: Poland
Posts: 2,854
Quote:
Originally Posted by Stephen R. Savage View Post
After much (?) work, I am pleased to announce the release of "zlib" v1.0 FINAL. The library and VapourSynth example are linked in the frist post. Work is underway to improve the multithreaded scalability of the software, making it a better fit for playback scenarios
Great, thank you for your work.

Could you add some noise generator?
I found that Floyd Steinberg dithering with a tiny amount of noise gives very good results. This small noise helps to cover contouring left after dithering.
Another thing is Filter Light algorithm, which is close to Floyd, but apparently waay faster.

Last edited by kolak; 30th November 2014 at 16:05.
kolak is offline   Reply With Quote
Old 1st December 2014, 14:55   #6  |  Link
mandarinka
Registered User
 
mandarinka's Avatar
 
Join Date: Jan 2007
Posts: 729
Wouldn't that be a step back (removal of error-diffusion dithering)?
mandarinka is offline   Reply With Quote
Old 1st December 2014, 19:41   #7  |  Link
kolak
Registered User
 
Join Date: Nov 2004
Location: Poland
Posts: 2,854
No, Floyd method is staying.
Filter Light would be an additional option. It's very close to Floyd, but can be waaay faster. At leas this is what I have read.
kolak is offline   Reply With Quote
Old 7th December 2014, 15:49   #8  |  Link
YamashitaRen
Registered User
 
Join Date: Apr 2014
Location: France
Posts: 33
Question of the day :
What you guys have with cats ?

Answer of the day :
I tried the filter on my Odroid-U2 (unlike fmtconv, it doesn't throw me SSE2 errors) for resizing/cropping a 1920x1080 Blu-Ray to 1280x536.
Swscale is 2x as fast as zlib (8.79fps against 4.36fps). It's probably caused by the lack of NEON optimizations.
YamashitaRen is offline   Reply With Quote
Old 17th April 2015, 17:58   #9  |  Link
mawen1250
Registered User
 
Join Date: Aug 2011
Posts: 103
Thanks for your reply!
This build fixes the magnitude of ordered dither (to 0.5), which solved lots of problems. The underflow issues is solved, but the overflow issues for 9-15bit still exists.

2. Not only full range YUV can produce this kind of overflow issue, but also limited range RGB/YUV with out of range value, such as reducing 65535 to 9-15bit with dithering, or converting 65535 from limited range 16bit to full range 9-15bit. Thus, IMO this may lead to potential problems in practice, since we may not guarantee the input image is perfect. Acctually converting to 9-15bit is not so often used except in the case of final output, so I prefer safe output rather than performance or intermediate precision. Perhaps an additional option for clamping the result to valid range can be added? Also it can be used for limiting the value to limited range when fullrange_out=False.

3. My mistake, the ordered dither does affect only the least significant bit of the output, maybe I was misled by the image with underflow issues. In the previous build, when the output depth & range are the same as input, the ordered dithering pattern still applies to the image. After fixing the magnitude issue of ordered dither it's also solved. I suppose it will be faster to directly return the src frame pointer on this condition, since the frame data is always unchanged.

4. Yes, I got it.
mawen1250 is offline   Reply With Quote
Old 18th April 2015, 21:08   #10  |  Link
mawen1250
Registered User
 
Join Date: Aug 2011
Posts: 103
I see. This will make this filter clean and clear. On the other side, the users need to be aware of what they are doing and take more care of such risks on these special conditions.
Anyway, this is a great library and thanks for your efforts!
mawen1250 is offline   Reply With Quote
Old 25th April 2015, 07:52   #11  |  Link
foxyshadis
Angel of Night
 
foxyshadis's Avatar
 
Join Date: Nov 2004
Location: Tangled in the silks
Posts: 9,562
The download link has an underscore when it should be a dash.
foxyshadis is offline   Reply With Quote
Old 7th June 2015, 03:39   #12  |  Link
mawen1250
Registered User
 
Join Date: Aug 2011
Posts: 103
Code:
# YUV420P8 input
src = core.z.Depth(src, depth=16, fullrange_in=False)
src = core.z.Resize(src, src.width, src.height, filter_uv="bicubic", filter_param_a_uv=1/3, filter_param_b_uv=1/3, subsample_w=0, subsample_h=0)
src = core.z.Colorspace(src, matrix_in=1, transfer_in=1, primaries_in=1, matrix_out=0, transfer_out=1, primaries_out=1)
# RGB48 output
When I run this script, there's a filter error from z.Colorspace.
Quote:
Error getting the frame number 27800:
unsupported pixel type
EDIT: Oops, I forgot z.Colorspace only accepts float input...

Last edited by mawen1250; 7th June 2015 at 04:54.
mawen1250 is offline   Reply With Quote
Old 15th August 2015, 13:24   #13  |  Link
mawen1250
Registered User
 
Join Date: Aug 2011
Posts: 103
zimg-1.1.1 crashes VS for unknown reason, after running a script for some time(about 70-100s in my tests).

Windows 7 x64
VapourSynth R27 x64
threads=8

I just found that using BM3D built with MSVC14 also introduces the same problem (64bit crashes, 32bit doesn't in my test).
Considering zimg-1.1.1 is also built with MSVC14, could this be some problems related to VapourSynth and MSVC14?

Last edited by mawen1250; 15th August 2015 at 13:31.
mawen1250 is offline   Reply With Quote
Old 15th August 2015, 13:37   #14  |  Link
feisty2
I'm Siri
 
feisty2's Avatar
 
Join Date: Oct 2012
Location: void
Posts: 2,633
Quote:
Originally Posted by mawen1250 View Post
zimg-1.1.1 crashes VS for unknown reason, after running a script for some time(about 70-100s in my tests).

Windows 7 x64
VapourSynth R27 x64
threads=8

I just found that using BM3D built with MSVC14 also introduces the same problem (64bit crashes, 32bit doesn't in my test).
Considering zimg-1.1.1 is also built with MSVC14, could this be some problems related to VapourSynth and MSVC14?
http://forum.doom9.org/showthread.ph...27#post1727527
like this?
80% sure it's a vs2015 issue
feisty2 is offline   Reply With Quote
Old 16th August 2015, 06:51   #15  |  Link
mawen1250
Registered User
 
Join Date: Aug 2011
Posts: 103
This build works fine.
mawen1250 is offline   Reply With Quote
Old 8th October 2015, 16:26   #16  |  Link
mawen1250
Registered User
 
Join Date: Aug 2011
Posts: 103
two issues about 1.95 Beta:
1. If the clip is first cropped with std.CropAbs/std.CropRel and the width is not mod16, right-most pixels are corrupt.
2. no arguments for shift_w, shift_h, subwidth, subheight

Last edited by mawen1250; 8th October 2015 at 17:20.
mawen1250 is offline   Reply With Quote
Old 8th October 2015, 18:14   #17  |  Link
Stephen R. Savage
Registered User
 
Stephen R. Savage's Avatar
 
Join Date: Nov 2009
Posts: 327
Quote:
Originally Posted by mawen1250 View Post
two issues about 1.95 Beta:
1. If the clip is first cropped with std.CropAbs/std.CropRel and the width is not mod16, right-most pixels are corrupt.
2. no arguments for shift_w, shift_h, subwidth, subheight
1. I found an issue in the border handling of BYTE dither. It will be fixed in the next prerelease.
2. Is there a use case for this that's not covered by specifying chromaloc?

Last edited by Stephen R. Savage; 8th October 2015 at 19:02.
Stephen R. Savage is offline   Reply With Quote
Old 9th October 2015, 04:50   #18  |  Link
mawen1250
Registered User
 
Join Date: Aug 2011
Posts: 103
Quote:
Originally Posted by Stephen R. Savage View Post
2. Is there a use case for this that's not covered by specifying chromaloc?
1. as a post-resampler for non-center aligned resampler such as nnedi/eedi, this is commonly needed in AA/scaling scripts using edi
2. to do top-left aligned resampling, for example mv.Super and warp.AWarp may need it
3. any other time you want to do a (sub-pixel) cropping/padding
mawen1250 is offline   Reply With Quote
Old 24th October 2015, 11:49   #19  |  Link
kolak
Registered User
 
Join Date: Nov 2004
Location: Poland
Posts: 2,854
Looks like around 32 threads 'engine' gets saturated in terms of threading.
How does the threading work- some slices? Does it mean higher resolution will scale better?

Latest ffmpeg added zscale filter, which I think is great news!
kolak is offline   Reply With Quote
Old 24th October 2015, 22:50   #20  |  Link
kolak
Registered User
 
Join Date: Nov 2004
Location: Poland
Posts: 2,854
I meant: performance doesn't scale linear with cores. Provided graphs show big speed/core performance drop with bigger core numbers. Speed still rises, but we are wasting many cores and CPU power to get eg. just 20% speedup.

I think my question is- does z library engine scales linearly with core numbers? If we would be able to deliver raw video data at unlimited speed and just measure z performace, would processing speed be linear with cores rise?

Last edited by kolak; 24th October 2015 at 22:54.
kolak is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 17:29.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.