PDA

View Full Version : Variance AQ Megathread (AQ v0.48 update--defaults changed)


Pages : [1] 2 3 4

Dark Shikari
15th December 2007, 10:57
So, after collecting enough magic pixie dust, I've come up with an AQ algorithm that just might work. Its purpose is to avoid blocking in flat areas like regular AQ, but more importantly, avoid blurring in relatively flat textured areas, such as grass at a football game or film grain. It seems to be relatively ineffective (or worse than no AQ at all) on low-bitrate anime, but I have gotten reports that its quite useful at higher bitrates, so try it yourself and see.

Patch (http://files.x264.nl/AQ/AQ_0.48.diff)
Build (http://files.x264.nl/AQ/x264.736.dark.aq.0.48.exe)
PthreadGC2.dll if you need it (http://mirror05.x264.nl/Dark/force.php?file=./pthreadGC2.dll).
A 5 minute or so 1080p sample encoded with the new AQ (0.45) at 5 megabits per second (http://mirror05.x264.nl/Dark/force.php?file=./PlanetEarthSample.mkv)
a 44 second sample of 1080p encoded at 10 megabits with AQ 0.47 (strength 0.9, sensitivity 14, qcomp=1) (http://x264.nl/x264.736.aq.0.47.mkv)

How to use AQ:

1. AQ is on by default at strength 0.5. Change --aq-strength to make it stronger or weaker.
2. In 2pass, use AQ on both passes with the same settings.
3. Watch in wonder as all the blurred details come back to life and the SSIM of your video rises.

Version history:

0.48: AQ strength 0.5, sensitivity 13 made the defaults. Updated to r736. Qcomp is now scaled based on AQ strength automatically.
0.47: Rounding error fixed at low QPs. Code cleanup/optimization by Akupenguin.
0.46: Code cleanup, documentation updated, defaults changed based on testing, and RDRC removed from the main patch in preparation for putting AQ in SVN. RDRC builds and patches can still be found on Mirror05/Dark.
0.45: Fixed bug if variance=0. Additionally, when static sensitivity is used, no limit is put on the quantizers other than qpmin/qpmax; this allows one to use AQ as a form of quasi-ratecontrol to redistribute more bits to flatter frames to improve quality.
0.44: Bug in x264 CABAC encoding (mb_qp_delta) fixed. While this isn't a bug in AQ, it only showed up when using CABAC, interlaced mode, and AQ.
0.431: Crash bug fixed.
0.43: Code cleaned and re-organized. Massive speed increase of the AQ itself due to performance optimizations.
0.42: Lambda-based AQ removed due to incompatibilities with deadzone that will take some work to resolve.

0.4 (Huge overhaul):

1. Totally rewritten AQ. Same basic concept, but now uses a logarithmic scale instead of a hackneyed exponential one.
2. For B-frames, uses a tricky bit of lambda-changing instead of QP changing; this requires absolutely no bits for QP-deltas!
3. For P-frames, uses a slight bit of trickery to reduce the bit cost of QP deltas.
4. Totally rewritten, far faster automatic sensitivity. Respects bitrate in CRF mode better also.
5. Now based on r720.

0.3 (Major overhaul of automatic thresholding and options)
0.21 (Fixed overflow bug with variance function. Code now is allowed to raise quantizers. If this causes problems I may restrict it somewhat.)
0.2 (Re-introduced pre-pass for automagic thresholding)
0.11 (Heavily optimized code)
0.1 (Initial release, fixed scaling formula, removed pre-pass)
0.01 (Initial algorithm)

Results from a test of AQ 0.42:

(1-0.9769377) / (1- 0.9799512) = 15% SSIM boost
(37.751-36.792)/0.05 = 19% PSNR drop
7416.06 / 6960.12 - 1 = 6.55% bitrate drop

Inventive Software
15th December 2007, 10:59
Trust you to come up with this an hour before I go home for 3 weeks! :D

I'll have a play next week, if you're not too busy changing the innards of the algorithm. ;)

ToS_Maverick
15th December 2007, 12:23
OMG Dark Shikari you are a HERO :D

i could only do a quick test for now, but @CRF20 it was COMPLETELY transparent, at CRF22 it was VERY good...

a few questions:
- how does it work, it's very effective?
- where can i donate :D

more detailed test with screens coming soon ;)

Dark Shikari
15th December 2007, 12:30
OMG Dark Shikari you are a HERO :D

i could only do a quick test for now, but @CRF20 it was COMPLETELY transparent, at CRF22 it was VERY good...

a few questions:
- how does it work, it's very effective?
- where can i donate :D

more detailed test with screens coming soon ;)
It works by using the AC energy of each macroblock as a metric. Or, in other words, it takes the average of the block's pixels, subtracts those from the original pixel data, and then takes the sum of squares of the result. This means that blocks that aren't completely flat but have a lot of texture still get hit by AQ.

There are some slight optimizations to this (like the fact that this is mathematically equivalent to SSD - SAD^2, and so forth). Also note I made a major change to the build that's up there (I reuploaded)--it now has --aq-sensitivity affect the algorithm. This affects the thresholding in the following way: lower values mean that blocks have to be "flatter" to be affected by AQ, while higher values mean blocks don't have to be as flat.

AGDenton
15th December 2007, 16:54
Can you do an svn diff against some revision of x264 ? I'd like to test this, but I'm not under win32...

Dark Shikari
15th December 2007, 20:26
Can you do an svn diff against some revision of x264 ? I'd like to test this, but I'm not under win32...

Index: encoder/encoder.c
===================================================================
--- encoder/encoder.c (revision 712)
+++ encoder/encoder.c (working copy)
@@ -472,6 +472,8 @@
if( !h->param.b_cabac )
h->param.analyse.i_trellis = 0;
h->param.analyse.i_trellis = x264_clip3( h->param.analyse.i_trellis, 0, 2 );
+ if( h->param.analyse.b_aq && h->param.analyse.f_aq_strength <= 0 )
+ h->param.analyse.b_aq = 0;
h->param.analyse.i_noise_reduction = x264_clip3( h->param.analyse.i_noise_reduction, 0, 1<<16 );

{
Index: encoder/analyse.c
===================================================================
--- encoder/analyse.c (revision 712)
+++ encoder/analyse.c (working copy)
@@ -29,6 +29,7 @@
#endif

#include "common/common.h"
+#include "common/cpu.h"
#include "macroblock.h"
#include "me.h"
#include "ratecontrol.h"
@@ -2031,8 +2032,61 @@
}
}

+//Finds the total AC energy of the block in all planes.
+static int ac_energy_mb(x264_t *h)
+{
+ DECLARE_ALIGNED( static uint8_t, zero[FDEC_STRIDE*16], 16 );
+ int avg[3];
+ int x,y;
+ for(y = 0; y < 16; y++)
+ for(x = 0; x < 16; x++)
+ zero[FDEC_STRIDE*y+x]=0;
+ avg[0] = h->pixf.sad[PIXEL_16x16](zero,FDEC_STRIDE,h->mb.pic.p_fenc[0],FENC_STRIDE) >> 8;
+ avg[1] = h->pixf.sad[PIXEL_8x8](zero,FDEC_STRIDE,h->mb.pic.p_fenc[1],FENC_STRIDE) >> 6;
+ avg[2] = h->pixf.sad[PIXEL_8x8](zero,FDEC_STRIDE,h->mb.pic.p_fenc[2],FENC_STRIDE) >> 6;
+ int totalSSD = 0;
+ for(y = 0; y < 16; y++)
+ for(x = 0; x < 16; x++)
+ zero[FDEC_STRIDE*y+x]=avg[0];
+ totalSSD += h->pixf.ssd[PIXEL_16x16](zero,FDEC_STRIDE,h->mb.pic.p_fenc[0],FENC_STRIDE);
+ for(y = 0; y < 8; y++)
+ for(x = 0; x < 8; x++)
+ zero[FDEC_STRIDE*y+x]=avg[1];
+ totalSSD += h->pixf.ssd[PIXEL_8x8](zero,FDEC_STRIDE,h->mb.pic.p_fenc[1],FENC_STRIDE);
+ for(y = 0; y < 8; y++)
+ for(x = 0; x < 8; x++)
+ zero[FDEC_STRIDE*y+x]=avg[2];
+ totalSSD += h->pixf.ssd[PIXEL_8x8](zero,FDEC_STRIDE,h->mb.pic.p_fenc[2],FENC_STRIDE);
+ return totalSSD;
+}

/*****************************************************************************
+ * x264_adaptive_quant:
+ * check if mb is "flat", i.e. has most energy in low frequency components, and
+ * adjust qp down if it is
+ *****************************************************************************/
+void x264_adaptive_quant( x264_t *h, x264_mb_analysis_t *a )
+{
+ int qp = h->mb.i_qp;
+ int ac_energy = ac_energy_mb(h);
+ x264_cpu_restore(h->param.cpu);
+ float result = ac_energy;
+ const float expconst = 0.367879441;
+ float threshold = powf(h->param.analyse.f_aq_sensitivity,4)/2;
+ if(result < threshold)
+ {
+ if(result == 0) result = 1;
+ else
+ result = (expconst-expf(-powf(threshold/result,0.2))) * 2.71828183;
+ }
+ else result = 0;
+ int qp_adj = (qp * result * h->param.analyse.f_aq_strength) / 2;
+ qp_adj = x264_clip3(qp_adj, 0, qp/2);
+ h->mb.i_qp = a->i_qp = qp - qp_adj;
+ h->mb.i_chroma_qp = i_chroma_qp_table[x264_clip3( h->mb.i_qp + h->pps->i_chroma_qp_index_offset, 0, 51 )];
+}
+
+/*****************************************************************************
* x264_macroblock_analyse:
*****************************************************************************/
void x264_macroblock_analyse( x264_t *h )
@@ -2040,9 +2094,14 @@
x264_mb_analysis_t analysis;
int i_cost = COST_MAX;
int i;
+
+ h->mb.i_qp = x264_ratecontrol_qp( h );

+ if( h->param.analyse.b_aq )
+ x264_adaptive_quant( h, &analysis );
+
/* init analysis */
- x264_mb_analyse_init( h, &analysis, x264_ratecontrol_qp( h ) );
+ x264_mb_analyse_init( h, &analysis, h->mb.i_qp );

/*--------------------------- Do the analysis ---------------------------*/
if( h->sh.i_type == SLICE_TYPE_I )
Index: x264.c
===================================================================
--- x264.c (revision 712)
+++ x264.c (working copy)
@@ -243,6 +243,12 @@
" - 2: enabled on all mode decisions\n", defaults->analyse.i_trellis );
H0( " --no-fast-pskip Disables early SKIP detection on P-frames\n" );
H0( " --no-dct-decimate Disables coefficient thresholding on P-frames\n" );
+ H0( " --aq-strength <float> Amount to adjust QP per MB [%.1f]\n"
+ " 0.0: no AQ\n"
+ " 1.1: strong AQ\n", defaults->analyse.f_aq_strength );
+ H0( " --aq-sensitivity <float> \"Flatness\" threshold to trigger AQ [%.1f]\n"
+ " 5: applies to almost no blocks\n"
+ " 35: applies to almost all blocks\n", defaults->analyse.f_aq_sensitivity );
H0( " --nr <integer> Noise reduction [%d]\n", defaults->analyse.i_noise_reduction );
H1( "\n" );
H1( " --deadzone-inter <int> Set the size of the inter luma quantization deadzone [%d]\n", defaults->analyse.i_luma_deadzone[0] );
@@ -406,6 +412,8 @@
{ "trellis", required_argument, NULL, 't' },
{ "no-fast-pskip", no_argument, NULL, 0 },
{ "no-dct-decimate", no_argument, NULL, 0 },
+ { "aq-strength", required_argument, NULL, 0 },
+ { "aq-sensitivity", required_argument, NULL, 0 },
{ "deadzone-inter", required_argument, NULL, '0' },
{ "deadzone-intra", required_argument, NULL, '0' },
{ "level", required_argument, NULL, 0 },
Index: common/pixel.c
===================================================================
--- common/pixel.c (revision 712)
+++ common/pixel.c (working copy)
@@ -213,6 +213,14 @@
PIXEL_SATD_C( x264_pixel_satd_4x8, 4, 8 )
PIXEL_SATD_C( x264_pixel_satd_4x4, 4, 4 )

+static int x264_pixel_count_8x8( uint8_t *pix, int i_pix, uint32_t threshold )
+{
+ int x, y, sum = 0;
+ for( y=0; y<8; y++, pix += i_pix )
+ for( x=0; x<8; x++ )
+ sum += pix[x] > (uint8_t)threshold;
+ return sum;
+}

/****************************************************************************
* pixel_sa8d_WxH: sum of 8x8 Hadamard transformed differences
@@ -473,6 +481,8 @@
pixf->ads[PIXEL_16x8] = pixel_ads2;
pixf->ads[PIXEL_8x8] = pixel_ads1;

+ pixf->count_8x8 = x264_pixel_count_8x8;
+
#ifdef HAVE_MMX
if( cpu&X264_CPU_MMX )
{
Index: common/pixel.h
===================================================================
--- common/pixel.h (revision 712)
+++ common/pixel.h (working copy)
@@ -84,6 +84,8 @@
void (*ads[7])( int enc_dc[4], uint16_t *sums, int delta,
uint16_t *res, int width );

+ int (*count_8x8)( uint8_t *pix, int i_pix, uint32_t threshold );
+
/* calculate satd of V, H, and DC modes.
* may be NULL, in which case just use pred+satd instead. */
void (*intra_satd_x3_16x16)( uint8_t *fenc, uint8_t *fdec, int res[3] );
Index: common/common.c
===================================================================
--- common/common.c (revision 712)
+++ common/common.c (working copy)
@@ -123,6 +123,9 @@
param->analyse.i_chroma_qp_offset = 0;
param->analyse.b_fast_pskip = 1;
param->analyse.b_dct_decimate = 1;
+ param->analyse.b_aq = 0;
+ param->analyse.f_aq_strength = 0.0;
+ param->analyse.f_aq_sensitivity = 15;
param->analyse.i_luma_deadzone[0] = 21;
param->analyse.i_luma_deadzone[1] = 11;
param->analyse.b_psnr = 1;
@@ -455,6 +458,13 @@
p->analyse.b_fast_pskip = atobool(value);
OPT("dct-decimate")
p->analyse.b_dct_decimate = atobool(value);
+ OPT("aq-strength")
+ {
+ p->analyse.f_aq_strength = atof(value);
+ p->analyse.b_aq = (p->analyse.f_aq_strength > 0.0);
+ }
+ OPT("aq-sensitivity")
+ p->analyse.f_aq_sensitivity = atof(value);
OPT("deadzone-inter")
p->analyse.i_luma_deadzone[0] = atoi(value);
OPT("deadzone-intra")
@@ -939,6 +949,9 @@
s += sprintf( s, " zones" );
}

+ if( p->analyse.b_aq )
+ s += sprintf( s, " aq=1:%.1f:%.1f", p->analyse.f_aq_strength, p->analyse.f_aq_sensitivity );
+
return buf;
}

Index: x264.h
===================================================================
--- x264.h (revision 712)
+++ x264.h (working copy)
@@ -230,6 +230,9 @@
int i_trellis; /* trellis RD quantization */
int b_fast_pskip; /* early SKIP detection on P-frames */
int b_dct_decimate; /* transform coefficient thresholding on P-frames */
+ int b_aq; /* psy adaptive QP */
+ float f_aq_strength;
+ float f_aq_sensitivity;
int i_noise_reduction; /* adaptive pseudo-deadzone */

/* the deadzone size that will be used in luma quantization */

akupenguin
15th December 2007, 20:43
In addition to being unnecessary as discussed before, your zero array is not thread safe. And if you did for whatever reason need a dc array, its stride should be 0 to reduce the amount of data to initialize.

Dark Shikari
15th December 2007, 20:50
In addition to being unnecessary as discussed before, your zero array is not thread safe. And if you did for whatever reason need a dc array, its stride should be 0 to reduce the amount of data to initialize.Yup, yup, I will fix the code. Wait, the zero array isn't threadsafe though? Isn't that what ordinary AQ uses?

akupenguin
15th December 2007, 20:52
That zero array contains zeros. Yours gets modified. In short: there shouldn't be any non-const static variables.

Dark Shikari
15th December 2007, 21:10
That zero array contains zeros. Yours gets modified. In short: there shouldn't be any non-const static variables.Bleh, fixed code.

Not bit-equivalent to the old one, but close enough, and faster.

Index: encoder/encoder.c
===================================================================
--- encoder/encoder.c (revision 712)
+++ encoder/encoder.c (working copy)
@@ -472,6 +472,8 @@
if( !h->param.b_cabac )
h->param.analyse.i_trellis = 0;
h->param.analyse.i_trellis = x264_clip3( h->param.analyse.i_trellis, 0, 2 );
+ if( h->param.analyse.b_aq && h->param.analyse.f_aq_strength <= 0 )
+ h->param.analyse.b_aq = 0;
h->param.analyse.i_noise_reduction = x264_clip3( h->param.analyse.i_noise_reduction, 0, 1<<16 );

{
Index: encoder/analyse.c
===================================================================
--- encoder/analyse.c (revision 712)
+++ encoder/analyse.c (working copy)
@@ -29,6 +29,7 @@
#endif

#include "common/common.h"
+#include "common/cpu.h"
#include "macroblock.h"
#include "me.h"
#include "ratecontrol.h"
@@ -2031,8 +2032,51 @@
}
}

+//Finds the total AC energy of the block in all planes.
+static int ac_energy_mb(x264_t *h)
+{
+ DECLARE_ALIGNED( static uint8_t, zero[16], 16 );
+ int sad,ssd;
+ int totalSSD = 0;
+ sad = h->pixf.sad[PIXEL_16x16](zero,0,h->mb.pic.p_fenc[0],FENC_STRIDE);
+ ssd = h->pixf.ssd[PIXEL_16x16](zero,0,h->mb.pic.p_fenc[0],FENC_STRIDE);
+ totalSSD += ssd - ((sad * sad) >> 8);
+ sad = h->pixf.sad[PIXEL_8x8](zero,0,h->mb.pic.p_fenc[1],FENC_STRIDE);
+ ssd = h->pixf.ssd[PIXEL_8x8](zero,0,h->mb.pic.p_fenc[1],FENC_STRIDE);
+ totalSSD += ssd - ((sad * sad) >> 6);
+ sad = h->pixf.sad[PIXEL_8x8](zero,0,h->mb.pic.p_fenc[2],FENC_STRIDE);
+ ssd = h->pixf.ssd[PIXEL_8x8](zero,0,h->mb.pic.p_fenc[2],FENC_STRIDE);
+ totalSSD += ssd - ((sad * sad) >> 6);
+ return totalSSD;
+}

/*****************************************************************************
+ * x264_adaptive_quant:
+ * check if mb is "flat", i.e. has most energy in low frequency components, and
+ * adjust qp down if it is
+ *****************************************************************************/
+void x264_adaptive_quant( x264_t *h, x264_mb_analysis_t *a )
+{
+ int qp = h->mb.i_qp;
+ int ac_energy = ac_energy_mb(h);
+ x264_cpu_restore(h->param.cpu);
+ float result = ac_energy;
+ const float expconst = 0.367879441;
+ float threshold = powf(h->param.analyse.f_aq_sensitivity,4)/2;
+ if(result < threshold)
+ {
+ if(result == 0) result = 1;
+ else
+ result = (expconst-expf(-powf(threshold/result,0.2))) * 2.71828183;
+ }
+ else result = 0;
+ int qp_adj = (qp * result * h->param.analyse.f_aq_strength) / 2;
+ qp_adj = x264_clip3(qp_adj, 0, qp/2);
+ h->mb.i_qp = a->i_qp = qp - qp_adj;
+ h->mb.i_chroma_qp = i_chroma_qp_table[x264_clip3( h->mb.i_qp + h->pps->i_chroma_qp_index_offset, 0, 51 )];
+}
+
+/*****************************************************************************
* x264_macroblock_analyse:
*****************************************************************************/
void x264_macroblock_analyse( x264_t *h )
@@ -2040,9 +2084,14 @@
x264_mb_analysis_t analysis;
int i_cost = COST_MAX;
int i;
+
+ h->mb.i_qp = x264_ratecontrol_qp( h );

+ if( h->param.analyse.b_aq )
+ x264_adaptive_quant( h, &analysis );
+
/* init analysis */
- x264_mb_analyse_init( h, &analysis, x264_ratecontrol_qp( h ) );
+ x264_mb_analyse_init( h, &analysis, h->mb.i_qp );

/*--------------------------- Do the analysis ---------------------------*/
if( h->sh.i_type == SLICE_TYPE_I )
Index: x264.c
===================================================================
--- x264.c (revision 712)
+++ x264.c (working copy)
@@ -243,6 +243,12 @@
" - 2: enabled on all mode decisions\n", defaults->analyse.i_trellis );
H0( " --no-fast-pskip Disables early SKIP detection on P-frames\n" );
H0( " --no-dct-decimate Disables coefficient thresholding on P-frames\n" );
+ H0( " --aq-strength <float> Amount to adjust QP per MB [%.1f]\n"
+ " 0.0: no AQ\n"
+ " 1.1: strong AQ\n", defaults->analyse.f_aq_strength );
+ H0( " --aq-sensitivity <float> \"Flatness\" threshold to trigger AQ [%.1f]\n"
+ " 5: applies to almost no blocks\n"
+ " 35: applies to almost all blocks\n", defaults->analyse.f_aq_sensitivity );
H0( " --nr <integer> Noise reduction [%d]\n", defaults->analyse.i_noise_reduction );
H1( "\n" );
H1( " --deadzone-inter <int> Set the size of the inter luma quantization deadzone [%d]\n", defaults->analyse.i_luma_deadzone[0] );
@@ -406,6 +412,8 @@
{ "trellis", required_argument, NULL, 't' },
{ "no-fast-pskip", no_argument, NULL, 0 },
{ "no-dct-decimate", no_argument, NULL, 0 },
+ { "aq-strength", required_argument, NULL, 0 },
+ { "aq-sensitivity", required_argument, NULL, 0 },
{ "deadzone-inter", required_argument, NULL, '0' },
{ "deadzone-intra", required_argument, NULL, '0' },
{ "level", required_argument, NULL, 0 },
Index: common/pixel.c
===================================================================
--- common/pixel.c (revision 712)
+++ common/pixel.c (working copy)
@@ -213,6 +213,14 @@
PIXEL_SATD_C( x264_pixel_satd_4x8, 4, 8 )
PIXEL_SATD_C( x264_pixel_satd_4x4, 4, 4 )

+static int x264_pixel_count_8x8( uint8_t *pix, int i_pix, uint32_t threshold )
+{
+ int x, y, sum = 0;
+ for( y=0; y<8; y++, pix += i_pix )
+ for( x=0; x<8; x++ )
+ sum += pix[x] > (uint8_t)threshold;
+ return sum;
+}

/****************************************************************************
* pixel_sa8d_WxH: sum of 8x8 Hadamard transformed differences
@@ -473,6 +481,8 @@
pixf->ads[PIXEL_16x8] = pixel_ads2;
pixf->ads[PIXEL_8x8] = pixel_ads1;

+ pixf->count_8x8 = x264_pixel_count_8x8;
+
#ifdef HAVE_MMX
if( cpu&X264_CPU_MMX )
{
Index: common/pixel.h
===================================================================
--- common/pixel.h (revision 712)
+++ common/pixel.h (working copy)
@@ -84,6 +84,8 @@
void (*ads[7])( int enc_dc[4], uint16_t *sums, int delta,
uint16_t *res, int width );

+ int (*count_8x8)( uint8_t *pix, int i_pix, uint32_t threshold );
+
/* calculate satd of V, H, and DC modes.
* may be NULL, in which case just use pred+satd instead. */
void (*intra_satd_x3_16x16)( uint8_t *fenc, uint8_t *fdec, int res[3] );
Index: common/common.c
===================================================================
--- common/common.c (revision 712)
+++ common/common.c (working copy)
@@ -123,6 +123,9 @@
param->analyse.i_chroma_qp_offset = 0;
param->analyse.b_fast_pskip = 1;
param->analyse.b_dct_decimate = 1;
+ param->analyse.b_aq = 0;
+ param->analyse.f_aq_strength = 0.0;
+ param->analyse.f_aq_sensitivity = 15;
param->analyse.i_luma_deadzone[0] = 21;
param->analyse.i_luma_deadzone[1] = 11;
param->analyse.b_psnr = 1;
@@ -455,6 +458,13 @@
p->analyse.b_fast_pskip = atobool(value);
OPT("dct-decimate")
p->analyse.b_dct_decimate = atobool(value);
+ OPT("aq-strength")
+ {
+ p->analyse.f_aq_strength = atof(value);
+ p->analyse.b_aq = (p->analyse.f_aq_strength > 0.0);
+ }
+ OPT("aq-sensitivity")
+ p->analyse.f_aq_sensitivity = atof(value);
OPT("deadzone-inter")
p->analyse.i_luma_deadzone[0] = atoi(value);
OPT("deadzone-intra")
@@ -939,6 +949,9 @@
s += sprintf( s, " zones" );
}

+ if( p->analyse.b_aq )
+ s += sprintf( s, " aq=1:%.1f:%.1f", p->analyse.f_aq_strength, p->analyse.f_aq_sensitivity );
+
return buf;
}

Index: x264.h
===================================================================
--- x264.h (revision 712)
+++ x264.h (working copy)
@@ -230,6 +230,9 @@
int i_trellis; /* trellis RD quantization */
int b_fast_pskip; /* early SKIP detection on P-frames */
int b_dct_decimate; /* transform coefficient thresholding on P-frames */
+ int b_aq; /* psy adaptive QP */
+ float f_aq_strength;
+ float f_aq_sensitivity;
int i_noise_reduction; /* adaptive pseudo-deadzone */

/* the deadzone size that will be used in luma quantization */

EXE updated.

LigH
15th December 2007, 21:41
Hooray!

Thanks for this patch. I bet some friends in the german board will test it too.

Sagekilla
16th December 2007, 00:39
So now the general starting point for using AQ-strength would be 1.0 and now 0.5 as before? If so that would be quite nice, since I'd imagine it'd give me some more leeway with only using tiny amounts of AQ in those pesky movies where theres relatively few dark scenes.

Dark Shikari
16th December 2007, 01:47
I found a bug with my energy function where I get an integer overflow in some extremely bright blocks, resulting in AQ not being activated even if the block is flat. A fix will come in a bit.

I'm also working on a magical algorithm to automatically find a good threshold value for each frame. :)

Sagekilla
16th December 2007, 02:10
I found a bug with my energy function where I get an integer overflow in some extremely bright blocks, resulting in AQ not being activated even if the block is flat. A fix will come in a bit.

I'm also working on a magical algorithm to automatically find a good threshold value for each frame. :)

Could this be magically added to a multi-patched x264 with all your other wonderful patches too? :)

Dark Shikari
16th December 2007, 02:15
Could this be magically added to a multi-patched x264 with all your other wonderful patches too? :)Soon. In the meantime, a teaser of the latest algorithm:

(1-pass ABR, 1000 kbit, a comparison of two I-frames)

Original:

http://i6.tinypic.com/72riyc7.pnghttp://i2.tinypic.com/6lb7k2g.png

AQ:

http://i4.tinypic.com/8ftnotu.pnghttp://i17.tinypic.com/8borxbp.png

Note most of the ringing is from the original source, which was not a very well-encoded DVD (and so blurring obscures the ringing when AQ isn't used).

If you want a huge contrast between the two, look at the wheel in the background on the first image. Or the bricks in the background on the second image.

Sagekilla
16th December 2007, 02:16
Very nice, some good detail retention in areas that I'd imagine would otherwise be killed off..

kumi
16th December 2007, 04:38
I see a huge difference in CRF output size with --aq-sensitivity 0. Normal?

--crf 21.5
Size: 9.99 MB
Bitrate (Avg): 1.169

--crf 21.5 --aq-str 1.0
Size: 8.12 MB
Bitrate (Avg): 0.950

--crf 21.5 --aq-str 1.0 --aq-sens 0
Size: 36.8 MB
Bitrate (Avg): 4.309

Sagekilla
16th December 2007, 04:41
I see a huge difference in CRF output size with --aq-sensitivity 0. Normal?

--crf 21.5
Size: 9.99 MB
Bitrate (Avg): 1.169

--crf 21.5 --aq-str 1.0
Size: 8.12 MB
Bitrate (Avg): 0.950

--crf 21.5 --aq-str 1.0 --aq-sens 0
Size: 36.8 MB
Bitrate (Avg): 4.309

Perhaps it may be borked with 0, like one of those divide by zero errors.

kumi
16th December 2007, 04:56
Yes, but
"2. For the automagic thresholding algorithm, use --aq-sensitivity 0."

Dark Shikari
16th December 2007, 05:11
I see a huge difference in CRF output size with --aq-sensitivity 0. Normal?

--crf 21.5
Size: 9.99 MB
Bitrate (Avg): 1.169

--crf 21.5 --aq-str 1.0
Size: 8.12 MB
Bitrate (Avg): 0.950

--crf 21.5 --aq-str 1.0 --aq-sens 0
Size: 36.8 MB
Bitrate (Avg): 4.309Try with bitrate mode to make the results more comparable. Its likely screwing up ratecontrol--I will try to see what I can do to make it avoid blowing up the filesize.

That is natural though--AQ does drastically raise filesize with CRF. Its just in this case its raised it a bit more than usual.

check
16th December 2007, 05:43
what do you get with a sensitivity very near 1?

Sagekilla
16th December 2007, 05:43
I have to say, I find your AQ to be quite interesting.. I actually got a -huge- reduction in bitrate when I used it. 1738 kbps without vs 1475 kbps with, in one of my tests. That was using a simple --aq-strength 0.5 --aq-sensitivity 15.

@Check: At that point I think it'd be still running at the regular non-adaptive sensitivity so it would activate on very few blocks according to what the help says (low aq = less blocks activated on, high aq = more blocks activated on)

Dark Shikari
16th December 2007, 06:17
0 = adaptive, any other value = regular scheme.

Its possible adaptive could be a bit too strong by default, so experiment with lower strength values (and I could experiment with slightly better adaptive schemes).

Sagekilla
16th December 2007, 06:41
0 = adaptive, any other value = regular scheme.

Its possible adaptive could be a bit too strong by default, so experiment with lower strength values (and I could experiment with slightly better adaptive schemes).

It seems like it, because I tried the adaptive mode (Wouldn't that make it a... adaptive adaptive quantization?) myself and I ended up with a severely bloated file over have it at the default sensitivity of 15. Personally I like how it decreases the file sizes while not really decreasing the quality at all, so that's just about reason enough for me to just go with strength 0.9 and sensitivity 15.

Dark Shikari
16th December 2007, 11:18
I found some serious problems with automagic thresholding--it was consistently overestimating the necessary threshold.

As a result, I implemented a much more brute-force (and as a result slightly slower) algorithm that should be able to find a better threshold. Try it out--unlike before, it shouldn't screw up ratecontrol.

Strength has also been moved to a different part of the formula for easier control over the results of the algorithm.

The goal of this latest algorithm is to keep the average QP per frame the same. This, in most cases, keeps the bits per frame relatively similar, which means AQ should no longer drastically increase or decrease bitrate at a given CRF/QP.

kumi
16th December 2007, 11:41
Great! Can't wait to test :D

ToS_Maverick
16th December 2007, 13:23
what i found out about 0.3 with BlackPearl:
- --aq-strength 1.0 --aq-sensitivity 20 and CRF20 is transparent
- your new AQ produces bigger files, but the quality is better than ever!
- str 1.0 is very balanced
- below sens 20 some areas get left behind
- auto mode (sens 0) is producing heavily undersized files
- --aq-strength 1.0 --aq-sensitivity 20 and CRF20 = --aq-strength 1.0 and CRF15, about the same size and quality (only a small difference)
- SSIM is the same (CRF20 with AQ compared to CRF17 without, same size)
- PSNR is 1 dB lower (OMG ;))

why is your adaptive-mode acting so weird? what is it supposed to do?

Sagekilla
16th December 2007, 16:43
what i found out about 0.3 with BlackPearl:
- --aq-strength 1.0 --aq-sensitivity 20 and CRF20 is transparent
- your new AQ produces bigger files, but the quality is better than ever!
- str 1.0 is very balanced
- below sens 20 some areas get left behind
- auto mode (sens 0) is producing heavily undersized files
- --aq-strength 1.0 --aq-sensitivity 20 and CRF20 = --aq-strength 1.0 and CRF15, about the same size and quality (only a small difference)
- SSIM is the same (CRF20 with AQ compared to CRF17 without, same size)
- PSNR is 1 dB lower (OMG ;))

why is your adaptive-mode acting so weird? what is it supposed to do?

The "adaptive mode" for adaptive quantization is supposed to dynamically choose the best sensitivity for each frame, so it can change the qps accordingly, or that's what it seems to be doing from what I can infer.

Dark Shikari
16th December 2007, 21:07
The "adaptive mode" for adaptive quantization is supposed to dynamically choose the best sensitivity for each frame, so it can change the qps accordingly, or that's what it seems to be doing from what I can infer.And it defines "best" as the sensitivity that results in the average QP for that frame not changing--i.e. if it raises 20 QPs by 5, it also has to lower other QPS by a total of 100.

Sagekilla
16th December 2007, 21:38
And it defines "best" as the sensitivity that results in the average QP for that frame not changing--i.e. if it raises 20 QPs by 5, it also has to lower other QPS by a total of 100.

Does this affect the qps of each block after x264 chooses a qp for a given frame or does this intermix with the qp decision to give the frame?

Dark Shikari
16th December 2007, 21:43
Does this affect the qps of each block after x264 chooses a qp for a given frame or does this intermix with the qp decision to give the frame?After, because x264's frame-QP decision is already based on the relative complexity of the frame.

Sagekilla
16th December 2007, 21:45
After, because x264's frame-QP decision is already based on the relative complexity of the frame.

Gotcha, so in this case the new adaptive mode will be just be adding and removing bits here and there without actually reducing or increasing the bitrate significantly?

Dark Shikari
16th December 2007, 22:48
Gotcha, so in this case the new adaptive mode will be just be adding and removing bits here and there without actually reducing or increasing the bitrate significantly?Ideally, yes.

ToS_Maverick
16th December 2007, 23:22
then why does it lead to a massive undersize with this sample, while it actually should increase the bitrate?

and why are the final quants so low (15-17)?

Dark Shikari
16th December 2007, 23:28
then why does it lead to a massive undersize with this sample, while it actually should increase the bitrate?

and why are the final quants so low (15-17)?Can you upload the .h264 stream so I can look at it?

ToS_Maverick
17th December 2007, 00:00
you should have the sample, try it with these settings:
--crf 20.0 --level 4.1 --keyint 100 --min-keyint 1 --ref 3 --mixed-refs --no-fast-pskip --bframes 2 --b-pyramid --bime --weightb --filter -2,-2 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --vbv-bufsize 9781 --vbv-maxrate 29400 --threads auto --thread-input --progress --no-dct-decimate --output "output" "input" --aq-strength 1.0

i can't upload it today, if you need it i'll upload it tomorrow!

thx and good night ;)

Dark Shikari
17th December 2007, 00:05
you should have the sample, try it with these settings:What sample? What do you mean, "you should have the sample"? That's not exactly descriptive... :rolleyes:

kumi
17th December 2007, 02:57
It seems the oversizing is a little better now, only +14% @ 0.9 strength. I haven't encountered any undersizing yet :rolleyes:

25.1 MB --crf 21.5
25.4 MB --crf 21.5 --aq-strength 0.3 --aq-sensitivity 0
26.7 MB --crf 21.5 --aq-strength 0.6 --aq-sensitivity 0
29.2 MB --crf 21.5 --aq-strength 0.9 --aq-sensitivity 0



I compared AQ on a 2% sample of bright, outdoor, well-shot prosumer SD DV movie. Scenes consist of lots of close-ups of people's faces talking, with lots of action in the background. No dark scenes at all.

25.1 MB --crf 21.5
vs
25.2 MB --crf 22.5 --aq-strength 0.9 --aq-sensitivity 0

Right away the most noticable improvement is in the increased facial detail and skin tones. I mean HUGE improvement. Mosquito noise, blocking and banding are all much less visible. And not just the dark and/or detailed flat areas, either. Everywhere there is detail to bring out, like humari hair, it seems to bring it out. In fact I can't find areas that look worse than before... where are the extra bits coming from?! This is voodoo magic! :eek:

If there's anything I would ask, it would be to speed it up a bit (if possible), and release a fast-ref-search/AQ binary :p But this is #$%@ing great work you've done here, thank you. :cool:

Dark Shikari
17th December 2007, 03:03
Right away the most noticable improvement is in the increased facial detail and skin tones. I mean HUGE improvement. Mosquito noise, blocking and banding are all much less visible. And not just the dark and/or detailed flat areas, either. Everywhere there is detail to bring out, like humari hair, it seems to bring it out. In fact I can't find areas that look worse than before... where are the extra bits coming from?! This is voodoo magic! :eek:It takes the bits from the areas with the highest variance--a very sharp boundary with strong brightness differences, for example, would get bits taken away. I'm not sure if this would have a negative effect in anime--in live action any negative effect seems to be nearly invisible.

One thing you'll find when looking at bit distribution of non-AQ encodes is that often the vast majority of the bits are concentrated in very small areas; one can easily take a few away without there being much noticeable difference.

Sagekilla
17th December 2007, 03:40
Everywhere there is detail to bring out, like humari hair, it seems to bring it out. In fact I can't find areas that look worse than before... where are the extra bits coming from?! This is voodoo magic! :eek:

Voodoo magic? No.. This.. Is.. SPARTA!!

@Dark Shikari: If it were to be that way, wouldn't it be a good idea to use the mode you're using right now as a "real life" mode, and then use a different type of AQ as an "anime" mode? Because, I do encode a few anime sources where I do need to use AQ, and if the new AQ will harm anime then I really think you should consider adding a switch to choose anime/real life mode or something to that effect.

Sharktooth
17th December 2007, 04:01
why dont you try the new AQ on your anime, see it with your eyes and report back?
it would be a really usefull info...

Dark Shikari
17th December 2007, 04:29
why dont you try the new AQ on your anime, see it with your eyes and report back?
it would be a really usefull info...I've tried it, and its hard to tell. It really does salvage some detail, much like in ordinary encodes, but I'm really not that sure about it.

Sharktooth
17th December 2007, 15:57
... it was directed to sagekilla ...

i know you probably tested it on animes too, but a second POV would be usefull...

ToS_Maverick
17th December 2007, 19:55
Dark Shikari, i meant the BlackPearlSample, my main testsample ;)

from your screens i could see you still got it, anyway, i posted the link here:
http://forum.doom9.org/showthread.php?p=1028047#post1028047

i used this script:
DGDecode_mpeg2source("Black.Pearl.Sample.d2v", idct=7)
trim(2,0)
crop(0,58,0,-62)
LanczosResize(768,320)

crf 15 --aq-strength 1.0:
--[NoImage] Job commandline: "C:\Programme\megui\tools\x264\x264.exe" --crf 15.0 --level 4.1 --keyint 100 --min-keyint 1 --ref 3 --mixed-refs --no-fast-pskip --bframes 2 --b-pyramid --bime --weightb --filter -2,-2 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --vbv-bufsize 9781 --vbv-maxrate 29400 --threads auto --thread-input --progress --no-dct-decimate --output "F:\Video\BlackPearl\Black.Pearl.Sample crf 15 newaq10.mkv" "F:\Video\BlackPearl\Black.Pearl.Sample.avs" --aq-strength 1.0
--[Information] [16.12.2007 13:08:58] Encoding started
--[NoImage] Standard output stream
--[NoImage] Standard error stream
---[NoImage] avis [info]: 768x320 @ 23.98 fps (3622 frames)
---[NoImage] x264 [info]: using cpu capabilities: MMX MMXEXT SSE SSE2 SSE3 SSSE3 Cache64
---[NoImage] x264 [info]: slice I:98 Avg QP:14.66 size: 33513 PSNR Mean Y:46.78 U:48.70 V:49.63 Avg:47.40 Global:46.44
---[NoImage] x264 [info]: slice P:1955 Avg QP:16.93 size: 16632 PSNR Mean Y:44.88 U:47.55 V:48.51 Avg:45.69 Global:45.52
---[NoImage] x264 [info]: slice B:1569 Avg QP:18.58 size: 6651 PSNR Mean Y:43.48 U:46.79 V:47.66 Avg:44.39 Global:44.23
---[NoImage] x264 [info]: mb I I16..4: 14.4% 24.3% 61.2%
---[NoImage] x264 [info]: mb P I16..4: 8.8% 21.7% 13.7% P16..4: 21.7% 21.7% 9.9% 0.0% 0.0% skip: 2.5%
---[NoImage] x264 [info]: mb B I16..4: 2.1% 5.5% 2.1% B16..8: 42.6% 3.6% 8.0% direct:14.7% skip:21.4%
---[NoImage] x264 [info]: 8x8 transform intra:47.9% inter:29.8%
---[NoImage] x264 [info]: ref P 74.8% 17.5% 7.7%
---[NoImage] x264 [info]: ref B 79.5% 16.6% 3.9%
---[NoImage] x264 [info]: SSIM Mean Y:0.9818539
---[NoImage] x264 [info]: PSNR Mean Y:44.322 U:47.254 V:48.177 Avg:45.170 Global:44.935 kb/s:2448.41
---[NoImage] encoded 3622 frames, 31.19 fps, 2448.63 kb/s

crf 20 --aq-strength 1.0 --aq-sensitivity 20
--[NoImage] Job commandline: "C:\Programme\megui\tools\x264\x264.exe" --crf 20.0 --level 4.1 --keyint 100 --min-keyint 1 --ref 3 --mixed-refs --no-fast-pskip --bframes 2 --b-pyramid --bime --weightb --filter -2,-2 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --vbv-bufsize 9781 --vbv-maxrate 29400 --threads auto --thread-input --progress --no-dct-decimate --output "F:\Video\BlackPearl\Black.Pearl.Sample crf 20 newaq10 sens10.mkv" "F:\Video\BlackPearl\Black.Pearl.Sample.avs" --aq-strength 1.0 --aq-sensitivity 20
--[Information] [16.12.2007 11:58:52] Encoding started
--[NoImage] Standard output stream
--[NoImage] Standard error stream
---[NoImage] avis [info]: 768x320 @ 23.98 fps (3622 frames)
---[NoImage] x264 [info]: using cpu capabilities: MMX MMXEXT SSE SSE2 SSE3 SSSE3 Cache64
---[NoImage] x264 [info]: slice I:98 Avg QP:19.66 size: 28360 PSNR Mean Y:45.00 U:47.60 V:48.59 Avg:45.74 Global:43.20
---[NoImage] x264 [info]: slice P:1955 Avg QP:21.93 size: 15696 PSNR Mean Y:44.31 U:47.18 V:48.16 Avg:45.15 Global:44.86
---[NoImage] x264 [info]: slice B:1569 Avg QP:23.58 size: 6291 PSNR Mean Y:42.89 U:46.36 V:47.24 Avg:43.83 Global:43.60
---[NoImage] x264 [info]: mb I I16..4: 15.5% 28.0% 56.5%
---[NoImage] x264 [info]: mb P I16..4: 8.2% 21.3% 14.6% P16..4: 22.7% 22.1% 8.9% 0.0% 0.0% skip: 2.3%
---[NoImage] x264 [info]: mb B I16..4: 2.3% 6.0% 2.1% B16..8: 43.3% 3.6% 7.1% direct:14.3% skip:21.2%
---[NoImage] x264 [info]: 8x8 transform intra:47.9% inter:29.6%
---[NoImage] x264 [info]: ref P 74.8% 17.6% 7.6%
---[NoImage] x264 [info]: ref B 79.4% 16.6% 4.0%
---[NoImage] x264 [info]: SSIM Mean Y:0.9807162
---[NoImage] x264 [info]: PSNR Mean Y:43.712 U:46.837 V:47.774 Avg:44.595 Global:44.223 kb/s:2294.88
---[NoImage] encoded 3622 frames, 32.24 fps, 2295.10 kb/s

for crf 20 --aq-strength 1.0 i get
918 kb/s
SSIM 0.969
which is way too low in size, SSIM and visual quality

Dark Shikari
17th December 2007, 19:58
for crf 20 --aq-strength 1.0 i get
918 kb/s
SSIM 0.969
which is way too low in size, SSIM and visual qualityHow about you compare two different videos, at the same bitrate, visually?

ToS_Maverick
17th December 2007, 20:13
ok, i think i have to express myself a bit more clearly ;)

the crf 15 vid has 150 kb/s more bitrate than the crf 20 one, that's 6.5 %. for me, thats close enough. i compared them visually of course.

to give you an idea:
vid1=directshowsource("Black.Pearl.Sample crf 15 newaq10.mkv", audio=false).lanczosresize(1280,536)
vid2=directshowsource("Black.Pearl.Sample crf 20 newaq10 sens20.mkv", audio=false).lanczosresize(1280,536)
interleave(vid1,vid2)

normally, i predict the quality of my encodes with the aveage quant/ratefactor. now a sample, that is transparent at 20, suddenly needs 15. that's a bit weird for me.

Dark Shikari
17th December 2007, 20:23
ok, i think i have to express myself a bit more clearly ;)

the crf 15 vid has 150 kb/s more bitrate than the crf 20 one, that's 6.5 %. for me, thats close enough. i compared them visually of course.

to give you an idea:
vid1=directshowsource("Black.Pearl.Sample crf 15 newaq10.mkv", audio=false).lanczosresize(1280,536)
vid2=directshowsource("Black.Pearl.Sample crf 20 newaq10 sens20.mkv", audio=false).lanczosresize(1280,536)
interleave(vid1,vid2)

normally, i predict the quality of my encodes with the aveage quant/ratefactor. now a sample, that is transparent at 20, suddenly needs 15. that's a bit weird for me.I'll have to do some testing with this to see why the bitrate is changing so much--whether its the fact that it doesn't need all that bitrate when AQ is applied, or whether AQ is just being applied unevenly.

I might have found the problem. It might be because of the black on the bottom and the top, the letterbox padding--this is counted as "flat" and so the formula screws up completely. But I'm not 100% sure about this... /goes back to testing.

foxyshadis
17th December 2007, 22:50
Maybe the quant accounting is just off, since it'll average out to the same quant but a very different bit allocation. It probably doesn't really matter that much, forcing the same bitrate would require modifying the RC or accounting for how many bits every quant change adds or subtracts, a lot of work for questionable gain.

Gilgamesh83
17th December 2007, 23:49
Hi!

Was just wondering about what the aq does in a codec,
does it:

a) pull bitrate from dynamic parts of the frame to non dynamic part of the frame?

b) pull bitrate from brighter parts of the frame to darker parts of the frame? (does it have to do with the fact that codecs give less bitrate in darker areas with the same quantizer in bright areas? or something like that.)

again a quick answer is ok since im a noob or rather, I know nothing about programming but am an avid user of x264 in megui.
I have always imagined that aq could in a frame have different quantizers to e.g. only a small dynamic parts of a frame. (like when in an anime there is a frame that is still but has a tv that shows static, that frame would still get a great about of bitrate cause of the dynamics in the static tv part.)

Dark Shikari
18th December 2007, 01:38
a) pull bitrate from dynamic parts of the frame to non dynamic part of the frame?This would be some sort of motion-based AQ. I've never seen one, personally.
b) pull bitrate from brighter parts of the frame to darker parts of the frame? (does it have to do with the fact that codecs give less bitrate in darker areas with the same quantizer in bright areas? or something like that.)That's called brightness-based AQ. Elecard supports this, and regular x264 AQ, though not brightness-based, is thresholded by brightness.

Regular x264 AQ finds the flattest parts of the frame (least complex) and gives the more bits. Mine is somewhat similar, but does it using different math and is much more willing to move bits around.

Dark Shikari
18th December 2007, 02:01
A test of the AQ, at 2 megabits per second:

Original (no AQ) (http://mirror05.x264.nl/Dark/force.php?file=./Original.mkv)

New AQ (http://mirror05.x264.nl/Dark/force.php?file=./AQTest.mkv)

Notice the massively better grain retention with the AQ.

Sharktooth
18th December 2007, 02:20
what settings did you use for those 2 encodings?
also how does it behave at very low quantizers (below 18)?
it seems in the original one, background grain is there on the i frame, but on Bs it gets washed. it does the same with AQ, but it "drops" much less.

Dark Shikari
18th December 2007, 03:08
what settings did you use for those 2 encodings?
also how does it behave at very low quantizers (below 18)?
it seems in the original one, background grain is there on the i frame, but on Bs it gets washed. it does the same with AQ, but it "drops" much less.I used 3pass with pretty much maxed settings, with --no-dct-decimate. Trellis 1.

I-frames will obviously have better grain because they have lower quantizers.

I'm testing a 5 megabit encode right now to see behavior at lower quantizers.

Also note that the original video clip isn't the best quality--some blurring and even blocking is actually from the original.

Gilgamesh83
18th December 2007, 03:20
This would be some sort of motion-based AQ. I've never seen one, personally.
That's called brightness-based AQ. Elecard supports this, and regular x264 AQ, though not brightness-based, is thresholded by brightness.

Regular x264 AQ finds the flattest parts of the frame (least complex) and gives the more bits. Mine is somewhat similar, but does it using different math and is much more willing to move bits around.

thx for the answer. would be cool if a motion-based aq existed.

foxyshadis
18th December 2007, 08:14
Motion-based isn't that interesting. You'd really want some sort of area of visual interest AQ, but that is firmly in the Hard Problem territory. You'd practically need the director or someone intimately familiar with the film highlight the area the eye is focusing on in every frame. Researching is progressing in this every year, though, there's a lot of papers out there if anyone ever wants to take a (quixotic imho) stab at it.

The quantizer, deadzone, and custom matrix are all there to tweak how the codec responds to motion and detail, and generally do a fine job; it's grain that has generated nearly all of the complaints over the years. The lousy performance of h.264 with grain is the main reason VC-1 even exists.

btw, xvid and ffmpeg use method b and call it lumimasking. x264 doesn't need that since it doesn't have the same overquantization problems in dark areas they do, although there's some overlap when dark areas are flat but grainy.

ToS_Maverick
18th December 2007, 09:32
@Dark Shikari:
did you find something out during your testing? just curious ;)

foxyshadis
18th December 2007, 11:28
Also: It definitely helps reduce blocking of gradients in anime. It can increase bitrate by a pretty inordinate amount in some scenes, though, even with zero visual difference - anime's probably so flat that the algorithm goes a little crazy.

Valeron
18th December 2007, 16:51
hi, Dark Shikari, not good news from my anime encode experience.
crf18 same setting, one with ur new AQ strength 1.0 and threshold 0(automatic) enable, the other disable AQ, the AQ enable one looks bad compare to the no AQ encode. And is 27MB larger in size.
If u would like some screen shot, I can post here tomorrow.

Dark Shikari
18th December 2007, 17:11
hi, Dark Shikari, not good news from my anime encode experience.
crf18 same setting, one with ur new AQ strength 1.0 and threshold 0(automatic) enable, the other disable AQ, the AQ enable one looks bad compare to the no AQ encode. And is 27MB larger in size.
If u would like some screen shot, I can post here tomorrow."Bad" is sort of a bad term--screenshots are definitely useful to illustrate. Also, comparing to files with different sizes is generally bad, too.

burfadel
18th December 2007, 17:15
"Bad" is sort of a bad term--screenshots are definitely useful to illustrate. Also, comparing to files with different sizes is generally bad, too.

Especially since CRF is constant quality, not constant bitrate (which would end up with the same file size). The use of P and B frames, as well as macroblocks would also be different with constant quality and AQ enabled (?), so although the image may have the same CRF, the filesize will end up being different! The filesize could go either way?...

Dark Shikari
18th December 2007, 17:22
Especially since CRF is constant quality, not constant bitrate (which would end up with the same file size). The use of P and B frames, as well as macroblocks would also be different with constant quality and AQ enabled (?), so although the image may have the same CRF, the filesize will end up being different! The filesize could go either way?...And of course comparing individual frames is also bad unless the GOPs have the same structure--comparing a B-frame to an I-frame, for example, is just retarded.

Best way to compare is just to run two two-pass encodes, one with AQ and one without, and comparing the result.

Sharktooth
18th December 2007, 17:44
i'd suggest to encode at a target bitrate and see if the AQ encode looks better...

zbutsam
18th December 2007, 23:53
I read in a post on this topic that this AQ patch could improve the encoding of scenes with grass textures and I had a great clip to test it on. It is a trailer of the film "Kicking and Screaming" which contains a lot of action on football fields.
The clip (originally in 720p) was resized to 560x304 for speed's sake and encoded in MeGUI using the HQ-Fast profile and with a two-pass target bitrate of 700kbits (rather low I know but I wanted to see how the AQ would react with lower bitrates).
As a reference I used the latest x264 build that MeGUI would download (709).
For the AQ x264.exe I added the switches
--aq-strength 1.0 --aq-sensitivity 20.

The results were great:) Whereas the original would struggle with the low bitrate having to blur details on the grass and producing a flat effect, the new AQ managed to preserve much more detail without noticable loss of quality anywhere else.

On a 1000 kbit 2-pass I tried just with the AQ build it retained much more detail and the picture was sharp (however I will have to go back and repeat this last test with the unmodified build to see how that does).

Speed-wise the build with the AQ patch is about 10% slower for me.

I will try to post more tests later but, so far, congratulations :) it seems to be working great

Sagekilla
19th December 2007, 00:05
I read in a post on this topic that this AQ patch could improve the encoding of scenes with grass textures and I had a great clip to test it on. It is a trailer of the film "Kicking and Screaming" which contains a lot of action on football fields.
The clip (originally in 720p) was resized to 560x304 for speed's sake and encoded in MeGUI using the HQ-Fast profile and with a two-pass target bitrate of 700kbits (rather low I know but I wanted to see how the AQ would react with lower bitrates).
As a reference I used the latest x264 build that MeGUI would download (709).
For the AQ x264.exe I added the switches
--aq-strength 1.0 --aq-sensitivity 20.

The results were great:) Whereas the original would struggle with the low bitrate having to blur details on the grass and producing a flat effect, the new AQ managed to preserve much more detail without noticable loss of quality anywhere else.

On a 1000 kbit 2-pass I tried just with the AQ build it retained much more detail and the picture was sharp (however I will have to go back and repeat this last test with the unmodified build to see how that does).

Speed-wise the build with the AQ patch is about 10% slower for me.

I will try to post more tests later but, so far, congratulations :) it seems to be working great

700 kbps actually isn't that too low for 560x304. I manage to get around 1.1 mbps (300, surprisingly) to about 2 mbps on most of my encodes @ 864x480. Then again, I do typically enable most settings except for esa. On some of my encodes I've decided to just go with the full 16 refs since they're so slow to begin with because of preprocessing.


In any case, that's very interesting to hear. I'm waiting for the preprocessing to finish on one of my videos before I decide how to tackle it with AQ. Last encode I ran on it, I ended up using an older build of the new AQ and the newer builds seem to be doing an even better job, so I'm redoing it for the probably 8th time now.

Dark Shikari
19th December 2007, 01:15
700 kbps actually isn't that too low for 560x304. I manage to get around 1.1 mbps (300, surprisingly) to about 2 mbps on most of my encodes @ 864x480.The main issue that I find, however, is that without my AQ, you need very high bitrates to retain fine background detail, like grass; it simply doesn't put the bits where they need to be. As a result, you end up needing vastly higher bitrates to achieve transparency, even though most of those bits end up wasted.

Sagekilla
19th December 2007, 02:45
The main issue that I find, however, is that without my AQ, you need very high bitrates to retain fine background detail, like grass; it simply doesn't put the bits where they need to be. As a result, you end up needing vastly higher bitrates to achieve transparency, even though most of those bits end up wasted.


Neat, all the more for me to be excited about re-encoding 300 for the zillionth time. At the bitrates I said above, I usually find the videos to be mostly transparent (except backgrounds which tend to be slightly blocked but manageable) I just hope the rate control won't get screwed up badly using crf 18. By the way, why did you say that grass tends to be unfairly smeared by x264 again? I always found that a bit of an odd quirk.

Sharktooth
19th December 2007, 03:02
metrics do not represent the eyes perception. they're just numbers that represent an average deviation from the source picture. since x264 internal stuff was made to get the best compression keeping high level of metrics, sometimes the codec produce unwanted (visually speaking) results.
however every codec developer is more or less using the same method coz it's easier to compare eventual (metric) improvements and that leads to a faster development. when algos are optimized and the compression gets close to the theoretical maximum, then visual optimizations (psy and other stuff) are introduced to obtain a visually pleasing picture.
in other words, x264 has a very good compression but the picture quality can be improved drastically.

rhester72
19th December 2007, 21:58
Since there are no current diffs, any chance we could get the test EXEs compiled with MP4 output support?

Rodney

Dark Shikari
19th December 2007, 22:02
Here's a diff... (http://pastebin.com/f54db05b9)

I really need to work on the automatic thresholding and the formula though--there are some cases in which the AQ really doesn't work well. I've been busy lately though--final exams and watching Haruhi.

Sagittaire
19th December 2007, 22:24
Well here a well know psy optimisation for noise/grain: for HVS noise in dark area is useless.

1) make pre-process for reduce noise in dark area. Make strong denoising/degraining in dark area (with lumi < 40 for example).

2) Use lower quant in dark area for better HVS quality in dark area. After "dark denoising" dark area will be more compressible. Use high quality (low quantizer) for dark area is really important for TFT screen.

3) Use higher quant in for complex texture. With this HVS AQ the quality for flat area will be really better. Use "dark denoising pre-process" and "spacial complexity AQ" will produce directly better quality for dark area.

ToS_Maverick
19th December 2007, 22:32
Dear Dark Shikari, I got a pre-christmas present for you :D

i recorded PotC1 some time ago from HDTV and can now present you, the same sample in 1080p broadcasted at about 6 MBit:
http://www.megaupload.com/de/?d=O262L4JJ

to get the same screen size and picture area, i use this script:
directshowsource("Black.Pearl.Sample HD.mkv",audio=false,fps=25)
trim(82,3703)
ColorMatrix(mode="Rec.709->Rec.601")
crop(0,140,0,-140)
lanczosresize(768,320)

the broadcast isn't perfect, but the very fine detail in the background is preserved very well, which should be good input for your AQ!

have fun :cool:

kumi
19th December 2007, 22:33
Good luck with your exams, and Haruhi :) I hope you find time to adjust the automatic thresholding for use with crf mode. I know that you said this isn't like constant bitrate mode and we shouldn't expect filesize parity, but a little more predictability would be real nice. I just finished an encode that came out massively oversized, (2.3GB vs 1.4GB without AQ). Other movies haven't been nearly as bad, though.

Well I guess I should have been doing a prediction pass from the start... stupid me :p

Happy holidays, everyone

Dark Shikari
19th December 2007, 23:02
Good luck with your exams, and Haruhi :) I hope you find time to adjust the automatic thresholding for use with crf mode. I know that you said this isn't like constant bitrate mode and we shouldn't expect filesize parity, but a little more predictability would be real nice. I just finished an encode that came out massively oversized, (2.3GB vs 1.4GB without AQ). Other movies haven't been nearly as bad, though.

Well I guess I should have been doing a prediction pass from the start... stupid me :p

Happy holidays, everyoneThe other issue is that edges are really getting screwed up in some cases with my AQ; my algorithm seems to work at its absolute best when there are no sharp edges in the video, and worst when there are plenty--so perhaps I have to deal with edge masking or similar.

Sagekilla
20th December 2007, 04:48
Hmm, the latest build (0.3) is rather quirky in your transmagical adaptive sensitivity mode with crf . Strength 1 I got 1.6 mbps on a 720p resized portion of that PotC source. Increasing strength to 2 made the bitrate go down further, to 1.3 mbps! I'm decreasing crf to 16.5 to see if that'll change anything, but the rate control really gets screwed up with your transmorphmagical mode :(

kumi
20th December 2007, 05:04
Rate control gets out of whack, but it does have potential... I've been trying to match the (overall) quality achieved with aq-sensitivity 0 on a certain source... and no amount of fiddling with >0 aq-sensitivity values was able to approach it.

Dark Shikari
20th December 2007, 05:06
Rate control gets out of whack, but it does have potential... I've been trying to match the (overall) quality achieved with aq-sensitivity 0 on a certain source... and no amount of fiddling with >0 aq-sensitivity values was able to approach it.It seems to be absurdly source-dependent--which is what AQ should try to avoid.

This winter, I'll try to find a good way to fix it if I can. Its a tough problem.

Gromozeka
20th December 2007, 08:52
When you can add AQ in official build? :)
It already now gives very big positive effect.
It would allow more better testing yours AQ many people From the different countries
Thank you

Sharktooth
20th December 2007, 16:40
AQ is not ready yet. stop asking those silly questions.
all Dark Shikari needs is testing from qualified persons, not all idiots on the planet.

ToS_Maverick
20th December 2007, 17:58
@Dark Shikari and Sharktooth:
What type of samples, genres, ... still need to be tested? It would be nice to have a list or sth, to know which samples are still open.

maybe i got some other useful stuff that could be tested.

Gromozeka
20th December 2007, 19:10
To Sharktooth
Слышь ты, умник, производное децибела, аля мозг планеты, будь поскромнее! Мои вопросы может и не блещут ни познанием английского языка, ни алгоритмами программирования, но я подозреваю, что ты мог бы покумекать своими мозгами и быть несколько вежливее.
А если ты на это не способен то закрой свой ротик, зачехли рога на башке и сопи в тряпочку!
С нескрытым раздражением, но уважением, Игорь

Sagekilla
20th December 2007, 19:19
When you can add AQ in official build? :)
It already now gives very big positive effect.
It would allow more better testing yours AQ many people From the different countries
Thank you


Personally I don't think it's ready yet. Currently the rate control for constant rate factor is completely off when using adaptive sensitivity for AQ, so it'll take a lot of testing before AQ will be introduced to svn since it's completely failing with crf.

sp@rrow
20th December 2007, 19:26
Gromozeka
Бугага, эта 5 :-)))

fields_g
20th December 2007, 19:35
To Sharktooth
Слышь ты, умник, производное децибела, аля мозг планеты, будь поскромнее! Мои вопросы может и не блещут ни познанием английского языка, ни алгоритмами программирования, но я подозреваю, что ты мог бы покумекать своими мозгами и быть несколько вежливее.
А если ты на это не способен то закрой свой ротик, зачехли рога на башке и сопи в тряпочку!
С нескрытым раздражением, но уважением, Игорь

Igor (Gromozeka),
You've missed years of people asking for AQ to be added to SVN. Not only that, but you have also missed years of different methods, tweaks, and revisions. AQ (or other psy enhancements) is greatly needed, but is not currently stable.

I think the x264 community can be proud of the integrity of our SVN. Patches are accepted into SVN when the developers are comfortable with what it does and feel they can maintain the addition properly. Feel free to compile x264 yourself with any available patches you choose. In fact, I would guess that the majority of people here use builds that are NOT pure SVN.

BTW.... You might want to brush up on Rule 13.

Sharktooth
20th December 2007, 20:13
just as a reminder...
13) The official language is English. Outside the translator forum English is the only allowed language.

Gromozeka
20th December 2007, 21:59
I think the x264 community can be proud of the integrity of our SVN. Patches are accepted into SVN when the developers are comfortable with what it does and feel they can maintain the addition properly. Feel free to compile x264 yourself with any available patches you choose. In fact, I would guess that the majority of people here use builds that are NOT pure SVN.

BTW.... You might want to brush up on Rule 13.

I have understood you on 75 %. Russian language is combined even for Russian people at times. And to communicate here on it it is wrong. Thanks for all

Chainmax
22nd December 2007, 18:18
fields_g: Gromozeka's post was polite and, more importantly, was volunteering for testing. What you described is not an excuse for Shartooth's awfully rude, insulting and snobbish retort. He should know better.


just as a reminder...
13) The official language is English. Outside the translator forum English is the only allowed language.

Just another reminder:
4) Be nice to each other and respect the moderator. Profanity and insults will not be tolerated. If you have a problem with another member turn to the respective moderator and if the moderator can't help you send a private message to Doom9

bond
22nd December 2007, 18:50
yeah everyone keep cool please...

desta
27th December 2007, 02:35
Sorry to ask probably an obvious question, but does "--aq-sensitivity 0" have to be input to use the 'automatic' thresholding? Going by --help and the info in the first post, I would've assumed that automatic is.. automatic, but going from certain other posts in this thread, it seems it does need to be input.

Dark Shikari
27th December 2007, 02:54
Sorry to ask probably an obvious question, but does "--aq-sensitivity 0" have to be input to use the 'automatic' thresholding? Going by --help and the info in the first post, I would've assumed that automatic is.. automatic, but going from certain other posts in this thread, it seems it does need to be input.Automatic wasn't the default until version 0.3. Now it is.

desta
27th December 2007, 03:06
Ah I see. Thanks for clarifying. :)

Ranguvar
28th December 2007, 02:30
I have a video that would be very good for testing this, IMO. I'd like to do so and provide screenshots.

It's an HD trailer for a video game. Not capped by myself, provided by GameTrailers. Would it be a Rule 6?

Dark Shikari
28th December 2007, 03:26
I have a video that would be very good for testing this, IMO. I'd like to do so and provide screenshots.

It's an HD trailer for a video game. Not capped by myself, provided by GameTrailers. Would it be a Rule 6?Nope, those are fine to use.

Sharktooth
28th December 2007, 17:41
fields_g: Gromozeka's post was polite and, more importantly, was volunteering for testing. What you described is not an excuse for Shartooth's awfully rude, insulting and snobbish retort. He should know better.
ppl need to learn to speak only when necessary and just not to blow air out of their mouth.
the fact there were like 1 million of ppl requesting AQ in the SVN it doesnt mean it DESERVES to be there.
There are several test builds and i cant see the reason why incomplete and experimental code should be put in the x264 SVN.
Also, since it has been asked SO MUCH TIMES (and the answer was always the same), he could SEARCH before posting (you should too).
No offense, it's just my way...
and excuse me for the OT.

bond
28th December 2007, 20:16
guys, again, keep cool and nice please. next offense, no matter which one, will get striked

ToS_Maverick
4th January 2008, 23:54
@Dark Shikari
i played around with VC-1 a bit, while an idea struck me:
would it make sense, to apply AQ only to I or I and P frames?

i think it would be interesting to see the effect of this. quality and metric-wise and if you could save some bitrate by leaving the B frames out.

Dark Shikari
5th January 2008, 00:01
@Dark Shikari
i played around with VC-1 a bit, while an idea struck me:
would it make sense, to apply AQ only to I or I and P frames?

i think it would be interesting to see the effect of this. quality and metric-wise and if you could save some bitrate by leaving the B frames out.That's not a bad idea, since the main disadvantage of my AQ is actually the cost of encoding the qp_delta bits.

ToS_Maverick
5th January 2008, 00:06
great that i could be helpful!

well since VC1 is using things from AVC, why not have a look at what they are doing.

at the MS VC1 codec you can set this. i got the idea because the background grain started to "update" with every I frame. i could not test the I/P setting since AVS2ASF seems bit buggy.

mahsah
5th January 2008, 18:46
Any idea what settings (if any) for AQ I could use to retain the dithering added by Gradfun2db?

Sagekilla
5th January 2008, 22:15
Any idea what settings (if any) for AQ I could use to retain the dithering added by Gradfun2db?

Dithering is actually quite "complex" since it's not a uniform, flat color like the sky would be. Since it makes use of bunching together a bunch of different colors in a set pattern to make it LOOK like another color, it'll end up getting mushed because it looks like grain or noise.

And since AQ looks to increase the bits allocated towards flat areas so that it isn't blocky (dithering looks flat, but to the human eye, not to an algorithm) but since dithering isn't like this, like I just explained above, it won't work well.

Dark Shikari
5th January 2008, 22:25
And since AQ looks to increase the bits allocated towards flat areas so that it isn't blocky (dithering looks flat, but to the human eye, not to an algorithm) but since dithering isn't like this, like I just explained above, it won't work well.My algorithm will most definitely consider dithered blocks to be very very close to flat, and as a result will decrease their quantizer (though not necessarily by enough to keep the dither accurately).

Sagekilla
5th January 2008, 22:33
My algorithm will most definitely consider dithered blocks to be very very close to flat, and as a result will decrease their quantizer (though not necessarily by enough to keep the dither accurately).

Oho, that's interesting. Is this something unique to your latest version?

Dark Shikari
5th January 2008, 22:42
Oho, that's interesting. Is this something unique to your latest version?No, its inherent in the concept--dithered blocks will have extremely low variance, and so will get the strongest AQ applied.

burfadel
6th January 2008, 14:31
Will the latest changes made to x264 with version 717 affect the patch? is it possible for a new build :)?!

Dark Shikari
6th January 2008, 15:57
Will the latest changes made to x264 with version 717 affect the patch? is it possible for a new build :)?!717 just looks like an ESA improvement.

burfadel
6th January 2008, 16:10
ah ok! I see that now :) wouldn't a 1.3x increase (30 percent) in ESA speed bring it reasonably close to the speed of Multi hex?

akupenguin
6th January 2008, 16:29
ah ok! I see that now :) wouldn't a 1.3x increase (30 percent) in ESA speed bring it reasonably close to the speed of Multi hex?
No. Maybe if you also pull in UMH's early termination and range adaption, so that only the multi-hexagon part is replaced by ESA.

Dark Shikari
6th January 2008, 16:59
No. Maybe if you also pull in UMH's early termination and range adaption, so that only the multi-hexagon part is replaced by ESA.While we're at it, how much does this speed up ESA SATD?

Sagekilla
6th January 2008, 19:43
While we're at it, how much does this speed up ESA SATD?

Pardon my slightly off topic question but isn't the current esa actually a completely different algo? I remember you referring to it as "SEA," what exactly does that stand for?


On a side note, nice to hear that esa got a 30% boost in speed. I've been using esa (along with a number of other insane settings) on relatively short clips and I've been pleased with the (slight, but noticeable) improvement over umh.

Dark Shikari
6th January 2008, 19:44
Pardon my slightly off topic question but isn't the current esa actually a completely different algo? I remember you referring to it as "SEA," what exactly does that stand for?

On a side note, nice to hear that esa got a 30% boost in speed. I've been using esa (along with a number of other insane settings) on relatively short clips and I've been pleased with the (slight, but noticeable) improvement over umh.ESA SATD, at least Aku's version, uses SEA.

SEA is Sequential Elimination, and it uses a layered image representation to losslessly eliminate candidates (IIRC).

Sagekilla
6th January 2008, 20:04
Also, how compatible are the various speed and quality patches (fast ref, new aq) with rev 715? I'm looking to compile my own build with fast ref search, the older AQ algo (new one is dodgy with crf), and get the nice esa speed boost from the latest patch.

ToS_Maverick
10th January 2008, 00:39
just for the record

the new AQ at
--aq-strength 1.0 --aq-sensitivity 17

hast the same size as the old AQ @
--aq-strength 0.6 --aq-sensitivity 10 or --aq-strength 0.9

with all CRFs (tested on 18, 20 and 22)
with a higher visual quality.

JvA_
11th January 2008, 12:12
Dark Shikari, could you please make a diff against the latest SVN checkout, if possible?

Except knowledge of C/C++, how much mathematical knowledge is required to start hacking on x264? I have no previous knowledge on video compression, but I've had some courses about fourier-transform and similar math. I've seen JPEG2000 uses a lot of the math I've studied so far, so if MPEG4 has similarities I suppose I would understand a lot ;)

So, what do you recommend? Start digging the source code right away, or are there documents that I should read first that will give me a better "hands on"?

Dark Shikari
11th January 2008, 13:53
Dark Shikari, could you please make a diff against the latest SVN checkout, if possible?I don't think there should be any incompatibilities between my old patch and the current SVN. If there are, I'll fix them.
Except knowledge of C/C++, how much mathematical knowledge is required to start hacking on x264? I have no previous knowledge on video compression, but I've had some courses about fourier-transform and similar math. I've seen JPEG2000 uses a lot of the math I've studied so far, so if MPEG4 has similarities I suppose I would understand a lot ;)You don't need much math knowledge. There are dozens of places in x264 that you can start hacking at without even knowing more than a small amount about video compression, since one can easily treat every other part of the code as a "black box" and ignore how it works. This is why I started on me.c and was able to do a few useful things even as a clueless newbie back in the day.

So, what do you recommend? Start digging the source code right away, or are there documents that I should read first that will give me a better "hands on"?Dig through the source code while asking every question you can think of in #x264dev on Freenode. You'll learn faster than you thought you ever could.

JvA_
11th January 2008, 17:58
Thanks for your reply!

I tried to patch the source code with your web-published diff using patch -Np0 -i ../path/to/the/file.txt standing in the checked out x264 directory. Got a reject on every line it tried to add. If you got time, please change the diff so it matches. Can't wait to try this AQ :)

Dark Shikari
11th January 2008, 18:14
Thanks for your reply!

I tried to patch the source code with your web-published diff using patch -Np0 -i ../path/to/the/file.txt standing in the checked out x264 directory. Got a reject on every line it tried to add. If you got time, please change the diff so it matches. Can't wait to try this AQ :)Are you patching the SVN code, or the web code? The code on mirror05 is already patched with the other AQ, which would result in the errors.

JvA_
11th January 2008, 18:41
I'm patching the code I've fetched from:
svn co svn://svn.videolan.org/x264/trunk x264

Dark Shikari
14th January 2008, 22:28
New AQ is out. Massive changes.

1. Totally rewritten AQ. Same basic concept, but now uses a logarithmic scale instead of a hackneyed exponential one.
2. For B-frames, uses a tricky bit of lambda-changing instead of QP changing; this requires absolutely no bits for QP-deltas!
3. Totally rewritten, far faster automatic sensitivity. Respects bitrate in CRF mode better also.

Will post it in the original post soon.

Sagekilla
14th January 2008, 22:41
Very nice, I'll go try this out and see how well it behaves on my encoding. I've been holding out on a number of movies because I couldn't get decent flat detailed areas with good quality.

akupenguin
14th January 2008, 22:45
Pardon my slightly off topic question but isn't the current esa actually a completely different algo? I remember you referring to it as "SEA," what exactly does that stand for?
Technically ESA and SEA are different algorithms. But since they produce the same result and differ only in implementation details, I saw no reason to rename the commandline option when I switched algorithm in r388.

mitsubishi
14th January 2008, 23:12
The link for 0.4 doesn't seem to work: "Invalid Quickkey. This error has been forwarded to MediaFire's development team."

Dark Shikari
14th January 2008, 23:14
The link for 0.4 doesn't seem to work:Fixed.

Atak_Snajpera
14th January 2008, 23:35
Thanks Dark Shikari !
Finally I'm fully convinced to new AQ algorithm :)
No more dancing blocks on blue sky :)

BTW strength 1.0 gives me the best result. I've noticed also that sometimes file size is even lower than 0.5 in CQ mode.

MasterNobody
15th January 2008, 00:29
Dark Shikari
May be you will also upload source diff with current x264 trunk so people can compile own's builds (for example, I want to make experimental x264vfw version with this new AQ patch)

Dark Shikari
15th January 2008, 02:51
Dark Shikari
May be you will also upload source diff with current x264 trunk so people can compile own's builds (for example, I want to make experimental x264vfw version with this new AQ patch)
Patch version 0.41 (http://pastebin.com/f7fb3770d). 0.41 is just mainly documentation/code cleanup/code comments.

ToS_Maverick
15th January 2008, 10:34
Dark Shikari, i got bad news for you:

just did a quick test with these settings:

--crf 20.0 --level 3 --keyint 100 --min-keyint 1 --ref 3 --mixed-refs --no-fast-pskip --bframes 2 --b-pyramid --bime --weightb --filter -2,-2 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --vbv-bufsize 1835 --vbv-maxrate 10000 --threads auto --thread-input --progress --no-dct-decimate --no-psnr --no-ssim --output "output" "input" --aq-strength 1.0
and got the following result
http://img444.imageshack.us/img444/2190/newaqpy2.th.png (http://img444.imageshack.us/my.php?image=newaqpy2.png)

for comparison your 0.3 algo:
http://img256.imageshack.us/img256/7029/aqck8.th.png (http://img256.imageshack.us/my.php?image=aqck8.png)

otherwise, your new AQ looks very promising, but i got the feeling that the grain "stutters" a little bit? like it has only 1/2 or 1/4th of the fps.

Dark Shikari
15th January 2008, 16:19
Dark Shikari, i got bad news for you:

just did a quick test with these settings:

--crf 20.0 --level 3 --keyint 100 --min-keyint 1 --ref 3 --mixed-refs --no-fast-pskip --bframes 2 --b-pyramid --bime --weightb --filter -2,-2 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --vbv-bufsize 1835 --vbv-maxrate 10000 --threads auto --thread-input --progress --no-dct-decimate --no-psnr --no-ssim --output "output" "input" --aq-strength 1.0
and got the following result
http://img444.imageshack.us/img444/2190/newaqpy2.th.png (http://img444.imageshack.us/my.php?image=newaqpy2.png)

for comparison your 0.3 algo:
http://img256.imageshack.us/img256/7029/aqck8.th.png (http://img256.imageshack.us/my.php?image=aqck8.png)

otherwise, your new AQ looks very promising, but i got the feeling that the grain "stutters" a little bit? like it has only 1/2 or 1/4th of the fps.Whoa... something went wrong there... :p

Its likely the "stuttering" problem is due to the lambda trick in B-frames, since that really doesn't retain grain very well. I can add a commandline to either do AQ on all frames, AQ on non-B-frames and lambda on B-frames, or AQ on just P-frames.

Edit: Tested your command on the exact same source and didn't get the problem you got...

DeathTheSheep
15th January 2008, 16:43
Is this AQ patch applicable on top of hadamard, me-prepass, etc for r720?

Dark Shikari
15th January 2008, 16:50
Is this AQ patch applicable on top of hadamard, me-prepass, etc for r720?fpel-cmp seems to be incompatible as of r717, so you'll have to ask Pengvado to fix that.

ME-prepass should continue to be fine. Both should have no problem with AQ, and AQ should be fine with RDRC also.

DeathTheSheep
15th January 2008, 17:28
RDRC? Rate distortion rate control? I must have missed this...!

Anyway, I guess I'm backtracking to r680 until the patches gain some steam again. I'm guessing it was the ESA speedup that caused the incompatibilities, and r680 is pretty much the same speed/quality-wise for me without it.

Will this AQ work with 680 on top of the other patches?

PS: I'm not very compelling when it comes to asking people to do things, so a new fpel-cmp feels like a pipe dream. :)

Dark Shikari
15th January 2008, 17:30
RDRC? Rate distortion rate control? I must have missed this...!

Anyway, I guess I'm backtracking to r680 until the patches gain some steam again. I'm guessing it was the ESA speedup that caused the incompatibilities, and r680 is pretty much the same speed/quality-wise for me without it.

Will this AQ work with 680 on top of the other patches?

PS: I'm not very compelling when it comes to asking people to do things, so a new fpel-cmp feels like a pipe dream. :)AQ should work fine on r680 as far as I can think.

fpel-cmp just has to be updated--it doesn't have to be rewritten.

RDRC... come on #x264dev, and learn all about the potential 1db+ PSNR gain :p

ToS_Maverick
15th January 2008, 20:44
ok, now i reencoded the sample and got 3!!! different file-sizes?!

please reencode the sample and look at the file sizes, i hope you get the same result...

obviously there is something strange happening here

Dark Shikari
15th January 2008, 21:06
ok, now i reencoded the sample and got 3!!! different file-sizes?!

please reencode the sample and look at the file sizes, i hope you get the same result...

obviously there is something strange happening here
I just ran it three times... bitwise equivalent result. Something must be wrong with your machine.

Atak_Snajpera
15th January 2008, 21:08
I have no problems neither. Core2Duo 1.86GHz overclocked to 2.8GHz :)

DeathTheSheep
15th January 2008, 21:52
Overclock? Strange results? Data processing corruption? Hmm, something smells...like burning CPU!! Is your system stable, ToS_Maverick?

Oh and DS, hadamard doesn't work with r716 either. Compiles but crashes (and burns). :)

I'm wondering if r681 is a good choice, now that I look carefully, since it seems to integrate your improved subme7, correct? (As you can see, I want to test this with some insane options. Adaptive quantization goes best with static insanity, does it not?).

Atak_Snajpera
15th January 2008, 21:56
Download Go-orthos and check if your cpu is stable (run at least few hours)

G_M_C
15th January 2008, 22:03
Whoa... something went wrong there... :p

Its likely the "stuttering" problem is due to the lambda trick in B-frames, since that really doesn't retain grain very well. I can add a commandline to either do AQ on all frames, AQ on non-B-frames and lambda on B-frames, or AQ on just P-frames.

Edit: Tested your command on the exact same source and didn't get the problem you got...

The grainpulsing might be less when you preprocess the source with grainoptimizer (http://forum.doom9.org/showthread.php?p=1052870#post1052870).

ToS_Maverick
15th January 2008, 22:56
thank you all for your tips!

if my pc would be instable, many other programs would crash, i should get freezes, BSODs and whatever, but my machine is rock solid. i even underclocked it now to test.

i did a lot of tests now, but could not always get bit identical results.

what i tested:
my machine (C2D 6600@2.33GHz)
AQ 0.4 build - 4 of 6 identical
AQ 0.4 build no AQ - none
rev 720 std - 2 of 3 identical

father's machine (Athlon X2 3800 all std)
AQ 0.4 build - none
rev 720 std - none

the identical files seem to be related with the current system load. while doing nothing during the encoding process, the files have a higher chance to get identical. it's very strange that the athlon doesn't produce identical files. shouldn't a program deliver correct results, no matter what runs nearby?

maybe someone else has a better explanation to this? and sry for being a bit off topic.

akupenguin
15th January 2008, 23:03
Before trying to debug nondeterministic r680 (if that's what you're doing), read r713 (http://trac.videolan.org/x264/changeset/713). That wouldn't cause colored snow like ToS_Maverick saw, but it was a bug.

Dark Shikari
16th January 2008, 01:33
Two bugs have been discovered.

1. The modified lambda system doesn't handle chroma QPs properly, which causes breakage at high QPs. This has been fixed in my latest internal build. It will be uploaded soon. In the meantime, here's the updated patch (http://pastebin.com/f499214d3).

2. The deadzone lambda changing is somewhat broken. Using trellis is recommended until this is fixed.

Razorholt
16th January 2008, 01:46
trellis 1 or 2? Doesn't matter?

Thanks
- Dan

Dark Shikari
16th January 2008, 01:50
trellis 1 or 2? Doesn't matter?

Thanks
- Dan2 will completely eliminate the use of deadzone, ensuring no problems ever. 1 is probably sufficient.

DeathTheSheep
16th January 2008, 01:52
Aku: Then could you update your SATD patch for 720? It would make testing this more beneficial in the extreme scenarios.

Also, Dark Shikari: How about a refresh of that me-prepass? Last I checked it was thrown all over the place (b0rked diffs, missing lines, updates, etc)?

'Twould truly be much appreciated on my part. :)

desta
16th January 2008, 01:59
So using this AQ + deadzones is a no go (for now)?

Dark Shikari
16th January 2008, 02:00
Use this new patch:
]Index: encoder/encoder.c
===================================================================
--- encoder/encoder.c (revision 720)
+++ encoder/encoder.c (working copy)
@@ -374,7 +374,7 @@
h->param.analyse.i_direct_mv_pred = X264_DIRECT_PRED_SPATIAL;
}
}
-
+
if( h->param.rc.i_rc_method < 0 || h->param.rc.i_rc_method > 2 )
{
x264_log( h, X264_LOG_ERROR, "no ratecontrol method specified\n" );
@@ -472,6 +472,8 @@
if( !h->param.b_cabac )
h->param.analyse.i_trellis = 0;
h->param.analyse.i_trellis = x264_clip3( h->param.analyse.i_trellis, 0, 2 );
+ if( h->param.analyse.b_aq && h->param.analyse.f_aq_strength <= 0 )
+ h->param.analyse.b_aq = 0;
h->param.analyse.i_noise_reduction = x264_clip3( h->param.analyse.i_noise_reduction, 0, 1<<16 );

{
@@ -1020,6 +1022,32 @@
x264_macroblock_slice_init( h );
}

+//Finds the total AC energy of the block in all planes.
+static int ac_energy_mb(x264_t *h)
+{
+ DECLARE_ALIGNED( static uint8_t, zero[16], 16 );
+ int sad = h->pixf.sad[PIXEL_16x16](zero,0,h->mb.pic.p_fenc[0],FENC_STRIDE) >> 4;
+ int ssd = h->pixf.ssd[PIXEL_16x16](zero,0,h->mb.pic.p_fenc[0],FENC_STRIDE);
+ int totalSSD = ssd - (sad * sad);
+ sad = h->pixf.sad[PIXEL_8x8](zero,0,h->mb.pic.p_fenc[1],FENC_STRIDE) >> 3;
+ ssd = h->pixf.ssd[PIXEL_8x8](zero,0,h->mb.pic.p_fenc[1],FENC_STRIDE);
+ totalSSD += ssd - (sad * sad);
+ sad = h->pixf.sad[PIXEL_8x8](zero,0,h->mb.pic.p_fenc[2],FENC_STRIDE) >> 3;
+ ssd = h->pixf.ssd[PIXEL_8x8](zero,0,h->mb.pic.p_fenc[2],FENC_STRIDE);
+ totalSSD += ssd - (sad * sad);
+ return totalSSD;
+}
+
+//Find the total SATD score of a block. Represents the block's overall complexity (bit cost) for intra encoding.
+static int satd_mb(x264_t *h)
+{
+ DECLARE_ALIGNED( static uint8_t, zero[16], 16 );
+ int totalSATD = h->pixf.satd[PIXEL_16x16](zero,0,h->mb.pic.p_fenc[0],FENC_STRIDE);
+ totalSATD += h->pixf.satd[PIXEL_8x8](zero,0,h->mb.pic.p_fenc[1],FENC_STRIDE);
+ totalSATD += h->pixf.satd[PIXEL_8x8](zero,0,h->mb.pic.p_fenc[2],FENC_STRIDE);
+ return totalSATD;
+}
+
static void x264_slice_write( x264_t *h )
{
int i_skip;
@@ -1045,7 +1073,51 @@
}
h->mb.i_last_qp = h->sh.i_qp;
h->mb.i_last_dqp = 0;
-
+ x264_cpu_restore(h->param.cpu);
+ /* Adaptive AQ sensitivity algorithm. */
+ if(h->param.analyse.f_aq_sensitivity == 0 && h->param.analyse.f_aq_strength != 0)
+ {
+ double total = 0;
+ double n = 0;
+ /* FIXME: Easier way to iterate over MBs? Do we need to do the full cache_load? */
+ /* FIXME: Some of the SATDs might be already calculated elsewhere (ratecontrol?). Can we reuse them? */
+ /* FIXME: Store the data, then do the logs after, to avoid the cpu_restores every single cycle? */
+ /* FIXME: Is chroma SATD necessary? */
+ for( mb_xy = h->sh.i_first_mb; mb_xy < h->sh.i_last_mb; )
+ {
+ const int i_mb_y = mb_xy / h->sps->i_mb_width;
+ const int i_mb_x = mb_xy % h->sps->i_mb_width;
+ x264_macroblock_cache_load( h, i_mb_x, i_mb_y );
+ int energy = ac_energy_mb(h);
+ x264_cpu_restore(h->param.cpu);
+ /* Weight the energy value by the SATD value of the MB. This represents the fact that
+ the more complex blocks in a frame should be weighted more when calculating the optimal sensitivity.
+ This also helps diminish the negative effect of large numbers of simple blocks in a frame, such as in the case
+ of a letterboxed film. */
+ if(energy != 0)
+ {
+ int satd = satd_mb(h);
+ x264_cpu_restore(h->param.cpu);
+ total += log(energy) * satd;
+ n += satd;
+ }
+ if( h->sh.b_mbaff )
+ {
+ if( (i_mb_y&1) && i_mb_x == h->sps->i_mb_width - 1 )
+ mb_xy++;
+ else if( i_mb_y&1 )
+ mb_xy += 1 - h->sps->i_mb_width;
+ else
+ mb_xy += h->sps->i_mb_width;
+ }
+ else
+ mb_xy++;
+ }
+ x264_cpu_restore(h->param.cpu);
+ /* Calculate and store the threshold. */
+ if(n == 0) h->aq_threshold = 100000;
+ else h->aq_threshold = expf(total / n);
+ }
for( mb_xy = h->sh.i_first_mb, i_skip = 0; mb_xy < h->sh.i_last_mb; )
{
const int i_mb_y = mb_xy / h->sps->i_mb_width;
Index: encoder/analyse.c
===================================================================
--- encoder/analyse.c (revision 720)
+++ encoder/analyse.c (working copy)
@@ -29,6 +29,7 @@
#endif

#include "common/common.h"
+#include "common/cpu.h"
#include "macroblock.h"
#include "me.h"
#include "ratecontrol.h"
@@ -2037,8 +2038,68 @@
}
}

+//Finds the total AC energy of the macroblock in all planes.
+static int ac_energy_mb(x264_t *h)
+{
+ DECLARE_ALIGNED( static uint8_t, zero[16], 16 );
+ int sad = h->pixf.sad[PIXEL_16x16](zero,0,h->mb.pic.p_fenc[0],FENC_STRIDE) >> 4;
+ int ssd = h->pixf.ssd[PIXEL_16x16](zero,0,h->mb.pic.p_fenc[0],FENC_STRIDE);
+ int totalSSD = ssd - (sad * sad);
+ sad = h->pixf.sad[PIXEL_8x8](zero,0,h->mb.pic.p_fenc[1],FENC_STRIDE) >> 3;
+ ssd = h->pixf.ssd[PIXEL_8x8](zero,0,h->mb.pic.p_fenc[1],FENC_STRIDE);
+ totalSSD += ssd - (sad * sad);
+ sad = h->pixf.sad[PIXEL_8x8](zero,0,h->mb.pic.p_fenc[2],FENC_STRIDE) >> 3;
+ ssd = h->pixf.ssd[PIXEL_8x8](zero,0,h->mb.pic.p_fenc[2],FENC_STRIDE);
+ totalSSD += ssd - (sad * sad);
+ return totalSSD;
+}

/*****************************************************************************
+* x264_adaptive_quant:
+ * adjust macroblock QP based on variance (AC energy) of the MB.
+ * high variance = higher QP
+ * low variance = lower QP
+ * This generally increases SSIM and lowers PSNR.
+ * To save bits in B-frames, adaptive lambda is used instead of adaptive quantization.
+*****************************************************************************/
+void x264_adaptive_quant( x264_t *h, x264_mb_analysis_t *a )
+{
+ int qp = h->mb.i_qp;
+ int ac_energy = ac_energy_mb(h);
+ x264_cpu_restore(h->param.cpu);
+ float result = ac_energy;
+ float threshold;
+ /* In the case of adaptive AQ sensitivity, grab the value from the frame pre-process. Otherwise, calculate
+ the AQ sensitivity value for the current frame. */
+ if(h->param.analyse.f_aq_sensitivity == 0)
+ threshold = h->aq_threshold;
+ else threshold = powf(h->param.analyse.f_aq_sensitivity,4)/2;
+ /* Adjust the QP based on the AC energy of the macroblock. */
+ int qp_adj = -3.0 * h->param.analyse.f_aq_strength * log(result / threshold);
+ qp_adj = x264_clip3(qp_adj,-5*h->param.analyse.f_aq_strength,5*h->param.analyse.f_aq_strength);
+ int new_qp = x264_clip3(qp - qp_adj,h->param.rc.i_qp_min,h->param.rc.i_qp_max);
+ /* Change the lambda values. */
+ /* If the QP of this MB is within 1 of the previous MB, code the same QP as the previous MB, to lower the bit
+ cost of the qp_delta. */
+ if(abs(new_qp - h->mb.i_last_qp) == 1) new_qp = h->mb.i_last_qp;
+ a->i_lambda = i_qp0_cost_table[new_qp];
+ a->i_lambda2 = i_qp0_cost_table[new_qp];
+ //h->i_mod_qp = new_qp;
+ //h->i_mod_chroma_qp = i_chroma_qp_table[x264_clip3( new_qp + h->pps->i_chroma_qp_index_offset, 0, 51 )];
+ /* Adaptive quantization only applies to I/P frames. Applying it to B-frames generally results in a lot of
+ unnecessary bits spent on delta_qp. Instead, the lambdas are changed.
+ FIXME: Choose whether to use adaptive lambda or adaptive quantization on a per-block basis?
+ FIXME: Choose which to use based on bit cost?
+ FIXME: Is this optimal? */
+ //if(h->sh.i_type != SLICE_TYPE_B )
+ {
+ /* Save the final QP and update the chroma QP. */
+ h->mb.i_qp = a->i_qp = new_qp;
+ h->mb.i_chroma_qp = i_chroma_qp_table[x264_clip3( h->mb.i_qp + h->pps->i_chroma_qp_index_offset, 0, 51 )];
+ }
+}
+
+/*****************************************************************************
* x264_macroblock_analyse:
*****************************************************************************/
void x264_macroblock_analyse( x264_t *h )
@@ -2046,9 +2107,19 @@
x264_mb_analysis_t analysis;
int i_cost = COST_MAX;
int i;
+
+ h->mb.i_qp = x264_ratecontrol_qp( h );
+
+ if( h->param.analyse.b_aq )
+ x264_adaptive_quant( h, &analysis );
+ //else
+ //{
+ // h->i_mod_qp = h->mb.i_qp;
+ // h->i_mod_chroma_qp = i_chroma_qp_table[x264_clip3( h->mb.i_qp + h->pps->i_chroma_qp_index_offset, 0, 51 )];
+ //}

/* init analysis */
- x264_mb_analyse_init( h, &analysis, x264_ratecontrol_qp( h ) );
+ x264_mb_analyse_init( h, &analysis, h->mb.i_qp );

/*--------------------------- Do the analysis ---------------------------*/
if( h->sh.i_type == SLICE_TYPE_I )
Index: x264.c
===================================================================
--- x264.c (revision 720)
+++ x264.c (working copy)
@@ -243,6 +243,14 @@
" - 2: enabled on all mode decisions\n", defaults->analyse.i_trellis );
H0( " --no-fast-pskip Disables early SKIP detection on P-frames\n" );
H0( " --no-dct-decimate Disables coefficient thresholding on P-frames\n" );
+ H0( " --aq-strength <float> Amount to adjust QP/lambda per MB [%.1f]\n"
+ " 0.0: no AQ\n"
+ " 0.7: medium AQ\n"
+ " 1.4: strong AQ\n", defaults->analyse.f_aq_strength );
+ H0( " --aq-sensitivity <float> \"Center\" of AQ curve. [%.1f]\n"
+ " 0: automatic sensitivity (recommended)\n"
+ " 5: almost all QPs are raised\n"
+ " 35: almost all QPs are lowered\n", defaults->analyse.f_aq_sensitivity );
H0( " --nr <integer> Noise reduction [%d]\n", defaults->analyse.i_noise_reduction );
H1( "\n" );
H1( " --deadzone-inter <int> Set the size of the inter luma quantization deadzone [%d]\n", defaults->analyse.i_luma_deadzone[0] );
@@ -406,6 +414,8 @@
{ "trellis", required_argument, NULL, 't' },
{ "no-fast-pskip", no_argument, NULL, 0 },
{ "no-dct-decimate", no_argument, NULL, 0 },
+ { "aq-strength", required_argument, NULL, 0 },
+ { "aq-sensitivity", required_argument, NULL, 0 },
{ "deadzone-inter", required_argument, NULL, '0' },
{ "deadzone-intra", required_argument, NULL, '0' },
{ "level", required_argument, NULL, 0 },
Index: common/common.c
===================================================================
--- common/common.c (revision 720)
+++ common/common.c (working copy)
@@ -123,6 +123,9 @@
param->analyse.i_chroma_qp_offset = 0;
param->analyse.b_fast_pskip = 1;
param->analyse.b_dct_decimate = 1;
+ param->analyse.b_aq = 0;
+ param->analyse.f_aq_strength = 0.0;
+ param->analyse.f_aq_sensitivity = 0;
param->analyse.i_luma_deadzone[0] = 21;
param->analyse.i_luma_deadzone[1] = 11;
param->analyse.b_psnr = 1;
@@ -455,6 +458,13 @@
p->analyse.b_fast_pskip = atobool(value);
OPT("dct-decimate")
p->analyse.b_dct_decimate = atobool(value);
+ OPT("aq-strength")
+ {
+ p->analyse.f_aq_strength = atof(value);
+ p->analyse.b_aq = (p->analyse.f_aq_strength > 0.0);
+ }
+ OPT("aq-sensitivity")
+ p->analyse.f_aq_sensitivity = atof(value);
OPT("deadzone-inter")
p->analyse.i_luma_deadzone[0] = atoi(value);
OPT("deadzone-intra")
@@ -939,6 +949,9 @@
s += sprintf( s, " zones" );
}

+ if( p->analyse.b_aq )
+ s += sprintf( s, " aq=1:%.1f:%.1f", p->analyse.f_aq_strength, p->analyse.f_aq_sensitivity );
+
return buf;
}

Index: common/common.h
===================================================================
--- common/common.h (revision 720)
+++ common/common.h (working copy)
@@ -232,6 +232,10 @@

struct x264_t
{
+ float aq_threshold;
+ //int i_mod_qp;
+ //int i_mod_chroma_qp;
+
/* encoder parameters */
x264_param_t param;

Index: x264.h
===================================================================
--- x264.h (revision 720)
+++ x264.h (working copy)
@@ -230,6 +230,9 @@
int i_trellis; /* trellis RD quantization */
int b_fast_pskip; /* early SKIP detection on P-frames */
int b_dct_decimate; /* transform coefficient thresholding on P-frames */
+ int b_aq; /* psy adaptive QP */
+ float f_aq_strength;
+ float f_aq_sensitivity;
int i_noise_reduction; /* adaptive pseudo-deadzone */

/* the deadzone size that will be used in luma quantization */
This should compile (I haven't tried it) and should have absolutely no potential for bugs/breakage. It just entirely disables the lambda-based AQ for the meantime, since its only on B-frames and not really a big deal.

Dark Shikari
16th January 2008, 04:10
New executable uploaded. Removing the lambda AQ didn't seem to have much of a negative or positive effect for now, but will fix most of the bugs with this version. I may add it back in later.

Dark Shikari
16th January 2008, 05:55
I have posted results in the OP of a test of test of the new AQ. Yes, that's right, roughly 21.5% bitrate-adjusted SSIM improvement. :eek:

acrespo
16th January 2008, 06:32
I didn't use version 0.4 but v0.42 crashes in Vista x64. When I execute I receive a message from windows that the pthreadGC2.dll is missing. I returned to version 0.3 and don't have problems.

Dark Shikari
16th January 2008, 06:37
I didn't use version 0.4 but v0.42 crashes in Vista x64. When I execute I receive a message from windows that the pthreadGC2.dll is missing. I returned to version 0.3 and don't have problems.That's because... you need pthreadGC2.dll?

You can get it here (http://mirror05.x264.nl/Dark/force.php?file=./pthreadGC2.dll).

ToS_Maverick
16th January 2008, 09:45
you somehow forgot to replace the link, it still points to 0.4
0.42 is there but not labeled as .exe:
http://mirror05.x264.nl/Dark/force.php?file=./x264_Experimental_AQ_0.42

CruNcher
16th January 2008, 09:53
@Dark Shikari
Great work, you really solved parts of the Banding Problem with this, im amazed by the results :)

Without Dark Shikaris Magic AQ
http://rapidshare.com/files/84197179/testseq-pearl-nodarkaq.mkv

With Dark Shikaris Magic AQ (HVS Quality is greatly improved)
http://rapidshare.com/files/84197485/testseq-pearl-darkaq.mkv

Get it now, it makes your "darkest" dreams come true ;D

i tested an older version i think of this tough (found the patch in your folder) with --aq-strength 0.9, bellow that it wouldn't look good enough and --aq-strength 1.0 coused a problem (gonna retest with the new patch) in this test cut :)

and yes im as crazy as you where with Vendeta 3 Mbit for 1080p ;)

Here is the Bug with --aq-strength 1.0 (also happens for --aq-strength 0.9 if --no-dct-decimate is used), excuse me if this problem allready has been encountered before i didn't read the whole thread yet (and maybe it's not even happening with the new patch anymore)
http://rapidshare.com/files/84200572/testseq-pearl-darkaq-bug.mkv (look @ the top left when the calendar pages are turned)

burfadel
16th January 2008, 11:18
This patch definately makes a better quality image, without sacrificing bitrate, at least in crf mode :) Its to the point where I can say it should be enabled by default, and with a strength of 1.0 (100 percent? :) ) by default only because it works so well, and unlike the old patch such a high AQ seems to be better. Except of course, for the bug that cruncher has mentioned which I haven't seen! Hopefully this is a real step towards including it in the svn!

acrespo
16th January 2008, 12:22
That's because... you need pthreadGC2.dll?

You can get it here (http://mirror05.x264.nl/Dark/force.php?file=./pthreadGC2.dll).

Why the new version needs this file? I notice that the file size decrease too. Is that because you remove this library inside the .EXE?

Atak_Snajpera
16th January 2008, 13:28
(look @ the top left when the calendar pages are turned)

Could you make a screenshot and mark spot because I can't see it.
BTW 3MBps for 1440x1080... You are crazy :)

Good advice use mediafire.com instead of Rapidshare.
(I've just reached the limit for user)

Dark Shikari
16th January 2008, 15:11
Why the new version needs this file? I notice that the file size decrease too. Is that because you remove this library inside the .EXE?I'm too lazy to fix it for now ;_;

Also, I fixed the link.

Sharktooth
16th January 2008, 17:53
nice one DS! really... :)

ToS_Maverick
16th January 2008, 18:32
Dark Shikari, could you implement the additional switches you suggested, in the next release? i'd really like to play around with a fully enabled AQ (also on B frames).

Dark Shikari
16th January 2008, 18:35
Dark Shikari, could you implement the additional switches you suggested, in the next release? i'd really like to play around with a fully enabled AQ (also on B frames).This one is fully enabled on B-frames :)

I completely took out all frame-specific code and lambda-based code for the meantime.

DeathTheSheep
16th January 2008, 19:12
Even so, with r681, it yields an average of 33.4% bitrate-adjusted SSIM increase on my anime test clips (as compared to r680) in conjunction with hadamard, me-prepass, and 681's subme7 tweak. (Baseline profile, relatively high quantizers too).

Hmm, I wonder what all of these quality patches have in common? Like common place of origin, etc? Hmmm, nope, nothing. :p

Dark Shikari
16th January 2008, 19:19
Even so, with r681, it yields an average of 33.4% bitrate-adjusted SSIM increase on my anime test clips:eek:

Dark Shikari
16th January 2008, 19:27
Here is the Bug with --aq-strength 1.0 (also happens for --aq-strength 0.9 if --no-dct-decimate is used), excuse me if this problem allready has been encountered before i didn't read the whole thread yet (and maybe it's not even happening with the new patch anymore)
http://rapidshare.com/files/84200572/testseq-pearl-darkaq-bug.mkv (look @ the top left when the calendar pages are turned)
Something is HORRIBLY screwed up with the quantizers in that frame.

What commandline are you using, and where can I get your source? There seems to be something like a complete AQ reversal (!?!?!) It almost looks like the AQ sensitivity for that frame is negative (?!!!) I'm guessing it has something to do with the dark/light areas and an overflow of some sort in the calculations.

DeathTheSheep
16th January 2008, 19:34
...and the more modest 17.1% without hadamard, me-prepass, and subme7 patch (r680).

(This "expectedly unexpected" development only serves to add fuel to the fire in favor of updating the old patches; for some reason, quality increase is almost doubled when all used together! :D).

[edit]By the way, I'm testing in constant quantizer mode only. Are these results at all expected?

Sagekilla
16th January 2008, 19:40
...and the more modest 17.1% without hadamard, me-prepass, and subme7 patch (r680).

(This "expectedly unexpected" development only serves to add fuel to the fire in favor of updating the old patches; for some reason, quality increase is almost doubled when all used together! :D).

[edit]By the way, I'm testing in constant quantizer mode only. Are these results at all expected?

That could have something to do with it... You should try a similar constant quality mode instead to see what kind of quality gains can be had.

DeathTheSheep
16th January 2008, 19:46
What do you mean? As in the qcomped "crf" instead of "qp"?

It is bitrate adjusted, after all (bitrate delta compared with SSIM delta over 3 qp range works too).

Also, it's fun to note that q29 aq strength 0.5 produces nearly identical filesize (on 2 of my anime test clips) as q30 without aq, but SSIM goes up from [av] 0.9714289 to [av] 0.9747350.

bob0r
16th January 2008, 20:02
@Dark Shikari
Great work, you really solved parts of the Banding Problem with this, im amazed by the results :)

Without Dark Shikaris Magic AQ
http://rapidshare.com/files/84197179/testseq-pearl-nodarkaq.mkv

With Dark Shikaris Magic AQ (HVS Quality is greatly improved)
http://rapidshare.com/files/84197485/testseq-pearl-darkaq.mkv

Get it now, it makes your "darkest" dreams come true ;D

...

Wow what a difference!!!
Truely stunning to see this must improvement... i have mirrored the files: http://files.x264.nl/cruncher/

CruNcher
16th January 2008, 21:01
Thanks bob0r for mirroring, but i might have bad news the new Patch does nothing on this scene, seems the adaptive --aq-sensitivity is failing and even if i set it manualy to 15 and strength to 0.9 or 1.0 nothing changes anymore as with the old patch :(

CMD is


Old patch (x264_aq-brdo.diff) (improves visual percepted quality of this scene massively)
x264-oldaq pearl.avs --bitrate 3000 --level 4.1 --min-keyint 1 -
-keyint 15 --aq-strength 0.9 --no-fast-pskip --bframes 0 --ref 1 --weightb --sub
me 1 --8x8dct --qpmin 15 --ipratio 1.1 --trellis 0 --deadzone-inter 11 --deadzon
e-intra 20 --nf --vbv-bufsize 14754 --vbv-maxrate 29400 --vbv-init 1.0 --me dia
--threads auto --no-chroma-me --thread-input --aud --progress --sar 16:9 -o pear
l-aq.mkv

Old patch Reveresed AQ Bug (x264_aq-brdo.diff) (still improves but shows a problem in one sequence)
x264-oldaq pearl.avs --bitrate 3000 --level 4.1 --min-keyint 1 -
-keyint 15 --aq-strength 1.0 --no-fast-pskip --bframes 0 --ref 1 --weightb --sub
me 1 --8x8dct --qpmin 15 --ipratio 1.1 --trellis 0 --deadzone-inter 11 --deadzon
e-intra 20 --nf --vbv-bufsize 14754 --vbv-maxrate 29400 --vbv-init 1.0 --me dia
--threads auto --no-chroma-me --thread-input --aud --progress --sar 16:9 -o pear
l-aq.mkv

New Patch (No reaction, also non with strength 0.9/1.0 and sensitvitiy 15)
x264-newaq pearl.avs --bitrate 3000 --level 4.1 --min-keyint 1 -
-keyint 15 --aq-strength 1.0 --no-fast-pskip --bframes 0 --ref 1 --weightb --sub
me 1 --8x8dct --qpmin 15 --ipratio 1.1 --trellis 0 --deadzone-inter 11 --deadzon
e-intra 20 --nf --vbv-bufsize 14754 --vbv-maxrate 29400 --vbv-init 1.0 --me dia
--threads auto --no-chroma-me --thread-input --aud --progress --sar 16:9 -o pear
l-aq.mkv

Dark Shikari
16th January 2008, 21:03
Thanks bob0r for mirroring, but i might have bad news the new Patch does nothing on this scene, seems the adaptive --aq-sensitivity is failing and even if i set it manualy to 15 and strength to 0.9 or 1.0 nothing changes anymore as with the old patch :(

CMD is


Old patch (x264_aq-brdo.diff)
x264-oldaq pearl.avs --bitrate 3000 --level 4.1 --min-keyint 1 -
-keyint 15 --aq-strength 0.9 --no-fast-pskip --bframes 0 --ref 1 --weightb --sub
me 1 --8x8dct --qpmin 15 --ipratio 1.1 --trellis 0 --deadzone-inter 11 --deadzon
e-intra 20 --nf --vbv-bufsize 14754 --vbv-maxrate 29400 --vbv-init 1.0 --me dia
--threads auto --no-chroma-me --thread-input --aud --progress --sar 16:9 -o pear
l-aq.mkv

Old patch Reveresed AQ Bug (x264_aq-brdo.diff)
x264-oldaq pearl.avs --bitrate 3000 --level 4.1 --min-keyint 1 -
-keyint 15 --aq-strength 1.0 --no-fast-pskip --bframes 0 --ref 1 --weightb --sub
me 1 --8x8dct --qpmin 15 --ipratio 1.1 --trellis 0 --deadzone-inter 11 --deadzon
e-intra 20 --nf --vbv-bufsize 14754 --vbv-maxrate 29400 --vbv-init 1.0 --me dia
--threads auto --no-chroma-me --thread-input --aud --progress --sar 16:9 -o pear
l-aq.mkv

New Patch (No reaction, also non with strength 0.9/1.0 and sensitvitiy 15)
x264-newaq pearl.avs --bitrate 3000 --level 4.1 --min-keyint 1 -
-keyint 15 --aq-strength 1.0 --no-fast-pskip --bframes 0 --ref 1 --weightb --sub
me 1 --8x8dct --qpmin 15 --ipratio 1.1 --trellis 0 --deadzone-inter 11 --deadzon
e-intra 20 --nf --vbv-bufsize 14754 --vbv-maxrate 29400 --vbv-init 1.0 --me dia
--threads auto --no-chroma-me --thread-input --aud --progress --sar 16:9 -o pear
l-aq.mkv
Can you get me the source for this so I can find and fix the bug?

Sharktooth
16th January 2008, 21:04
bobor, i hope you've messed up the file names on your mirror, coz the nodarkaq looks better and less blocky than the darkaq sequence...

Dark Shikari
16th January 2008, 21:06
bobor, i hope you've messed up the file names on your mirror, coz the nodarkaq looks better and less blocky than the darkaq sequence...That's probably due to the bug in my AQ that seems to show its ugly face on certain parts of this clip.

Sharktooth
16th January 2008, 21:18
no, i saw blockiness in bright areas.

DeathTheSheep
16th January 2008, 21:20
Funny I noticed similar... but at my bitrates, the increase in quality of everything else more than makes up for it.

Dark Shikari
16th January 2008, 21:32
no, i saw blockiness in bright areas.Uh, that's part of the bug. My AQ has absolutely zero, zip, zilch to do with "brightness" or "darkness."

Sharktooth
16th January 2008, 21:39
but the blockiness was accentuated from the nodarkaq encode... so IMHO you should watch out from stealing bits from those areas unless you want them to look worse than without AQ.

Dark Shikari
16th January 2008, 21:43
but the blockiness was accentuated from the nodarkaq encode... so IMHO you should watch out from stealing bits from those areas unless you want them to look worse than without AQ.The entire bug is that AQ is reversed in certain frames. This means its INCREASING blockiness in blocky areas--doing the opposite of what it should. How much do I have to explain this? Look at the quantizer distribution yourself--its obviously completely reversed.

So far, I have been unable to replicate the bug using his encoded source as my source, with a CRF pass.

CruNcher
16th January 2008, 21:45
The percived quality improvement here is in the really dark scenes but even with standard VMR9 and calibrated you can see those problems without Dark Shikaris AQ (especialy LCD). They even become visible in the gradient in the non Dark (Green/Blue) Background with the text (The Rise and fall of our Empire is at stake) you can clearly see how Darks AQ improves this Background Gradient drasticly :). Sure you wont see the stuff in the Dark area before that if you turn down the Brightness a little, but in this Area you would still see this Gradient Problems even if you lower your Brightness. And for me Personaly that (Banding) is more anoying then the little more blocks it introduces mostly in the Fast Action :D, never forget we working lossy here we have to set visual priorities and my is clear in that case ;)

DeathTheSheep
16th January 2008, 21:50
In my case, it seems there were flaws in the original mpeg2, which I misconstrued as problems with the AQ. No worries for me...

What was the exact binary, avs script, decoder, and commandline used to produce the bug? I want to replicate this thing--I can't believe it would favor one system over another.

CruNcher
16th January 2008, 21:52
Indeed the source is bad one of the first Blu-Ray Mpeg-2 encodes (Full of Filmgrain mixed with Encoder Quantization Noise, and Banding couseing problems when cleaning it), so perfect to test extreme cases ;). Ok found a problem useing --sar 16:9 makes this hardware incompatible :( --sar 4:3 works but it's huge now im sure it wasn't intended to be displayed that way, jesus i never gonna understand AR as it's AR is 2.40 it's wrong i think that it's displayed now with --sar 4:3 @ AR 16/9 how crazy is that ;).
Ok so no go for staying hardware compatible and also in the right AR (2.40) onscreen itself, you must include the letterbox i see no other way or im blind ;)

Ok here another one with the old AQ, this time it should be almost 100% Hardware Decoding Compatible (and it's low complexity) :)
http://mirror05.x264.nl/CruNcher/new-olddarkaq-hdc.mkv

Another thing i encountered now is that with VM9 Windowed (at least on my Nvidia 8800 GT) i see every mini block specialy now the problems in the Letterbox area (grey/white flashing because brightness is much higher now seems the old TV Level thing most probably).
With VMR9 Renderless it looks much better (because it's darker and so alot of problems aren't visible (Luma HVS tricks) on the first sight and even hard to spot on the second). Darks AQ improves the rest mostly the visible banding in the bright fire flash gradient, then the background gradient of the Text scene and also in the dark blue gradient background outside :) I also left out --no-fast-pskip, this seems to have overall improved it a little. Viewing this from aprox 75cm-1m away looks really nice :). Only thing im not happy with is the color representation compared to the source it seems wrong much duller it's strange it seems to be no Colormatrix problem as the uncompressed sample FFV1 shows all the color fidelity. As expected it's the way Cyberlinks Decoder does it under VMR9 and even Overlay Mixer the only Decoder doing it right from the start is CoreAVC it shows imidiatly the Full Color Representation as the Source (and by doing that it even gets rid of more of the visual flaws that you would see with other decoders even Mplayer and Videolan show the same dull colors :(

Dark Shikari
17th January 2008, 03:08
Same settings as you cruncher--and absolutely no problem like I encountered in your stream. QPs came out exactly as I expected:

http://i18.tinypic.com/8dvvvqq.png

Also, I vastly optimized the AQ, removing the need for duplicate cache_loads and SADs and SSDs. The main result is that the CPU cost of AQ is cut drastically.

CruNcher
17th January 2008, 03:16
Dark did you used the old AQ patch or the new one ? the new one didn't worked @ all for me and the old one showed that bug @ --aq-strength 1.0, strange this is.
This here is the one i mean http://files.x264.nl/force.php?file=./dark/x264_aq-brdo.diff that one i patched on 720 and got those --aq-strength 1.0 results.
gcc.exe (GCC) 3.4.5 (mingw special) <--- now im scared i better try a newer gcc version asap

Dark Shikari
17th January 2008, 03:21
Dark did you used the old AQ patch or the new one ? the new one didn't worked @ all for me and the old one showed that bug @ --aq-strength 1.0, strange this is.
This here is the one i mean http://files.x264.nl/force.php?file=./dark/x264_aq-brdo.diff that one i patched on 720 and got those --aq-strength 1.0 results.That's an ancient and bugged AQ patch.

Here, I'll give you a little something that will make you all warm and fuzzy inside: a combined patch of RDRC and AQ... AQ 0.43, not 0.42! Nice and fast... until you turn on RCRD ;)

Patch (http://pastebin.com/f227baccb)

Build (http://www.mediafire.com/?bcd2dd5ygtj)

CruNcher
17th January 2008, 03:28
Nice and where is Lookahead ? a complete RDRC but no Lookahead for ABR geez ;)

Sharktooth
17th January 2008, 03:31
could you encode the clip again with this one? just AQ, no RDRC...

Dark Shikari
17th January 2008, 03:35
could you encode the clip again with this one? just AQ, no RDRC...That's what I just did--just AQ, not RCRD.

RCRD is unrelated and I'm not using it for most of the AQ work; its just something some people were requesting.

CruNcher
17th January 2008, 03:55
Dark ehh i tried your new build (not patched it your .exe) but --aq-strength 1.0 does nothing compared to the old (that you call buggy AQ) in the key spots im talked about it just ignores them and when doing --aq-strength 1.0 --aq-sensitivity 15 your .exe is crashing the encode right after the 1 frame :(

Dark Shikari
17th January 2008, 03:57
Dark ehh i tried your new build (not patched it your .exe) but --aq-strength 1.0 does nothing compared to the old (that you call buggy AQ) in the key spots im talked about it just ignores them and when doing --aq-strength 1.0 --aq-sensitivity 15 your .exe is crashing the encode right after the 1 frame :(Well it works just fine here... :rolleyes: same source, same commandlines...

Can you come on Freenode and we'll try to resolve this?

mahsah
17th January 2008, 03:58
Wow, this is great! I tried this on a test clip of Futurama, and while the results are not very noticable (and I was doing two pass encoding), there are fewer blocks with AQ on. Also the png filesize is lower, which I guess means it is more compressable.

NoAQ:
http://img47.imageshack.us/my.php?image=noaqtq1.png

AQ:
http://img337.imageshack.us/img337/5960/aqgx4.png

command line:

C:\Program Files\megui\tools\x264\x264.exe" --pass 2 --bitrate 1350 --stats "hfyu_go.stats" --ref 8 --mixed-refs --bframes 16 --b-pyramid --b-rdo --bime --weightb --direct auto --filter 1,1 --subme 6 --trellis 1 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --me umh --merange 12 --threads auto --thread-input --sar 243:200 --progress --no-psnr --no-ssim --output "hfyu_go.mkv" "hfyu_go.avs"

CruNcher
17th January 2008, 05:03
Dark wich commandline i pasted 3 :)

so ok i did a test again with the old aq and your new aq both same settings --aq-streangth 0.9 this time as 1.0 is extreme buggy as you said no --aq-sensitivity was used for both (not correct the old one has --aq-sensitivity 15 as default) and here is now a visual comparsion analysis of they key spots i talked about and their subjectivly percepted visual differences.



Not many guys for sure have such trained eyes as me and maybe that's the problem why i realize such scenes so fast and they really hurt me (please note it's not about the individual frames here it's about the scenes they mark and the time they stay in focus and are perceptable to the eye) :(

New AQ
http://s3.directupload.net/images/080117/llnhtezh.png
http://s1.directupload.net/images/080117/tu7hg58j.png
http://s2.directupload.net/images/080117/gueurani.png
http://s1.directupload.net/images/080117/ft79axio.png

Old AQ
http://s1.directupload.net/images/080117/gh3325fq.png
http://s2.directupload.net/images/080117/2dp7ut4p.png
http://s5.directupload.net/images/080117/dvit3x7j.png
http://s5.directupload.net/images/080117/hgc6ar9j.png

These are the Key Areas for me that decide what is good and what is bad and how the whole sequence is rated by the viewer more rated by me as Sharktooth for example doesn't like the blocks in the luminant areas but those are so fast anyway this really hurts my eyes about your new AQ compared to the old :)
I have overexergerated the Decoding brightness Problems (useing VMR9 Windowed and ffdshow is the best combination to reach that hehe) to show those Key areas this is ofcourse not the same what you would see with a good calibration on screen and the best to rate this on Windows seems CoreAVC (Perfect Rendering Quality right color levels right brigthness levels everything perfect) :)

These Key Spots have to be seen in Motion (especialy the color gradients) and then it's clear that the new AQ is not for everyone :)

Dark Shikari
17th January 2008, 05:11
It'd be far more useful if you showed me what you thought was a "good" quantizer distribution and what you thought was a "bad" one.

I can't tell crap from those images without seeing the quantizers.

Sagekilla
17th January 2008, 05:22
Dark Shikari, does your latest linked build feature your RCRD as well? I downloaded what I thought had this feature and I found it produced bit-for-bit identical encodes as the current r720 build, with the exception of r720 being marginally faster (6.88 vs 6.47 fps).

Nevermind, I forgot to enable the switches to begin with. Are there any suggestions for --rcrd-lambda, like a starting point for example? :)

Dark Shikari
17th January 2008, 05:24
For lambda, note that lambda does not correspond to a specific CRF--the bitrate result of lambda will vary based on your window size, source, and AQ settings.

As I mentioned in the OP, lambda 500 will give you something roughly similar to CRF 24-26.

CruNcher
17th January 2008, 05:26
@Dark
Sorry gonna upload the results from both showing the quantizers and some results showing both rendered by CoreAVC in the key spots (also interesting) :)


This is the nices Representation of the Encode (wich comes closest to the source in color fidelity and brightness)

CoreAVC New AQ
http://s3.directupload.net/images/080117/g7vcy5ab.png
http://s3.directupload.net/images/080117/jtx66peq.png
http://s3.directupload.net/images/080117/o3jlae5c.png

CoreAVC Old AQ
http://s5.directupload.net/images/080117/wz44kip6.png
http://s3.directupload.net/images/080117/y73468f4.png
http://s4.directupload.net/images/080117/pgqe79ch.png

Even with Hardware Decoding it's not possible to reach this Quality at least not with MPCHC + Cyberlink and Nvidia on VMR9 or Overlay the colors will allways be duller then the actuall source was :(

Sagekilla
17th January 2008, 05:38
@CruNcher: Sounds like you're not properly adjusted for PC levels.. If you're not, then video will always look dull and washed out (black will be gray, whites won't be as white, etc) I forgot which setting, but it involves YUV conversion in hardware. Don't do that, that's what causes it to be set to TV scale instead of PC scale.


On a side note, is there any possibility of speed optimizations for RCRD or is it inherently that slow?

Dark Shikari
17th January 2008, 05:49
On a side note, is there any possibility of speed optimizations for RCRD or is it inherently that slow?I have already made a number of optimizations, but more are possible.

1) Take more shortcuts when doing the lookahead encodes.

2) Take more shortcuts when choosing the QP.

3) The full effect of RDRC can be simulated by analyzing the video on the first pass to figure out, for every case in which data is added to the video stream, how much that data is used in all future frames. This is an extremely difficult problem, but if we could somehow get an answer to that question, we would have an RC method that would get results close to RDRC and absolutely blow away all other video codecs out there. I already have some theories on how to do this, but I'm not sure how good an approximation they would be.

Sagekilla
17th January 2008, 06:16
Interesting work.. Any chance of any portion of this being added to the RC for crf mode? Or is this something that'll only work for 2-pass?

Dark Shikari
17th January 2008, 06:18
Interesting work.. Any chance of any portion of this being added to the RC for crf mode? Or is this something that'll only work for 2-pass?The speed cost is extremely heavy; the only thing I could think of for general use would be to use it to make I-frame/scenecut quantizer decisions or something.

It doesn't need to be twopass--that's just to make the coding easier (to avoid having to decide frametypes).

CruNcher
17th January 2008, 06:24
Here dark as you can see in the gradient areas the Quantization is in the new AQ to high or uneven distributed

New AQ Quantization Distribution
http://s4.directupload.net/images/080117/qiil5rwn.png
http://s6.directupload.net/images/080117/92elpdmw.png
http://s4.directupload.net/images/080117/dj435r2x.png
http://s3.directupload.net/images/080117/5xdn8u82.png

Old AQ Quantization Distribution
http://s1.directupload.net/images/080117/kgtd8olt.png
http://s3.directupload.net/images/080117/p86p3bdv.png
http://s3.directupload.net/images/080117/st6beg4c.png
http://s3.directupload.net/images/080117/9ar89obb.png

Maybe a Quantization of 15-20 (oldaq is most of the time really low like 15-17 over several frames) will help in such areas :) 24-28 seems to be to much and then it seems the result is very visible banding.

Here are also both clips for you to analyze
http://mirror05.x264.nl/CruNcher/pearl-aq-newdark.mkv
http://mirror05.x264.nl/CruNcher/new-olddarkaq-hdc.mkv

Dark Shikari
17th January 2008, 06:27
The explanation there is that the new AQ (when on automatic sensitivity) tries to avoid redistributing bits among frames, while older AQs didn't.

I would suggest you try a higher AQ strength, also, as the new AQ uses a different formula.

Atak_Snajpera
17th January 2008, 13:31
@CruNcher

...and don't forget to use RGB32 instead of YV12 !

CruNcher
17th January 2008, 15:58
@Sagekilla and Atak
I used the high quality YV12->RGB32 conversion but enableing/disableing doesn't anything to the picture quality with fffdshow and vmr9 windowed/renderless.
Don't know tough what Sagekilla means with the YUV conversion in hardware, never saw such an option in MPCHC Cyberlink or the VMR9 renderer.
I think the problem here arises once again from the Nvidia driver and somehow the DVI output is set on TV Level their is a registry fix for that, gonna try if this improves anything :).
It's strange that CoreAVC does compensate this somehow i think it's more then just that(also somehting with color telemetry and the decoder).
Saw a Bt709 option now in ffdshow that's something new and also the Range setting but both didn't improved anything on the visible picture quality.

@Dark Shikari
i try a higer Strength now thx for that info :)

Atak_Snajpera
17th January 2008, 16:11
@CruNcher
In MPC use System Default instead of vmr9

Source: HDV camcorder
Settings: PS3 profile 2-pass 6144 kbps.

AQ 1.0
http://img508.imageshack.us/img508/1593/aq1ov8.th.png (http://img508.imageshack.us/my.php?image=aq1ov8.png)


NOAQ
http://img502.imageshack.us/img502/5281/noaqca4.th.png (http://img502.imageshack.us/my.php?image=noaqca4.png)

Beautiful job Dark!

Dark Shikari
17th January 2008, 16:44
An extreme example. Same bitrate (3 megabit), same encoding settings, almost the same framesize.

Except the AQ one looks so much better its astounding (second is AQ):

http://i6.tinypic.com/8aksxz9.pnghttp://i17.tinypic.com/6tmoyty.png

Swap between them fast to see a huge difference.

CruNcher
17th January 2008, 16:51
i don't need to swap between you can allready see from the first sight less ringing :) that's amazing :D

New AQ
- less ringing
- less banding
- more details

:D that's 3 for one and with very little speed loss (and only smal complexity disadvantage) :)

yeah dark --aq-strength 1.5 enhanced it i go higher now :)

final ratefactor: 24.74 <- before that where somewhere @ 30 :D

but tough i still don't reach that optimum visual quality i reached with the old one @ 0.9
hmm and the higher i go with the new one the more the bitrate decreases and the file gets smaller in the end but still those spots are not enhanced jesus i try a strength of 5.0 now ;)
strength of 5.0 is extreme it shows all the faces basicly smashed up in blocks but the background seems to get ignored completly at least it doesn't enhance the way it did with the old AQ
i think a little more balance to the new AQ and it could be perfect for all situations you just have to find that :), but it seemes easy now comparing the Old AQ with the new AQ and the spots they actually enhance/disenhance mixing those both and voila = Super AQ :D

@Dark
here is the visual result with strength 5.0
http://mirror05.x264.nl/CruNcher/pearl-newdark-5.0.mkv

i didn't found the setting that results in the same visual quality for the whole scene yet as the oldaq did (still searching)
http://forum.doom9.org/showthread.php?p=1088356#post1088356

-aq-strength 2.0 seems to come near the visual result of the old but it's not quiet the same but i think i get closer (at least it gets harder to spot the difference)
maybe i should take the final size as the indicator for it :)

Old AQ --aq-strength 0.9 --aq-sensitivity 15 = 11.273.040 Bytes
New AQ --aq-strength 2.0 = 10.643.143 Bytes

jep it seems to be somewhere bellow the old AQ i come nearer :)
i reached now the same filesize @ --aq-strength 0.3 but the visual result is suboptimal for this scene compared with the oldAQ @ --aq-strenth 0.9 and --aq-sensitivity 15

desta
17th January 2008, 18:00
I'm a bit lost now - the higher the strength of the new AQ, the less overall bitrate it needs?!

CruNcher
17th January 2008, 18:12
jep that's how it looks to me, im almost sure the problem lies somewhere in the adaptive --aq-sensitivity in the new AQ, because that's also what is crashing the encoding process if you try to set it manualy.

here is the nearest i could get with the new AQ to the Old AQ visualy wise for this testcut (with just adjusting --aq-strength for both)

New AQ (--aq-strength 0.3) SSIM Mean Y:0.9870448 PSNR Mean Y:47.322 U:48.604 V:51.085 Avg:47.950 Global:47.431 final ratefactor: 25.26
http://mirror05.x264.nl/CruNcher/pearl-aq-newdark-0.3.mkv

Old AQ (--aq-strength 0.9) SSIM Mean Y:0.9838171 PSNR Mean Y:45.361 U:47.704 V:49.915 Avg:46.192 Global:45.801 final ratefactor: 30.14
http://mirror05.x264.nl/CruNcher/pearl-aq-olddark-0.9.mkv

Now it's up to the viewer to decide wich gives better results for this scene


I don't giva a shit about SSIM and PSNR for years now doing my visual optimization stuff keep that in mind (because SSIM isn't realizing this visual problems @ all) (im more an Artist then a Mathematician) (im absolutely thriled by HVS optimization and tricking the eye)

Dark Shikari
17th January 2008, 18:23
I'm a bit lost now - the higher the strength of the new AQ, the less overall bitrate it needs?!Generally, yes, because it raises quantizers on high-detail areas that don't need as many bits.

bob0r
17th January 2008, 18:25
@CruNcher
You deleted all your test files?
As in the last two links above?

CruNcher
17th January 2008, 18:44
No their are up they just werent up before i wrote it yes the --aq-strength 5.0 i deleted i wasn't sure if Dark really needs it as he can reproduce it but maybe it's interesting for the rest to watch it that's why i uploaded it again ;)

Dark Shikari
17th January 2008, 18:47
No their are up they just werent up before i wrote it yes the --aq-strength 5.0 i deleted i wasn't sure if Dark really needs it as he can reproduce it but maybe it's interesting for the rest to watch it that's why i uploaded it again ;)Strength 5.0 is retarded since it allows QP adjustments of up to 25 in each direction :p

CruNcher
17th January 2008, 18:51
Dark if i see those Metric results did you really blindly optimized for SSIM ?
I think above is the best example to showof where such Optimizations end and where you should better go the HVS (Psy Optimization way) the Ateme guys really understood that fast after i did my first beta tests of their encoder back then and really did some great HVS research since then (in the whole area Picture Sharpness, Perfect Masking and some more clever things) :)

Dark Shikari
17th January 2008, 18:57
Dark if i see those Metric results did you really blindly optimized for SSIM ?The reason SSIM rises is simple: SSIM is directly related to the variance of a block. One of its main advantages over PSNR is that it measures distortion relative to the detail already present in a block--which makes perfect sense visually.

My AQ has the exact same approach, and for the same reasons; X amount of lossiness in a low detail block looks far worse than X amount of lossiness in a high detail block. Therefore, low-detail blocks should have much less lossiness than high-detail blocks.

I actually learned that SSIM worked this way slightly after I wrote my first version of this AQ, but the results are not at all surprising given what the AQ does. It doesn't specifically optimize for SSIM, but by its very nature it should increase SSIM.

desta
17th January 2008, 19:08
Generally, yes, because it raises quantizers on high-detail areas that don't need as many bits.
So to preserve detail evenly, it's best to use moderate strength and/or lower sensitivity?

Dark Shikari
17th January 2008, 19:14
So to preserve detail evenly, it's best to use moderate strength and/or lower sensitivity?Sensitivity should be kept at automatic whenever possible.

Obviously too high strength will ruin quality.

The trick is that with no AQ, detail is kept far better in high-detail areas than in low detail. This is because of how quantization works. Let's say we have two blocks, each of which consist of a single frequency (for simplicity):

Block 1 frequency: 102.9

Block 2 frequency: 7.4

Let's say our quantizer allows values 0, 5, 10, etc.

Block 1 will be rounded to 105, resulting in an error of about 2%. Block 2 will be rounded to 5, resulting in an error of about 30%. Yet they both used the same quantizer!

This AQ tries to resolve this by forcing Block 1 to use a higher quantizer and Block 2 to use a lower quantizer.

desta
17th January 2008, 19:20
Right, yeah I understand. Thanks for the explanation.

It threw me a bit when CruNcher showed the stronger AQ using less bits, but I see now it's pretty much the same principle as your original AQ - just that pushing the AQ strength too much will tip the results in the opposite direction.

CruNcher
17th January 2008, 19:28
Might be Dark but that doesn't change the fact that your new AQ Visualy isn't @ the optimum as the Old one wich renders especialy the very problematic background edges that they eye imidiatly realizes (in motion even more) better.
This is even clearly viewable with a correct calibrated color and brightness representation as how CoreAVC shows it and with a bad calibration (like for sure many Windows users have them) it even looks more worse with your new aproach compared to your old aproach, you really should think about it.
Im not sure if more details are worth it to have such scenes apearing that way to the eye and the viewer get distracted and the illusion of a very clean source is gone in that moment (talking especialy for very low bitrates here).

Dark Shikari
17th January 2008, 19:34
Might be Dark but that doesn't change the fact that your new AQ Visualy isn't @ the optimum as the Old oneI've found it to be much better, personally; but perhaps you're looking for an AQ with a stronger QP-lowering effect at very low variances? This would act more like Haali's AQ, and have a much stronger effect on dark backgrounds.

CruNcher
17th January 2008, 19:48
Dark im now conducting some more improvements stuff on this source project im gonna todo 2 encodes of the complete 3 hour with your AQ and the old one @ the given settings and then do a subjective compare of most of the scenes and tell you the final results, first im doing a extreme cut of what you just saw with many different scenes put together (also used that to evaluate Atemes quality) and show you those results with both AQs and my Visual understanding of this (the result will be the Encode that has the lowest amount of scenes where you could realize this is a lossy encode wins) :) later im also doing that with a almost complete dark Movie and see how good each of those AQs does in each situation, definately will take some time.

DeathTheSheep
17th January 2008, 19:50
This RCRD is revolutionary... (and slow, of course, but the slower the revolution, the more exquisite the results).

Looks like it's ignoring/washing out complex detail that's only shown for a split second, keeping all the goodies that persist in the following frames. This reminds me of some beta vp6 builds of yore.

Dark Shikari
17th January 2008, 19:50
Dark im now conducting some more improvements stuff on this source project im gonna todo 2 encodes of the complete 3 hour with your AQ and the old one @ the given settings and then do a subjective compare of most of the scenes and tell you the final results, first im doing a extreme cut of what you just saw with many different scenes put together (also used that to evaluate Atemes quality) and show you those results with both AQs and my Visual understanding of this :)How about you encode using a source that isn't atrocious?

And try using a sane bitrate? :p

Encoding with unrealistic settings is not a way to test an AQ.

DeathTheSheep
17th January 2008, 19:56
Haha, maybe unrealistic settings on an atrocious source brings out the best in an unrealistic, atrocious AQ? :)

CruNcher
17th January 2008, 20:06
Dark you know the EBU (Broadcast) test sequence ? my stuff isn't far away from that testcut just that i completly use hollywood (film) stuff.
And we all know that X264 still has problems just run ParkRun throug it with low bitrate and you see there could be alot still done in Psy Optimizations for X264

Dark Shikari
17th January 2008, 20:19
Dark you know the EBU (Broadcast) test sequence ? my stuff isn't far away from that testcut just that i completly use hollywood (film) stuff.
And we all know that X264 still has problems just run ParkRun throug it with low bitrate and you see there could be alot still done in Psy Optimizations for X264Given what my AQ does, I suspect it would give extremely good results for ParkRun...

DeathTheSheep
17th January 2008, 20:22
Are there any plans for improving the quality algorithm further? You mentioned lambda support and such being removed "for the time being" since they "screw up" B frames on high quantizers or whatnot, but are there plans to pump the algo to the next level soon enough?

Dark Shikari
17th January 2008, 20:25
Are there any plans for improving the quality algorithm further? You mentioned lambda support and such being removed "for the time being" since they "screw up" B frames on high quantizers or whatnot, but are there plans to pump the algo to the next level soon enough?At this point the only real possible benefits are:

1) Adding the ability to drop QP even further in extremely flat areas, in addition to the current method.

2) Trellising the QP_deltas to save bits.

3) Lambda modification to save bits (will be a hell of an annoyance to do, and I'm not sure how good an idea it is). Lambda modification is better in Xvid or similar, where you can't move quantizers around as effectively.

At this point, for the most part, I think this AQ is ready for primetime; I have yet to see it have a negative effect on anything but a cartoon source at low bitrates. The positive effects are of course huge.

DeathTheSheep
17th January 2008, 21:15
Mm, good stuff. I'm curious as to the origin of this statement: "Its not particularly good at cartoons; I wouldn't use it on anime/cartoons," especially in light of its groundbreaking success on my anime test clips at high QP, and whether the improvements you mention have the potential to remedy what problems may remain in this regard.

Danisan
17th January 2008, 21:36
Mm, good stuff. I'm curious as to the origin of this statement: "Its not particularly good at cartoons; I wouldn't use it on anime/cartoons," especially in light of its groundbreaking success on my anime test clips at high QP, and whether the improvements you mention have the potential to remedy what problems may remain in this regard.

Do you have any comparison pictures for the anime tests you've made? :thanks:

CruNcher
17th January 2008, 21:49
Dark whats about the --aq-sensitivity crash in your latest build ?

Dark Shikari
17th January 2008, 22:09
Dark whats about the --aq-sensitivity crash in your latest build ?What crash? If someone reported a bug, I missed it...

ditche
17th January 2008, 22:20
Yep, I have a crash with v0.42, there's no encoding...

-[Information] Log for job1 (video, video test.avs -> video test new aq.mp4)
--[Information] [17/01/2008 20:26:02] Started handling job
--[Information] [17/01/2008 20:26:02] Preprocessing
--[NoImage] Job commandline: "C:\Program Files\megui\x264.exe" --qp 23 --ref 10 --mixed-refs --no-fast-pskip --bframes 16 --b-pyramid --bime --weightb --direct auto --filter -2,-1 --trellis 2 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --me umh --threads 3 --thread-input --sar 1:1 --progress --no-dct-decimate --no-psnr --no-ssim --output "D:\VIDEO\rip\video test new aq.mp4" "D:\VIDEO\rip\video test.avs" --aq-strength 0.6
--[Information] [17/01/2008 20:26:02] Encoding started
--[NoImage] Standard output stream
--[NoImage] Standard error stream
--[Information] [17/01/2008 20:26:02] Job completed

:helpful::)

Dark Shikari
17th January 2008, 22:26
Found one crash bug. Note to self: you cannot say if(floating point value == 0) before calling x264_cpu_restore(). Updated builds/patch.

CruNcher
17th January 2008, 23:07
Wow this really almost fliped me out of the chair :D

No AQ
http://mirror05.x264.nl/CruNcher/parkrun-noaq.mkv

New AQ
http://mirror05.x264.nl/CruNcher/parkrun-newaq-1.0.mkv

Old AQ
http://mirror05.x264.nl/CruNcher/parkrun-oldaq-0.9.mkv
http://mirror05.x264.nl/CruNcher/parkrun-oldaq-1.0.mkv

Dark Shikari
17th January 2008, 23:12
Wow this really almost fliped me out of the chair :D

No AQ
http://mirror05.x264.nl/CruNcher/parkrun-noaq.mkv

New AQ
http://mirror05.x264.nl/CruNcher/parkrun-newaq-1.0.mkv

Old AQ
http://mirror05.x264.nl/CruNcher/parkrun-oldaq-0.9.mkv
http://mirror05.x264.nl/CruNcher/parkrun-oldaq-1.0.mkvHoly crap! I thought it was just a ratecontrol issue until I realized the I-frames were the same size between the three encodes... holy shit!

The difference in some of the later I-frames between those is unbelievable.

Dark Shikari
17th January 2008, 23:18
>>>WOW<<<

http://i8.tinypic.com/6pocmcy.png

http://i2.tinypic.com/87k6ipf.png

:eek::eek::eek::eek::eek::eek::eek::eek:

I didn't realize any AQ could be this effective. Wow.

(both I-frames, same size)

Romario
17th January 2008, 23:24
Dark Shikari,what about Vista users, especially Vista 64 users?

Do you have a plan to compile 64-bit build?

Dark Shikari
17th January 2008, 23:27
Dark Shikari,what about Vista users, especially Vista 64 users?

Do you have a plan to compile 64-bit build?I don't have a 64-bit OS, so I can't make a 64-bit build--but the patch is available for someone else to.

Also, here's an animated GIF of the difference, since its so shocking:

http://i18.tinypic.com/82u8c9j.gif

CruNcher
17th January 2008, 23:29
just the pulseing from the closed gop that needs to be workedaround visualy, then it would be perfect :)

Snowknight26
17th January 2008, 23:32
The I-frames really detract from the visual asthetics when watching those samples.
Pretty impressive nonetheless.

Dark Shikari
17th January 2008, 23:33
The I-frames really detract from the visual asthetics when watching those samples.The main problem is that Cruncher used 1-pass ABR. The second problem being his crappy settings :p

Razorholt
17th January 2008, 23:34
@Darky: what build did you use?

Dark Shikari
17th January 2008, 23:38
Apparently (and not surprisingly) people want to see comparisons of P-frames in properly encoded clips, rather than Cruncher's subme1 atrocities, so I will post a real comparison in a bit.

CruNcher
17th January 2008, 23:39
i know Dark but it's realtime that way on my 2.8 GHZ Dualcore machine :D sure with subme 5 this would look much better and useing b-frames and more ref and all the stuff but complexity would go up and speed down and that's not the way i balance stuff my goal is also another one then best compression existing ;)

DeathTheSheep
17th January 2008, 23:45
Is 1.0 now the recommended strength? Everyone seems to scream 1.0 is the best, but I see no evidence. Did you do your tests with 1.0, DS?

Dark Shikari
17th January 2008, 23:49
Is 1.0 now the recommended strength? Everyone seems to scream 1.0 is the best, but I see no evidence. Did you do your tests with 1.0, DS?Yeah, I've been using 1.0.

1.0 = max QP adjustment of +/- 5

Atak_Snajpera
18th January 2008, 00:25
PS3 profile 2-pass 6144 kbps

frame 370
no AQ
http://img174.imageshack.us/img174/5076/noaquq8.th.jpg (http://img174.imageshack.us/my.php?image=noaquq8.jpg)

AQ1.0
http://img156.imageshack.us/img156/5189/aq1yu6.th.jpg (http://img156.imageshack.us/my.php?image=aq1yu6.jpg)

Dark Shikari
18th January 2008, 00:35
:eek:

I just finished my own test encodes, 5 megabits at 25 frames per second, maxed out settings.

AQ got a 37% SSIM boost.

The quality difference is mindblowing--the AQ, even in motion, looks nearly transparent, while the non-AQ looks atrocious.

Linkage to download both clips (AQ, no AQ) (http://www.mediafire.com/?62dzmttn0n4).

Inventive Software
18th January 2008, 00:38
And the award for next SVN entry goes to:


Dark Shikari! :D

DeathTheSheep
18th January 2008, 00:42
You know, I'd like a link to the clips (preferably the anime) you claim didn't benefit visually from the AQ.
I want to test if any settings combo can give a better effect, then see if I can generalize these to other problem anime samples.

Dark Shikari
18th January 2008, 00:47
You know, I'd like a link to the clips (preferably the anime) you claim didn't benefit visually from the AQ.
I want to test if any settings combo can give a better effect, then see if I can generalize these to other problem anime samples.I haven't tested a lot on cartoons--once I fix interlacing to work correctly with AQ, I'll go back to them.

CruNcher
18th January 2008, 01:03
I haven't tested a lot on cartoons--once I fix interlacing to work correctly with AQ, I'll go back to them.

Arghh don't deoptimize X264 for Anime stuff at least don't unbalance the Real Footage behaveiour that would be a disaster.

This i-Frame pulseing should be looked into even @ 6 Mbits it's visible and even with heavy encoding settings (but i think that problem is very deep inside X264 since the first days and has todo with the partitioning itself won't be easy to fix it for sure (make it more consistent), hmm playing arround with the deadzone reduces it a little but it's still their)

Dark Shikari
18th January 2008, 01:03
Interlacing's stream corruption appears to not be a bug in AQ bug actually a long-standing bug in x264--the CABAC context for interlaced QP_deltas seems to be wrong in interlaced mode, since it only appears with CABAC, and appears even if I entirely remove AQ and do nothing but randomely apply quantizers to different blocks.

DeathTheSheep
18th January 2008, 01:13
Crunch: Nobody's de-optimizing anything.

DS: If interlacing is causing you trouble, don't use it. It's more visually difficult to spot differences in screencaps anyway with the presence of two fields. Instead (just for testing purposes, if nothing else), deinterlace the source well first; it might even be advisable to discard one field entirely and interpolate the width (via spline36) to maintain the AR--this way, you'll have a great low-res, low-br test clip on your hands.

Dark Shikari
18th January 2008, 01:17
Crunch: Nobody's de-optimizing anything.

DS: If interlacing is causing you trouble, don't use it. It's more visually difficult to spot differences in screencaps anyway with the presence of two fields. Instead (just for testing purposes, if nothing else), deinterlace the source well first; it might even be advisable to discard one field entirely and interpolate the width (via spline36) to maintain the AR--this way, you'll have a great low-res, low-br test clip on your hands.I don't have this choice. Dakaz insists that interlacing and AQ work together :p