PDA

View Full Version : Which frame looks better?


Dark Shikari
27th September 2007, 08:30
The following frames are from 5 different encodes plus a reference frame (not indicated which is which). Not all are necessarily various modifications of x264--in fact, they might all be different codecs... I'm not telling for now. If you figure it out, don't say, because that sort of eliminates the point of this post.

For the following 6 images, comment on what you like and don't like about each, and which one(s) you think are the best. Look very carefully at them--find blocks in the background, ringing, sharpness, grain retention, etc. Comment on them--I want to know what is wrong with each of them, and what is right--not just that one sucks and another is good!

Also note that I will tell which is which later. One of them is the original.

1: http://img233.imageshack.us/img233/6760/89164279aw2.png
2: http://img72.imageshack.us/img72/899/22269025xa2.png
3: http://img63.imageshack.us/img63/3974/93797124ps4.png
4: http://img295.imageshack.us/img295/5980/39562005yz4.png
5: http://img295.imageshack.us/img295/159/94900362yd7.png
6: http://img295.imageshack.us/img295/3848/96015227wh9.png

They are all from streams at the same ABR bitrate. The original source was quite grainy, so all the grain shown is from the original. No FGM was used.

Sergey A. Sablin
27th September 2007, 10:19
do you mind to post first image of the same resolution as others?

Dark Shikari
27th September 2007, 10:23
do you mind to post first image of the same resolution as others?I accidentally overcropped the frame--let me fix it.

Edit: Frame fixed and a 5th and a 6th. I have also posted that a reference frame hidden among them, rather than leaving you to decide on your own whether there was one or not.

Japhsoncross
27th September 2007, 10:54
frame 2 doesn't have the same vertical position as other frames, but its quality is almost the same as frame 1. frame 3,4,5 seem to be compressed a bit. at least the background is smoothed in frame 3,4,5. it seems frame 4,5 show more grain in the background than 4, but they don't have better quality on the center part of the qicture as frame 4. i think frame 3 is a bit better than frame 5 since i can see contours in frame 5.

---------------------------
after posting, i found 6 frame totally. i'll check them again.

le_canz
27th September 2007, 12:01
Pic 2 clearly seems to be the smoother to me.

Pic 6 seems the most blocky (at least the most obvious, and pic 5 is nearly as blocky.

Pic 1 and 2 are close except that 2 is smoother, but they're not blocky.

Pic 3 has a more grain than 4, they're not blocky.

I think I prefer picture 3.

I hope it will be helpful :)

CruNcher
27th September 2007, 12:38
yep definately 3 but takeing also the size into account (size/quality balance seems perfect) for this 1 frame at least, but how about in motion ?

ToS_Maverick
27th September 2007, 14:00
2 is smooth and flawless
1 and 3 are very close, 3 is probably a bit better
4
6
5 is the worst

*.mp4 guy
27th September 2007, 14:27
I completely agree with ToS_Maverick.

Wishbringer
27th September 2007, 15:05
1 has most details for me and a nice remaining grain in the background.

can't decide if 5 or 6 looks worst in background for me... think 5

hand in 2 is brighter than 1, more shiny (but not better, i prefer 1)

3 - 6 is overbrightend at hand, reducing details. 4 - 6 are worst.

5 has worst grain, then 4, 6, 3, 2, 1

So I think 1 is original, 2 has AQ.
5 is without optimization, then ranking 3, 4, 6 (better to worse)

Egladil
27th September 2007, 15:18
1 is best for me, sharp and nice background
2 is too smooth
3 looks very good too, has a bit less grain preserved than 1 in detailed areas
4 bad background
5 grain in background lost.
6 bad background

my preference:
1, 3, 5, 2, (4, 6)

Brother John
27th September 2007, 16:17
1 is best.
2 is a little smoother and less detailled than 1.
3 preserves detail a tiny bit better than 2 in the left part of the picture but pays by being slightly smoother and more blocky on the right (the dark blue area).
4 is smoother than 2/3 and has noticeable blocks.
5 and 6 are even more blocky.

My favourite is 1 and I'm about indifferent between 2-3. 4-6 all have too many blocks.

check
27th September 2007, 16:38
There's definately a green cast to 2. 2 is a little more blocky as well. See the brown to light blue transition in the upper right.
3 seems to have banding in the gradients. See the brown to light blue transition in the upper right. The high detail areas seem sharper.
4 is blockier in the bottom right again... looks like it would be extra noticeable in motion, but fixable with deband in ffdshow. I think there's a little more detail in the high noise areas... but it's hard to tell.
5 is nice and noise free, there are some underlying gradients that are more pronounced now, but looks reasonably nice otherwise. Probably rather high quality compared to the input. I'd imagine it would be a little flickery in motion.
6 has the green cast again. Blocky in the gradients. Just looks worse in general.

My pick would be 1, because it doesn't have any obvious flaws, and the background is detailed. 5 would be nice if there was a little more bits in the background too.

Selur
27th September 2007, 16:46
I go with:
1, 3, 5, 2, (4, 6)
same reason as Egladil

addit
27th September 2007, 16:56
My preference order:

1) Sharpest image - best foreground detail and best background grain retention.
3) Almost as good foreground as 1) but a subtly smoother/less grainy background.
2) Better background grain retention than 2) but a softer, smoother foreground hence worse than 3)
4) Slight blocking in background, foreground seems alright
6) Slightly more blocking in the background.
5) Worst blocking in the background...

Adam

honai
27th September 2007, 18:03
The original source was quite grainy, so all the grain shown is from the original.

Not the best source for this test, to be honest. The Chronicles of Riddick was one of the first A-list HD-DVD titles to use VC-1, and given the fact that they worked straight off the digital master this release is sub-par for various reasons:


grain retention is actually quite bad
slightly wrong levels, and hue is off as well
obviously the source material has been softened all over the place


Also, the VC-1 tool chain wasn't really mature when this title was produced.

A much better choice would be either one of (in no particular order):

Black Snake Moan
Casino Royale
The Departed
Apocalypto
Mission Impossible 3
Pirates of the Caribbean 2
X-Men 3


Otherwise you'd be only amplifying the problems of the source material.

Dark Shikari
27th September 2007, 18:05
Not the best source for this test, to be honest. The Chronicles of Riddick was one of the first A-list HD-DVD titles to use VC-1, and given the fact that they worked straight off the digital master this release is sub-par for various reasons:


grain retention is actually quite bad
slightly wrong levels, and hue is off as well
obviously the source material has been softened all over the place


Also, the VC-1 tool chain wasn't really mature when this title was produced.

A much better choice would be either one of (in no particular order):

Black Snake Moan
Casino Royale
The Departed
Apocalypto
Mission Impossible 3
Pirates of the Caribbean 2
X-Men 3


Otherwise you'd be only amplifying the problems of the source material.Its from the DVD, not the HD-DVD :)

honai
27th September 2007, 18:11
Oops! ;-)

reepa
27th September 2007, 18:22
2 and 6 have the same greenish color cast. 2 retains most of the grain. 5 has obliterated the grain. 1 and 3 are similar - they get the second place in grain retention. 4 and 6 are slightly worse, 4 being better. My ranking from nicest to least nice: 2, 1/3, 4, 6, 5.

Mtz
27th September 2007, 18:41
1 and 3, or 3 and 1.

enjoy,
Mtz

fields_g
27th September 2007, 18:44
Image 3 is 720x352 instead of 720x356. Image 2 have a slight vertical shift down. Could this be fixed? I know this sounds really picky.


My RANKINGS:

1=3>2>4>6>5

Dark Shikari
27th September 2007, 18:53
Image 3 is 720x352 instead of 720x356. Image 2 have a slight vertical shift down. Could this be fixed? I know this sounds really picky.

I also noticed that 1 and 2 are darker than the others... I wonder if this points to different encoding....
3 has been fixed for you.

IgorC
27th September 2007, 19:09
How usefull is compare still images?
There can be changes between frames during playback those nobody don't get in count. A lot of issues would be invisible. Flickering, not smooth motion etc...

Dark Shikari
27th September 2007, 19:11
How usefull is compare still images?
There can be changes between frames during playback those nobody don't get in count. A lot of issues would be invisible. Flickering, not smooth motion etc...
First of all, not all of the frames are from streams which I even have; though I won't post which is which yet, one of them is from the latest in-house version of the Ateme encoder, and for obvious reasons bobobolo refused to give me anything more than a screenshot :p

The motion is smooth on all of them as far as I can tell, none are particularly bad in terms of motion. We're mainly looking at lack of blocking, sharpness, grain retention, etc here.

foxyshadis
27th September 2007, 19:21
Proof, if anyone needed it, that I'm an undiscerning video consumer: I can't tell a difference between 1, 2, and 3, aside from that color cast. 4 is okay, I dislike the banding of 5 & 6 more than I dislike microblock artifacts.

Dark Shikari
27th September 2007, 19:23
Eh, I think there have been enough responses, I'll tell which is which.

This is a scene in which my new AQ (not the one in the other thread--yet another new algorithm) doesn't really shine that much, because the old AQ is already so good at dark/blue blocking artifacts.

1: My new AQ algorithm at full strength.
2: Original
3: Old AQ at full strength.
4: New AQ at half strength.
5: No AQ at all.
6: Ateme's latest in-house encoder with psyopts enabled

Since more people answered 1) than 2) as their best, I think I can declare 1 to be basically transparent and 3 probably pretty damn close.

Egladil
27th September 2007, 19:36
Hm why is the source smoother than the encodes? did you apply sharpening?

Dark Shikari
27th September 2007, 19:39
Hm why is the source smoother than the encodes? did you apply sharpening?Nope, but I noticed this in a number of places throughout my encode--my AQ seems to actually sharpen the source slightly for some reason. Many edges seem slightly sharper.

It could be because x264 is rounding up some of the higher frequencies, due to the lower quantizer, and so it looks sharper.

addit
27th September 2007, 20:46
Alright enough showmanship! Give us the patch already so we can try it out for ourselves.:devil:

Adam

Daodan
27th September 2007, 20:54
Yeah, :stupid:

Nr 1 was gonna be my pick as well (heh, the higher quantizers reduced some of the blocks in bright areas from the source).
Also question, in the other AQ thread, was the source also untouched? Because the encode with old aq was visibly sharper then the source.

Dark Shikari
27th September 2007, 23:50
Yeah, :stupid:

Nr 1 was gonna be my pick as well (heh, the higher quantizers reduced some of the blocks in bright areas from the source).
Also question, in the other AQ thread, was the source also untouched? Because the encode with old aq was visibly sharper then the source.Yup, untouched. All the sharpening is due solely to the rounding up of high-frequency coefficients, I think.

Here is the new AQ code; no modifications are required to any file other than analyze.c.

static int x264_subtract_dctq( int16_t dct[8][8], int16_t origdct[8][8], int dctweight[8][8] )
{
int i, j, t = 0;
for( i=0; i<8; i++ )
for( j=0; j<8; j++ )
{
unsigned int s = (abs(dct[i][j] - origdct[i][j]) * x264_dct8_weight_tab[1+(i<<3)+j]);
s = (s*s) / ((abs(origdct[i][j])+10)* x264_dct8_weight_tab[1+(i<<3)+j]);
t += (s >> dctweight[i][j]);
}
return t;
}

static int x264_sum_dctq( int16_t dct[8][8], int dctweight[8][8] )
{
int i, j, t = 0;
for( i=0; i<8; i++ )
for( j=0; j<8; j++ )
{
t += (abs(dct[i][j]) * x264_dct8_weight_tab[1+(i<<3)+j]) >> dctweight[i][j];
}
return t;
}

/*****************************************************************************
* x264_adaptive_quant:
* check if mb is "flat", i.e. has most energy in low frequency components, and
* adjust qp down if it is
*****************************************************************************/
void x264_adaptive_quant( x264_t *h, x264_mb_analysis_t *a )
{
DECLARE_ALIGNED( static uint8_t, zero[FDEC_STRIDE*8], 16 );
DECLARE_ALIGNED( int16_t, dct[8][8], 16 );
DECLARE_ALIGNED( int16_t, origdct[8][8], 16 );
int dctweight[8][8];
int i,j,k;
float fc;
int total = 0;
int qp = h->mb.i_qp, qp_adj;
int weightqp[16] = {0,0,qp/50,qp/40,qp/30,qp/20,qp/15,qp/12,qp/10,qp/8,qp/6,qp/5,qp/4,qp/3,qp/2,qp};
for( i=0; i<8; i++ )
for( j=0; j<8; j++ )
dctweight[i][j] = weightqp[i+j];
for( i=0; i<4; i++ )
{
h->dctf.sub8x8_dct8( dct, h->mb.pic.p_fenc[0] + (i&1)*8 + (i>>1)*FENC_STRIDE, zero );
for( j=0; j<8; j++ )
for( k=0; k<8; k++ )
origdct[j][k]=dct[j][k];
h->quantf.quant_8x8( dct, h->quant8_mf[CQM_8IY][qp], h->quant8_bias[CQM_8IY][qp] );
h->quantf.dequant_8x8( dct, h->dequant8_mf[CQM_8IY], qp );
total += x264_subtract_dctq(dct,origdct,dctweight);
}
x264_cpu_restore( h->param.cpu );
fc = expf(-1e-6 * total);
qp_adj = (int)(qp * ((2.0 * h->param.analyse.f_aq_strength / pow(2 - fc, h->param.analyse.f_aq_sensitivity))));
int newQP = qp - qp_adj;
h->mb.i_qp = a->i_qp = X264_MAX(X264_MIN(X264_MAX(X264_MIN(newQP,3*qp/2),0),h->param.rc.i_qp_max),h->param.rc.i_qp_min);
h->mb.i_chroma_qp = i_chroma_qp_table[x264_clip3( h->mb.i_qp + h->pps->i_chroma_qp_index_offset, 0, 51 )];
a->i_lambda = i_qp0_cost_table[h->mb.i_qp];
a->i_lambda2 = i_qp0_cost2_table[h->mb.i_qp];
}

Note that my weights, while obviously an educated guess (higher QP = higher frequencies matter less, lower QP = higher frequencies matter more), are just that, a guess. If someone can come up with a good way to better choose weights, my AQ could get considerably better.

Here's how it works:

1. Get the DCT of the current 8x8 block.
2. Quantize, and then dequantize, the DCT from 1). Call it DCT2.
3. Calculate (abs( DCT2(i) - DCT(i) ) ^ 2) / DCT(i) and then shift by the appropriate weight value. Do this for each of the 64 coefficients and add them up.
4. Do the above for all 4 8x8 blocks and add up the results.
5. This total is the weight used for the AQ algorithm. At this point very little is changed from the original AQ except various constants.

ToS_Maverick
28th September 2007, 00:13
ha, i knew i was right ;)

@Dark Shikari:
what's the difference of your latest AQ version, and the version i tested and made screenshots with?

the reason i ask is, that i was quite satisfied with the version i tested, cause it prevented the slightest details in the background from getting quantized away. the only downside was, that it got slightly blocky at high constrast areas.

lexor
28th September 2007, 00:27
As you might remember from our little discussion in the new AQ thread, I like clear picture with sharp details.

1 = blurry mess, especially around edges, area of the left of the ball is too dark
2 = slight improvement on 1, especially on the ball, but not much
3 = has the greatest sharpness, provides greatest detail = best picture perceptually (I'm not claiming it's original, just that it looks the best). Overall the best grain-feel to it
4 = very close to 3, but bottom right is a bit oversmoothed, same darkness problem on left as 1
5 = almost as good as 3, but ball is slightly blurrier, detail on the mesh to the left of the arm that holds the ball is slightly shifted (shifted noise, not better or worse than 3), ridges to the right of ball blurier, too much blur in bottom right area
6 = same as 5 on the figure, but more blur on ball and ridges right of ball, and too much colour separation in bottom right again, almost like sharp gradient.

So 3 >> 6 > 5 > 4 > 2 > 1

>> meaning it's much better than the rest, no contest if I had to pick one.

/EDIT: oops didn't notice second page before posting, I guess contest is over :( No offense, you do great work, but I hope this patch doesn't get committed then, since 3 is old AQ

/EDIT2: I can't believe people are picking #1, it's so incredibly blurry, do you all like that?

Dark Shikari
28th September 2007, 00:30
ha, i knew i was right ;)

@Dark Shikari:
what's the difference of your latest AQ version, and the version i tested and made screenshots with?

the reason i ask is, that i was quite satisfied with the version i tested, cause it prevented the slightest details in the background from getting quantized away. the only downside was, that it got slightly blocky at high constrast areas.
The old version had some issues in high contrast areas that could not be solved with the algorithm it used.

My current method is probably better at both the background and on high-detail areas, because it doesn't raise quants. I found that raising quants is generally a bad idea when using a DCT algorithm, because DCT itself cannot be guaranteed to find the quants that "should" be raised, and one or two mistakes give you one or two horrible blocks.

Dark Shikari
28th September 2007, 01:13
Also, frame sizes:

1: 26k
2: Original
3: 26k
4: 20k
5: 18k
6: 12k

They're not very meaningful without knowing the sizes of nearby frames, of course (if Ateme used fewer B-frames, for example, that would explain the lower P-frame size), but they're both a good comment on the ratecontrol ability of the encoder and the efficiency.

Terranigma
28th September 2007, 01:20
/EDIT2: I can't believe people are picking #1, it's so incredibly blurry, do you all like that?
Better to be smooth than to be sharp and noisy.

*.mp4 guy
28th September 2007, 01:37
Nope, but I noticed this in a number of places throughout my encode--my AQ seems to actually sharpen the source slightly for some reason. Many edges seem slightly sharper.

It could be because x264 is rounding up some of the higher frequencies, due to the lower quantizer, and so it looks sharper.

I don't beleive this is the case, I have noticied many times that bluring can actually lead to a greater perception of sharpness; generally what apears to happen is that when you remove small details/gradients, but leave the big ones, the big details look more prominent and, thus, sharper. Psychovisually, a detail on a flat background is more noticible then a detail on a detailed background.

Deen can sometimes have this effect, as can most masked/thresholded blurs, X264 always exhibits this to some extent, with high deadzones + the flat matrix being the most obvious.

Dark Shikari
28th September 2007, 01:46
I don't beleive this is the case, I have noticied many times that bluring can actually lead to a greater perception of sharpness; generally what apears to happen is that when you remove small details/gradients, but leave the big ones, the big details look more prominent and, thus, sharper. Psychovisually, a detail on a flat background is more noticible then a detail on a detailed background.

Deen can sometimes have this effect, as can most masked/thresholded blurs, X264 always exhibits this to some extent, with high deadzones + the flat matrix being the most obvious.That is an interesting idea...

Another thought I had: does it cost more bits to retain grain, or to remove that grain, add dither, and use ridiculously low deadzone settings (adaptive deadzone) to retain that dither?

desta
28th September 2007, 01:54
/EDIT2: I can't believe people are picking #1, it's so incredibly blurry, do you all like that?
I can't believe you think 3 and 5 look the same. No offense, but are you looking at the same pics as everyone else?

3. http://img63.imageshack.us/img63/3974/93797124ps4.png

5. http://img295.imageshack.us/img295/159/94900362yd7.png

morph166955
28th September 2007, 02:17
Here is the new AQ code; no modifications are required to any file other than analyze.c.


Any chance of getting a diff off of r680?

Dark Shikari
28th September 2007, 02:52
Any chance of getting a diff off of r680?I'll get around to it later--the code has no changes in that area between r676 and 680 as far as I know, so it should be a simple copy-paste job.

Sagekilla
28th September 2007, 03:36
6 seems to have changed the color in the background. 4 and 5 have banding. 2 seems to have a very small loss in quality.

Aside from that I can discern no other differences.

lexor
28th September 2007, 04:31
I can't believe you think 3 and 5 look the same. No offense, but are you looking at the same pics as everyone else?

Where did I say they are the same? I said 5 is almost as good, it has nothing to do with being the same or not. And the comment was mostly meant for the objects not for backgrounds (in which case 3 is much better than 5 and 6 as I noted)

As for:
Better to be smooth than to be sharp and noisy.
I do want a little noise to replace grain that would be there in any decent source (and killed off by filter to help encode better) motion will alleviate the little noise anyway. And I take sharp and grainy over smooth and blurry any day, since smoothness itself is an artifact of upscaling DVDs (SD footage in general) to high monitor resolutions.

woah!
28th September 2007, 05:01
can i ask how you came up with 720x356 ??? image seems streched vertically no???

my hd version is 2:35:1 ratio when cropped.... 720x306 ?

fields_g
28th September 2007, 05:30
can i ask how you came up with 720x356 ??? image seems streched vertically no???

my hd version is 2:35:1 ratio when cropped.... 720x306 ?

Your HD version has a PAR of 1:1. This source was a anamorphic DVD. I'm not exactly sure but the dvd decoder would either:
a) squeeze this image vertically down to 720x306 on playback
or
b) the stretch it horizontally to 836x356 on playback

Either way getting 2.35:1. In order to eliminate the stretches as a source of the visual problems, the aspect is off with this sample.

DeathTheSheep
28th September 2007, 05:55
6: Ateme's latest in-house encoder with psyopts enabled
...at 12kb/frame. Hmm. Something tells me their psychovisual system only produces its desired effects at framerates > 10fps (meaning not so good when you actually freeze-frame it). Just a little reminder from 2004's beta...

And by latest do you mean their H264 Encoder Suite or their newer Ateme File Encoder (broadband)?
Also, maybe Mainconcept Reference Pro should've been given a shot. ;)

Dark Shikari
28th September 2007, 06:31
...at 12kb/frame. Hmm. Something tells me their psychovisual system only produces its desired effects at framerates > 10fps (meaning not so good when you actually freeze-frame it). Just a little reminder from 2004's beta...

And by latest do you mean their H264 Encoder Suite or their newer Ateme File Encoder (broadband)?
Also, maybe Mainconcept Reference Pro should've been given a shot. ;)By latest, I mean the latest experimental in-house version that is so sekrit that I couldn't even get more than a single frame and its size and type in terms of information.

Gabriel_Bouvigne
28th September 2007, 08:37
...at 12kb/frame. Hmm. Something tells me their psychovisual system only produces its desired effects at framerates > 10fps (meaning not so good when you actually freeze-frame it). Just a little reminder from 2004's beta...
I'm not objecting to the fact that a single frame could be a bad quality indicator of a video (what about stability?), but you are not seriously expecting the psychovisuals to be identical/close to the 2004 ones?

Dark Shikari
28th September 2007, 08:40
Also, though there's still tons of tuning to do before I post anything... adaptive deadzone has been implemented :p

Gabriel_Bouvigne
28th September 2007, 08:51
1: My new AQ algorithm at full strength.
2: Original
3: Old AQ at full strength.


Since more people answered 1) than 2) as their best, I think I can declare 1 to be basically transparent and 3 probably pretty damn close.

Transparency is not a matter of preference, but the fact that you can not spot differences between the source and modified/encoded version.
If people said that they like 1) better than 2), then it means that it's not transparent (in the context of a single, non moving frame), but you have no way to assess transparency by this test.

Dark Shikari
28th September 2007, 08:51
Transparency is not a matter of preference, but the fact that you can not spot differences between the source and modified/encoded version.
If people said that they like 1) better than 2), then it means that it's not transparent (in the context of a single, non moving frame), but you have no way to assess transparency by this test.Probably, though to me I actually cannot tell the difference without extremely close inspection.

There are some other scenes with cases in which it is clearly not 100% transparent; I'm working on them.

Wishbringer
28th September 2007, 09:43
With a good TFT monitor with high contrast you can easily see the differences between 1 and 2.
So it's not 100% transparent, but all other pictures aren't too.
1 comes far nearer to 2 than any other picture.
On a CRT monitor you really have to search for differences.

lexor
28th September 2007, 14:57
Probably, though to me I actually cannot tell the difference without extremely close inspection.

There are some other scenes with cases in which it is clearly not 100% transparent; I'm working on them.

You kidding right? 2 is waaay brighter than 1, I'd call changing the contrast of the image a pretty big screw up. I'm beginning to think people got bad monitors around here (or at least poorly calibrated).

Wishbringer
28th September 2007, 15:23
You kidding right? 2 is waaay brighter than 1, I'd call changing the contrast of the image a pretty big screw up. I'm beginning to think people got bad monitors around here (or at least poorly calibrated).

So people see different results:
1 and 2 are both at same darkness level for me.
3 to 6 are way brighter for me

And I have a S-PVA Panel and not a TN+Film one.

fields_g
28th September 2007, 15:54
I thought there was a really big change in brightness/contrast also BUT.... I found the problem was my own machine.
Under IE, 1 and 2 were darker at first. Later, under IE, 1-3 were darker. This made me suspicious. Loaded up firefox and all were the same. I don't know if this phenomon is what is happening to you guys also.

Dark Shikari
28th September 2007, 16:03
Any overall brightness difference is entirely coincidental or my mistake.

foxyshadis
28th September 2007, 17:22
Well, the original DVD snap appears to have been screenshotted using another color matrix, so it comes out a bit off. (Actually I just checked, it looks the same as the others when using 601->709 conversion.)

DeathTheSheep
29th September 2007, 04:31
Just as an aside, what movie was that screenshot taken from? Actually, what the heck is in that frame? A tower? Grrr...

Dark Shikari
29th September 2007, 04:36
Just as an aside, what movie was that screenshot taken from? Actually, what the heck is in that frame? A tower? Grrr...Its from Chronicles of Riddick, the intro sequence. Its one of the alien guys grabbing some sword thing.

DeathTheSheep
29th September 2007, 04:39
Aha! I thought it was some alien and some staff/sword, but then I second-guessed myself... :p.

Well anyway, screen 1 looks the best to me, too, so congratulations on the AQ even though it doesn't apply to my INSANE-ly low bitrates. :D

...probably.

Dark Shikari
29th September 2007, 05:04
Aha! I thought it was some alien and some staff/sword, but then I second-guessed myself... :p.

Well anyway, screen 1 looks the best to me, too, so congratulations on the AQ even though it doesn't apply to my INSANE-ly low bitrates. :D

...probably. I could definitely make a "low-bitrate" AQ, but it would have to be totally different. Here's one idea I had:

1. In the first pass, store the locations of the macroblocks that use up the most bits, and how many bits they use relative to the rest of the frame.
2. In the second pass, raise the quants on these blocks proportional to the amount of bits they used the first time around.

At low bitrates, you can deal with losing quality in some places if it means greater quality everywhere else.

RaynQuist
29th September 2007, 18:28
unsigned int s = (abs(dct[i][j] - origdct[i][j]) * x264_dct8_weight_tab[1+(i<<3)+j]);
s = (s*s) / ((abs(origdct[i][j])+10)* x264_dct8_weight_tab[1+(i<<3)+j]);
t += (s >> dctweight[i][j]);


Optimization ideas:

int s = (dct[i][j] - origdct[i][j]);
s = (s*s) / (abs(origdct[i][j])+10) * x264_dct8_weight_tab[1+(i<<3)+j];
t += (s >> dctweight[i][j]);
1. abs before squaring is redundant, unless you need that s to be unsigned. What's the range of DCT coefficients? Can they be negative?
2. x264_dct8_weight_tab mathematically cancel out.


As far as the algorithm goes, this doesn't makes sense to me:
difference^2 / original

If you're scaling by the original I would do:
(difference / original)^2

I would also argue that the difference doesn't need to be scaled by the original. I think 3->0 is as much of an error as 203->200. So I say just do:
difference^2


Also, about weights:


int weightqp[16] = {0,0,qp/50,qp/40,qp/30,qp/20,qp/15,qp/12,qp/10,qp/8,qp/6,qp/5,qp/4,qp/3,qp/2,qp};
for( i=0; i<8; i++ )
for( j=0; j<8; j++ )
dctweight[i][j] = weightqp[i+j];


1. Max of i+j is 14, so your weightqp has 1 extra number.
2. You can just have dctweight be a constant and multiply by qp when you use it so you don't have to recompute (as much) everytime the function is called.
3. Do we need to multiply weight by qp? aq_strength already multiplies qp when calculating qp_adj; I don't think aq_sensitivity should scale with qp.
4. I don't think shifting by the weight gives enough precision and I think it's too aggresive. I say multiply by the weights and just tweak the weights to achieve your intended scaling curve.

Dark Shikari
29th September 2007, 21:04
I would also argue that the difference doesn't need to be scaled by the original. I think 3->0 is as much of an error as 203->200. So I say just do:
difference^2

No, this is an important tenant of the algorithm--a flat block has much higher error relative to its actual frequency values. The same error is much more tolerable in a "non-flat" block.


Also, about weights:

1. Max of i+j is 14, so your weightqp has 1 extra number.
2. You can just have dctweight be a constant and multiply by qp when you use it so you don't have to recompute (as much) everytime the function is called.
3. Do we need to multiply weight by qp? aq_strength already multiplies qp when calculating qp_adj; I don't think aq_sensitivity should scale with qp.
4. I don't think shifting by the weight gives enough precision and I think it's too aggresive. I say multiply by the weights and just tweak the weights to achieve your intended scaling curve.
Yeah, the weights are completely arbitrary and should really be redone based on some actual thought/testing.

Not sure what you mean by 3).

2) can't be done because the QP can change from frame to frame, so I'd have to at least redo it for each frame.

Otherwise those are definitely some good optimizations.

RaynQuist
30th September 2007, 05:16
No, this is an important tenant of the algorithm--a flat block has much higher error relative to its actual frequency values. The same error is much more tolerable in a "non-flat" block.

I see what you're doing now and I like it.


2) can't be done because the QP can change from frame to frame, so I'd have to at least redo it for each frame.

I meant something like:

int weightqp[16] = {0,0,1/50,1/40,1/30,1/20,1/15,1/12,1/10,1/8,1/6,1/5,1/4,1/3,1/2,1};
for( i=0; i<8; i++ )
for( j=0; j<8; j++ )
dctweight[i][j] = weightqp[i+j];

static int x264_subtract_dctq()
{
...
t += (s * dctweight[i][j]);
...
}

void x264_adaptive_quant()
{
...
for()
{
total += x264_subtract_dctq()
}
total *= qp;
...
}

If we multiply by weight instead of shifting then qp can be factored out. Of course weightqp needs to be changed so it's not all zero.
Another idea is that since we only have 52 qp's we can just precompute them all.


Not sure what you mean by 3).

Using qp to influence "how much aq this block needs" is already done here:

qp_adj = (int)(qp * ((2.0 * h->param.analyse.f_aq_strength / pow(2 - fc, h->param.analyse.f_aq_sensitivity))));

I'm asking if we really need to use qp in the weights also, because then we'd be scaling by qp twice.

And I forgot something:

for( j=0; j<8; j++ )
for( k=0; k<8; k++ )
origdct[j][k]=dct[j][k];

This can use a memcopy or something.

Dark Shikari
30th September 2007, 06:51
The reason I used QP in the weights was I *guessed* that when encoding at lower QPs, you'd need to take into account the high frequency components of the source more than at high QPs.

As I said, the weights need actual testing, and perhaps some input from someone with a lot of experience with DCT encoding.

kad77
30th September 2007, 09:02
Hello all, I've read through this thread, and perhaps I missed someone saying this...

#1 has *very* noticeably decreased luminosity in at least the upper left of the frame compared to #2 and #3. The decreased brightness is not uniform to all of #1, but this color/brightness distortion would need to be fixed before switching away from the old algorithm.

FWIW, I liked 2,(3/1),4,5,6 before I read the results, although #1 dealt with grain better than #3 except for the darkening of the image.

#3 seemed to enhance the grain too much, and #1 kept edges defined while subtlety lessening grain (probably luminosity tampering again). #4 artificially increased brightness in the lower part that was originally dark, which is a distortion in the other direction of #1 and negatively affected the scene.

#5 blocky, blah, #6 seemed very low bitrate + blocky, and it turned out in fact that it was.

Nice job, keep up the good work! I think your new algo has promise.

Sagittaire
30th September 2007, 11:11
Dark Shikari, you can write a patch for your new AQ ...

addit
12th October 2007, 22:06
Any news on this? I've just encountered a grain heavy source that would be perfect to test this with .:)

Dark Shikari
12th October 2007, 22:12
Any news on this? I've just encountered a grain heavy source that would be perfect to test this with .:)I've been quite busy with real-life work lately, though I've been working on my GrainOptimizer some.

Perhaps in a bit I'll get back to working on my AQ and optimizing it for grain, and not just truly flat areas.

Ranguvar
14th October 2007, 03:52
They are close enough that I don't see the point - only after staring at it for many minutes can I discern any difference at all, so there's no need to be picky during encoding, go with the fastest :)