3+ - pass encoding for XviD [Archive]

View Full Version : 3+ - pass encoding for XviD

molerus

19th February 2003, 11:26

Hi!

I must admit that sometimes I'm not fully satisfied with the results obtained by the 2-pass encoding. Recently I encoded "The Terminator" and while some scenes look perfect, other are shitty. The thing is that these shitty scenes are neither low nor high-bitrate, are just average. While some scenes can be compressed more (like up to 50%), other need almost 100% of the original size.
I'm thinking about writing the AviSynth filter, which would compare two clips (i.e. before encoding and after) and compute the "shittiness" of frames. The results would be stored in an external file, and afterwards would be treated as NanDub's manner Luminance Noise - the tougher frame to encode, the lower LumaNoise. Then it could be put into the XviD's stats file and the bitrate curve could be scalled again, using GKnot.
My question is what the numerical criteria of "shittiness" are, which means the difference between the original and the encoded frame or just the visual quality of the output. I know that NanDub does the similiar thing, it's called Anti-Shit, but frankly I don't know how it works.
If someone gave me the hints how to asess resulting video I would appeciate. :p

NuclearFusi0n

19th February 2003, 11:41

PSNR?

molerus

19th February 2003, 15:11

Yes, I've heard about it, but what it is? I know you will say Peak Signal to Noise Ratio, but what is it telling us? Where can I learn more about it?

Selur

19th February 2003, 15:27

dude use google and you'll find things like:
http://osl.iu.edu/~tveldhui/papers/MAScThesis/node18.html
http://bmrc.berkeley.edu/courseware/cs294/fall97/assignment/psnr.html
...
and here's a tool to compare the psnr of 2 avi files:
http://www.vsofts.com/codec/codec_psnr.html

int 21h

19th February 2003, 20:24

Many years ago, I was very interested in integrating PSNR measurements into a tool called M4C. (Back before Nandub) Eventually I abandoned the project, because I realized all too soon that PSNR was not an exact representation of perceived quality. That is why the Anti-Shit feature of Nandub was used very sparingly to detect only frames that were totally foobar'd, it wasn't meant to be a catch all for bad quality.

So what can be used? It's hard to say. Its not exceedingly difficult to model a human eye, but it is very difficult to model human perception, perception is what gives you the ability to distinguish good quality from bad, pretty girl from ugly. We can only, for now, hope to embed a sort of common sense into the algorithm (perhaps combining criteria such as luma, compression quant, bitrate, etc.)

Liisachan

20th February 2003, 05:40

There s another insane tool, called ERRCHK,
which will read 2 AVS files (usually one for a Huffyuv source and one for the compressed one), and will compare Y U V for ALL pixels in ALL frames.

Its evaluation is not psychological but quantitative.
It will simply check how much the compressed one is "wrong."

Sample output:

Y U V
Error 2 :113725 (32.907%), 75240 (43.542%), 61341 (35.498%),
Error 4 : 27937 (8.084%), 17180 (9.942%), 9745 (5.640%),
Error 8 : 2617 (0.757%), 1509 (0.874%), 673 (0.390%),
...
Error 128 : 0 (0.000%), 0 (0.000%), 0 (0.000%),

You can get this kind of output for each frame, for average, for the "worst" frame, etc.

This calculation may be too time-consuming to use like m4c or anti-shit in Nandub, but you can use it to evaluate quantitatively
which settings (or which codec) is better than the other--
you ll only have to run ERRCHK 2 times, one for Sample A vs. the source, another for Sample B vs. the source.

This comparison can be informative as performance test,
but it is not that this can evaluate psychological "quality/shitness."
A compressed frame with higher error might be more HQ to human eyes for psychological reasons.
Although this method may be too obvious, it s another way "to assess resulting video" anyway :)

Unfortunately, there's only Japanese version atm:
http://www.geocities.co.jp/SiliconValley-Sunnyvale/3109/
http://www.geocities.co.jp/SiliconValley-Sunnyvale/3109/errchk02.zip

yaz

20th February 2003, 10:34

hi molerus!

Originally posted by molerus

The thing is that these shitty scenes are neither low nor high-bitrate, are just average. While some scenes can be compressed more (like up to 50%), other need almost 100% of the original size.

the final size of a single frame tells not too much. anyway, the values u dropped prompts me a kinda 'compressibility problem'. they seem to be too high even if u go for 2cd. how much is the effective bits per pixel value & how much is compression (say, given by gknot) ? maybe, the problem's there.

Originally posted by molerus

I'm thinking about writing the AviSynth filter, which would compare two clips (i.e. before encoding and after) and compute the "shittiness" of frames. The results would be stored in an external file, and afterwards would be treated as NanDub's manner Luminance Noise - the tougher frame to encode, the lower LumaNoise. Then it could be put into the XviD's stats file and the bitrate curve could be scalled again, using GKnot.

luma noise could be detected even in the 1st pass & imho, it'd be very useful. i prompted it for the devels of xvid, but it was refused as 'xvid works in a different way'. however, some bytes are (were?) reserved for this info is in the stat file but it's been never implemented. i luved this option in nandub (not shitframe but lumanoise correction!). it helped me outta mud many times.

Originally posted by molerus

My question is what the numerical criteria of "shittiness" are, which means the difference between the original and the encoded frame or just the visual quality of the output. I know that NanDub does the similiar thing, it's called Anti-Shit, but frankly I don't know how it works.

my question is what do u mean 'shit' xactly? why don't u like that frames? imho, u should define the term 'shitness'. is it 'blocking', 'tiling', 'smeared detailes' or what? imho, wout it we can't really help.
anyway, encoding that movie is not a joy ride. i suffered a lot with it & the final result was ... khmm ... say, acceptable :-))

the bests
y

molerus

20th February 2003, 12:04

Hi!

Originally posted by yaz

luma noise could be detected even in the 1st pass & imho, it'd be very useful. i prompted it for the devels of xvid, but it was refused as 'xvid works in a different way'. however, some bytes are (were?) reserved for this info is in the stat file but it's been never implemented. i luved this option in nandub (not shitframe but lumanoise correction!). it helped me outta mud many times.

y

At first I thought it could be fault of lack luma noise correction too. So I ported NanDub's code for computing it into a VirtualDub's filter. This filter stores information about luma noise of each frame into a separate file. Additionally I wrote a short program which muxes the luma information into the stats file. Then one can use GKnot to scale the bitrate curve. If you are interested I can post this filter as an attachment.

However it appeared that scenes I mentioned are TOTALLY incompressible. Even when I set quant to 2 I got blocks. Only setting quantization to 1 solved the problem, but the thing is that one can not force XviD to use such low quantizer in 2-pass encoding.

Frankly I don't know what else can I do :sly: .

Didée

20th February 2003, 12:54

Originally posted by molerus
... Only setting quantization to 1 solved the problem, but the thing is that one can not force XviD to use such low quantizer in 2-pass encoding.
Hint:
There is a way to use quant 1, if you need it by all means:
You can "misuse" the start- and endcredits function to use specific quants on certain scenes. Okay, you're obviously bound to use this in only 2 different scenes in the movie ... unless you break up your new encoding into several parts, taking the bitrate distribution of your initial encoding as guidance for the size of each part. Alot of manual work, yes ... but its better than trying some dozens of encodings with all kind of filters and settings, and in the end you still don't get what you want ;)

And, once more: Verify your blocking problem is also there without "dust", or any other noise filter.

Oh, one thing more: I encode most stuff with mpeg quantization, for that's more my taste.
Lately, I tried h.263 quantization once again -
- and found lots of blocks in flat areas, where I was not used to see any blocks in earlier (mpeg quant) tests ...

Regards

Didée

Prosper

23rd February 2003, 03:32

Once upon a time there was a program called 'Fair Use' (FU) which had a similar sort of algorithm for comparing 2 (or more) frames, and determining which was better. Too bad it's closed-source. The author used to (still does?) hang out here, maybe if you asked real nice...

Belgabor

23rd February 2003, 05:46

Just a small idea that came to me while reading this: how about substracting original and compressed and run a edge detection for vertical & horizontal edges on the diffrece to detect blockyness?

molerus

23rd February 2003, 12:33

Hi!

@Didée

I think I found another way to force quant 1. All you have to do is to use the custom matrix, which has the coefficients equal to one half of the original matrix (h.263 either MPEG). It should do the same as setting quant to 1 :) .

@Belgabor

Hmmmmmm. I think I must first substract the two frames and enhance the differences to see what's the difference. But I think you may be right. I'll try it.

@Prosper

Thx. I'll try to torment Google:D . Maybe I'll find something.

BTW. I made comparsion based on PSNR. NO (I repeat NO) correlation with HVS. It sucks:mad: .

Didée

23rd February 2003, 16:08

Originally posted by molerus
I think I found another way to force quant 1. All you have to do is to use the custom matrix, which has the coefficients equal to one half of the original matrix (h.263 either MPEG). It should do the same as setting quant to 1 :) .
Heheh ;)

Congratulations - by yourself, you found the exact thing that I am elaborating on recently.
From my current findings, 2-pass encoding can get a very noticeable overall quality boost.
However, it is not all that easy:
With a "halfened" matrix, you will get double the quantizers, compared to standard encoding. So, quant-2 with that custom matrix will equal to quant-1 of the normal one, yes.
But, when encoding with that custom matrix, the codec will use quant-2 always never, only on some mini-frames, probably black ones.
But quant-3 will get used quite some times (especially if you use some proper curve compression to help small frames). And that equals ~ "quant 1.5" the normal way, and gives a really high quality picture. On my last tests, I used standard CC with hi=30, lo=15, what gave me a really nice quantizer distribution. I did my test with a 10%-snip through LOTR FOTR SEE. 3h 20min will go at 544*224 on one 800MB XCD, with Vorbis @ Q1 ...

First test with CC hi-38 lo-14, not capped [edit: pixiedust(limit=3) was used]:Q:2:399
Q:3:1031
Q:4:3424
Q:5:4877
Q:6:2992
Q:7:1669
Q:8:659
Q:9:152
Q:10:33
Q:11:8
Q:12:2
Second test with CC hi-28 lo-14, minQ=4[edit: pixiedust(limit=3) was used]:Q:4:5017
Q:5:4830
Q:6:3285
Q:7:1611
Q:8:444
Q:9:57
Q:10:2
(All readers remember to divide all the quantizers above by 2 to compare to "normal" encoding)

In the end, I will use hi-30 lo-15 with minQ=3.

Much more testing has to take place, but I think this is a very promising way.
Another benefit is that the distribution of quantisation will be spread much finer than with normal matrices. With a usual hi-quality encoding, the quantizers are jumping mostly between q2 and q3. Now, that is already a reasonable jump in quality! With a "halfened" matrix, the distribution will be q'3-q'4-q'5-q'6, instead of only q2-q3.

Post is long enough now - waiting for comments.

Regards

Didée

pandv

23rd February 2003, 16:23

I thinked about this (half the coefficients) some time ago.

But I think there are a problem:

The very first coefficient of intra matrix (the upper-left or dc component) can't be modified. It's fixed to 8.

So if you get a new quantizer of 4, equivalent to old 2 quantizer, you are overquantizing the dc component.

But, maybe your tests shows this is not a practical problem, only a theoretical one.

pandv

molerus

24th February 2003, 11:38

Hi!

@Didée

Would you be so kind and attached matrix you're using?:p I would appreciate.
Another idea popped in my mind. Currently quantization is done in such a way, that DCT coefficients are divided by the quantization matrix, and then by the coefficient depending on quantizer. What if, instead of dividing it by one coefficient, use a matrix, which would be biased towards lower frequencies. For quantizer 2 dividing by that matrix would take place once, for 3 twice and so on. In such way one would preserve information regarding low frequencies, losing more and more information about details, noise etc. Once I read that Koepi had thought about introducing separate matrices for all quantizers, but it seems that that idea died. Pity :( .

BTW - I compress using Duron 950 so if I were you I wouldn't expect my comments soon:D .

@pandv

I think there's no point in altering the intra matrix. It seems to not influence both the quality and filesize. However in the inter-matrix you can put any coefficient you want. But if you're so determined you can decieve XviD through the matrix file. It has structure simple as a shoe:). All you have to do is edit this file and replace first byte by copying it from the second byte for example.

Didée

24th February 2003, 12:41

molerus,

too lazy to type that matrix by hand ? :)

Anyway, I attached it here: <DoubleBitrateMatrix.txt> Wait for a moderator to approve the attachement.
Another thing I am wondering is, if there are greater rounding errors introduced by doubling the quantizers due to the matrix delivering double the data rate? This would destroy a lot of the theoretical benefit ...
These days, I have some little problems with comparing things: the very same clip looks like crap today, pretty good tomorrow, and so on ...

BTW, since my encoding rig with Athlon XP1800 is busy (almost) 24/7, most of my testing is actually done on an Athlon Classic 700 (!) - so don't bother about your machine ;)

molerus

24th February 2003, 23:30

I don't want to complain, but there is no attachment :( .

Aktan

25th February 2003, 08:33

wait for it........!

Didée

25th February 2003, 09:41

Patience, molerus, patience ...

Are you waiting for a magic present, or what ;) ?

The matrix indeed looks as ridiculous simple as
8 8 9 10 11 12 13 14
8 9 10 11 12 13 14 15
9 10 11 12 13 14 15 16
10 11 12 13 14 15 16 17
11 12 13 14 15 16 17 18
12 13 14 15 16 17 18 19
13 14 15 16 17 18 19 20
14 15 16 17 18 19 20 22
for both intra and inter matrix. And it is not too hard to come to that conclusion, from what was written above ;)

Now, warm up your fingers ...

Happy testing

Didée

sam_b

25th February 2003, 13:46

Have either of you tried andreas' 78'ers matrix. It has really amazed me, and it allows larger filesizes as the quants are generally lower, so it seems to do what you guys want. To give an example quality 88 with b-frames still looks pretty good, and the file size is pretty normal.

molerus

25th February 2003, 14:31

@Didée

Hmmm, mhhhmm, hmmmm :rolleyes:

I'm not convinced. Your matrix seems to look like you divided MPEG intra matrix by 2. But bear in mind that intra and inter matrices for MPEG differ. I think that the coefficients for intra matrix could be as you have written (actually the first could be even 4, XviD admits it). But as for the inter matrix coefficients (of course we are talking about MPEG-like quantization) should range from 8 to 16 or 17. With the matrix you attached you should get very sharp, coarse picture, along with lot of mosquito noise.

Personally I prefer h.263-like quantization. Since the intra-matrix is not so important I left it as it is, while for the intra-matrix I set all the coefficients to 8. See what will happen :scared: .

@sam_b

Could you post a link, I couldn't find user named Andreas 78:) .

sam_b

25th February 2003, 14:45

This help ya?

http://forum.doom9.org/showthread.php?s=&threadid=46579

It's at the bottom of one of wotef's posts 22 posts the start.

Didée

25th February 2003, 16:52

Originally posted by molerus
Your matrix seems to look like you divided MPEG intra matrix by 2
Indeed.
And that is exactly what I was talking about :)

Basically, the results of that matrix should be very similar to the standard matrix, since during 2nd-pass double the quantizers will get used.
By the time I tried that, my primary goal was to achieve a finer distribution of quantization. Not more, not less. So, since you can't use "half" quantizers during encoding, I simply halfened the coeffs in the matrix, Period. But you still have the advantage that lower effective quantization than "normal q2" can take place on certain sequences, and that the "big jump" between q2 and q3 is avoided.

In the end, there is no magic matrix, alas. You only have a certain amount of bits that you can distribute somehow.
It's just like life - you win here, you lose there ...

Didée

Didée

26th February 2003, 09:43

And about the mosquito noise specially:

Well, to reduce the mosquito noise, you would need to lower exactly those coefficients that cause the noise. Now, which coeffs in the table are that? You can't tell ... you know that nice little graphic that shows which DC frequencies belong to which cell in the matrix? Good. But you cannot draw any direct dependency between (the visualisation of) these frequencies and the actual image content! That's because not the actual image content gets compressed, but the difference between the frames (<- alittle simplified, I know).
Because of that, AND because of the fact that XviD's motion estimation seems to work pretty well, only ca. 33% of the left-topmost area of the table are responsible for ca. 95% of all content (wild guessing for those numbers, but it will give the idea, I think).
As a result, I think it is not possible by only manipulating the quantization matrix, to take influence on mosquito noise alone, and leave the rest untouched.

As for blocking problems on flat surfaces and gradients like smoke, perhaps one could try to lower only the three numbers in the upper left? Lets see ...

What eases my mind a little, is that "trbarry" Tom, who for sure understands much more of all that stuff than me, wrote his DCT filter, but admitted that he not really understands/foresees all the consequences ;)

Regards

Didée