PDA

View Full Version : How to make a good custom matrix ???


Soulhunter
13th March 2004, 01:26
Anything special I have to pay attention for when making a custom matrix ???

Example: How behave the values from top-bottom n' left-right to each other... :confused:


Tia, Soulhunter

RadicalEd
13th March 2004, 02:27
Well, lower and rightmost values quantize out finer details, while upper left values zero the coarser part of the image. Ideally you want to discard as much detail as possible up to the visible threshold, so it's a dance of how far up/left you can go without artifacts. Since the types and amounts of detail are content specific, you'll have to tailor it to what you're doing. That's why there are so many of the things :|

Soulhunter
13th March 2004, 19:23
Have already done one...


8 12 14 16 16 16 18 18 12 14 16 18 18 18 22 22
12 14 16 16 16 18 18 22 14 16 18 18 18 22 22 24
14 16 16 16 18 18 22 22 16 18 18 18 22 22 24 24
16 16 16 18 18 22 22 24 18 18 18 22 22 24 24 26
16 16 18 18 22 22 24 24 18 18 22 22 24 24 26 26
16 18 18 22 22 24 24 26 18 22 22 24 24 26 26 28
18 18 22 22 24 24 26 28 22 22 24 24 26 26 28 28
18 22 22 24 24 26 28 32 22 24 24 26 26 28 28 32


My goal is to make a matrix that looks better than MPEG but produces not so big files like Didée's 6of9 !!!


Test: T3 (PAL) @ Q2 n' 1024x576

- H263 = 3.24 GB

- MPEG = 3.78 GB

- MINE = 4.51 GB


Bye

Didée
13th March 2004, 20:31
Filesize at a fixed quant-[X] is of no importance at all. *)

E.g. speaking of SixOfNine again: filesize at q2 is much higher than standard, yes. But so is the quality. As rule of thumb for SixOfNine, its about

q2 [std] == q3 [6of9]
-- [std] == q4 [6of9]
q3 [std] == q5 [6of9]

So, (one) trick there is simply to make an additional quantizer available, where the std matrix would always have to do the "big jump" between q2 and q3. See the other thread, where I explained it more detailed.

In the end, all one can do is to "shift around" the distribution of bits. But still, there are only so much bits to distribute - not one bit more, not one bit less.


- Didée


*) This is not completely true, at least for XviD. AFAIK XviD makes changes within the motion estimation for higher quants to accomodate somehow for the expected high quantization. But this seems to be hard-fixed to certain quantizers, not to resulting bitrate (that algo simply doesn't have high-bitrate matrices in mind... ).
Therefore, if you make a matrix out of the standard matrix with all values just halfened, that matrix *should*, at doubled quantizers, behave almost exactly like the standard-matrix. Or: at quant-16, it *should* behave just like standard-matrix at quant-8.
But it does not, because the motion estimation gets changed with those higher quantizers, and your "halfened" matrix will deliver worse results then. That's also why my encodings with SixOfNine will hardly ever use higher quants than q6 for P's and q8 for B's *at max*, at least not for archiving stuff.

(Could that perhaps be something for past-1.0 XviD ?)

LigH
13th March 2004, 20:58
I'm currently working on some kind of matrix viewer/editor:

http://forum.doom9.org/showthread.php?s=&threadid=72211

If you are seriously interested, you may have a look at the current stange of development, just to tell me if it goes into the right direction. In case of interest, you'll have to order it per PM (I do not yet want to post it publicly).

Soulhunter
13th March 2004, 22:10
Originally posted by Didée
Filesize at a fixed quant-[X] is of no importance at all. *) I want to produce a matrix that makes 4GB out of a 3GB MPEG matrix filesize !!!

So far I think my first test matrix works pretty well... :)

Suggestions or tests would be nice !!!


Tia, Soulhunter

LigH
13th March 2004, 22:24
If your original video had only 3 GB due to using high quantization factors, and you re-encode that one with a matrix which creates a 4 GB file, then you won't gain any quality! Re-encoding always decreases quality - maybe not obviously, but measurably.

But if you had top-quality original material, and you encode it on one hand with a standard matrix and get 3 GB, and on the other hand encode it with the special matrix and get 4 GB, then the second copy may have better quality than the first copy.

Soulhunter
13th March 2004, 22:46
Originally posted by LigH
If your original video had only 3 GB due to using high quantization factors, and you re-encode that one with a matrix which creates a 4 GB file, then you won't gain any quality! Re-encoding always decreases quality - maybe not obviously, but measurably.

But if you had top-quality original material, and you encode it on one hand with a standard matrix and get 3 GB, and on the other hand encode it with the special matrix and get 4 GB, then the second copy may have better quality than the first copy.
Source was a DVD (Terminator 3 / German / R2)

I mean 3GB filesize with MPEG matrix n' 4GB size with my matrix...

Its for the situation when a q2 encode is too small, but a q1 encode too big !!!

Bye

Didée
14th March 2004, 03:11
Of course you are perfectly right.

In your given case, you'll get best results if you tweak the matrix in such a way, that you reach your desired filesize with q2 only. That way, you'll have the littlest possible loss (as far as quantization is concerned, technically).

My schoolmasterly talking ;) was about general usage, because that was how I understood your post with the matrix:

> Looks it good ???
> My goal is to make a matrix that looks better than MEG but produces
> not so big files like Andreas 78er or Didée's 6of9 !!!
> [.. followed by a comparison of q2-tests ..]

And for general usage, speak 2-pass encodes in a reasonable range with single-numbered quantizers, my above statement is valid.

To your matrix: it seems not bad to me. Perhaps you should reduce the "steps" between the quantization levels (you're first settling on '18', then jumping to '22', settling again, jumping to '24' ...). Try to "blur" it a little together, at least in the border regions.

What I'm still not sure about, is the relative quantization relation INTRA <-> INTER. Looking at I-frames only, I'd prefer to have the I's quantized a little bit more than the P's - especially at quant 2. But we also have the INTRA matrix used in P-frames ... don't yet know if it's generally better to have these INTRA blocks in P-frames quantized lower/equal/higher then the rest. (?)
But then again, for a constant-q2 encode, you probably could just set INTRA==INTER.

- Didée

sysKin
14th March 2004, 03:55
Originally posted by Didée
[B]In your given case, you'll get best results if you tweak the matrix in such a way, that you reach your desired filesize with q2 only. That way, you'll have the littlest possible loss (as far as quantization is concerned, technically).Ca yuo explain why?

I always had the feeling that it's better to oparate at higher quants (say 5..8) because quant variations are smaller (small undersize is compensated by slightly lower quant, rather than a jump to q1, which is a waste. Similary, small oversize will reduce the quantizer just a bit, not all the way down to 3).

What I'm still not sure about, is the relative quantization relation INTRA <-> INTER. Looking at I-frames only, I'd prefer to have the I's quantized a little bit more than the P's - especially at quant 2. But we also have the INTRA matrix used in P-frames ... don't yet know if it's generally better to have these INTRA blocks in P-frames quantized lower/equal/higher then the rest. (?)
But then again, for a constant-q2 encode, you probably could just set INTRA==INTER.INTRA has higher quantization in all matrices I've seen. There must be a reason for this :)

Radek

Didée
14th March 2004, 05:04
Originally posted by sysKin
Ca yuo explain why?Yes I can. :)

I always had the feeling that it's better to oparate at higher quants (say 5..8) because quant variations are smaller (small undersize is compensated by slightly lower quant, rather than a jump to q1, which is a waste. Similary, small oversize will reduce the quantizer just a bit, not all the way down to 3).I could not agree more. Once more, for exactly the same idea I mostly use 6of9 now: to make the codec use smaller steps in quantization.

What I was saying about "minimizing the loss" was referring to soulhunter's specific case: he wants to hit a desired filesize (4GB) as close as possible with quant-2 only. For that aim, it's obviously a good (but timeconsuming) idea to tweak the matrix accordingly. In the end, he will end up with minimal usage of q3's. Presuming his finally used matrix will produce a filesize a tad above 4GB, and he uses 2-pass. For 1-pass, he'll just get what he'll get: a file ~about~ 4GB.

INTRA has higher quantization in all matrices I've seen. There must be a reason for this :)
There are so much more people using DivX than people using XviD.
There must be a reason for this. A big majority will never be wrong - therefore I assume DivX is so much better :D

BTW, is that thingie about simplifying/whateveritmaybeexactly the motion vectors for high quants still true for XviD 1.0 tree? When you once made comments about that, 0.9 tree was bleeding edge ...

- Didée

sysKin
14th March 2004, 08:02
Originally posted by Didée
[B]What I was saying about "minimizing the loss" was referring to soulhunter's specific case: he wants to hit a desired filesize (4GB) as close as possible with quant-2 only. For that aim, it's obviously a good (but timeconsuming) idea to tweak the matrix accordingly. In the end, he will end up with minimal usage of q3's. Presuming his finally used matrix will produce a filesize a tad above 4GB, and he uses 2-pass. For 1-pass, he'll just get what he'll get: a file ~about~ 4GB.OK, fair enough :)
There are so much more people using DivX than people using XviD.
There must be a reason for this. A big majority will never be wrong - therefore I assume DivX is so much better :DWell, this time I mean default mpeg-4 matrix (the one used as default in "mpeg" quantization... who called it "mpeg" quantization anyway?) and also default mpeg-2 matrix... which still does not mean much of course.
BTW, is that thingie about simplifying/whateveritmaybeexactly the motion vectors for high quants still true for XviD 1.0 tree? When you once made comments about that, 0.9 tree was bleeding edge ...Yup, still true, but I wouldn't worry about that too much. Similar hard-coded bits/quality as a function of quantizer is used in VHQ and trellis quant, but I would worry even less. In fact, if you worry about it, use VHQ of 2 (or more) which will smartly conteract any possible wrong decisions made by normal ME.

Radek

LigH
14th March 2004, 12:23
Originally posted by sysKin
INTRA has higher quantization in all matrices I've seen.

... as a maximum quantization factor (for the high frequency edges/corner). But if you compare the minimum (for the low frequency edges/base corner), often the opposite is true: INTRA probably must always start with 8, INTER usually starts with 16 (but probably doesn't need to).

Soulhunter
18th March 2004, 20:26
Got the same ffdshow playback issues with like they are reported for 6of9... :\

With XviD's decoder all works well, but I wanna use ffdshow !!!


What have I to change here ???

8 12 14 16 16 16 18 18 12 14 16 18 18 18 22 22
12 14 16 16 16 18 18 22 14 16 18 18 18 22 22 24
14 16 16 16 18 18 22 22 16 18 18 18 22 22 24 24
16 16 16 18 18 22 22 24 18 18 18 22 22 24 24 26
16 16 18 18 22 22 24 24 18 18 22 22 24 24 26 26
16 18 18 22 22 24 24 26 18 22 22 24 24 26 26 28
18 18 22 22 24 24 26 28 22 22 24 24 26 26 28 28
18 22 22 24 24 26 28 32 22 24 24 26 26 28 28 32


Would this fix it ???


8 12 14 16 16 16 18 18 16 16 16 18 18 18 22 22
12 14 16 16 16 18 18 22 16 16 18 18 18 22 22 24
14 16 16 16 18 18 22 22 16 18 18 18 22 22 24 24
16 16 16 18 18 22 22 24 18 18 18 22 22 24 24 26
16 16 18 18 22 22 24 24 18 18 22 22 24 24 26 26
16 18 18 22 22 24 24 26 18 22 22 24 24 26 26 28
18 18 22 22 24 24 26 28 22 22 24 24 26 26 28 28
18 22 22 24 24 26 28 32 22 24 24 26 26 28 28 32



Tia, Soulhunter

Sharktooth
18th March 2004, 21:27
Originally posted by Soulhunter
Would this fix it ???

8 12 14 16 16 16 18 18 16 16 16 18 18 18 22 22
12 14 16 16 16 18 18 22 16 16 18 18 18 22 22 24
14 16 16 16 18 18 22 22 16 18 18 18 22 22 24 24
16 16 16 18 18 22 22 24 18 18 18 22 22 24 24 26
16 16 18 18 22 22 24 24 18 18 22 22 24 24 26 26
16 18 18 22 22 24 24 26 18 22 22 24 24 26 26 28
18 18 22 22 24 24 26 28 22 22 24 24 26 26 28 28
18 22 22 24 24 26 28 32 22 24 24 26 26 28 28 32

Probably yes.

Soulhunter
18th March 2004, 22:00
Thanks... ;)

MfA
18th March 2004, 22:25
Originally posted by Didée
What I was saying about "minimizing the loss" was referring to soulhunter's specific case: he wants to hit a desired filesize (4GB) as close as possible with quant-2 only.

Of course then the question becomes why he wants that ... quant2 has no magical properties which makes it any better than quant4 with a matrix with all entries halved.

Marco

Soulhunter
19th March 2004, 23:57
Originally posted by MfA
Of course then the question becomes why he wants that ... quant2 has no magical properties which makes it any better than quant4 with a matrix with all entries halved.Just to follow my routine... ;)


- I start with a fixed quant. 2 encode first

- If the filesize is too high I use BVOP's @ 1/1/1

- If the filesize is only a bit too high I use BVOP's @ 1/1/0

- If the filesize is too low I re-do the above stuff with another quant. matrix


Bye

Chainmax
20th March 2004, 15:46
Is 6of9-HVS only useable for Q2 encodes? If not, what would you guys say is the bitrate floor in which it can be used? Also, how do QPel and Trellis play with it?

Didée
20th March 2004, 20:23
Personally, I use 6of9 for _general_ usage. But then, I usually do relatively high bitrate encodings only ...
I can't tell you much about the "bitrate floor". People always talk about bitrate, but they never talk about *resolution*! Now, say I encode at 1500 kbps ... for a 640*272 2.35:1 DVD rip, this is a pretty high bitrate. For a 720*576 4:3 (or 16:9 anamorph) rip, it is a weak medium bitrate, and for 1280*720 "HDTV", this is a bitrate so low that it's out of discussion ...

If you browse through the iiP (http://forum.doom9.org/showthread.php?s=&threadid=70916) thread, you'll find links to some samples I did: 704*528 (16:9 anamorph) @ 1000 kbps with 6of9 as quant matrix, plus qpel, plus trellis. To give a better idea, this equals to about 0.10 bits/pixels*frame (remember? "Aim for 0.2 for good quality, and never go below 0.15...")

So, people ... forget about "bitrate" unless you are *more specific*. (Please.)

Talking about quantizers: Generally, I aim for 6of9 encodes to not exceed q5 for p-frames (q6 is OK for curve compression's overflow control), and quant 7~8 for B-frames.

As mentioned several times now, (one of) the main trick(s) is to make smaller quantization steps available to the codec.
To give a visualization:
Relative framesizes per quant

Standard SixOfNine
* *
* *
* * *
* * *
* * * * *
* * * * * *
* * * * * * * *
* * * * * * * *
* * * * * * * *
* * * * * * * *
q2 q3 q4 q3 q4 q5 q6 q7
As can be seen easily, for the same compression range that the standard matrix offers for q2-q4, a high bitrate matrix offers "additional" quantization steps. Especially the avoidance of the "big jump" between q2 and q3 comes handy.

Hugh, spoken I have ;)

- Didée

Chainmax
20th March 2004, 21:28
I was thinking of using 6of9_HVS for both a cartoon rip at 640x480 with an average bitrate of ~1200kbps and for an anime rip at 592x320 with an average bitrate of ~850kbps. From what you are posting, I gather that the cartoon could benefit from 6of9_HVS but the anime rip won't (I'll use HVS_better for that). Thanks for the quick reply :).

LigH
20th March 2004, 23:07
How much differ a cartoon and an anime, except that anime is a japanese cartoon?

TorgoGuy
20th March 2004, 23:17
Originally posted by Didée
So, people ... forget about "bitrate" unless you are *more specific*. (Please.)
Agreed. Saying bitrate by itself is completely meaningless.

Originally posted by Didée
Talking about quantizers: Generally, I aim for 6of9 encodes to not exceed q5 for p-frames (q6 is OK for curve compression's overflow control), and quant 7~8 for B-frames. As mentioned several times now, (one of) the main trick(s) is to make smaller quantization steps available to the codec.
I understand this concept, and have heard Syskin talking about it before. What I wonder is why you couldn't go further with it. For example, cut your matrix numbers in half and let Xvid work with even finer granularity (e.g, q7s-q10s).

Is is because INTRA frames seem to require an 8 for the top left number? Or perhaps because your matrix numbers get so small that it is difficult to create a good matrix (not enough granularity available to specify the matrix itself?)

Also, is the requirement of 8 in the top left of INTRA required in the spec or is it imposed by Xvid (or is even just something about the algorithm itself that makes numbers less than 8 make no sense)?

Thanks!

Sharktooth
21st March 2004, 15:50
Originally posted by Chainmax
I was thinking of using 6of9_HVS for both a cartoon rip at 640x480 with an average bitrate of ~1200kbps and for an anime rip at 592x320 with an average bitrate of ~850kbps. From what you are posting, I gather that the cartoon could benefit from 6of9_HVS but the anime rip won't (I'll use HVS_better for that). Thanks for the quick reply :).
I managed to encode some animes (Cowboy bebop) @ 720x576 with an average bitrate of 860kbps (final video size including vorbis audio-> 140Mb - credits were cut off) and h263 matrix. All the 10 encoded episodes (session 1 and 2) are hardly distinguishable from the original DVDs...

Chainmax
21st March 2004, 18:30
Originally posted by LigH:
How much differ a cartoon and an anime, except that anime is a japanese cartoon?
Of course, you're right. By "cartoon" I mean stuff like "Simpsons" or "Futurama", by "anime" I mean detailed cartoons. Bad definition, I know.

Originally posted by Sharktooth:
I managed to encode some animes (Cowboy bebop) @ 720x576 with an average bitrate of 860kbps (final video size including vorbis audio-> 140Mb - credits were cut off) and h263 matrix. All the 10 encoded episodes (session 1 and 2) are hardly distinguishable from the original DVDs...
Really? :eek: How detailed is Cowboy Bebop? Somehow I doubt Grave of the Fireflies will be as compressible. I'm still torn on wether to use MPEG matrix + QPel or 6of9_HVS without QPel on the cartoon rip.

Leak
21st March 2004, 20:54
Originally posted by Chainmax
Really? :eek: How detailed is Cowboy Bebop? Somehow I doubt Grave of the Fireflies will be as compressible. I'm still torn on wether to use MPEG matrix + QPel or 6of9_HVS without QPel on the cartoon rip.

Oh, Cowboy Bebop is quite detailed, but then again it's also quite a bit newer than Grave of the Fireflies (1998 vs. 1988), and since it was done using digital ink & paint rather than cels it's got much less noise and a stabler picture than GotF - which of course helps compressability quite a lot.

Hell, all episodes from my Witch Hunter Robin Vol. 1 DVD came out severely undersized with H.263; the first ep was 122MB in 640x480 *with* 128kBit CBR MP3-Audio, and I was going for a final file size of 170MB... *eg*

np: The Orb - Assassin (7" Mix) (U.F.Off: The Best Of The Orb)

Soulhunter
21st March 2004, 23:52
@LigH

Some progress with your Custom Quant Matrix Editor ???


Btw, how does it work ???

Is it a "simple" editor to write a custom matrix, or has it some thrilling extras like...

- Claculate a modulation of two diffrent matrixes

- Calculate a source "matched" matrix through analyzing it

- Estimate compression results by telling it what you got with a other marix

- Give popup tips/warnings while writing a matrix


Bye

Sharktooth
22nd March 2004, 03:13
Originally posted by Chainmax
Really? :eek: How detailed is Cowboy Bebop? Somehow I doubt Grave of the Fireflies will be as compressible. I'm still torn on wether to use MPEG matrix + QPel or 6of9_HVS without QPel on the cartoon rip.
Well its quite detailed. However the compressibility tests didnt went as good as i thought (about 20%). I had to apply some filters to gain some better results (the italian DVD source is quite badly compressed, i could clearly see macroblocks here and there :sly: ).
In some cases the backup looks better than the original expecially on flat coloured areas. I had to apply also some specific filters for some scenes and use quant zones to compensate some weird quantizer distribution.
I had to go with h263 matrix also. MPEG or other detailed matrixes were producing what i tried to eliminate even from the source - macroblocks.
It was a hard work but im satisfied with the results.

LigH
22nd March 2004, 09:57
@ Soulhunter:

It is an editor. It helps you creating matrices from templates, other existing matrices, and parametrically generated. But it is in no way able to tell you if a matrix is good, which compressibility you can expect, or even running a test sequence (to be able to provide that, I would first have to find out how I can control codecs and compress videos inside a Delphi application, this would take time...); I don't think I would even want that. Top of the hill might be exporting a VDub job script, if at all.
__

P.S.: Modulation between matrices... well, I'll think about that point.

Soulhunter
23rd March 2004, 21:14
@LigH

Your Matrix Editor works excellent...


I made this custom matrix with it !!!

http://img8.imageshack.us/img8/4930/SoulhuntersV3.png

Works really nice for my purpose, think Ill stay with it... :)

But how to call it now ???


Bye

LigH
23rd March 2004, 22:32
As you like it; what about "Soulhunters Simple HQ"?

Soulhunter
23rd March 2004, 22:51
Originally posted by LigH
As you like it; what about "Soulhunters Simple HQ"?
Hmm... As it Inter-matrix is similar to the standard MPEG one, but the Intra-matrix is changed to give higher quality, I thought of MPEG-EX as extension to Soulhunters maybe !!!


Bye

LigH
23rd March 2004, 23:13
:sly: The MPEG consortium might not like that name, sounds so official... :rolleyes: (if they ever hear of it)

Soulhunter
23rd March 2004, 23:45
Originally posted by LigH
The MPEG consortium might not like that name...Ohw... :scared:

Originally posted by LigH
...sounds so official...
But also cool... :cool:

Originally posted by LigH
...if they ever hear of it
They are everywhere... :D

Nah, Ill think about it... ;)


Bye

TorgoGuy
24th March 2004, 05:56
Does anybody know why the first number of an Intra frame has to be 8?

Is it in the MPEG4 specs or an XviD requirement?

Thanks in advance!

LigH
24th March 2004, 12:50
In my humble opiion, this default value has to be assumed since MPEG-1, it represents the average base brightness/color of the macro block, and therefore must be quantized rather exactly. But I don't have any quote from the official standards available, sorry...

Soulhunter
24th March 2004, 19:44
Hmm, thought about the name...

I simply use the name of thus screenshot above !!!

So, its just Soulhunters V3 now... ;)


Bye

Soulhunter
26th March 2004, 19:53
Originally posted by RadicalEd
Well, lower and rightmost values quantize out finer details, while upper left values zero the coarser part of the image.So, would it be logical to raise the bottom/right values roughly when dealing with toon/anime stuff ???

Coz this kind of content has not much details as real movies... :\


Tia, Soulhunter

RadicalEd
26th March 2004, 23:06
Definitely. A good way to understand the emprical side of this is to look at the set of 8x8 DCT basis functions (http://home.uchicago.edu/~orebas/jpeg/DCT.jpg). Essenitally, what the DCT does is supply a matrix of amplitude values (with frequency and direction defined by the element placement). Thanks to Fourier's work on wave synthesis, we know these different waves can be added to form any wave possible. So, what the DCT describes then, is the 8x8 pixel values as a set of waves that can be combined to reconstruct the wave/pixels of the original block. That's why if you quantize the higher and medium frequency waves out, you'll wind up with an image full of simple wave-like gradients. All they are are simple cosine waves of luminace. Test it out with a quant matrix, you'll see firsthand what I mean.

With that in mind, lower right values aren't important to anime but top-right and bottom-left are. You need a *lot* of coefficients to reshape a wave to fit and approximate a square, which is what high-contrast sharp lines (like drawn edges) are. For example, a line of luminace pixels where there is an edge may go {191, 173, 169, 0, 0, 0, 120, 135}. Graphed, this makes a line kind of like: ---__--, which as you could imagine, doesn't fit a wave ~ well normally.
That's the reason anime doesn't work so well with the DCT, while natural content works rather well. So an ideal anime matrix may look something like:
x x x x x x x x
x x x x o o x x
x x o o o o o x
x x o o o o o o
x o o o o o o o
x o o o o o o o
x x o o o o o o
x x x o o o o o

where x is stuff you're trying to keep and o is stuff you're dropping more heavily.

/101

Soulhunter
26th March 2004, 23:27
Thanks, well explained... :)

Bye

Chainmax
27th March 2004, 06:40
Would you be willing to try and create a custom matrix for animated material? It would be good to have an alternative to H.263...

BoNz1
27th March 2004, 06:57
Originally posted by Chainmax
Would you be willing to try and create a custom matrix for animated material? It would be good to have an alternative to H.263...

Don't think so, I spent an entire afternoon trying to do this and in the end mine was only marginally better than or worse than H.263. No point in my opinion. And although sometimes there were scenes where my matrix beat H.263 quite handidly there would be some scenes where it did look bad. H.263 seems really efficient for animated material. Certainly for low bitrate I think you would be hard pressed to make something better. A high bitrate animated matrix might be kinda cool though...

LigH
27th March 2004, 18:41
In my opinion, creating a matrix for cartoons is a two-handed problem: On one hand, you may want to decrease the influence of noise because usually there are smoothly colored areas. But on the other hand, you may want to decrease the influence of linear structures not too much because they are necessary to reproduce the black borders (originally drawn in "india-ink") sharply, but without too many ring artefacts (a.k.a. "Mosquitos"). Therefore you see that some elements at the (left to middle) lower and (upper to middle) right edge of RadicalEd's example matrix still have the status "worth to keep".
__

By the way: I still wonder if there is an integer matrix representation of the H.263 quantization.

Soulhunter
27th March 2004, 18:43
Originally posted by Chainmax
Would you be willing to try and create a custom matrix for animated material? It would be good to have an alternative to H.263... Maybe next month... :rolleyes:

Atm, my PC re-boots every 15 minutes !!!

Need to buy a new PSU the next days... :angry:


Bye

Leak
27th March 2004, 19:55
Originally posted by Soulhunter
@LigH

Your Matrix Editor works excellent...


You know, even if it looks a bit like a matrix editor it really is Q-Bert in disguise... :D ;)

(@LigH - every program needs an easter egg, so why not this? :p)

np: Deadbeat - Portable Memory (Something Borrowed, Something Blue)

Didée
28th March 2004, 00:40
*knock knock*

A little coolddown before the exitement overboards:

You are all aware of the fact that the equation
"high DCT frequencies" == "fine image detail"

is way too simplistic, yes? It's only true for I-frames. For 99% of all frames, the codec is not coding picture images, but instead "error maps" of what is left over from the codec's motion compensation. And these error maps have very little in common with the original image ...
It may happen that detail, even fine one, "vanishes" in the ME, but mostly it will not. It is much more likely that the ME will *create* frequencies on its own that are not present in the original image: imagine a stick thrown into the air, rotating on its way. Orthogonal based ME will fail to catch the rotating stick, and produce other frequencies (exept the stick is coded into INTRA blocks).
And since it is not clear what the ME routines will spit out, I consider it dangerous to say "these kind of content needs these frequencies, that content needs those frequencies". Simply beacause after ME there can appear *all* kind of frequencies.

- Didée

RadicalEd
28th March 2004, 00:53
I plead guilty to the fact that my advice was only directed at the intra matrix. Guess I should have mentioned so. Inter I haven't even given a thought to at this point :|

It stands, though, that most high frequency detail in an interblock is going to be noise, especially with anime. Other than that, I can't think of any good temporal motion generalizations at the moment. So I guess my matrix example is kind of still applicable, but on a more restricted basis.

Soulhunter
28th March 2004, 18:56
So for the intra matrix something like this...

X X X X X X X X
X X X O O O X X
X X O O O O O X
X O O O O O O O
X O O O O O O O
X O O O O O # #
X X O O O # # #
X X X O O # # #

X = Low values
O = Med values
# = High values

And for the inter matrix... :confused:

Playing around n' comparing the results !!!

But my amount of these content is very limited, so I would need some ppl to test it as well... :D

Btw, In some white-papers I read that the maximum allowed value for matrices is 62... Is this really right ???


Bye

RadicalEd
28th March 2004, 20:08
XviD allows up to 255, but in some jpeg examples I've seen > 300. That's jpeg, of course, so it may or may not be applicable to mpeg 4.

Soulhunter
28th March 2004, 20:14
Originally posted by RadicalEd
XviD allows up to 255, but in some jpeg examples I've seen > 300. That's jpeg, of course, so it may or may not be applicable to mpeg 4.
Ok, I know that bigger values are decode-able... :o

But I meant... Is it also MPEG4-standard conform ???


Bye

RadicalEd
28th March 2004, 20:26
The xvid devs are the only guys I know with the spec, so I'll trust what they're putting out :|

MfA
28th March 2004, 22:05
intra_quant_mat: This is a list of 2 to 64 eight-bit unsigned integers.

So 1-255 it is.

For the encoder it would be nice if you could enter 0s in the matrix though, which the encoder should then interpret as "always 0 this coefficient" (in the bitstream it should put 255, which is the closest to the divisor actually used ... namely infinity).

LigH
28th March 2004, 22:20
NOTE: CCE uses signed short integers (unfortunately), so for CCE the maximum is 127. I read that somewhere in a hint shipped with an other program...

RadicalEd
28th March 2004, 22:44
Originally posted by MfA
intra_quant_mat: This is a list of 2 to 64 eight-bit unsigned integers.

2 - 64? o_O How can mpeg 4 use a < 64 element quantization matrix?

Soulhunter
28th March 2004, 23:24
Originally posted by MfA
So 1-255 it is. Thanks for clarification... :)


@LigH

Dont wanna bother you, but...

Maybe some progress with your nice tool ???


Bye

LigH
28th March 2004, 23:30
Yes, quite some. The current state is an inofficial "RC1", and there is no more feedback, so I guess I'll release it during this next week (how about a "not April joke"?).

Soulhunter
28th March 2004, 23:46
Originally posted by LigH
Yes, quite some. The current state is an inofficial "RC1", and there is no more feedback, so I guess I'll release it during this next week (how about a "not April joke"?). Cant await it... :D


Btw, think I will start with this as intra...


8 8 10 10 12 14 16 16

8 10 10 12 14 16 16 16

10 10 12 14 16 16 18 18

10 12 14 16 16 18 24 24

12 14 16 16 18 24 32 32

14 16 16 18 24 32 32 64

16 16 18 24 32 32 64 96

16 16 18 24 32 64 96 128


Bye

Selur
29th March 2004, 09:08
@LigH:

would be nice to be able to '(re)load' a intra/inter matrix from a file after the initial creation (via 'new'), so one could e.g. keep the changes of the inter matrix and just reload the inter matrix

Another nice gimick feature would be to be able to export/save and import/load as clean txt, so it would be easier so copy&paste matrices to or from a message.

Cu Selur

LigH
29th March 2004, 17:11
Keep existing intra or inter matrix - oh, you are right, this will be useful!

Imoprt / export "comma separated values" - would be useful, too.

And I'm about to look if I can import / export Nero Vision matrices, if I can find out where they are stored. I'm just installing it; but you can help and tell me if you already know the location.

Soulhunter
29th March 2004, 18:51
Forgotten... Maybe you could add drag&drop ability also ???


Bye

Selur
30th March 2004, 09:45
so back to the main topic:

does anyone have some good advices how to tune the inter matrix ?
Normally I used to write it some sort of familiar to the intra matrix, which sometimes does a good job an dsometimes doesn't ;)

Cu Selur

Soulhunter
30th March 2004, 18:51
Originally posted by Selur
so back to the main topic:

does anyone have some good advices how to tune the inter matrix ?
Normally I used to write it some sort of familiar to the intra matrix, which sometimes does a good job and sometimes doesn't ;)

Cu Selur Maybe just imagination, but I found a flat one pleasing for low compression reates... :)

But for usually (medium) compression a INTRA = INTER works quite well I think !!!

But my "tests" where somehow limited... Only a hand full of sources till now... ;)


Bye

Soulhunter
31st March 2004, 21:17
Originally posted by Soulhunter

Maybe just imagination, but I found a flat one pleasing for low compression reates... :)

But for usually (medium) compression a INTRA = INTER works quite well I think !!!

But my "tests" where somehow limited... Only a hand full of sources till now... ;)


Bye No comments on this "presumable bad" put... :rolleyes:


Bye

Soulhunter
1st April 2004, 20:58
Nah, so here is my first attempt to make a "good" anime matrix... :rolleyes:

http://img8.imageshack.us/img8/5351/Soulhunters-Anime-Matrix-V1.png

Needs still some testing to prove... Interested someone ???


Bye

LigH
1st April 2004, 21:37
Export it as "Text (CSV)" using the CQME 1.0, then I'll check; I won't type the values from the image, no way!

Soulhunter
1st April 2004, 22:09
Nice, could I mail it to you... :)

LigH
1st April 2004, 22:19
Just post it here (code).

Soulhunter
1st April 2004, 22:29
Originally posted by LigH
Just post it here (code). Ahh, yeah didnt thought of this possibility... :D


Soulhunters V3
8 8 10 12 14 16 18 20
8 10 12 14 16 18 20 22
10 12 14 16 18 20 22 24
12 14 16 18 20 22 24 26
14 16 18 20 22 24 26 28
16 18 20 22 24 26 28 30
18 20 22 24 26 28 30 32
20 22 24 26 28 30 32 32

16 16 16 16 16 16 16 16
16 16 16 16 16 16 16 16
16 16 16 16 16 16 16 16
16 16 16 16 16 16 16 16
16 16 16 16 16 16 16 16
16 16 16 16 16 16 16 16
16 16 16 16 16 16 16 16
16 16 16 16 16 16 16 16

Soulhunters Anime
8 8 10 10 12 14 16 16
8 10 10 12 14 16 16 16
10 10 12 14 16 16 18 18
10 12 14 16 16 18 24 24
12 14 16 16 18 24 32 32
14 16 16 18 24 32 32 64
16 16 18 24 32 32 64 96
16 16 18 24 32 64 96 128

16 16 16 18 22 24 28 28
16 16 18 22 24 28 28 28
16 18 22 24 28 28 32 32
18 22 24 28 28 32 38 48
22 24 28 28 32 38 56 56
24 28 28 32 38 56 64 64
28 28 32 38 56 64 82 96
28 28 32 48 56 64 96 128

Bye

Zarxrax
1st April 2004, 23:58
I have a few questions about custom matrices.

First of all, what matrix does H.263 use? I was under the impression that the intra frame was all 16's and inter frame was all 32's, but that seems not to be the case.

And next, what is the best way to test matrices that you create? With a changing filesize, I don't really see how to make an accurate comparison.

RadicalEd
2nd April 2004, 05:07
Like I said on AIM (omg we are such losers), do it RD-style. Take a representative few sample frames, compress using a bunch of matrices, and compare size to SSIM or some quality metric. That should give the most suited matrix regardless of quant differences.

sysKin
2nd April 2004, 06:07
Originally posted by RadicalEd
Like I said on AIM (omg we are such losers), do it RD-style. Take a representative few sample frames, compress using a bunch of matrices, and compare size to SSIM or some quality metric. That should give the most suited matrix regardless of quant differences. I think it can be proven that h263 quant type gives highest PSNR. Just think in frequency terms - if any single frequency has it worse than others, the error at this frequency will be higher then at any other and the peak error will be there.

I might be wrong but there is something in it.

Radek

MfA
2nd April 2004, 08:19
No, an all 8 matrix gives better MSE. Of course in a R/D sense all bets are off entirely, if the distribution of the coefficients after quantization with a flat matrix isnt the same as for what the VLC is optimal (and it practically never is) a flat quantizer matrix will not be ideal.

Radical, the problem is that the R/D curve is by no means a flat line. A given matrix can be the best for a given size target, and not another. So in the extreme case you'd have to test encode with all matrices, there are shortcuts ... but no real systematic way.

LigH
2nd April 2004, 15:18
I asked the same question as Zarxrax (H.263 matrix) a few times. Until today, no one could explain, if there is a matrix representation, or why H.263 may be so different that this would be impossible...

Soulhunter
2nd April 2004, 22:00
From some H.263 white paper... :D
In baseline H.263, quantization is performed using the same step size within a macroblock by working with a uniform quantization matrix. Except for the first coefficient of an intra block which is coded using a step size of eight, even quantization levels in the range from 2 to 62 are allowed.

Zarxrax
2nd April 2004, 22:30
In baseline H.263, quantization is performed using the same step size within a macroblock by working with a uniform quantization matrix. Except for the first coefficient of an intra block which is coded using a step size of eight, even quantization levels in the range from 2 to 62 are allowed.

But this does not seem to be the case. I performed a test on H.263 intra block matrix, and the result was 3.84 MB.
I then tested matrices with uniform coefficients of 22,23, & 24. The 24 matrix produced a file slightly smaller than the H.263 matrix, and the 22 and 23 matrices produced larger files. Therefore, it can't be possible that H.263 is using a uniform matrix, or else one of these matrices would have matched it.

LigH
2nd April 2004, 22:35
Shade wrote me a PM where he described the reason; but I wonder why he didn't write it here, in the thread:



Shade wrote on 2nd April 2004 18:12:
I asked the same question as Zarxrax (H.263 matrix) a few times. Until today, no one could explain, if there is a matrix representation, or why H.263 may be so different that this would be impossible...
Because H.263 use totallly different quantization function with MPEG quantization, so no MPPEG matrix perform like H.263.

The H.263 quantization:
intra:
|F[u][v]|
----------------- == QF[u][v]
2*quantizer_scale

inter:
|F[u][v]|/2 - quantizer_scale
----------------------------- == QF[u][v]
2*quantizer_scale


MPEG quantization:
intra:
F[v][u]*16
------------------------- == QF[v][u]
W[v][u]*2*quantiser_scale

inter:
F[u][v]*16/W[u][v] - k*quantiser_scale
--------------------------------------- == QF[u][v]
2*quantizer_scale

k=sign(QF[v][u])


For more info, see skal's webpage
http://skal.planet-d.net/coding/quantize.html

search "Experiments", you will find the c program "quantization/dequantization functions", and performance analysis.

Soulhunter
2nd April 2004, 23:09
So, we would need a custom H.263 too... :p

RadicalEd
2nd April 2004, 23:18
Oh, so H.263 doesn't *have* a quantization matrix to begin with, that explains why none are around to be seen :P
Thanks for the heads up.

Odd, according to Shade's equation, an MPEG matrix of straight 16's should match the H.263 matrix at all quants.

F(u,v) / 2 * Q = F(u,v) * 16 / 16 * 2 * Q

The 16's cancel. This is, again, not the case.
The mystery continues...

RadicalEd
4th April 2004, 00:56
Ah-ha, the equation given was wrong. It should read:


F[u][v]*16 + W[u][v] * quantizer_scale
-------------------------------------- == QF[v][u]
W[u][v]*2*quantizer_scale


For which there is no real equivalent to H.263. Oh well? :|