View Full Version : Understanding a Quantization Matrix
crusty
24th May 2003, 20:19
Ok Folks, here is an explanation of what a matrix really does.
With Xvid you can not only use different built-in Quantization Matrices, like H.263 and Mpeg, but you can also use custom matrices either made by yourself or someone else.
Since few people actually understand what a matrix does, I will try to give an explanation here that is possible to understand even if you're not a math expert.
You will all have heard about macroblocks by now. These are the 16x16 or 32x32 blocks of which a single mpeg4-frame is composed of.
These macroblocks are comprised of 4 8x8 blocks that are grouped together. These 8x8 blocks form the basis of MPEG-4 compression.
Instead of being composed of proper pixels, like a normal bitmap or a film still, a block is more like a representation of a complex formula that tries to mimic the content of the original picture as best as it can.
The Human eye is much more sensitive to changes in the brightness, or Luminance than it is in color changes.
So mpeg4 uses a type of color space in it's file structure that assigns less bits to color changes than it does to brightness changes.
A 8x8 block is not made up of pixels but it is made up of a single value which represents the average brightness (or color) value, and all the remaining values are mathematical representations of the amount of variation from this average value for the whole block.
To put it in other words, you have one basic mean, or average value, and all the variation, or detail of the picture , is represented by the end result of a certain complex formula.
The other places in the 8x8 block, which we invariably but inaccurately call pixels, represent different types of detail, and especially the variation of this detail.
Let's take a look at one block:
X X X X X X X X
X X X X X X X X
X X X X X X X X
X X X X X X X X
X X X X X X X X
X X X X X X X X
X X X X X X X X
X X X X X X X X
The first place in the upper left corner represents the average, or mean value. So if the whole block is dark brown or light red on average, this place says so.
Going from this place to the right or down, we get representations of the amount of variation from this value.
Now this is hard to grasp, so pay attention.
When going from left to right, or from top to bottom, the higher the amount of detail gets.
If you take the original picture of 8x8 bits, the amount of detail is transformed into values depending on a certain frequency at which the detail is present.
Finer detail is represented by higher frequencies.
So, the further you go to the right in the block, the higher the frequency (or the finer the detail).
Let's take three examples:
An 8x8 picture with one big iron bar in it has little detail, so it has a very low frequency.
An 8x8 picture with four broomsticks in it has some detail, so it has a medium frequency.
An 8x8 picture with rain, a cornfield or hair in it, has lots of detail in it and so it has a very high frequency.
As you can guess by now, the values in a block represent both the horizontal and vertical frequency in the original picture.
The formula that does the translation or transformation from detail to frequency is called Discrete Cosinus Transformation or DCT.
So when looking at a block in the way we have just gotten to understand, we can see the next things:
Average brightness and color of the whole block
I
I Low frequency (bigger)detail
I I
I I Medium frequency (normal)detail
I I I
I I I Higher Frequency (fine)
I I I detail
I I I I
X--X--X--X--X--X--X--X
X--\--X--X--X--X--X--X
Low Frequency detail--- X--X--\--X--X--X--X--X
X--X--X--\--X--X--X--X
Medium Frequency detail X--X--X--X--\--X--X--X
X--X--X--X--X--\--X--X
X--X--X--X--X--X--\--X
High Frequency detail---X--X--X--X--X--X--X--A
I
I
Mathematical representation of the finest
detail, both horizontally and vertically
EDIT: Finally used the code tag...
Ok, now we understand how a block is built.
So how come Quantization matrices into play?
A Quantization matrix (QM from now on)looks something like this:
08 16 19 22 26 27 29 34
16 16 22 24 27 29 34 37
19 22 26 27 29 34 34 38
22 22 26 27 29 34 37 40
22 26 27 29 32 35 40 48
26 27 29 32 35 40 48 58
26 27 29 34 38 46 56 69
27 29 35 38 46 56 69 83
Now there is actually a rather complex proces behind this, but I'll try to describe it in a simple manner:
Every value in the QM is the threshold for the DCT detail-to-frequency translation.
All detail below the threshold will not be regarded as detail and will NOT be encoded. It will simply be discarded.
Now you can understand why Xvid is a so called lossy codec.
It throws detail away. The amount of detail thrown away is determined by the QM.
As you can see, the farther to the right and the farther to the bottom the higher the threshold gets.
The finer the detail, the higher the detail has to stand out of the rest of the picture to be encoded and not thrown away.
So say you have a picture of a girl with long blond hair standing in front of a very light gray wall and behind two prison bars. (don't we all love to see that now) :D
Off course it's hard to portray that in an 8x8 picture but bear with me for a minute.
The blond girl and the wall will have little contrast between them, so the difference between the average values for brightness and contrast and the maximum values will not be that much.
The values will be lower and will not go that often over the threshold. This would mean less difference that has to be encoded and the picture will have high compressability.
If the wall had been black, the contrast would be much higher and the difference between the average value (sort-of-grey) and the extremities (blond and black) would be much higher. So much more values would have gone over the threshold and would have to be encoded. So higher contrasting scenes result in lower compressability, which we already know offcourse.
Now the two black prison bars in front of the girl are detail, all be it not very fine. So they get a low frequency and they will be encoded if their difference from the average goes over the threshold.
So assuming they're not blond, they will be encoded. :cool:
Now the girl has hair, and the texture of hair is off course very fine. So the hair has a lot of detail and gets a very high frequency.
As you can see in our QM, the threshold for high frequencies is much higher than for low frequencies. So the fine detail (the hair) has to differ much more from the average values to be encoded.
So unless the difference is very high, which in this case we assume isn't, the details of the hair won't be encoded.
The matter would be different if she has something like coloured streaks in her hair which would increase the contrast.
So the end result is that the finer the detail, the bigger the contrast of this detail needs to be from the average values of the picture, to be encoded.
This is offcourse on a per-block basis and not on a picture as a whole, which generally consists of more than one 8x8 block. Let's hope you can see through the simplification.
Now you can understand why some matrices soften the picture, like H.263, while others like Mpeg produce a sharper picture.
The values in one QM simply give finer detail a lower threshold and are therefore more likely to encode finer detail, at the price of compressability.
You can also see that the QM that I took as an example isn't a very high compressability matrix; the values are rather low in general.
Some other points:
-End credits usually have very little detail, so you could design a QM especially for this, with VERY high compressability.
-A heavy compression matrix simply ups all the finer detail values so less of that is compressed.
-You could design specific matrices for specific type of content.
You could design matrices especially for sci-fi space adventures, Anime and animation movies, and jungle scenes.
-If you know the exact frequency of interlacing artifacts, you could up their threshold to filter them out!
-Same might work for other types of noise and artifacts.
-I don't know if the one QM is meant for both luminance and color information, I assume it does, but separate QM's for luminance and color would produce higher tweakability. (Don't know if it would be Mpeg4-compliant though, or if it's possible at all).
--------------------------------------------------------------------
Well that's it folks!
Hope I didn't make too many errors, and please correct me if and where I'm wrong.
Cheers,
Crusty
EDIT:
Edited some spelling errors
Removed part about Modulated QM
Reedit: altered macroblock to block. A macroblock is 4 8x8 blocks grouped together.
Try using the [code] tag for nonproportional text.
Acaila
24th May 2003, 22:15
Modulated QM uses different matrices for different types of scenes within one clip. For different parts of the clip, it takes the QM with the highest compression for that particular clip.
- Modulated uses the MPEG matrix when a frame's quantizer is <=3, and the H.263 matrix when the quantizer is >=4.
- New Modulated HQ uses the exact opposite, so the H.263 matrix when a frame's quantizer is <=3, and the MPEG matrix when the quantizer is >=4.
So it's not as smart as you might have hoped, but it's still quite effective.
crusty
24th May 2003, 22:20
Ok, removed that part...
Rest of it is OK?
Acaila
24th May 2003, 22:30
As far as I know everything else is exactly correct :).
snowbeach
25th May 2003, 01:27
Very good explanation crusty! Maybe I should add this to my guide! If I am allowed? :D
MrBunny
25th May 2003, 02:07
Originally posted by crusty
Some other points:
-End credits usually have very little detail, so you could design a QM especially for this, with VERY high compressability.
As Syskin says in this post: http://forum.doom9.org/showthread.php?s=&postid=305827#post305827 changing quant types (in this case to a new quant matrix for the credits) is not MPEG-4 compliant, though it is xvid compliant if on an i-frame. Just a little something to note.
Other than that, it's a really nice comprehensive guide. You also might want to make a quick reference to quantization error (from quantization rounding). It doesn't have much to do with compressibility, but might affect image quality.
Keep up the good work :)
crusty
25th May 2003, 02:44
@Snowbeach:
Permission granted :D :D
Have fun with it.
MrBunny
You also might want to make a quick reference to quantization error (from quantization rounding). It doesn't have much to do with compressibility, but might affect image quality.
I would if I'd understand it. I know that rounding exists but I don't know the effect. :confused:
BTW, is your first name Energizer ? :D
MrBunny
25th May 2003, 04:00
From what I recall of quantization and inverse quantization:
Quantization: Quantized coeff = Floor(DCT coeff/QM coeff)
Inverse Quantization: Reconstructed coeff = (Quantized coeff + 0.5) * QM coeff
The problem arises with the rounding error involved in the floor and using the midpoint value of each coefficient to reconstruct.
Consider the DCT coefficients 75, 88, and 99 with a QM coefficient of 25. Each will result in a quantized coeffient of 3 and each will reconstruct to 87.5. As you can see, there is significant error.
However, I don't visualize well in the frequency domain and don't know how much nor exactly the type of effect it would have in the spacial domain (after iDCT).
I guess at least one of the lessons that should be taken away is that you should be really careful with putting really high coefficients in a QM, since large coeffs will result in large error (when the QM fails to zero that particular coefficient). If you do want to simply zero a certain number of the highest frequency values, you might want to have a look at the avisynth based DCTfilter (http://forum.doom9.org/showthread.php?s=&threadid=38539 and http://forum.doom9.org/showthread.php?s=&threadid=45695) then use a "normal" quant type.
I think quantization error is one of the reasons QMs look like they do (smaller values for low freq and increasingly larger for higher freq). You need to have the most accurate reconstruction possible of the most important coefficients (thus small quants/low error for low freq), whereas it is less important for the less important high freq coeffs (high quants for high freq with hopes of zeroing).
I hope this isn't too confusing ;)
Acaila
25th May 2003, 11:42
@MrBunny:
IIRC JPEG uses flooring, but MPEG uses rounding after quantization. Mainly because rounding to nearest integer results in a smaller quantization error.
...is that you should be really careful with putting really high coefficients in a QM, since large coeffs will result in large errorBut high values in the matrix will drive more coefficients to zero. Because quantization matrix values are 8-bit values, I believe the highest you can put in there is 256. If you want a certain coefficient to become zero, why shouldn't you just put 256 in that position?
MrBunny
25th May 2003, 18:49
@Acaila
I said to be careful about doing it, not to not do it at all :)
And I was thinking in terms of JPEG compression, I was unaware that MPEG handled quantization differently.
Can DCTed coefficients be larger than 256 (Or i just 128 if it rounds)? If it can, then even a QM w/ 256 coefficient isn't guaranteed to zero. In that case I still think that using the DCTfilter to zero coefficients is a better alternative than using a custom quant matrix. Otherwise, putting 256 should be sufficient.
Acaila
25th May 2003, 19:33
Yes in theory DCT coefficients can easily be larger than 256, however except for the top-left DC coefficient almost all other AC coefficients will be quite low in practice. So I don't see anything wrong in using that value in a matrix, however since I've never tried it out myself I could be wrong of course :).
crusty
25th May 2003, 20:42
So higher QM values will result in more variation thrown away in the first place, but will work with a bigger margin of error when it doesn't thrown away variation.
But as long as you use the higher values only for finer detail, it really doesn't matter that much now does it?
I don't know for sure, but I assume that higher rounding errors are less of a problem for finer detail.
The end effect would be that finer detail, if allowed through by the QM value, would simply have a bigger luminance error.
Is the human eye less sensitive for the luminance of fine detail?
The only type of scenes that I can think of that would be affected, off the top of my head, would be dark skies with stars in them or 'proper' space scenes.
This is because a real night sky with stars in them has high contrast fine detail.
The stars are basically points, but there is a difference in luminance that is very easily spotted. They vary easily from the average value, which would be the mean value of the sky (very dark blue), so they would easily get above even a high treshold, but they would then have a pretty large margin of error in luminance.
If you wanted to portray this type of scene realistically, you would have to find out the DCT frequency of the stars and then lower the QM value to allow for a smaller margin of error.
Otherwise the stars will look washed out and you might loose the 'twinkling' effect that a real-life night sky has.
Mind you, DVD's use mpeg2 compression which also uses DCT and macroblocks, so unless they specifically used a smaller QM value for the conversion from movie to DVD, this type of scene will look washed out on DVD anyway.
Also, most 'artificially' created (Special Effects) space scenes don't come close to realistically portraying stars. When astronauts first went up into space, they found the stars to be very bright and to have none of the 'twinkling' effect whatsoever, which offcourse is created by the atmosphere.
But in space the stars are so bright compared to the darkness of space that no TV or monitor can get close to portaying this kind of contrast.
Another thing;
I think that especially for anime movies many of the QM values could be higher, because contrast is usually much higher. I don't know for sure because I have never done an anime movie before, but I'll do Akira probably somewhere this month and I'll try it with different custom matrices.
I suspect a QM with lower values to low frequencies and higher to high frequencies will give a better result. And because anime has in general much sharper contrast than real-life footage, all the values could probably be higher to start with.
It might also be possible to find a 'sweet spot' for each different anime movie or serie because one movie could use finer lines than the other.
To all: Please try custom matrices based on this and report the effect.
Question to developers:
Is mosquito noise around sharp edges created by using too high frequencies for these edges?
I can imagine a very bright thin line getting a higher frequency than a very bright thick line, ergo the macroblock with a thin line getting additional thin lines in decoding because of the higher frequency used.
Is this correct or is this just plain nonsense?
If it's correct, then I assume that some form of postprocessing filter could be made to detect this type of noise and remove it, not just on Xvid, but on all Mpeg4.
Originally posted by crusty
Question to developers:
Is mosquito noise around sharp edges created by using too high frequencies for these edges?
I can imagine a very bright thin line getting a higher frequency than a very bright thick line, ergo the macroblock with a thin line getting additional thin lines in decoding because of the higher frequency used.
Is this correct or is this just plain nonsense?
If it's correct, then I assume that some form of postprocessing filter could be made to detect this type of noise and remove it, not just on Xvid, but on all Mpeg4.
Hah! finally I can engage in this conversation as well. Ringing around sharp edges is created by dropping coefficients.
crusty
26th May 2003, 02:30
Dropping coefficients ?
Could you elaborate on that ?
MrBunny
26th May 2003, 03:42
I believe he means zeroing coefficients (as we've been calling it).
It makes sense. Edges are typically not very "average" with respect to its surroundings. Thus it is not well covered by the DC coefficient and other lower freq components and generally requires a number of the higher freq components to represent it correctly. When some of these components get zeroed, the edge gets eroded (less sharp, less defined and unsettles the surrounding area) resulting in the mosquito noise we all hate.
Didée
26th May 2003, 10:24
After so much elaborating about which cells of the quantization table correspond to which frequencies, and to what kind of image detail, I feel the need to remind of this:
All of the above written is generally correct for intra-frames, resp. intra-blocks, only.
In I-frames the actual image gets coded, and therefore all the frequency stuff is directly related to image detail.
Now, in the usual way we all encode mpeg-4, most of the video stream (~ 98%) is P- and/or B-frames. And for these, its a quite different story.
What gets DCT'ed is the image after the motion compensation, and therefore it is not possible to directly draw the conclusion "fine detail" -> "high frequency", or the other way round. For example, it is perfectly possible that you have a block with pretty fine detail, but that block gets very nicely catched by ME, so the result that gets DCT'ed consists only out of low frequencies!
Because of that, the relation between quantization coefficients and image detail is not of that kind like our straight imagination would suggest. Keep that in mind!
Another point in this context:
For the above reasons, it makes a big difference if you "zero-out" some frequencies by either the encoder's quantization table (so: after ME), or by Tom's DCTfilter (before ME).
Zeroing by DCTfilter will directly remove image detail, whereas zeroing by encoder's matrix will remove the differences that are left over by ME. That are two different pairs of shoes!
Regards
Didée
kilg0r3
26th May 2003, 10:46
So what do we learn from this regarding the relation between an intra cel and the corresponding pframe cell?
After all there are matrices that distinguish between these two. Other, however, have the same value for I-and P-cells. Any general suggestion what might be a wiser way to go?
crusty
26th May 2003, 19:38
MrBunny:
When some of these components get zeroed, the edge gets eroded (less sharp, less defined and unsettles the surrounding area) resulting in the mosquito noise we all hate.
So, if higher QM values for higher frequencies results in less accurate edge compression and more mosquito noise, I suggest the following to the developers:
Insert some form of 'edge detection algorithm' between DCT and Quantization, at least for the higher frequencies. That way, if normal picture content is below the threshold, it will not be encoded, but if there is an edge in the picture, the resulting DCT value would get an additional bonus, so it would go over the treshold easier, therefore reducing noise.
It would probably hurt compressability a bit, but it would result in less mosquito noise. It would be part of the encoding process only, so it would not break mpeg4 compatability.
Call it 'DCT edge detection' or something and allow the user to adjust it's 'DCT edge bonus'.
The question is offcourse whether the improvement in quality would counter the cost in bitrate, but there's only one way to find out....program it and let us test it.
Nic, SysKin, Koepi and others....any thoughts on this?
Didée:
Zeroing by DCTfilter will directly remove image detail, whereas zeroing by encoder's matrix will remove the differences that are left over by ME. That are two different pairs of shoes!
This leads me to conclude that, since Qpel works with smaller motions than Hpel, Qpel will be less efficient when using high QM values, because the effect of Qpel will be nullified by the high thresholds of the QM. The minimum use of bits by Qpel will remain the same tho.
Am I right in this?
What gets DCT'ed is the image after the motion compensation, and therefore it is not possible to directly draw the conclusion "fine detail" -> "high frequency", or the other way round. For example, it is perfectly possible that you have a block with pretty fine detail, but that block gets very nicely catched by ME, so the result that gets DCT'ed consists only out of low frequencies!
Isn't it so that when a pictures content is catched by ME, the corresponding macroblock doesn't contain ANY texture bits, but just motion vectors?
Kilg0r3:
After all there are matrices that distinguish between these two. Other, however, have the same value for I-and P-cells. Any general suggestion what might be a wiser way to go?
If DCT in P- and B-frames generally results in low frequencies, low settings for high frequencies seem to me to have little or no influence to the end result. Maybe a higher threshold for these could lead to higher compressability without too much quality loss.
Also, if high QM values for high frequencies seem to counter the effect of Qpel, I would suggest not using Qpel with high compression matrices, since the amount of bits requiered by the Qpel technique may not be countered by the beneficial effect of Qpel.
The higher rounding errors created by high QM values would then also make Qpel less interesting.
So, to sum up:
1-higher QM values will lead to bigger rounding errors, nullifying subtle luminance variations in Keyframes.
2-Qpel 'might' be useless with high QM values for high frequencies. Don't use Qpel with high compression QM's, at least not with high compression P/B-frame QM's.
3-Using high values for high frequencies in B/P-frame QM could improve compressability. Lowering the high frequency QM values in I-frames could be used to add bits to fine detail in keyframes, possibly creating better values for the B/P-frame process to work with.
4-using the same QM for both Inter- and Intra-frames is less efficient than using different QM's, because of the difference in method used.
Any discussion on these points would be greatly appreciated.
Also, does anybody have the QM's for DivX 3 Low-motion and Fast-motion?
I wonder if they are any different.
It would also be very interesting to see the effects of different QM's on low and high-motion content.
Too bad I'm trying to flush the content of my harddrive because I'm installing a new motherboard this week....otherwise I would be experimenting with this stuff all day long. :D
Only 30 movies to go.....;)
Acaila
26th May 2003, 19:57
This leads me to conclude that, since Qpel works with smaller motions than Hpel, Qpel will be less efficient when using high QM values, because the effect of Qpel will be nullified by the high thresholds of the QM. The minimum use of bits by Qpel will remain the same tho.
Am I right in this?
I don't know why you brought QPel into this, for it has next to nothing to do with quantization. QPel increases the sensitivity of Motion Estimation. Once that process is finished the motion estimated frame is subtracted from the current frame with the result being a residual frame. It in on this residual frame that quantization is applied. QPel works before QM, and so there's no way that high QM could nullify the effect of QPel (more like the other way around if anything).
QPel helps to catch more motion, so it decreases the size of the residual frame, and thus also decreases the amount of effect of the QM. If ME were to work for 100% the residual frame would be empty and no quantization (of the residual frame anyway) could take place.
3-Using high values for high frequencies in B/P-frame QM could improve compressability.And that's exactly what all matrices already do...
And since using too high values cause ringing as mentioned before, I believe the standard matrices are already quite well made. The only reason I could see anyone using a custom QM is for increasing the amount of detail/size of a video (since for higher compression you could just as well use a higher quant, no need for a custom matrix).
Also, does anybody have the QM's for DivX 3 Low-motion and Fast-motion?As far as I know the two flavors of DivX 3 have identical quantization matrices. The only difference is the quant ranges that they can use. I believe low-motion could use anything 2-31, while the high-motion codec could only use 4-31, but I'm not sure if these ranges are correct.
crusty
26th May 2003, 21:23
I don't know why you brought QPel into this,
I figured that since they both work with the precision of motion, there has to be some sort of interaction.
And that's exactly what all matrices already do...And since using too high values cause ringing as mentioned before, I believe the standard matrices are already quite well made.
Offcourse they are, but there certainly are certain types of film that might improve with a tweaked QM, like cartoons.
What do you think of the edge detection in the codec I suggested?
manono
27th May 2003, 01:22
Hi Acaila-
while the high-motion codec could only use 4-31
It's even worse than that-the lowest quant that Hi-Mo uses is 5. Minor point though, in your informative series of posts in this thread.
Acaila
27th May 2003, 09:42
What do you think of the edge detection in the codec I suggested?For one thing it would have to be done before DCT because edges no longer exist in the transformed frame (I think). All spatial information is gone after DCT, only frequency remains; high frequency can just as well mean noise or edges, there's no way to know after the DCT.
Giving a bonus to egdes sounds to me very much like sharpening, and that's not always a good idea, it either creates ringing or increases edge enhancement (->halos). But other than that I have no idea if it would be a good idea in general.
Maybe some real developer can say if it's possible/worthwhile.
Ps. Thanks Manono :)
EDIT: Yes, edge detection using the DCT coefficients (or another transform for that matter) seems possible but from what I have read it's very complicated too.
Didée
27th May 2003, 10:28
Originally posted by Acaila
And since using too high values cause ringing as mentioned before, I believe the standard matrices are already quite well made. The only reason I could see anyone using a custom QM is for increasing the amount of detail/size of a video (since for higher compression you could just as well use a higher quant, no need for a custom matrix).[/B]
Well, let me bore you with my points in using custom matrices.
The standard matrix is well designed as an all-purpose-matrix. It delivers reasonable results with all quality ranges. But obviously, it is not at all optimized for either hi-quality or hi-compression scenarios. For both, a good custom one will fit better.
On the hi-quality side, we already had the discussion "better quality at max quality" (here (http://forum.doom9.org/showthread.php?s=&threadid=40588&highlight=better+quality+max)). Quant-2 doesn't deliver super-high quality, and quant-1 ... you know.
Moreover, I think that the bigger coeffs for the high frequencies are mostly there to deal with noise, not with image detail. (Well, intended or not, it IS like that.) But if you have a well-denoised picture, then your interest in keeping high frequencies will raise quickly, proportional to the ratio of win/loss forced by keeping them. Hah, what a sentence ;)
Furthermore, there is an additional benefit for 2-pass scenarios:
The codec can scale the bitrate much smoother.
Usually, for really good quality, we're still aiming for an average quantizer of '3' or better - at least for P-frames. Right?
Now, with the standard matrix we have that "big jump" in filesize, or bitrate, between quant-2 and quant-3. Therefore, when the codec is working with only q2 and q3, the bitrate is not distributed smoothly: one expensive q2 equals to many cheap q3's, and the quality improvement from those seldom q2's is hard to notice. There should be something between q2 and q3 ...
ATM, I use the following matrix almost all the time. Scales well with all high-bitrate to better-medium-bitrate scenarios - but only for well-denoised material. I always use PixieDust(2|3), seldom with a little caliming-down the noise beforehand by light Convolution3D or Fluxsmooth. Undot() of course.
Since attaching seems down again, I'll do a little more typing. Perhaps you will, too.
# "SixOfNine"
08 10 11 12 12 13 14 15 10 10 11 12 12 13 14 15
10 11 12 13 13 15 15 16 10 11 12 13 14 14 15 16
11 12 12 14 15 15 16 17 11 12 12 14 14 15 16 17
12 13 14 15 15 16 17 18 12 13 14 14 15 16 17 18
12 13 15 15 16 17 18 19 12 14 14 15 16 17 18 19
13 15 15 16 17 18 19 19 13 14 15 16 17 18 19 20
14 15 16 17 18 19 19 20 14 15 16 17 18 19 20 20
15 16 17 18 19 19 20 20 15 16 17 18 19 20 20 20
- This' matrix quantizer range q2-q8 eaquals about the standard matrix range 'q1.4'-'q4.5'
- q3 is a tad better as q2(standard)
- q4 is between q2 and q3 (std)
- q5 is a little weaker as q3 (std)
- ...
For the same bitrate range, the codec simply has more quantizers available for scaling.
Regards
Didée
JimiK
27th May 2003, 12:26
@crusty
I agree with Acaila, that imho there is no use for detecting edges. Even if would be possible, what benefit would you have? You would know there is an edge, and now? This is not meant to sound rude.
Now I always thought a bit different about quantizer matrices. Why do they have higher coefficients for higher frequencies (lower right of the matrix)? Well, I thought it is, because you cannot notice a loss in detail that easily. That's why you can quantize it more. It's dependend on the human visual system. Ever noticed macroblocks in evenly colored areas? Of course that's because the whole block gets the same value which is another problem. But why do we see these blocks, even though the difference is not big (blue block in the sky next to another blue block with almost the same color). That's because the human eye notices differences in a colorgradient much easier than differences between two completely different colors.
So it's not possible to quantize low frequencies even more, you would easily notice. You can of course quantize high frequencies less and get more details and maybe less mosquito noise, but you would have to pay with a much higher need of bitrate, while it's quite hard to see the difference to the other picture.
Again (as I always say): I don't now much about QM's, so maybe that's wrong, but I think I read that somewhere.
@Didee
Thats interesting, with the more even distribution. But there is only a slight difference between the quantization of low to high frequencies (ratio 2:1) which would leed to the problem (bitrate) I just mentioned. Strange, didn't you say you put LOTR on one CD with decent quality? And another point: doesn't this matrix cause problems with ffdshow? As far as I know, it does not like values below 16.
Best regards,
JimiK
vinouz
27th May 2003, 13:43
Theoretically you could notice edges on the DCT, as it produces a fair amount of odd coefficients. Computing pow(odd*odd) - pow(even*even) could maybe give a serious hint about edges. And then the corresponding matrix could be one special for edges (I once before talked of it).
Odd armonics code for a square signal (multiplied by their order's inverse). The only thing is I don't know how this pure square 'signature' resists to noises and geometric distortions. But sure it's good to take a little look at it.
Acaila
27th May 2003, 14:34
@Didee:
JimiK made some valid points there. I agree with him that the coefficients should not be scaled so linearly as you've done in your matrices, because high frequencies are harder to notice than low. With those matrices you rely almost solely on quantizers to scale the bitrate, while I believe you should rely on both QM and quantizers. What I mean with that is that a matrix with higher values in the bottom-right and lower in the top-left should give better scaling (with higher quality possible at quant 2 and less filesize distance between quants, see below). Once a frame gets encoded with higher quantizers you don't want to loose (relatively) just as many low frequencies as high ones because the low have a lot more impact on visual quality.
Also ffdshow indeed has problems with values less than 16, however as far as I can tell it only has problems with the top-left value being lower than 16. Whenever I put a 16 in the top-left corner everything played just fine no matter how low the other values were.
Something else I've noticed is that VHQ doesn't seem to work very well (lower PSNR) on custom matrices compared to H.263 and MPEG. Can anyone confirm this?
The matrices I would recommend:
Intra: Inter:
08 09 10 11 12 13 14 15 16 08 10 12 14 16 18 20
09 10 11 12 13 14 15 16 08 10 12 14 16 18 20 22
10 11 12 13 14 15 16 17 10 12 14 16 18 20 22 24
11 12 13 14 15 16 17 18 12 14 16 18 20 22 24 27
12 13 14 15 16 17 18 19 14 16 18 20 22 24 27 30
13 14 15 16 17 18 19 20 16 18 20 22 24 27 30 33
14 15 16 17 18 19 20 20 18 20 22 24 27 30 33 36
15 16 17 18 19 20 20 20 20 22 24 27 30 33 36 40
Filesize and PSNR are distributed as below (high to low):
Matrix Quantizer
Mine 2
H.263 2
Mine 3
Mine 4
H.263 3
Mine 5
Mine 6
H.263 4
Mine 7
Mine 8
H.263 5
As you can see this gives better bitrate scaling just like Didee suggested and the option for better quality at quant 2. It also doesn't scale low and high fequencies linearly because I don't think that's right.
ssaga
27th May 2003, 15:54
Acaila's QM has a very interesting feature, the number on the same
line are the same, does it have any special meaning?
Another question, since the right site and the bottom site both mean higher frenquency, what's the difference between the numbers on the same diagonal? (such as (3,5) vs (5,3))
Acaila
27th May 2003, 15:57
Acaila's QM has a very interesting feature, the number on the same
line are the same, does it have any special meaning?
I like symmetry :).
Another question, since the right site and the bottom site both mean higher frenquency, what's the difference between the numbers on the same diagonal? (such as (3,5) vs (5,3))From left to right is increasing horizontal frequency. From top to bottom is increasing vertical frequency.
crusty
27th May 2003, 20:22
JimiK:
Ever noticed macroblocks in evenly colored areas? Of course that's because the whole block gets the same value which is another problem.
Well, I read in one thread that Chroma Motion Estimation is designed especially for less color-blocking in almost evenly colored spaces.
I don't know exactly which thread it was, try using the search.
Acaila:
For one thing it would have to be done before DCT because edges no longer exist in the transformed frame (I think). All spatial information is gone after DCT, only frequency remains; high frequency can just as well mean noise or edges, there's no way to know after the DCT.
Well I already knew that it would have to happen before DCT, perhaps I was not clear enough on that, no problem.
Giving a bonus to egdes sounds to me very much like sharpening, and that's not always a good idea, it either creates ringing or increases edge enhancement
Well the idea was really to decrease compression artifacts.
Giving a bonus to edges does seem like sharpening on first sight, but since we would be using it for edge enhancing anyway, that's basically what we would be doing. For normal or oversharpened movies this would not come in handy, but it would be good for movies that are too soft. I personally like sharpness and crispness a lot.
And if you think about it, it's only the REAL edges, determined by the to-be-programmed edge enhancement, that get the bonus. So you wouldn't be adding noise or artifacts, just enhancing the real footage.
But I was thinking a bit more about what would really be usefull, and I figured something else.
Maybe edges don't need a bonus on the current frequency like this:
Edge detection --> determines bonus for DCT coefficient --> DCT --> DCT coefficent gets bonus added --> final value goes trough QM......
Which was what I first had in mind...
But this would indeed increase the visibility of the edges but would not reduce the noice.
Instead I am now thinking about assigning edges a LOWER frequency based on an edge detection bonus than they would have gotten without the edge detection.
If mosquito noise is created by the codec thinking there is TOO MUCH detail (ergo, noise) than there really is, then assigning a lower frequency would reduce the noise now, wouldn't it?
So instead of the edge detection bonus upping the DCT value for the determined frequencies' treshold, it would now instead assign the macroblock a different frequency, based on a more precise estimate of the real content, created by the edge detection algorithm.
So it would instead look more like this:
Edge detection --> assigning extra (internal to codec) value to edges -->
DCT algorithm -->assigning different frequency to macroblock determined by internal value --> QM
It would probably be hella slow, but you might get right of a lot of noise :D
Just some food for thoughts...
Vinouz:
And then the corresponding matrix could be one special for edges
Wouldn't that entail more than one QM per frame, and is that a) possible and b) Mpeg4 compliant?
Didée:
The codec can scale the bitrate much smoother.
Could you elaborate a bit more on that? I'm not entirely sure on how that would be affected by a different QM.
For the same bitrate range, the codec simply has more quantizers available for scaling.
So you're saying that by lowering the QM values, you give the codec more freedom to allocate bits when needed and when not? How good do you find this effect in practice?
I always use PixieDust(2|3)
Slightly off-topic:
Pixiedust created artifacts with me when using it at resolutions lower than 576xXXX. Have you experienced the same or haven't you used Dust with that low resolutions?
/slightly off-topic.
Acaila:
Also ffdshow indeed has problems with values less than 16, however as far as I can tell it only has problems with the top-left value being lower than 16. Whenever I put a 16 in the top-left corner everything played just fine no matter how low the other values were.
So what DOES happen when you use a value less than 16? Does it refuse to play or do you get artifacts?
Also @ Acaila:
I see that your QM for P/B-frames has twice the values for the higher frequencies than your I-frame QM. This would result in your B/P-frames having a bigger bias to low frequencies.
Compared to the standard H263 or mpeg QM, did you adjust the B-frame settings (Q-ratio, offset, treshold etc) to compensate for different end result created by your custom QM?
Acaila:
From left to right is increasing horizontal frequency. From top to bottom is increasing vertical frequency
So from left to right it goes something like: 1 prison bar, 2 prison bars, 4 prison bars, etc etc.
And from top to bottom it goes something like: 1 stairstep, 2 steps, 4 steps, etc etc.
Right?
Also it would not always be prudent to use a symmetrical QM, based on the type of footage.
Say you've got a jungle scene. Most of this would be trees and leefs.
I'd say that would have a bias towards vertical objects, ergo a lot of horizontal frequencies (think about the texture on the trees).
And if you've got a scene with a lot of stairs in it, it would be a lot of horizontal objects, so a lot of vertical frequencies.
Offcourse these are normally just small clips in an entire movie, but I hope it proves the point. :D
Acaila
27th May 2003, 21:23
So what DOES happen when you use a value less than 16? Does it refuse to play or do you get artifacts?It's like pieces of the frame stay put while others start moving around. All I know is it's ugly :).
Compared to the standard H263 or mpeg QM, did you adjust the B-frame settings (Q-ratio, offset, treshold etc) to compensate for different end result created by your custom QM?
I haven't tested B-frames yet with that matrix. It's mostly theoretical, I haven't done any extensive testing with it, but it seemed to perform great on what few tests I have done.
Also it would not always be prudent to use a symmetrical QM, based on the type of footage.
Say you've got a jungle scene. Most of this would be trees and leefs.
I'd say that would have a bias towards vertical objects, ergo a lot of horizontal frequencies (think about the texture on the trees).
And if you've got a scene with a lot of stairs in it, it would be a lot of horizontal objects, so a lot of vertical frequencies.
Offcourse these are normally just small clips in an entire movie, but I hope it proves the point. The trees create vertical frequencies, the leaves create horizontal frequencies. So you see things are not as simple as that :). Some scenes in a video might have more horizontal other scenes might have more vertical frequencies. So the best matrix is one that can accomodate for both types equally. That is why I always make my matrices symmetrical. If we could have something that creates QM's on the fly optimized for each frame we wouldn't go through such trouble to make a good QM, now would we? We can only have one QM, so it must be optimal for as many different scenes as possible, hence symmetrical.
As for your custom edge enhancement theory I'm still not convinced it is actually useful or will even have a different effect from sharpening for that matter. Edge detecting during filtering: very useful. Edge detection during encoding: doubtful. I believe it's not the codec's job to reduce noise (unless it is done with preprocessing filters).
That doesn't mean it might not be useful, just that this is my opinion.
Didée
27th May 2003, 23:13
Back from some testing. However, there is much more to test and compare than available time :(
I took 1500 frames of a 5%-snip through chapter 2 and 3 of LOTR Fellowship SEE, with a snipsize of 100 frames (4s, PAL).
Treatment was
crop(8,80,-8,-80).undot().LanczosResize(640,272).undot()
\ .ConvertToYUY2().pixiedust(limit=2).ConvertToYV12()
XviD (14052003-1) settings were basicly default, +chroME +qpel +chromaOptimizer. And custom quants, of course ...
Destination was 9200 kB, what was 75% of the 1st-pass with standard-mpeg.
Originally posted by trbarry, some time ago
Ah, a table!
| | PSNR PSNR PSNR | MeanAbsDev |
| 1st-pass | min avg max | (avg) | q-distrib.
============================================================================
Standard | 12542 kB | 41.6922 45.5240 50.1203 | 0.9432 | q2:541
| | | | q3:956
| | | | q4:4
----------------------------------------------------------------------------
SixOfNine | 20544 kB | 41.5405 45.5355 49.9382 | 0.9384 | q3:388
| | | | q4:970
| | | | q5:138
| | | | q6:5
----------------------------------------------------------------------------
Acaila | 19438 kB | 40.9498 45.0723 50.1547 | 0.9900 | q3:466
| | | | q4:906
| | | | q5:128
| | | | q6:1
Acaila,
my matrix is not so flat as you may think. In fact, SixOfNine was originally designed to behave very much like the standard matrix, just to get a differently scaled quantizer range in the end.
I also tested around with bigger values in the critical upper-left cell some time back. But I got the impression of flat areas becoming more blocky by that strategy, and so I dropped the idea again. Also it has its pro's in reducing the raised bitrate. Erh, what?
And I don't forced to let ffdshow decode my videos ... it is sufficient as postprocessor only.
Too tired for analyzing now ... perhaps tomorrow with more test results.
Good night, gentlemen,
please try to not post more than 2-3 additional pages on this thread in the next 24 hours, okay? (Crusty!! Carry on!)
Didée
LoKi128
28th May 2003, 02:09
Well, since we are talking about matrices here, I have a question.
When encoding an anamorphic source... would a matrix with higher detail in the horizontal axis provide better quality? Most matrices that I've seen are pretty even in both the horizontal and vertical... but in an anamorphic encode it is important to keep the higher horizontal frequencies, since sqishing the image in essense shifts any lower frequencies up.
Defiler
28th May 2003, 06:16
Acaila:
Ouch, the first pass is huge with your matrix. :)
MrBunny
28th May 2003, 07:38
Originally posted by Defiler
Acaila:
Ouch, the first pass is huge with your matrix. :)
There isn't an issue with large first pass sizes. In a two pass situation, a large first pass will simply cause a higher average quantizer. However, since the Acaila and Six of Nine QMs have relatively small coefficients compared to others QMs, even if the average quantizer is larger (as shown to be true in Didée's table), the actually value used for quantization is still comprable to "normal" QMs.
and back a few posts...
Originally posted by crusty
Wouldn't that entail more than one QM per frame, and is that a) possible and b) Mpeg4 compliant?
As I'd mentioned in a post in the previous page, Syskin talks about switching QMs on the fly in this post: http://forum.doom9.org/showthread.p...5827#post305827
Keep up the interesting discussion (Sorry if we overwhelm you Didée :D )
Acaila
28th May 2003, 14:21
@Didee:
I just finished a dozen tests on a 5000 frame Collosseum scene from Gladiator and my findings support yours for 100%, so I won't post them here. In 2-pass my matrix had a (0.5 dB) lower PSNR than the standard or SevenOfNine matrices on all occasions.
But it looked nice in theory :D.
I've also tried some variations on your SevenOfNine matrix, but no matter what I tried, PSNR always dropped. So I believe you've got quite a good matrix there ;).
Substituting the top-left value 8 with a 16 (in the interframe) to make it ffdshow compatible didn't seem to harm quality at all and PSNR only dropped a little for chroma, not luma. So I see no harm in doing that if anyone wants to use that matrix and uses ffdshow to decode.
Originally posted by LoKi128
When encoding an anamorphic source... would a matrix with higher detail in the horizontal axis provide better quality? Most matrices that I've seen are pretty even in both the horizontal and vertical... but in an anamorphic encode it is important to keep the higher horizontal frequencies, since sqishing the image in essense shifts any lower frequencies up.Honestly I have no idea. But I don't think it makes much difference.
If only we had a way to see the DCT coefficients before they get quantized...I bet that would help a lot in creating a good matrix.
Defiler
28th May 2003, 14:53
Originally posted by Acaila
If only we had a way to see the DCT coefficients before they get quantized...I bet that would help a lot in creating a good matrix. Yes, and it would also be nice to be able to enter values in the matrix, and instantly see the changes previewed on an I-frame. That would be extremely cool.
By the way.. I know the large first pass is irrelevant, because what matters is the actual coefficient, not the quantizer value itself. I just had to up the disk quota in that folder to make room for the first pass. Heh.
Defiler
28th May 2003, 14:56
Also.. are there a pair of tools that can:
1. Create XviD-compatible QM presets from their text equivalent.
2. Load a saved QM into the Custom page via the command-line?
It would be nice to be able to schedule a large set of test encodes, rather than having to load and save manually between each one.
crusty
28th May 2003, 17:10
Acaila:
I haven't tested B-frames yet with that matrix.
It looks like B-frames would be compressed even more than normal with this QM, so it might produce a lower quality end result if B-frames are enabled.
If we could have something that creates QM's on the fly optimized for each frame we wouldn't go through such trouble to make a good QM, now would we? We can only have one QM
Why? I want more !! :D :D
No seriously, even if it would not be mpeg4 compliant, if we could have more QM's in one movie, it would probably improve quality at the same compression level or improve compression at the same quality level. That's what modulated QM did do, even if the modulation decision was as dumb as hell.
I believe it's not the codec's job to reduce noise
Well it's the codec's job to reduce size, and noise adds size. So reducing noise, especially when it's created by the compression itself, would be the codec's responsibility.
Reducing artifacts created by the compression proces is different from reducing noise already present in the original footage. It's the difference between preserving the original picture and altering the original picture.
The process I suggest is to preserve the original picture, while improving compressibility.
Didée:
please try to not post more than 2-3 additional pages on this thread in the next 24 hours, okay? (Crusty!! Carry on!)
Working on it...processing....processing...please stand by :D
MrBunny, I couldn't find the thread you posted.
In a two pass situation, a large first pass will simply cause a higher average quantizer.
You mean, because a large first pass tells the codec that the end result would be bigger than allowed if it used a lower quantizer, right? Just to get this straight...
I was just thinking....
When I started encoding a few years ago I had a really hard time encoding the movie '2010: The year we made contact' end I ended up using both the DivX 3 Fast-motion and Low-motion codecs.
It's a space movie and on the whole it looked better with the Low-motion codec but there where two or three scenes of fast motion that looked blocky.
I encoded the whole movie with both codecs and then used a tool (cannot remember it's name) that would merge the best parts of both.
The end result was very good.
Now I also have used different QM's recently inside one movie, by using a Heavy Compression matrix on the end credits and using H263 or modulated on the main movie.
It may not be mpeg4 compliant, but it works like a charm.
So using different QM's in one clip is quite possible.
This leads me to the following idea (you're probably guessing it already):
Encode the entire movie with several different QM's and then use a tool to switch between the best looking/best compressed parts and merge those. Then you would have a modulated end result.
And with this process you can easily alter the algorithm for choosing between the QM's without altering the codec itself, as it's just a standalone tool.
This way you could use up to a zillion different QM's in one clip, depending on how you make the decision. So you could automatically use different QM's for end credits, low motion and fast motion scenes and dark and light content etc etc.
Any thoughts on this?
Defiler
28th May 2003, 17:15
Can you merge two different AVI files that use different custom XviD quantizer matrices? I've never tried.
By the way.. B-frames work fine with Acaila's matrix. At least, they did for me.
Acaila
28th May 2003, 17:32
You're not the first to think of this :), DCTune (http://ise.stanford.edu/class/psych221/98/dctune/yuke/index.htm) already does per-frame QM optimization during encoding. Even though the program seems to be gone by now, they did prove it was possible. But I bet it required a lot of horsepower as well :D.
Doing it with an external program adds some extra difficulties. I-frames wouldn't be a problem, but P-/B-frames require a reference frame as you all know. Normally this reference frame is a decode version of a previously encoded frame on which motion estimation is performed.
If you take two videos, each encoded with a different matrix, this will mean that all frames that are used as reference will be different between those two videos. If you take an I-frame out of the first video and combine it with the following P-frame of the second video, the P-frame would act on other data than it was created for and the result would be a total mess.
The only way that I can see your idea can work, is if by analysing multiple videos you could output a logfile that says with which QM to encode each frame based on which would result in the smallest frame. During encoding this logfile would then be used to encode the video with the appropriate QM.
But faster would be to just calculate multiple matrices during encoding and pick the one that results in the smallest frame. But would have to be incorporated into a future codec someday.
crusty
28th May 2003, 20:07
If you take two videos, each encoded with a different matrix, this will mean that all frames that are used as reference will be different between those two videos. If you take an I-frame out of the first video and combine it with the following P-frame of the second video, the P-frame would act on other data than it was created for and the result would be a total mess.
Well a quick workaround for that would be to only cut those parts with keyframes on the same frames.
Say movie 1 has keyframes on these:
1 112 229 300 330 400 567 1000.....
And movie 2 has keyframes on these:
1 96 229 300 334 401 567 1000.....
You see that it would be possible to swap out sections 1-229, 1-300, 229-300, 1-567, 229-567, 300-567 etc etc ...
Nothing difficult here, really.
The only way that I can see your idea can work, is if by analysing multiple videos you could output a logfile that says with which QM to encode each frame based on which would result in the smallest frame. During encoding this logfile would then be used to encode the video with the appropriate QM.
This would be indeed be a better but far more complex manner. Even more so because you would have to take into account the I/B/P-frame decision and the use of different QM's for I and B/P-frames.
So, if you did an analysis like that on each frame, you would have to take into account whether or not the current frame would be a I-frame, in which case the calculations aren't that complex, or whether it would be an B/P-frame, in which case you have to recalculate the preceding frames up to the I-frame used for reference. This would be both inefficient and far more cpu intensive.
If you just calculated all frames as I-frames, you would ignore the effects of a separate B/P-frame QM. It would be far simpler tho.
But faster would be to just calculate multiple matrices during encoding and pick the one that results in the smallest frame.
True. That's why I suggested to put it in the codec in the first place. :D
But the crude way I mentioned above, splitting only when I-frames match, would be far easier to implement.
It would allow people to modulate QM on a much more intelligent basis, than just the current Quantizer-based decision. It would also allow modulation on more than 2 QM's provided you have the cpu and time to spare, and you would be able to use other QM's than just h263 and Mpeg.
MrBunny
28th May 2003, 20:21
Originally posted by crusty
MrBunny, I couldn't find the thread you posted.
You mean, because a large first pass tells the codec that the end result would be bigger than allowed if it used a lower quantizer, right? Just to get this straight...
Sorry, I don't know what happened to the url, must have copy&pasted it wrong.
http://forum.doom9.org/showthread.php?s=&postid=305827#post305827
Originally posted by crusty
I was just thinking....
When I started encoding a few years ago I had a really hard time encoding the movie '2010: The year we made contact' end I ended up using both the DivX 3 Fast-motion and Low-motion codecs.
It's a space movie and on the whole it looked better with the Low-motion codec but there where two or three scenes of fast motion that looked blocky.
I encoded the whole movie with both codecs and then used a tool (cannot remember it's name) that would merge the best parts of both.
The end result was very good.
Now I also have used different QM's recently inside one movie, by using a Heavy Compression matrix on the end credits and using H263 or modulated on the main movie.
It may not be mpeg4 compliant, but it works like a charm.
So using different QM's in one clip is quite possible.
This leads me to the following idea (you're probably guessing it already):
Encode the entire movie with several different QM's and then use a tool to switch between the best looking/best compressed parts and merge those. Then you would have a modulated end result.
And with this process you can easily alter the algorithm for choosing between the QM's without altering the codec itself, as it's just a standalone tool.
This way you could use up to a zillion different QM's in one clip, depending on how you make the decision. So you could automatically use different QM's for end credits, low motion and fast motion scenes and dark and light content etc etc.
Any thoughts on this?
Ahhh, the old manual MM4 days. I remember using a tool called ProjectDivX for it, but your point is well taken. Like that tool, the QM/cutscene points would need to be per i-frame as not to break Xvid Compatibility (much for the reasons Acaila described).
I personally would love a tool like ProjectDivX just for cutting and pasting between two xvid files. Sometimes I enjoy tweaking a scene or two (or many more...) that the curve just doesn't deal with right. Using it for clips with different QMs would also be an option. Multiple QM xvid files are possible as long as they are cut @ i-frames, I've done one myself. But that was before I know about MPEG-4 compatibility issues, and right now I prefer to stay MPEG-4 compliant, even at the cost of a little quality.
Sigh...new post as I was writing this :D
My major problem with auto-selection of QM by any algorithm, is that it can't "see" what the end result is. It's quite possible that matrix A is results in slightly smaller framesizes than matrix B, but encoding with matrix A looks really ugly relative to B. I think such an algoritm would have to be very carefully written.
Acaila
28th May 2003, 20:33
Well a quick workaround for that would be to only cut those parts with keyframes on the same frames.No, you misunderstood me. What I meant was that a frame encoded with one matrix cannot serve as a reference for the following frame when that frame expects a reference frame encoded with some other matrix.
Suppose you have one video encoded with H263 and another with the MPEG matrix. If you take an I-frame from the first video and put a P-frame from the second after it, the P-frame will not have the correct data because it expects an I-frame encoded with MPEG, but gets one with H263.
The fact that a frame is encoded with either H263 or MPEG makes it look completely different and mismatched I-frames like that will corrupt all P- and B-frames referencing it.
So it's simply not possible to cut&paste a video together to get the most efficient matrices. Not without decoding-reencoding anyway.
Defiler
28th May 2003, 21:12
This sounds like a job for... KLUDGE!
http://sourceforge.net/projects/kludge
crusty
29th May 2003, 13:31
No, you misunderstood me. What I meant was that a frame encoded with one matrix cannot serve as a reference for the following frame when that frame expects a reference frame encoded with some other matrix.
Ok, I see. But that doesn't change the workaround because you can still cut on keyframes, because they have no reference.
Originally posted by Acaila
The only way that I can see your idea can work, is if by analysing multiple videos you could output a logfile that says with which QM to encode each frame based on which would result in the smallest frame. During encoding this logfile would then be used to encode the video with the appropriate QM.
But faster would be to just calculate multiple matrices during encoding and pick the one that results in the smallest frame. But would have to be incorporated into a future codec someday.
Why smallest frame? If I have a matrix that destroys lots of detail, it will almost always create the smallest frames, and will "win" the decision over all the other matrices I might have. I'd say something like PSNR coupled to size, but then again, PSNR would probably always be related to size (since in this stage all you're doing is reducing coefficients, which will always kill detail and thus reduce PSNR), so you'd have to involve some psychovisual mathematics in it.
Acaila
29th May 2003, 14:31
I followed the smallest frame scenario because that is what crusty originally mentioned when he brought that idea up. But you're right, something based on an optimal rate-distortion curve would probably be preferable. That way you try to achieve the highest PSNR possible while still keeping the same total bitrate as the original video.
Well ok in the 2nd pass it's doable :). But at constant quant a decision would be harder.
Acaila
29th May 2003, 16:39
In the 2nd pass? We were talking about an external program to cut up videos that were already finished (which would always be suboptimal compared to encoder based optimization, but that's beside the point). Whether that video was created with 2-pass or 1-pass is no longer relevant at that point.
crusty
29th May 2003, 18:14
Offcourse it would be suboptimal. It would be relatively easy to implement too. I'm no programmer, but I'd say that probably one experienced person could program something like this in a few days.
That's a lot quicker than getting it through a CVS thread in an Opensource project. And just because something isn't in the codec doesn't have to mean it's not good. Nandub SBC anyone? :D
You could indeed have two 1-pass files encoded at constant Quantizer or constant quality and then have the tool merge the best parts based on different assumptions. These could be PSNR, filesize, Quantizer, PSNR/Filesize ratio, etc, whatever you can think of basically.
You could even add intelligent filters that would base the decision on motion estimation or luminance variations, for instance to switch QM's intelligently between fast-motion and low-motion scenes.
You could also do this with 2-pass files. Or Nth-pass if you like DivX, except offcourse that you can't use different QM's with Divx. :D
Offcourse, the more files you have to merge from, the less mergable parts there will be because there will be less and less identical keyframes with every extra file.
You also do not necessarily have to go for the smallest filesize, but instead for a certain quantizer range or PSNR. Smallest filesize for the same quality or the best possible quality at a given filesize are however my greatest interests.
I have to think a bit about how this would affect the end result of pre-processing noise filters. Say for instance you used a different filter-set for two 2-pass files.
You could devise a way to take the best parts of both files, those parts with the fewest artifacts, and still use heavy filtering.
One point I often find is that you cannot set filters at the strenght you would like the most, without introducing horrible artifacts in some scenes in your movie. That's why I always encode end credits separately, even if it's just one minute, because you can use much heavier filtering on end credits.
You could take filtered and unfiltered files and simply take the best parts of both and merge them, if your decision algorithm is intelligent enough.
Maybe this is not possible, but maybe it is. A decision like that would probably not be based on PSNR because a filtered file would probably have a lower PSNR. Maybe if you could add a PSNR-offset to the filtered file and simply switch whenever the filtered file has the better PSNR(adjusted with the offset) than the unfiltered file.....everybody still getting this?? :D :D
You can even take this one step further and make it a kind of general multi-file merge tool, with plugin modules that add different merge-decision-algorithms into the tool. And in the end you could even try to merge this tool itself with VirtualDubMod. :D
JimiK
29th May 2003, 18:45
@Acaila
Did you compare the quality of the clips when you confirmed Didée's test? Because as we all know: higher PSNR does not necessarily mean better visual quality (that's why they used another method called JNI in a test by the german computer magazin c't. JNI should also consider the human visual system). Of course I know that Didée is here for a long time and I don't doubt that this matrix is great, but it just makes me wonder why.
@all
Some thoughts about different matrices:
1. you would no longer be able to use bframes (as you might know, Modulated and bframes don't mix). Do you really think you could achieve the same quality without bframes, by just using better fitting matrices? (I mean quality at low bitrates).
2. you would have to store all these matrices (the decoder has to know the matrix the encoder used). Then you would to set a flag which matrix is used. I don't know how modulated is implemented. But in the best case scenario, you would have to set a flag, everytime the matrix changes. In the worst case, you would need a flag for every frame that tells the decoder which matrix to use (well, worst case would be if the matrix would not be stored at a single point in the file, but would have to be there everytime it changes. But that is highly unlikely). Even in the best case, you would waste some bits. Say you have up to 256 different matrices. The flag would need 8Bit=1Byte to say which matrix to use.
@crusty
You don't have to encode your credits separately. Just use the trim command in avisynth and filter this section more than the other. Then you can also use the xvid credits section and compress this part more than the others.
Best regards,
JimiK
crusty
29th May 2003, 19:18
Some thoughts about different matrices: 1. you would no longer be able to use bframes (as you might know, Modulated and bframes don't mix). Do you really think you could achieve the same quality without bframes, by just using better fitting matrices? (I mean quality at low bitrates).
You're both right and wrong here.
Modulated and B-frames don't mix because the Xvid codec doesn't alter the QM only at keyframes in modulated mode. (this should be a quick fix for modulated BTW)
Xvid takes h263 when the average quantizer is under one value and mpeg when it is over one value. It doesn't do this just at keyframes but whenever it feels like doing that. So altering the QM at a B or P frame would break predictability in the codec. Ergo, no modulated QM with B-frames.
But the method I mentioned only alters the QM in complete I-B-P frame sequences, so there should be no problem.
So yes, you could still use B-frames. You could even alter all the B-frame settings (ratio, offset and treshold) for every separate clip because those are only ENcoding settings and do not matter when DEcoding.
@point 2:
Good question. I have NO idea how big a QM actually is inside a mpeg4 stream. Anyone got the answer? :confused:
Say you have up to 256 different matrices
A bit overkill probably, I doubt anyone would ever use more than 8 matrices. :D
I encode my end credits using separate avs files and the trim function. And I do filter them much heavier and I DO use the lowest bitrate without introducing too many artifacts. That's what I said. :)
Acaila
29th May 2003, 20:09
Originally posted by JimiK
Did you compare the quality of the clips when you confirmed Didée's test? Because as we all know: higher PSNR does not necessarily mean better visual quality (that's why they used another method called JNI in a test by the german computer magazin c't. JNI should also consider the human visual system). Of course I know that Didée is here for a long time and I don't doubt that this matrix is great, but it just makes me wonder why.
I quickly scanned through them and couldn't see any differences, however I doubt anyone could see the difference between e.g. a PSNR of 45 and a PSNR of 45.5 though (yes I know it's possible, but I couldn't do it). I did my tests at a compressibility of about 70%, so either will look great. And when both videos look great, I tend to label the one with the highest PSNR as the best.
Originally posted by crusty
Good question. I have NO idea how big a QM actually is inside a mpeg4 stream. Anyone got the answer?All I know is that QM values are 8-bit, so those 64 values would take up 64 bytes. I don't know if they have to be repeated at every QM change, or can be set with a flag with all matrices written out in a header somewhere.
crusty
30th May 2003, 01:32
I quickly scanned through them and couldn't see any differences, however I doubt anyone could see the difference between e.g. a PSNR of 45 and a PSNR of 45.5 though (yes I know it's possible, but I couldn't do it).
That's probably just a matter of training.
After ripping 400 Audio CD's I could differentiate easily between a 192 kbps mp3 and the original WAV...hell I could even hear the difference between CD's and records by then!
And I found lately when doing a lot of difficult audio syncing (enterprise serie from mpeg to xvid) that I could 'feel' the difference between 10 and 20 ms of delay. That's 1/100th of a second.
I guess that with enough training people would probably be able to see PSNR drops like you mentioned.All I know is that QM values are 8-bit, so those 64 values would take up 64 bytes. I don't know if they have to be repeated at every QM change, or can be set with a flag with all matrices written out in a header somewhere.
Well that would mean that at every QM change 64 bytes extra are used, because I doubt that this would be implemented by flags.
Actually a .5 PSNR increase is quite much. It's noticable on a full movie.
OUTPinged_
30th May 2003, 22:44
That's strange. I am getting pretty bad results with the provided matrixes on test clip.
It is actually a "torture" one, the codec is expected to provide good-looking picture with ~30% 1/2_pass_ratio. Last koepi's build is used, vhq4+bframes+chroma etc etc.
new matrixes were working in quantization range of 4-10 while h263 was using 2-6 range. h263 was looking pretty good everywhere, while custom matrixes looked like shit on hi-motion scenes where quantizers jupmed up to 8 and beyond.
Interesting thing was that if i was using custom matrix with minimum value of 16 (increasing to 32 same way akaila's does), it looked ok again. Quantizer range used was 3-7.
So i picked a frame which was referred by KF and had practically same size for all matrixes.
it had: quant6 for h263, quant 7 for "16-32" matrix and quant9 for akaila's (i was using it since they both give pretty same results and incompatibility with ffdshow is a big no-no).
here are images that show this very well:
"http://www.mif.vu.lt/~dmku1330/lowkoeffbug-16-32 matrix at quant7.jpg"
"http://www.mif.vu.lt/~dmku1330/lowkoeffbug-akaila's matrix at quant9.jpg"
"http://www.mif.vu.lt/~dmku1330/lowkoeffbug-h263 matrix at quant6.jpg"
notice how 16-32 and h262 matrixes look pretty much the same when "akaila" shows up blocky.
screens were taken in vdub.
Does that occur everywhere where higher quantizer ranges are used? (literally,am i supposed to get crap quality with those low koeff. matrixes for low 1/2 pass ratios?) Maybe there is a bug someplace i missed?
Also, here is a small OT question: i thought that h263 was equivalent to "all 16" matrix but 16s are giving me higher compression ratio. Any ideas which matrix will represent h263 most closely?
OUTPinged_
30th May 2003, 22:46
@Acaila
i believe, QM is stored inside each intra frame.
Acaila
31st May 2003, 09:05
I don't know why these custom matrices performed bad in your high compression tests. The only reason I can think of is because they quantize low coefficients roughly the same way as high coefficients. Whereas the standard matrices quantize the high coeficients much stronger than the low ones. Once you get to high quants (>6) this could have a great effect on the image detail because you'll start losing a lot of low coefficients as well. But that's just a guess.
One thing I do noticed is that the picture from the frame encoded with my matrix looked a lot clearer (less blurry) than the other two. Was the rest of the video also like that or was it just that one frame?
crusty
31st May 2003, 17:44
i believe, QM is stored inside each intra frame.
Damn, I keep forgetting the difference between inter and intra frame :(
That's why I talk about Key-frames and B/P-frames, cause that's a lot clearer.
Do you mean at every Key-frame?
One thing I do noticed is that the picture from the frame encoded with my matrix looked a lot clearer (less blurry) than the other two.
Yeah I noticed that too. It also looks like most of this particular frame is ok but there is distortion in part of the picture, which gets amplified by Acaila's QM.
Was this distortion also present in the original? Please check.
Maybe if you could post a small clip (<250 frames) we could take a look at it.
EDIT: is it me or has this forum slowed down a bit the last few weeks?
OUTPinged_
3rd June 2003, 08:10
@acaila:
In fact, it isnt any "clearer" or a tiny bit better-looking in any other frame.
The "regular encoding frame" used h263 and looked pretty much as it is supposed to. Use it as a reference.
"16-32" matrix uses same matrix koefficient scaling as your matrix does, it is just bumped up for 16 to be lowest koeff.
To me this proves that in certain scenarious matrixes with "under-16" koefficients may be buggy. Maybe it is that particular build of xvid i am using (22032003-1, by koepi).
@crusty:
Do you mean at every Key-frame?
Yes.
Was this distortion also present in the original? Please check.
Checked it several times.
You can try it yourself:
take 20-30 second clip with clean dvd source and high resolution. Let there be both slow and high motion scene (slo-mo after hi-mo in my case). enable chroma_me+bframes+vhq4+me6. First use h263 quantizer. make first pass. Now calculate a target filesize so that it would use about 35% first/second pass ratio. make second pass. Now make both passes with akaila's matrix and you'll get new clip with same size.
Zoom both of them in virtualsub and examine.
My point is that akaila's matrix provides better quality only in those "special" cases, and shouldn't be used in "high compression ratio" encodes.
Didée
3rd June 2003, 10:21
OUTPinged_ ,
seems like your interpretation of what was said before is a little screwed.
1. Initially, I mentioned that my matrix SixOfNine was invented for high bitrate scenarios.
@ Acaila: SixOfNine, not Seven. For sure, that poor little matrix can't scope with the bombshell Jeri Ryan ;)
When you are going to test things with a compressability in the range of 30-35%, which is pretty low, you should try to use a matrix that reduces bitrate, instead of raising it.
A screwdriver is pretty handy for screwing screws, but I'd not try to saw any wood with it!
But, perhaps my matrix is a ScrewHammerSeaSaw? Read on.
2. Looking at that one frame, your testing material seems ... funny. But just for completeness, you should repeat that test with my matrix instead of Acaila's:
In that frame, we see mostly high-blurry content, with no edges at all, plus that funny distortion consisting of very fine detail, almost like noise. Now, if you compare the matrices of Acaila and me, you will see that my matrix will keep high frequencies much longer, when they already will get zeroed by Acaila's. That could explain the bad result you got.
I'm not sure wether my matrix will perform better in your special tests, or not. But you should TRY it, since they both are different.
Just to repeat myself again:
SixOfNine was designed to behave very similar to the standard matrix, but with a smoother quantizer scaling. Thusly, it should perform not too bad even with your heavy compression tests.
3. Acaila spells Acaila, not Akaila.
4. I disagree with your incompatibility with ffdshow is a big no-noThis is only true, of course, when it is important for you to spread your stuff all over the world. I don't.
My encodings are for me, and some very good friends, perhaps. Why should I rely on ffdshow for decoding when it is obvious that libavcodec is buggy. If it will be fixed - fine. As long as the bug is there, I feel very comfortable to let XviD itself decode its content.
Using coeffs < 16 is perfectly allowed. If ffdshow can't handle that, I don't care.
5. More citing:... Last koepi's build is used ...
... that particular build of xvid i am using (22032003-1) ...
... VHQ4 ...Now, if your "last Koepi's build" is from March 22: Welcome back on earth, OUTPinged_ ;)
In its early stages, VHQ > 1 often showed severe problems. But VHQ has been improved considerably in Koepi's 14052003 build. Perhaps you should upgrade.
6. Completely OT
Even if it is very, very late:
OUTPinged_, I still have to thank you for your elaboration on XviD's AltCC-settings. (Do you remember that monster-thread?) It was your detailed explanation that made me finally understand it :)
Regards
Didée
OUTPinged_
4th June 2003, 11:28
Initially, I mentioned that my matrix SixOfNine was invented for high bitrate scenarios
That wasn't it. I was testing if your "high bitrate" matrix is ok to use with medium/low bitrate encodes. If it would be performing as it was intended to (that means, perfectly compatible with near linear 8->20 scaling), it would give out similar picture.
If you use inter coefficients with 16 for matrix A and 32 for matrix B then identical frames with identical references will be same quality if matrix' A quantizer will be 2 times smaller than matrix' B one.
For some strange reason that wasnt' true for any matrixes that were using coefficients under 16 when i tested - first, the rate control algorithm went nuts and wasn't hitting frames at desired sizes - i had to tweak "overflow improvement/degradation" settings in order to get same results (which i don't usually have to). Second, the output came out looking far worse than it should.
The worst thing was that when i updated to later xvid version, the difference decreased. Crap. :/
Anyway, for myself i decided that matrixes with under-16 coefficients are not 100% safe to use with xvids and playback software. I want to use them to encode releases so it has to be compatible :-(
Also, i would like to state this thing:
for "better-than-quantizer2" quality it is useless to futher reduce high frequencies - they are always encoding perfectly with quant2 with most matrixes and i never seen any movie where i would be able to say "this high-frequency part doesnt look good enough with quant2. The scenes that dont look good are all foggy/dark/low frequency ones. But the interesting thing is, that your matrixes dont provide much lower low-freq. coefficients than h263 does!
Simple as that: let's imagine, we have foggy scene with low details behind fog, and it gets smeared even if we use quant2 for it. Unforgivable! Now look what coefficients we will have for 4 last digits with diferent matrixes:
h263 48
akaila 66-80
didee 40
mpeg*** 112-166
cg-animation* 32
low bitrate* 60-200
very low bitr.* 200-200
ultimate* 64
hvs-best** 80-260
hvs-better** 166-234
hvs-good** 176-288
*-were in "xvid custom quantization matrices.zip" file that was posted on this forum long ago
**-was posted on this forum as "HVS matrix pack" or something like that.
***-values were taken from some smart paper.
No comments on that. The thing i wanted to ask was, how large coefficients are enough for low freq., what do you think?
(oh dear, this post is gonna be long..)
When you are going to test things with a compressability in the range of 30-35%, which is pretty low, you should try to use a matrix that reduces bitrate, instead of raising it.
Matrixes dont "reduce" bitrate. To check that, make one matrix with all 16s and one with all 64s. Now encode first with constant quant8, second with constant quant2 and check for differences.
A screwdriver is pretty handy for screwing screws, but I'd not try to saw any wood with it!
You are telling that to a wrong person :-)
In that frame, we see mostly high-blurry content, with no edges at all, plus that funny distortion consisting of very fine detail, almost like noise.
Yes, that is what makes it perfect for testing things like that. I can very easy tell if low/high frequencies are cut out by quantization. Other part of a clip has a couple of seconds where heavy mosquito noise pops up.
Now, if you compare the matrices of Acaila and me, you will see that my matrix will keep high frequencies much longer
I am amused to say that, but for some reason i thought that your matrix scales up to 40 too. Some of my sentences dont apply to your matrix then. But again, it was buggy as hell with that build of xvid i used.
3. Acaila spells Acaila, not Akaila.
roger that sir <o.
Why should I rely on ffdshow for decoding when it is obvious that libavcodec is buggy
For release encodes that is essential - too much linux pcs around.
In its early stages, VHQ > 1 often showed severe problems
No, VHQ wasn't guilty in that case.
It was your detailed explanation that made me finally understand it
Well thank you for that, but i am sad to see that it is now officially recommended to use regular CC algo to not confuse newbies :-( Sheesh, even doom9 used it in his test encodes. Crap.
Arcon
15th June 2004, 14:10
just a short question: is the decoding of mpeg4 material generic or does the decoder need to have the same QM that has been used for the encoding?
and is every mpeg4 video encoded with a custom matrix mpeg4 compliant or is only h.263 allowed to claim full mpeg4 compatibility?
bond
15th June 2004, 14:34
Originally posted by Arcon
just a short question: is the decoding of mpeg4 material generic or does the decoder need to have the same QM that has been used for the encoding?there are 3 possibilities
1) h263, which every mpeg-4 (advanced) simple profile decoder handles
2) mpeg, which every mpeg-4 advanced simple profile decoder handles
3) custom matrix, which every mpeg-4 asp decoder should handle too (to answer your question: no, it doesnt matter which custom matrix was used)
and is every mpeg4 video encoded with a custom matrix mpeg4 compliant or is only h.263 allowed to claim full mpeg4 compatibility?all 3 possibilities are compliant to the mpeg-4 standard
Arcon
15th June 2004, 17:38
Originally posted by bond
3) custom matrix, which every mpeg-4 asp decoder should handle too
is the matrix included in the video-stream in this case or is it indifferent to know the exact matrix just for decoding purposes?
RadicalEd
15th June 2004, 18:05
Yeah, the matrix is always included in the bitstream (in the case of mpeg quantization).
Arcon
15th June 2004, 18:09
Originally posted by RadicalEd
(in the case of mpeg quantization).
and if it's not mpeg quant? h263 does not need to be included but everything else?
crusty
17th June 2004, 12:55
AFAIK, the QM is always contained in the file, no matter if it is a custom one or a standard one.
And even though things like modulated QM are not MPEG-4 compliant, they do seem to work more often than not. So, within one videostream it is possible (but not up to spec!!) to use different QM's.
Because these streams get decoded (played) correctly most of the time, I would deduct that the QM is stored in the stream.
I have no idea how, or if this even correct. Perhaps some dev could shed some light on the subject.... :)
RadicalEd
17th June 2004, 22:26
Originally posted by Arcon
and if it's not mpeg quant? h263 does not need to be included but everything else?
H263 quantization is not actually a matrix, nor can it be directly compared or converted to one. It's just a method of quantizing that doesn't use weights (which MPEG does, and those are the quant matrix elements).
Shinigami-Sama
14th March 2005, 08:38
wow
if Iwasn't so dizzy from readign all of this I would form a corheant commet but untill them if for nothing moe than geting e-mail notifacations for later
wow
everything makes a small amount of sense now
I can now understand why a quant matrix is as inmporant as it is
and the basics of how they work
this is probl;y full of horribale typos
but I can't see them yet
so from what I understand and from serious thikning fom an overloaded mind*looks at his title so you know in advance*
useing more than one QM breacks mpeg4 specs
and if you don't care about spec this has little affect on you
so encoding with moer than one QM can be good if you;er planing to watch with a s/w player
how could I go about making it use two or more QM I may have read it I'm not sure
but is it possible to specify wher I want it to use differant QM if at all?
before I confuse you more with my bad typing and confused post I'll finish it here
p.s
thanks crusty for making this thread
Sharktooth
14th March 2005, 15:54
Actually XVID does support only 1 quant matrix. "Modulated" feature was removed coz it was not MPEG4 compliant.
Shinigami-Sama
15th March 2005, 02:53
ahh
well I don't caer abut complancy anyways
just as long as it wil play on Pc thats all that matters for me
now I understand hte basics I think
only took a day and a half for it to sink in
crusty is good at explaining things it seems
hopefully in a few days I may be able to put together a nwebish CQM myself
samo_jurdik
27th June 2006, 17:09
Ok Folks, here is an explanation of what a matrix really does.
Guys, this is my first post in this forum....and I have a question. Apologies upfront in case it is a stupid one...
I dont understand the relation between the 'quantizers' and the 'quantums' in the q matrices. I will narrow my focus down to iframes (needed at all?).
Quote:
http://www.ece.purdue.edu/~ace/jpeg-tut/jpgquan1.html
"The quantizer divides the DCT coefficient by its corresponding quantum, then rounds to the nearest integer. Large quantums drive small coefficients down to zero. The result: many high frequency coefficients become zero, and therefore easier to code."
MY QUESTION: is 'quantizer' the name of the result of the division of the DCT signal by the corresponding quantum from the q matrix??? ie quantizer = round (DCT / quantum) ???
I hope the answer is yes, but the following quotation makes me doubt:
http://www.cmlab.csie.ntu.edu.tw/cml/dsp/training/coding/jpeg/jpeg/encoder.htm
"Quatization is defined as division of each DCT coefficient by its corresponding quantizer step size, followed by rounding to the nearest integer"
In here, 'quantizer' is what I believe should be 'quantum'.
I would greatly appreciate if you guys could shed some light into this wording mess (as it appears to me at least). Thank you!
Samo
foxyshadis
27th June 2006, 22:06
Result[x] = pDCT[x] / (Matrix[x] * Quantizer)
All results are rounded down (never up) to a minimum of 0. You can rearrange the algebra to find the meaning of Quantizer. The exact terminology is a little... malleable, and some researchers will use terms to mean different things than others.
It doesn't actually code the raw DCT, but rather a predicted DCT based on previous DCTs.
Note that AVC's equation is more complex, because the Quantizer scale is logarithmic.
samo_jurdik
3rd July 2006, 11:23
Result[x] = pDCT[x] / (Matrix[x] * Quantizer)
Contemplating what this means....
In the 1st pass, the codec sets the quantizers to values of 2, 2 and 4 for the respective frame types and runs the source movie through to get the bit rate picture.
In the 2nd pass then the codec dynamically changes the quantizers in order to get close to the requested bit rate / file size (considering the additional 2nd pass settings - iframe tunning, overflow, curve comp and possible quantizer limitations set in Advanced options).
Can you please confirm if this approximately correct or not. Thank you ! :thanks:
Samo
Poutnik
4th July 2006, 13:56
You have got it.
omega6666
31st March 2007, 07:59
I also found this article to be very informative about the way quantization works; http://direct.xilinx.com/bvdocs/appnotes/xapp615.pdf
"Quantization is done to achieve better compression. Quantization reduces the number of bits needed to store information by reducing the size of the integers representing the information in the scene. These are details that the human visual system ignores. This step represents one key segment in the multi- compression process. A reduction in the number of bits reduces storage capacity needed, improves bandwidth, and lowers implementation costs."
BluDRed
10th November 2008, 05:12
Here's a bit of a noobish question.
With some of the videos I compress, I find that dark scenes tend to come out blocky, whereas the light scenes are coming out good with no blockiness. Now, I think I understand that a dark scene is defined as being particularly low frequency detail. This is when using the default XviD matrix H.263, would I be better off customising a matrix with a higher starting figure then continuing with less of a margin until reaching medium frequency detail for such scenes? ack I hope I'm not coming across as hopeless :D
vBulletin® v3.8.4, Copyright ©2000-2009, Jelsoft Enterprises Ltd.