Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
29th May 2003, 00:45 | #1 | Link |
Flying Attack Donkey
Join Date: Mar 2002
Posts: 159
|
Spectral Sub-band Seperation in H.26L...
... aka The "Integer DCT" transform.
Why the committee chose it is a mystery to me, for two main reasons. 1. It is clearly worse than the DCT in terms of its ability to seperate the different frequencies; numerous tests have shown that. The argument which says that it allows more accurate reconstruction doesn't fly. Even at a low quant, the roundoff will almost certainly make any difference very low. 2. It doesn't allow DCT-domain sub-pixel motion estimation. Since this approach is faster (and may possibly present better quality), in a system like H.26L, where even 8th-pel precision is allowed, this is a must. Any reasons as to why?
__________________
"Bork Bork Bork" |
29th May 2003, 03:35 | #4 | Link |
retired developer
Join Date: Oct 2002
Location: Canada
Posts: 8,978
|
Geez... this is high-tech stuff...
A little question from a H264 noob: Is H.26L = H.264?
__________________
Detritus Software |
29th May 2003, 10:32 | #5 | Link | |||
Registered User
Join Date: Jun 2002
Location: Adelaide, Australia
Posts: 1,167
|
Re: Spectral Sub-band Seperation in H.26L...
Quote:
Quote:
Quote:
Again, it might be that I don't understand and am wrong Please explain, I will be doing "Kludge" and I'm planning on using integer transform (just note: I _will_ need 100% perfect reconstruction, some stuff will be predicted from the picture, prediction will determine VLCs used so must be the same on both ends) Regards, Radek |
|||
29th May 2003, 10:44 | #6 | Link | |
Registered User
Join Date: Oct 2001
Location: Germany
Posts: 7,277
|
@sirber:
Quote:
|
|
29th May 2003, 11:56 | #7 | Link | ||||
Flying Attack Donkey
Join Date: Mar 2002
Posts: 159
|
Quote:
Quote:
Quote:
Quote:
__________________
"Bork Bork Bork" |
||||
29th May 2003, 12:35 | #8 | Link | ||
Registered User
Join Date: Jun 2002
Location: Adelaide, Australia
Posts: 1,167
|
Quote:
Do you happen to know how much I can lose when changing 8x8 transform (from dct to this integer)? Quote:
Radek |
||
29th May 2003, 15:36 | #9 | Link |
Registered User
Join Date: Mar 2002
Posts: 1,075
|
The integer transform is quite a bit different than the DCT, so I wouldnt be surprised by the 7% number ... but being different from DCT doesnt say much about the compression. There are integer transforms which approximate the DCT very well, but they take a lot more multipliers&adders.
Does anyone have any numbers on coding loss, either measured with constant PSNR or with constant size, for non intra frames? (Not interested in some measure of decorrelation/energy-compaction/whatever, only cold hard bits after entropy coding count) I seriously doubt it is significant. Transform domain MC doesnt seem to make much sense with such small blocks and long interpolation filters, as for ME ... encoder complexity was always a secondary concern. Personally I still think the most appropriate way of coding the DFD is VQ. Last edited by MfA; 29th May 2003 at 16:20. |
29th May 2003, 19:24 | #11 | Link | |
Registered User
Join Date: Mar 2002
Posts: 863
|
Quote:
|
|
1st June 2003, 02:38 | #12 | Link | |
Registered User
Join Date: May 2003
Posts: 328
|
Quote:
Concerning the DCT mismatch issue, it is more visible in mpeg4 compared to mpeg1/2 because the usual GOP is much larger (~300 frames is common) and because most encoders don't intra-code macroblocks after 132 consecutive preditive coding to reduce the accumulation of DCT mismatch errors as required by the standard. This mismatch problem is probably one of the motivations that leads to the choice of an integer transform. -- bobololo. |
|
1st June 2003, 08:20 | #13 | Link | ||
Registered User
Join Date: Jan 2003
Posts: 69
|
Re: Spectral Sub-band Seperation in H.26L...
Quote:
1. It's easy to implement and takes alot of computation load off the CPU. 2. It's possible to implement it with 16 bit processors (makes the semicoductors industry very happy I guess). And of course it fixes the compliancy problem of the "old" DCT. 3. I guess someone will collect a bunch of money from the royalties (politics is an issue as well). Quote:
The standard is supposed to define the best methods to code the data and construct a compliant stream. The algorithms to use are up-to-you. |
||
1st June 2003, 14:53 | #14 | Link |
Registered User
Join Date: Jun 2002
Posts: 76
|
In this zip file there is also a program (not realy good, some calculation errors, higher multiplicators mean less rounding error...) that can calculate a matrix for a integer DCT. I've made some test, and the 4x4 matrix of h.26L is the best i've seen, no other matrix is more correct than this one. Ok, there is some error if you want lossless forward - inverse transform, i don't know much about float transforms, but i think the are equal or more rounding errors than using this integer dct transform.
|
1st June 2003, 15:32 | #15 | Link |
Registered User
Join Date: Jan 2003
Posts: 69
|
@bergi
I took a brief look at your tests I think you might be wrong in constructing the matrixes. You see the inverse transform is not the transpose of the forward transform, but the inverse matrix. anyway the math isnt that hard, I calculated the inverse transform for : 13 13 13 13 17 7 -7 -17 13 -13 -13 13 7 -17 17 -7 and it's : 20 26 20 11 20 11 -20 -26 20 -11 -20 26 20 -26 20 -11 with 10 bits accuracy. you might want to correct the code and run it with the right matrixes. You can use matlab to calculate the inverse transforms. |
1st June 2003, 15:46 | #16 | Link |
Registered User
Join Date: Jun 2002
Posts: 76
|
@shlezman
I don't think so. The 4x4 matrix calculated with my program is the same as the h.26L matrix. And matrixes calculated with my program were use in my other program (for transforming an bmp) included in the zip file, and i don't had any (big) noticeable errors. |
1st June 2003, 16:48 | #17 | Link |
Registered User
Join Date: Jan 2003
Posts: 69
|
The Integer-Transform of H.264 is a rough approximation to the 4x4 DCT.
The matrix that you use is a much better approximation to that transform, then the results of forward transform and inverse transform MUST be 99.9% error free. Test that with no quantization at all (quantization of 1) then calculate psnr : sq2 = Sum of the square of the error (pixelwise) psnr = 10*log10(255*255*image_width*image_size/sq2) if it's less then 60 db then the calculation is wrong otherwise your calculation are correct and I'm wrong |
3rd June 2003, 21:27 | #18 | Link |
Registered User
Join Date: Jun 2002
Posts: 76
|
@shlezman
Ok, you are right, but the difference wasn't so high, don't remeber the numbers for sure, think 77,??? for both, you matrix was a little bit better. Also I think i've found the error, i calculate the inverse transform, for the float forward transform and your matrix is the inverse transform for the (rounded) integer transform, right? I'm going to update the program the next days, but first i want to change some things: - clean the code (code should be easy to understand for everybody) - calculate all inverse matrixes new, couse the 8x8, 16x16 and 32x32 are calculated the same way (can anybody give me some source how to calculate the the inverse transform this way?) - perhaps add some wavelet block matrix (test only, don't think wavelet will lock good at little blocks) |
|
|