Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > MPEG-4 AVC / H.264

Reply
 
Thread Tools Search this Thread Display Modes
Old 29th May 2003, 00:45   #1  |  Link
SirDavidGuy
Flying Attack Donkey
 
Join Date: Mar 2002
Posts: 159
Spectral Sub-band Seperation in H.26L...

... aka The "Integer DCT" transform.

Why the committee chose it is a mystery to me, for two main reasons.

1. It is clearly worse than the DCT in terms of its ability to seperate the different frequencies; numerous tests have shown that. The argument which says that it allows more accurate reconstruction doesn't fly. Even at a low quant, the roundoff will almost certainly make any difference very low.

2. It doesn't allow DCT-domain sub-pixel motion estimation. Since this approach is faster (and may possibly present better quality), in a system like H.26L, where even 8th-pel precision is allowed, this is a must.

Any reasons as to why?
__________________
"Bork Bork Bork"
SirDavidGuy is offline   Reply With Quote
Old 29th May 2003, 00:53   #2  |  Link
bergi
Registered User
 
bergi's Avatar
 
Join Date: Jun 2002
Posts: 76
Quote:
The argument which says that it allows more accurate reconstruction doesn't fly
An other argument:
Integer instructions are faster than floats.
bergi is offline   Reply With Quote
Old 29th May 2003, 01:28   #3  |  Link
SirDavidGuy
Flying Attack Donkey
 
Join Date: Mar 2002
Posts: 159
Quote:
Originally posted by bergi
An other argument:
Integer instructions are faster than floats.
But this is addressed in point 2.
__________________
"Bork Bork Bork"
SirDavidGuy is offline   Reply With Quote
Old 29th May 2003, 03:35   #4  |  Link
Sirber
retired developer
 
Sirber's Avatar
 
Join Date: Oct 2002
Location: Canada
Posts: 8,978
Geez... this is high-tech stuff...

A little question from a H264 noob: Is H.26L = H.264?
__________________
Detritus Software
Sirber is offline   Reply With Quote
Old 29th May 2003, 10:32   #5  |  Link
sysKin
Registered User
 
sysKin's Avatar
 
Join Date: Jun 2002
Location: Adelaide, Australia
Posts: 1,167
Re: Spectral Sub-band Seperation in H.26L...

Quote:
Originally posted by SirDavidGuy
1. It is clearly worse than the DCT in terms of its ability to seperate the different frequencies; numerous tests have shown that.
Are you sure about it? I've read that the difference between this transform and DCT is up to 1% of any coefficient. I might be wrong though. And also, I don't see why separiting frequencies is important.
Quote:
The argument which says that it allows more accurate reconstruction doesn't fly. Even at a low quant, the roundoff will almost certainly make any difference very low.
I disagree. DCT problems are there, and they are very ugly. Do you remember XviD qpel smearing? It's horrible. Even if we can solve it on PCs by using the same DCT for all projects, I fear that _all_ qpel videos will not be decodable on stand-alone players.
Quote:
2. It doesn't allow DCT-domain sub-pixel motion estimation. Since this approach is faster (and may possibly present better quality), in a system like H.26L, where even 8th-pel precision is allowed, this is a must.
Can you explain? You always can do DCT if you want to, and you won't use the data during encoding anyway (I doubt 4x4 transforms will do any good for ME)... so I don't see the point.
Again, it might be that I don't understand and am wrong Please explain, I will be doing "Kludge" and I'm planning on using integer transform (just note: I _will_ need 100% perfect reconstruction, some stuff will be predicted from the picture, prediction will determine VLCs used so must be the same on both ends)

Regards,
Radek
__________________
Visit #xvid or #x264 at irc.freenode.net
sysKin is offline   Reply With Quote
Old 29th May 2003, 10:44   #6  |  Link
Selur
Registered User
 
Selur's Avatar
 
Join Date: Oct 2001
Location: Germany
Posts: 7,277
@sirber:
Quote:
The H.264/MPEG-4 AVC standard has had several names over the course of its development. It was initially known as ITU-T H.26L and is now formally becoming Part 10 of the ISO/IEC MPEG-4 standard identified as ISO/IEC 14496-10 AVC.
source: VideoLocus Introduces World?s First Real-Time H.264/MPEG-4 AVC Standard Definition Video Encoder
Selur is offline   Reply With Quote
Old 29th May 2003, 11:56   #7  |  Link
SirDavidGuy
Flying Attack Donkey
 
Join Date: Mar 2002
Posts: 159
Quote:
I've read that the difference between this transform and DCT is up to 1% of any coefficient.
I got different figures; around 7-8%.

Quote:
I don't see why separiting frequencies is important.
The more into the frequency domain you rotate it, the less noticeable (in the spatial domain) the quantization step will be. To a much smaller extent, sort of like the difference between quantization in the frequency (DCT) domain and the spatial domain (direct quantization of the data).

Quote:
I disagree. DCT problems are there, and they are very ugly. Do you remember XviD qpel smearing? It's horrible. Even if we can solve it on PCs by using the same DCT for all projects, I fear that _all_ qpel videos will not be decodable on stand-alone players.
Then the "little" difference shouldn't mean much, should it?

Quote:
Can you explain? You always can do DCT if you want to, and you won't use the data during encoding anyway (I doubt 4x4 transforms will do any good for ME)... so I don't see the point.
But then it wastes time doing the DCT; which would presumably be done with an 8x8.
__________________
"Bork Bork Bork"
SirDavidGuy is offline   Reply With Quote
Old 29th May 2003, 12:35   #8  |  Link
sysKin
Registered User
 
sysKin's Avatar
 
Join Date: Jun 2002
Location: Adelaide, Australia
Posts: 1,167
Quote:
Originally posted by SirDavidGuy
The more into the frequency domain you rotate it, the less noticeable (in the spatial domain) the quantization step will be. To a much smaller extent, sort of like the difference between quantization in the frequency (DCT) domain and the spatial domain (direct quantization of the data).
OK I get it now, thanks
Do you happen to know how much I can lose when changing 8x8 transform (from dct to this integer)?
Quote:
Then the "little" difference shouldn't mean much, should it?
Well, you can't watch the video. Dunno if it's a little difference DCT errors propagate. It wasn't a problem in mpeg1/2 because the propagation was short. In xvid, even with many b-frames, it just looks horrible...

Radek
__________________
Visit #xvid or #x264 at irc.freenode.net
sysKin is offline   Reply With Quote
Old 29th May 2003, 15:36   #9  |  Link
MfA
Registered User
 
Join Date: Mar 2002
Posts: 1,075
The integer transform is quite a bit different than the DCT, so I wouldnt be surprised by the 7% number ... but being different from DCT doesnt say much about the compression. There are integer transforms which approximate the DCT very well, but they take a lot more multipliers&adders.

Does anyone have any numbers on coding loss, either measured with constant PSNR or with constant size, for non intra frames? (Not interested in some measure of decorrelation/energy-compaction/whatever, only cold hard bits after entropy coding count) I seriously doubt it is significant.

Transform domain MC doesnt seem to make much sense with such small blocks and long interpolation filters, as for ME ... encoder complexity was always a secondary concern.

Personally I still think the most appropriate way of coding the DFD is VQ.

Last edited by MfA; 29th May 2003 at 16:20.
MfA is offline   Reply With Quote
Old 29th May 2003, 19:15   #10  |  Link
SirDavidGuy
Flying Attack Donkey
 
Join Date: Mar 2002
Posts: 159
Why does Q-Pel amplify the DCT error? I don't know much about it, technically. Is the interpolation technique defined in the standard, the stream, or chosen at encode time?
__________________
"Bork Bork Bork"
SirDavidGuy is offline   Reply With Quote
Old 29th May 2003, 19:24   #11  |  Link
Tommy Carrot
Registered User
 
Tommy Carrot's Avatar
 
Join Date: Mar 2002
Posts: 863
Quote:
Originally posted by SirDavidGuy
Why does Q-Pel amplify the DCT error? I don't know much about it, technically. Is the interpolation technique defined in the standard, the stream, or chosen at encode time?
I don't know, but every mpeg4 codec has this issue, so it's the standard's fault. Even halfpel does this, just to a lesser extent. So the integer transforms definetaly help here.
Tommy Carrot is offline   Reply With Quote
Old 1st June 2003, 02:38   #12  |  Link
bobololo
Registered User
 
Join Date: May 2003
Posts: 328
Quote:
Originally posted by Tommy Carrot
I don't know, but every mpeg4 codec has this issue, so it's the standard's fault. Even halfpel does this, just to a lesser extent. So the integer transforms definetaly help here.
The main qpel issue with different codecs is primarily related to the definition of the qpel interpolation specified in the ISO/IEC standard. The first specification was very confusing and was completely updated lately (in a draft corrigendum from wg11, not publicly published yet). The result is that different codecs have their own implementation that follows more or less the standard and aren't not 100% interoperable.

Concerning the DCT mismatch issue, it is more visible in mpeg4 compared to mpeg1/2 because the usual GOP is much larger (~300 frames is common) and because most encoders don't intra-code macroblocks after 132 consecutive preditive coding to reduce the accumulation of DCT mismatch errors as required by the standard.

This mismatch problem is probably one of the motivations that leads to the choice of an integer transform.

-- bobololo.
bobololo is offline   Reply With Quote
Old 1st June 2003, 08:20   #13  |  Link
shlezman
Registered User
 
Join Date: Jan 2003
Posts: 69
Re: Spectral Sub-band Seperation in H.26L...

Quote:
Originally posted by SirDavidGuy
[B1. It is clearly worse than the DCT [/B]
It's not as good as "Real"-DCT but it's not that bad as you make it sound. The Integer transform is an approximation to the 4x4 DCT, It has three clear advantages which compensate the fact that it isnt as good as DCT.
1. It's easy to implement and takes alot of computation load off the CPU.
2. It's possible to implement it with 16 bit processors (makes the semicoductors industry very happy I guess). And of course it fixes the compliancy problem of the "old" DCT.
3. I guess someone will collect a bunch of money from the royalties (politics is an issue as well).
Quote:
Originally posted by SirDavidGuy
[B2. It doesn't allow DCT-domain sub-pixel motion estimation [/B]
There are many advantages to the method you specified but it's not a consideration when forming a standard.
The standard is supposed to define the best methods to code the data and construct a compliant stream. The algorithms to use are up-to-you.
shlezman is offline   Reply With Quote
Old 1st June 2003, 14:53   #14  |  Link
bergi
Registered User
 
bergi's Avatar
 
Join Date: Jun 2002
Posts: 76
In this zip file there is also a program (not realy good, some calculation errors, higher multiplicators mean less rounding error...) that can calculate a matrix for a integer DCT. I've made some test, and the 4x4 matrix of h.26L is the best i've seen, no other matrix is more correct than this one. Ok, there is some error if you want lossless forward - inverse transform, i don't know much about float transforms, but i think the are equal or more rounding errors than using this integer dct transform.
bergi is offline   Reply With Quote
Old 1st June 2003, 15:32   #15  |  Link
shlezman
Registered User
 
Join Date: Jan 2003
Posts: 69
@bergi
I took a brief look at your tests I think you might be wrong in constructing the matrixes. You see the inverse transform is not the transpose of the forward transform, but the inverse matrix. anyway the math isnt that hard, I calculated the inverse transform for :

13 13 13 13
17 7 -7 -17
13 -13 -13 13
7 -17 17 -7

and it's :

20 26 20 11
20 11 -20 -26
20 -11 -20 26
20 -26 20 -11

with 10 bits accuracy.

you might want to correct the code and run it with the right matrixes. You can use matlab to calculate the inverse transforms.
shlezman is offline   Reply With Quote
Old 1st June 2003, 15:46   #16  |  Link
bergi
Registered User
 
bergi's Avatar
 
Join Date: Jun 2002
Posts: 76
@shlezman
I don't think so. The 4x4 matrix calculated with my program is the same as the h.26L matrix. And matrixes calculated with my program were use in my other program (for transforming an bmp) included in the zip file, and i don't had any (big) noticeable errors.
bergi is offline   Reply With Quote
Old 1st June 2003, 16:48   #17  |  Link
shlezman
Registered User
 
Join Date: Jan 2003
Posts: 69
The Integer-Transform of H.264 is a rough approximation to the 4x4 DCT.
The matrix that you use is a much better approximation to that transform, then the results of forward transform and inverse transform MUST be 99.9% error free.
Test that with no quantization at all (quantization of 1) then calculate psnr :
sq2 = Sum of the square of the error (pixelwise)
psnr = 10*log10(255*255*image_width*image_size/sq2)

if it's less then 60 db then the calculation is wrong otherwise your calculation are correct and I'm wrong
shlezman is offline   Reply With Quote
Old 3rd June 2003, 21:27   #18  |  Link
bergi
Registered User
 
bergi's Avatar
 
Join Date: Jun 2002
Posts: 76
@shlezman
Ok, you are right, but the difference wasn't so high, don't remeber the numbers for sure, think 77,??? for both, you matrix was a little bit better. Also I think i've found the error, i calculate the inverse transform, for the float forward transform and your matrix is the inverse transform for the (rounded) integer transform, right?
I'm going to update the program the next days, but first i want to change some things:
- clean the code (code should be easy to understand for everybody)
- calculate all inverse matrixes new, couse the 8x8, 16x16 and 32x32 are calculated the same way (can anybody give me some source how to calculate the the inverse transform this way?)
- perhaps add some wavelet block matrix (test only, don't think wavelet will lock good at little blocks)
bergi is offline   Reply With Quote
Old 4th June 2003, 09:10   #19  |  Link
shlezman
Registered User
 
Join Date: Jan 2003
Posts: 69
The easyest way is to use matlab or octave to inverse matrixes and quantize them, it can be VERY useful to test all these transforms, included wavelets.
shlezman is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 05:50.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.