View Full Version : Apply codec logic to blocks or whole image?
koliva
13th August 2009, 10:32
Hello,
I am trying to write my own CODEC in MATLAB. I know it will never be as fast as the current H.264. This is just a research. I would want to ask a question that I have just encountered.
After creating motion vectors, I create a residual image for whole P frame. Then I get the DCT of this image, then I quantize it. My question is, I am doing all the processes like DCT, quantize, entropy coding for whole image not block by block. I think that is why my huffman encoding takes a lot of time during encoding and decoding. Is this correct to do all these processes for whole image instead block by block? How is it going in H.264? Thanks.
Dust Signs
13th August 2009, 14:45
It seems to me as if you mix up H.264 with MPEG-2 and/or JPEG. The latter use Huffman entropy coding, H.264 does not (it uses CAVLC or CABAC). Both MPEG-2 and JPEG use the DCT, H.264 does not (it uses an integer transform, and on some residuals an additional Hadamard transform). Transformation is in all standards applied to blocks, not to the whole image.
I suggest you dig into the theory first and read a book about H.264 or the technical papers which explain the same (this is a good one for example: http://www.ebu.ch/en/technical/trev/trev_293-schaefer.pdf).
Even if you plan to write a (quote) "codec" for educational purposes there are more suiteable tools to do so. Alltogether I would suggest you first read the theory, then the imporant parts of the H.264 standard itself (ISO/IEC 14496-10) and then maybe start looking of some encoder and/or decoder implementations, like x264 or ffmpeg. This should yield a more thorough understanding and prevent you from trying to start to write an encoder from scratch with Matlab...
Dust Signs
Dark Shikari
13th August 2009, 20:23
Hello,
I am trying to write my own CODEC in MATLAB. I know it will never be as fast as the current H.264. This is just a research. I would want to ask a question that I have just encountered.
After creating motion vectors, I create a residual image for whole P frame. Then I get the DCT of this image, then I quantize it. My question is, I am doing all the processes like DCT, quantize, entropy coding for whole image not block by block. I think that is why my huffman encoding takes a lot of time during encoding and decoding. Is this correct to do all these processes for whole image instead block by block? How is it going in H.264? Thanks.The best choice of DCT is one which matches the approximate feature size of the image. Since the feature size is quite a bit smaller than the whole image, a full-frame DCT would be a rather bad idea for decorrelating the image data.
vBulletin® v3.8.4, Copyright ©2000-2010, Jelsoft Enterprises Ltd.