H.264 for Image coding

sharp81 · 24th May 2010, 14:06

Hallo Everybody,

As I was searching for information on H.264 , just the intra prediction mode I came across this informative website. I found it very interesting to figure out how the whole process is done. I am trying to learn and maybe implement a part of the H.264 standard. I am interested only in the intra prediction ie I-frames and I want to use it for still images rather than videos. The idea is to check performance of various predictor modes and to select the best possible predictor. And after that I want to compare the blockwise and pointwise compressions. Since I am beginner in the field of Digital Image Processing, I want to first try to do a very simple implementation on this. I was checking the block prediction algorithm but I dont quite understand it works. Could you give me some ideas how to proceed with a complete system design and maybe some coding help? I am considering starting with only 4x4 block implementation

Also every paper mentions using a reference pixels on top and left to predict the block. But how do I select the reference pixels?

Thanks in advance.

sauvage78 · 24th May 2010, 14:26

Technically I can't help but your post reminds me of this:

http://www.bilsen.com/index.htm?http...ilsen.com/aic/

If you don't already know about this, maybe you'll find some interesting info there.

jmartinr · 24th May 2010, 14:48

Have you read this: http://forum.doom9.org/showthread.ph...97#post1299497?

sharp81 · 24th May 2010, 17:39

yeah, I have read that forum post. But I think just using the intra prediction from H.264 shld give better results than jpeg. Which I dont know yet.

So I want to check if the PSNR and entropy is better.

This page on Advanced Image Coding is very nice and interesting. Thanks for the link. But I would be glad if somebody could explain how I could go abt the implementation.

Thanks.

Dark Shikari · 24th May 2010, 17:53

Quote:

Originally Posted by sharp81

yeah, I have read that forum post. But I think just using the intra prediction from H.264 shld give better results than jpeg. Which I dont know yet.

Yes you do. It does.

Feel free to use x264 to confirm if you want

sharp81 · 24th May 2010, 18:23

Hi Dark Shikari, Is there a documentation on the source cod of x264? Its very hard to understand a complete code and algorithm without documentation.

Dark Shikari · 24th May 2010, 18:27

Quote:

Originally Posted by sharp81

Hi Dark Shikari, Is there a documentation on the source cod of x264? Its very hard to understand a complete code and algorithm without documentation.

I didn't say to use the source. Just x264 with --tune psnr --preset placebo --frames 1 --crf X, for some value of X, is sufficient to generate your x264-compressed, psnr-optimized images.

For psy optimization, replace psnr with "stillimage".

sharp81 · 24th May 2010, 18:35

is it possible to obtain some explanation on the source code? So that I know how it is working behind the scenes.

Dark Shikari · 24th May 2010, 18:45

Quote:

Originally Posted by sharp81

is it possible to obtain some explanation on the source code? So that I know how it is working behind the scenes.

Hop on the x264 IRC on Freenode and ask questions.

sharp81 · 25th May 2010, 09:03

The bilsen project looks interesting. Unfortunately I dont have a delhi to compile it. Is there anything similar available written in C/C++?

cyberbeing · 25th May 2010, 09:34

Quote:

Originally Posted by Dark Shikari

Quote:

Originally Posted by sharp81

yeah, I have read that forum post. But I think just using the intra prediction from H.264 shld give better results than jpeg. Which I dont know yet.

Yes you do. It does.

Feel free to use x264 to confirm if you want

http://forum.doom9.org/showthread.php?t=142910
Here was a comparison done between JPEG, JPEG2000, HDPhoto, and x264 back in 2008. It was better than JPEG back then, so it should be even better today.

sharp81 · 31st May 2010, 00:49

i guessed so that it must be definitely better than jpeg. Becoz the prediction algorithm is definitely better. Here the dct is also done on the difference picture rather than on the picture itself like in jpeg.

What I found out about the Intra coding is finding the pixel values from reference is done by intrapolation and extrapolation and taking the weighted average. But I still am not able to get the math here. Is it possible for someone to explain the whole math.

Thanks

sharp81 · 18th June 2010, 23:45

does the Jpeg2000 use the same prediction method as the H.264 and only the part where it differs is it uses DWT instead of DCT. Is that true?

Dark Shikari · 18th June 2010, 23:46

Quote:

Originally Posted by sharp81

does the Jpeg2000 use the same prediction method as the H.264 and only the part where it differs is it uses DWT instead of DCT. Is that true?

No; the prediction methods in H.264 are mutually incompatible with large wavelet transforms.

sharp81 · 19th June 2010, 22:01

from what i understand in Jpeg there is no prediction and instead DCT is directly done on the picture which is divided into blocks. How exactly does Jpeg2000 work and how is the H.264 intra mode better than Jpeg2000. Is there a reference source code available?

foxyshadis · 19th June 2010, 22:35

H.264 JM Reference Source Code
JPEG2000 Reference Source Code

JPEG2000 has no block or pixel prediction. It's a direct wavelet transform of the source, same as JPEG is a direct DCT transform. The primary gain at low bitrates comes from the LL band of the wavelet transform being recursively transformed, and selectively throwing away higher resolution information.

H.264 can predict a block based on DC, up, left, up-left, up-right, and even down-left. Here's a very good h.264 intra primer (pdf) you could look at.

sharp81 · 20th June 2010, 11:14

Thanks a lot for the info. Can you tell me which are the C files, that would be interesting just to consider the intra prediction from the JM software. What's the major difference between this JM software and x264?

Underground78 · 20th June 2010, 13:43

Quote:

Originally Posted by sharp81

What's the major difference between this JM software and x264?

JM is more something like a proof of concept than an encoder you can use for real daily encoding ... In fact, it is slow as hell and x264 probably has some black magic quality optimizations JM does not have ... But JM source code is probably easier to understand.

Dark Shikari · 20th June 2010, 13:46

Quote:

Originally Posted by Underground78

But JM source code is probably easier to understand.

I doubt it. It's practically commentless and has no IRC channel where one can easily talk to the people who wrote it.

foxyshadis · 20th June 2010, 15:28

I found ffmpeg clear and reasonably concise when I was implementing h.264 parsing in python, trying to minimize outside dependencies. I just used x264 directly for encoding, no way I could compete with that.

24th May 2010, 14:06	#1 \| Link
sharp81 Registered User Join Date: May 2010 Posts: 11	H.264 for Image coding Hallo Everybody, As I was searching for information on H.264 , just the intra prediction mode I came across this informative website. I found it very interesting to figure out how the whole process is done. I am trying to learn and maybe implement a part of the H.264 standard. I am interested only in the intra prediction ie I-frames and I want to use it for still images rather than videos. The idea is to check performance of various predictor modes and to select the best possible predictor. And after that I want to compare the blockwise and pointwise compressions. Since I am beginner in the field of Digital Image Processing, I want to first try to do a very simple implementation on this. I was checking the block prediction algorithm but I dont quite understand it works. Could you give me some ideas how to proceed with a complete system design and maybe some coding help? I am considering starting with only 4x4 block implementation Also every paper mentions using a reference pixels on top and left to predict the block. But how do I select the reference pixels? Thanks in advance.

24th May 2010, 14:48	#3 \| Link
jmartinr Registered User Join Date: Dec 2007 Location: Enschede, NL Posts: 301	Have you read this: http://forum.doom9.org/showthread.ph...97#post1299497? __________________ Roelofs Coaching

Thread Tools	Search this Thread
Show Printable Version Email this Page	Search this Thread: Advanced Search
Display Modes
Linear Mode Switch to Hybrid Mode Switch to Threaded Mode

24th May 2010, 14:26	#2 \| Link
sauvage78 Registered User Join Date: Jan 2009 Posts: 5	Technically I can't help but your post reminds me of this: http://www.bilsen.com/index.htm?http...ilsen.com/aic/ If you don't already know about this, maybe you'll find some interesting info there.

24th May 2010, 17:39	#4 \| Link
sharp81 Registered User Join Date: May 2010 Posts: 11	yeah, I have read that forum post. But I think just using the intra prediction from H.264 shld give better results than jpeg. Which I dont know yet. So I want to check if the PSNR and entropy is better. This page on Advanced Image Coding is very nice and interesting. Thanks for the link. But I would be glad if somebody could explain how I could go abt the implementation. Thanks.

24th May 2010, 18:23	#6 \| Link
sharp81 Registered User Join Date: May 2010 Posts: 11	Hi Dark Shikari, Is there a documentation on the source cod of x264? Its very hard to understand a complete code and algorithm without documentation.

24th May 2010, 18:35	#8 \| Link
sharp81 Registered User Join Date: May 2010 Posts: 11	is it possible to obtain some explanation on the source code? So that I know how it is working behind the scenes.

25th May 2010, 09:03	#10 \| Link
sharp81 Registered User Join Date: May 2010 Posts: 11	The bilsen project looks interesting. Unfortunately I dont have a delhi to compile it. Is there anything similar available written in C/C++?

31st May 2010, 00:49	#12 \| Link
sharp81 Registered User Join Date: May 2010 Posts: 11	i guessed so that it must be definitely better than jpeg. Becoz the prediction algorithm is definitely better. Here the dct is also done on the difference picture rather than on the picture itself like in jpeg. What I found out about the Intra coding is finding the pixel values from reference is done by intrapolation and extrapolation and taking the weighted average. But I still am not able to get the math here. Is it possible for someone to explain the whole math. Thanks

18th June 2010, 23:45	#13 \| Link
sharp81 Registered User Join Date: May 2010 Posts: 11	does the Jpeg2000 use the same prediction method as the H.264 and only the part where it differs is it uses DWT instead of DCT. Is that true?

19th June 2010, 22:01	#15 \| Link
sharp81 Registered User Join Date: May 2010 Posts: 11	from what i understand in Jpeg there is no prediction and instead DCT is directly done on the picture which is divided into blocks. How exactly does Jpeg2000 work and how is the H.264 intra mode better than Jpeg2000. Is there a reference source code available?

19th June 2010, 22:35	#16 \| Link
foxyshadis Angel of Night Join Date: Nov 2004 Location: Tangled in the silks Posts: 9,559	H.264 JM Reference Source Code JPEG2000 Reference Source Code JPEG2000 has no block or pixel prediction. It's a direct wavelet transform of the source, same as JPEG is a direct DCT transform. The primary gain at low bitrates comes from the LL band of the wavelet transform being recursively transformed, and selectively throwing away higher resolution information. H.264 can predict a block based on DC, up, left, up-left, up-right, and even down-left. Here's a very good h.264 intra primer (pdf) you could look at.

20th June 2010, 11:14	#17 \| Link
sharp81 Registered User Join Date: May 2010 Posts: 11	Thanks a lot for the info. Can you tell me which are the C files, that would be interesting just to consider the intra prediction from the JM software. What's the major difference between this JM software and x264?

20th June 2010, 15:28	#20 \| Link
foxyshadis Angel of Night Join Date: Nov 2004 Location: Tangled in the silks Posts: 9,559	I found ffmpeg clear and reasonably concise when I was implementing h.264 parsing in python, trying to minimize outside dependencies. I just used x264 directly for encoding, no way I could compete with that.