Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > MPEG-4 AVC / H.264

Reply
 
Thread Tools Search this Thread Display Modes
Old 24th May 2010, 14:06   #1  |  Link
sharp81
Registered User
 
Join Date: May 2010
Posts: 11
H.264 for Image coding

Hallo Everybody,

As I was searching for information on H.264 , just the intra prediction mode I came across this informative website. I found it very interesting to figure out how the whole process is done. I am trying to learn and maybe implement a part of the H.264 standard. I am interested only in the intra prediction ie I-frames and I want to use it for still images rather than videos. The idea is to check performance of various predictor modes and to select the best possible predictor. And after that I want to compare the blockwise and pointwise compressions. Since I am beginner in the field of Digital Image Processing, I want to first try to do a very simple implementation on this. I was checking the block prediction algorithm but I dont quite understand it works. Could you give me some ideas how to proceed with a complete system design and maybe some coding help? I am considering starting with only 4x4 block implementation

Also every paper mentions using a reference pixels on top and left to predict the block. But how do I select the reference pixels?

Thanks in advance.
sharp81 is offline   Reply With Quote
Old 24th May 2010, 14:26   #2  |  Link
sauvage78
Registered User
 
Join Date: Jan 2009
Posts: 5
Technically I can't help but your post reminds me of this:

http://www.bilsen.com/index.htm?http...ilsen.com/aic/

If you don't already know about this, maybe you'll find some interesting info there.
sauvage78 is offline   Reply With Quote
Old 24th May 2010, 14:48   #3  |  Link
jmartinr
Registered User
 
jmartinr's Avatar
 
Join Date: Dec 2007
Location: Enschede, NL
Posts: 301
Have you read this: http://forum.doom9.org/showthread.ph...97#post1299497?
__________________
Roelofs Coaching
jmartinr is offline   Reply With Quote
Old 24th May 2010, 17:39   #4  |  Link
sharp81
Registered User
 
Join Date: May 2010
Posts: 11
yeah, I have read that forum post. But I think just using the intra prediction from H.264 shld give better results than jpeg. Which I dont know yet.

So I want to check if the PSNR and entropy is better.

This page on Advanced Image Coding is very nice and interesting. Thanks for the link. But I would be glad if somebody could explain how I could go abt the implementation.

Thanks.
sharp81 is offline   Reply With Quote
Old 24th May 2010, 17:53   #5  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Quote:
Originally Posted by sharp81 View Post
yeah, I have read that forum post. But I think just using the intra prediction from H.264 shld give better results than jpeg. Which I dont know yet.
Yes you do. It does.

Feel free to use x264 to confirm if you want
Dark Shikari is offline   Reply With Quote
Old 24th May 2010, 18:23   #6  |  Link
sharp81
Registered User
 
Join Date: May 2010
Posts: 11
Hi Dark Shikari, Is there a documentation on the source cod of x264? Its very hard to understand a complete code and algorithm without documentation.
sharp81 is offline   Reply With Quote
Old 24th May 2010, 18:27   #7  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Quote:
Originally Posted by sharp81 View Post
Hi Dark Shikari, Is there a documentation on the source cod of x264? Its very hard to understand a complete code and algorithm without documentation.
I didn't say to use the source. Just x264 with --tune psnr --preset placebo --frames 1 --crf X, for some value of X, is sufficient to generate your x264-compressed, psnr-optimized images.

For psy optimization, replace psnr with "stillimage".
Dark Shikari is offline   Reply With Quote
Old 24th May 2010, 18:35   #8  |  Link
sharp81
Registered User
 
Join Date: May 2010
Posts: 11
is it possible to obtain some explanation on the source code? So that I know how it is working behind the scenes.
sharp81 is offline   Reply With Quote
Old 24th May 2010, 18:45   #9  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Quote:
Originally Posted by sharp81 View Post
is it possible to obtain some explanation on the source code? So that I know how it is working behind the scenes.
Hop on the x264 IRC on Freenode and ask questions.
Dark Shikari is offline   Reply With Quote
Old 25th May 2010, 09:03   #10  |  Link
sharp81
Registered User
 
Join Date: May 2010
Posts: 11
The bilsen project looks interesting. Unfortunately I dont have a delhi to compile it. Is there anything similar available written in C/C++?
sharp81 is offline   Reply With Quote
Old 25th May 2010, 09:34   #11  |  Link
cyberbeing
Broadband Junkie
 
Join Date: Oct 2005
Posts: 1,859
Quote:
Originally Posted by Dark Shikari View Post
Quote:
Originally Posted by sharp81 View Post
yeah, I have read that forum post. But I think just using the intra prediction from H.264 shld give better results than jpeg. Which I dont know yet.
Yes you do. It does.

Feel free to use x264 to confirm if you want
http://forum.doom9.org/showthread.php?t=142910
Here was a comparison done between JPEG, JPEG2000, HDPhoto, and x264 back in 2008. It was better than JPEG back then, so it should be even better today.

Last edited by cyberbeing; 31st May 2010 at 07:14.
cyberbeing is offline   Reply With Quote
Old 31st May 2010, 00:49   #12  |  Link
sharp81
Registered User
 
Join Date: May 2010
Posts: 11
i guessed so that it must be definitely better than jpeg. Becoz the prediction algorithm is definitely better. Here the dct is also done on the difference picture rather than on the picture itself like in jpeg.

What I found out about the Intra coding is finding the pixel values from reference is done by intrapolation and extrapolation and taking the weighted average. But I still am not able to get the math here. Is it possible for someone to explain the whole math.

Thanks
sharp81 is offline   Reply With Quote
Old 18th June 2010, 23:45   #13  |  Link
sharp81
Registered User
 
Join Date: May 2010
Posts: 11
does the Jpeg2000 use the same prediction method as the H.264 and only the part where it differs is it uses DWT instead of DCT. Is that true?
sharp81 is offline   Reply With Quote
Old 18th June 2010, 23:46   #14  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Quote:
Originally Posted by sharp81 View Post
does the Jpeg2000 use the same prediction method as the H.264 and only the part where it differs is it uses DWT instead of DCT. Is that true?
No; the prediction methods in H.264 are mutually incompatible with large wavelet transforms.
Dark Shikari is offline   Reply With Quote
Old 19th June 2010, 22:01   #15  |  Link
sharp81
Registered User
 
Join Date: May 2010
Posts: 11
from what i understand in Jpeg there is no prediction and instead DCT is directly done on the picture which is divided into blocks. How exactly does Jpeg2000 work and how is the H.264 intra mode better than Jpeg2000. Is there a reference source code available?
sharp81 is offline   Reply With Quote
Old 19th June 2010, 22:35   #16  |  Link
foxyshadis
Angel of Night
 
foxyshadis's Avatar
 
Join Date: Nov 2004
Location: Tangled in the silks
Posts: 9,559
H.264 JM Reference Source Code
JPEG2000 Reference Source Code

JPEG2000 has no block or pixel prediction. It's a direct wavelet transform of the source, same as JPEG is a direct DCT transform. The primary gain at low bitrates comes from the LL band of the wavelet transform being recursively transformed, and selectively throwing away higher resolution information.

H.264 can predict a block based on DC, up, left, up-left, up-right, and even down-left. Here's a very good h.264 intra primer (pdf) you could look at.
foxyshadis is offline   Reply With Quote
Old 20th June 2010, 11:14   #17  |  Link
sharp81
Registered User
 
Join Date: May 2010
Posts: 11
Thanks a lot for the info. Can you tell me which are the C files, that would be interesting just to consider the intra prediction from the JM software. What's the major difference between this JM software and x264?
sharp81 is offline   Reply With Quote
Old 20th June 2010, 13:43   #18  |  Link
Underground78
Registered User
 
Underground78's Avatar
 
Join Date: Oct 2004
Location: France
Posts: 567
Quote:
Originally Posted by sharp81 View Post
What's the major difference between this JM software and x264?
JM is more something like a proof of concept than an encoder you can use for real daily encoding ... In fact, it is slow as hell and x264 probably has some black magic quality optimizations JM does not have ... But JM source code is probably easier to understand.
Underground78 is offline   Reply With Quote
Old 20th June 2010, 13:46   #19  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Quote:
Originally Posted by Underground78 View Post
But JM source code is probably easier to understand.
I doubt it. It's practically commentless and has no IRC channel where one can easily talk to the people who wrote it.
Dark Shikari is offline   Reply With Quote
Old 20th June 2010, 15:28   #20  |  Link
foxyshadis
Angel of Night
 
foxyshadis's Avatar
 
Join Date: Nov 2004
Location: Tangled in the silks
Posts: 9,559
I found ffmpeg clear and reasonably concise when I was implementing h.264 parsing in python, trying to minimize outside dependencies. I just used x264 directly for encoding, no way I could compete with that.
foxyshadis is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 13:40.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.