View Single Post
Old 6th November 2016, 23:54   #2  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,248
Quote:
Originally Posted by Ataril View Post
For example, where can I read about the very beginning of encoding process? Let's say I know that coder gets data already in YUV format, but what is responsible for converting input RGB file containing normal pixels? This functional is a part of codec or otherwise how it is implemented? (or it's all depends on codec?)
Color-space conversion, such as RGB to YUV (actually YCbCr), is not specific to H.264 at all. You will find a lot information about it, even on Wikipedia:
https://en.wikipedia.org/wiki/YCbCr#YCbCr

Note that compressed video formats usually operate on YCbCr color-space, because it separates "chrominance" (color) information from "luminance" (brightness) information, which helps compression.

See also:
https://en.wikipedia.org/wiki/Chroma_subsampling

Quote:
Originally Posted by Ataril View Post
As far as I understand even in original ITU-T papers mainly decoder's model are described in details.
Video compression standards, such as H.264 or H.265, only describe how a valid bit-stream looks. And how a compliant decoder handles such a valid bit-stream.

But, how to generate a valid bit-stream that, after decompression, resembles the original input video as closely as possible (under the given bitrate limitations), is totally undefined.

That exercise is left for the encoder developers to figure out

Quote:
Originally Posted by Ataril View Post
Another mistery for me is the motion estimation. I met mentions of Diamon, Hex, UMH and ESA methods for searching the best matching block in the frame (It is still the part of block-matching algorithm, right?), but never met detailed comprehensive explanation what is the difference between all of them or in which case one or the other should be used.
In order to find the "best" motion vectors, the encoder has to try them out and keep the result that performed best, e.g. in terms of smallest "error".

Now, there are way too many possibilities to try them all (in reasonable time). So, the encoder has to search the space of possible motion vectors in a "smart" way.

Simply put, what the encoder does in practice is trying only a few possibilities (according to some "search pattern") and then refining the most promising candidates.

The names "diamond" (DIA), "hexagonal" (HEX), "uneven multi-hexagon" (UMH) and "exhaustive search" (ESA) refer to such search methods/patterns.
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊

Last edited by LoRd_MuldeR; 7th November 2016 at 00:08.
LoRd_MuldeR is offline   Reply With Quote