Quote:
Originally Posted by Ataril
For example, where can I read about the very beginning of encoding process? Let's say I know that coder gets data already in YUV format, but what is responsible for converting input RGB file containing normal pixels? This functional is a part of codec or otherwise how it is implemented? (or it's all depends on codec?)
|
Color-space conversion, such as RGB to YUV (actually YCbCr), is
not specific to
H.264 at all. You will find a lot information about it, even on Wikipedia:
https://en.wikipedia.org/wiki/YCbCr#YCbCr
Note that compressed video formats usually operate on YCbCr color-space, because it separates "chrominance" (color) information from "luminance" (brightness) information, which helps compression.
See also:
https://en.wikipedia.org/wiki/Chroma_subsampling
Quote:
Originally Posted by Ataril
As far as I understand even in original ITU-T papers mainly decoder's model are described in details.
|
Video compression standards, such as H.264 or H.265, only describe how a
valid bit-stream looks. And how a compliant decoder handles such a valid bit-stream.
But, how to generate a valid bit-stream that,
after decompression, resembles the original input video as closely as possible (under the given bitrate limitations), is totally
undefined.
That exercise is left for the encoder developers to figure out
Quote:
Originally Posted by Ataril
Another mistery for me is the motion estimation. I met mentions of Diamon, Hex, UMH and ESA methods for searching the best matching block in the frame (It is still the part of block-matching algorithm, right?), but never met detailed comprehensive explanation what is the difference between all of them or in which case one or the other should be used.
|
In order to find the "best" motion vectors, the encoder has to try them out and keep the result that performed best, e.g. in terms of smallest "error".
Now, there are
way too many possibilities to try them all (in reasonable time). So, the encoder has to search the space of possible motion vectors in a "smart" way.
Simply put, what the encoder does in practice is trying only a few possibilities (according to some "search pattern") and then refining the most promising candidates.
The names "diamond" (DIA), "hexagonal" (HEX), "uneven multi-hexagon" (UMH) and "exhaustive search" (ESA) refer to such search methods/patterns.