How frequency domain analysis in frame processing works? [Archive]

View Full Version : How frequency domain analysis in frame processing works?

luquinhas0021

23rd April 2015, 21:23

Hi, my dear people!,

A lot of image processing algorithms (And it articles) talks about the use of frequency domain in order to improve the quality of frame. But, at least the ones I see, no one teach exactly how they arrives in this frequency domain.
What I know is that, for calculate the frequency domain, the algorithm has find the time domain, and thus use the Fourier Transform (It seems produce only one function, instead Fourier Series, that gives a lot of). However... How I arrive to time domain's equation? What data I use: the RGB values in arithmetic form, the color's hexadecimal value?... If it's no one of that, which is?
It be used in sound processing too, and in so many other areas.
P.S.: I know that frequency domain analysis gives me the components of that signal. Can it be used in order to discover what colors compose a given color x (Or a given RGB arithmetic components values)?
Please, answer me all questions.

hank you!

LoRd_MuldeR

23rd April 2015, 22:00

Frequency transform for 2D images (and video is very similar, as it's just a sequence of 2D images) doesn't transform form time domain to frequency domain, but from spatial domain to frequency domain.

Put simply, you cut the input image into fixed-size blocks and transform each block into a linear combination of certain "patterns" (the base functions). For example, these are the DCT (http://en.wikipedia.org/wiki/Discrete_cosine_transform) base functions used by JPEG:

http://upload.wikimedia.org/wikipedia/commons/2/23/Dctjpeg.png

So, while in spatial domain each 8x8 block consists of 64 distinct pixel values, in frequency domain you have 64 frequency coefficients. And each coefficient corresponds to the "weight" of one the above patterns.

The primary advantage is: In frequency domain, you can usually approximate an 8x8 block pretty well with only a small number of non-zero frequency coefficients. This is where bits are saved, in entropy coding (http://en.wikipedia.org/wiki/Entropy_encoding) stage.

http://img.tomshardware.com/us/1999/09/24/video_guide_part_3/dct.gif

luquinhas0021

24th April 2015, 17:01

LoRd MuldeR, this pixel values that in matrix, before DCT, are relative to RGB values? If doesn't, which?

LoRd_MuldeR

24th April 2015, 18:06

LoRd MuldeR, this pixel values that in matrix, before DCT, are relative to RGB values? If doesn't, which?

Usually image/video compression works in YCbCr (http://en.wikipedia.org/wiki/YCbCr) color space, with the two chroma channels sub-sampled (http://en.wikipedia.org/wiki/Chroma_subsampling). The transformation will be done separately for each channel.

luquinhas0021

24th April 2015, 18:26

Each channel (Y, Cb and Cr) has your own DCT? Or is Y and Cb+Cr?

LoRd_MuldeR

24th April 2015, 18:37

Each channel (Y, Cb and Cr) has your own DCT? Or is Y and Cb+Cr?

It's always the same DCT, I suppose, but applied separately to each channel.