View Full Version : How to prevent grain induced block noise in uniform areas?
adoniscik
27th May 2004, 21:26
I am trying to get a 1MBps video with a smooth sky and a reasonable encoding rate (over 10 fps), but I realize this may not be possible. If not, I want to know how close I can get.
I read the archives voraciously, yet I could not solve the problem. Some of the suggestions contradict each other! Some people say block noise is due to bit starvation, so noise should be added. Others say it is a result of noise, and should be filtered out. What is a poor soul to do?
Since there was so much grain to begin with I felt no need to use the dithering approach. That leaves the noise reduction approach, of which I tried the following Avisynth 2.5 filters to no avail, used after de-interlacing with tomsmocomp(1,5,1):
mpeg2source with the cpu=6 parameter.
Convolution3d(preset="movieLQ") (as well as other parameters)
peachsmoother(NoiseReduction=100)
smoothhiq(7,20,20,256,0)
VagueDenoiser(threshold=0.8,method=1,nsteps=6,chroma=true)
The only thing that remotely helped was enabling "Film Effect" in the XVID decoder.
Xvid 1.0 parameters:
AS @ L5, Two-pass @ 1000kbps
"Ultra High" motion precision search
"Mode decision" VHQ mode
Default quantizer restrictions: 1,31,1,31,1,31
Trellision quantization
MPEG quantizer (but I tried 'em all)
Asmodian
27th May 2004, 22:19
Try with no trellision quantization. Also a quant range of 2-31 and the h263 or HVS Best matrix might help.
EDIT: Do you just use standard B frame settings?
*.mp4 guy
27th May 2004, 22:26
try these settings:
custom mpg matrix sooulhuntersv5: " "$ "$0 "$08 "$08@ "$08@€ $ $* $*0 $*08 $*08@ $*08@` $*08@`€
gmc
interlaced
qpel
default bvops
packed bitstream
closed gov
chroma optimizer enabled
motion search6
vhq4
use chroma motion
I frame interval 300
min quant 2 for all frames
trellis quant
to use the custom matrix just copy paste it into a txt file and open the txt file under the load custom matrix section.
Blockiness on smooth surfaces, which contain little detail... A widely-discussed and never ending problem. Filtering out the grain only makes things worse in such situations.
Do not filter out the grain and use a matrix that helps keep details. (Since you say the film-effect -adding grain- during playback doesn't disturb you.) If high quantizers are used in that scene, use "zones" feature to assign lower quants to that particular section.
Manao
27th May 2004, 22:53
adoniscik : blocks are indeed due to overquantization of relatively smooth areas. This overquantization happens when you don't have enough bits to properly encode a frame. It is mainly visible in smooth area, because blocks are more visible in those areas.
So, a way to fight it is to make those areas less smooth, that's why some people were advising to add noise. But there is a counterpart : if you add noise, you're making the movie less compressible, so you'll lower the overall quality of the movie. The trick is to add enough noise to make the blocks almost disappear, and yet retain enough quality for the rest of the movie.
However, I would tend to use postprocessing to get rid of blocks. I don't like the idea of making the movie less compressible ( but that's a personnal opinion ).
Now, for XviD's settings, I don't know whether you used b-frames or not, but you should use them. As Asmodian say, you should try without Treillis quantization, because one of its effect is to discard useless DCT coefficients, which alas creates such blocks. You should definitely try a custom matrix ( I like HVS-Best )
Teegedeck
28th May 2004, 10:26
I do agree with iago and Manao. Though I'm not sure about disabling Trellis. Because Trellis has very good effects that I wouldn't want to miss. For example it gives a lower filesize while contours seem to remain more accurate with it than witout it, and in most cases it seems to prevent more quantization errors than it creates...
As iago said, first try to quantize the problematic zone less, use hvs-best as Manao said. If that doesn't help, add noise with that avisynth filter, err, I believe its name was "blockbuster", but only for that zone.
Lord_KiRon
29th May 2004, 14:23
adoniscik - are you by any chance try to play your clip on MTK based hardware MPEG4 player ? - This players are known to produce a lot of blocks when displaying uniform areas , when on PC they are noticable only if you set high resolution .
adoniscik
29th May 2004, 16:31
No, I am using the standard PC players; BSPlayer, MPC, and Media Player. I did use B-frames, at the Xvid 1.0 standard settings.
I ended up using noise reduction and enabling "film effect" in the decoder, but keep the suggestions coming. I am sure someone will benefit from them. Out of curiosity, does this problem exist with wavelet compression?
Prettz
29th May 2004, 23:32
The best solution is to completely and totally eliminate all noise in the still areas. If you can do this, you will (obviously) eliminate the problem. However, usually this is not possible, and in some cases attempting to remove the noise just makes the compressed result look even worse than before (especially when you try to remove the noise with Convolution3D).
In cases where you can't eliminate the noise, you should just do some light filtering (with filters like FluxSmooth and Undot) and use zones to up the bitrate, lower the quantizers, and reduce the number of bframes used. At least, this is the solution that has worked best for me in the past.
Manao
29th May 2004, 23:49
The best solution is to completely and totally eliminate all noise in the still areas. If you can do this, you will (obviously) eliminate the problem.No. Doing so, you'll end up with a smooth gradient, which will be overquantized by XviD, because there isn't enough detail in it. Overquantization will break the smooth gradient ( in a sort of step-function ( if that word exists ) ), hence creating blocks. Such denoising will only prevent them from moving.
adoniscik,
I processed your sample, you can find it on
ftp://www.eb.enterpol.pl
user: www.eb.enterpol.pl
password : eb
name of the sample ss1++.avi
eb
adoniscik
30th May 2004, 03:16
I downloaded your version. Thank you for the effort. The blocking is gone, but I notice ghosting, presumably due to aggressive temporal smoothing. Do you see it too? Judging by the 4CC, I think you used 3ivx rather than xvid.
Yes 3ivx was used.
But most weired in your sample is shaking picture so I used deshaker plugin, then smoother,denoiser and again smoother, all this in the second pass of deshaker.
eb
adoniscik
30th May 2004, 05:21
Is that Gunnar Thalin's deshaker (http://biphome.spray.se/gunnart/video/deshaker.htm)? I would imagine stabilizing affects compressibility because shaking makes motion estimation more difficult. I will investigate this. What is the performance penalty?
lordadmira
30th May 2004, 11:24
Yeah, it's a common problem u have there. But if you ask me trying to get 1 Mbps is pushing it for that frame size. Nevertheless I gave a go at it, I reused most of the settings I use for encoding my Inuyasha eps. :) I got a 504kB clip that looks much better than eb's attempt (overall). He used Quant 1 which is insane. ;) I uploaded the config file too so you can check out the settings.
The problem with fine gradients like that is like Manao said, it's a problem with the dynamic range. However I don't see that problem in ur clip. The sky is chaotic enough to encode fine. It's the *really* fine gradients that have the potential to break the codec. Any blockiness u see is do to too low a bitrate. My encode went to about 820 kbps, 1000 is too high for this particular scene. I did a Q=2 encode and it only came out at under 1300 kbps. There is some choppiness to the sky (choppiness is diff from blockiness), that's normal for this codec for the dynamic range considerations already discussed. This is because the codec works by forcing square pegs into round holes, there's limitations to what that can achieve. Even Q=1 in my tests gave minor choppiness. To get any better u need custom matrixes. That's a trial and error process. The default matrix is limited in what it can do. I used Virtual Dub Mod and Smart Deinterlacer. The other filters are built in.
http://w3.goodnews.net/~wagnerc/adonisck.vcf
http://w3.goodnews.net/~wagnerc/adonisck[Xvid_1000].avi
LA
lordadmira
30th May 2004, 11:47
Originally posted by Manao
No. Doing so, you'll end up with a smooth gradient, which will be overquantized by XviD, because there isn't enough detail in it. Overquantization will break the smooth gradient ( in a sort of step-function ( if that word exists ) ), hence creating blocks. Such denoising will only prevent them from moving. Eh, sort of. What we call a fine gradient is an area that has a small dynamic range. It's a range and pattern that doesn't correspond well to any of the quant matrix's transforms. So the codec has to force the area to take on one of set of bad matches. The result is choppiness. Blockiness is when the edge patterns of blocks don't line up/match up. The codec is optimized to handle well a certain range of patterns and gradients that occur in natural scenes. Fine gradients break that at one end of the spectrum and anime breaks it at the other end. Solid color -> black line -> solid color is not something it likes to deal with. It's the same phenomenon that causes the codec to go into rate control anarchy when it starts using quants 1's and 2's. Math might tell u that 1.52987 is the right choice, but sorry! U have to pick 1 or 2. Same with the fine gradients.
@iago
I don't think the amount of noise makes any difference to the dynamic range problem. Because although the noise will expand the absolute range, it will not change the average range. This is why temporal-spatial smoothers are able to filter out noise in the first place. So adding noise will just add to the blockiness problem since bits have to be spent to handle that large absolute range and do nothing for the choppiness problem which is one of average dynamic range.
LA
Manao
30th May 2004, 12:17
lordadmira : I think we're speaking of the same thing. With low quant, you can approximate very well ( to a certain extent ) all the values taken by DCT coefficients, while at higher quants, a stronger rounding will happen. When you round too much the coefficient that represent the mean value of the block ( I don't remember whether it is DC or AC, let's call it DC ), you obtain the block effect : DC coefficients will differ from one block to another, and continuity won't be brought back by other coefficients, because they are even more quantized.
lordadmira
30th May 2004, 13:12
Heheh, yeah, choosing the right words gets hard in these esoteric discussions. ^_^ U need to be a damn English major sometimes...
manono
30th May 2004, 15:57
adoniscik-
You've got a bad PAL to NTSC DVD there. Part of your problem stems from the fact that you're just deinterlacing it (and filtering the hell out of it) and keeping it at 29.97fps. If you were to use the KernelBob/RePAL combo, it'll become 24.975fps, distribute the available bits among fewer frames, have higher quality with a lower average quant for the same file size, and generally give you a much better looking result.
lordadmira wrote
I got a 504kB clip that looks much better than eb's attempt (overall). He used Quant 1 which is insane.
Yes you are right, I used this to make quick assessment what max bitrate is needed to not to loose details, then I forget about this, jumping to the deshaking problem.
And I must underline that the prime target of my post was to take care of adoniscik to the shaking problem.
lordadmira:
Heheh, yeah, choosing the right words gets hard in these esoteric discussions. ^_^ U need to be a damn English major sometimes...
so what I can say with my crappy English
eb
adoniscik
31st May 2004, 01:27
You've got a bad PAL to NTSC DVD there.
Why did I not think of that? I saw the blended frames, but I did not infer the presence of telecine. You learn something new every day! If I have any questions about the tutorial (http://www.doom9.org/ivtc-tut.htm) I know where to ask.
lordadmira
31st May 2004, 05:13
You have to be very careful dealing with inverse telecining. U can only successfully do it if the movie was telecined properly in the first place. I've run into videos that were constructed from various telecined pieces stitched together. The IVT starts off in sync but then hits a stitch point and the sync is lost. Stuff after that point will have strange interlacing artifacts due to the out of sync IVT. IVT only works right if the whole video was telecined after being made. The only real way to tell this is to load it up in VDub with the IVT and frame through it looking for artifacts. It's a real pain.
Looking at ur video it seems not to be telecined but truely interlaced. Doing an IVT left visible interlacing artifacts.
edit: Looking again it appears 1 in 6 frames is non-interlaced. This could indicate some strange film -> PAL -> NTSC conversion and some more bizarre telecining process. Undoing that much butchery might be more trouble than it's worth. Straight deinterlacing gave me good results.
eb: LOL :)
Lord_KiRon
1st June 2004, 09:03
Can't the process be "automated" ? I mean it shouldn't be to hard having the process(algorithm) to detect telecine to enhance it to detect videos that were constructed from various telecined pieces , after that we have only need to make decission on that to do at stich point .
use donald graft's decomb for avisnth.
it has a reset feature especially for this problem - you'll need to read the help file
I reckon your initial problem is noise induced. remove it with a high radius using temporal soften
temporalsoften(8, 2 or 3, 8, 16, 2)
Close examination of the clip show a fair amount of chroma noise. you might want to try cnr2()
Prettz
2nd June 2004, 12:17
Originally posted by Manao
No. Doing so, you'll end up with a smooth gradient, which will be overquantized by XviD, because there isn't enough detail in it. Overquantization will break the smooth gradient ( in a sort of step-function ( if that word exists ) ), hence creating blocks. Such denoising will only prevent them from moving.
I completely disagree with this. When I said "completely and totally eliminate all noise in the still areas" I meant the best case scenario when you can do this without smoothing or removing detail. This happens occasionally.
If you can't remove random noise, you can often end up with just larger solid areas of random noise moving about, which look absolutely hideous in the final encode (this happens a lot with animation). In these cases you're best off trying to reduce the sharpness of the noise as much as possible without just smoothing the noise around.
Manao
2nd June 2004, 12:58
Prettz : as soon as I get home, I'll create a noiseless gradient, and test my assertion ( which is that if I overquantize that noiseless gradient, I'll get blocks ). But I'm almost sure to be right.
lordadmira
2nd June 2004, 13:12
Of course it's possible to crush any image type (including a pure gradient) enough to get blocks. The question is how much will Xvid decide on it's own to quantize fine gradients in the context of the video overall. I think ur point is that fine gradients will break the codec more readily than general images. In these cases u need a custom matrix to bring more quant numbers into the range that the gradient "wants" to quantize to.
LA
Manao
2nd June 2004, 17:50
I did the test : I created a gradient going horizontally from deep black to deep white ( luma increasing by one with each pixels ). I encoded it, with XviD, const quant 4 and 7, H263 ( I also tried quant 3, and HVSBest quant 3, results were almost the same as H263 quant 4 ), without fancy options. The pictures shown are keyframes, but the followings are almost the same ( since there is no motion )
http://jourdan.madism.org/~manao/gradient-original.png
http://jourdan.madism.org/~manao/gradient-h263-q4.png
http://jourdan.madism.org/~manao/gradient-h263-q7.png
You should see that the gradient isn't smooth anymore, even at quant 4 ( which can't be called image crushing, imho )
Teegedeck
2nd June 2004, 18:00
:goodpost:
lordadmira
2nd June 2004, 18:25
That's a really good demonstration of the limitations of DCT encoding. And actually, that is an even steeper gradient than the ones that give us problems in real life encodes. I'ld be interested to see what quants 1 and 2 look like. Hmm, what gradient would be considered "the worst case scenario" for Xvid? We should save that and add it to a FAQ somewhere.
PS Please forgive my cavalier use of the word "crush". :D
Manao
2nd June 2004, 19:25
Hmm, what gradient would be considered "the worst case scenario" for Xvid?It will be most visible on artificial gradients, because the eye expect something XviD can't reproduce. On natural gradient, it's less visible.
Another thing : I made screenshots with VDubMod, and something strikes me : even with the original gradient, we can see a sort of blocking ( very faint, only visible if zoomed enough ). I wonder whether a convertion in RGB wouldn't be the culprit of this one.
And for the third screenshot, am I the only one to see faint green lines ? The AVS script was a blankclip() followed by a filter which touched only the luma channel, so chroma should be uniform, so XviD shouldn't show such green lines. Is that an encoder or decoder bug ( if it is even a bug ) ?
Finally, I also tested quant 2. Keyframe is slightly blocky ( though there it may be RGB convertion the culprit ), and, more surprising, not vertically uniform. However, the following p-frames tend to correct both blocks and uniformity ( though it needs 5 p-frames ). The same effect ( disparition of the blocks ) was also almost achieved after 7-8 p-frames at Q4, and but not at Q7.
lordadmira
2nd June 2004, 19:35
But which artificial gradient would be the worst case scenario? You probably had 16 bit color on the preview windows. That's the default unless u change it. There's no problem going between YUV and RGB. (Not counting YV12 which is evil!) I didn't see any green lines on #3 but there is some strange color-like mottling. That could be anything from a compression artifact to a moire illusion. I think there is some Xvid bug where it can't handle pure B&W material.
Soulhunter
2nd June 2004, 22:05
Hmm...
Loading the pic as uncompressed Avi in VDub via AviSynth...
AviSource("C:\Blah.avi")
http://img6.imageshack.us/img6/5669/00001.png
Loading the pic as uncompressed Avi in VDub via AviSynth...
AviSource("C:\Blah.avi").ConverttoYV12()
http://img17.imageshack.us/img17/6972/00004.png
Bye
lordadmira
2nd June 2004, 22:38
Originally posted by Soulhunter
AviSource("C:\Blah.avi").ConverttoYV12()
Ah, sheer beauty. A clear demonstration of the evil of the YV12 colorspace. :devil:
LA
addendum: I opened up the "original" gradients in Paint Shop and there are only 219 colors present. The YV12 gradient only has 189 colors.
Manao
2nd June 2004, 23:00
Soulhunter : this should not be relevant, since the output of my avs script is YV12, and XviD also encodes to YV12.
What bothers me, however, is that if I open the avi directly with virtual dub, or with avisynth, ( AVISource("foo.avi", pixel_type="YV12") ), I get different results.
lordadmira : I don't think ( alas ) that it's a moire illusion. It is not a problem of VDub being in 16 bits ( it was in 'desktop color depth' mode ). And for black & white, YV12 is not evil.
Edit : for the number of color : somebody ( something ) put a limiter somewhere ( hence reducing the luma range to 16-235 which, omg, makes only 219 available grey levels )
RadicalEd
2nd June 2004, 23:09
The green lines seem to come from the non-mod-16 height. Crop it down to 256x96 and I'd bet it'll work.
Manao
3rd June 2004, 00:15
RadicalEd : alas, no. I encoded a 720x544 clip, and I cropped after having made the screenshots. I should have mentioned this, sorry.
RadicalEd
3rd June 2004, 00:18
Strange. I encoded one sample at 256x96, another at 256x90, and the same kind of lines were present (only) in the latter.
lordadmira
3rd June 2004, 00:20
But there are fewer grays in the YV12 sample than in the normal sample. How do u explain that. I know it "shouldn't" matter.
@ed Going non mod16 shouldn't make any difference.
Manao
3rd June 2004, 00:28
lordadmira : RGB -> YV12 conversions are lossy. Rounding are made. So it may happen that the number of color is reduced.
Look at the following example :
Suppose that for conveting from a monocolorspace X to another monocolorspace Y, you have to multiply the value by 2. Now, you have a gradient going from 0 to 255 in the Y colorspace. You convert it to the X colorspace ( so you're dividing by 2, and rounding ), then back to the Y colorspace. You now have the following gradient :
0 0 2 2 4 4 6 6 .. 254 254
That's a crude example, but that's roughly what is happening when converting from RGB to YV12.
lordadmira
3rd June 2004, 00:39
But for a grayscale image a YV12 conversion shouldn't be lossy. It is lossy for color images. This is because only the UV planes are subsampled. The Y plane, the only one with nonzero data in a grayscale image, is full rez. So if the chroma info is blank, as it should be in these pics, a YV12 conversion shouldn't degrade the image.
@ed I don't know what green lines ur talking about. I ran the color dropper over the third image and nothing that could be considered green came up. Almost everything was a pure gray, a couple lines had a slight excess of green but only by 3 or 4.
RadicalEd
3rd June 2004, 00:56
You're not looking hard enough. The discontinuities are definitely visible, zoomed or not. Even so, no mathematical discontinuities should be present between RGB channels.
[edit]FYI, the first one goes from y = 6 to 9
Prettz
3rd June 2004, 01:00
Eh, I never mentioned anything about creating a gradient, only removing noise. In my experience the gradients are often destroyed not by the encoder but by the noise removing filters themselves (again, especially Convolution3D).
I think Xvid generally handles gradients very well, and orders of magnitude better than Divx 5.
Manao
3rd June 2004, 01:00
But for a grayscale image a YV12 conversion shouldn't be lossy.I just looked at the source code, and, at least for the C version, it's lossy. Basically, the luma channel is computed by the following formula :
((0.114 * r + 0.587 * g + 0.299 * b) * 219 / 255 + 16
Hence, if we had 219 different grey levels, after making a conversion to RGB, then back to YV12, there'll be 219 * 219 / 255 = 188.08 different grey levels.
a couple lines had a slight excess of green but only by 1 or 2.So I wasn't dreaming. I don't see what could mathematically create these lines. It means that DCT coefficients ( other than DC ) are different from 0, where there should have been zero. But since input in chroma is homogenous, I can't see how it would happen ( except, again, a bad rouding, but since it happens only at higher quants, there is something wrong ).
lordadmira
3rd June 2004, 01:07
Oh ur talking about *horizontal* lines. I was only looking at the vertical shading lines. :D Now I can see the slight horizontal green lines. They're made up of pixels that are 3 points higher in green than pure gray. They are 4 pixels high and go through the middle of the macroblocks. As for the vertical discontinuities, I think those are illusions since I examined it with the dropper and way zoomed in. They're just different widths of the various gray lines and the fact that the gray is the wrong shade.
lordadmira
3rd June 2004, 01:22
Originally posted by Manao
I just looked at the source code, and, at least for the C version, it's lossy. Basically, the luma channel is computed by the following formula :
((0.114 * r + 0.587 * g + 0.299 * b) * 219 / 255 + 16 That is seriously screwed up. What program is that? AviSynth? Are u sure that's not for the YUY2 colorspace used by NTSC television? The Y channel must have a valid range of 0 - 255. All this crap is why I say to hell with non RGB colorspace. :devil:
I think this green line has to do with a semi known bug where Xvid pukes on B&W material. Some people see green, some see pink. I've had B&W encodes look pink.
Edit: NTSC uses the YIQ colorspace.
Manao
3rd June 2004, 03:21
What program is that? AviSynth?YepAre u sure that's not for the YUY2 colorspace used by NTSC television?It the C function called RGB2YUV, which convert a rgb pixel into a yuv one. But I don't know how works the real converttoyv12(), since it's all in asm / mmx.
I think this green line has to do with a semi known bug where Xvid pukes on B&W materialI always thought it was because the iDCT had changed many times in the early releases of XviD.All this crap is why I say to hell with non RGB colorspace.If only screen were made of YUV pixels instead of RGB ones. I don't like RGB because it doesn't take into account how works the HVS.
lordadmira
3rd June 2004, 15:21
Well YUV was invented as a hack when color TV's came out so that black and white TV's would still work with the new signals. It used to be that there was just the "Y" signal. Instead of making a new transmission scheme they cleverly just "added" the U and V signals so that the new TV's could be full color and the old ones would still work. We've been paying the price ever since....
LA
virus
3rd June 2004, 17:12
Originally posted by lordadmira
We've been paying the price ever since....Actually YUV is not a burden to carry on for historical reasons, but a way to provide better compression. The YUV components are (almost) decorrelated whilst the RGB ones are not. That means less variance for the U and V planes, hence less entropy, thus they can be represented with less bits.
All serious pictures/video compression codecs use some YUV colorspace (not always the same) tu reduce spectral redundancy.
RadicalEd
3rd June 2004, 19:40
Originally posted by virus
Actually YUV is not a burden to carry on for historical reasons, but a way to provide better compression. The YUV components are (almost) decorrelated whilst the RGB ones are not. That means less variance for the U and V planes, hence less entropy, thus they can be represented with less bits.
That and there's the whole HVS factor, where our ability to sense change in color is that much lower than change in luma, thus allowing for even fewer bits.
virus
3rd June 2004, 21:28
Originally posted by RadicalEd
That and there's the whole HVS factor, where our ability to sense change in color is that much lower than change in luma, thus allowing for even fewer bits. Yes. This is related to the fact that the luma component has a lot of green in it (see the equation above), since the eye is especially sensitive to the intensity of the green component and thus less sensitive to the U and V planes (which contain little green, being almost uncorrelated to the luma component).
Also, U/V signals have typically slow variations: that means that the U and V planes have a lower bandwidth than luma and thus they can (almost) safely be subsampled. And with this step we move from a generic YUV space to the well-known YUY2/YV12 spaces. And the eye still feels pretty fine :D
JasonFly
3rd June 2004, 22:06
This is a very intereting topic. I am very surprised concerning the conversion from rgb to YV12 so I also made some tests.
I made an RGB gradient image in photoshop. Here is the source:
http://perso.wanadoo.fr/paille/GradientRGB_source.png
I made two 2 seconds clips(one giving RGB and one giving YV12) with avisynth using those script.
RGB:
ImageReader("E:\GradientRGB_source.png", 0, 0,25,false).Loop(50,0,0)
YV12:
ImageReader("E:\GradientRGB_source.png", 0, 0,25,false).Loop(50,0,0).ConvertToYV12()
The results confirmed what that has already been said,RGB->YV12 conversion is lossy.
RGB in Vdub:
http://perso.wanadoo.fr/paille/Vdub_RBG.png
YV12 in Vdub:
http://perso.wanadoo.fr/paille/Vdub_YV12.png
I aslo made a test giving the RGB source to Vdub and encode with divx 5 and XviD. The test was made at q=2with almost defaut param for both codecs. I think I just put "slowest" for DivX and "hvsbest" for XviD. Anyway, that's not very important.Filesize for XviD is bigger than the DivX one. Concerning image quality, DivX is better than XviD.
DivX5:
http://perso.wanadoo.fr/paille/GradientRGB_divx5.png
XviD, hvsbest:
http://perso.wanadoo.fr/paille/GradientRGB_hvsbest.png
Does tat mean that DivX YV12 conversion is better than the XviD one?
Manao
3rd June 2004, 22:14
To compare DivX5 and XviD on that particular matter, you should use for XviD the H263 matrix. And I would also try to use the same DCT / iDCT.
JasonFly
3rd June 2004, 22:50
Yes you're rigth, results are far better with h263 than with hvsbest. But,is it possible to use the same idct for the both codecs? What is the DCT used by DivX?
H263
http://perso.wanadoo.fr/paille/GradientRGB_h263.png
dude051
4th June 2004, 06:36
@JasonFly
This test you did to compare DivX to XviD doesn't seem to prove anything about XviD's color space conversion to me. This is because when you changed the matrix, the result was in fact better... so doesn't this show a defect in the compression used with the matrix? To me, im no expert though, the problem looks to be mainly in the VDub or Avisynth's color conversion factors.
Just an opinion to throw out.
is it possibly caused by using only 8 bit matrix?
lordadmira
4th June 2004, 12:51
Originally posted by virus
Actually YUV is not a burden to carry on for historical reasons, but a way to provide better compression. The YUV components are (almost) decorrelated whilst the RGB ones are not. That means less variance for the U and V planes, hence less entropy, thus they can be represented with less bits. No no no!! YUV isn't a way to get better compression it's a way to cheat on bandwidth. YUV's only "advantage", if it can be called that, is that it allows for chroma subsampling of an image. That's fancy tech speak for chinsing on the color bandwidth. A hack. Now there would be nothing wrong with a full range full sample 4:4:4 YUV colorspace. But thanks to bean counters we have abominations like YV12, 4:2:0, 4:1:1. These non pure YUV spaces make certain YUV <-> RGB conversions lossy. Pushing more significant data into the Y channel doesn't decrease the total entropy. Entropy always increases, the zeroeth law of thermodynamics. You can take advantage of that with high entropy encoding schemes for the Y channel and low entropy schemes for the UV channels but I don't think that will save any bandwidth.
virus wrote
That and there's the whole HVS factor, where our ability to sense change in color is that much lower than change in luma, thus allowing for even fewer bits. Only by using subsampling which increases the lossiness of the compression.
virus wrote
Yes. This is related to the fact that the luma component has a lot of green in it (see the equation above), since the eye is especially sensitive to the intensity of the green component and thus less sensitive to the U and V planes (which contain little green, being almost uncorrelated to the luma component). Uh, I don't think u know what ur talking about. :D U and V are not red and blue.
@Jason
Yes a *color* RGB -> YV12 conversion will always be lossy since YV12 subsamples the chroma planes. U can see that in ur YV12 image, those are the subsampling artifacts. The B&W examples before should not have been lossy going from RGB -> YV12 because there is no chroma information. I don't know why that AviSynth function was mapping the 256 RGB grays into 219 Y grays. It's probably some attempt to compensate for the fact that each RGB channel contributes differently to the intensity of the gray. Ur compression tests on the gradients make sense since Mpeg4 can't replicate a gradient. It transforms images into waveforms, waves that must rise, peak, and fall. A gradient has no fall so it cannot be encoded.
LA
Manao
4th June 2004, 13:44
drcl : no
lordadmira : you may be interested in reading this (http://jehoo.netian.com/tech%20brief/brief%203/dvcontents/dv-14.html)Entropy always increases, the zeroeth law of thermodynamics.In the real world, yes, but we're talking about informatics here.Uh, I don't think u know what ur talking about. U and V are not red and blue.They are when Y = 0. When U = V = 0, the picture can vary from dark green ( Y = 0) to bright green ( Y = 255 ).
virus
4th June 2004, 13:54
lordadmira, mate, listen to me ;)
Statements like that:
Originally posted by lordadmira
Entropy always increases, the zeroeth law of thermodynamics.
where you mix stuff from physics which has nothing in common with information theory, only show a lack of understanding of the very basics of data compression. If you never hear words like "random variable", "zeroth order entropy of a source", "mutual information", "coding gain" or "spectral decorrelation" it doesn't mean they don't exist, just that you know nothing about that. I'd warmly suggest you to read a book on information theory ;)
Uh, I don't think u know what ur talking about. :D strange, I was thinking the very same thing about you :D
Anyway FYI I have an engineering degree, took with a thesis on lossless image compression (and related theory behind it). Looks like I've been really lucky that they didn't trash all my work altogether uh? ;)
virus
lordadmira
4th June 2004, 17:00
Originally posted by Manao
lordadmira : you may be interested in reading this (http://jehoo.netian.com/tech%20brief/brief%203/dvcontents/dv-14.html) That's a really good article. Helps to elucidate this YUV phenomenon. Jason, it helps explain the 219 grays issue, "white clipping".
They are when Y = 0. When U = V = 0, the picture can vary from dark green ( Y = 0) to bright green ( Y = 255 ). When U and V are 0 the picture is grayscale. The R, G, and B values will always be equal. When Y is 0 you have black. This is because the presence of any R, G, or B value will raise the Y value. They all must be 0 for Y to be 0. But strictly speaking, U and V, they are not red and blue. I created a spreadsheet to play around with the values. I used these formulas: RGB -> YUV
Y = 0.299*R + 0.587*G + 0.114*B
U = -0.147*R - 0.289*G + 0.436*B
V = 0.615*R - 0.515*G - 0.100*B
YUV -> RGB
Red = Y + 1.140*V
Green = Y - 0.396*U - 0.581*V
Blue = Y + 2.029*U
@virus
I must disagree. Physics has everything to do with this discussion. Video encoding and display come directly from the physics of electromagnetic waves and their interaction with the human eye. The standards evolved from the study of these physical phenomena. And I wasn't the one who brought the word "entropy" into this. ;) Some people call video compression "entropy encoding". When u touch the dragon of entropy it will only get bigger, never smaller. Lossy encoding can be considered a form of allowing entropy to grow. Everything's information anyway, matter, energy it's all information (ref: implicate order). Data compression? This forum has nothing to do with data compression. U might want to visit a RAR or BZ2 forum...:cool:
Anyway, it's good that u have a degree. Let's not let this fall into a stupid tomâto, tomäto argument.
LA
virus
4th June 2004, 17:11
lordadmira:
ROTFL :D
Really, you should make a job out of posts like these... please keep on enlightening and entertaining us with your stuff. :D
Manao
4th June 2004, 17:45
lordadmira : no, the picture is in greyscale when U = V = 128 ( around 128 at least ). You have the correct mathematical formulas, but in computer science, Y, U and V are ranged from 0 to 255, and your formula ( RGB to YUV ) give negative U and V. So you have to offset them after multiplying them by a matrix.
For the rest of your post : entropy in computer science is nothing less than the minimal (considering no prediction schemes are used) mean of number of bits per symbol needed to write a serie of symbols. Shanon told us 50 years ago that you could not write that serie of symbol with a lower mean of number of bits per symbol. Entropy coding refers to lossless encoding schemes which allow to almost reach the entropy while compressing.
lordadmira
4th June 2004, 20:01
Originally posted by Manao
lordadmira : no, the picture is in greyscale when U = V = 128 ( around 128 at least ). You have the correct mathematical formulas, but in computer science, Y, U and V are ranged from 0 to 255, and your formula ( RGB to YUV ) give negative U and V. So you have to offset them after multiplying them by a matrix. I based my statements off the RGB -> YUV equations. I knew that a remapping had to take place to put U and V into 8 bits unsigned. Y has a natural range of 0 to 255, U -111.2 to +111.2, V -156.8 to +156.8. V's range exceeds 256 values so it has to be smashed into that range. U and V can never have their minimum values simultaneously, that yields invalid RGB values no matter what Y is. R, G, and B can have values independant of each other. The values of Y, U, and V cannot have arbitrary values since they are constrained by the other two variables.
For the rest of your post : entropy in computer science is nothing less than the minimal (consi........................<snip> I don't see what any of that has to do with this thread, u should address that to virus, he's the one that brought it up. No reply necessary, but I disagree with that definition of entropy.
Hmm, somewhere does any of this have anything to do with the failure of YUV color spaces to handle gradients? :D
LA
OCedHrt
13th July 2004, 20:58
Originally posted by Manao
What bothers me, however, is that if I open the avi directly with virtual dub, or with avisynth, ( AVISource("foo.avi", pixel_type="YV12") ), I get different results.
Edit : for the number of color : somebody ( something ) put a limiter somewhere ( hence reducing the luma range to 16-235 which, omg, makes only 219 available grey levels ) [/B]
I think that has to do with what I just read on avisynth's site:
This parameter will extend the YUV range from 16-235 (this is the range used by all avisynth converters) to 0-255.
Wilbert
13th July 2004, 21:53
@Manao,
Sorry, I have nothing useful to say about your problem.
@lordadmira,
No no no!! YUV isn't a way to get better compression it's a way to cheat on bandwidth. YUV's only "advantage", if it can be called that, is that it allows for chroma subsampling of an image. That's fancy tech speak for chinsing on the color bandwidth. A hack. Now there would be nothing wrong with a full range full sample 4:4:4 YUV colorspace. But thanks to bean counters we have abominations like YV12, 4:2:0, 4:1:1. These non pure YUV spaces make certain YUV <-> RGB conversions lossy.
That's true, but note that may colorformat conversions are lossy. RGB <-> HSV is also lossy.
Anyway, chroma subsampling is not a hack. The eye is less sensitive for changes in chroma than in luma. So, why not use that? It enables us to put more info on a dvd. Of course, if your source/target is YUV you shouldn't convert to RGB in the first place.
Yes a *color* RGB -> YV12 conversion will always be lossy since YV12 subsamples the chroma planes. U can see that in ur YV12 image, those are the subsampling artifacts. The B&W examples before should not have been lossy going from RGB -> YV12 because there is no chroma information. I don't know why that AviSynth function was mapping the 256 RGB grays into 219 Y grays. It's probably some attempt to compensate for the fact that each RGB channel contributes differently to the intensity of the gray. Ur compression tests on the gradients make sense since Mpeg4 can't replicate a gradient. It transforms images into waveforms, waves that must rise, peak, and fall. A gradient has no fall so it cannot be encoded.
The main problem is that you have YUV [0,255], YUV [16,235], RGB [0,255] and RGB [16,235].
MPEG1/2/4 is all YV12 [16,235]. That's why AviSynth (and most codecs) changes the luma range when converting from RGB to YUV. Of course, it would be nice if a RGB->YUV (luma range unchanged) conversion would exist in AviSynth. But, that is only useful for "artificial" purposes.
Note that some mjpeg keep the luma range fixed when converting YUV to RGB, getting RGB [16,235] (assuming your YUV clip is also [16,235]).
The main problem is just that there are different color spaces being used, and we need to convert between them sometimes. An additional problem is the two luma ranges floating around.
vBulletin® v3.8.4, Copyright ©2000-2009, Jelsoft Enterprises Ltd.