Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Announcements and Chat > General Discussion

View Poll Results: Metrics.
Only SSIM is a reliable metric test. 4 9.09%
Only PSNR is a reliable metric test. 0 0%
Both SSIM and PSNR are reliable metric test. 13 29.55%
I meassure quality only by my own visual perception. 27 61.36%
Voters: 44. You may not vote on this poll

Reply
 
Thread Tools Display Modes
Old 27th April 2006, 05:03   #1  |  Link
IgorC
Registered User
 
Join Date: Apr 2004
Posts: 1,025
Do you beleive to metric result? or Do you agree with them?

Last time SSIM and OPSNR tests became even more popular between hobbists and devs of video compression.

Personally many times I was disagree with OPSNR metrics. When the difference was +/- 0.35-0.5 db but visually I prefered the video with lower opsnr. And the difference was noticeble.
My visual perception is more related with ssim metrics. I always was agree with the ssim results when the difference was noticeble . At least +/- 0.2 ssim .

Last edited by IgorC; 27th April 2006 at 05:12.
IgorC is offline   Reply With Quote
Old 27th April 2006, 05:06   #2  |  Link
siddharthagandhi
Go Nero Digital
 
Join Date: Jan 2006
Location: Edison, NJ
Posts: 466
Yes, generally the metrics dont lie. Generally, I say.
siddharthagandhi is offline   Reply With Quote
Old 27th April 2006, 05:24   #3  |  Link
Mug Funky
interlace this!
 
Mug Funky's Avatar
 
Join Date: Jun 2003
Location: i'm in ur transfers, addin noise
Posts: 4,267
i agree with the wording that "PSNR and SSIM are reliable metrics", because that statement says nothing about perceptual quality...

i judge quality with my eyes, but numbers are a very good guide. for example, when i do MPEG-2 (which is all the time), i'll do a bitrateview scan and look for the bits where the quants visibly spike. you're unlikely to have quality problems with q2 or q1 footage. but q17 stuff requires visual inspection...

so i guess i'd have to say my quality metric is quantiser rather than PSNR or SSIM, as i hardly ever use either of them.

also, i read a lot of threads with metric comparisons, and PSNR and SSIM are almost always in agreement. they only differ greatly when the visual quality starts getting bad, or in special cases that i can't think of offhand.
__________________
interlace... right or wrong, just deal with it.
Mug Funky is offline   Reply With Quote
Old 27th April 2006, 05:37   #4  |  Link
foxyshadis
ангел смерти
 
foxyshadis's Avatar
 
Join Date: Nov 2004
Location: Oceanborn
Posts: 7,598
PSNR has a noise area, based on what I can tell it's something like 1-2 db at low values and more as you go higher... any two encodings within that area may be better, worse, or identical, only when the scores are very different are they reliable (at which point visual quality differences are very obvious anyway). SSIM is the same, but much smaller intervals, which is the only reason it's more useful. Still fine differences can throw it off, and it isn't really optimized for all the different kinds of artifacts on different kinds of video that encoders can cause.
__________________
There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. ~ Ed Howdershelt
foxyshadis is offline   Reply With Quote
Old 27th April 2006, 08:44   #5  |  Link
Doom9
clueless n00b
 
Join Date: Oct 2001
Location: somewhere over the rainbow
Posts: 10,258
c't performs codec test using the most complex metric of 'em all: SND. Yet, in my codec comparisons, the codec they find to be great do not do so well.. hence I trust only my eyes. No matter how much time and maths you invest to model the HVS, each and everyone of us has a different visual trace and places focus on different areas. The occasional block may be distracting, but personally if the alternative means a lack of crispness, then when I watch both clips one after another, I'll still rate the crisper but blockier one higher than the washed out one. But I think Joe Average would rate them just the reverse.. there are enough people that are still happy with VHS quality after all and that just looks terrible. Even a lot of DVDs don't look that great and the fact that directors add arbitrary noise only makes the situation worse.
__________________
For the web's most comprehensive collection of DVD backup guides go to www.doom9.org
Doom9 is offline   Reply With Quote
Old 27th April 2006, 09:09   #6  |  Link
GodofaGap
Registered User
 
Join Date: Feb 2006
Posts: 620
I think for an intra-codec comparison (look for one codec what options are better) PSNR certainly is an useful value. Except for CQM's I think any option that is supposed to increase quality will also increase PSNR. Codecs seem to be mostly tuned on achieving the highest PSNR, anyway.

The problem with PSNR is that there is no useful scale that correlates with subjective judgments like good, ok, bad, etc. It is very easy to pump a PSNR score by adding letterboxing (which won't increase quality of course) and it is also very easy to decrease PSNR by changing the brightness just the slightest bit (which won't really decrease the quality of course). So if you have codec A with score X and codec B with score Y, it is very difficult to say which one is better, because we lack this scale. Except of course when one codec scores idiotically low.
GodofaGap is offline   Reply With Quote
Old 27th April 2006, 10:32   #7  |  Link
Sagittaire
Testeur de codecs
 
Sagittaire's Avatar
 
Join Date: May 2003
Location: France
Posts: 2,065
PSNR compare simply equivalent pixels : if there are high convergence between source and video then PSNR is high


Quote:
It is very easy to pump a PSNR score by adding letterboxing (which won't increase quality of course)
PSNR is not absolute quality measurement : only relative test. You must compare codec A vs Codec B or functionnality A vs functionnality B with exacty the same source.


Quote:
it is also very easy to decrease PSNR by changing the brightness just the slightest bit (which won't really decrease the quality of course).
You can flip (horizontaly or verticaly) video or desynchronyse frame and obtain too very bad PSNR. But in this case source are not equivalente. Denoising, sharpening, tweak brighness, saturation or contraste decrease PSNR but these transformation make not equivalent source and no codec in the world make these pre-process for the source simply because it's not objective of video codec. The only objective for video codec is to make the highest possible convergence for eyes between output and input and PSNR can check that very well.

Quote:
So if you have codec A with score X and codec B with score Y, it is very difficult to say which one is better
Well PSNR can really easy say:
MPEG4 ASP is better than MPEG2
MPEG4 AVC is better than MPEG4 ASP

For make PSNR judgement you must use threshold: if codec A obtain 43.0 dB and codec B obtain 43.1 dB we can't say that "codec B is better than codec A". But if codec A obtain 43.0 dB and codec B obtain 44.0 dB we can say with very high probablity that "codec B is better than codec A".

the threshold of uncertainty for PSNR is higher than the threshold of uncertainty for SSIM simply because HVS evaluation for SSIM is better than HVS evaluation for PSNR.

you do not believe in the metrics : well metric test for codec done exactly the same result that the last doom9 codec test ... try explain that like you want. All the codec developper in the world use metric to improve their codec ... try explain that like you want.
__________________
Le Sagittaire ... ;-)

1- Ateme AVC or x264
2- VP7 or RV10 only for anime
3- XviD, DivX or WMV9
Sagittaire is offline   Reply With Quote
Old 27th April 2006, 10:41   #8  |  Link
Doom9
clueless n00b
 
Join Date: Oct 2001
Location: somewhere over the rainbow
Posts: 10,258
Quote:
you do not believe in the metrics : well metric test for codec done exactly the same result that the last doom9 codec test ... try explain that like you want. All the codec developper in the world use metric to improve their codec ... try explain that like you want.
But there are people who disagree with my results I don't have their eyes so unless claims are really outrageous, I can't really say their opinion is wrong.

And I gess I got SSIM eyes then because SND doesn't agree with my eyes.
__________________
For the web's most comprehensive collection of DVD backup guides go to www.doom9.org
Doom9 is offline   Reply With Quote
Old 27th April 2006, 10:49   #9  |  Link
Mug Funky
interlace this!
 
Mug Funky's Avatar
 
Join Date: Jun 2003
Location: i'm in ur transfers, addin noise
Posts: 4,267
Quote:
Originally Posted by Sagittaire
Well PSNR can really easy say:
MPEG4 ASP is better than MPEG2
MPEG4 AVC is better than MPEG4 ASP
but it can't prove it except under a very limited set of conditions.

it's used by codec developers to improve their codecs because they are doing several small controlled tests - they'll vary one thing and check the metrics. usually it's because the differences are imperceptible that metrics are used - we can't see 1 pixel difference in luma around the edge of a man's glasses while said man is being shot out of a cannon at high shutter-speed, but an objective metric can.

then there's codec optimisation techniques that would make it impractical to use visual inspection: i remember reading here Pengvado's way of finding an optimal command line for (i think) mencoder - construct a genetic algorithm with bitrate and PSNR as selection criteria, then breed the settings together. doing that with visual inspection would yeild possibly stupid and superfluous settings, but using PSNR helped remove the crap. contrast this with LAME's old "r3mix" preset which was a mixture of good sense and stupidity. however the analogy ends there because objective metrics mean even less to audio, where blind testing and statistical analysis is the only way to trust one's results.

it would be difficult to compare AVC with ASP when the loop filter is turned on - perceptual quality increases and PSNR drops.

and as psychovisual modelling in codecs becomes more sophisticated, we will expect to see more of this kind of thing happening - PSNR going down while perceptual quality goes up.
__________________
interlace... right or wrong, just deal with it.

Last edited by Mug Funky; 27th April 2006 at 10:52.
Mug Funky is offline   Reply With Quote
Old 27th April 2006, 12:31   #10  |  Link
Sagittaire
Testeur de codecs
 
Sagittaire's Avatar
 
Join Date: May 2003
Location: France
Posts: 2,065
Quote:
But there are people who disagree with my results I don't have their eyes so unless claims are really outrageous, I can't really say their opinion is wrong.
well perhaps in DivX vs XviD fight but certainely not in ASP vs ASP fight.


Quote:
And I gess I got SSIM eyes then because SND doesn't agree with my eyes.
You must use average comparison for user too. If SDN show not the same judgement then SDN is simply bad metric.


Quote:
it would be difficult to compare AVC with ASP when the loop filter is turned on - perceptual quality increases and PSNR drops.
why ... ???

and inloop increase always PSNR:
- best size for -2:-2 (constant quant encoding)
- best PSNR for 0:0 (constant quant or size encoding)
- best SSIM generaly for -1:-1 (constant quant or size encoding)
- best default value for developper is in [-2;0] interval. x264 use 0:0, Elecard use -1:-1 and Ateme -2:-2
__________________
Le Sagittaire ... ;-)

1- Ateme AVC or x264
2- VP7 or RV10 only for anime
3- XviD, DivX or WMV9
Sagittaire is offline   Reply With Quote
Old 27th April 2006, 12:42   #11  |  Link
dragongodz
....
 
dragongodz's Avatar
 
Join Date: May 2002
Location: Australia
Posts: 2,789
Quote:
All the codec developper in the world use metric to improve their codec ... try explain that like you want.
lets be clear about why devs can use metrics before trying to espouse them saying metrics = quality. lets go back in the past a little
http://forum.doom9.org/showthread.ph...634#post698634

i give my reply a few posts down about how you misunderstand the statement you use/quote. so yes devs can and do use metrics to try and give clues or ideas where there codec/encoder may need improvement. this does not mean they suddenly think it will tell you whats the best quality all the time or to everybody because that simply is not the case.

EDIT:
oh and PLEASE stop saying silly things like "All the codec developper in the world" since you have absolutly no way of knowing such a thing.
__________________
Narrator: And of course, with the birth of the artist came the inevitable afterbirth - the critic. (History of the World part 1)
dragongodz is offline   Reply With Quote
Old 27th April 2006, 13:03   #12  |  Link
Sagittaire
Testeur de codecs
 
Sagittaire's Avatar
 
Join Date: May 2003
Location: France
Posts: 2,065
Quote:
oh and PLEASE stop saying silly things like "All the codec developper in the world" since you have absolutly no way of knowing such a thing
here exhaustive list:
- Ateme developper
- Elecard developper
- x264 developper
- RV10 developper
- XviD developper
- DivX developper
- VP7 developper
- VC1 developper

... etc etc etc

and most of them use metric to compare their codecs with the concurence.

Quote:
Originally Posted by Babayaga (Nero/Ateme developper MPEG4 AVC)

PSNR at constant quantizer gives a good idea of the raw efficiency since this is what core encoder try to optimize most of the time.
MPSNR/OPSNR add some hints about rate-control and SSIM (JND metrics or whatever) add clues about adaptive quantization.

A codec which is inferior is those 3 metrics will almost surely be seen as subjectively inferior.

Besides, subjective testing is also very difficult to do and is not always reliable.
Actually I have no example to contradict that ... but if you find this example make demonstration ...
__________________
Le Sagittaire ... ;-)

1- Ateme AVC or x264
2- VP7 or RV10 only for anime
3- XviD, DivX or WMV9
Sagittaire is offline   Reply With Quote
Old 27th April 2006, 14:10   #13  |  Link
dragongodz
....
 
dragongodz's Avatar
 
Join Date: May 2002
Location: Australia
Posts: 2,789
Quote:
here exhaustive list:
Quote:
... etc etc etc
oh come on, thats plain insulting. when you can list all the codecs and encoders and show where they have said they use metrics then you have a case. until then you do NOT know that they all have. saying "many devs" would be more realistic, not this "all" you have repeatedly used.

Quote:
Originally Posted by Babayaga
i already commented on that quote when you used it before. yes metrics can be used to try and spot problem areas or to try and increase efficiency etc etc and yes the end goal is to have the best quality with best efficiency as possible. that does NOT mean metrics give you an absolute on quality.
take the line
Quote:
A codec which is inferior is those 3 metrics will almost surely be seen as subjectively inferior.
notice a very important word ? ALMOST. thats right , not an absolute.
so please stop pushing that quote as some evidence that metrics will ALWAYS show you whats best quality to everybody because it simply is NOT true.

Quote:
Actually I have no example to contradict that ... but if you find this example make demonstration ...
are you asking me to show where a metric has failed ? if so i already answered that. its in the giant Rejig thread where Makira made a change that increased PSNR but when people started testing it the quality was actually inferior. now you have come back against that saying it cant have been OPSNR etc, well you would have to ask Makira as he was the one who did it. the point is its a real example of a metric failing which you choose to constantly not believe.

i am rather tired of having to say the same things to you about this so i wont bother anymore.
__________________
Narrator: And of course, with the birth of the artist came the inevitable afterbirth - the critic. (History of the World part 1)
dragongodz is offline   Reply With Quote
Old 27th April 2006, 15:01   #14  |  Link
Sagittaire
Testeur de codecs
 
Sagittaire's Avatar
 
Join Date: May 2003
Location: France
Posts: 2,065
Quote:
notice a very important word ? ALMOST. thats right , not an absolute.
so please stop pushing that quote as some evidence that metrics will ALWAYS show you whats best quality to everybody because it simply is NOT true.
well I didn't say always and everyone. I say:
"for very large majority with high probability if there are subtancial difference"

little example :

at this time impossible to say which is the best between XviD and DivX with metric simply metric test are really close. Many users find that XviD is visually the best and many users find that DivX is visually the best.

-> That's the reality


at this time really easy to say which is the best between ASP and AVP with metric simply metric test are very best for AVC. Large user's majority find that AVC is by far visually better than ASP.

-> That's the reality


Many developper use metric to compare their codec with concurence and for me too "subjective testing is also very difficult to do and is not always reliable" simply because overall sujective judgement is really really hard (very easy for me to make demonstration here)

-> That's the reality
__________________
Le Sagittaire ... ;-)

1- Ateme AVC or x264
2- VP7 or RV10 only for anime
3- XviD, DivX or WMV9
Sagittaire is offline   Reply With Quote
Old 27th April 2006, 16:04   #15  |  Link
Mug Funky
interlace this!
 
Mug Funky's Avatar
 
Join Date: Jun 2003
Location: i'm in ur transfers, addin noise
Posts: 4,267
i wonder what would look better between an AVC encode with +2,+2 vs an ASP encode with a higher PSNR? obviously this is strongly dependent on the amount of quantization going on, but my point is that the ASP encode would have visible blocking where the AVC wouldn't. on a lot of sources the AVC would look much better in spite of being further away from the original, metric wise.

hopefully one can see that such a metric-based comparison breaks down when it's applied to drastically different technologies.

however (and i suspect this is what codec developers would be doing, though i can't make such a call and i can't be bothered sitting up all night digging up quotes), "quality" metrics are valid and very useful when tweaking settings within the same codec, on the same source. tests of this kind must be controlled or they are invalid and useless (like going xvid + fft3dfilter versus divx + conv3d in a PSNR battle would say nothing about either the codecs or the filters).

also, though i will look at the numbers when doing an encode, nothing goes on a DVD master that hasn't been watched by a human in a controlled environment. though quants/metrics can tell us where the problem areas might be, they wont say whether there's other problems going on. in the case of a total A/V dropout, PSNR can go to -inf, and quants will drop to 1, but that doesn't mean the quality is good because in fact it's a total disaster and will require several hours in photoshop cursing the name of certain equipment manufacturers.

[edit]

Quote:
Originally Posted by Sagittaire
Many developper use metric to compare their codec with concurence and for me too "subjective testing is also very difficult to do and is not always reliable" simply because overall sujective judgement is really really hard (very easy for me to make demonstration here)
if the codecs are subjectively tied, it surely just goes to prove that the metrics have a long way to go in modelling the human visual system! i mean if the metrics say there's 3dB difference when you see none, what on earth does that say about the metrics? such a comparison is meaningless IMHO unless someone is interested in a "my codec is bigger than yours" zealot battle with numbers as ammunition.
__________________
interlace... right or wrong, just deal with it.

Last edited by Mug Funky; 27th April 2006 at 16:08.
Mug Funky is offline   Reply With Quote
Old 27th April 2006, 17:53   #16  |  Link
Sagittaire
Testeur de codecs
 
Sagittaire's Avatar
 
Join Date: May 2003
Location: France
Posts: 2,065
Quote:
however (and i suspect this is what codec developers would be doing, though i can't make such a call and i can't be bothered sitting up all night digging up quotes), "quality" metrics are valid and very useful when tweaking settings within the same codec, on the same source. tests of this kind must be controlled or they are invalid and useless (like going xvid + fft3dfilter versus divx + conv3d in a PSNR battle would say nothing about either the codecs or the filters).

also, though i will look at the numbers when doing an encode, nothing goes on a DVD master that hasn't been watched by a human in a controlled environment. though quants/metrics can tell us where the problem areas might be, they wont say whether there's other problems going on. in the case of a total A/V dropout, PSNR can go to -inf, and quants will drop to 1, but that doesn't mean the quality is good because in fact it's a total disaster and will require several hours in photoshop cursing the name of certain equipment manufacturers.
Doen't mean anything for metric test: you must always compare exactly equivalent source. The only objective for video codec must be to obtain the most possible convergence for eyes between input and output. If input is crap objective to video codec must be simply to make exactly the same crap video in output. Improve quality of the source is not the objectif of a video codec ... ... only pre-process for that.


Quote:
i wonder what would look better between an AVC encode with +2,+2 vs an ASP encode with a higher PSNR? obviously this is strongly dependent on the amount of quantization going on, but my point is that the ASP encode would have visible blocking where the AVC wouldn't. on a lot of sources the AVC would look much better in spite of being further away from the original, metric wise.
1) PSNR for AVC is always better than ASP and by far at same size
2) "result better than original" doesn't mean anything for video codec
__________________
Le Sagittaire ... ;-)

1- Ateme AVC or x264
2- VP7 or RV10 only for anime
3- XviD, DivX or WMV9

Last edited by Sagittaire; 27th April 2006 at 18:13.
Sagittaire is offline   Reply With Quote
Old 27th April 2006, 18:47   #17  |  Link
foxyshadis
ангел смерти
 
foxyshadis's Avatar
 
Join Date: Nov 2004
Location: Oceanborn
Posts: 7,598
Quote:
Originally Posted by Sagittaire
Doen't mean anything for metric test: you must always compare exactly equivalent source. The only objective for video codec must be to obtain the most possible convergence for eyes between input and output. If input is crap objective to video codec must be simply to make exactly the same crap video in output. Improve quality of the source is not the objectif of a video codec ... ... only pre-process for that.

1) PSNR for AVC is always better than ASP and by far at same size
2) "result better than original" doesn't mean anything for video codec
This is why metrics that compare only against the original are fundamentally flawed as arbiters of quality. Being told that a frame has a 20db difference from the original is useless when what you really want to know is if the output is blocking, noisy, ringy, blurry, blended, or has other common video artifacts, whether the original is or not, as I said in the other thread. There's so many ways that the encoder can change the quality, different types of reductions can be more or less irritating at different times, in different areas.

I hope someone comes up with a quality evaluator like that someday, it'd be so useful. (For one thing, guis could use it to place certain filters in their scripts, and tweak them in certain frame ranges as necessary, extending the trend of automated filtering.)
__________________
There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order. ~ Ed Howdershelt
foxyshadis is offline   Reply With Quote
Old 27th April 2006, 20:16   #18  |  Link
Sagittaire
Testeur de codecs
 
Sagittaire's Avatar
 
Join Date: May 2003
Location: France
Posts: 2,065
Quote:
This is why metrics that compare only against the original are fundamentally flawed as arbiters of quality. Being told that a frame has a 20db difference from the original is useless when what you really want to know is if the output is blocking, noisy, ringy, blurry, blended, or has other common video artifacts,
but SSIM make that ...
http://www.cns.nyu.edu/~zwang/files/...demo_lena.html
http://www.cns.nyu.edu/~zwang/files/...demo_blur.html
http://www.cns.nyu.edu/~zwang/files/.../demo_jpg.html
http://www.cns.nyu.edu/~zwang/files/...mo_couple.html

you speak about metric but you don't use it or you don't know their real capacity and limitation ... ;-)
__________________
Le Sagittaire ... ;-)

1- Ateme AVC or x264
2- VP7 or RV10 only for anime
3- XviD, DivX or WMV9
Sagittaire is offline   Reply With Quote
Old 27th April 2006, 23:12   #19  |  Link
siddharthagandhi
Go Nero Digital
 
Join Date: Jan 2006
Location: Edison, NJ
Posts: 466
Both are reliable tests. SSIM may be better, but not that much. However, they are by no means perfect and they really need to be improved. But they generally should be good.

I am using PSNR for my upcoming codec comparison.
siddharthagandhi is offline   Reply With Quote
Old 27th April 2006, 23:54   #20  |  Link
Soulhunter
Leaker!
 
Soulhunter's Avatar
 
Join Date: Apr 2003
Location: Unknown
Posts: 2,773
I think it always depends on... what you wanna test with the metrics, how you to do the test, and how you interpret the results... In some cases metric test can be reliable, in others not, eh!? ^^


Bye
__________________

For my stuff: My crappy Website (back online) - or - #Videots-United



Last edited by Soulhunter; 28th April 2006 at 00:24.
Soulhunter is offline   Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 07:34.


Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.