PDA

View Full Version : Custom matrix comparison - V3 - Round 2 - Finished


Soulhunter
16th November 2004, 13:43
Here we go again...


1. How is it supposed to work ???


We did a encode of this source...

Gladiator (PAL/R2) / Chapter No. 19 (3:57 min.)


With this AVS...

LoadPlugin("C:\SOFTS\AVSFILES\MPEG2DEC3.DLL")

Mpeg2Source("gladiator.d2v", idct=2)

Crop(6,76,704,432)


And this settings...

XviD v.1.0.2 (Koepis) / Defaults / 2Pass @ 1600kbps (VHQ4) / Trellis


With 10 different matrices !!!


We took 2 samples of each encode, and named them like this:

CQM_NO_006_CLIP-01.avi


The red text indicates the matrix that was used...

CQM_NO_006_CLIP-01.avi


The blue text indicates the scene that was used...

CQM_NO_006_CLIP-01.avi



2. But how to vote ???


Simple, watch the samples under your "usually" playback conditions...

When you got a good impression, write down a number between 1-5 !!!

1 = Crap
2 = Bad
3 = Tolerable
4 = Good
5 = Perfect


Then send me your votes like this...

CLIP-01:

CQM_NO_001 = 2.0

CQM_NO_002 = 4.5

CQM_NO_003 = 5.0

etc...


And after some time (maybe 4 weeks), I post the results !!!



3. How to get the samples ???


1st. Click here... (http://service.gmx.net/mc/m2FEwRtLTUegEj2le4yaKVpUIV1bOT)

2nd. Press this button -> http://img60.exs.cx/img60/6854/bt_gmxmediacenterstarten.gif

3rd. Download the standard or broad-band zip file !!!



4. Finished...

Here (http://forum.doom9.org/showthread.php?s=&postid=578992#post578992) are the results !!!


Bye

kurt
16th November 2004, 14:35
nice, votes will follow soon :D

Teegedeck
16th November 2004, 19:54
Soulhunter, there is something I have to know in order to judge this fairly: Are the 'dancing blocks' in the dark-red wall-painting of the arena already present in the source?

Soulhunter
16th November 2004, 20:48
Nope, they are definitely not visible in the source... (http://img15.exs.cx/img15/4449/Sorce1.png)

Is your monitor very bright, or have you maybe a LCD ???

Coz Im not able to see this blocks without raising the brightness !?!


Bye

Teegedeck
16th November 2004, 20:57
A TFT. It is merciless!

Soulhunter
16th November 2004, 21:30
True, recently I was looking for a new monitor @ a the local MediaMarkt...

After appraising some of this "frys ya eyes" TFTs, I decided to buy a 21" CRT !!!


Bye

Teegedeck
16th November 2004, 22:06
Now, that was a 21''-mistake! :D
Excuse the OT-talk, I'm gonna stop here.

merlinhd
17th November 2004, 03:00
Hi, I hope what I'm going to say doesn't bother anyone - especially whoever is doing the poll.

The thing is that this is simply not the right way of acquiring this sort of information, I happened to mention this just a few days ago in another thread in this same forum, when talking about another subject. People have a hard time to measure, in absolute terms, characteristics such as quality, beauty, comfort, and so on. It looks like the information being gathered in this poll falls into this. In general, the numbers obtained won't be really representative - maybe just the best and worst elements.

On the other hand, it's much easier for people to compare elements in couples. That is, compare the results of two matrices and simply decide which one is better. Then, each individual would just compare a few couples. By adding up all the information obtained, it'd be easy to generate a ranking of matrices, along with other information such as 'how many people think this matrix is the best' and so on.

ObiKenobi
17th November 2004, 04:48
OT a bit, but I was gonna ask you Soulhunter at what bitrates is your Y.A.C.Q.M matrix recommended for? I was gonna do some matrix comparisons for myself and thought I'd give it a try.

Soulhunter
17th November 2004, 15:34
Originally posted by ObiKenobi

OT a bit, but I was gonna ask you Soulhunter at what bitrates is your Y.A.C.Q.M matrix recommended for? I was gonna do some matrix comparisons for myself and thought I'd give it a try.

Lets say, I use it for anime and cartoon @ medium/high bitrates... ;)


Bye

ObiKenobi
17th November 2004, 16:12
Originally posted by Soulhunter
Lets say, I use it for anime and cartoon @ medium/high bitrates... ;)

Would this be saying it wouldn't be the best for compressing a 90 minute movie to a 1 cd rip? :p

Sharktooth
17th November 2004, 16:19
Originally posted by ObiKenobi
Would this be saying it wouldn't be the best for compressing a 90 minute movie to a 1 cd rip?
Uhm... let me guess... no?

ObiKenobi
17th November 2004, 16:25
Originally posted by Sharktooth
Uhm... let me guess... no?

Figured as much, guess I could still try it out though.

Soulhunter
17th November 2004, 17:19
I suggest to use a bitrate of 1220kbps or so...

This way you can put 20 episodes on a DVD blank !!!


EDIT:

Btw, this CQM needs proper denoised anime/cartoon to work correctly...

If you wanna put 90min. on 1CD, have a look @ the 900kbps comparison !!!


Bye

dr.Prozac
17th November 2004, 19:42
Hi !
Soulhunter
This time I'm also going to vote :)
I've got just one question. I would like to describe this Custom matrix comparison at our polish forum and also put the link to download your samples. The users will judge them and sent their results to me and I'will sent to you all of them. What do you think ?

Teegedeck
17th November 2004, 19:48
BTW, just to underline a change in the testing procedure from the last test: This time it is not anymore 'which clip looks best?' but 'how good does this clip look?'. So if the best clip of the bunch would still look like crap, you'd give it a '1'. :D

Soulhunter
17th November 2004, 19:52
Originally posted by dr.Prozac

I've got just one question. I would like to describe this Custom matrix comparison at our polish forum and also put the link to download your samples. The users will judge them and sent their results to me and I'will sent to you all of them. What do you think ?

Sure, very nice idea... :)

But I would prefer that they send their results directly to this (EC03@gmx.de) address !!!

Not that I dont trust you, but I wanna keep control @ the whole process... ;)


Tia n' Bye

dr.Prozac
17th November 2004, 20:00
But I would prefer that they send their results directly to this address !!!
Ok, this is a good idea ! :)

Soulhunter
23rd November 2004, 14:11
Well, 2 votes after the first week !!!


Bye

Teegedeck
23rd November 2004, 14:27
I was busy... :(

Soulhunter
23rd November 2004, 14:40
Busy X-Mas shopping ???

Yeah, I know this... :D


Bye

Teegedeck
24th November 2004, 12:52
Results sent. :)

Sharktooth
24th November 2004, 13:55
This time i need some more time, i'm a bit busy with my job...

Soulhunter
24th November 2004, 18:48
5 votes... :D


Bye

calinb
25th November 2004, 23:45
Soulhunter, thanks for doing round 2!

My lame attempt at ranking sent :) Without prejudicing others to any particular CQM (hopefully), these are my general thoughts.

On my 50' Sammy DLP, artifacts are my main concern and squirming texture/surfaces drive me nuts with all the mp4 codecs and CQMs. Neither of these clips contain the worst case for this problem however (low contrast backgrounds with or without a moving bright object, extreme facial closups with skin complexion). However, there are still some moving textures on flesh and background surfaces that are obvious in clip-1. Clip-2 has high motion so it's not a problem. Both clips are pretty blocky with all CQMs too.

The blocks are also less intrusive on the fast motion clip-2, even though it contains more blocks than clip-1 (unless you weight single stepping frame evaluation highly, which I don't). Like most such evaluations, ranking depends on where you look. There are definitely areas where one sample does better than others, but it's usually offset by looking elsewhere--even in the same frame. All in all, I'm waiting for a solid H.264 codec to bring dramatic improvement to my encodings at 1/3 to 1/4 the bitrate of decent mpeg2 source.

Teegedeck
25th November 2004, 23:58
1/whath of the MPEG-2 was this anyway? :D

Soulhunter
26th November 2004, 01:12
Originally posted by Teegedeck

1/what of the MPEG-2 was this anyway? :D

Well, here are the source specs...

http://img98.exs.cx/img98/7872/MPEG2Info.png


Bye

Teegedeck
26th November 2004, 08:53
So we have 27% of the original MPEG-2, here. That's less than 1/3, more like 1/4 - and I think the results look pretty good for that. Yes, I also think that at such percentage figures H.264/AVC could yield better results if you want to keep full resolution. But that is theoretical and remains to be tested.

Again, most of the test clips look 'tolerable' but at least to me some look almost 'good' (we shouldn't mention which ones, in order not to influence others that may wish to join the test).

Didée
26th November 2004, 11:26
Originally posted by Teegedeck
Again, most of the test clips look 'tolerable' but at least to me some look almost 'good'[/edit] (To bridge the time until I get my voting ready: )

Didn't anybody wonder about the overall quant distribution? For my taste, the 1st clip has come out *heavily* over-quantized, in relation to the 2nd one.

With the good'old "steal a bit from the rich, to give plenty to the poor" strategy, the result would've been better, IMO: While the 2nd appears very OK for the given bitrate, the 1st one looks really bad to my eyes. Not exactly what I would have expected at this bitrate.

See: some prefer a "flat" quant distribution, and some (me) prefer slightly higher quants on hi-mo scenes & lower ones on lo-mo scenes. But here it's the opposite: high motion has gotten lower quants than low motion ... that can't be the right deal !?
(I never got such a thing from XviD - but then, I usually have a 5/5 setting in the CC fields...)

Moreover, for the 1st sample, the size deviations between the matrices are rather big, whereas the sizes for the 2nd sample are consistent.

BTW, both observations are the very same as in the 900kbps comparison!

Who is at fault? XviD's bitrate managment on a "too short" sequence? Overflow control in particular?

Teegedeck
26th November 2004, 11:35
Eh? Too short sequences? :) From what I understood, the full movie has been encoded 10 times for this test. It just seems to compress rottingly bad, right? (I don't own this one, so I cannot tell for sure, but Soulhunter is going to let us know the first-pass' filesize at the end of the test - or perhaps even now? *wink*)

Edit: That is why I find it, kind of, a bit *cough* misleading to to call it a test at "1600 kbps bitrate". Bitrate is such an eye-catcher and says so little. I expect that this is not at all a typical 2-CD scenario but more like what you could expect from a 1-CD encode. IMHO; if I'm not totally wrong with my assumptions about what the first pass filesize was.

Didée
26th November 2004, 11:44
Well, Soulhunter initially wrote

We did a encode of this source...

Gladiator (PAL/R2) / Chapter No. 19 (3:57 min.)


I understand this as "we encoded just that chapter". If they really did full encodings, then the wording should've been different.

Teegedeck
26th November 2004, 11:48
Whoooops; somewhere along the way I must have gotten this very wrong, indeed!

Edit: BTW, I generally found sample 2 worse than sample 1, as had to be expected. But that verdict is based upon watching it at 1/4th speed(!) which I had to do in order to rate it at all. And which I decidedly think is problematic, and I wrote that in the round-1-thread already. At full speed I wouldn't be able to tell a noteworthy difference between the clips for sample 2 and would rate them higher than the average of the sample 1 cips.

Soulhunter
26th November 2004, 19:57
Originally posted by Didée

I understand this as "we encoded just that chapter".

Yes, we only used chapter No.19 for this 3 rounds !!!

Btw, transferring 180MB via mail was a hard task !!!

Seems mail providers dont like big attachments...


Originally posted by Teegedeck

...let us know the first-pass' filesize at the end of the test - or perhaps even now? *wink*


H.263 - 1st pass

http://img113.exs.cx/img113/4872/H264_1st.png


H.263 - 2nd pass

http://img113.exs.cx/img113/4905/H264_2nd.png


MPEG - 1st pass

http://img113.exs.cx/img113/8643/MPEG_1st.png


MPEG - 2nd pass

http://img113.exs.cx/img113/1333/MPEG_2nd.png


Originally posted by Didée

Who is at fault? XviD's bitrate management on a "too short" sequence?

Well, I could post the stats of a full movie encode !?!

Think this should proof if the sample was to short...


Bye

calinb
26th November 2004, 21:18
Originally posted by Teegedeck
But that verdict is based upon watching it at 1/4th speed(!) which I had to do in order to rate it at all.
HeHe. Yup, Teegedeck. I had to cheat too--but only half as much as you. ;) I could barely differentiate the clip-2 samples from each other at 1/2 speed and I had a tough time feeling confident about my rankings of clip-2.

CruNcher
27th November 2004, 00:37
H.264 - 1st pass <- ohhh no now DXN knows damn Soulhunter how could ya ;)

Soulhunter
27th November 2004, 02:17
Originally posted by CruNcher

H.264 - 1st pass <- ohhh no now DXN knows damn Soulhunter how could ya ;)

Whoops... :D

Ok, Ive corrected my post !!!


Bye

dr.Prozac
28th November 2004, 20:04
Hi !
I've just recived one post as a reply to my comparisons at our polish forum. The user has written that it is the worst idea to make such comparisons by useing short fragments of vob's or just parts of movies. He wrote very strong words and said that all such comparisons (VHQ, custom matrices, filters ect) had been incorrect and the authors of them are incompetent ! In his opinion the proper comparison is only when comparing full movies or at least 30 mins of the movie.
His gauche opinion is really irritate for me :angry:
What do you think about it ?

PiXuS
28th November 2004, 22:41
@dr.Prozac


What do you think about it ?


Grill cheese are good! (or said otherwise... who cares?)

Soulhunter
28th November 2004, 22:43
Well, he could start his own comparison with different settings !!!

But I doubt much people gonna download 3GB of sample clips... :rolleyes:


Bye

dr.Prozac
28th November 2004, 23:15
@PiXuS
Grill cheese are good! (or said otherwise... who cares?)
Thx for your opinion.

@Soulhunter
Exactly.

Soulhunter
8th December 2004, 19:24
Matrices and appropriate numbers


CQM_NO_ 001 = H.263
CQM_NO_ 002 = Andreas 78er
CQM_NO_ 003 = Standard MPEG
CQM_NO_ 004 = HVS Better
CQM_NO_ 005 = HVS Best
CQM_NO_ 006 = EQM v3LR
CQM_NO_ 007 = SixOfNine HVS
CQM_NO_ 008 = Soulhunters v6
CQM_NO_ 009 = Soulhunters v3
CQM_NO_ 010 = YACQM



Results from best to worst


CLIP-01:

3,44 = H.263
3,18 = Soulhunters v6
3,02 = HVS Better
2,99 = Soulhunters v3
2,86 = Standard MPEG
2,85 = HVS Best
2,81 = EQM v3LR
2,76 = Andreas 78er
2,76 = YACQM
2,74 = SixOfNine HVS


CLIP-02:

3,20 = H.263
3,11 = Soulhunters v6
2,93 = HVS Better
2,86 = Standard MPEG
2,76 = Soulhunters v3
2,73 = YACQM
2,69 = EQM v3LR
2,64 = Andreas 78er
2,63 = HVS Best
2,54 = SixOfNine HVS



Comments...


- Seems everybody likes H.263... :D

- Wow, my CQMs are better than I thought !?!


Bye

Didée
8th December 2004, 22:18
Again too late, for the 2nd time. Grrr, damn!

Soulhunter, perhaps a small notice 2 or 3 days before deadline, next time ... for those that never get their a** lifted unless things become urgent? :)

Well ...

Conclusion #1: I feel strongly seconded in my preference of never doing a serious encoding without any filtering ;) - After all, the result's mean is *below* "tolerable", given that everyone ranked as supposed.

Conclusion #2: The 10th-best matrix of that comparison was never really meant to be used in the range of double-digit quantizers(HEAVEN HELP!) :)

Conclusion #3: Saggitaire ever was right :eek: ... (He ever preferred the one that came out 10th-worst in this round :D )


Time to get my hands on that source and look how it should be done in fact.

Soulhunter
8th December 2004, 22:41
Well, I posted the results coz its nearly 2 weeks ago since I got the last vote...


Bye

stephanV
8th December 2004, 23:23
@soulhunter: could you perhaps give some info on the amount of votes and standard deviation of the results?

thank you for your hard work! (and testers too of course :) )

Soulhunter
8th December 2004, 23:32
@ stephanV

1.) Six people voted...

2.) Sorry, but I have no idea what "standard deviation" means !?!


Bye

stephanV
8th December 2004, 23:42
STD is the spreading of the result... uhm... its not really important.

calinb
9th December 2004, 00:07
Without doing any statistical analysis (not sure I remember how :)), I think there's pretty good corelation in the rankings between clip1 and clip2. (H.263 won in both clips and SixOfNine HVS was tenth in both clips. The mid-pack placements are ordered similarly in each group--roughly.)

However, I don't know if the correlation exists because some CQMs were univerally better or worse, or if the participants had a tendency/desire to place consistent votes when ranking the samples in the clip1 and clip2 groups. Although the participants did not know which CQMs were associated with which sample numbers, they DID know (or assumed) that the numbering was the same in both groups.

Although it would make tracking slighly more difficult, it would probably be valuable to re-numbering the samples from one set of clips to the next, ie., rescramble the deck for each set of clips and don't reveal anything until after the comparison is closed.

Just an idea for next time. Thanks for the interesting comparison once again, Soulhunter :goodpost:

Teegedeck
9th December 2004, 09:14
Thanks for the test, Soulhunter; interesting results! I ranked some of the matrices differently, but I think it is clear that H.263's virtue for 'low bitrates' has been much undervalued in all the custom-matrix-frenzy.

Which brings me to the point, didn't you want to remind us about the compressibility values when giving the results? :P

Let me again stress this: We are talking about H.263 quantizer at quants 4 to 5. This is low compressibility. We have results that shed light on the choice of quantizer for 1-CD encoding! :)

For 'deeper' analysis of the results:


2 = Bad
3 = Tolerable
4 = Good

All matrices lay in the range of between 2.5 and 3.5. As a matter of fact, there is less than one whole mark between the top scorer and the lowest scorer in each test (2.74 - 3.44; 2.54 - 3.20). So they range between 'not bad, but not yet tolerable' (2.54) and 'not yet good, but better than tolerable' (3.44). There were no real over- or underachievers.

Differences between the matrices were very gradual; you cannot really tell apart groups but the ranking is quite clear and - interestingly - almost the same for both clips (I urge you strongly not to use the same sequence of matrices for the two clips, next time, Soulhunter); h.263 can quite clearly be declared a winner with Soulhunter v.6 as runner up! Though h.263 cleary did better than the other matrices in the decisive clip 1.

On the other end, the lowest-ranking matrices scored almost the same:
For clip 1
2,76 = Andreas 78er
2,76 = YACQM
2,74 = SixOfNine HVS
score almost the same and are on the last rank within the margin of error. Though
2,86 = Standard MPEG
2,85 = HVS Best
2,81 = EQM v3LR
are not more than 0.1 points from them and didn't do much better.

For clip 2, SixOfNine-HVS makes for the last place with 2.54 points, followed by the of group of
2,64 = Andreas 78er
2,63 = HVS Best
a mere 0.09 points above it.

The bottom-line is: h.263 seems the choice for 1-CD-type compressibilities. (Though I tried that after this test, and in several instances SixOfNine-HVS looked better than h.263. What now? Am I blind or was the test-sample not representative? It would seem that the look of the content you encode makes for a big difference.)

Edit: I hope you can collect more votes for upcoming tests. People, please participate! Less than 10 votes really make it hard to call a test representative.

Didée
9th December 2004, 11:22
Somehow I've the feeling that I'm more ranting in this thread than being constructive :| , but it isn't meant like that. So, first let me say again THANK YOU, Soulhunter and Sharktooth, for putting time and effort into this comparison ... and perhaps for round 3 I'll finally manage to participate right in time ...

However, I want to mention again that the results of sample #1 are badly suited to make any comparisons between the candidates - the differences in bitrate are way too big:
The biggest (h.263) came out with 35% higher bitrate than the smallest (YAQCM), wich effectively makes a comparison *impossible*. If we "cut out" the biggest & smallest sample, then there remains a bitrate span of 15%, which still is *a lot*. And compared to the mean bitrate over all samples, h.263 had a ~15% advantage, too.
Also note that in sample #1, the way undersized YAQCM came in 9th. In sample #2 (where it was scaled correctly), it came in 6th, very close to 5th ... I could imagine that in #1, a correctly scaled YAQCM perhaps could've made the jump into the top-3, instead of being the last but one.

Teegedeck, this should (partly) be an answer to your contradicting "h.263 vs. SixOfNine" observation.

I now that all of this rather shows a problem of XviD's curve scaling (on short samples) than a characteristic of the respective matrices ... but/and as said earlier, one hardly can test a matrix' performance solely, but only in conjunction with the "matrix/XviD" interaction.

So, the topic keeps being "difficult".

Teegedeck
9th December 2004, 11:48
Ah, yes, I forgot about the filesize discrepancies. This would speak strongly in favour of encoding the whole movie next time, no matter how long it takes, doesn't it? Though I know we currently use all of our CPU resources on that dreaded Recode2 (locks up my PC completely when I try and cancel an encoding session)... ;)

Didée
9th December 2004, 12:37
Originally posted by Teegedeck
we currently use all of our CPU resources on that dreaded Recode2
Me not :)

Perhaps I'll have a look after the first couple of updates has taken place ... for now, the circle of testers is already exquisite ;) - and when you/they've put enough fingers on the weak points, even DAU's like me might get into the swing of AVC ...

Sharktooth
9th December 2004, 13:26
This time i couldn't vote.
I'm too busy with my job and have not enaugh time.
However HVS better and EQM V3LR have almost the same coefficients, and scored 3rd and 6th (7th in the second clip) respectively.
Uhm, i doubt there is that "big" difference between the 2...
H.263 is the "softest" sample too... i wonder how ppl have assigned the best score to it.

Teegedeck
9th December 2004, 13:32
And the results also say there isn't: 0.07 points (0.15 in the high motion clip respectively) difference really don't mean anything. :)

(edited: points for 2nd clip)

Soulhunter
9th December 2004, 13:35
Originally posted by Teegedeck

Which brings me to the point, didn't you want to remind us about the compressibility values when giving the results? :P


You mean the h.263/MPEG results ???

I already posted them @ page 2 iirc... :o

Or do you mean the "full movie" encodes ???

Well, Im still waiting (http://img8.exs.cx/img8/7461/hdd8po.png) for my new 200GB HDD... :(

I cant even finish my 30h AVC encode of Reloaded !!!


Originally posted by Teegedeck

This would speak strongly in favour of encoding the whole movie next time...

The difference between the encodes "full chapter" is less than 1% !!!

Only the short "cutted out" samples have such a different bitrate... ;)


Bye

Teegedeck
9th December 2004, 13:38
Ah! That would alleviate things. So how long was the chapter you've encoded in the first place?

And: yup, I meant repeating those values in order to remind people, but we have pointed to that clearly enough now, haven't we? ;)

Soulhunter
9th December 2004, 13:46
Originally posted by Teegedeck

Ah! That would alleviate things.

So how long was the chapter you've encoded in the first place?

Originally posted by Soulhunter

We did a encode of this source...


Gladiator (PAL/R2) / Chapter No. 19 (3:57 min.)


Bye

Teegedeck
9th December 2004, 13:47
:o ...