View Full Version : New public listening Tests @128kbps
Kurtnoise
5th December 2005, 19:27
Sebastian Mares has launched a new public listening blind-tests at 128bkps.
The contenders:
Nero AAC 3.1.0.2 (http://www.rarewares.org/rja/nero_aac_encoder_v3_1_0_2.rar) VBR/Stereo - Streaming, 100-120 kbps [LC AAC]
iTunes AAC 6.0.1.3 (http://www.apple.com/itunes/) 128 kbps, VBR
LAME 3.97 Beta 2 (http://lame.sourceforge.net/) -V5 --vbr-new
Ogg Vorbis AoTuV 4.51 Beta (http://www.geocities.jp/aoyoume/aotuv/) -q 4.25
WMA Professional 9.1 (http://www.microsoft.com/windows/windowsmedia/default.mspx) Quality-Based VBR, Q50
Shine 0.1.4 (http://www.mp3-tech.org/programmer/encoding.html) (Low Anchor) -b 128
More infos on HA (http://www.hydrogenaudio.org/forums/index.php?showtopic=39448) and here (http://www.maresweb.de/listening-tests/mf_128_1.php)...
bond
5th December 2005, 22:16
yeah!
Kurtnoise
20th December 2005, 15:15
bump...;)
Just to say that test is scheduled to the end on December 25th and you don't need to test all samples to participate...
Kurtnoise
24th December 2005, 00:38
Tests extended to the end of January 13th 2006.
bond
15th January 2006, 15:28
ok the results are in:
http://www.maresweb.de/listening-tests/mf-128-1/results.htm
http://www.maresweb.de/listening-tests/mf-128-1/resultsz2.png
seems all codecs are pretty much on par (which makes lame imho perform very good, as mp3 is the "worst" technology)
nero has been disqualified because it had some wierd behaviour that made it perform better in an unfair way with no real-life relevance (how exactly this worked out is described here (http://www.maresweb.de/listening-tests/mf-128-1/miscellaneous/nero.txt) and was found by "super-ear" guruboolez). it was a bug not cheating...
enjoy :)
JohnV
18th January 2006, 09:43
nero has been disqualified because it had some wierd behaviour that made it perform better in an unfair way with no real-life relevance (how exactly this worked out is described here (http://www.maresweb.de/listening-tests/mf-128-1/miscellaneous/nero.txt) and was found by "super-ear" guruboolez). it was a bug not cheating...
enjoy :)Nero was disqualified because of the bug in bit-reservour handling. However, it is wrong to say it only made Nero perform better. The bug caused actually worse performance after about reaching the half point of the test track positions (reservour drained), than it could be without the bug. But those who conducted the test were afraid that people rate the samples based only on the beginning of the tracks. Also the quality issues due to bit reservour bug, which caused both slight over- and undercoding in the first and second halfs of the samples respectively, are not so big that it could be said that the results have no real-life relevance; it was just a bit-reservour bug of the ABR-mode after all. We are not even talking about VBR or any psychoacoustic problems here.
If it is assumed (like it should) that people listen the whole 30 sec sample and not only half of it, then the Nero score is exactly what it should be from those samples, and in this case the score may very well instead be worse, not better, than it could because of the bug in bit-reservour which caused "undercoding" after half point of the tracks, and due to the fact that problems and differences from the original (which are of course more likely during undercoding) are the basis for scoring.
I am biased but imo it was not the right decision to exclude Nero results, because of the assumptions about testers' listening habits (that everybody would listen only first half of the sample lengths). Imo it's the whole point of a group test, that all factors like this are averaged in a group.
Even more, what comes to the "perform better in an unfair way", it has to be remembered that we were using ABR mode. Even during the "unfair overcoding" of the first part, the bitrate there was not high compared to what codecs using VBR mode can do...
Sagittaire
18th January 2006, 13:04
Just my 2 cents ... ;-)
1) Well ... not really 128 Kbps test but 140 Kbps listening test
Sample (Duration in Seconds) iTunes LAME Nero AoTuV WMA Professional
--------------------------------------------------------------------------------------
BigYellow (24) 139 141 139 147 138
bodyheat (25) 136 146 138 139 143
Carbonelli (17) 128 121 142 143 92
Coladito (20) 145 152 140 152 162
DontLetMeBeMisunderstood (30) 143 162 137 163 165
yello (9) 142 160 152 175 115
Elizabeth (29) 128 109 137 112 117
eric_clapton (25) 141 153 138 146 153
ReunionBlues (30) 144 155 137 143 163
LesJoursHeureux (20) 136 146 141 180 119
macabre (17) 133 147 142 149 125
MysteriousTimes (28) 143 148 137 146 153
ravel (28) 140 149 137 131 157
School (19) 144 153 141 163 150
Senor (17) 135 137 142 132 131
SongForGuy (15) 133 144 144 161 126
TheDraperyFalls (30) 138 146 137 156 140
WhiteAmerica (30) 128 113 137 125 113
--------------------------------------------------------------------------------------
Average: 22.94 137.56 143.44 139.89 147.94 136.78
2) Target bitrate is always a big problem for me
1- Vorbis with rating 4.79 and 147 Kbps for bitrate
2- I-tunes with rating 4.74 and 137 Kbps for bitrate
It's 7% difference for bitrate. In comparison RDO advantage for MPEG4 ASP is only 6% in metric test (VHQ0 vs VHQ4 for example). Make visual test with XviD at 930 Kbps and DivX at 1000 Kbps is certainely not good test too ...
I'am really curious to see test with that:
1- Vorbis with rating X and 127 Kbps for bitrate
2- I-tunes with rating 4.74 and 137 Kbps for bitrate
or test with that:
1- Vorbis with rating X and 116 Kbps for bitrate
2- Vorbis with rating Y and 128 Kbps for bitrate
3- Vorbis with rating Z and 140 Kbps for bitrate
3) Ogg Vorbis has a very good quality preset. Why use q4.25 ... and make undersize target. Why not q4.00 or other setting for better target bitrate ... ???
Gabriel_Bouvigne
19th January 2006, 15:54
Sagittaire, you already asked the same questions on HA board, and received answers.
Why are you asking exactly the same question here, as if you did not already received explanations?
Doom9
19th January 2006, 16:59
I don't post at HA;)
There's one point Saggitaire has though.. it's kinda unfair if different bitrates are used. For such short samples, ratecontrol (or whatever you want to call it for audio), apparently is far from being perfect. Wouldn't it be better to encode at least a song's worth of audio, make sure the average bitrate matches, then pick samples from those songs at strategic (difficult?) locations and not worry about the size of these samples? Obviously that means you risk missing something, but that's a risk I think well worth taking. With bitrate differences of almost 30 Kb/s (way over 20%), that is likely to have a sizeable effect on quality.
Gabriel_Bouvigne
19th January 2006, 17:59
Doom9: Settings were choosen to produce 130kbps overall on a whole music library. (but this is mentionned in the test description)
Bitrate of short samples can of course vary, and is not representative by itself of the bitrate of whole tracks or whole libraries. It is the same thing as the bitrate of a specific scene in a whole movie encoding.
Doom9
20th January 2006, 13:13
Bitrate of short samples can of course vary, and is not representative by itself of the bitrate of whole tracks or whole libraries.I expected as much, and I did read Settings were choosen to produce 130kbps overall on a whole music library. .. but.. for reproducability, what is that library comprised of, and what was the precise bitrate of the entire library? And.. was the whole library encoded and the samples cut from the library, or was the library only used to see if the settings result in the proper size? I encode a "whole library (movie)" and then look at pieces of it. If I were to only encode the pieces, with the same settings, the result would not be the same. Naturally, part of that is due to two pass encoding and the encoder knowing the source after the first pass, something which doesn't apply to audio, but in VBR mode, an audio encoder will have a certain bitrate reservoir to pick from as well.. if you only have a 9 second sample to use.. you could drain it all, and if we think about the imaginary case now that the encoder would have to continue its work after the 9 seconds, having drained its reservoir, second 10 would probably sound horrible.
Gabriel_Bouvigne
20th January 2006, 13:59
The bitrate test was done in the preparation thread on HA. A few people tested the proposed settings and checked the overall bitrate.
Samples were then encoded using those settings, ie samples are not extracted from a bigger encoding. The two pass extraction problem does not occurs here, as encoders were only using 1 pass modes. Moreover, the beginning of encoded samples was not tested, to let encoders adapt themselves to the content, as in a real encoding.
The encoders tested in VBR were not targetting bitrate but quality, thus they are not trying to fit into a given bitrate. Bit reservoir should then not be a problem.
All encoders but the Nero one are publically available encoders, so we know that they do not show this behavior (spending many bits first then beeing starved).
However, the only encoder submitted specifically for this test was effectively found to fall into this case. Nero encoder was then excluded from the test results.
Doom9
20th January 2006, 15:10
wouldn't quality based mode result in different filesize (and thus average bitrate) on different content? That's what happens with video in quality based modes. Do you have a link to the preparatory thread? I can't find it anywhere in the article itself (a major omission if you ask me.. reproduceability is the A&O of any test)
Gabriel_Bouvigne
20th January 2006, 15:22
wouldn't quality based mode result in different filesize (and thus average bitrate) on different content?
Of course. The target bitrate was really overall, and can change according to content type.
http://www.hydrogenaudio.org/forums/index.php?showtopic=38955
http://www.hydrogenaudio.org/forums/index.php?showtopic=38723
reproduceability is the A&O of any test
What does "A&O" means?
Doom9
20th January 2006, 16:26
What does "A&O" means?I used latin letter instead of greek ones.. A stands for alpha, O for omega.. aka the beginning and the end, or something crucial. Thanks for the links.
guada 2
21st January 2006, 19:58
" Sebastian Mares has launched a new public listening blind-tests at 128bkps.
The contenders:
Nero AAC 3.1.0.2 VBR/Stereo - Streaming, 100-120 kbps [LC AAC]
iTunes AAC 6.0.1.3 128 kbps, VBR
LAME 3.97 Beta 2 -V5 --vbr-new
Ogg Vorbis AoTuV 4.51 Beta -q 4.25
WMA Professional 9.1 Quality-Based VBR, Q50
Shine 0.1.4 (Low Anchor) -b 128 " (Kurtnoise13)
Just a question:
And SHINE 0.1.4???????
Where is it ?
bond
21st January 2006, 20:23
shine is the low anchor, its not meant to produce any useable quality
guada 2
21st January 2006, 20:34
@bond
Thank you for this precision.
Bye.
guada 2
21st January 2006, 20:43
Ogg Vorbis AoTuV 4.51 Beta -q 4.25?
It is CBR or VBR.
bond
21st January 2006, 20:46
-q is vbr
guada 2
21st January 2006, 20:57
Strange!!!!
q 4.25 vbr = 123 kps ???
bond
21st January 2006, 21:08
vbr means the bitrate can varry
guada 2
21st January 2006, 21:56
@bond
I understood it well.
But I would prefer to compare the vbr mode of each codec with their size.
What do you think of my comment?
The Link
21st January 2006, 22:09
@bond
I understood it well.
But I would prefer to compare the vbr mode of each codec with their size.
What do you think of my comment?
And what would the result tell us?
guada 2
21st January 2006, 22:20
@link
For me the size of an audio file is very significant, because it determines the clearness of the audio file.
If one must compare audio bitrates, others codecs should be used and not some codec audios.
It is my point of view.
The Link
21st January 2006, 22:31
@link
For me the size of an audio file is very significant, because it determines the clearness of the audio file.
If one must compare audio bitrates, others codecs should be used and not some codec audios.
It is my point of view.
I'm not sure if I understood you correctly: You mean the bigger the audio file the better it sounds? While this might be true if the file size difference is relatively big (using up to date codecs) it is not true in a smaller/marginal range. Thats the point of testing several different audio samples whose overall mean bitrate is nearly equal: You test how good the psycho accustic model used in the different codecs is and how good the actual bitrate allocation for the different codecs works (codecs were tested with their vbr settings). That means bigger audio files might sound significantly worse than smaller ones.
guada 2
21st January 2006, 22:39
The link
For me, It would be interesting to see the size of each codec with same the bitrates and to compare them.
" vbr means the bitrate can varry ".
Why compare different codecs audios with random bitrates (VBR mode) by forgetting the final size ?
The Link
21st January 2006, 22:52
The link
For me, It would be interesting to see the size of each codec with same the bitrates and to compare them.
" vbr means the bitrate can varry ".
Why compare different codecs audios with random bitrates (VBR mode) by forgetting the final size ?
The mean bitrate over all samples used was nearly the same for each codec afaik and thus the size of all audio files summed up for each codec. Since different samples of different music styles were used this approach makes sense.
Is it normal that the bitrate is very high on some samples (even 180 kbps)?
Yes, and that is the beauty of VBR encoding - it will simply ignore bitrate limitations whenever possible, using as much bits as needed to encode a problematic sample.
Although that raises issues of fairness, it is the best way to compare modern codecs that shine most in VBR mode, like MP3 and Vorbis. Trying to force a VBR setting to match a desired bitrate, although fairer, is far from the usual practice of audio encoding, where it's more usual that a user just sticks to a quality setting, not caring much about a specific bitrate.
The quality settings for the VBR codecs were chosen because they average out to about 128 kbps over a number of encoded albums. It would be unfair to tie the hands of VBR codecs and punish them for being smart about where to spend what turns out to be the same number of bits over the long run. from http://www.maresweb.de/listening-tests/mf-128-1/index.php
guada 2
21st January 2006, 23:10
A comment:
Perhaps you are right? :)
But it must have some differences between origin codec.
Look at the videos codecs origin. It are not all equivalent. No.....
NOTE: it is not similar but it is like just . :)
Rockaria
22nd January 2006, 00:24
vbr means the bitrate can vary...
It does not mean the vbr bit rate is set @ random value, rather it is set by reasonable calculation(moving average) per each frame(segment or sample...).
The rough formula for each sample(not considering the adjacent samples) would be (0 ~ 100% of decoded or original PCM, i.e.16bit * 48k) * codec efficiency * q (represented by persentage / 100).
And the codec efficiency & (thus) the bit rate are changing through the segments depending on the signal complicatedness(represented in binary image).
Finally, the overall(average) rate(VBR) is represented by (file size in bits - extra) / duration(in seconds), which is, I guess, how most meta tag utils are showing the bit rate on the VBR encoded contents.
The codec efficiency mentioned above is my expression of representing the perfect(lossless, transcode) encoding quality(q == 100%) varying on each codec, not a professional definition. ;)
Sagittaire
22nd January 2006, 15:13
Well I read that:
http://www.hydrogenaudio.org/forums/index.php?showtopic=38955
1) test procedure
Your procedure test for bitrate is like "statistic poll". For good "statistic poll" you must use high effective number (1000 for good poll and 10000 for very very good poll) with high representative sample. If your samples are not representatives samples you must use higher effective for better result.
- You use only 150 samples for your "statistic poll" ... !!!
- Your samples are really represesentives samples ... you are sure for that ... ???
2) Average bitrate
Your average bitrate is not correct.You must compute the bitrate differently and it takes the length into consideration.
little example
sample1 : 10 sec and 96 Kbps
sample2 : 100 sec and 128 Kbps
sample3 : 1000 sec and 160 Kbps
Overall bitrate is (160*1000 + 128*100 + 96*10) / (1000 + 100 + 10) = 156 Kbps and certainely not ( 160 + 128 + 96 ) / 3 = 128 Kbps
I will try with 1000 samples with good bitrate calculation method and see if my result are the same that your result (IMO certainely not ... lol)
vBulletin® v3.8.4, Copyright ©2000-2009, Jelsoft Enterprises Ltd.