Log in

View Full Version : Taking submissions for a small encoder comparison


Pages : 1 [2] 3 4

Biggiesized
12th May 2010, 00:15
btw, this VC-1 encodes, i don't see they are use max GOP 250?
PSE is limited to 120 frames as the maximum GOP length for custom encodes.

Biggiesized
12th May 2010, 00:25
Can anyone encode via broadcast H.264 encoders (in the EU)? Their bit rate ceiling will be much lower but I'm curious how good it will look.

kieranrk
12th May 2010, 09:25
Can anyone encode via broadcast H.264 encoders (in the EU)? Their bit rate ceiling will be much lower but I'm curious how good it will look.

I doubt many people will have 1080p50 broadcast encoders.

roozhou
12th May 2010, 10:20
Real Video 4 (erv4.dll 10.0.0.2) using Easy RealMedia Producer

EHQ = 100
BFrames = 3

http://www.mediafire.com/?qmmddcnzztv

kolak
13th May 2010, 19:54
Stuff I already have:

x264 (PSNR-optimized and psy-optimized)
VP7
Theora (Thusnelda and Ptalabvorm)
Dirac (through Schroedinger)
Xvid
ffmpeg mpeg-4
ffmpeg mpeg-1
Bink
SVQ1
SIF1
CudaH264Enc
Ateme (v1.5 and v2.0)
Elecard
Samsung+BBC H.265 proposal
Mainconcept 8.5
Microsoft VC-1 SDK



Where can I download these files?


Andrew

Dark Shikari
13th May 2010, 20:10
Nowhere yet. Wait until the results are published.

Atak_Snajpera
13th May 2010, 20:18
Nowhere yet. Wait until the results are published.
Will it be a "blind test"? The similar method how audio compression is judged?

Dark Shikari
13th May 2010, 20:22
Will it be a "blind test"? The similar method how audio compression is judged?Here's the planned instructions.

To avoid skewing the results, the most absurd comedy options will probably be omitted. The rest will be posted for users to view (with no names, of course).

"Comparison" should be done by tabbing back and forth in one's browser.

1. Quickly glance over all the images to get an idea of how the quality varies. This will allow you to calibrate your rating scale--to get an idea of how good "good" is and how bad "bad" is.

2. Go through the list rating each on a scale from 1 to 10, 10 being perfect quality. Fractional values are allowed.

3. Go again through the list, but this time in the order of your ratings (from lowest to highest). Compare each pair of neighboring images in said order. Revise your ratings if necessary. For example, it might turn out that something you rated 5.8 actually looks better than something you rated 6.3 when you put the two side-by-side.

4. If you changed any of your ratings in step 3) in such a fashion that it changed the order of your ratings, check again. Repeat until you settle on an order you agree with.

5. Post the results.

Should take 10-20 minutes, I would guess.

kolak
13th May 2010, 20:24
Nowhere yet. Wait until the results are published.

But I already have video from other encoders.
We're going to have all files anyway as a proof, so it does't matter.

What does decode source file- ffdshow?

Andrew

Dark Shikari
13th May 2010, 20:26
But I already have video from other encoders.
We're going to have all files anyway as a proof, so it does't matter.

What does decode source file- ffdshow?

AndrewI'd rather not post stuff too early because that could skew the results of a blind test if too many people look at it early.

I'm using ffmpeg for decoding whenever possible and Virtualdub for cases of proprietary encoders like SIF1, VP7, etc.

kolak
13th May 2010, 20:28
Here's the planned instructions.

To avoid skewing the results, the most absurd comedy options will probably be omitted. The rest will be posted for users to view (with no names, of course).

"Comparison" should be done by tabbing back and forth in one's browser.

1. Quickly glance over all the images to get an idea of how the quality varies. This will allow you to calibrate your rating scale--to get an idea of how good "good" is and how bad "bad" is.

2. Go through the list rating each on a scale from 1 to 10, 10 being perfect quality. Fractional values are allowed.

3. Go again through the list, but this time in the order of your ratings (from lowest to highest). Compare each pair of neighboring images in said order. Revise your ratings if necessary. For example, it might turn out that something you rated 5.8 actually looks better than something you rated 6.3 when you put the two side-by-side.

4. If you changed any of your ratings in step 3) in such a fashion that it changed the order of your ratings, check again. Repeat until you settle on an order you agree with.

5. Post the results.

Should take 10-20 minutes, I would guess.

There has to be quite few, random frames to avoid I v. B frames comparision. Good thing is to have few frames one after another.
We also need files available to watch them in motion.

Andrew

kolak
13th May 2010, 20:30
I'd rather not post stuff too early because that could skew the results of a blind test if too many people look at it early.

I'm using ffmpeg for decoding whenever possible and Virtualdub for cases of proprietary encoders like SIF1, VP7, etc.

..but how do I decode source file- ffdshow, Vdub?


Andrew

Dark Shikari
13th May 2010, 20:32
..but how do I decode source file- ffdshow, Vdub?


AndrewAs I said, I use ffmpeg to decode the source file, except where no open source decoder is available, in which case I use the one they give me as part of the codec.

There has to be quite few, random frames to avoid I v. B frames comparision. Good thing is to have few frames one after another.
We also need files available to watch them in motion.

AndrewThat would be nice, but also a lot of work, which most people who are doing the comparison won't do, so it won't help the results either.

It certainly won't bias in favor of x264, since I intentionally picked a frame that was a B-frame in x264's encode ;)

kolak
13th May 2010, 20:47
As I said, I use ffmpeg to decode the source file, except where no open source decoder is available, in which case I use the one they give me as part of the codec.

That would be nice, but also a lot of work, which most people who are doing the comparison won't do, so it won't help the results either.

It certainly won't bias in favor of x264, since I intentionally picked a frame that was a B-frame in x264's encode ;)

Frames should be chosen randomly- not according to some (whatever) rules.

No files, no fair comparision, but still would like to see results :)


Andrew

Dark Shikari
13th May 2010, 20:55
Frames should be chosen randomly- not according to some (whatever) rules.

No files, no fair comparision, but still would like to see results :)If you want, I can include the files in the comparison to allow people to choose based on them, but I doubt most people will use them.

And there are reasons to have "rules" for picking frames. For example, don't pick one immediately after an I-frame (biases towards encoders that use overly high-quality I-frames).

kolak
13th May 2010, 21:03
If you want, I can include the files in the comparison to allow people to choose based on them, but I doubt most people will use them.

And there are reasons to have "rules" for picking frames. For example, don't pick one immediately after an I-frame (biases towards encoders that use overly high-quality I-frames).

Exactly- so that's why you choose random frames :)
Different encoders will have different "better" frames, so random choice will average them.

Anyway- lets wait for results.

Andrew

Boolsheet
13th May 2010, 21:04
and Virtualdub
Careful with the yuv->rgb conversions in VirtualDub, it's slightly different than ffmpeg.

shon3i
13th May 2010, 21:05
I think there is no problem with this source since all encoders use 2-3 I frames, so i think comparing will be fair enough

kolak
13th May 2010, 21:09
I think there is no problem with this source since all encoders use 2-3 I frames, so i think comparing will be fair enought

Yes- GOP size is massive and source file very short.


Andrew

Dark Shikari
13th May 2010, 21:11
Careful with the yuv->rgb conversions in VirtualDub, it's slightly different than ffmpeg.Yeah, not sure quite what to do here. ffmpeg is defaulting to BT.601, which is wrong, but it's fine as long as every single clip uses the same wrong conversion, since the difference is pretty minor.

Boolsheet
13th May 2010, 21:37
Yeah, not sure quite what to do here.

You can save the frame to avi in raw yv12. VirtualDub doesn't touch the frame if you set the color depth for input and output to yv12.

poisondeathray
13th May 2010, 21:42
another option would be to use avisynth and avsp to take screenshots decoded as rec709 by using converttorgb(matrix="rec709") . This way everything from back to the original film transfer should have rec709 preserved (assuming the various encoders did things correctly as well)

but I agree as long as it's consistently done it shouldn't matter too much

Atak_Snajpera
13th May 2010, 22:16
I think we should judge clips in motion instead of static screenshots.

Dark Shikari
13th May 2010, 22:17
I think we should judge clips in motion instead of static screenshots.It's far easier to measure small differences when comparing screenshots though... I think the best way is to compare screenshots, but to use motion as a supplement. This allows you to catch cases in which it looks much worse in motion than you expected, e.g. if there's a lot of smearing.

Atak_Snajpera
13th May 2010, 22:35
but to use motion as a supplement.
in flash player?

Dark Shikari
13th May 2010, 22:42
in flash player?No, I'm just going to post them for people to view how they want.

poisondeathray
13th May 2010, 22:42
in flash player?

not all the formats are compatible with flash players

maybe you could do "part A" with screenshots, and "part B" with clips.

but playing back clips might make the "blind" nature difficult, because if the clip extension is ".wmv" or ".rmvb" for example....hmmm I wonder what that is :)

CruNcher
13th May 2010, 22:44
I think the best way is to compare screenshots, but to use motion as a supplement.


But what if one of the Encoder is using Psy that makes especially use of Motion and it's effect isn't visible in 1 screenshot itself ?
I agree with kolak the files need to be released everything else doesn't really matter then, the most important ones btw are already posted here kolak and can be compared already :)
From the H.264 ones people expect to see the difference between Ateme, Mainconcept, Nvidia and X264 though the Ateme Encode is the oldest in terms of the Encoder it was done with.
The subjective best of that compared vs the non MPEG stuff especially Dirac, VP7, pre H.265 and SIF1 where VP7 and pre H.265 obviously are the most important in prediction of VP8 as the possible future MPEG contender and MPEGs own future :)
Also if im not wrong Ateme, Mainconcept same as Elecard and maybe also Nvidia by default use VBV restrictions that don't apply for x264 in it's default configuration.

Atak_Snajpera
13th May 2010, 23:12
but playing back clips might make the "blind" nature difficult, because if the clip extension is ".wmv" or ".rmvb" for example....hmmm I wonder what that is
I think we should create simple app/gui for this. Files would be stored without extensions. Order would be randomized by gui. Clips would be viewed in ffplay for example. Result would be automatically send to email address.

CruNcher
13th May 2010, 23:14
MSU has such a application same as everwicked developed Video Quality Studio http://visumalchemia.com/vqstudio/#download for that purpose though neither of both are Cross Platform and MSUs doesn't support HDTV in the free version and everwicked is rather limited of course non of them has E-mail out function of the Subjective rating results gathered :)

creamyhorror
14th May 2010, 08:50
I think we should create simple app/gui for this. Files would be stored without extensions. Order would be randomized by gui. Clips would be viewed in ffplay for example. Result would be automatically send to email address.
Then for safety you'd have to do an encryption of the files so that their type couldn't be easily checked via MediaInfo, and possibly hide their filesize through concatenation/splitting...lots of things to consider :devil:

How about having two neutral/fair/respected parties transcode all the video files to a lossless format, then distributing the full pack via bittorrent? (The two parties would compare hashes of the transcodes to ensure neither did anything wrong.) Would solve the problem of playback codecs and allow for blind(-ish) testing.

Or we could just leave the videos unblinded and do blind testing only for the screencaps.

julius666
14th May 2010, 12:58
How about having two neutral/fair/respected parties transcode all the video files to a lossless format, then distributing the full pack via bittorrent? Would solve the problem of playback codecs and allow for blind(-ish) testing.

Yeah. The playback speed would be sooo low (we are speaking from lossless FullHD-content!), that it would be like comparing screenshots... :rolleyes:
And I myself don't want to download ~1 GB just for testing codecs.

I think comparing well-chosen screenshots is a good compromise. And if anyone is interested, could download the video after the comparison.

BTW Dark Shikari, is there any codec to H265? I can't wait to see what H265 is capable of :)

poisondeathray
14th May 2010, 14:09
BTW Dark Shikari, is there any codec to H265? I can't wait to see what H265 is capable of :)


DS wrote a bit about a h.265 proposal in his blog
http://x264dev.multimedia.cx/?p=360

Dark Shikari
14th May 2010, 16:19
My test uses the Samsung-BBC proposal. It's not a 100% fair test, since the proposal encoder doesn't have b-adapt and some other features, but on the other hand, Samsung/BBC cheated the crap out of that encoder (using optimizations they weren't supposed to, optimizing specifically for the test clips they were given, etc), so perhaps it's fair game ;)

creamyhorror
15th May 2010, 05:50
Yeah. The playback speed would be sooo low (we are speaking from lossless FullHD-content!), that it would be like comparing screenshots... :rolleyes:
Huh? Lossless doesn't mean slow to decode. If anything, it'd likely be faster to decode than a lossy x264 encode of the same material.


And I myself don't want to download ~1 GB just for testing codecs.
1GB isn't much, compared to the x264 Blu-ray.


I think comparing well-chosen screenshots is a good compromise. And if anyone is interested, could download the video after the comparison.
Of course. Chosen screenshots are the essential requirement; this video stuff I was referring to is an additional part.

Dark Shikari
15th May 2010, 06:02
Huh? Lossless doesn't mean slow to decode. If anything, it'd likely be faster to decode than a lossy x264 encode of the same material.Lossless x264 is not fast to decode. Combine that with the disk bottleneck...

And it might be 500MB... per encode. That's a lot of gigabytes to download for all the different sources.

creamyhorror
15th May 2010, 10:36
Lossless x264 is not fast to decode. Combine that with the disk bottleneck...
I'm referring to HuffYUV or maybe Ut. But the disk bottleneck might apply, admittedly.

And it might be 500MB... per encode. That's a lot of gigabytes to download for all the different sources.
Ouch, okay.

IgorC
15th May 2010, 16:56
Maybe I forget something. But what are reasons and advantages to choose 50fps source?

Dark Shikari
15th May 2010, 18:12
Maybe I forget something. But what are reasons and advantages to choose 50fps source?All the SVT clips are high-framerate. If you want it to be 25fps, it can be 25fps, just slow it down 2x ;)

KikeG
15th May 2010, 21:57
I know that you know all of this and I don't know as much as you of video encoding, but wouldn't comparing still frames not take into account that the some encoding schemes take advantage of the fact that our eye is less sensitive to artifacts when there is motion? I mean, the already mentioned B-frames issue for example. I don't know much in x264, but with XVid default parameters isolated B-frames look much worse than I or P frames. Also, some codecs take into account the amount of motion for compressing more or less the frame, and others don't. So depending if you compare high motion or low motion frames this will benefit some encoders or others, benefit that may not be so when looking at the actual moving video. I too believe some artifacts (moving textures, etc) are more visible with moving video than with still frames.

I think that if despite all this the comparison is based in still frames, several different frames, high and low motion, should be used, in order to take into account these issues, and the result for each encoder show be the average result of the different frames.

Also, for blind tests it is always needed at least a control or anchor encoder. This is, a encoder whose quality is known and clearly worse or better than the encoders at test, so that differences between encoders can be made relative to the anchors. Usually a low anchor is employed.

CruNcher
15th May 2010, 23:28
Yep and as the heaviest contenders in the H.264 range here are very heavy developed it will be hard to see differences per 1 test sequence better would be to compare a whole encode versus each other with different complexity stages. I know test sequences x264 and Mainconcept would heavily lose vs Ateme for example too so im not sure if it's good to test such heavy R&D Encoders with just 1 sequence it give a somewhat picture of the state their in but not a complete overview. For example Mainconcept lost some years ago heavily in the ParkRun sample and also x264 recently after that started to optimize for such complexity cases Ateme @ that time already did that before both of them and won visually by a very big margin ;) (i mean would it have been fair comparing x264 this way???). Also comparing Psy (Look & Feel) is hardly possible with just 1 sequence especially not in screenshot differences imho.
I don't think that picking every small detail of difference out and saying here x264 does it visually better is a fair way and it surely shouldn't be sold as such, but combined with MSUs results for example which this time should be much better worked out it can give a nice overall picture, though Ateme isn't participating again i think so 1 major result would still be missing. Anyway for normal average Consumer this comparison is also rather useless as their is no real up2date contender to X264 in that space, except maybe currently the DivX Plus-HD Converter ;)

[ReX]
15th May 2010, 23:57
You could take a screenshot of each video every 2secs (or 1.2, 1.5, etc).

CruNcher
16th May 2010, 00:27
Do you think Hollywood compared solutions this way when they where examining HD-DVD, Blu-Ray participants ? ;)

Atak_Snajpera
16th May 2010, 12:12
At the moment x264 has only one competition Atheme V2. Rest is horrible!
http://img268.imageshack.us/img268/9611/stackhorizontalavssnaps.png

Dark Shikari
16th May 2010, 12:26
At the moment x264 has only one competition Atheme V2. Rest is horrible!
http://img268.imageshack.us/img268/9611/stackhorizontalavssnaps.pngIMO Mainconcept 8.5 stands up rather well too. Too bad you can't actually find any products using it.

Atak_Snajpera
16th May 2010, 12:31
x264 is also amazing in intra-only mode. Original vs x264 intra (~250KB) vs mjpeg (~250KB)
http://img22.imageshack.us/img22/7835/new1pe.png (http://img22.imageshack.us/i/new1pe.png/)

Uploaded with ImageShack.us (http://imageshack.us)

CruNcher
16th May 2010, 12:53
x264 is also amazing in intra-only mode. Original vs x264 intra (~250KB) vs mjpeg (~250KB)
http://img22.imageshack.us/img22/7835/new1pe.png (http://img22.imageshack.us/i/new1pe.png/)

Uploaded with ImageShack.us (http://imageshack.us)

except the blocking yeah ;)

CruNcher
16th May 2010, 12:56
IMO Mainconcept 8.5 stands up rather well too. Too bad you can't actually find any products using it.

DivX Plus-HD Converter uses it since some time now, it seems to be a default 2 pass system because the pre encode stage takes a long time you see a message "Is preparing" and CPU utilization @ 100% (seems to be a analyze pass taking place). So it takes a very long time for the result compared to a well balanced X264 CRF encode :)

gonna upload the default result, that consumers can expect :)

DivX Plus HD Converter result (everything default just pushed start)

http://www.mediafire.com/download.php?wjzmnzndwoh


Mediainfo result:

Allgemein
Vollständiger Name : C:\Dokumente und Einstellungen\Administrator\Eigene Dateien\Eigene Videos\DivX Movies\parkjoy.mkv
Format : Matroska
Dateigröße : 17,0 MiB
Dauer : 9s 960ms
Gesamte Bitrate : 14,3 Mbps
Kodierendes Programm : DivXMKVMux 3.4.1.0004
verwendete Encoder-Bibliothek : libDivXMediaFormat 3.4.1.0004

Video
ID : 1
Format : AVC
Format/Info : Advanced Video Codec
Format-Profil : High@L4.0
Format-Einstellungen für CABAC : Ja
Format-Einstellungen für ReFrame : 4 frames
Muxing-Modus : Container profile=Unknown@4.0
Codec-ID : V_MPEG4/ISO/AVC
Dauer : 9s 960ms
Bitrate : 14,0 Mbps
Breite : 1 280 Pixel
Höhe : 720 Pixel
Bildseitenverhältnis : 16:9
Bildwiederholungsrate : 50,000 FPS
Standard : PAL
Auflösung : 8 bits
Colorimetrie : 4:2:0
Scantyp : progressiv
Bits/(Pixel*Frame) : 0.305
Stream-Größe : 16,7 MiB (98%)
Sprache : Englisch
colour_primaries : BT.709-5, BT.1361, IEC 61966-2-4, SMPTE RP177
transfer_characteristics : BT.709-5, BT.1361
matrix_coefficients : BT.709-5, BT.1361, IEC 61966-2-4 709, SMPTE RP177

So as Dark_Shikari said as 1080p 50 fps is out of DivX Plus HD specs it changed the output resolution by default to the next supported that is 1280x720p 50 fps

Dark Shikari
16th May 2010, 13:10
DivX Plus-HD Converter uses it since some time now with a default complexity masking setting
Then why does it suck so much?

Atak_Snajpera
16th May 2010, 13:36
except the blocking yeah
I always decompress footage from AVCHD camcorder using --preset superfast --tune fastdecode --keyint 1. I need deinterlaced footage which could be easly (lower cpu utilization) decoded by Sony Vegas. That's why I cannot use deblock and cabac for example. if you see blocks in x264 you better check what you get in mjpeg. Also don't forget that this is 200% zoom.