Log in

View Full Version : New home-made cross-OS x265 benchmark


Losko
18th February 2022, 11:22
I've made a series of HEVC encoding using x265 on my home laptop, now equipped with two OSs (LMDE (64 bit) (https://www.linuxmint.com/) and Haiku (64bit) (https://www.haiku-os.org/)), both with fresh updates.
Source content were the two files provided by Ben here (SolLevante and TearsOfSteel) (https://forum.doom9.org/showthread.php?p=1853595#post1853595) and the command line was as simple as:
x265 --preset slow --crf 22 --input filename --aq-mode 3 --selective-sao 2 --pools 4 --output output.265
On my LMDE (which is Debian 10 (Buster) based) I used two kernel versions: 4 and 5, both provided in repos through synaptic, and two x265 releases as comparison.

linux 4 means:
Linux debbieb1 4.19.0-18-amd64 #1 SMP Debian 4.19.208-1 (2021-09-29) x86_64 GNU/Linux

linux 5 means:
Linux debbieb1 5.10.0-0.bpo.9-amd64 #1 SMP Debian 5.10.70-1~bpo10+1 (2021-10-10) x86_64 GNU/Linux

haikuDx means:
Haiku shredder 1 hrev55874 Feb 14 2022 09:04:41 x86_64 x86_64 Haiku
(this is a nightly build - Dx is the kernel debug level 0 (no debug) or 2 (max debug))

On the other hand, x265 3.3 means:
x265 [info]: HEVC encoder version 3.3+10-gd4b5ab60b
x265 [info]: build info [Linux][GCC 8.3.0][64 bit] 8bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2

and x265 3.4 means (on linux):
x265 [info]: HEVC encoder version 3.4.1+1-1827b372c
x265 [info]: build info [Linux][GCC 8.3.0][64 bit] 8bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
and (on haiku):
x265 [info]: HEVC encoder version 3.4.1+1-1827b372c
x265 [info]: build info [Unk-OS][GCC 11.2.0][64 bit] 8bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast LZCNT SSSE3 SSE4.2 AVX FMA3 BMI2 AVX2
(notice the compiler is recent gcc-11.2)

My laptop is equipped with a Core i7 7500u processor (2 cores+HT), with 8 GB (4+4) Kingston DDR4.
Source y4m files were both copied on an USB 3.0 stick, assuring safe access times.
I launched each encoding three times after an OS reboot, then took an average of reported fps stats (stat values were substantially constant, though).

Finally, the numbers:

| x265 3.4 | x265 3.4 | x265 3.3 | x265 3.4 | x265 3.4 |
| haikuD2 | haikuD0 | linux 4 | linux 4 | linux 5 |
-----------------------------------------------------------------------------------------------
Sol Levante | 1.19 fps | 1.20 fps | 1.95 fps | 1.94 fps | 1.78 fps |
-----------------------------------------------------------------------------------------------
Tears Of Steel | 2.19 fps | 2.23 fps | 3.65 fps | 3.66 fps | 2.26 fps |
-----------------------------------------------------------------------------------------------

While it is reassuring the identical performances of x265-3.3 and x265-3.4 on linux4, I find shocking the fall on linux5 - it should be noted that, although linux5 is available on repositories, on Buster the official kernel is still linux4, so maybe the newer kernel never received any fine tuning. I had some expectations on Haiku performances, and they disappointed me. The OS is still in beta (beta4 is expected soon), so I will show these numbers on their forum as well, in the hope they can be helpful.

benwaggoner
18th February 2022, 18:14
It'd be interesting to compare these results to Windows 11 on the same hardware.

As for a benchmark, I might recommend using --preset slower instead of slow. Slower adds some new interesting things for the encoder to chew on, and reduces the odds of IO limitations being a material factor.

I'd probably be using --aq-mode 4 by default these days.

Losko
21st February 2022, 10:17
As one of the results of this comparison was a noticeable interest in the haiku forum (here (https://discuss.haiku-os.org/t/new-home-made-cross-os-x265-benchmark/11954)) and my purpose is actually support their development, I will likely go further and run again the benchmark (read: I will tolerate more wait for the encoding at slower preset).

It'd be interesting to compare these results to Windows 11 on the same hardware.
I'm usually open to install and try new OSs BUT I definitely don't want MS latest beast to mess up my partitioned hd, so no way.

I'd probably be using --aq-mode 4 by default these days.
I went with --aq-mode 3 as here (https://x265.readthedocs.io/en/master/cli.html) I read '3' is "recommended for 8-bit encodes" as these I've done. Are there some enhancements in setting '4' ?

Losko
28th February 2022, 14:21
Updated table with benchmark run on haiku (custom kernel built with no debugging features).

microchip8
28th February 2022, 15:00
I'd probably be using --aq-mode 4 by default these days.

You're crazy. You should be using aq-mode 1 and high values of psy-rd and psy-rdoq. These are the only optimal values, especially if you don't want banding.

All other aq-modes are sub-optimal. Again, you're crazy

excellentswordfight
28th February 2022, 17:00
You're crazy. You should be using aq-mode 1 and high values of psy-rd and psy-rdoq. These are the only optimal values, especially if you don't want banding.

All other aq-modes are sub-optimal. Again, you're crazy
I actually made some tests after I saw Bens post, and I was actually rather impressed by aq-mode 4, definitely enough so to continue to investigate that mode.

For lowish bitrate encoding though...

microchip8
28th February 2022, 17:58
I actually made some tests after I saw Bens post, and I was actually rather impressed by aq-mode 4, definitely enough so to continue to investigate that mode.

For lowish bitrate encoding though...

Impressed how? From my own testing, it does little to get rid of banding (10 bit enncode) and blows up the bitrate considerably (tested on Blade Runner 2049).

I've done a dozens of test encode on clean source testing the different aq-modes and none of them were able to perform as they claim to (all tests on Blade Runner 2049). The only thing that can match the input as close as possible is aq-mode 1 with high values of psy-rd and psy-rdoq

excellentswordfight
28th February 2022, 20:35
Impressed how? From my own testing, it does little to get rid of banding (10 bit enncode) and blows up the bitrate considerably (tested on Blade Runner 2049).

First of all I mostly did the comparisons to mode 2 & 3, and compared to those there were mostly improvements. Compared to aq-mode 1, it was a bit hit & miss, it was more prone to onion artifacts (banding but not on flat surfaces, more sections in motion), but surprisingly some areas had better fine detail/noise/grain with aq-mode 4.

I have no idea what it does to the bitrate as I always test settings with 2pass vbr.

I've done a dozens of test encode on clean source testing the different aq-modes and none of them were able to perform as they claim to (all tests on Blade Runner 2049). The only thing that can match the input as close as possible is aq-mode 1 with high values of psy-rd and psy-rdoq
Yes, and I use aq-mode 1 when try that as well, but when you do 3-4Mbps encodes for 1080p, trying to match the input as close as possible might not produce the most enjoyable image. I haven't really decided yet which version I thought looked best between 1 & 4.

benwaggoner
1st March 2022, 18:23
I have no idea what it does to the bitrate as I always test settings with 2pass vbr.
Which is really the only good way to do it. A parameter change that increases bitrate and increases quality is a lot harder to evaluate in CRF mode. With 2-pass VBR, you're always comparing at a fixed ABR so all you are evaluating are the quality differences.

rwill
2nd March 2022, 09:23
Arguing about x265 AQ-Modes is like arguing about the prostheses used at the special olympics. Please just stop.

benwaggoner
4th March 2022, 00:42
Arguing about x265 AQ-Modes is like arguing about the prostheses used at the special olympics. Please just stop.
This is Doom9. It is for arguing about aq-modes :sly:!