Log in

View Full Version : New advanced Benchmark


Pages : 1 2 [3]

Atak_Snajpera
21st November 2021, 18:12
Sounds interesting. Could you please give a bit more information / sources about chunk encoding? I get the idea but how to approach this technically?

It works like this
https://i.postimg.cc/Vk7yzLZN/client.png

You basically divide your video clip virtually in avisynth script with Trim function and then encode all chunks at once. Then you combine all .265 files into one and mux. With this approach i can easily saturate even Threadripper 64C/128T using just 720p source.

kolak
21st November 2021, 21:41
There are problems with it as well: VBV buffer flow, VBR efficiency with very short chunks, but in most cases it works fine. You also should use stitching option in encoders and make sure headers are set correctly, specially for the last chunk.
SPS/PPS should also be set properly (for ts muxing) as some hardware boxes don't like if those change during one video.
You should also put some attention where you place chunk points, eg. best are scene changes. In case of streaming need of fixed I frame distance chunks points should be rather aligned to them.
For perfect streams it's not as easy as simple divide to n parts and encode them simultaneously, but it can be done. It all depends what is the end target.

sebastian1
24th November 2021, 17:35
5900X@PBO/+50Mhz/142W
||---------------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
| CPU ....................| x264 | x265 | LAVC | auto | MMX2 | SSE | SSE2 | SSE3 | SSE4 | AVX | AVX2 | All |
|----------------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
| Ryzen 9 5900X......| 48.29 | 8.88 | 224 | 6.94 | 2.39 | 2.40 | 3.67 | 3.93 | 5.66 | 5.72 | 6.86 | N/A |

tormento
26th November 2021, 19:28
Yet nobody with Alder Lake? :)

RanmaCanada
26th November 2021, 21:33
Yet nobody with Alder Lake? :)

We would honestly need to get the benchmark updated for anything in Alder Lake to be relevant.

GEfS
1st January 2022, 06:17
R5 2600@4.1ghz 1.35V
2666mhz CL16-18-18-38 RAM.


|---------------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
| CPU | x264 | x265 | LAVC | auto | MMX2 | SSE | SSE2 | SSE3 | SSE4 | AVX | AVX2 | All |
|---------------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
| Ryzen 5 2600 | 16.42 | 2.75 | 97 | 2.26 | 0.90 | 0.84 | 1.30 | 1.44 | 2.09 | 2.12 | 2.18 | N/A |

Emulgator
2nd January 2022, 17:40
|---------------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
| CPU | x264 | x265 | LAVC | auto | MMX2 | SSE | SSE2 | SSE3 | SSE4 | AVX | AVX2 | All |
|---------------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
| 11th Gen Core i9-11900K | 29.92 | 5.16 | 169 | 4.07 | 1.34 | 1.29 | 2.09 | 2.23 | 2.92 | 3.51 | 4.05 | N/A |

Corei9-11900K, Notebook-MB Z590
No CPU/GPU OC, Busy Clock was between 3,6 and 4,5GHz, idling around 5,1..5,3 GHz,
3200MHz RAM underclocked to 2933 (CR 2T 1467MHz 21-21-21-47)
Consumption depending on passes
CPU cores ate min.80W..max.145W
CPU Package ate min.90W..max.165W
Total System ate min.150W..max.262W

GEfS
6th January 2022, 08:44
i5-12600@stock with ID-Cooling SE-207-XT on Gigabyte Z690 Gaming X DDR4
2133mhz RAM as Corsair stock (3600C18 kit, Micron E-die) and I forgot to turn on XMP. And I don't have time for a second run cuz the machine gonna be shipped real soon.
Benchmark suite running on a NVMe Pcie 4.0 Samsung PM9A1 512GB

|---------------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
| CPU | x264 | x265 | LAVC | auto | MMX2 | SSE | SSE2 | SSE3 | SSE4 | AVX | AVX2 | All |
|---------------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
| Core i5-12600K | 33.75 | 6.34 | 163 | 4.69 | 1.62 | 1.54 | 2.49 | 2.77 | 3.91 | 4.10 | 4.55 | N/A |


Will update with a proper setup (with at least 3200mhz RAM) whenever I get in touch on Alder Lake system.

rwill
7th January 2022, 18:48
|----------------------------------------------|---------|----------|---------|---------|---------|----------|---------|---------|---------|---------|---------|---------|
| CPU | x264 | x265 | LAVC | auto | MMX2 | SSE | SSE2 | SSE3 | SSE4 | AVX | AVX2 | All |
|----------------------------------------------|---------|----------|---------|---------|---------|----------|---------|---------|---------|---------|---------|---------|
| Ryzen Threadripper 3970X 32-Core Processor | 57.04 | 15.53 | 209 | 9.92 | 3.64 | 3.64 | 5.37 | 5.78 | 8.34 | 8.49 | 10.14 | N/A |

CPU utilization wasnt even close to 99% though.

rwill
7th January 2022, 18:53
By the way:

https://www.igorslab.de/en/intel-deactivated-avx-512-on-alder-lake-but-fully-questionable-interpretation-of-efficiency-news-editorial/

benwaggoner
7th January 2022, 23:18
By the way:

https://www.igorslab.de/en/intel-deactivated-avx-512-on-alder-lake-but-fully-questionable-interpretation-of-efficiency-news-editorial/
The thermal throttling caused by the current AVX512 implementation tends to counteract the potential speed gains in going from 256 to 512 bit SIMD. With x265, turning off AVX512 yields faster encoding below something like 4K at --preset veryslow.

I can see Intel very much wanting to keep AVX512 out of consumer systems where it would actually cause performance regressions most of the time. It's great news if this means they'll provide a better, cooler AVX512 soon that could be of more practical use.

AVX2 had a similar course, where the thermal throttling of the initial implementation made it only a small boost, but a later revision had better thermal characteristics and made AVX2 a lot more valuable to use.

nevcairiel
7th January 2022, 23:40
The thermal throttling caused by the current AVX512 implementation tends to counteract the potential speed gains in going from 256 to 512 bit SIMD.

Did you actually test on Alder Lake, and not some previous implementation?
Because between the new process node as well as general improvements, the downsides should be much reduced.

tonemapped
10th January 2022, 07:48
The thermal throttling caused by the current AVX512 implementation tends to counteract the potential speed gains in going from 256 to 512 bit SIMD. With x265, turning off AVX512 yields faster encoding below something like 4K at --preset veryslow.

I can see Intel very much wanting to keep AVX512 out of consumer systems where it would actually cause performance regressions most of the time. It's great news if this means they'll provide a better, cooler AVX512 soon that could be of more practical use.

AVX2 had a similar course, where the thermal throttling of the initial implementation made it only a small boost, but a later revision had better thermal characteristics and made AVX2 a lot more valuable to use.

The problem with Intel's f**kery with 'accidentally' leaving AVX512 not fused off, therefore allowing motherboard manufacturers to enable in via BIOS (which Intel would have noticed before launch), is that it gave an estimated uplift during the initial review cycle (first week) of ~9% in many applications/benchmarks. I don't see that as a coincidence.

On top of that, I know Intel added AVX2 support to the Alder Lake's Atom cores, sorry I mean "efficiency cores", so that people could utilise both the proper cores and the Atom cores, which makes total sense, but Intel stated about six months before 12 Gen. released that AVX512 would be fused off as the Atom cores can't support it.

Then there's the power issue with disabling AVX512. The Alder Lake 12900K already uses ~80% more power for ~21% more performance, when compared to a CPU with fewer cores but the same threads (5900X), and it would appear disabling AVX512 increases power use.


IgorsLab conducted some tests on the 12900K with power measured via the EPS 12V, so a reliable reading, with the following configurations:

12900K 8+8(24) / no AVX512 due to Atom cores
12900K (P-core only) / AVX512 enabled
12900K (P-core only) / AVX512 disabled


Now the interesting part is the power consumption:

12900K 8+8(24) / no AVX512 due to Atom cores: 255W
12900K (P-core only) / AVX512 enabled: 307W
12900K (P-core only) / AVX512 disabled: 328W


I'd really hoped Alder Lake would be more power efficient, but it seems Intel's taken Nvidia's approach and given up with power efficiency. My 5900X can encode two instances of x265 -slower @ ~10.6 fps (1.14V, 139W, 4.35 GHz). The 12900K is certainly impressive, but I don't see how people can justify ~255W for 21% more performance.

And in case people think I'm noting the above because I have a 5900X - the same applies to the 3080 (which I have) vs the 2080 Ti. The former offers ~17% more performance for an additional ~120W. It's obscene.

RanmaCanada
11th January 2022, 20:50
One could say Intel left AVX512 in on purpose to give idiots the false sense that their chips are superior. Now that it is being taken away, all that people will remember is that when the chips were launched, they were faster, and they won't notice, or care that their chips are now actually slower, and use far more power.

Intel breaking the law again with false advertising? Nah, they would never break the law, ever.

benwaggoner
11th January 2022, 23:22
One could say Intel left AVX512 in on purpose to give idiots the false sense that their chips are superior. Now that it is being taken away, all that people will remember is that when the chips were launched, they were faster, and they won't notice, or care that their chips are now actually slower, and use far more power.

The thing is, customer aren't going to see any actual real-world regressions without AVX512 with the implementations to date, outside of maybe some specific scientific computing applications, or 4K preset veryslow x265 encoding.

nevcairiel
12th January 2022, 00:11
Intel breaking the law again with false advertising? Nah, they would never break the law, ever.

Intel never advertised AVX512 for Alder lake, and even directly said to media that AVX512 is (supposed to be) disabled on those chips. Any conclusions drawn otherwise are entirely on the side of the media.

The vast majority of benchmarks or real-world use-cases won't even benefit from AVX512, and the fact that you have to disable the E-cores to make use of it also would only ever show any uplift on extremely heavy AVX512 work-loads, which would be capable of offsetting the reduction in cores.
For x265 for example, I would blindly claim that enabling the E-cores is extremely likely to be faster then making use of AVX512.

RanmaCanada
12th January 2022, 16:25
Intel never advertised AVX512 for Alder lake, and even directly said to media that AVX512 is (supposed to be) disabled on those chips. Any conclusions drawn otherwise are entirely on the side of the media.

The vast majority of benchmarks or real-world use-cases won't even benefit from AVX512, and the fact that you have to disable the E-cores to make use of it also would only ever show any uplift on extremely heavy AVX512 work-loads, which would be capable of offsetting the reduction in cores.
For x265 for example, I would blindly claim that enabling the E-cores is extremely likely to be faster then making use of AVX512.

https://videocardz.com/newz/intel-confirms-alder-lake-p-h-mobile-specs-publishes-hybrid-architecture-optimization-guide-for-developers

I hate to use the site, but Intel released documentation that stated all you would need to do was disable the E-Cores to enable AVX512. Intel then back peddled and claimed their official documentation was "wrong".

Atak_Snajpera
12th January 2022, 16:48
Intel never advertised AVX512 for Alder lake, and even directly said to media that AVX512 is (supposed to be) disabled on those chips. Any conclusions drawn otherwise are entirely on the side of the media.

The vast majority of benchmarks or real-world use-cases won't even benefit from AVX512, and the fact that you have to disable the E-cores to make use of it also would only ever show any uplift on extremely heavy AVX512 work-loads, which would be capable of offsetting the reduction in cores.
For x265 for example, I would blindly claim that enabling the E-cores is extremely likely to be faster then making use of AVX512.

E-Cores are very weak in floating point calculations. If you take into account lower clock then 4 E-Cores will be equal to 1 P-Core.
https://i.postimg.cc/GdGk8kQj/Flops-CPUv2.png

Locked clocks at 4 GHz (P-CORE) and 3 GHz (E-CORE) for testing purposes.
https://abload.de/img/flops61ojmu.png

nevcairiel
12th January 2022, 17:14
E-Cores are very weak in floating point calculations. If you take into account lower clock then 4 E-Cores will be equal to 1 P-Core.

Luckily video encoding doesn't use much floating point, if any at all. Its all integer math.

Its also all just theory, a proper comparison would be the only interesting part. Even if your assumption is right, there is 8 E cores on the big CPUs, if they make up 25% extra performance (eg. 2 extra cores), thats likely going to beat or match AVX512.

benwaggoner
12th January 2022, 17:16
Wow, I was not expecting a big jump in x87 performance this decade!

Atak_Snajpera
12th January 2022, 17:37
Wow, I was not expecting a big jump in x87 performance this decade!

Intel CPUs will again crush AMD CPUs in QUAKE 2 game (software mode)!

rwill
12th January 2022, 18:13
Intel CPUs will again crush AMD CPUs in QUAKE 2 game (software mode)!

You are joking but there still seem to be some middleware libraries for games that are using x87. I think the last prominent one that got called out was in Skyrim.

Atak_Snajpera
13th January 2022, 16:49
Luckily video encoding doesn't use much floating point, if any at all. Its all integer math.

Its also all just theory, a proper comparison would be the only interesting part. Even if your assumption is right, there is 8 E cores on the big CPUs, if they make up 25% extra performance (eg. 2 extra cores), thats likely going to beat or match AVX512.

AVX-512 supports integer as well
https://en.wikipedia.org/wiki/AVX-512
Introduced with Cannon Lake.[4]

AVX-512 Integer Fused Multiply Add (IFMA) - fused multiply add of integers using 52-bit precision.

GEfS
17th January 2022, 08:05
|---------------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
| CPU | x264 | x265 | LAVC | auto | MMX2 | SSE | SSE2 | SSE3 | SSE4 | AVX | AVX2 | All |
|---------------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
| Core i7-11700 | 28.40 | 5.55 | 168 | 4.39 | 1.36 | 1.37 | 2.14 | 2.32 | 3.18 | 3.10 | 3.97 | N/A |

B560M Aorus Elite, i7 11700, Tower cooler rated at 180W.
4.4 all core while full load. (near 90C, ambient 20C)
RAM 2x8GB bus 3200 C16-20
SSD Seagate Q5 500GB while running the test.

i5-12600@stock with ID-Cooling SE-207-XT on Gigabyte Z690 Gaming X DDR4
2133mhz RAM as Corsair stock (3600C18 kit, Micron E-die) and I forgot to turn on XMP. And I don't have time for a second run cuz the machine gonna be shipped real soon.
Benchmark suite running on a NVMe Pcie 4.0 Samsung PM9A1 512GB

|---------------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
| CPU | x264 | x265 | LAVC | auto | MMX2 | SSE | SSE2 | SSE3 | SSE4 | AVX | AVX2 | All |
|---------------------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|---------|
| Core i5-12600K | 33.75 | 6.34 | 163 | 4.69 | 1.62 | 1.54 | 2.49 | 2.77 | 3.91 | 4.10 | 4.55 | N/A |


Will update with a proper setup (with at least 3200mhz RAM) whenever I get in touch on Alder Lake system.

And even the worst setup for 12600K Alder Lake is still better than i7 11700.