Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Hardware & Software > PC Hard & Software

Reply
 
Thread Tools Search this Thread Display Modes
Old 9th February 2024, 14:58   #1  |  Link
ShogoXT
Registered User
 
Join Date: Dec 2011
Posts: 95
BIG vs little, Intel vs AMD

In the future there will be a separation on how Intel and AMD design their CPUs. They both have completely different visions on how to scale upwards on cores.

For AMD Zen 4 and 5 will have available cacheless cores called Zen4c or Zen5c. These will STILL fully support AVX512 and be identical to normal cores just happen to take up a lot less space on the die thus can have more of them. Naturally they perform worse.

Check benchmarks for the Ryzen 8500g which I would consider this a prototype of this. Also known as Phoenix Point 2, expect to see Zen5c as Strix Point and Strix Halo.

Now for Intel it's more drastic. Their little cores for Alder Lake were upgraded to support avx2 and new instructions were added because they couldn't do avx512 anymore on the whole system. They're large p cores could, but they disabled them.

People wondered when they would return, well Intel wants to rework avx for hybrid processors, but it won't support avx512 for a long time and probably won't support current cpus. See avx10 below.

https://www.anandtech.com/show/18975...architectures-

Not only that but it seems they are separating them on the server platform. P cores for Granite Rapids and e cores for Sierra Forest. So granite maintains the only avx512.

https://www.anandtech.com/show/20034...a-forest-xeons

Meteor Lake is out now with new p and e cores for mobile only, along with a npu, new igpu with av1 encoding. No avx10. I suspect it will come with Arrow Lake on the next platform and socket 1851.

Finally both brands now have AI NPU options built in. AMD Phoenix Point and Intel Meteor Lake. Can these be used by filters and encoders eventually like opencl was? Openml does exist, I imagine it would be great for motion estimation.

Thanks guys I hope to hear from you all! Sorry for misspells I'm on my phone.
ShogoXT is offline   Reply With Quote
Old 10th February 2024, 02:42   #2  |  Link
StainlessS
HeartlessS Usurer
 
StainlessS's Avatar
 
Join Date: Dec 2009
Location: Over the rainbow
Posts: 10,980
Sorry if you feel this is OT.

Quote:
Now for Intel it's more drastic. Their little cores for Alder Lake were upgraded to support avx2 and new instructions were added because they couldn't do avx512 anymore on the whole system. They're large p cores could, but they disabled them.
People wondered when they would return, well Intel wants to rework avx for hybrid processors, but it won't support avx512 for a long time and probably won't support current cpus
I've got a Lenovo ThinkCenter M70Q gen 2 Tiny, with i5-11400T (35W TDP) that claims AVX512 (at least partial).
Also, a Dell Optiplex 7000 Micro, with i7-12700T (35W TDP) that professes not to support AVX512 [8 P cores {+ 8 hyperthreaded}, 4 E cores].

I find below a little interesting but have not as yet investigated it.
Quote:
Software support

Alder Lake requires special support from the operating system due to its relatively unusual-for-x86 hybrid nature. For software unable to be upgraded, a UEFI-provided compatibility mode may be used to disable the E cores; it is enabled by the user turning on scroll lock.[29]
Quote:
This problem has been fixed in a microcode update. The P and E cores now return the same CPUID when both are enabled. A different CPUID is reported when E cores are disabled and only P cores are enabled. The AVX-512 instruction set extension is implemented in the P cores but disabled due to incompatibility with the E cores.[32] Hackers have shown that it is possible to enable the AVX-512 instructions on the P cores when the E cores are disabled and an old microcode version is used.[33]
https://en.wikipedia.org/wiki/Alder_Lake#Dies

When using MeGUI x64 to encode Avisynth script, I'm finding that the i7-12700T machine is basically using only the 4 E cores to encode
(well a few % of the 8 P cores [+ 8 hyperthreaded logical cores] is also used), thats got me a little perplexed.
I'll havta look into it a little more next time I use the i7-12700T machine [I can turn off any number of E cores or P cores in BIOS UEFI Firmware].

EDIT: I'm using Win10 on all machines [I really dont want W11].
EDIT: Long ago, I use to used a program called something like "Prio" to set default Processor Affinity & Priority for certain applications [Addon TAB for TaskManager].
EDIT: This seems to be it, not tested in W10. [EDIT: Supposed to be W7+ compatible].:- https://www.prnwatch.com/prio/
EDIT: This seems to be a similar app to prio, Process Lasso {There is a limited free version}:- https://bitsum.com/
EDIT: 6 Tools to Permanently Set Process Priority in Windows:- https://whatsoftware.com/permanently...ger-with-prio/

EDIT: Flags determined by Avisynth for i7-12700T with E cores disabled in BIOS(same as for i7-8700) : No change to flags whether E cores/Scroll-Lock enabled or disabled.
Code:
Blankclip
info
[clickMe]


And for i5-11400T [Has Avx512 flags]


EDIT: i5-12500T does not have AVX512 either and as it does not have E cores, then does not seem to have any reason to not support AVX512 (unless its to avoid users buying i5 rather than i7/i9 to get AVX512 without the P/E cores hybrid incompatability problem).
__________________
I sometimes post sober.
StainlessS@MediaFire ::: AND/OR ::: StainlessS@SendSpace

"Some infinities are bigger than other infinities", but how many of them are infinitely bigger ???

Last edited by StainlessS; 10th February 2024 at 04:39.
StainlessS is offline   Reply With Quote
Old 10th February 2024, 19:29   #3  |  Link
ShogoXT
Registered User
 
Join Date: Dec 2011
Posts: 95
The Alder Lake situation post mortem is the main reason I made this thread. Avx-512 was only made available by some motherboard company BIOSs early on. Then Intel forced them to disable it. If you have access to those early bios or user altered BIOS you can still use it, but considering all the CPU flaw fixes and other platform fixes it might be lost there.

Not only that on Raptor Lake and newer binned Alder Lake CPUs the feature was fused off at the hardware level. It became impossible from then on. With that in hand along with avx10 plans, people just have to accept that Rocket Lake was the only real consumer avx512 line from Intel.

In other news Zen 5 was just confirmed to have new avx2 instructions from Tiger Lake and Alder Lake era Intel CPUs.

https://www.phoronix.com/news/AMD-Zen-5-Znver-5-GCC

This also ignored the point that Zen4 was BETTER at avx512 than Intel ever was too. The point being in the future Intel is going to be the mOrE cOrEs while AMD will be bigger. Intel won't be able to fully allow consumers to use their big cores for several years...
ShogoXT is offline   Reply With Quote
Old 11th February 2024, 07:51   #4  |  Link
DTL
Registered User
 
Join Date: Jul 2018
Posts: 1,070
Intel simply finally separated Professional CPUs from Home CPUs. Intel Professional named Xeons and has support of AVX512 in about each. Intel Home CPUs named like i3/i7/i.... and NOT support AVX512 because of too little benefit in home applications.

For users of moving pictures processing applications Intel has a large series of Xeons chips. Also to work with AVX512 at the better memory performance the bigger RAM transfer speed required - so Xeons typically have 4 to 12 (16 ?) RAM channels. While Home chips are 2 channels max now. Single AVX2 core can saturate 1 ch of DDR4 SDRAM at simple activity like YV12 to RGB decode so no need to put expensive AVX512 to 4..8+ cores with poor people 2 RAM channels home machine.

It is simply marketing separation in the 202x years. Home users no longer require high-performance computing so do not like to invest in AVX512 and large RAM bandwidth. And all home gaming tasks are now better served with external GPUs.

If home users want to go with high performance Intel in 202x it is only some many RAM channels Xeon machine like Workstation series. Or wait for a big step to the future of high performance computing with Xeon MAX in Personal Computer with finally integrated HBM RAM on CPU.

AMD for home can dispatch AVX512 program blocks in 7xxx home chips but may simply limit performance if large transfer size is required with moving pictures frames and many CPU cores active in multithreading. Because home AMD chips use the same 2 ch DDR SDRAM.

It is a really big strike for the residuals of open source programmers for the home users community. No reason to put large efforts for free into much more complex AVX512 programming because it will not benefit too much at the physically limited home machines in 202x. Only a limited number of algorithms with small enough datasets to process may be implemented with AVX512 and very limited main RAM bandwidth to show expected benefit from AVX512 (about 2..4+x faster over AVX2).

Also AMD support of AVX512 at home chips may mean only program execution not stop and dispatch 512 bit data with 256 bit dispatch ports at lower rate. While full-blood AVX512 chips really have full width 512 bit dispatch ports to run at higher speed.

" would be great for motion estimation."

There is a special hardware ASIC in the GPU to serve MPEG encoders with ME data. In Win10 and DX12 Microsoft finally made it API for user applications. So instead of loading universal ML cores with application specific tasks for ME it is better to develop better and better ASIC for ME used in both better and better MPEG encoder performance and all other moting pictures processing via DX12-ME (and other open APIs expected later).
I hope Intel will finally support DX12-ME in some integrated GPUs in CPU chips. So users can compare performance/quality of hardware ME ASIC between NVIDIA/AMD/Intel implementations.

"i5-12500T does not have AVX512 either and as it does not have E cores, then does not seem to have any reason to not support AVX512"

AVX512 at full-blood implementation is a very separate and power-hungry hardware unit. Running at full load it can require about 10x the power of a standard scalar core. So for a T-marked chip with a very low TDP of 35W it is physically impossible to run AVX512 peaking to 350W or 700W with the same number of cores. No hardware power budget in all steps (PSU + motherboard + heat sink) to run AVX512.

It looks currently Intel disables even AVX2 in low-power cores to fit into very small TDP. And AVX512 at good hardware implementation (many running dispatch ports at the same time) may require 4..10+x more power.

So AVX512 support does not mean real performance benefit by only mark at CPU box or description. It must have good hardware implementation and good typically mean large TDP addition to scalar core running.

The peak internal performance of SIMD compute unit and top power budget depends not only on the SIMD arch name of SSE/AVX but also on the number of dispatch ports to dispatch the same instructions with available dataset. It is the additional layer of super-scalarity and depends on current hardware implementation.
Total theoretical multi core chip peak performance is about
Num Cores x SIMD dataword width x SuperScalarityFactor x SIMD dispatch port performance

So to fit in TDP the CPU designer can change the number of dispatch ports to dispatch the same SIMD instructions and it will directly affect the real TDP and top computing performance. So if the programmer (or compiler) will provide enough data in the register file and CPU implementation will have enough dispatch ports for the SIMD arch used - its internal core compute performance may scale close to linear to the number of dispatch ports used. But it also will increase peak TDP (at this program part).

So currently industry is going into application-specific CPU design - some companies can order special builds of Xeon configured to have the required number of cores and RAM channels and dispatch ports for current AVX512 SIMD to get best performance at the customers application.

And for the general market Intel produces CPUs simply somehow optimized for some standard test set of applications for each market area (home PCs, Workstations, Servers and others).

Computing means power consumption. AVX is about high-performance computing so it causes high TDP. And home users do not want to pay for 500+W TDP desktop simply to run one AVX512 application once per year. It is physically possible to design a low-power AVX dispatch engine (as we see in AMD 7xxx) but it will also have a low performance and it will destroy the idea of AVX. So in 202x the market is separated to high-performance Xeons with AVX512 and high TDP and low-power home CPUs.

"i5-11400T (35W TDP) that claims AVX512"

11400T is 1.3 GHz base clock frequency. And 6 cores only. So it can run AVX512 units only with very significant clock throttling. If it runs out of 35W TDP in an attempt to run 6 cores with AVX512 hard load it may also try other throttling methods like cores idling and other.

" i7-12700T (35W TDP) that professes not to support AVX512 [8 P cores"

Xeon W3-2435 of 8 cores with AVX512 at 3.1 GHz have Base Power of 165 W (and may run much higher at high load of AVX512 units).

Last edited by DTL; 11th February 2024 at 09:29.
DTL is online now   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 00:01.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.