Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > High Efficiency Video Coding (HEVC)

Reply
 
Thread Tools Search this Thread Display Modes
Old 9th November 2019, 11:05   #141  |  Link
aymanalz
Registered User
 
Join Date: May 2015
Posts: 68
Quote:
Originally Posted by DotJun View Post
The odd part is that both auto detect (avx2) and avx512 needed the same -3 offset or windows becomes unstable and crash. Also, avx2 ran 5-10c hotter than avx512, which is the opposite of what I thought would happen.

Results might be different on a longer sample clip, but I really didn't want to invest too much time into it since for whatever reason avx2 was pushing out more heat than what I am comfortable with.
This may have more to do with your processor being overclocked, than any quirks of AVX2.

Have you tried the same tests on stock clocks?
aymanalz is offline   Reply With Quote
Old 9th November 2019, 11:39   #142  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 9,837
Quote:
Originally Posted by aymanalz View Post
Have you tried the same tests on stock clocks?
Noone runs a Skylake X series on stock clocks, that would be a waste of those CPUs, since they OC extremely well. As such any tests on stock really don't tell you anything useful.

For the question at hand, yes, AVX512 can run overall cooler then AVX2, because it does twice the work in the same time, or needs less time to do the same work, which means the AVX units are overall less busy, and don't heat up as much, since software like x265 isn't pushing pure AVX512 load, it still has loads of other things to compute.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 9th November 2019, 13:16   #143  |  Link
DotJun
Registered User
 
Join Date: Aug 2014
Posts: 23
Quote:
Originally Posted by nevcairiel View Post
Noone runs a Skylake X series on stock clocks, that would be a waste of those CPUs, since they OC extremely well. As such any tests on stock really don't tell you anything useful.

For the question at hand, yes, AVX512 can run overall cooler then AVX2, because it does twice the work in the same time, or needs less time to do the same work, which means the AVX units are overall less busy, and don't heat up as much, since software like x265 isn't pushing pure AVX512 load, it still has loads of other things to compute.
Ok that makes sense. Is that also why I'm seeing increased performance with avx512, because it has time to do other things since it's overall less busy due to finishing avx stuff sooner?
DotJun is offline   Reply With Quote
Old 9th November 2019, 14:24   #144  |  Link
nevcairiel
Registered Developer
 
Join Date: Mar 2010
Location: Hamburg/Germany
Posts: 9,837
Quote:
Originally Posted by DotJun View Post
Ok that makes sense. Is that also why I'm seeing increased performance with avx512, because it has time to do other things since it's overall less busy due to finishing avx stuff sooner?
Thats generally how all SIMD speedup works, you make it spend less time in typical DSP functions, which are easy to optimize, so the overall process runs faster.
__________________
LAV Filters - open source ffmpeg based media splitter and decoders
nevcairiel is offline   Reply With Quote
Old 10th November 2019, 05:39   #145  |  Link
aymanalz
Registered User
 
Join Date: May 2015
Posts: 68
Quote:
Originally Posted by nevcairiel View Post
Noone runs a Skylake X series on stock clocks, that would be a waste of those CPUs, since they OC extremely well. As such any tests on stock really don't tell you anything useful.
Yes, running an X series on stock clocks would be a waste of money. But I was suggesting that as a test to eliminate overclocking issues as a likely culprit in the unstability and crashing that he mentioned. I don't see how AVX-2 can cause crashes at this point, as x265 has been pretty well optimized for AVX2 by now. A bad overclock on the other hand...

If those crashes and unstability occur at stock settings, then he can be sure that it's not related to hardware issues.
aymanalz is offline   Reply With Quote
Old 10th November 2019, 13:19   #146  |  Link
DotJun
Registered User
 
Join Date: Aug 2014
Posts: 23
Quote:
Originally Posted by aymanalz View Post
Yes, running an X series on stock clocks would be a waste of money. But I was suggesting that as a test to eliminate overclocking issues as a likely culprit in the unstability and crashing that he mentioned. I don't see how AVX-2 can cause crashes at this point, as x265 has been pretty well optimized for AVX2 by now. A bad overclock on the other hand...

If those crashes and unstability occur at stock settings, then he can be sure that it's not related to hardware issues.
It wouldn't crash at stock speeds if it isn't crashing at 4.4 with a -3 offset.
DotJun is offline   Reply With Quote
Old 15th November 2019, 12:58   #147  |  Link
NikosD
Registered User
 
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,710
The Ryzen 9 3900X 12C/24T was already too fast, faster than any Intel desktop processor.

But the Ryzen 9 3950X 16C/32T is even faster, a lot faster.

But you can always wait for the Threadrippers that will be released this month. (if you can afford the price premium)

__________________
Win 10 x64 (18363.476) - Core i3-9100F - nVidia 1660 (441.41)
HEVC decoding benchmarks
H.264 DXVA Benchmarks for all
NikosD is offline   Reply With Quote
Old 15th November 2019, 13:26   #148  |  Link
Stereodude
Registered User
 
Join Date: Dec 2002
Location: Region 0
Posts: 1,138
It's looking good. Can you give us more details on that graph? Like what resolution / settings were used in handbrake?

Personally, I want to see how the 3960X and 3970X compare before buying something. I presume reviews will drop by the 25th when they all go on sale though I don't plan to be a day 1 buyer anyhow.
Stereodude is offline   Reply With Quote
Old 15th November 2019, 16:07   #149  |  Link
NikosD
Registered User
 
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,710
Quote:
Originally Posted by Stereodude View Post
It's looking good.
Can you give us more details on that graph?
Like what resolution / settings were used in handbrake?
It's a graph from legitreviews.com review.
Quote:
We used Big Buck Bunny as our input file, which has become one of the world standards for video benchmarks.

For our benchmark scenario we used a standard 2D 4K (38402160) 60 FPS clip in the MP4 format and used Handbrake version 1.2.2
__________________
Win 10 x64 (18363.476) - Core i3-9100F - nVidia 1660 (441.41)
HEVC decoding benchmarks
H.264 DXVA Benchmarks for all
NikosD is offline   Reply With Quote
Old 15th November 2019, 17:17   #150  |  Link
Stereodude
Registered User
 
Join Date: Dec 2002
Location: Region 0
Posts: 1,138
Quote:
Originally Posted by NikosD View Post
It's a graph from legitreviews.com review.
So based on their picture in the review they're taking a 2160p60 input clip and converting it to 1080p30 (x264) using the "Fast 1080p" preset in Handbrake 1.2.2. I'm guessing the x265 test is also a 1080p encode.

Edit: This is probably not correct.

Last edited by Stereodude; 16th November 2019 at 17:41.
Stereodude is offline   Reply With Quote
Old 15th November 2019, 17:38   #151  |  Link
NikosD
Registered User
 
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,710
By reading almost 10 reviews of Ryzen 9 3950X today, I can clearly say that this particular mainstream desktop CPU is the best processor of AMD and generally of x86/x64 platform of all time.
It's the fastest single-thread CPU of all Intel and AMD processors and the fastest gaming CPU of AMD ever (still Intel has a slight advantage for games at 1080p)
The multi-thread performance is most of the times better than the 18 core Intel HEDT and a lot faster than second generation 16 core Threadripper.
All of that using the same power consumption of 9900K (~140W in all core turbo real performance) which has half cores (!) - only 8 - making 3950X an extremely efficient processor.
You can even drop the TDP (base frequency) to 65W instead of 105W with a performance loss of 10% - 15%
All of those with 750$.
AMD at its best after many, many years.
__________________
Win 10 x64 (18363.476) - Core i3-9100F - nVidia 1660 (441.41)
HEVC decoding benchmarks
H.264 DXVA Benchmarks for all
NikosD is offline   Reply With Quote
Old 15th November 2019, 19:23   #152  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,575
^^ I'm SUPER impressed with it.

I have a 9900k at work and the improvement relative to this may get me to upgrade from my 7700k at home, especially with me getting more interested in SVT AV1 testing

16 cores / 32 threads is perfect for x265 as well, especially all in 1 NUMA node.

Last edited by Blue_MiSfit; 15th November 2019 at 19:48.
Blue_MiSfit is offline   Reply With Quote
Old 15th November 2019, 20:04   #153  |  Link
Stereodude
Registered User
 
Join Date: Dec 2002
Location: Region 0
Posts: 1,138
Do we know yet if the new Zen 2 based Threadrippers (3960X / 3970X) will have more than one NUMA node splitting the cores up?
Stereodude is offline   Reply With Quote
Old 15th November 2019, 20:44   #154  |  Link
Atak_Snajpera
RipBot264 author
 
Atak_Snajpera's Avatar
 
Join Date: May 2006
Location: Poland
Posts: 7,241
Quote:
Originally Posted by Stereodude View Post
Do we know yet if the new Zen 2 based Threadrippers (3960X / 3970X) will have more than one NUMA node splitting the cores up?
3980x (48C/96T) and 3990x (64C/128T) will be seen by windows as 2 numa nodes unless you disable SMT.
Atak_Snajpera is offline   Reply With Quote
Old 15th November 2019, 20:58   #155  |  Link
Stereodude
Registered User
 
Join Date: Dec 2002
Location: Region 0
Posts: 1,138
Quote:
Originally Posted by Atak_Snajpera View Post
3980x (48C/96T) and 3990x (64C/128T) will be seen by windows as 2 numa nodes unless you disable SMT.
Are you sure? The 64 core Epyc Rome is only 2 NUMA nodes with 8 chiplets so it seems like 4 chiplets in single NUMA node should be possible.
Stereodude is offline   Reply With Quote
Old 15th November 2019, 21:31   #156  |  Link
Atak_Snajpera
RipBot264 author
 
Atak_Snajpera's Avatar
 
Join Date: May 2006
Location: Poland
Posts: 7,241
Quote:
Originally Posted by Stereodude View Post
Are you sure? The 64 core Epyc Rome is only 2 NUMA nodes with 8 chiplets so it seems like 4 chiplets in single NUMA node should be possible.
Yes 48T and 64T will be seen as single NUMA by windows. Higher number of threads will give you extra node in task manager.
Atak_Snajpera is offline   Reply With Quote
Old 16th November 2019, 02:52   #157  |  Link
RanmaCanada
Registered User
 
Join Date: May 2009
Posts: 110
Well if anyone here has a Zen 2, aka Ryzen 3000 series chip, they should run the benchmark Sagitare created and post their results there. That way we can get some real world numbers, and not the canned crap that benchmarking sites use. Though the exe files need to be updated as they're over a year old now. I honestly wish all sites would do crf 20 slow, then slower, slowest and placebo. No one uses the settings they benchmark with so the results they produce are useless.

https://forum.doom9.org/showthread.php?t=174393
RanmaCanada is offline   Reply With Quote
Old 16th November 2019, 17:53   #158  |  Link
Stereodude
Registered User
 
Join Date: Dec 2002
Location: Region 0
Posts: 1,138
Quote:
Originally Posted by Stereodude View Post
So based on their picture in the review they're taking a 2160p60 input clip and converting it to 1080p30 (x264) using the "Fast 1080p" preset in Handbrake 1.2.2. I'm guessing the x265 test is also a 1080p encode.
So I decided to download HandBrake 1.2.2 and the 2160p60 version of Big Buck Bunny and try this test myself to see how my E5-2687Wv2 computer compares to the systems Legit Reviews used to see what sort of improvement I might get. I got an average FPS of 47.3 using x264 and the settings shown in their screenshot. They're clearly not using the HandBrake settings shown in the screenshot. There's no way a 8C/16T Xeon without AVX2 is going to best a 8C/16T 9th Gen i9 by over 50% when the i9 has the same number of cores, a sizeable clock speed advantage, and ~6 generations of IPC improvements and support for additional instructions.

So, no idea what settings they're using, but it's definitely not what's shown in the screenshot.
Stereodude is offline   Reply With Quote
Old 16th November 2019, 18:25   #159  |  Link
RanmaCanada
Registered User
 
Join Date: May 2009
Posts: 110
Quote:
Originally Posted by Stereodude View Post
So I decided to download HandBrake 1.2.2 and the 2160p60 version of Big Buck Bunny and try this test myself to see how my E5-2687Wv2 computer compares to the systems Legit Reviews used to see what sort of improvement I might get. I got an average FPS of 47.3 using x264 and the settings shown in their screenshot. They're clearly not using the HandBrake settings shown in the screenshot. There's no way a 8C/16T Xeon without AVX2 is going to best a 8C/16T 9th Gen i9 by over 50% when the i9 has the same number of cores, a sizeable clock speed advantage, and ~6 generations of IPC improvements and support for additional instructions.

So, no idea what settings they're using, but it's definitely not what's shown in the screenshot.
And that is why we can't trust reviews for encoding benchmarks if they don't give us the command lines used, and I suggested that any one of the numerous people here who have a zen 2, should run Sagitare's benchmark so we can at least get an apples to apples comparison. Though as the exe's are actually 2 years old, Sagitare needs to update it (I tried and failed miserably haha).
RanmaCanada is offline   Reply With Quote
Old 25th November 2019, 17:52   #160  |  Link
Warrex
Registered User
 
Join Date: Dec 2005
Posts: 71
Threadripper 39xxX:



For comparison:



Intel SVT (AV1, etc.) performance:

https://techgage.com/article/amd-thr...980xe-linux/2/
Warrex is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 14:01.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2019, vBulletin Solutions Inc.