View Full Version : Which processor to encode x265 4K ?
Pages :
1
2
3
[
4]
5
6
7
8
NikosD
15th November 2019, 17:38
By reading almost 10 reviews of Ryzen 9 3950X today, I can clearly say that this particular mainstream desktop CPU is the best processor of AMD and generally of x86/x64 platform of all time.
It's the fastest single-thread CPU of all Intel and AMD processors and the fastest gaming CPU of AMD ever (still Intel has a slight advantage for games at 1080p)
The multi-thread performance is most of the times better than the 18 core Intel HEDT and a lot faster than second generation 16 core Threadripper.
All of that using the same power consumption of 9900K (~140W in all core turbo real performance) which has half cores (!) - only 8 - making 3950X an extremely efficient processor.
You can even drop the TDP (base frequency) to 65W instead of 105W with a performance loss of 10% - 15%
All of those with 750$.
AMD at its best after many, many years.
Blue_MiSfit
15th November 2019, 19:23
^^ I'm SUPER impressed with it.
I have a 9900k at work and the improvement relative to this may get me to upgrade from my 7700k at home, especially with me getting more interested in SVT AV1 testing :)
16 cores / 32 threads is perfect for x265 as well, especially all in 1 NUMA node.
Stereodude
15th November 2019, 20:04
Do we know yet if the new Zen 2 based Threadrippers (3960X / 3970X) will have more than one NUMA node splitting the cores up?
Atak_Snajpera
15th November 2019, 20:44
Do we know yet if the new Zen 2 based Threadrippers (3960X / 3970X) will have more than one NUMA node splitting the cores up?
3980x (48C/96T) and 3990x (64C/128T) will be seen by windows as 2 numa nodes unless you disable SMT.
Stereodude
15th November 2019, 20:58
3980x (48C/96T) and 3990x (64C/128T) will be seen by windows as 2 numa nodes unless you disable SMT.
Are you sure? The 64 core Epyc Rome is only 2 NUMA nodes with 8 chiplets so it seems like 4 chiplets in single NUMA node should be possible.
Atak_Snajpera
15th November 2019, 21:31
Are you sure? The 64 core Epyc Rome is only 2 NUMA nodes with 8 chiplets so it seems like 4 chiplets in single NUMA node should be possible.
Yes 48T and 64T will be seen as single NUMA by windows. Higher number of threads will give you extra node in task manager.
RanmaCanada
16th November 2019, 02:52
Well if anyone here has a Zen 2, aka Ryzen 3000 series chip, they should run the benchmark Sagitare created and post their results there. That way we can get some real world numbers, and not the canned crap that benchmarking sites use. Though the exe files need to be updated as they're over a year old now. I honestly wish all sites would do crf 20 slow, then slower, slowest and placebo. No one uses the settings they benchmark with so the results they produce are useless.
https://forum.doom9.org/showthread.php?t=174393
Stereodude
16th November 2019, 17:53
So based on their picture in the review they're taking a 2160p60 input clip and converting it to 1080p30 (x264) using the "Fast 1080p" preset in Handbrake 1.2.2. I'm guessing the x265 test is also a 1080p encode.
So I decided to download HandBrake 1.2.2 and the 2160p60 version of Big Buck Bunny and try this test myself to see how my E5-2687Wv2 computer compares to the systems Legit Reviews used to see what sort of improvement I might get. I got an average FPS of 47.3 using x264 and the settings shown in their screenshot. They're clearly not using the HandBrake settings shown in the screenshot. There's no way a 8C/16T Xeon without AVX2 is going to best a 8C/16T 9th Gen i9 by over 50% when the i9 has the same number of cores, a sizeable clock speed advantage, and ~6 generations of IPC improvements and support for additional instructions.
So, no idea what settings they're using, but it's definitely not what's shown in the screenshot.
RanmaCanada
16th November 2019, 18:25
So I decided to download HandBrake 1.2.2 and the 2160p60 version of Big Buck Bunny and try this test myself to see how my E5-2687Wv2 computer compares to the systems Legit Reviews used to see what sort of improvement I might get. I got an average FPS of 47.3 using x264 and the settings shown in their screenshot. They're clearly not using the HandBrake settings shown in the screenshot. There's no way a 8C/16T Xeon without AVX2 is going to best a 8C/16T 9th Gen i9 by over 50% when the i9 has the same number of cores, a sizeable clock speed advantage, and ~6 generations of IPC improvements and support for additional instructions.
So, no idea what settings they're using, but it's definitely not what's shown in the screenshot.
And that is why we can't trust reviews for encoding benchmarks if they don't give us the command lines used, and I suggested that any one of the numerous people here who have a zen 2, should run Sagitare's benchmark so we can at least get an apples to apples comparison. Though as the exe's are actually 2 years old, Sagitare needs to update it (I tried and failed miserably haha).
Warrex
25th November 2019, 17:52
Threadripper 39xxX:
https://p.xfastest.com/~sinchen/AMD-Ryzen-Threadripper-3960X-3970X/AMD-Ryzen-Threadripper-3960X-3970X-33.jpg
For comparison:
https://p.xfastest.com/~sinchen/AMD-Ryzen-7-3700X-9-3900X/AMD-Ryzen-7-3700X-9-3900X-34.jpg
Intel SVT (AV1, etc.) performance:
https://techgage.com/article/amd-threadripper-3960x-3970x-intel-i9-10980xe-linux/2/
Stereodude
25th November 2019, 21:45
And what does the XFASTEST x264/x265 test consist of?
Their data for x265 looks promising, but shows it's not faster in x264.
Others show improvements in both (from Legit Reviews):
https://i.imgur.com/oDRhNTr.png
Warrex
25th November 2019, 22:54
And what does the XFASTEST x264/x265 test consist of?
As already discussed in this thread XFASTEST use Atak's benchmarks:
https://forum.pclab.pl/topic/1184884-x265-FHD-Benchmark/
https://www.guru3d.com/files-details/x264-fhd-benchmark-v1-1-64bit.html
RanmaCanada
26th November 2019, 05:35
I personally feel no benchmark is valid unless they are using at least the slow preset. I will wait until I see someone running with those, like techpowerup did (https://www.techpowerup.com/review/amd-ryzen-7-3700x/13.html) for the Ryzen 3700x and 3900x. Though they used time to encode, and did not give us the fps, nor the time of the actual clip they used. I really hate how no one properly benches encoding. Tell us how long the clip is, how many frames it has, and what type of media it is. I would honestly say using one of the test pattern clips, or even any of the open source/Creative Commons movies, like Tears of Steel. I mean how else are we the public supposed to replicate the results if we don't know what reviewers are using to test with.
nevcairiel
26th November 2019, 08:50
I mean how else are we the public supposed to replicate the results if we don't know what reviewers are using to test with.
You are not, because otherwise there would be fanbois from both sides showing up playing point and counter-point showing how the results the website posted is wrong in either direction.
Thats why you keep benchmarks at least slightly opaque. If you want to do your own benchmarks, you need to make your own references to benchmark against.
Atak_Snajpera
26th November 2019, 15:54
Even 3950X is extremely good in x265
https://i.postimg.cc/KYgqT9Cf/Untitled-1.png
NikosD
27th November 2019, 02:23
And SVT-AV1 too
https://i.postimg.cc/Bn7dyfhg/y-V94uwn-ZCpkfk4-B3z-UR4-Td-650-80.png
Nintendo Maniac 64
27th November 2019, 19:50
And SVT-AV1 too
I know you can't compare setups, but how the heck did Tom's get such ridiculously faster numbers than Phoronix? Windows vs Linux shouldn't cause that much of a difference!
https://www.phoronix.com/scan.php?page=article&item=amd-linux-3960x-3970x&num=8
Atak_Snajpera
27th November 2019, 20:17
I know you can't compare setups, but how the heck did Tom's get such ridiculously faster numbers than Phoronix? Windows vs Linux shouldn't cause that much of a difference!
https://www.phoronix.com/scan.php?page=article&item=amd-linux-3960x-3970x&num=8
Easier to compress sample? Different ENC mode? Hard to tell...
NikosD
27th November 2019, 20:35
I know you can't compare setups, but how the heck did Tom's get such ridiculously faster numbers than Phoronix? I know nothing of SVT-AV1 but I must say that according to Phoronix Enc4 vs Enc8 has one order of magnitude (x10) difference regarding absolute numbers (!)
I think we must focus on relative performance for each test of Tom's and Phoronix regarding to Enc4.
For example, in both tests of Enc4 the new Threadripper 32C is about two times faster than Core i9 18C and 2.5 to 3 times faster than previous gen Threadripper 32C.
The absolute performance could probably be more important for those who actually use the specific AV1 encoder and not using it just for benchmarks.
TEB
12th December 2019, 14:51
So... i got access to a Lenovo SR630 with Epyc 7742 on at work now running RHEL 8.1. Any benchmarks u guys want me to run on it?
processor : 127
vendor_id : AuthenticAMD
cpu family : 23
model : 49
model name : AMD EPYC 7742 64-Core Processor
stepping : 0
microcode : 0x830101c
cpu MHz : 2332.947
cache size : 512 KB
physical id : 0
siblings : 128
RanmaCanada
12th December 2019, 17:49
Try running Sagitare's benchmark. https://forum.doom9.org/showthread.php?t=174393 It's a little old.
You could also run a 1080p or 4k benchmark of say Tears of Steel at CRF 18-20 and go from Placebo all the way to ultra fast (if you have time) Because this would be a real world workload and let people see the difference in each preset.
Atak_Snajpera
18th December 2019, 23:14
This https://forum.pclab.pl/topic/1184884-x265-FHD-Benchmark/
TEB
18th December 2019, 23:30
This https://forum.pclab.pl/topic/1184884-x265-FHD-Benchmark/
Running RHEL 8.x
Atak_Snajpera
18th December 2019, 23:42
Running RHEL 8.x
Windows in VM maybe?
blublub
20th January 2020, 20:43
If anyone wants me to run a test on a Threadripper 3960X let me know.
Conditions:
Windows based
easy to setup - best specify the source with link, programm to encode with wanted preset
RanmaCanada
21st January 2020, 01:33
If anyone wants me to run a test on a Threadripper 3960X let me know.
Conditions:
Windows based
easy to setup - best specify the source with link, programm to encode with wanted preset
For simplicity I would say to do Tears of Steel with each preset at CRF 16-18 (what most people here use). If you have time do 1080p to 1080p and 4k to 4k. (https://mango.blender.org/download/)
If that's too long, do the good old Park Joy (https://media.xiph.org/video/derf/).
You could use handbrake, or staxrip, your choice. No filtering, no extras, just let the presets do what they are supposed to.
The reason I ask this is because I feel Atak's benchmark uses a preset that no one will use in real life, and every site that does benchmark it does not give us the length, frames, etc of the source they are using, which makes their benchmarks useless.
By using these well known, open source/royalty free videos, the results can be properly compared with other processors, and results can be replicated.
Atak_Snajpera
21st January 2020, 11:46
For simplicity I would say to do Tears of Steel with each preset at CRF 16-18 (what most people here use). If you have time do 1080p to 1080p and 4k to 4k.
You also make blind assumptions that most people use your settings. You have zero data to prove your claim so stop saying that my benchmark using default settings is unrealistic. Furthermore yours CRF16-18 is total overkill for most users (bitrate will go through the roof). Another issue. handbreak won't saturate all 48 threads on 1080p and probably also in 4k. Chunked encoding is the only effecting way of achieving constant 100% CPU usage. With 3990x chunked encoding is basicaly required.
RanmaCanada
21st January 2020, 17:16
You also make blind assumptions that most people use your settings. You have zero data to prove your claim so stop saying that my benchmark using default settings is unrealistic. Furthermore yours CRF16-18 is total overkill for most users (bitrate will go through the roof). Another issue. handbreak won't saturate all 48 threads on 1080p and probably also in 4k. Chunked encoding is the only effecting way of achieving constant 100% CPU usage. With 3990x chunked encoding is basicaly required.
Just go through all the treads in the various subs here and almost EVERYONE uses slow or slower for their encodes, with an average CRF of 18. You will have the odd person who is "blind" and thinks 22 or higher are fine, but the majority of people use CRF's in the teens.
Remember, this is a forum for the 1% of encoders and enthusiasts. Most people want archival quality of their encodes.
I am sorry you feel offended by what I said about your benchmark, but in real world scenarios your benchmark doesn't hold up. For example on your benchmark it states my 2700 gets 21+ fps, which I've never EVER seen in encoding at slow or slower. Not even at medium have I seen it at that speed, and I run with a pretty bare command line. Benchmarks are supposed to be about real world performance, and no one who cares about quality would be using fast and higher presets.
microchip8
21st January 2020, 17:25
Just go through all the treads in the various subs here and almost EVERYONE uses slow or slower for their encodes, with an average CRF of 18. You will have the odd person who is "blind" and thinks 22 or higher are fine, but the majority of people use CRF's in the teens.
Remember, this is a forum for the 1% of encoders and enthusiasts. Most people want archival quality of their encodes.
I am sorry you feel offended by what I said about your benchmark, but in real world scenarios your benchmark doesn't hold up. For example on your benchmark it states my 2700 gets 21+ fps, which I've never EVER seen in encoding at slow or slower. Not even at medium have I seen it at that speed, and I run with a pretty bare command line. Benchmarks are supposed to be about real world performance, and no one who cares about quality would be using fast and higher presets.
I'm not "blind" and use CRF 21 in 10-bits and it looks totally fine. In fact, I can't see a (major) difference between 21 and 18 or 19, except for the bitrate. I must be blind, then?
Atak_Snajpera
21st January 2020, 17:59
Just go through all the treads in the various subs here and almost EVERYONE uses slow or slower for their encodes, with an average CRF of 18. You will have the odd person who is "blind" and thinks 22 or higher are fine, but the majority of people use CRF's in the teens.
Remember, this is a forum for the 1% of encoders and enthusiasts. Most people want archival quality of their encodes.
I am sorry you feel offended by what I said about your benchmark, but in real world scenarios your benchmark doesn't hold up. For example on your benchmark it states my 2700 gets 21+ fps, which I've never EVER seen in encoding at slow or slower. Not even at medium have I seen it at that speed, and I run with a pretty bare command line. Benchmarks are supposed to be about real world performance, and no one who cares about quality would be using fast and higher presets.
Because this benchmark uses samples containing a lot of motion and details!
crowd_run_1080p50.yuv
ducks_take_off_1080p50.yuv
in_to_tree_1080p50.yuv
old_town_cross_1080p50.yuv
park_joy_1080p50.yuv
Looks at this as worse case scenario.
Besides, You are probably encoding at cropped 1920x800 resolution instead of full 1920x1080.
This benchmark shows you what you can expect from CPU A vs CPU B. For example. Should I buy Ryzen 3950x or Intel core i9 10980xe. Do not look at raw numbers because it does not make sense.
PS. 2700x gets 28 fps in my benchmark so be more precise next time ,ok?
https://i.postimg.cc/G2K5grgN/Capture.png
Asmodian
21st January 2020, 18:02
Fast/slow/slower?
CRF 21 is not bad with x265. I use lower for x264 but with x265 below that usually results in bitrates near where I would also get transparent encodes using x264. However, I think a CPU benchmark should use slower not fast. Hardware encoding is good enough today that unless you are using slower+ a Turing GPU encode is probably a better option.
microchip8
21st January 2020, 18:06
Fast/slow/slower?
CRF 21 is not bad with x265. I use lower for x264 but with x265 below that usually results in bitrates near where I would also get transparent encodes using x264. However, I think a CPU benchmark should use slower not fast. Hardware encoding is good enough today that unless you are using slower+ a Turing GPU encode is probably a better option.
Custom settings that have features from both slow and slower preset
blublub
21st January 2020, 20:50
Fast/slow/slower?
CRF 21 is not bad with x265. I use lower for x264 but with x265 below that usually results in bitrates near where I would also get transparent encodes using x264. However, I think a CPU benchmark should use slower not fast. Hardware encoding is good enough today that unless you are using slower+ a Turing GPU encode is probably a better option.I haven't tried GPU encoding for ages. Is it really good looking now?
I will see what I can bench test over the weekend.
From my perspective it doesn't matter what presets are used for a benchmark.
Consistency is important to compare CPUs
RanmaCanada
21st January 2020, 20:54
Because this benchmark uses samples containing a lot of motion and details!
crowd_run_1080p50.yuv
ducks_take_off_1080p50.yuv
in_to_tree_1080p50.yuv
old_town_cross_1080p50.yuv
park_joy_1080p50.yuv
Looks at this as worse case scenario.
Besides, You are probably encoding at cropped 1920x800 resolution instead of full 1920x1080.
This benchmark shows you what you can expect from CPU A vs CPU B. For example. Should I buy Ryzen 3950x or Intel core i9 10980xe. Do not look at raw numbers because it does not make sense.
PS. 2700x gets 28 fps in my benchmark so be more precise next time ,ok?
https://i.postimg.cc/G2K5grgN/Capture.png
I said 2700, not 2700x
Atak_Snajpera
22nd January 2020, 00:02
I said 2700, not 2700x
According to this graph you should get 24+ FPS then
https://cdn.mos.cms.futurecdn.net/KfSHM4wcaYKM9VWgPk9FdS-650-80.png
Indeed huge error from my side ...
RanmaCanada
22nd January 2020, 00:29
Which I do see with an updated version of x265. Still no where near what I get in real world situations, hence the request for slow and lower presets as that is what the majority of us who encode would be using. Depending on material at slow I see 4-7 fps, and on slower I'm lucky if I get 2, but usually hang around 1.3-1.5 fps. This is for 1080p content. For 4k, on slow I'm lucky if I hit 1fps.
blublub
22nd January 2020, 06:05
Hi
Ok. Here is one result:
Tears of Steel 4K, x265 preset slow: 6,32fps
Software: RipBot with HEVC encoder version 3.2+34-8e6db24c1517
settings:
x265_x64.exe" --colorprim bt709 --transfer bt709 --colormatrix bt709 --crf 18 --fps 24 --min-keyint 24 --keyint 240 --frames 17616 --sar 1:1 --profile main10 --output-depth 10 --preset slow --ctu 64 --merange 57
@Atak_Snapjera:
I tried to download your benchmark but the download site kept generating one "key" after another but never presented the download link
RanmaCanada
22nd January 2020, 06:57
Thank you!
Atak_Snajpera
22nd January 2020, 12:57
Hi
Ok. Here is one result:
Tears of Steel 4K, x265 preset slow: 6,32fps
Software: RipBot with HEVC encoder version 3.2+34-8e6db24c1517
settings:
x265_x64.exe" --colorprim bt709 --transfer bt709 --colormatrix bt709 --crf 18 --fps 24 --min-keyint 24 --keyint 240 --frames 17616 --sar 1:1 --profile main10 --output-depth 10 --preset slow --ctu 64 --merange 57
@Atak_Snapjera:
I tried to download your benchmark but the download site kept generating one "key" after another but never presented the download link
Important question. Was CPU usage at 100%
blublub
22nd January 2020, 13:11
No, but close. Since I used the preset slower it was around 80 to 85%. On medium it is around 65
Atak_Snajpera
22nd January 2020, 15:49
No, but close. Since I used the preset slower it was around 80 to 85%. On medium it is around 65
Try again with active Distributed Encoding mode
Use these settings
https://i.postimg.cc/Gt0TtbFX/Capture.png
https://i.postimg.cc/FzjdJwGN/Capture2.png
Flat 100% CPU usage guaranteed!
blublub
22nd January 2020, 18:10
Hi, I have trouble getting it to encode. I disabled all NICs despite one, disabled FW and Uninstaller AV but the encode still won't start.
blublub
24th January 2020, 09:25
OK, I did get distributed encoding to start correctly.
CPU load was mostly between 80 and 90%. That is still an increase of 10 to 15% in cpu load but no matter what I can't get it constantly close to 100%.
I have no clue what the issue here is.
I run the complete encode on a NVME Samsung Pro SSD, so that can't be it.
From what I know from encoding it shouldn't be limited by memory speed and I have quad channel 2933 with more than 50 gb/s, so I guess that's not it either.
May be its a bottleneck within the CPU, i.e AVX capacity.
I don't know, but from what I am seeing here it kinda does not make sense to buy a CPU with more than 24, for encoding at the moment.
Atak_Snajpera
24th January 2020, 11:57
Have you tried with 3rd server?
blublub
24th January 2020, 12:56
Not yet.
1 encode with the settings has a 60 to 80% usage. Since a 2nd instances only marginally improves the usage I doubt a 3rd encode will, there seems to be some sort of bottleneck.
Atak_Snajpera
24th January 2020, 13:03
run my benchmark and see if you get at least similar fps like in my data base.
https://i.postimg.cc/KYgqT9Cf/Untitled-1.png
blublub
24th January 2020, 13:45
Can u post an alternate download link?
I already tried to download it 2 days ago but the website did not generate a link.
Atak_Snajpera
24th January 2020, 13:55
Maybe you should disable adblock? Change/update browser? Works fine here. Just checked.
blublub
24th January 2020, 13:59
K, I'll try tonight
blublub
24th January 2020, 14:30
run my benchmark and see if you get at least similar fps like in my data base.
https://i.postimg.cc/KYgqT9Cf/Untitled-1.png
Score is 108
vBulletin® v3.8.11, Copyright ©2000-2026, vBulletin Solutions Inc.