Log in

View Full Version : Which processor to encode x265 4K ?


Pages : 1 2 3 4 5 [6] 7 8

D3X
14th February 2022, 14:09
Is there a possibility to render x265 on the GPU?
Cuda would be very nice. I don't want a lecture on why it would be bad,
just wondering if anyone has that bitbucket.org CUDA on GPU x265 repo or so.
Thanks!

D3X
14th February 2022, 14:10
https://bitbucket.org/vovagubin/x265-hevc-opencl-or-cuda-encoder

RanmaCanada
14th February 2022, 17:45
Is there a possibility to render x265 on the GPU?
Cuda would be very nice. I don't want a lecture on why it would be bad,
just wondering if anyone has that bitbucket.org CUDA on GPU x265 repo or so.
Thanks!

If you're going to use a GPU to encode, just use quicksync. It will be far faster and the quality will be pretty darn close, especially with the new ASICS in Alder Lake.

RanmaCanada
14th February 2022, 17:51
Hi :)
Now there is a new Intel gen, what is the best choice to encode 4K ? I'm still on a B450M + 3700X combo but I would like to know which one between 5900X or i9 12900 would be the best choice if I upgrade ? The 5900X could be used on the B450M platform. Or perhaps a 3900X used if price is good.
Thanks !

The 12900 beats a 5900x (https://www.techpowerup.com/review/intel-core-i9-12900k-alder-lake-12th-gen/14.html), albeit with power usage out the wazoo (https://www.techpowerup.com/review/intel-core-i9-12900k-alder-lake-12th-gen/20.html). But then you also get access to quicksync. If the cost of your electricity is extremely high, ie you live in Europe, the 12900 is a really bad choice for software encoding as it uses almost twice the power of a 5900x when encoding.

Nico8583
14th February 2022, 19:56
Thanks, I live in Europe... :rolleyes:

D3X
15th February 2022, 03:28
If you're going to use a GPU to encode, just use quicksync. It will be far faster and the quality will be pretty darn close, especially with the new ASICS in Alder Lake.

Okay, let's pretend I'm a total newb with the QS commands.
What would you recommend?
I have a RTX 3070 Ti

tormento
15th February 2022, 12:51
Or wait for a discrete Intel GPU with QuickSync.

asarian
15th February 2022, 13:40
Defintely the i9 12900K, hands down. Is almost twice as fast as the i7 10700K I (briefly) had before, when it comes to x265 encoding.

SquallMX
16th February 2022, 17:32
Making a prediction of around 2 FPS (I hope) with a Ryzen 9 5900X, I made this reasoning, if that's correct.
In the duration of a second of movie there are about 24fps and the processor processes 2 FPS, it means that the ratio is 1:12, so a movie with a duration of 2 hours needs 24 hours of processing, is that correct?
If necessary, it is possible to divide the source file into several parts, in order to do the processing at different times and then carry out the final union?
Thank you

I can confirm that, with a 5900x without OC you get 2.xx-3.xx fps depending on the movie content, usually grainy ones are the slowest, --medium preset with a few tweaks gets you 5.xx-6.xx fps without noticeable differences.

And no, Hardware Encoding (QS / nVidia NVENC) is not in the same ballpark quality-wise.

RanmaCanada
16th February 2022, 23:38
Okay, let's pretend I'm a total newb with the QS commands.
What would you recommend?
I have a RTX 3070 Ti
I can't give you any commands that will work with QS as your 3070Ti doesn't have it. It has NVENC, totally different ASICS.

D3X
17th February 2022, 10:47
I can't give you any commands that will work with QS as your 3070Ti doesn't have it. It has NVENC, totally different ASICS.

That's why I desperately need that BitBucket repo I linked to earlier. :)

excellentswordfight
17th February 2022, 13:27
That's why I desperately need that BitBucket repo I linked to earlier. :)
Well that didnt use nvenc/ASICS, if I remembered correctly it was just some experimental fork of x265 that offloaded some work utilizing opencl/cuda that never got any real development afaik. Even if you find it, it will be based on an very old x265 version that could give you some unexpected results.

This is an nvenc based encoder https://github.com/rigaya/NVEnc/releases

Defintely the i9 12900K, hands down. Is almost twice as fast as the i7 10700K I (briefly) had before, when it comes to x265 encoding.
Although 12900k is indeed very fast, its worth having in mind what RanmaCanada wrote. When the alder lake models goes above about 125W it starts to throw the efficiency out the window, and if you take the efficiency in account 12700k is much more compelling option for encoding as you probably want to keep pl1 rather conservative anyway. When my 12700k drops from pl2 (~200W) to pl1 (125W) after about 60s the decrease in performance is no were near the drop in power consumption.

5900X is probably still the king for consumer CPUs when it comes to x264/x265 encoding giving the efficiency factor. But I would say that 12700k is still an very valid option. I would only go for the 12900k if pure speed is the pretty much the only criteria.

Nico8583
17th February 2022, 15:34
Thanks all for feedback about 12700K / 12900K / 5900X

Selur
18th February 2022, 10:14
if I remembered correctly it was just some experimental fork of x265 that offloaded some work utilizing opencl/cuda that never got any real development afaik.
I agree, as far as I know there never was any usable code in it.

benwaggoner
18th February 2022, 18:17
I agree, as far as I know there never was any usable code in it.
Yeah, I think it had a maximum theoretical speed boost of only 20%, and there was lower, more portable low-hanging fruit for optimization. All the analysis save/load, reusing of data from an existing h.264 encode, etc all can deliver bigger performance boosts. And using a HW encoder as input to refine is pretty similar conceptually, and can work on most GPUs that would support sufficiently performant CUDA or OpenCL.

speedy
23rd February 2022, 23:04
Is there somewhere I can go to see how my i9-9900K compares with newer processors for 4K x265 encoding or does anyone here have data on this? Thanks!

Blue_MiSfit
24th February 2022, 01:20
I have a 9900k at work and a Ryzen 9 5900x at home (12 core). I'll do a quick test with Tears of Steel 4k :)

Both have pretty powerful AIO water coolers so they can maintain decent boost clocks.

RanmaCanada
24th February 2022, 02:16
Is there somewhere I can go to see how my i9-9900K compares with newer processors for 4K x265 encoding or does anyone here have data on this? Thanks!

https://www.techpowerup.com/review/intel-core-i9-9900k/6.html it's a little faster than a 2700x, and slower than a 3900x https://www.techpowerup.com/review/amd-ryzen-9-3900x/13.html and roughly half the speed of a 5900x https://www.techpowerup.com/review/amd-ryzen-9-5900x/14.html at 1080p so 4k would be of course "about" the same.

You can also always run the old benchmark to see how you stack up in "real world" terms..

https://forum.doom9.org/showthread.php?t=174393

Blue_MiSfit
24th February 2022, 03:26
Ok, so I did preset slower and crf 20 (everything else at defaults). This was using a recent ffmpeg build with x265 3.5. The source is the 4k SDR 4:2:0 8 bit ~80 Mbps H.264 version available from the Tears of Steel site.

- The 9900k gets about 0.8 fps
- The 5900x gets about 1.4 fps.

Looks like the better IPC and 12 cores of the Ryzen 9 vs 8 cores compensates well for the lower clock speed.

The 9900k can sustain 4.68 GHz all-core turbo whereas the 5900x can hover around 4.2. Both systems have beefy water coolers, the Corsair HX115i if memory serves...

CPU usage was 100% in both cases.

excellentswordfight
24th February 2022, 15:21
Ok, so I did preset slower and crf 20 (everything else at defaults). This was using a recent ffmpeg build with x265 3.5. The source is the ~80 Mbps H.264 version available from the Tears of Steel site.

- The 9900k gets about 0.8 fps
- The 5900x gets about 1.4 fps.

Looks like its better IPC and 12 cores vs 8 cores offsets the lower clock speed. The 9900k can sustain 4.68 GHz all-core turbo whereas the 5900x can hover around 4.2.

CPU usage was 100% in both cases.
Although one should be careful to do comparisons with so many different variables. I get about 1.4 when not power limited and 1.2 @ 125W on a 12700k with preset slower for ToS, and that seems to line up with the x265 results techpowerup publishes when compared to your numbers.

Although I still dont think consumer CPU:s is fast enough for preset 'slower' to make sense for UHD-encoding for private use, we are definitely starting to getting usable performance for 'slow' now (~4fps, so about 12h encoding time for 2h title).

speedy
24th February 2022, 15:53
That's impressive. Looks like I'm due for an upgrade soon. I'm sure I can find a home for my 9900K in another computer.

speedy
9th March 2022, 22:56
I just finished upgrading my 9900K to a 12900K and am now wondering how to get the most out of it while encoding with x265.
Any advice?
Thanks!

RanmaCanada
10th March 2022, 16:24
I just finished upgrading my 9900K to a 12900K and am now wondering how to get the most out of it while encoding with x265.
Any advice?
Thanks!

If you're not seeing 100% utilization then use Ripbot to divy up the job and do it in chunks?

speedy
18th March 2022, 20:26
Does anyone else here have experience with x265 encoding on Intel 12th Gen Alder Lake CPUs?

I'm running into issues with my encodes being moved to the E-cores (Efficiency cores) when disconnect from my Windows user session? I typically RDP (Remote Desktop Protocol) into the encoding server and check on and setup my encodes, but after disconnecting it seems like the encode workloads are moved to the E-cores. If I stay logged in then the encodes run on the P-cores (Performance Cores). I've been able to workaround this for now by using Chrome Remote Desktop which makes Windows think I'm logging into the machine locally because when I disconnect it doesn't realize I'm no longer "remoted in".

Encodes seem to take 4 times longer running on the E-cores so I'd really like to figure this out.

P.S. I'm using Handbrake 1.5.1

mastrboy
20th March 2022, 01:06
Does anyone else here have experience with x265 encoding on Intel 12th Gen Alder Lake CPUs?

I'm running into issues with my encodes being moved to the E-cores (Efficiency cores) when disconnect from my Windows user session? I typically RDP (Remote Desktop Protocol) into the encoding server and check on and setup my encodes, but after disconnecting it seems like the encode workloads are moved to the E-cores. If I stay logged in then the encodes run on the P-cores (Performance Cores). I've been able to workaround this for now by using Chrome Remote Desktop which makes Windows think I'm logging into the machine locally because when I disconnect it doesn't realize I'm no longer "remoted in".

Encodes seem to take 4 times longer running on the E-cores so I'd really like to figure this out.

P.S. I'm using Handbrake 1.5.1

Not really a solution, but maybe a decent workaround is using Process Lasso to pin the cores: https://bitsum.com

RanmaCanada
20th March 2022, 20:12
Does anyone else here have experience with x265 encoding on Intel 12th Gen Alder Lake CPUs?

I'm running into issues with my encodes being moved to the E-cores (Efficiency cores) when disconnect from my Windows user session? I typically RDP (Remote Desktop Protocol) into the encoding server and check on and setup my encodes, but after disconnecting it seems like the encode workloads are moved to the E-cores. If I stay logged in then the encodes run on the P-cores (Performance Cores). I've been able to workaround this for now by using Chrome Remote Desktop which makes Windows think I'm logging into the machine locally because when I disconnect it doesn't realize I'm no longer "remoted in".

Encodes seem to take 4 times longer running on the E-cores so I'd really like to figure this out.

P.S. I'm using Handbrake 1.5.1

I would try other programs like staxrip, hybrid, fastflix, xmedia recode, etc, to see if it's program related or windows related.

benwaggoner
23rd March 2022, 03:49
If you're not seeing 100% utilization then use Ripbot to divy up the job and do it in chunks?
Does Ripbot properly use --chunk-start and --chunk-end to avoid quality issues at stitch points?

Ala https://patentimages.storage.googleapis.com/29/05/b6/ed3668747b35b0/US10863179.pdf

Blue_MiSfit
24th March 2022, 02:19
Nice patent, Ben ;)

RanmaCanada
25th March 2022, 05:13
Does Ripbot properly use --chunk-start and --chunk-end to avoid quality issues at stitch points?

Ala https://patentimages.storage.googleapis.com/29/05/b6/ed3668747b35b0/US10863179.pdf

I believe it does but Atak_Snajpera would be the best person to answer this question.

speedy
13th April 2022, 03:23
Not really a solution, but maybe a decent workaround is using Process Lasso to pin the cores: https://bitsum.com

I would try other programs like staxrip, hybrid, fastflix, xmedia recode, etc, to see if it's program related or windows related.

I ended up figuring out that you need to run these two commands in Windows for Intel 12th Gen (Alder Lake) CPUs:
POWERCFG /POWERTHROTTLING DISABLE /PATH "c:\Program Files\HandBrake\HandBrake.Worker.exe"
POWERCFG /POWERTHROTTLING DISABLE /PATH "c:\Program Files\HandBrake\HandBrake.exe"

DMD
11th October 2022, 13:55
Good morning.
Should debut i9-13900K in days, I ask where I could find a comparison with x265 encoding vs Ryzen 9 7950X
Thanks

excellentswordfight
11th October 2022, 14:10
Good morning.
Should debut i9-13900K in days, I ask where I could find a comparison with x265 encoding vs Ryzen 9 7950X
Thanks
Of the big review sites I think that techpowerup does the best x265 test (using preset slow and crf 20), so I would check there once its reviewed.

https://www.techpowerup.com/review/amd-ryzen-9-7950x/17.html

tormento
12th October 2022, 15:32
I'd really like to see some Alder Lake or other Intel AVX 512 vs Zen 4 with AVX512 results.

Anyone? :)

rwill
12th October 2022, 16:58
I'd really like to see some Alder Lake or other Intel AVX 512 vs Zen 4 with AVX512 results.

Anyone? :)

Alder Lake does not support AVX 512 ?

tormento
12th October 2022, 22:43
Alder Lake does not support AVX 512 ?
Early version does, disabling E-cores.

RanmaCanada
13th October 2022, 05:03
I'd really like to see some Alder Lake or other Intel AVX 512 vs Zen 4 with AVX512 results.

Anyone? :)

You won't be able to get it as Intel in their wisdom has decided end users do not deserve to have access to it. Very few chips support it, and those that did will also have it disabled if users update their BIOS, with no means to flash back, as per Intel's instructions to manufacturers.

There is no point in the benchmark as the chips are unobtainable at this time.

excellentswordfight
13th October 2022, 11:14
Early version does, disabling E-cores.
In that case AVX512 will be slower, you loose about 30% performance by disabling the e-cores for x265-encoding on i7/i9. I very much doubt that AVX512 gives that kinds of performance gains, it never even accommodated for the frequency loss for all the AVX512 test ive done on xeons. On something like an i9 that already is pushed power-wize, if it can keep the frequency under avx512 power consumption must go through the roof.

For zen4, the test i´ve seen the avx512 implementation seems rather good from a power-perspektive, but there doesnt seems to result in that much gain for x265.

RanmaCanada
15th October 2022, 05:37
In that case AVX512 will be slower, you loose about 30% performance by disabling the e-cores for x265-encoding on i7/i9. I very much doubt that AVX512 gives that kinds of performance gains, it never even accommodated for the frequency loss for all the AVX512 test ive done on xeons. On something like an i9 that already is pushed power-wize, if it can keep the frequency under avx512 power consumption must go through the roof.

For zen4, the test i´ve seen the avx512 implementation seems rather good from a power-perspektive, but there doesnt seems to result in that much gain for x265.

In 2018 Intel released a white paper (https://www.intel.com/content/www/us/en/developer/articles/technical/accelerating-x265-with-intel-advanced-vector-extensions-512-intel-avx-512.html) that touted the speed increase AVX512 would give processors in encoding HEVC. It's funny how they not only took this away from consumers, but it didn't generate the speed increases they tried to market.

tormento
15th October 2022, 10:16
In that case AVX512 will be slower, you loose about 30% performance by disabling the e-cores for x265-encoding on i7/i9.
Watch derbau8r video on youtube about alder lake speed in avx512 :)

benwaggoner
16th October 2022, 22:23
In 2018 Intel released a white paper (https://www.intel.com/content/www/us/en/developer/articles/technical/accelerating-x265-with-intel-advanced-vector-extensions-512-intel-avx-512.html) that touted the speed increase AVX512 would give processors in encoding HEVC. It's funny how they not only took this away from consumers, but it didn't generate the speed increases they tried to market.
The AVX512 instruction set is quite powerful, and really does improve performance per clock with x265, particularly higher resolutions. Intel's problem is that using AVX512 quickly lowers the clock speed a lot.

The instruction set will probably become as essential as AVX2 eventually, as thermal issues get addressed. AVX2 had much the same problem when it first launched, which was much improved in the next major revision.

But it's not like the typical consumer would miss AVX2 instructions either for the most part. It's going to be much more valuable for workstations and compute-optimized servers.

frencher
17th October 2022, 01:28
Hello all,

AMD Threadripper 3990X or Intel i5 13600k for 4K encoding ?
What would be the frame rate for the i5 13600k in 4k encoding ?
A result under x265 FHD Benchmark ?

Thanks

excellentswordfight
17th October 2022, 08:48
Watch derbau8r video on youtube about alder lake speed in avx512 :)
I have, and I dont remember him testing x265. Just cause one load gets a huge speedup by avx512 doesnt means another does. As some that has benchmarked avx512 on xeon I can tell you that the speedup of avx512 in x265 is not big.

Here is a x265 benchmark for avx512 on Zen4
https://i.ibb.co/qCJy7hL/x265.jpg

And in case you would just assume that zen4 has bad avx512 performance in general here is one with 3DPM:
https://images.anandtech.com/graphs/graph17585/130235.png

benwaggoner
18th October 2022, 19:31
MCW's guidence some years back was that AVX512 was generally a speed regression on Intel CPUs of that era for anything below 4K --preset veryslow.

I've got some 8K --preset placebo test scripts that I can try with AVX512 on/off when I get back from my work trip this Friday.

benwaggoner
18th October 2022, 19:50
I have, and I dont remember him testing x265. Just cause one load gets a huge speedup by avx512 doesnt means another does. As some that has benchmarked avx512 on xeon I can tell you that the speedup of avx512 in x265 is not big.

Here is a x265 benchmark for avx512 on Zen4
Do you know what resolution was being encoded with what preset? AVX512 should help more with more pixels and with more complex presets.

excellentswordfight
19th October 2022, 08:44
Do you know what resolution was being encoded with what preset? AVX512 should help more with more pixels and with more complex presets.
No, but judging from the fps its either 1080p at a slow preset, or 2160p at maybe medium? But as the speed is really depandant on source complexity its hard to say,

Whats interesting is that zen4 doesnt suffer from downclocking. It seems like Zen4 has a very different implementation of AVX512 which makes it much more power efficient.

"But rather going for a 512-bit FPU data path and the possibility of reduced clock frequencies and power/thermal concerns, they employed a 256-bit "double pumping" strategy."

"Here is a look at the CPU peak frequency across the entire span of benchmarks tested... No real change compared to without AVX-512"

"Likewise, the CPU power consumption was similar when running the AVX-512 enabled software. For nearly all the tests the AVX2 vs. AVX-512 results were almost identical"

https://www.phoronix.com/review/amd-zen4-avx512/6

This is the speed I got on Xeon the last time i did a test (ToS 2160p @ preset slow):

Intel Xeon "Cascade Lake Refresh" 2x6226R 16c/32T 150W MSRP 1300USD (each).
avx256: 4,85fps
avx512: 4,49fps

AMD EPYC "ROME" 7502P 32c/64t 180W MSRP 2300USD
avx256: 6,26fps

Unfortunately we dont have any Ice lake-SP models I can test on as we have started to switch to Epyc. I would love to see a comparison between Xeon Gold 6314U & EPYC 7543P.

excellentswordfight
20th October 2022, 15:16
Good morning.
Should debut i9-13900K in days, I ask where I could find a comparison with x265 encoding vs Ryzen 9 7950X
Thanks
https://tpucdn.com/review/intel-core-i9-13900k/images/encode-h265.png

Thats a win for 7950X, 13900k needs close to 400W be able to push infront of it, in stock its still a bit more power hungry than 7950X.

benwaggoner
20th October 2022, 19:39
https://tpucdn.com/review/intel-core-i9-13900k/images/encode-h265.png

Thats a win for 7950X, 13900k needs close to 400W be able to push infront of it, in stock its still a bit more power hungry than 7950X.
Did you try --avx512 on the 13900? I'm curious how it might have improved the thermal throttling behaviors to increase throughput.

Zebulon84
20th October 2022, 20:29
Did you try --avx512 on the 13900?
If I understand correctly intel i9-13900K specifications (https://www.intel.com/content/www/us/en/products/sku/230496/intel-core-i913900k-processor-36m-cache-up-to-5-80-ghz/specifications.html#specs-1-0-7), the Instruction Set Extensions include AVX2 but not AVX 512.

benwaggoner
20th October 2022, 22:29
If I understand correctly intel i9-13900K specifications (https://www.intel.com/content/www/us/en/products/sku/230496/intel-core-i913900k-processor-36m-cache-up-to-5-80-ghz/specifications.html#specs-1-0-7), the Instruction Set Extensions include AVX2 but not AVX 512.
Oh, right.

When are the 13th gen equivalent Xeons coming out? Those should still have AVX512.

excellentswordfight
21st October 2022, 06:46
Oh, right.

When are the 13th gen equivalent Xeons coming out? Those should still have AVX512.
Sapphire Rapids is the code name for next gen xeon-sp, and the follow up for Ice lake-sp. It will be built on the same node and have the golden cove cores found in gen12 ”core” processors (which are pretty much the same as 13th gen) I think its been delayed again to q1/q2 next year. It will also use MCM (multi-chip module), so core count in top models should increase by a alot.

4th-gen Epyc, Genoa, is due for release soon as weel and will also feature avx512. And given that zen4 on ryzen doesnt downclock at all that will be rather intresting as well.