Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
3rd September 2017, 19:01 | #1 | Link |
Registered User
Join Date: Sep 2007
Posts: 24
|
x265 encoding on arm64
Hi,
I'm on aarch64/arm64/armv8l and trying to figure out how to transcode video (mpeg1, mpeg2, avc) into h265 with opus audio. FFmpeg built-in h265 encoder for debian squeezy does not currently support multicore encoding (surprisingly h264 does), further in the current version, ffmpeg's opus encoder also appears broken. So I'd need to use ffmpeg to decode video and feed it directly to x265 encoder, then to decode audio and using opusenc create opus file, and finally mux those two streams. Can someone advice me how to do this? I'd have grabbed Handbrake a long time ago, but it's also not available for arm64 yet. Thanks |
4th September 2017, 12:28 | #5 | Link |
Registered User
Join Date: Dec 2014
Posts: 40
|
x265 on arm64, even with NEON, is slower by several orders of magnitude than x86.
Your best bet is to have a SoC with a HEVC hardware encoder and use that. To actually answer your question : you need to build FFmpeg linked with libx265 and libopus. This way, you can encode streams with commands roughly like this : Code:
ffmpeg -i <input> -c:v libx265 -c:a libopus <output> |
4th September 2017, 13:12 | #6 | Link |
Registered User
Join Date: Sep 2007
Posts: 24
|
I don't know, really....
x265 on arm64 is running 0.35-0.45fps on 'slower' preset for 352x272 MPEG-2 video When encoding the same file on i7-3820QM downclocked to 2.2GHz I get 3.4fps But this is most likely due to the fact that x265 on arm64 is not multithreaded: With as efficient multithreading as on x86-64, the performance would be on-par, or much closer, certainly not magnitudes slower. Last edited by wqcr; 4th September 2017 at 13:16. |
5th September 2017, 06:03 | #8 | Link | |
Registered User
Join Date: Sep 2007
Posts: 24
|
Quote:
Have you actually tried it before assuming all arm-SoCs are just too weak to handle this? I did - and not only the SoC does not throttle after days and days of 100% load (all 8 cores), further its performance is equivalent to i5-3340M HT at 3.2GHz (Geekbench3 and opus decoding speed). It doesn't have any difficulty with x264 even multithreaded. By all means it's not only suitable for x265 encode, it should be even preferred due to its vastly superior power efficiency, several magnitudes better than even the latest generation of Kaby Lake. The only drawback comes from current implementation of x265 which on arm64 supposedly neither use NEON nor is multithreaded. Still with manual parallelization, 2.5+fps is possible, which isn't that far from the above result measured at i7 quad Last edited by wqcr; 5th September 2017 at 06:09. |
|
9th September 2017, 08:04 | #10 | Link | |
Guest
Posts: n/a
|
Quote:
We have some limited ARM Neon optimization (x265\source\common\arm), but this is not anywhere near as complete as our x86 SIMD optimization. We've had discussions with various people at various times about doing a full optimization effort, but as of today this hasn't bubbled up to the top of the priority list for our customers or our strategic hardware partners. Of course, x265 is open source, and contributions are always welcomed. |
|
10th September 2017, 23:29 | #11 | Link |
Registered User
Join Date: Jan 2007
Posts: 729
|
What SoC/CPU are you using? "arm64" could mean something from awfully broad spectrum of slow to reasonably fast chips. Cortex-A53 is quite different from some higher out of order core or even the architectures Apple implemented. Clocks matter, number of cores too, etc.
|
30th September 2017, 11:32 | #12 | Link |
Registered User
Join Date: Sep 2007
Posts: 24
|
Encode finished in just under 78 hours - 7 h264 clips, 41mins long each at 352x272 - encoder preset "slower", bitrate based 400kbps, 1pass, audio 64k opus
I'm quite satisfied with the result. CPU throttled a little with all its cores loaded, but only 10-20% under extreme conditions (35°C ambient). 2-pass encoding seems to be broken though, even with the correct params, log files were not created. If you really want to know, this system used for encoding: Redmi Note 4 Global version CPU - Snapdragon 625, A53 octa-core at 2.02GHz RAM - 4GB LPDDR3-1600 Setup - Rooted AOSGP X 2.11 on Android 7.1.1, Debian Stretch running through chroot, using hotspot mode in conjunction with sshd and vnc server to operate the machine remotely. Typical power consumption - 3.7W on full load Last edited by wqcr; 30th September 2017 at 11:44. |
16th October 2017, 14:14 | #14 | Link |
Registered User
Join Date: Sep 2007
Posts: 24
|
Another encode finished, this time 2-pass PAL source.
Preset slower, fast 1-pass, v-bitrate 400, a-bitrate 64 (opus) 7 clips (each 7 minutes long) were finished in 22 hours. This time I used Slimrom, which by default disables any thermal throttling, so CPU was at 2016 MHz all the time. Tcase was just under 74°C, phone's outer case was no more than 48°C. Power consumption jumped to 4W, so beefier 5V/2A source had to be used. Still I'm again very satisfied with the result, even though x265 haven't used any arm64 optimizations. Performance is more or less directly comparable to similarily clocked C2Q, except that would run at 12 times the consumption compared to this little SOC. |
17th October 2017, 15:01 | #15 | Link |
Registered User
Join Date: Jan 2010
Posts: 709
|
7 x 7 min each = 49 min
if those are pal usually 720x576x25fps so about 385kPx/s looks to me a bit too slow for a 2.0GHz c2q, also is useless compare 10y old cpu efficiency with a new one. my rpi3 reach 200kPx/s (1.2GHz armv6 + neon, maybe I have to compile it with more appropriate flags)
__________________
powered by Google Translator Last edited by Motenai Yoda; 17th October 2017 at 15:09. |
12th November 2017, 19:03 | #16 | Link |
Registered User
Join Date: Mar 2004
Posts: 1,157
|
Would be nice if someone with one of these qualcomm 2400 arm chips could does some x264 and x265 encodes tests: https://blog.cloudflare.com/arm-takes-wing/
|
30th December 2020, 15:50 | #17 | Link |
Registered User
Join Date: Oct 2001
Posts: 465
|
Since I stumpled across some M1 x265 mentionings across the web over the past few days, I wanted to see if there are some in-depth benchmarks of the M1 handbrake/ffmpeg HEVC variants.
Found some interesting stuff: https://www.reddit.com/r/hardware/co..._i_benched_it/ https://www.youtube.com/watch?v=iGVK...ature=youtu.be Does anyone have some more comparisons? Looking at the very low power a MAC Mini is drawing and given the price, it looks like the M1 chip could be a very interesting option for x265 HEVC Encodings if not "the fastes around" is neede, but a solid performance for desktop/hobby usecases.. Thinking of powering this with solar |
30th December 2020, 18:08 | #18 | Link | |
Registered User
Join Date: Dec 2008
Posts: 416
|
Quote:
Last edited by nakTT; 30th December 2020 at 18:21. |
|
31st December 2020, 00:14 | #20 | Link |
Derek Prestegard IRL
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,997
|
Maybe too slow for you, but ARM is rapidly becoming more and more prevalent as ARM chips can occupy an interesting quadrant on the power / speed curve. I imagine with thorough assembly optimization a modern ARM server CPU could outperform a modern x86_64 CPU in terms of efficiency.
If this wasn't the case we probably wouldn't see AWS, Apple, and Microsoft all investing in their own ARM silicon. Granted, HEVC compression is a very specific use case |
Thread Tools | Search this Thread |
Display Modes | |
|
|