Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > High Efficiency Video Coding (HEVC)

Reply
 
Thread Tools Search this Thread Display Modes
Old 22nd August 2018, 23:09   #6301  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Quote:
Originally Posted by RieGo View Post
yes my feeling was that it might be slower with pmode, but as i said i never actually did speed tests, just looked at cpu usage lol. my bad...
maybe it's a good idea for me to just remove it.
but going to slower is not an option (for me) - slow -> slower almost increases encoding time 100%
Yeah, it is quite likely that pmode is slowing you down a bunch and turning it off could get you that 100% speed back.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 23rd August 2018, 00:27   #6302  |  Link
vidschlub
Registered User
 
Join Date: May 2016
Posts: 20
I'm having a discussion with someone online and they are citing an early 2016 discussion about 264 vs 265.
Does anyone know if there's a much more recent comparison of 264 to 265?

My assumption is, by now, with the correct settings used in the encoder, 265 should basically provide a superior image at the same bitrate, almost always (until the returns diminish at very high bitrates)
Surely, that is now the case?
vidschlub is offline   Reply With Quote
Old 23rd August 2018, 05:41   #6303  |  Link
alex1399
Registered User
 
Join Date: Jun 2018
Posts: 56
h.264 is still the grain king if you don't care the blocking it have.
alex1399 is offline   Reply With Quote
Old 23rd August 2018, 06:42   #6304  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,988
Quote:
Originally Posted by alex1399 View Post
h.264 is still the grain king if you don't care the blocking it have.
Unless we're talking 4K, particularly for HDR.

If you can't afford archival level bitrates, HEVC is dramatically better in almost every case, especially at high resolution.
Blue_MiSfit is offline   Reply With Quote
Old 23rd August 2018, 08:20   #6305  |  Link
Forteen88
Herr
 
Join Date: Apr 2009
Location: North Europe
Posts: 556
Quote:
Originally Posted by vidschlub View Post
I'm having a discussion with someone online and they are citing an early 2016 discussion about 264 vs 265.
Does anyone know if there's a much more recent comparison of 264 to 265?
Video Codecs Comparison 2017,
http://www.compression.ru/video/codec_comparison/hevc_2017/


Although they probably didn't compare 10-bit x265 vs 8bit x264 (x264 10-bit isn't supported by hardware-decoders!).
EDIT: OK, Nvidia GeForce 950/960 or better PC GPU supports full H265/HEVC 10-bit hardware-decoding, it doesn't support H264 10-bit hardware-decoding.

Last edited by Forteen88; 23rd August 2018 at 14:47.
Forteen88 is offline   Reply With Quote
Old 23rd August 2018, 12:47   #6306  |  Link
NikosD
Registered User
 
Join Date: Aug 2010
Location: Athens, Greece
Posts: 2,901
Quote:
Originally Posted by Forteen88 View Post
Although they probably didn't compare 10-bit x265 vs 8bit x264 (x264 10-bit isn't supported by hardware-decoders!).
I think that mobile SOCs include HW decoding of H.264 10bit.

Probably Smart TVs, too.
__________________
Win 10 x64 (19042.572) - Core i5-2400 - Radeon RX 470 (20.10.1)
HEVC decoding benchmarks
H.264 DXVA Benchmarks for all
NikosD is offline   Reply With Quote
Old 23rd August 2018, 12:56   #6307  |  Link
microchip8
ffx264/ffhevc author
 
microchip8's Avatar
 
Join Date: May 2007
Location: /dev/video0
Posts: 1,843
Quote:
Originally Posted by NikosD View Post
I think that mobile SOCs include HW decoding of H.264 10bit.

Probably Smart TVs, too.
I have 2 recent Smart TVs (Samsung and Panasonic) and 3 blu-ray players (2 from Samsung and 1 from LG). The Samsung BD players are UHD models

None of the devices above support 10-bit H.264 decoding
__________________
ffx264 || ffhevc || ffxvid || microenc
microchip8 is offline   Reply With Quote
Old 23rd August 2018, 14:47   #6308  |  Link
excellentswordfight
Lost my old account :(
 
Join Date: Jul 2017
Posts: 322
Quote:
Originally Posted by vidschlub View Post
I'm having a discussion with someone online and they are citing an early 2016 discussion about 264 vs 265.
Does anyone know if there's a much more recent comparison of 264 to 265?

My assumption is, by now, with the correct settings used in the encoder, 265 should basically provide a superior image at the same bitrate, almost always (until the returns diminish at very high bitrates)
Surely, that is now the case?
I would say yes, if speed is not considered. But when tuning x265 to be as fast as x264 it falls behind imo.

Most of the test I've done has been in the "rip" catagory, I found that x265 --slow --no-sao --crf 18 has very similar fidelity to x264 --slower --tune film --crf 18 for 1080p bluray re-encoding with a 20-30% bitrate reduction. Most test I've done has been on tears of steel, which is a pretty good source for "general" film content imo, but it could ofc be sources were these numbers dont apply at all (but it has been the case on a few other random blurays I've tested on as well).

Last edited by excellentswordfight; 23rd August 2018 at 14:52.
excellentswordfight is offline   Reply With Quote
Old 23rd August 2018, 20:04   #6309  |  Link
Forteen88
Herr
 
Join Date: Apr 2009
Location: North Europe
Posts: 556
I hope that they finish the Video Codecs Comparison 2018 on that website soon. They've released an Express Report 2018, but it doesn't include "Ultra Ripping: Comparison on extremely slow presets" yet,
http://www.compression.ru/video/code...son/hevc_2018/
Forteen88 is offline   Reply With Quote
Old 25th August 2018, 16:13   #6310  |  Link
Przemek_Sperling
Registered User
 
Join Date: Jun 2009
Location: Poland
Posts: 125
Quote:
Originally Posted by froggy1 View Post
I have 2 recent Smart TVs (Samsung and Panasonic) and 3 blu-ray players (2 from Samsung and 1 from LG). The Samsung BD players are UHD models

None of the devices above support 10-bit H.264 decoding
Weird, I own a cheap settop box (Opticum Sloth Combo Plus) and it decodes H.264 10-bit as well as H.265 12-bit. Maybe because of its chipset (Sunplus 1507). Most such devices have Ali chipsets and maybe they cannot decode such material.
Przemek_Sperling is offline   Reply With Quote
Old 25th August 2018, 17:44   #6311  |  Link
microchip8
ffx264/ffhevc author
 
microchip8's Avatar
 
Join Date: May 2007
Location: /dev/video0
Posts: 1,843
Quote:
Originally Posted by Przemek_Sperling View Post
Weird, I own a cheap settop box (Opticum Sloth Combo Plus) and it decodes H.264 10-bit as well as H.265 12-bit. Maybe because of its chipset (Sunplus 1507). Most such devices have Ali chipsets and maybe they cannot decode such material.
I don't know the chipsets of my devices, but you are most likely correct. That said, there are quite a few devices (TVs, BD players & co) that don't support 10 bit H.264. My Samsung TV, however, supports decoding of 10 bits HEVC but not 10 bits H.264. I haven't tested 12 bits HEVC on it yet
__________________
ffx264 || ffhevc || ffxvid || microenc
microchip8 is offline   Reply With Quote
Old 26th August 2018, 16:33   #6312  |  Link
singhkays
Registered User
 
Join Date: Aug 2018
Posts: 18
Quote:
Originally Posted by excellentswordfight View Post
Using --no-sao for a tune film is imo valid. In my experience no-sao does improve fine detail alot with almost no negative effects for general "film" content with lower crf values. Preset slow together with no-sao is imo enough for detail retention now days. Not sure what setting does it, but I find preset Medium to be way softer then preset slow (imo there should only be a bitrate difference between them when doing a CRF encode, but it doesnt work like that I guess).

I have found sao to be usefull for both animation and low bitrate content though (as expected).


To add to this, I see around 70-80% utilization on dual Xeon E5-2680 v3 (48t) systems for 2160p content using preset slow. Imo that is a very reasonable ammount of multithread performance. For 1080p I wouldnt bother with anything more then 8-12C. Start using chunk-encoding if better multithread utilization is needed.

But I still think Atak question is valid, does 2990wx need any NUMA tweaking to perform correctly?
Quote:
Originally Posted by FranceBB View Post
Not just 64 thread CPU, at work I have two Intel Xeon E5-2660V4 14c/28th for a total of 28c/56th and I can't still saturate both CPUs with a 2160p 10bit HDR10 content encoded with preset --medium and bluray compatible specs.


Some consumers are moving to AMD, but the majority of businesses are using Intel Xeon CPUs (my company included), so that's what they ask for optimizations.
They are simply following the market needs, nothing more.
I recently did some investigations around x265 scaling with 128 cores. You might be interested in the results https://www.singhkays.com/blog/x265-...-hdr-azure-vm/



Quote:
Originally Posted by Atak_Snajpera View Post
No it is not too low. Dual socket (2 NUMA) Intel Xeon E5-4660 v3 (56 threads total) still scales much better than single socket (4 NUMA) 2990WX.
It would probably scale even better if I set numa pools manually.

According to x265 documentation ( https://x265.readthedocs.io/en/default/threading.html )

Can somebody verify than I'm setting numa pools correctly in my previous post?
See my investigation above. The CPU details are in the blog post. Not sure if I can help you verify something.
singhkays is offline   Reply With Quote
Old 26th August 2018, 17:39   #6313  |  Link
Atak_Snajpera
RipBot264 author
 
Atak_Snajpera's Avatar
 
Join Date: May 2006
Location: Poland
Posts: 7,806
4k veryslow is the best case scenario for core utilization. 1080p with default medium preset would require at least 8 concurrent encodes to saturate all those 128 cores.

Ps. I'm not surprised that 2160p scales up to 32 cores. If We divide 2160 by default CU of 64 then we get value of 33.75.

Last edited by Atak_Snajpera; 26th August 2018 at 17:50.
Atak_Snajpera is offline   Reply With Quote
Old 27th August 2018, 08:38   #6314  |  Link
K.i.N.G
Registered User
 
Join Date: Aug 2009
Posts: 90
Quote:
Originally Posted by froggy1 View Post
I have 2 recent Smart TVs (Samsung and Panasonic) and 3 blu-ray players (2 from Samsung and 1 from LG). The Samsung BD players are UHD models

None of the devices above support 10-bit H.264 decoding
my nvidia shields, sony led tv (4k non-hdr) and lg oled tv (4k hdr) all decode it just fine, even my phone (samsung s5) plays them.
K.i.N.G is offline   Reply With Quote
Old 27th August 2018, 10:15   #6315  |  Link
microchip8
ffx264/ffhevc author
 
microchip8's Avatar
 
Join Date: May 2007
Location: /dev/video0
Posts: 1,843
Quote:
Originally Posted by K.i.N.G View Post
my nvidia shields, sony led tv (4k non-hdr) and lg oled tv (4k hdr) all decode it just fine, even my phone (samsung s5) plays them.
With the exception of the two Samsung BD players, all my other devices are Full HD only. I must have bad luck because none can decode 10 bits H.264, including the Samsung BD players that say "file unsupported" when trying to feed them 10 bit H.264

That said, it's not important for me since I moved over to 10 bits HEVC which is decodable on the Samsung BD players *and* my Full HD Samsung TV. My Panasonic TV (also FHD) doesn't support it so I use one of the BD players to decode and feed it. I also prefer using the BD players to stream to my TVs as I can use Bitstream passthrough for the audio, which I can't when using the TVs directly (they internally convert it to AC3 - I use Toslink to feed audio from TVs to Yamaha receiver. Neither Toslink nor ARC supports lossless audio passthrough)
__________________
ffx264 || ffhevc || ffxvid || microenc

Last edited by microchip8; 27th August 2018 at 10:17.
microchip8 is offline   Reply With Quote
Old 28th August 2018, 18:05   #6316  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Quote:
Originally Posted by Atak_Snajpera View Post
4k veryslow is the best case scenario for core utilization. 1080p with default medium preset would require at least 8 concurrent encodes to saturate all those 128 cores.

Ps. I'm not surprised that 2160p scales up to 32 cores. If We divide 2160 by default CU of 64 then we get value of 33.75.
There is a 3 CTU lag in frame parallelism in x265, however. Multithreading performance tuning in x265 is a pretty complex matter. I find this invaluable:

https://x265.readthedocs.io/en/default/threading.html

(x265 has the best documentation of any codec, ever!)
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 28th August 2018, 18:38   #6317  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Quote:
Originally Posted by singhkays View Post
I recently did some investigations around x265 scaling with 128 cores. You might be interested in the results https://www.singhkays.com/blog/x265-...-hdr-azure-vm/





See my investigation above. The CPU details are in the blog post. Not sure if I can help you verify something.
Interesting data.

I would expect that, running multiple instances on multiple sockets like that, your 4x performance would be better if you used --pools to lock each instance to one socket to improve cache coherency and reduce NUMA utilization. Ala --pools "+,-,-,-" to lock to just the first socket of four.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 28th August 2018, 22:11   #6318  |  Link
singhkays
Registered User
 
Join Date: Aug 2018
Posts: 18
Quote:
Originally Posted by benwaggoner View Post
Interesting data.

I would expect that, running multiple instances on multiple sockets like that, your 4x performance would be better if you used --pools to lock each instance to one socket to improve cache coherency and reduce NUMA utilization. Ala --pools "+,-,-,-" to lock to just the first socket of four.
Thanks! I'm doing a follow up based on the comment below on the blog, so I'll include the above optimization as well. Are there other optimizations you'd like to see?

Quote:
I'm confused about the using HEVC as the input file codec -- this will force decode delay, and unless it is full-intra, it will be multi-frame decode which will churn memory and processing just to get a decoded frame into the encode pipe.

A commercial application of this would be to encode a feature-length mezzanine, say an uncompressed MXF with a bit-depth and raster-size matching the output. Such a file would test raw encoding capability in high-cpu environments.
singhkays is offline   Reply With Quote
Old 29th August 2018, 00:41   #6319  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,750
Quote:
Originally Posted by singhkays View Post
Thanks! I'm doing a follow up based on the comment below on the blog, so I'll include the above optimization as well. Are there other optimizations you'd like to see?
That's the only one that popped out. Changes --pools will reduce the number of logical cores available and thus will also reduce the default --frame-threads and anything else that is based on core count, but that should happen automatically.

If you really want to stress single-instance encoding across all those sockets, try --pmode, or maybe even --pme if that doesn't saturate things. Those both increase CPU utilization more than they increase speed, but I bet you'd get more net speed out of a single instance with four cores with --pmode.

If you are looking to add more work for the encoder to do, both --cu-lossless and --tskip will help.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 4th September 2018, 08:20   #6320  |  Link
RainyDog
Registered User
 
Join Date: May 2009
Posts: 184
Unfortunately my PC's just crashed during the 2nd pass of a 2-pass encode.

If I run a 1-pass ABR encode using/reading the stats file that was generated during the 1st pass of the encode that crashed, am I right in assuming the result will be the same as if the 2nd pass had completed?

Thanks.
RainyDog is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 15:33.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.