View Full Version : x265 HEVC Encoder
Romario
30th April 2017, 00:02
We plan to optimize x265 with AVX-512 instructions, as soon as possible.
What speed gains are possible, when all is finished?
Gesendet von meinem GT-I9295 mit Tapatalk
uneedme
30th April 2017, 17:50
Hi all
I am a little confused the Lambda Table is auto applied on 2.4 version? No need to refer to the csv files using -- lambda file?
If the parameter table (csv file) is optimized where can i find the latest one?
Cheers
LigH
30th April 2017, 18:03
See: commit 94d59c3 (2017-04-13) (https://bitbucket.org/multicoreware/x265/commits/94d59c325e975888e4f7b152cc90b4199d9d24c4)
The lambda tables are not in separate CSV files, but in source/common/constants.cpp (you would have to extract them to separate CSV files if you wanted to switch them "on demand").
I don't know which syntax they have to use if provided as CSV. One value per line?
x265_Project
30th April 2017, 18:52
I am a little confused the Lambda Table is auto applied on 2.4 version? No need to refer to the csv files using -- lambda file?
Correct. v2.4 incorporates the new lambda tables.
x265_Project
30th April 2017, 18:55
See: commit 94d59c3 (2017-04-13) (https://bitbucket.org/multicoreware/x265/commits/94d59c325e975888e4f7b152cc90b4199d9d24c4)
The lambda tables are not in separate CSV files, but in source/common/constants.cpp (you would have to extract them to separate CSV files if you wanted to switch them "on demand").
I don't know which syntax they have to use if provided as CSV. One value per line?
Yes. Externally referenced lambda tables have one value per line. There are 70 entries for lambda, and 70 more for lambda2. Blank lines are ignored, so I usually insert a blank line in between lambda and lambda2.
brumsky
1st May 2017, 16:18
What speed gains are possible, when all is finished?
Gesendet von meinem GT-I9295 mit Tapatalk
I too would like to know this.
x265_Project
1st May 2017, 19:31
What speed gains are possible, when all is finished?
Speedup from SIMD instructions like AVX-512 varies a lot, depending on the machine you're using (and # of cores/threads you're using), the x265 preset you're running, and the picture size of the video. For AVX-2, we saw as much as a 53% speedup on a Haswell generation server (E5-2699 v3) for 4K Ultrafast at CRF 22, and as little as 0% speedup for medium preset (increasing again to ~ 10% speedup for veryslow preset).
You can easily test this yourself on any Haswell or later generation Intel chip, simply by running a job once as usual, and then again (after your system has cooled down) with --asm avx, which will turn off AVX-2 optimization. To get valid results you would need to be testing only x265 and not FFMPEG (so, encode from a YUV file to an HEVC bitstream, ideally from and to a RAMdisk to eliminate I/O bottlenecks as a factor).
We expect even bigger gains on Purley generation (E5 v5) Xeons and Skylake Extreme Edition chips due to the higher internal memory bandwidth and many other CPU improvements.
Romario
1st May 2017, 19:35
Speedup from SIMD instructions like AVX-512 varies a lot, depending on the machine you're using (and # of cores/threads you're using), the x265 preset you're running, and the picture size of the video. For AVX-2, we saw as much as a 53% speedup on a Haswell generation server (E5-2699 v3) for 4K Ultrafast at CRF 22, and as little as 0% speedup for medium preset (increasing again to ~ 10% speedup for veryslow preset).
You can easily test this yourself on any Haswell or later generation Intel chip, simply by running a job once as usual, and then again (after your system has cooled down) with --asm avx, which will turn off AVX-2 optimization. To get valid results you would need to be testing only x265 and not FFMPEG (so, encode from a YUV file to an HEVC bitstream, ideally from and to a RAMdisk to eliminate I/O bottlenecks as a factor).
We expect even bigger gains on Purley generation (E5 v5) Xeons and Skylake Extreme Edition chips due to the higher internal memory bandwidth and many other CPU improvements.
Ok, thank you. And do you know in which tineframe will AVX-512 support be finished?
For x265 2.6 Version?
Gesendet von meinem GT-I9295 mit Tapatalk
NikosD
1st May 2017, 19:43
We plan to optimize x265 with AVX-512 instructions, as soon as possible.
And what about RyZen architecture (not SIMD specifically) optimizations ?
x265_Project
1st May 2017, 19:54
Ok, thank you. And do you know in which time frame will AVX-512 support be finished?
For x265 2.6 Version?
I can't say at this point. We're actively trying to figure out how to get it done as soon as possible, but it's a big project.
x265_Project
1st May 2017, 19:57
And what about RyZen architecture (not SIMD specifically) optimizations?
As with all platforms, we're always reaching out to hardware companies to find ways to optimize performance. There is nothing I can update you on (with respect to Ryzen) at this time.
NikosD
1st May 2017, 20:00
As with all platforms, we're always reaching out to hardware companies to find ways to optimize performance. There is nothing I can update you on (with respect to Ryzen) at this time.
Interesting.
No help from AMD ?
Seems strange to me, since they are trying hard to be close to developers especially now, since Zen is a very new and promising architecture.
Sagittaire
1st May 2017, 21:34
To get valid results you would need to be testing only x265 and not FFMPEG (so, encode from a YUV file to an HEVC bitstream, ideally from and to a RAMdisk to eliminate I/O bottlenecks as a factor).
hhmmm ... if you use ffmpeg like frameserver, you have by far better speed decoding than speed encoding if you use medium (and higher) preset.
ffmpeg\ffmpeg.exe -i Sample\Exodus_UHD_HDR_Exodus_draft.mp4 -an -f rawvideo - | x265\x265.exe --input-res 3840x2160 --fps 23.976 - -o Output\x265_2160p.265 --input-depth 10 --output-depth 10 --crf 24 --preset medium --tune grain --bframes 3 --min-keyint 1 --qcomp 0.75 --ssim --psnr
with that you have less than 5% of CPU charge for HEVC stream decoding without I/O bottlenecks.
x265_Project
1st May 2017, 21:40
Interesting.
No help from AMD ?
I didn't say that.
You have to understand that our conversations with our chip partners are confidential. They wouldn't want me to characterize or elaborate on the extent of their support or cooperation, or in some cases, the lack thereof. When we have some improvements or progress to report, we'll let you know. From our perspective, you can be sure that we want more performance on every platform, and we're doing everything possible to get it.
x265_Project
1st May 2017, 21:44
hhmmm ... if you use ffmpeg like frameserver, you have by far better speed decoding than speed encoding if you use medium (and higher) preset.
with that you have less than 5% of CPU charge for HEVC stream decoding without I/O bottlenecks.
Performance measurement is a tricky thing. You don't just have the CPU % to consider. You also have to consider data availability (cache poisoning) and data dependencies (x265 waiting for decoded frames from FFMPEG, for example). When you're trying to measure the effect of SIMD optimization on a particular workload, you want to make sure that's the only workload running. We do this at both the kernel level and the full x265 library level.
NikosD
2nd May 2017, 13:53
I didn't say that.
You have to understand that our conversations with our chip partners are confidential. They wouldn't want me to characterize or elaborate on the extent of their support or cooperation, or in some cases, the lack thereof. When we have some improvements or progress to report, we'll let you know. From our perspective, you can be sure that we want more performance on every platform, and we're doing everything possible to get it.
Thank you for your reply.
I'm always here to help.
Agner's optimization manuals including RyZen have just released here:
http://www.agner.org/optimize/optimization_manuals.zip
Impressive low-level performance for RyZen in instruction latency/ throughput, most of the times looks on par or even faster than Intel in IPC, something that we haven't seen so far due to other limitations obviously.
The long awaited:
x265 2.4+6-fd01abfc7898 (https://www.mediafire.com/file/qskcb7il5jdrot7/x265_2.4%2B6-fd01abfc7898.7z) (merge with stable) contains an additional multi-lib EXE with Dynamic HDR10 enabled.
jlpsvk
3rd May 2017, 09:16
Can I use multilib EXE with DHDR enbled for encoding non-HDR and normal HDR?
Of course. Just don't specify DynamicHDR10 control files.
I'm liking the new 10-bit lambda tables, thank you x265 developers! So far, I'm finding a declination of 0.3 CRF is appropriate when targeting a similar size/quality to the old lambda tables.
--crf 20 --preset slow --tune grain --profile main10 --no-strong-intra-smoothing --deblock -3:0 --rskip --ctu 32
While CRF 20.3 is good enough for many sources, I choose CRF 20 when I want better quality. For my eyes, anything worse than CRF 20.3 when combined with --tune grain is not worth it.
Could someone please explain why encoding an 8-bit AVC source as 10-bit x265 is higher quality than 8-bit x265? It boggles my mind and I can't help but question whether I've misunderstood the intention behind certain comments in this thread.
I suspect my last comment is only visible to moderators for some reason since it never appeared in the thread, yet benwaggoner replied (https://forum.doom9.org/showthread.php?p=1802854#post1802854) to it. The quoted text is not the full comment. For context, I'm tuning for optimal file size without encoding much slower than the command line above, given a quality constraint.
Modern video encoders don't store the pixels in a video frame; they transform parts of it into a spectrum of visual frequencies and store their parameters. An encoder with 8 bit precision stores up to 256 different frequency parameter values. An encoder with 10 bit precision may store up to 1024 different values. Where an 8 bit encoder has to decide between n and n+1, a 10 bit encoder as 4 times the number of values to store the frequency parameters as precise as possible. Precision can be a trade-off: On one hand, more different numbers take more space; on the other hand, more precise encoding may leave less difference and lead to more efficient compression, especially with predicted (P, B) frames. Many people reported 10-bit x264 being quite efficient for cartoons, in comparison to 8-bit x264.
Re-coding one lossy format with another lossy format may look worse where the second encoder wastes bitrate on quantization errors the first encoder already left in the video frames. It's quite possible that one kind of quantization (similar to integer division while encoding, and multiplication while decoding) does not match the factor of the other, leaving occasional larger gaps in the "stair steps" between re-quantized parameters. Try for yourself, calculate at first the closest multiple of - let's say - 5 to a range of natural numbers, and then, the closest multiple of 7 to this result. You will notice that the steps get more irregular.
Boulder
6th May 2017, 08:32
Many people reported 10-bit x264 being quite efficient for cartoons, in comparison to 8-bit x264.In x264, the 10-bit encode was usually 10-15% smaller in my tests (CRF mode, of course). In x265, the 10-bit encode is larger than the 8-bit one. The 12-bit x265 encode is once again smaller than the others :)
nevcairiel
6th May 2017, 09:30
In x264, the 10-bit encode was usually 10-15% smaller in my tests (CRF mode, of course). In x265, the 10-bit encode is larger than the 8-bit one. The 12-bit x265 encode is once again smaller than the others :)
You can't really compare sizes of the same settings with different bitdepths, a comparison here isn't very meaningful. The encoder makes different decisions based on the bitdepth, so not only can the size vary, but the quality as well.
A better comparison would be to tweak the rate-control settings to generate the exact same file size and then compare quality, or tweak quality settings to generate the exact same measurable quality and then compare sizes. Both are of course a bit involved processes, but its the only way to draw any sort of conclusion.
Boulder
6th May 2017, 10:02
You can't really compare sizes of the same settings with different bitdepths, a comparison here isn't very meaningful. The encoder makes different decisions based on the bitdepth, so not only can the size vary, but the quality as well.Hmm, I recall seeing a technical explanation as to why the 10-bit x264 encode is more efficient compression-wise than the 8-bit one. I was just wondering whether it's the same case here regarding the 12-bit x265 encode.
LoRd_MuldeR
6th May 2017, 16:24
Hmm, I recall seeing a technical explanation as to why the 10-bit x264 encode is more efficient compression-wise than the 8-bit one. I was just wondering whether it's the same case here regarding the 12-bit x265 encode.
Higher "internal" bit-depth of the Codec (encoder/decoder) results in higher compression efficiency, because there are less rounding errors in all the various intermediate stages - that's regardless of whether you are encoding a true "high bit-depth" source or an 8-Bit source. Encoding at 8-Bit rather than 10-Bit (or 12-Bit) is merely a speed hack. Or, if you think about hardware Codes, a way to save some silicon. If speed wasn't an issue, we could use 32-Bit right away.
Some people make the false assumption that encoding at 10-Bit/12-Bit requires a higher bit-rate than encoding at 8-Bit, because 10-Bit/12-Bit video contains more information (more LSB's) than 8-Bit video. But, when encoding at 10-Bit/12-Bit, you can compensate for that by using higher quantizers, thus retaining exactly as many "significant bits" as in a 8-Bit encode - and still get improved compression efficiency due to less rounding errors in the intermediate stages.
See also:
https://forum.doom9.org/showpost.php?p=1260621&postcount=19
when the tune film will be released?
nevcairiel
7th May 2017, 00:56
Hmm, I recall seeing a technical explanation as to why the 10-bit x264 encode is more efficient compression-wise than the 8-bit one. I was just wondering whether it's the same case here regarding the 12-bit x265 encode.
The exact same argument doesn't apply fully anymore to H.265, since the internal precission was increased independent of the actual bitdepth, however there are still some gains - just not as high anymore as there were with H.264.
Midzuki
9th May 2017, 22:24
x265.exe 2.4+9-7ecd263f6d43
(GCC 7.1.0, Dynamic HDR10, multilib, x64)
https://forum.videohelp.com/threads/357754-%5BHEVC%5D-x265-EXE-mingw-builds?p=2485466#post2485466
Khun_Doug
10th May 2017, 06:34
when the tune film will be released?
There was some discussion back some months with folks picking their own favorite settings for something like tune=film. I haven't seen anything on that for some time now. I also was waiting but then another discussion on tune=grain explained that grain makes internal adjustments above and beyond CLI controls. Some simple tests proved it, so I am a believer.
Here is what I have found seems to work well. First, i am interested in retaining high quality rather than getting small compressed files. Generally, I have found 1 or 2 pass with grain, and set my bitrate in the range of 11000 and 12000. I also try a 10 minute segment to see how it compresses and use that as a guide to my bitrate and choice of 1 pass or 2 pass. I also have a few instances of using CRF 20 and grain.
Essentially, I start the compress when I am finished with the machine for the evening and let it run over night.
There was some discussion back some months with folks picking their own favorite settings for something like tune=film. I haven't seen anything on that for some time now. I also was waiting but then another discussion on tune=grain explained that grain makes internal adjustments above and beyond CLI controls. Some simple tests proved it, so I am a believer.
Here is what I have found seems to work well. First, i am interested in retaining high quality rather than getting small compressed files. Generally, I have found 1 or 2 pass with grain, and set my bitrate in the range of 11000 and 12000. I also try a 10 minute segment to see how it compresses and use that as a guide to my bitrate and choice of 1 pass or 2 pass. I also have a few instances of using CRF 20 and grain.
Essentially, I start the compress when I am finished with the machine for the evening and let it run over night.
I'm talking about small files, but even for this, at x264 there is small difference from the big file, while at the x265 there is more difference. When using normal x265 settings, it changes things a lot, tune grain is made for big sizes, while tune film suggested here is good but needs some improvements I think.
benwaggoner
12th May 2017, 17:04
You can't really compare sizes of the same settings with different bitdepths, a comparison here isn't very meaningful. The encoder makes different decisions based on the bitdepth, so not only can the size vary, but the quality as well.
A better comparison would be to tweak the rate-control settings to generate the exact same file size and then compare quality, or tweak quality settings to generate the exact same measurable quality and then compare sizes. Both are of course a bit involved processes, but its the only way to draw any sort of conclusion.
...plus "measurable quality" is very challenging to measure in a well subjectively correlated way, particularly when comparing different bit depths. Of the built-in x265 metrics, SSIM is the best, but still has lots of limitations. And the relevance of the mean of all frames is also questionable for content more than a few seconds long. A CBR and a VBR encode might have the same mean SSIM, but the CBR have wildly oscillating values and a much worse subjective viewer experience.
Comparing subjective quality at the same file size and parameters (bitrate, max-vbv, max-bufsize) is really the only reliable way to measure subtle quality differences.
x265 seem really powerfull at quantizer ~25 with default setting. It's certainely the best quantizer area for make comparison. 1000 kbps for 2K source at q35 have definitely too low visual quality.
For curiosity I've encoded 2K version of Tears of Steel @ 2000 kbit/s, the quality is much better.
Command line:f:\speed\tear>ffmpeg -i ../tearsofsteel-4k.y4m -pix_fmt yuv420p16 -vf "scale=1920:-4:flags=bicubic+accurate_rnd+full_chroma_int+
full_chroma_inp:param0=-0.5:param1=0.25,setsar=1" -v warning -strict -1 -f yuv4mpegpipe - | x265 --y4m - --bitrate 2000 -p9 --de
block -1 --keyint 480 --multi-pass-opt-distortion -o p1m.hevc --pass 1
[yuv4mpegpipe @ 000000000046c9c0] Warning: generating non standard YUV stream. Mjpegtools will not work.
y4m [info]: 1920x804 fps 24/1 i420p16 sar 1:1 unknown frame count
raw [info]: output file: p1m.hevc
x265 [info]: HEVC encoder version 2.4+14-bc0e9bd7c08f
x265 [info]: build info [Windows][MSVC 1910][64 bit] 10bit+8bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
x265 [info]: Main 10 profile, Level-4 (Main tier)
x265 [info]: Thread pool created using 4 threads
x265 [info]: Slices : 1
x265 [info]: frame threads / pool features : 2 / wpp(13 rows)
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 4 inter / 4 intra
x265 [info]: ME / range / subpel / merge : star / 92 / 5 / 5
x265 [info]: Keyframe min / max / scenecut / bias: 24 / 480 / 40 / 5.00
x265 [info]: Lookahead / bframes / badapt : 60 / 8 / 2
x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 1
x265 [info]: References / ref-limit cu / depth : 5 / off / off
x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress : ABR-2000 kbps / 0.60
x265 [info]: tools: rect amp rd=6 psy-rd=2.00 rdoq=2 psy-rdoq=1.00 tskip
x265 [info]: tools: signhide tmvp b-intra strong-intra-smoothing
x265 [info]: tools: deblock(tC=-1:B=-1) sao stats-write
x265 [info]: frame I: 129, Avg QP:22.35 kb/s: 13618.64
x265 [info]: frame P: 3666, Avg QP:24.43 kb/s: 5747.81
x265 [info]: frame B: 13825, Avg QP:30.17 kb/s: 845.36
x265 [info]: Weighted P-Frames: Y:5.5% UV:4.3%
x265 [info]: Weighted B-Frames: Y:3.8% UV:2.6%
x265 [info]: consecutive B-frames: 10.7% 5.3% 7.8% 27.8% 16.8% 15.1% 4.2% 6.4% 6.0%
encoded 17620 frames in 81497.66s (0.22 fps), 1958.88 kb/s, Avg QP:28.92
f:\speed\tear>ffmpeg -i ../tearsofsteel-4k.y4m -pix_fmt yuv420p16 -vf "scale=1920:-4:flags=bicubic+accurate_rnd+full_chroma_int+
full_chroma_inp:param0=-0.5:param1=0.25,setsar=1" -v warning -strict -1 -f yuv4mpegpipe - | x265 --y4m - --bitrate 2000 -p9 --de
block -1 --keyint 480 --multi-pass-opt-distortion -o p2m.hevc --pass 2
[yuv4mpegpipe @ 00000000004ec9c0] Warning: generating non standard YUV stream. Mjpegtools will not work.
y4m [info]: 1920x804 fps 24/1 i420p16 sar 1:1 unknown frame count
raw [info]: output file: p2m.hevc
x265 [info]: HEVC encoder version 2.4+14-bc0e9bd7c08f
x265 [info]: build info [Windows][MSVC 1910][64 bit] 10bit+8bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
x265 [info]: Main 10 profile, Level-4 (Main tier)
x265 [info]: Thread pool created using 4 threads
x265 [info]: Slices : 1
x265 [info]: frame threads / pool features : 2 / wpp(13 rows)
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 4 inter / 4 intra
x265 [info]: ME / range / subpel / merge : star / 92 / 5 / 5
x265 [info]: Keyframe min / max / scenecut / bias: 24 / 480 / 40 / 5.00
x265 [info]: Lookahead / bframes / badapt : 60 / 8 / 2
x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 1
x265 [info]: References / ref-limit cu / depth : 5 / off / off
x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress : ABR-2000 kbps / 0.60
x265 [info]: tools: rect amp rd=6 psy-rd=2.00 rdoq=2 psy-rdoq=1.00 tskip
x265 [info]: tools: signhide tmvp b-intra strong-intra-smoothing
x265 [info]: tools: deblock(tC=-1:B=-1) sao stats-read
x265 [info]: frame I: 129, Avg QP:20.66 kb/s: 15524.06
x265 [info]: frame P: 3666, Avg QP:25.10 kb/s: 5673.36
x265 [info]: frame B: 13825, Avg QP:30.80 kb/s: 897.89
x265 [info]: Weighted P-Frames: Y:2.2% UV:1.8%
x265 [info]: Weighted B-Frames: Y:0.9% UV:0.6%
x265 [info]: consecutive B-frames: 10.7% 5.3% 7.8% 27.8% 16.8% 15.1% 4.2% 6.4% 6.0%
encoded 17620 frames in 74910.62s (0.24 fps), 1998.55 kb/s, Avg QP:29.54
Video: www.msystem.waw.pl/x265/tears-2Mbps.mkv
Midzuki
15th May 2017, 17:47
x265.exe 2.4+22-c102c809fc4f
https://forum.videohelp.com/threads/357754-%5BHEVC%5D-x265-EXE-mingw-builds?p=2485922#post2485922
zub35
15th May 2017, 18:18
Please rename the existing preset "placebo" in "superslow" and new (true) preset "placebo":
+ bframes=16 ref=16 rc-lookahead=120 me=full subme=7 frame-threads=1
benwaggoner
15th May 2017, 18:40
Please rename the existing preset "placebo" in "superslow" and new preset "placebo":
+ bframes=16 ref=16 rc-lookahead=120 me=full subme=7 frame-threads=1
Well, if we are going FULL PLACEBO, we should also add:
--cu-lossless
--ref 6
--tskip
And for psychovisual optimization, --aq-mode 3 would likely help.
Also, this looks like just the first pass of what's intended to be a multipass encode. Presuming this is meant to be VBR, doing a second pass would improve quality significantly.
sneaker_ger
15th May 2017, 18:49
ref 16 is not even allowed in HEVC spec.
zub35
15th May 2017, 19:01
sneaker_ger That's right, my mistake. As said benwaggoner, ref=6
benwaggoner Psychovisual optimization is an additional correction, Not in the compression category options presets. This is closer to "tune" presets.
x265 2.4+22-c102c809fc4f (https://www.mediafire.com/file/i2aev4frrc9joa5/x265_2.4%2B22-c102c809fc4f.7z)
adds CTUInfo API (seems to be a debug feature, quite time consuming) and an AVX2 framework of integral functions to speed up SEA motion search.
pradeeprama
17th May 2017, 04:39
x265 2.4+22-c102c809fc4f (https://www.mediafire.com/file/i2aev4frrc9joa5/x265_2.4%2B22-c102c809fc4f.7z)
adds CTUInfo API (seems to be a debug feature, quite time consuming) and an AVX2 framework of integral functions to speed up SEA motion search.
The CTUInfo API is a new API introduced to enable applications that include the x265 library to provide additional information about certain locations in the video that have to be treated differently than the rest of the video. Though a combination of the CTUInfo API and the --ctuinfo options, a partition shape can be forced on the region of interest passed via this API and encoding parameters can be controlled as described in the online docs.
x265_Project
18th May 2017, 21:45
Please rename the existing preset "placebo" in "superslow" and new (true) preset "placebo":
+ bframes=16 ref=16 rc-lookahead=120 me=full subme=7 frame-threads=1
We look at our performance presets every so often, and we run a large batch of tests to determine the optimal combination of settings to achieve the best speed vs. compression efficiency trade-off for each of the 10 presets. I take responsibility for our preset strategy. The strategy for --preset placebo has not been to make every possible speed trade-off in search of every last possible bit of quality. Rather, this point on the speed vs. efficiency curve is the slowest possible encoding speed that we believe any reasonable person would tolerate. Maybe it will take more than a day to encode a short video, but it won't take weeks or months. It's generally well beyond the point of diminishing returns. Certainly, you can modify some settings to go even slower (and we've experimented with all of these, except the newer options like --me sea or --me full), but you will likely see almost no visible benefits. Now, if someone shows me that there are real compression efficiency benefits to be gained, we'll consider making placebo even slower. You would have to run some tests, using placebo vs. your modified placebo, and show that at the same bit rate, your modified placebo setting achieves visibly higher subjective visual quality. But no one is going to use placebo if it encodes one frame per minute.
Reminder... we care only about subjective visual quality at identical bit rates, not objective measurements like PSNR or SSIM.
Tom
nakTT
21st May 2017, 04:40
We look at our performance presets every so often, and we run a large batch of tests to determine the optimal combination of settings to achieve the best speed vs. compression efficiency trade-off for each of the 10 presets. I take responsibility for our preset strategy. The strategy for --preset placebo has not been to make every possible speed trade-off in search of every last possible bit of quality. Rather, this point on the speed vs. efficiency curve is the slowest possible encoding speed that we believe any reasonable person would tolerate. Maybe it will take more than a day to encode a short video, but it won't take weeks or months. It's generally well beyond the point of diminishing returns. Certainly, you can modify some settings to go even slower (and we've experimented with all of these, except the newer options like --me sea or --me full), but you will likely see almost no visible benefits. Now, if someone shows me that there are real compression efficiency benefits to be gained, we'll consider making placebo even slower. You would have to run some tests, using placebo vs. your modified placebo, and show that at the same bit rate, your modified placebo setting achieves visibly higher subjective visual quality. But no one is going to use placebo if it encodes one frame per minute.
Reminder... we care only about subjective visual quality at identical bit rates, not objective measurements like PSNR or SSIM.
Tom
Thanks for a very nice clarification. Now I understand more on the reasons behind x265 presets, especially the placebo.
Btw, any of you guys think that it is wise for me to encode videos with resolution 640*272 using x265 Very Slow preset at the bitrate of 300kbps for the video? Is there any way for me to be more efficient?
Thank you in advance. :thanks:
I've made some tests with true placebo option -- 2-pass encoding first 2160 frames of big_buck_bunny_1080p24.y4m.
common options was:
-D8 --bitrate 1500 -I480 --psnr --ssim -p9 --no-psy-rd --multi-pass-opt-distortion -f2160 --pass 1/2
Results:
Global PSNR | SSIM Mean Y (dB) | tested/additional options:
44.055 | 17.401 | no additional options
44.043 | 17.392 | --cu-lossless
44.041 | 17.388 | --cu-lossless --me sea
44.078 | 17.408 | --rc-lookahead 100
44.083 | 17.409 | --rc-lookahead 100 --bframes 10
44.093 | 17.418 | --rc-lookahead 120 --bframes 12
44.107 | 17.433 | --rc-lookahead 120 --bframes 12 --ref 6
44.115 | 17.440 | --rc-lookahead 120 --bframes 12 --ref 6 --subme 7
44.120 | 17.442 | --rc-lookahead 120 --bframes 12 --ref 6 --subme 7 -F1
Full results in attachment.
Only --cu-lossless and --me sea make the result PSNR/SSIM worse and --me sea is a big slow down. I've tried to test --me full, but encoding speed was about 0.03 fps, so I given up.
From my test the true placebo option is:
-p9 --rc-lookahead 120 --bframes 12 --ref 6 --subme 7 -F1
--------------------------
Sample to watch: 2K version of Big Buck Bunny encoded @ 1500 kbps with true placebo option: www.msystem.waw.pl/x265/bb1500.mkv
Encoder output:
f:\speed\2.4+25>x265 -D8 --bitrate 1500 -I480 --psnr --ssim -p9 --no-psy-rd --multi-pass-opt-distortion --rc-lookahead 120 --bfr
ames 12 --ref 6 --subme 7 -F1 ../big_buck_bunny_1080p24.y4m btp1.hevc --pass 1
y4m [info]: 1920x1080 fps 24/1 i420p8 sar 1:1 frames 0 - 14314 of 14315
raw [info]: output file: btp1.hevc
x265 [info]: HEVC encoder version 2.4+25-b4149e898b50
x265 [info]: build info [Windows][MSVC 1910][64 bit] 8bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
x265 [warning]: --psnr used with psy on: results will be invalid!
x265 [warning]: --tune psnr should be used if attempting to benchmark psnr!
x265 [info]: Main profile, Level-5 (Main tier)
x265 [info]: Thread pool created using 4 threads
x265 [info]: Slices : 1
x265 [info]: frame threads / pool features : 1 / wpp(17 rows)
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 4 inter / 4 intra
x265 [info]: ME / range / subpel / merge : star / 92 / 7 / 5
x265 [info]: Keyframe min / max / scenecut / bias: 24 / 480 / 40 / 5.00
x265 [info]: Lookahead / bframes / badapt : 120 / 12 / 2
x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 1
x265 [info]: References / ref-limit cu / depth : 6 / off / off
x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress : ABR-1500 kbps / 0.60
x265 [info]: tools: rect amp rd=6 rdoq=2 psy-rdoq=1.00 tskip signhide tmvp
x265 [info]: tools: b-intra strong-intra-smoothing deblock sao stats-write
x265 [info]: frame I: 132, Avg QP:20.77 kb/s: 23917.87 PSNR Mean: Y:47.813 U:50.299 V:50.809 SSIM Mean: 0.989789 (19.909dB
)
x265 [info]: frame P: 3202, Avg QP:24.87 kb/s: 4070.12 PSNR Mean: Y:44.882 U:48.155 V:48.646 SSIM Mean: 0.984755 (18.169dB
)
x265 [info]: frame B: 10981, Avg QP:31.11 kb/s: 480.10 PSNR Mean: Y:44.928 U:48.058 V:48.668 SSIM Mean: 0.984954 (18.226dB
)
x265 [info]: Weighted P-Frames: Y:6.7% UV:4.7%
x265 [info]: Weighted B-Frames: Y:2.5% UV:1.7%
x265 [info]: consecutive B-frames: 10.3% 13.5% 13.5% 30.6% 8.1% 10.0% 3.1% 4.6% 1.6% 1.3% 0.9% 1.1% 1.6%
encoded 14315 frames in 71887.79s (0.20 fps), 1499.24 kb/s, Avg QP:29.62, Global PSNR: 45.806, SSIM Mean Y: 0.9849541 (18.226 dB
)
f:\speed\2.4+25>x265 -D8 --bitrate 1500 -I480 --psnr --ssim -p9 --no-psy-rd --multi-pass-opt-distortion --rc-lookahead 120 --bfr
ames 12 --ref 6 --subme 7 -F1 ../big_buck_bunny_1080p24.y4m btp2.hevc --pass 2
y4m [info]: 1920x1080 fps 24/1 i420p8 sar 1:1 frames 0 - 14314 of 14315
raw [info]: output file: btp2.hevc
x265 [info]: HEVC encoder version 2.4+25-b4149e898b50
x265 [info]: build info [Windows][MSVC 1910][64 bit] 8bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX
x265 [warning]: --psnr used with psy on: results will be invalid!
x265 [warning]: --tune psnr should be used if attempting to benchmark psnr!
x265 [info]: Main profile, Level-5 (Main tier)
x265 [info]: Thread pool created using 4 threads
x265 [info]: Slices : 1
x265 [info]: frame threads / pool features : 1 / wpp(17 rows)
x265 [info]: Coding QT: max CU size, min CU size : 64 / 8
x265 [info]: Residual QT: max TU size, max depth : 32 / 4 inter / 4 intra
x265 [info]: ME / range / subpel / merge : star / 92 / 7 / 5
x265 [info]: Keyframe min / max / scenecut / bias: 24 / 480 / 40 / 5.00
x265 [info]: Lookahead / bframes / badapt : 120 / 12 / 2
x265 [info]: b-pyramid / weightp / weightb : 1 / 1 / 1
x265 [info]: References / ref-limit cu / depth : 6 / off / off
x265 [info]: AQ: mode / str / qg-size / cu-tree : 1 / 1.0 / 32 / 1
x265 [info]: Rate Control / qCompress : ABR-1500 kbps / 0.60
x265 [info]: tools: rect amp rd=6 rdoq=2 psy-rdoq=1.00 tskip signhide tmvp
x265 [info]: tools: b-intra strong-intra-smoothing deblock sao stats-read
x265 [info]: frame I: 132, Avg QP:20.86 kb/s: 23111.23 PSNR Mean: Y:48.060 U:50.452 V:50.964 SSIM Mean: 0.990132 (20.057dB
)
x265 [info]: frame P: 3202, Avg QP:26.04 kb/s: 4188.95 PSNR Mean: Y:44.876 U:48.047 V:48.562 SSIM Mean: 0.985262 (18.316dB
)
x265 [info]: frame B: 10981, Avg QP:32.33 kb/s: 456.05 PSNR Mean: Y:44.802 U:47.901 V:48.521 SSIM Mean: 0.985439 (18.368dB
)
x265 [info]: Weighted P-Frames: Y:1.1% UV:0.5%
x265 [info]: Weighted B-Frames: Y:0.6% UV:0.4%
x265 [info]: consecutive B-frames: 10.3% 13.5% 13.5% 30.6% 8.1% 10.0% 3.1% 4.6% 1.6% 1.3% 0.9% 1.1% 1.6%
encoded 14315 frames in 52657.88s (0.27 fps), 1499.94 kb/s, Avg QP:30.82, Global PSNR: 45.700, SSIM Mean Y: 0.9854429 (18.369 dB
)
Atak_Snajpera
21st May 2017, 17:45
Thanks for a very nice clarification. Now I understand more on the reasons behind x265 presets, especially the placebo.
Btw, any of you guys think that it is wise for me to encode videos with resolution 640*272 using x265 Very Slow preset at the bitrate of 300kbps for the video? Is there any way for me to be more efficient?
Thank you in advance. :thanks:
For such ultra low resolutions you should stick with x264. x265 works best with FHD+ resolutions. 640x272 is less than DVD.
divxmaster
21st May 2017, 23:33
Thanks for a very nice clarification. Now I understand more on the reasons behind x265 presets, especially the placebo.
Btw, any of you guys think that it is wise for me to encode videos with resolution 640*272 using x265 Very Slow preset at the bitrate of 300kbps for the video? Is there any way for me to be more efficient?
Thank you in advance. :thanks:
This is possible, but under certain situations. I have done a lot of work with very low bitrate x265. It depends on what screen size you are viewing it on. For phone and tablet viewing, I have found even much lower bitrates to work fine.
For example, with stargate sg1, I keep the full resolution, 720x480, and encode with crf28 and nrintra 400, nrinter 400. I wouldnt normally use those, but on a small screen you cannot really see the detail loss.
Using 10bit is essential, it allows low bitrate and removes any banding due to that. The bitrates I have for s05e22 is only 119kbps! No macroblocking, etc... Heck I even just watch a bit on my 15.6 laptop screen and it is watchable. But not on a big screen, say 48".
Note this is on older versions of x265 (1.8+106), I havent tried new versions yet. I can provide full parameters if you wish.
Cheers,
Divxmaster
Dclose
21st May 2017, 23:43
Btw, any of you guys think that it is wise for me to encode videos with resolution 640*272 using x265 Very Slow preset at the bitrate of 300kbps for the video? Is there any way for me to be more efficient?
You might find 360p (and aspect ratio equivalents) resolution (with possibly higher cfr to compensate for size) is worth it over 272p and 288p since x265 likes resolution more than texture at extreme low bitrates, and 360p seems to add just enough extra res to go from a 288p video that's "ewww" to "eh, I've seen worse."
You might even find 200 kbps or less tolerable.
720p at CFR of 28 or even higher can look surprisingly good (relatively speaking) especially the farther away/smaller the screen you watch it since when farther away grain/texture goes away and you focus more on edges of objects. But that setting will still usually use more bitrate than a "eh, I've seen worse" 360p resolution.
A main problem of higher resolution still comes down to x265 being so slow. 720p is a lot slower than 360p resolution. At extreme low bitrate, 720p may not be worth the encoding time and electricity over 360p resolution if you're just going for small size and tolerable quality.
For such ultra low resolutions you should stick with x264. x265 works best with FHD+ resolutions. 640x272 is less than DVD.
I disagree. The lower the resolution and bitrate, the more x265 shines over x264. x265 makes extreme low bitrate and res tolerable to watch compared to x264 making it unwatchable.
sneaker_ger
21st May 2017, 23:48
x265 shines at low bitrates compared to x264 but lower resolutions don't increase that effect, they decrease it. So x265 becomes better with decreasing bitrate and increasing resolution.
Dclose
22nd May 2017, 00:25
x265 shines at low bitrates compared to x264 but lower resolutions don't increase that effect, they decrease it. So x265 becomes better with decreasing bitrate and increasing resolution.
I don't see (literally!) :) why x264 is better. x265 is known for less inherent detail/grain than x264, and that doesn't matter much at extreme low bitrate/res, and x265 is better than x264 at "holding the picture together" for lack of a better term, regardless of resolution.
I've encoded lots of extreme-low bitrate stuff at 360p, 288p, all the way down to around 150p, with x265. I've tried to do the same with x264 on Placebo and it's just a macroblocking mess. That's not to say x265 at extreme-low bitrate/res is great to look at, but it's not as messy as x264 is.
Basically, at such low bitrate/resolution, what I would consider unwatchable on x264 I consider watchable with x265.
Different people have a different taste. How many pages did people already discuss the loss of details which they did not enjoy. There are people who can tolerate some compression artefacts easier than a "sterile" result... But that happens with rather convenient bitrates.
sneaker_ger
22nd May 2017, 07:36
I don't see (literally!) :) why x264 is better.
I didn't say x264 is better at low resolution/low bitrate. The point of my post is: if you lower the resolution the gap between x264 and x265 will close, not widen.
benwaggoner
22nd May 2017, 16:06
x265 shines at low bitrates compared to x264 but lower resolutions don't increase that effect, they decrease it. So x265 becomes better with decreasing bitrate and increasing resolution.
Another way to look at it is that x265 can use a higher resolution than x264 at the same bitrate, and deliver more detail that way.
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.