Log in

View Full Version : x265 HEVC Encoder


Pages : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 [40] 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197

RBX
24th March 2015, 15:32
In my tests, Intel builds are faster than GCC (4.9.2) builds. The difference is greater with psychovisual options and adaptive quantisation turned off (maybe because of assembly code).

Machine: i7-2630QM (4 cores + HT), not 100% utilisation.
Video: 1280x720
Edit: Tested only x265 8bpp

jlpsvk
24th March 2015, 16:13
Sorry to all. But I don't understand to frame comparing x265 vs x264 and looking for fine details. I don't care, if there are 2 little dots, or 1, when in real movie watching scenario, you even won't notice, that there are some dots, because you are focusing on something completely different...in real movie watching, crf20 in x265 gives me much small file size with the same visual quality, than with x264 with crf20. and that's important to me...lower bitrate, same resolution, same visual quality while watching (not meaningless frame comparing)... :)

x265_Project
24th March 2015, 17:54
Seems to me, most people prefer to use CRF.
On the other hand, I prefer to use CQP.

I would like to know what is the recommended value for x265 CQP?
We don't recommend using constant QP. You will get better visual quality using CRF or 2 pass ABR.

RBX
24th March 2015, 18:20
when in real movie watching scenario, you even won't notice, that there are some dots, because you are focusing on something completely different...
With a good source and a big enough screen, even little changes are visible. Doesn't matter much if you intend to watch on something like a laptop.

Ma
25th March 2015, 00:09
I noticed that during the tests, the CPU utilization never reached 100%. It was hovering around 60-80%, so I think seeing as your "threads" are real, physical cores (whereas half of mine are the HTs), the difference is more palpable.

Anonymlol results http://forum.doom9.org/showthread.php?p=1714309#post1714309 look similar to my results (difference GCC 5.0.0 vs 4.9.3), but fps are higher. It could be i7 CPU. Maybe your CPU works up to 35 W and this reduces the speed differences.

xooyoozoo
25th March 2015, 05:55
I was looking through libvpx's (http://git.chromium.org/gitweb/?p=webm/libvpx.git;a=commit;h=9a1ce7be7d4a3056b9da1df64cb7d5115a513dd9) (initial) interpretation of psy-rd and comparing it to it to x265's.

What immediately stuck out was that for both AQ and Psy, libvpx only considers the Luma plane, which makes obvious sense in a perceptual feature. In contrast, x265 equally weighs all three planes for both AQ and Psy, which has non-obvious implications.

What is "energy" in a Cb/Cr plane, and why should a human care? Is there any research that relates higher chroma complexity to better subjective responses? As far as I know, the answer to "should we consider Chroma as a perceptual estimator" is almost always "make a better Luma estimator".

LigH
25th March 2015, 08:21
Daala even tries to derive chrominance from luminance.

jlpsvk
25th March 2015, 09:40
With a good source and a big enough screen, even little changes are visible. Doesn't matter much if you intend to watch on something like a laptop.

what do you mead by big screen? have 50" 1080p plasma watching from 3-4 meters and everything is OK.

Anonymlol results http://forum.doom9.org/showthread.php?p=1714309#post1714309 look similar to my results (difference GCC 5.0.0 vs 4.9.3), but fps are higher. It could be i7 CPU. Maybe your CPU works up to 35 W and this reduces the speed differences.

Didi you tried my encode settings?

anonymlol
25th March 2015, 10:09
jlpsvk, could you please run Ma's test as well? I'm curious about how much speed difference there is between your AVX2 capable i7-4790k (4.8GHz) and my AVX capable i7-2700k (4.8 GHz).

Boulder
25th March 2015, 11:27
what do you mead by big screen? have 50" 1080p plasma watching from 3-4 meters and everything is OK.Everything is OK as you are looking at the screen from quite a big distance :devil:

http://s3.carltonbale.com/resolution_chart.html

My viewing distance (from a 50" plasma) is something like 3.5 meters and I would definitely like to move at least one meter closer but it's not possible at this moment but possibly in the future. Then again, I encode 99% of the time at 720p or lower, depending on the source.

THX even recommends that you sit less 5.6 feet away from the 50" TV :eek:

jlpsvk
25th March 2015, 12:05
Yeah....when using 720p, then CRF 17-18 will be OK. :)

jlpsvk
25th March 2015, 12:10
jlpsvk, could you please run Ma's test as well? I'm curious about how much speed difference there is between your AVX2 capable i7-4790k (4.8GHz) and my AVX capable i7-2700k (4.8 GHz).

I will, but no earlier than in 13-14 hours. Must wait until current encode will finish. Hmm...I now realised, that after BIOS update I forgot to set the CPU back to 4.8GHz, so now encoding only at 4.2GHz. :(

Ma
25th March 2015, 15:28
Didi you tried my encode settings?

I had to stop my previous encoding and start from beginning (sound card driver failure), but now my encoding options are closer to your proposition:
--preset slower --crf 17.0 --rdoq-level 1 --psy-rd 0.4 --deblock -1

If there will be too much blur I will try something stronger (maybe exactly your proposition). I want more details and less blur, but I don't want to destroy quite good and balanced 10-bit x265 default output.

RBX
25th March 2015, 16:09
what do you mead by big screen? have 50" 1080p plasma watching from 3-4 meters and everything is OK.


I said "and", that means you need a good source as well. Re-encoding a poorly encoded file does't change much.

MoSal
25th March 2015, 16:14
Daala even tries to derive chrominance from luminance.

It's the other way around.

Ajvar
25th March 2015, 16:32
Your assumptions about crf are wrong. Just download the files and you will see they are both the same size.
My assumption about CRF is 100% solid truth. And about those 2 files, one was encoded with --tune-film (x264) and second was encoded with --tune-grain (x265) and that's not how proper comparison is made.
jlpsvk, could you please run Ma's test as well? I'm curious about how much speed difference there is between your AVX2 capable i7-4790k (4.8GHz) and my AVX capable i7-2700k (4.8 GHz).

Are you totally sure that AVX Ivy Bridge is actually using AVX for everything and not SSE4 fo some part of work for example? I don't know this but have a feeling that AVX2 CPUs use AVX2 instructions for more work than older processors. This is just speculation though because I am not an expert, just a feeling because how big performace boost it gives.

benwaggoner
25th March 2015, 16:57
Are you totally sure that AVX Ivy Bridge is actually using AVX for everything and not SSE4 fo some part of work for example? I don't know this but have a feeling that AVX2 CPUs use AVX2 instructions for more work than older processors. This is just speculation though because I am not an expert, just a feeling because how big performace boost it gives.
In comparisons between the Ivy Bridge cc3.8xlarge instances and the Broadwell cc4.8xlarge instances, we see a healty double-digit encoding speed improvement per clock/core. And the gap has been growing substantially as more AVX2 instructions get checked in. The differential is bigger with faster presets

Some of that is due to other microarchitectural changes, but I think the majority has been from AVX2 for a while now.

And yes, SSE4 and straight assembler gets used for lots of algorithms even with AVX and AVX2. Although there are now lots of AVX2 optimized functions that don't have an AVX function.

Checking the commits is pretty amazing to see the pace of AVX2 optimizations going in over the last month.

Stereodude
25th March 2015, 17:07
what do you mead by big screen? have 50" 1080p plasma watching from 3-4 meters and everything is OK.
Perhaps because you're too far away from a screen that size to notice. It's not just a function of screen size or just seating distance. Ignoring potential differences in visual acuity, it's a function of the ratio between screen size is and how far away you are from it.

Boulder
25th March 2015, 18:16
My assumption about CRF is 100% solid truth. And about those 2 files, one was encoded with --tune-film (x264) and second was encoded with --tune-grain (x265) and that's not how proper comparison is made.Unfortunately there is no equivalent for --tune film in x265, I would have used it if there were. --tune grain was used to retain as much detail as possible and as you can see from my later post, I also tried without any tuning and --rdoq-level 1 but it still smoothes the image much more than x264.

I'm not looking to compare identical settings, I'm looking to get x265's detail retention level as close as possible to x264's and thus possibly benefit from the more advanced techniques it uses.

jlpsvk
26th March 2015, 13:11
jlpsvk, could you please run Ma's test as well? I'm curious about how much speed difference there is between your AVX2 capable i7-4790k (4.8GHz) and my AVX capable i7-2700k (4.8 GHz).

Running now at 4.2GHz only, as I am not by PC. But I noticed something interesting.... --pmode slows down encoding by about 0.3fps. Even if with --pmode, the CPU usage is at 98-100%, the encoding fps is slower. Without --pmode, CPU usage is about 80-85% but encoding fps is higher..

LigH
26th March 2015, 15:19
I remember this being mentioned already when it was introduced several months ago: Parallelizing mode decisions may be only useful under specific conditions. Possibly mainly for large dimensions. In my tests with smaller dimensions (up to 720p) and rather outdated CPUs, --pmode always slowed down the encoding.

Instead, parallelizing motion estimation (--pme) can speed up results, depending on the complexity of the encoding (a.k.a. preset).
__

Out-of-schedule, here are some additional builds, experimenting with jb_alvarado's media-autobuild_suite.

x265_1.5+420-24fdb661bb57.7z (https://www.mediafire.com/download/jne4nzav92hl972/x265_1.5+420-24fdb661bb57.7z) (MSYS, MinGW32, GCC 4.8.2, package from xhmikosr + 4x cross compile script, EXE+DLL)
x265_1.5+420-24fdb661bb57.GCC492.7z (http://www.mediafire.com/download/c57rpnqaz8na5bb/x265_1.5+420-24fdb661bb57.GCC492.7z) (MSYS2, MinGW64, GCC 4.9.2, media-autobuild_suite, EXE only, stripped and UPX'ed)

jlpsvk
26th March 2015, 17:20
Yep.... Removing --pmode and adding --pme make 1fps difference on my CPU.

LazyNcoder
26th March 2015, 17:38
I have a dual-cpu (6core/12thread each) machine and it's killing me (33% - 720p - slow preset - --aq-mode 2 --aq-strength 1.0). I'm using --pmode --pme --threads 48. it makes the final size a bit larger though(I think it lowers the quality too/not sure yet). Now I'll try it without --pmode to see what happens.

There's something I need to know though. someone mentioned that --frame-threads 1 would lower/remove the banding problem. I don't know if it's true or not( I mean I didn't test it yet) but I wonder if using options I mentioned above would increase the banding problem.

jlpsvk
26th March 2015, 17:57
@LigH
GCC 4.9.2 added another 0.4fps speed increase!!! Thanks for that build.

jlpsvk
26th March 2015, 17:59
I have a dual-cpu (6core/12thread each) machine and it's killing me (33% - 720p - slow preset - --aq-mode 2 --aq-strength 1.0). I'm using --pmode --pme --threads 48. it makes the final size a bit larger though(I think it lowers the quality too/not sure yet). Now I'll try it without --pmode to see what happens.

There's something I need to know though. someone mentioned that --frame-threads 1 would lower/remove the banding problem. I don't know if it's true or not( I mean I didn't test it yet) but I wonder if using options I mentioned above would increase the banding problem.

If you're using 10-bit x265, banding isn't issue. :) So --frame-threads 1 isn't necessary. Tested by myself. Remove --pmode and replace --threads by --pools

LigH
26th March 2015, 19:14
Allowing more and more threads (via --pools) doesn't guarantee that more are eventually used: x265 doesn't use more frame threads than it calculates to be efficient.

jlpsvk
26th March 2015, 19:23
LigH...on the GCC 4.9.2 video does not play smoothly? Will test again...just first thoughts.

Motenai Yoda
26th March 2015, 19:39
Out-of-schedule, here are some additional builds, experimenting with jb_alvarado's media-autobuild_suite.

Did you know how to set it to compile for a specific march or native?
Edit: I'm trying to replace -mtune=generic with -march=native -O2 into media-autobuild_suite.bat

RBX
26th March 2015, 21:16
I have a dual-cpu (6core/12thread each) machine and it's killing me (33% - 720p - slow preset - --aq-mode 2 --aq-strength 1.0). I'm using --pmode --pme --threads 48. it makes the final size a bit larger though(I think it lowers the quality too/not sure yet). Now I'll try it without --pmode to see what happens.


I once did tests for pmode and the size was a little bit higher than without it. I also once tried aq-mode 2, and resulting file size was larger than source file size with the settings I normally use. It might have been some minor bug, or source file problem. How well does it work for you?

LazyNcoder
26th March 2015, 22:05
I once did tests for pmode and the size was a little bit higher than without it. I also once tried aq-mode 2, and resulting file size was larger than source file size with the settings I normally use. It might have been some minor bug, or source file problem. How well does it work for you?

With pmode, it uses more CPU than without it. But I didn't test if it really helps the speed or not. I always thought more CPU utilization = faster encode.

aq-mode 2 was default in previous versions of x265, and I did lots of encodes with it, and it's still my favorite. I like it better than aq-mode 1. maybe a little larger output sometimes(depends on the content) and the thing you said, never happened for me. (I'm using x265 since v1.2)

Ma
26th March 2015, 22:10
Now there is default branch merged with stable but the options "--rdoq-level 1" and "--deblock -1" are still missing from text information at beginning of output HEVC file.

Info in HEVC file after this 2 different encodings is the same:
x265 --preset slow input output
x265 --preset slow --rdoq-level 1 --deblock -1 input output

LigH
26th March 2015, 22:12
Did you know how to set it to compile for a specific march or native?

I am already happy when it passes. Modifying? Not me...

jb_alvarado is looking for a new project host, running out of time.

MeteorRain
27th March 2015, 00:07
Now there is default branch merged with stable but the options "--rdoq-level 1" and "--deblock -1" are still missing from text information at beginning of output HEVC file.

A quick patch here (https://gist.github.com/msg7086/192b89bcc45c3d3b03ce). Don't know how to reach them to put the patch in though.

x265_Project
27th March 2015, 01:10
A quick patch here (https://gist.github.com/msg7086/192b89bcc45c3d3b03ce). Don't know how to reach them to put the patch in though.
Thanks MeteorRain. See https://bitbucket.org/multicoreware/x265/wiki/Contribute for all the details on how to contribute to the x265 project.

We always welcome contributions from talented developers!

MeteorRain
27th March 2015, 02:08
Thanks MeteorRain. See https://bitbucket.org/multicoreware/x265/wiki/Contribute for all the details on how to contribute to the x265 project.

We always welcome contributions from talented developers!
I have put the patch under public domain. You should be able to merge it without any problem. I also post on the mailing list.
If you are still looking for a signed agreement, please let me know.

x265_Project
27th March 2015, 02:19
I have put the patch under public domain. You should be able to merge it without any problem. I also post on the mailing list.
If you are still looking for a signed agreement, please let me know.
Hi MeteorRain,
No problem. To clarify, we don't want to make anyone jump through hoops to contribute a quick fix for something fairly trivial (a couple of missing parameters in the SEI message string). For trivial fixes, you can just email the x265-devel mailing list and say anything to the effect of "I contribute this". We only need a signed contributor agreement if you want to contribute a more significant patch (optimization, new feature or algorithm improvement, etc.), and you want to be credited as the author.

legend
27th March 2015, 07:09
Is there have any good software for x265 encoding.

Ma
27th March 2015, 08:27
A quick patch here (https://gist.github.com/msg7086/192b89bcc45c3d3b03ce). Don't know how to reach them to put the patch in though.

Thanks! After work I will try this code.

LigH
27th March 2015, 08:44
@ legend:

x265 is software for encoding to HEVC video. If you need a user interface to handle it in an easier way than using a console, you may try: StaxRip, Hybrid, MeGUI, ...

Ma
27th March 2015, 16:55
A quick patch here (https://gist.github.com/msg7086/192b89bcc45c3d3b03ce). Don't know how to reach them to put the patch in though.

Now there is much better. There are differences between stderr output and yours new header format: in stderr x265 displays deblock right before sao, numbers are written in form deblock(tC=-1:B=-1).

So my proposition is small modification to your code:
s += sprintf(s, " psy-rd=%.2f", p->psyRd);
s += sprintf(s, " rdoq=%d", p->rdoqLevel);
s += sprintf(s, " psy-rdoq=%.2f", p->psyRdoq);
BOOL(p->bEnableSignHiding, "signhide");
BOOL(p->bEnableLoopFilter, "lft");
if (p->bEnableLoopFilter)
{
BOOL(true, "deblock");
if (p->deblockingFilterBetaOffset || p->deblockingFilterTCOffset)
s += sprintf(s, "(tC=%d:B=%d)", p->deblockingFilterTCOffset, p->deblockingFilterBetaOffset);
}
BOOL(p->bEnableSAO, "sao");

I think we should print information in stderr and header in similar form.

sborho
27th March 2015, 18:58
the string emitted by param2string() and included in the info SEI is also used in two-pass stats files and analysis load/save files and it must be parse-able by x265_param_parse, so it must match the command line format. I'm removing 'lft' which is now deprecated, and using "deblock=%d,%d", which is parseable

Ma
27th March 2015, 20:07
the string emitted by param2string() and included in the info SEI is also used in two-pass stats files and analysis load/save files and it must be parse-able by x265_param_parse, so it must match the command line format. I'm removing 'lft' which is now deprecated, and using "deblock=%d,%d", which is parseable

Thanks for commit https://bitbucket.org/multicoreware/x265/commits/2da2b9dd7eb3bb724a2436d848921251a69d11e5

Ma
28th March 2015, 12:37
There is new header in output HEVC file in x265, but on my favorite site http://chromashift.org/x265_builds/ there is the version 1.5+443, which is the last version with old header (without rdoq-level and deblock).

If someone wants to check new builds (from 1.5+444) I put them on page http://msystem.waw.pl/x265/

MeteorRain
28th March 2015, 15:09
Thanks for the builds.

I also put my mod in my signature for those who want. Compiled with lavf and l-smash.

stax76
29th March 2015, 00:33
I made a snapshot of the x265 documentation when I've built StaxRip's x265 support and I've found a online diff application showing me what has changed meanwhile:

https://www.diffchecker.com/fk4vsibm

I've a problem now having not enough space in the Analysis section, I'll have to move a few options to a new tab because the tab is full, does anybody have a idea which options could be moved?

http://x265.readthedocs.org/en/latest/cli.html#mode-decision-analysis

Maybe somebody can also look if the order and captions make sense, it looks like this currently:

http://oi59.tinypic.com/jqorv8.jpg


Property RD As New OptionParam With {
.Switch = "--rd",
.Text = "RD:",
.Options = {"0 - SA8D mode and split decisions, intra w/ source pixels",
"1 - Recon generated (better intra), RDO merge/skip selection",
"2 - RDO splits and merge/skip selection",
"3 - RDO mode and split decisions, chroma residual used for sa8d",
"4 - Adds RDO Quant",
"5 - Adds RDO prediction decisions",
"6 - Currently same as 5"},
.Expand = True,
.Help = "Level of RDO in mode decision. The higher the value, the more exhaustive the analysis and the more rate distortion optimization is used. The lower the value the faster the encode, the higher the value the smaller the bitstream (in general)." + CrLf2 + "Default 3."}

Property MinCuSize As New OptionParam With {
.Switch = "--min-cu-size",
.Text = "Minimum CU size:",
.Options = {"64", "32", "16", "8"},
.Values = {"64", "32", "16", "8"},
.Help = "Minimum CU size (width and height). By using 16 or 32 the encoder will not analyze the cost of CUs below that minimum threshold, saving considerable amounts of compute with a predictable increase in bitrate. This setting has a large effect on performance on the faster presets." + CrLf2 + "Default: 8 (minimum 8x8 CU for HEVC, best compression efficiency)"}

Property MaxCuSize As New OptionParam With {
.Switch = "--ctu",
.Text = "Maximum CU size:",
.Options = {"64", "32", "16"},
.Values = {"64", "32", "16"},
.Help = "Maximum CU size (width and height). The larger the maximum CU size, the more efficiently x265 can encode flat areas of the picture, giving large reductions in bitrate. However this comes at a loss of parallelism with fewer rows of CUs that can be encoded in parallel, and less frame parallelism as well. Because of this the faster presets use a CU size of 32." + CrLf2 + "Default: 64"}

Property TUintra As New NumParam With {
.Switch = "--tu-intra-depth",
.Text = "TU Intra Depth:",
.Help = "The transform unit (residual) quad-tree begins with the same depth as the coding unit quad-tree, but the encoder may decide to further split the transform unit tree if it improves compression efficiency. This setting limits the number of extra recursion depth which can be attempted for intra coded units." + CrLf2 + "Default: 1, which means the residual quad-tree is always at the same depth as the coded unit quad-tree." + CrLf2 + "Note that when the CU intra prediction is NxN (only possible with 8x8 CUs), a TU split is implied, and thus the residual quad-tree begins at 4x4 and cannot split any futhrer.",
.MinMaxStep = {1, 4, 1}}

Property TUinter As New NumParam With {
.Switch = "--tu-inter-depth",
.Text = "TU Inter Depth:",
.Help = "The transform unit (residual) quad-tree begins with the same depth as the coding unit quad-tree, but the encoder may decide to further split the transform unit tree if it improves compression efficiency. This setting limits the number of extra recursion depth which can be attempted for inter coded units." + CrLf2 + "Default: 1. which means the residual quad-tree is always at the same depth as the coded unit quad-tree unless the CU was coded with rectangular or AMP partitions, in which case a TU split is implied and thus the residual quad-tree begins one layer below the CU quad-tree.",
.MinMaxStep = {1, 4, 1}}

Property rdoqLevel As New NumParam With {
.Switch = "--rdoq-level",
.Text = "RDOQ Level:",
.Help = "Specify the amount of rate-distortion analysis to use within quantization:" + CrLf2 + "At level 0 rate-distortion cost is not considered in quant" + CrLf2 + "At level 1 rate-distortion cost is used to find optimal rounding values for each level (and allows psy-rdoq to be effective). It trades-off the signaling cost of the coefficient vs its post-inverse quant distortion from the pre-quant coefficient. When --psy-rdoq is enabled, this formula is biased in favor of more energy in the residual (larger coefficient absolute levels)" + CrLf2 + "At level 2 rate-distortion cost is used to make decimate decisions on each 4x4 coding group, including the cost of signaling the group within the group bitmap. If the total distortion of not signaling the entire coding group is less than the rate cost, the block is decimated. Next, it applies rate-distortion cost analysis to the last non-zero coefficient, which can result in many (or all) of the coding groups being decimated. Psy-rdoq is less effective at preserving energy when RDOQ is at level 2, since it only has influence over the level distortion costs.",
.MinMaxStep = {0, 2, 1}}

Property Rect As New BoolParam With {
.Switch = "--rect",
.NoSwitch = "--no-rect",
.Text = "Enable analysis of rectangular motion partitions Nx2N and 2NxN",
.Help = "Enable analysis of rectangular motion partitions Nx2N and 2NxN (50/50 splits, two directions)." + CrLf2 + "Default disabled."}

Property AMP As New BoolParam With {
.Switch = "--amp",
.NoSwitch = "--no-amp",
.Text = "Enable analysis of asymmetric motion partitions",
.Help = "Enable analysis of asymmetric motion partitions (75/25 splits, four directions). At RD levels 0 through 4, AMP partitions are only considered at CU sizes 32x32 and below. At RD levels 5 and 6, it will only consider AMP partitions as merge candidates (no motion search) at 64x64, and as merge or inter candidates below 64x64. The AMP partitions which are searched are derived from the current best inter partition. If Nx2N (vertical rectangular) is the best current prediction, then left and right asymmetrical splits will be evaluated. If 2NxN (horizontal rectangular) is the best current prediction, then top and bottom asymmetrical splits will be evaluated, If 2Nx2N is the best prediction, and the block is not a merge/skip, then all four AMP partitions are evaluated. This setting has no effect if rectangular partitions are disabled." + CrLf2 + "Default disabled."}

Property EarlySkip As New BoolParam With {
.Switch = "--early-skip",
.NoSwitch = "--no-early-skip",
.Text = "Early Skip",
.Help = "Measure full CU size (2Nx2N) merge candidates first; if no residual is found the analysis is short circuited." + CrLf2 + "Default disabled."}

Property FastIntra As New BoolParam With {
.Switch = "--fast-intra",
.NoSwitch = "--no-fast-intra",
.Text = "Fast Intra",
.Help = "Perform an initial scan of every fifth intra angular mode, then check modes +/- 2 distance from the best mode, then +/- 1 distance from the best mode, effectively performing a gradient descent. When enabled 10 modes in total are checked. When disabled all 33 angular modes are checked. Only applicable for --rd levels 4 and below (medium preset and faster)."}

Property BIntra As New BoolParam With {
.Switch = "--b-intra",
.NoSwitch = "--no-b-intra",
.Text = "Evaluate intra modes in B slices",
.Help = "Enables the evaluation of intra modes in B slices." + CrLf2 + "Default disabled."}

Property FastCBF As New BoolParam With {
.Switch = "--fast-cbf",
.Text = "Short circuit analysis if a prediction is found that does not set the coded block flag",
.Help = "Short circuit analysis if a prediction is found that does not set the coded block flag (aka: no residual was encoded). It prevents the encoder from perhaps finding other predictions that also have no residual but require less signaling bits or have less distortion. Only applicable for RD levels 5 and 6." + CrLf2 + "Default disabled."}

Property CUlossless As New BoolParam With {
.Switch = "--cu-lossless",
.Text = "CU Lossless",
.Help = "For each CU, evaluate lossless (transform and quant bypass) encode of the best non-lossless mode option as a potential rate distortion optimization. If the global option --lossless has been specified, all CUs will be encoded as lossless unconditionally regardless of whether this option was enabled. Default disabled. Only effective at RD levels 3 and above, which perform RDO mode decisions."}

Property Tskip As New BoolParam With {
.Switch = "--tskip",
.Text = "Enable evaluation of transform skip coding for 4x4 TU coded blocks",
.Help = "Enable evaluation of transform skip (bypass DCT but still use quantization) coding for 4x4 TU coded blocks. Only effective at RD levels 3 and above, which perform RDO mode decisions." + CrLf2 + "Default disabled."}

Property TskipFast As New BoolParam With {
.Switch = "--tskip-fast",
.Text = "Only evaluate transform skip for NxN intra predictions (4x4 blocks)",
.Help = "Only evaluate transform skip for NxN intra predictions (4x4 blocks). Only applicable if transform skip is enabled. For chroma, only evaluate if luma used tskip. Inter block tskip analysis is unmodified." + CrLf2 + "Default disabled."}

MeteorRain
29th March 2015, 00:45
From my point of view, I'm completely confused by these options.
For most users using a GUI, balance between encoding efficiency and speed should be well grouped into a slide bar, from very fast to very slow, and that's it.
It's not very possible that users toggle and test every option to find the optimistic settings before they start encoding.

stax76
29th March 2015, 01:34
Presets is the first thing that every GUI supports I guess. In my case there are some things I don't like about trackbars/sliders and radioboxes, I'm not saying they are bad for users, I like to use them myself in other applications, I just don't happen to use them as programmer in my applications.

Ajvar
29th March 2015, 02:46
Huyston, we have a problem. Using LAV 0.64 (12-30) with CUDA decoder trying to play a file encoded with 444 and up I get player crach and info says it's because of cuvid.dll crash. With LAV 0.63 all is OK but HEVC decoding made a leap in 64 version.
Files encoded in 439 and before play super fine.

I use GUI software (Hybrid) but really doubt that it has something to that.
I used many different builds sources: snowfag, MA's, Yuuki's to doublecheck.
Presets medium and slow. CRF and 2pass bitrate.
"C:\PROGRA~1\Hybrid\x265-8b.exe" --pmode --input - --input-res 1920x1080 --fps 30 --no-high-tier --min-cu-size 16 --no-open-gop --crf 22.7 --no-rdoq-level --no-psy-rdoq --vbv-maxrate 8000 --vbv-bufsize 8000 --deblock=-1:-1 --colormatrix bt470bg
439 - http://fs8.www.ex.ua/load/6ca2e6560a09a8e09df9ebe3026c54e4/158030847/439.mkv
445 - http://fs8.www.ex.ua/load/1dce5ce570740508b2a882ccceb91a76/158030905/445%20bug.mkv

MeteorRain
29th March 2015, 02:56
Huyston, we have a problem. Using LAV 0.64 (12-30) with CUDA decoder trying to play a file encoded with 444 and up I get player crach and info says it's because of cuvid.dll crash. With LAV 0.63 all is OK but HEVC decoding made a leap in 64 version.
Files encoded in 439 and before play super fine.

I use GUI software (Hybrid) but really doubt that it has something to that.
I used many different builds sources: snowfag, MA's, Yuuki's to doublecheck.
Presets medium and slow. CRF and 2pass bitrate.
"C:\PROGRA~1\Hybrid\x265-8b.exe" --pmode --input - --input-res 1920x1080 --fps 30 --no-high-tier --min-cu-size 16 --no-open-gop --crf 22.7 --no-rdoq-level --no-psy-rdoq --vbv-maxrate 8000 --vbv-bufsize 8000 --deblock=-1:-1 --colormatrix bt470bg
439 - http://www.ex.ua/load/158030847?fs_id=8
445 - http://www.ex.ua/load/158030905?fs_id=8

The file storage is quite confusing as I'm seeing Japanese and Russian mixing on a single web page.

But anyway I'll try with your settings and see what I can do.

= EDIT =

Switched to English and still have no idea how to get the files. Would you mind using some other storages?

And I tried with your settings. They just produced same files, binary identical on the video data part.

Ajvar
29th March 2015, 03:52
The file storage is quite confusing as I'm seeing Japanese and Russian mixing on a single web page.

But anyway I'll try with your settings and see what I can do.

= EDIT =

Switched to English and still have no idea how to get the files. Would you mind using some other storages?

And I tried with your settings. They just produced same files, binary identical on the video data part.

My bad. Updated with direct links now, sorry.
Settings is nothing. It's about different encoder version.