Log in

View Full Version : x265 HEVC Encoder


Pages : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 [97] 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197

LigH
23rd February 2017, 21:21
*sigh* How much energy for development is wasted in ego wars instead. :(

troica
23rd February 2017, 21:37
Hello guys can anyone help me? I'm trying to code in visual studios wherein I want to produce a text file that reports the changes after encoding a YUV video. The text file consists of the frame #, slice #, CTU# etc. Thank you for any help.

LigH
23rd February 2017, 21:50
What kind of changes do you mean? Differences according to metrics like PSNR or SSIM? Absolute differences per YUV channel?

x265 can already produce a CSV log file per frame.

--csv <filename> Comma separated log file, if csv-log-level > 0 frame level statistics, else one line per run
--csv-log-level <integer> Level of csv logging, if csv-log-level > 0 frame level statistics, else one line per run: 0-2

birdie
23rd February 2017, 22:14
Ryzen supports AVX2, so let's hope for the best and expect the worst. :-)

LigH
23rd February 2017, 22:23
You probably mean it similar to the support of SSE3 by Phenom-II, which is considered too slow by x264/x265 to be enabled? :rolleyes:

NikosD
23rd February 2017, 22:31
The developers of x265 can tell us for sure if and how FMA3 is used.

Because nobody here disagrees that integer AVX2 is used a lot by x265.

Usually a floating point division or some other single floating point instruction could be used in projects like x264 or x265 but nothing more than that regarding floating point support.

SSEx is very fast in RyZen, it's as fast as Kabylake.

AVX/AVX2 is supported, but it's about half speed.

Ma
23rd February 2017, 23:18
The developers of x265 can tell us for sure if and how FMA3 is used.

FMA3 (nor FMA4) is not used in x265. It could be used for example in
https://bitbucket.org/multicoreware/x265/src/820f4327ddac44decb4328602ca63e84197ab473/source/common/x86/mc-a2.asm?at=default&fileviewer=file-view-default#mc-a2.asm-1120
but the speed up is only a few CPU cycles in function that is not important for whole encoding time. It is OK to use MUL and ADD instead of one FMA.

troica
24th February 2017, 17:02
What kind of changes do you mean? Differences according to metrics like PSNR or SSIM? Absolute differences per YUV channel?

x265 can already produce a CSV log file per frame.

--csv <filename> Comma separated log file, if csv-log-level > 0 frame level statistics, else one line per run
--csv-log-level <integer> Level of csv logging, if csv-log-level > 0 frame level statistics, else one line per run: 0-2

I mean like what kind of config you have before and after encoding..

If I encoded using 10 frames it will show 10 frames etc and record the quantization parameters per CTU etc

LigH
25th February 2017, 13:24
So you are looking for an API that will trigger even more detailed reports than the CSV log with level 2, even a per-CTU log... I believe there is none yet, only per-file and per-frame logs (see: x265-extras.h).

troica
25th February 2017, 13:32
So you are looking for an API that will trigger even more detailed reports than the CSV log with level 2, even a per-CTU log... I believe there is none yet, only per-file and per-frame logs (see: x265-extras.h).

If so, how do access thex265-extras? Sorry for being a noob at encoders.

And is there anything I can code in visual studios to report the per CTU log?

LigH
25th February 2017, 13:46
I assume you already used Mercurial to clone the whole x265 source repository? Apart from that, you can look at the source here:

https://bitbucket.org/multicoreware/x265/src / source / x265-extras.h (as well as x265-extras.cpp)

troica
25th February 2017, 14:20
How do you access the source files?

birdie
25th February 2017, 15:04
Intel got scared: http://wccftech.com/intel-amd-price-war-ryzen-processors/

Hopefully you haven't bought any Intel CPUs lately 'cause it's time to kick yourself in the balls.

Lovely!

LigH
25th February 2017, 15:11
@ troica:

Always read the documentation (https://bitbucket.org/multicoreware/x265/wiki/Home) first...

troica
25th February 2017, 15:58
Okay sorry. I really don't know how to start using HEVC and how to simulate a wireless video transmission between visual studios and the ns3 program. It's where my thesis will start so sorry if I got a lt oof questions. Btw, is this x265 the HM model?

Barough
25th February 2017, 18:02
x265 v2.3+9-820f4327ddac (http://www82.zippyshare.com/v/XdCFQwng/file.html) (MSYS/MinGW, GCC 6.3.0, 32 & 64bit 8/10/12bit multilib EXEs)

x265 [info]: HEVC encoder version 2.3+9-820f4327ddac
x265 [info]: build info [Windows][GCC 6.3.0][32 bit/64 bit] 8bit+10bit+12bit
x265 [info]: using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX AVX2 FMA3 LZCNT BMI2

https://bitbucket.org/multicoreware/x265/commits/branch/default

Sagittaire
25th February 2017, 21:57
I was told by an Intel fanboy and RyZen 7 reviewer that you need a Core i7 7700K@4.8GHz to catch the performance of RyZen 7@4.0GHz at x265 2nd pass or you could see the same thing reversed.

Will see...

No, Rysen 7 1800X 8C/16T at 500$ will be certainely on par with i7 6900K 8C/16T at 1000$ for x264/x265 encoding. You must use certainely something like O/C at 6 Ghz (nitrogen!!!) with i7 7700K (400$) or even more to have the same performance than R7 1700 (350$) at stock frequency.

NikosD
25th February 2017, 22:01
x264 is a lot different than x265 due to AVX2 optimizations.

Yes, I agree that using x264 RyZen is going to have a party.

But x265 has a lot of AVX2 optimizations and it's not possible for RyZen 7 to catch the performance of an Intel 8C/16T.

Motenai Yoda
25th February 2017, 22:23
No, Rysen 7 1800X 8C/16T at 500$ will be certainely on par with i7 6900K 8C/16T at 1000$ for x264/x265 encoding. You must use certainely something like O/C at 6 Ghz (nitrogen!!!) with i7 7700K (400$) or even more to have the same performance than R7 1700 (350$) at stock frequency.

I'd not be so enthusiast about Ryzen, as the only public comparison is still the amd charts one, and as bulldozer and its derivates are nothing better than intel's ones.

Also 7700k cost now about 340€ (maybe 320/300€ with the 10/15$ prices cut), sure not 400$, and 1700x is (yep) an esacore, but with castrated avx2 and few/no compiler optimizations.

6900k is the actual not plus ultra for a consumer pc so it's baldly overvalued.

Sagittaire
25th February 2017, 22:33
I'd not be so enthusiast about Ryzen, as the only public comparison is still the amd charts one, and as bulldozer and its derivates are nothing better than intel's ones.

Also 7700k cost now about 340€ (maybe 320/300€ with the 10/15$ prices cut), sure not 400$, and 1700x is (yep) an esacore, but with castrated avx2 and few/no compiler optimizations.

6900k is the actual not plus ultra for a consumer pc so it's baldly overvalued.

There are here in France exclusive test on "CPC hardware magazine" since one month. The test CPU model is 3.15/3.3/3.5 Ghz 8C/16T sample (will be like R7 1700).

Overall benchmark for x264 HB encodaging 2K, x265 HB encodaging 4K, WPrime, PovRay, Blender, 3DMax 2015/Mental Ray, Coronna Benchmark

i7 6900K: 193.4
R7 3.15/3.3/3.5 Ghz: 168.7
i7 6800K: 152.5
i7 6700K: 137.3
i7 4790K: 127.7
FX-8370: 105.2
i5 6600K: 100.0

Make scaling for R7 1800X at 3.6/3.8/4.0 Ghz by yourself ... ;-)

Sagittaire
25th February 2017, 22:45
x264 is a lot different than x265 due to AVX2 optimizations.

Yes, I agree that using x264 RyZen is going to have a party.

But x265 has a lot of AVX2 optimizations and it's not possible for RyZen 7 to catch the performance of an Intel 8C/16T.

Rysen seem to have really good AVX/AVX2 implementation too ... Almost as good as Kaby lake

NikosD
25th February 2017, 22:46
I certainly don't believe that.

nevcairiel
26th February 2017, 00:02
Rysen seem to have really good AVX/AVX2 implementation too ... Almost as good as Kaby lake

Even AMD documentation says that AVX/AVX2 is implemented using 128-bit processing units (and not 256-bit), so they have about half the throughput as Intel - which would have quite a strong impact on x265 performance.

Anyway, we'll see actual believable results in a week or so, not "leaked" results or AMD marketing results.

birdie
26th February 2017, 00:08
March the 2nd is really close, it's just four days from now. I guess we can all stop and wait for the official benchmarks. Also, there are rumors that AMD has already shipped a million Ryzen CPUs so we'll have a lot of data to chew on.

LigH
26th February 2017, 00:10
I'll be curious too. AMD used to be "notorious" for well optimized die designs vs. "brute force" preferred by intel. And even if RyZen is not as powerful as people may hope for, we still have to thank AMD for pushing intel (and Nvidia?) back to a solid ground, regarding price management ... :rolleyes: — Thanks birdie for the link a page ago.

pingfr
26th February 2017, 00:45
Anyone could take a quick peek at these settings and tell me if I'm missing a tweak here and there, thanks?

C:\x26x>x265.exe --crf 18 --aq-mode 1 --ctu 64 --qg-size 32 --deblock -1:-1 --me star --bframes 6 --rc-lookahead 60 --ref 5 --b-adapt 2 --tu-intra-depth 4 --tu-inter-depth 4 --merange 92 --weightp --weightb --scenecut 40 --rd 4 --limit-ref 0 --limit-modes --tskip --rect --amp --max-merge 5 --subme 7 --b-intra source.y4m -o out.hevc

Thanks!

Edit: Also what about --cutree ?

aymanalz
26th February 2017, 03:08
Anyone could take a quick peek at these settings and tell me if I'm missing a tweak here and there, thanks?

C:\x26x>x265.exe --crf 18 --aq-mode 1 --ctu 64 --qg-size 32 --deblock -1:-1 --me star --bframes 6 --rc-lookahead 60 --ref 5 --b-adapt 2 --tu-intra-depth 4 --tu-inter-depth 4 --merange 92 --weightp --weightb --scenecut 40 --rd 4 --limit-ref 0 --limit-modes --tskip --rect --amp --max-merge 5 --subme 7 --b-intra source.y4m -o out.hevc

Thanks!

Edit: Also what about --cutree ?

merange 92 is insanely high. What is the video resolution?

divxmaster
26th February 2017, 07:18
Anyone could take a quick peek at these settings and tell me if I'm missing a tweak here and there, thanks?

C:\x26x>x265.exe --crf 18 --aq-mode 1 --ctu 64 --qg-size 32 --deblock -1:-1 --me star --bframes 6 --rc-lookahead 60 --ref 5 --b-adapt 2 --tu-intra-depth 4 --tu-inter-depth 4 --merange 92 --weightp --weightb --scenecut 40 --rd 4 --limit-ref 0 --limit-modes --tskip --rect --amp --max-merge 5 --subme 7 --b-intra source.y4m -o out.hevc

Thanks!

Edit: Also what about --cutree ?

I would add
--rskip --no-sao --no-open-gop
and I prefer
--psy-rdoq 1.1

Cheers,
Divxmaster

troica
26th February 2017, 07:29
Hello guys is this x265 and HM reference model the same?

Selur
26th February 2017, 08:11
they both implement the same standard, but x265 has lot of optimizations.
HM = example on how it could be done for academic purposes
x265 = real world implementation which is a lot faster and usable

WhatZit
26th February 2017, 08:59
Anyone could take a quick peek at these settings and tell me if I'm missing a tweak here and there, thanks?

I'm constantly amazed why people keep pursing gigantic strings of parameters to try and solve a problem that's already been solved.

--tune grain, along with its built-in rc-grain algorithm, fundamentally changes the encoding decisions x265 makes by weighting towards quality over compression (aka high-frequency retention).

The settings contained in these presets...

http://i66.tinypic.com/ka495x.jpg

...retain more detail the slower you go. So much so that you can easily drop your non-grain CRF's BACK one or two notches at the slower presets.

I suspect that most people who don't understand tunegrain have just left the rest of their options at "veryslow CRF18" when "slow CRF20" would probably produce comparable visual quality, faster encodes and smaller filesizes.

This is my current command line, which replaced something 20+ parameters long:

--profile main10 --tune grain --deblock=-6:-6 --no-strong-intra-smoothing

Note that I haven't included the CRF or preset? That's because preset+CRF is ALL I change to adjust speed/filesize whilst still retaining quality for ANY encode, depending on the source.

Now, I'll admit that tunegrain could still undergo some tweaks for speed, and others have suggested the parameters to do that. Myself, I couldn't be happier about ditching the pursuit of endless quality experiments with this-command-line and that-command-line.

troica
26th February 2017, 09:07
they both implement the same standard, but x265 has lot of optimizations.
HM = example on how it could be done for academic purposes
x265 = real world implementation which is a lot faster and usable

Are they both programmable using visual studios? My thesis is about wireless video transmission using HEVC with HM, but I dont really know how to start using visual studios :(

Selur
26th February 2017, 09:11
not sure what you understand under 'programmable', but both can be compiled with MSVC and iirc both have an API and can be compiled as libraries.
-> no clue how to best start

troica
26th February 2017, 09:54
Thanks. I wanted to do generate a log file using HM model that takes the difference in the SNR, quantization parameter etc per frame, ctu, and slice number. Is there such a thing, or should I code manually to do such thing?

Boulder
26th February 2017, 11:12
Now, I'll admit that tunegrain could still undergo some tweaks for speed, and others have suggested the parameters to do that. Myself, I couldn't be happier about ditching the pursuit of endless quality experiments with this-command-line and that-command-line.

This. What I've done is choose --preset slower --tune grain --CRF 21 as the baseline, then picked some recommended settings to tune performance (--limit-refs 3 --limit-tu 3 --rskip) and some quality-related ones from the bottom two presets (--tu-inter-depth 4 --tu-intra-depth 4 --max-merge 4). Produces 10-30% smaller files than x264 with my preferred settings at CRF 18, and seems to keep detail better.

pingfr
26th February 2017, 12:11
merange 92 is insanely high. What is the video resolution?

1920x1080.

pingfr
26th February 2017, 12:15
I would add
--rskip --no-sao --no-open-gop
and I prefer
--psy-rdoq 1.1

I suppose --rskip stands for the "recursion skip", however both veryslow and placebo presets have the value set at 0. I suppose that means they are kind of --no-rskip?

Also I would assume --no-sao disables SAO setting it to 0, whereas every preset excepted for ultrafast and superfast have the value set to 1, makes me to actually WANT the SAO.

Now can some kind soul explain what is the rdoq-level? why is it set at 2 and how to make sure to keep it set at 2 from the command line args?

Thanks.

microchip8
26th February 2017, 12:35
@pingfr

you don't want SAO. It blurs things so if you want to preserve detail as much as possible, disable it and disable strong-intra-smoothing as well

pingfr
26th February 2017, 12:38
@froggy1: Noted. Any idea why the "best presets" have it enabled then?

microchip8
26th February 2017, 12:40
@froggy1: Noted. Any idea why the "best presets" have it enabled then?

The presets of x265 are optimized for (very) low bitrate where at those levels, blur is preferred over other compression artifacts. That's why most have SAO and intra smoothing enabled. Also the x265 presets haven't been optimized in a long time. If you want to squeeze out as much detail as possible, you'll need to tweak manually (IMHO)

pingfr
26th February 2017, 12:46
The presets of x265 are optimized for (very) low bitrate where at those levels, blur is preferred over other compression artifacts.

That's not my case here, quite the contrary, I'm trying to optimize for top-notch quality at a 10%~15% size/compression efficiency over x264, bitrate isn't an issue and never will be.

That's why most have SAO and intra smoothing enabled.

Gotcha.

Also the x265 presets haven't been optimized in a long time.

Not really helpful now is it?

If you want to squeeze out as much detail as possible, you'll need to tweak manually (IMHO)

That's exactly what I'm doing as we speak.

Speaking of quality or should we rather call it "detail retention", I see everyone coming up with different custom deblocking values.

Some are like -1:-1 where as others recommend as far as -6:-6, is there a "best" and a "safe" value? or when is actually too much deblocking... well... "too much"?

My latest test settings are:

x265.exe --crf 21 --aq-mode 1 --ctu 64 --qg-size 32 --deblock -6:-6 --me star --bframes 8 --rc-lookahead 60 --ref 5 --b-adapt 2 --tu-intra-depth 4 --tu-inter-depth 4 --merange 92 --weightp --weightb --scenecut 40 --rd 4 --limit-ref 0 --limit-modes --tskip --rect --amp --max-merge 5 --subme 7 --b-intra --no-rskip --no-sao --no-strong-intra-smoothing in.y4m -o out.hevc

microchip8
26th February 2017, 12:50
your line looks good, except for the, IMHO, too high merange value. Even for UHD, that's overkill. As to deblocking, there's no magic value and different people prefer different values. I'm fine with -3 and even -2. also, if you're mostly encoding HD/FHD, a CTU of 32 is slightly preferred

pingfr
26th February 2017, 12:59
your line looks good, except for the, IMHO, too high merange value. Even for UHD, that's overkill.

Will lowering the --merange 92 back to a saner --merange 57 value have any significant impacts on either resulting quality (better/worse looking?) and/or compression speed (faster/slower?) and/or compression efficiency (bigger/smaller filesize?)?

As to deblocking, there's no magic value and different people prefer different values. I'm fine with -3 and even -2.

If you were asked to justify your choices of a -3 or -2 values, how would you do so? :)

also, if you're mostly encoding HD/FHD, a CTU of 32 is slightly preferred

I'm tackling with 1080p/720p encodes from Blu-Ray disc sources at the moment, however it appears all the presets recommend using a --ctu 64. I think I'll stick to that at least for now.

Edit: Also I can't find the command line switch to set a custom lookahead-slices arg.

microchip8
26th February 2017, 13:23
Will lowering the --merange 92 back to a saner --merange 57 value have any significant impacts on either resulting quality (better/worse looking?) and/or compression speed (faster/slower?) and/or compression efficiency (bigger/smaller filesize?)?

it will have a slight impact on speed. As to quality, it will be difficult to spot the difference, unless you really "stick your nose" into the picture with a microscope.

The reason I said that 97 is overkill is because with that value, the encoder may select a motion vector that's not necessarily the best one. It is a similar situation with x264 where values above 32 can actually in some cases hurt quality instead of improving it.



If you were asked to justify your choices of a -3 or -2 values, how would you do so? :)


Simply, I prefer a slightly "soft" picture instead of one that jumps at me



I'm tackling with 1080p/720p encodes from Blu-Ray disc sources at the moment, however it appears all the presets recommend using a --ctu 64. I think I'll stick to that at least for now.


Keep in mind that the presets, as I said, are optimized for 2 things. One I already mentioned (very low bitrates) while the other is at least UHD resolutions. At these resolutions, a larger CTU makes sense


Edit: Also I can't find the command line switch to set a custom lookahead-slices arg.

I don't think it has one yet

ChaosKing
26th February 2017, 13:28
Will lowering the --merange 92 back to a saner --merange 57 value have any significant impacts on either resulting quality (better/worse looking?) and/or compression speed (faster/slower?) and/or compression efficiency (bigger/smaller filesize?)?


You can always test it yourself with a short sample and compare the results...

Yes the filesize will be a little bit smaller or the quality a little bit better, but it will be so small that it's not really worth it. That's why it is in the placebo preset, bcs it's sloooow ;)

aymanalz
26th February 2017, 14:50
1920x1080.

Then I repeat, merange of 92 is insanely high. Even the default of 57 is very high.

aymanalz
26th February 2017, 14:53
Will lowering the --merange 92 back to a saner --merange 57 value have any significant impacts on either resulting quality (better/worse looking?) and/or compression speed (faster/slower?) and/or compression efficiency (bigger/smaller filesize?)?



If you were asked to justify your choices of a -3 or -2 values, how would you do so? :)



I'm tackling with 1080p/720p encodes from Blu-Ray disc sources at the moment, however it appears all the presets recommend using a --ctu 64. I think I'll stick to that at least for now.

Edit: Also I can't find the command line switch to set a custom lookahead-slices arg.


1) Lowering the merange to 57 will significantly boost your encoding speed, and it is very unlikely (read, impossible) that you will notice any quality degradation.

2) CTU of 32 is good enough for 1080p, and that too will give you significant speed increase. Only much higher resolutions will benefit from CTU 64.

A couple of pages back, you were complaining about slow encoding speed, so I'm wondering why you would set these unhelpfully large values for merange especially, and CTU.

littlepox
26th February 2017, 16:47
--ssim-rd seems to be another detail killer. May I know how to use it in near-transparent encoding?

pingfr
26th February 2017, 17:48
Then I repeat, merange of 92 is insanely high. Even the default of 57 is very high.

Back to 57 then I guess. Thanks! ;)

pingfr
26th February 2017, 17:51
1) Lowering the merange to 57 will significantly boost your encoding speed, and it is very unlikely (read, impossible) that you will notice any quality degradation.

57 it is then. Case settled.

2160p = 92 merange.
1080p = 57 merange.
720p = ?? merange.

2) CTU of 32 is good enough for 1080p, and that too will give you significant speed increase. Only much higher resolutions will benefit from CTU 64.

2160p = CTU 64.
1080p = CTU 32.
720p = CTU ??.

A couple of pages back, you were complaining about slow encoding speed, so I'm wondering why you would set these unhelpfully large values for merange especially, and CTU.

Because... these values were the ones taken directly from the placebo presets? :D