x265: Are new versions = encoding speed ? [Archive]

View Full Version : x265: Are new versions = encoding speed ?

Forteen88

11th November 2018, 10:13

Hi. I wonder, in the new versions of x265 that are coming out, are they mostly (like 95%) for speed, or are there many video-quality improvements also?
:thanks:

excellentswordfight

11th November 2018, 15:29

Hi. I wonder, in the new versions of x265 that are coming out, are they mostly (like 95%) for speed, or are there many video-quality improvements also?
:thanks:
Even though this question probably is best answered by someone closer to the development of x265. But tbh I have seen very little improvements in both speed and/or quality for some time now, I have basically seen the same speed/quality for my use cases since the new lambda tables in 2.4. I have glanced on most of the patchnotes after that and from what i've seen most of them includes bug fixes and added features (chunk encoding this year for example), exluding the avx512 though, but the gains there was somewhat disappointing. I hope I'm wrong, but I view x265 to be very mature in these regards at this point.

Blue_MiSfit

12th November 2018, 00:09

There were recently some nice fixes for some really nasty VBV bugs that made a huge difference in my workflows.

benwaggoner

12th November 2018, 18:49

There have been some features that directly improve quality for some scenarios since 2.4. Also, performance improvements ARE quality improvements, because you can use a slower preset and get improved quality at the same encoding time. All the analysis reuse stuff can really speed up encoding a 12-stream adaption set. And chunked encoding is huge.

Specific to quality

2.5: --tune grain
2.7 --radl and --gop-lookahead
2.8 --refine-intra 4 and VBV Lookahead fix
2.9: fix rowStat computation and disable noise reduction with vbv

Only the VBV Lookahead and rowStat fixes would be on by default in a basic encode; the others are context specific. Tune grain only for really grainy stuff, radl only for fixed GOP, and I'm not really sure what gop-lookahead is for (can't I just set --keyint and --min-keyint?).

But all things together, on a good AVX512 capable system, for a lot of scenarios one can now use --preset-slower in the time that --preset slow took in 2.3. Heavily leveraging the analysis reuse for multibitrate encoding, I'm sure it can get up to 3-4x for some scenarios.

MGarret

13th November 2018, 20:51

--gop-lookahead is for overriding default --keyint if it finds new scene change within that additional lookahead that you specify. It's for adaptive gop, not for fixed.

benwaggoner

16th November 2018, 17:40

--gop-lookahead is for overriding default --keyint if it finds new scene change within that additional lookahead that you specify. It's for adaptive gop, not for fixed.
Right. I'm just fuzzy on the difference between using --keyint 60 and --keyint 50 --gop-lookahead 10.

Those would both have a maximum GOP duration of 60 frames. Would the --gop-lookahead approach yield a bias towards having GOPs closer to 50 or something?

MGarret

17th November 2018, 16:38

In theory --gop-lookahead option should work like that. It should bias to 50 and then sometimes extend it to 60 for a scenecut but I found that x265 likes to save bits by coding many scenechanges as P and not IDR/I so this option is basically useless.

Blue_MiSfit

18th November 2018, 06:02

What's the benefit of the chunked encoding method in x265?

I've always done this by simply using ffmpeg to encode chunks of a source and then stitching the result, e.g.

ffmpeg -ss 30 -i input.mov -c:v libx265 -t 10 output.mp4

From the x265 docs, this sounds weird:

--chunk-start <integer>
First frame of the chunk. Frames preceeding this in display order will be encoded, however, they will be discarded in the bitstream. This feature can be enabled only in closed GOP structures. Default 0 (disabled).

When would this be useful?

benwaggoner

23rd November 2018, 23:09

What's the benefit of the chunked encoding method in x265?

I've always done this by simply using ffmpeg to encode chunks of a source and then stitching the result, e.g.

ffmpeg -ss 30 -i input.mov -c:v libx265 -t 10 output.mp4

From the x265 docs, this sounds weird:

When would this be useful?
It would let you encode a section of video that you want rate control and VBV to account for, but that won't be included in the output bitstream.

Might help some with stuff like weighted prediction as well, if GOP boundaries catch a transition partway through.

Blue_MiSfit

27th November 2018, 01:05

Interesting - so I could encode 30 second chunks but have each include a few seconds of the prior chunk in the rate control. Neat!

Selur

27th November 2018, 05:45

benwaggoner

27th November 2018, 17:04

about chunked encoding: Would this be a good start for additional multi-threading option? Like 'virtually' splitting the source in chunks and encode them in parallel without having to physically split the source on systems where the cpu usage isn't that high?
it could be, for cases where you have a lot of unused RAM or cores. People have already done scripts or apps using the .dll for that.

Blue_MiSfit

30th November 2018, 07:15

Yep, absolutely. This is how you encode super fast using tons of machines e.g. in the cloud :)

BLKMGK

9th December 2018, 03:16

Yep, absolutely. This is how you encode super fast using tons of machines e.g. in the cloud :)

How frame accurate is this? Are you removing audio first? I've been struggling to do this on video that's had it's audio removed using x265 on Linux. I can specify start and ending frames and use keyframes to find entry points as I understand it. Damned if I can get the commandline right but ffmpeg as above looks much simpler but reading the docs it didn't look frame accurate to my novice eyes. I may try it and see, could it really be that easy?

x265 --seek 0 --crf 20 --fps 2400/1001 --keyint 240 --frames 1430 --y4m --preset slow "crim.mkv" --output 1.mkv

and

ffmpeg.exe -loglevel panic -i -strict -1 -f yuv4mpegpipe - | "x265.exe" --seek 0 --colorprim bt709 --transfer bt709 --colormatrix bt709 --crf 20 --fps 24000/1001 --min-keyint 24 --keyint 240 --frames 1430 --sar 1:1 --preset slow --ctu 16 --y4m --pools "+" --output "1.mkv"

Laughs at me telling me it cannot open the source file :mad::rolleyes:

Selur

9th December 2018, 09:40

ffmpeg.exe -loglevel panic -i -strict -1 -f yuv4mpegpipe - | "x265.exe" --seek 0 --colorprim bt709 --transfer bt709 --colormatrix bt709 --crf 20 --fps 24000/1001 --min-keyint 24 --keyint 240 --frames 1430 --sar 1:1 --preset slow --ctu 16 --y4m --pools "+" --output "1.mkv"

Laughs at me telling me it cannot open the source file
To sate the obvious: at least in that command line you posted you are not telling x265 that the source is a pipe,... you are missing ' --input - ' in your x265 call and your ffmpeg call also doesn't have an input file,...

Cu Selur

BLKMGK

9th December 2018, 10:02

To sate the obvious: at least in that command line you posted you are not telling x265 that the source is a pipe,... you are missing ' --input - ' in your x265 call and your ffmpeg call also doesn't have an input file,...

Cu Selur

D'oh! Still having trouble but I see what you're pointing out at least, it was confusing when it would name the input file and state it couldn't open it in some of my attempts. Learning the hard way and obviously cribbing from examples where I can find them! :rolleyes: I won't pull the conversation further off-topic with my attempts but thank you for the pointer!

Trying to figure out how best to encode pieces and chunked encoding as a feature caught my eye. Not seeing much documentation on it but saw seek and frames as possibilities.

Blue_MiSfit

10th December 2018, 21:30

How frame accurate is this? Are you removing audio first? I've been struggling to do this on video that's had it's audio removed using x265 on Linux. I can specify start and ending frames and use keyframes to find entry points as I understand it. Damned if I can get the commandline right but ffmpeg as above looks much simpler but reading the docs it didn't look frame accurate to my novice eyes. I may try it and see, could it really be that easy?

x265 --seek 0 --crf 20 --fps 2400/1001 --keyint 240 --frames 1430 --y4m --preset slow "crim.mkv" --output 1.mkv

and

ffmpeg.exe -loglevel panic -i -strict -1 -f yuv4mpegpipe - | "x265.exe" --seek 0 --colorprim bt709 --transfer bt709 --colormatrix bt709 --crf 20 --fps 24000/1001 --min-keyint 24 --keyint 240 --frames 1430 --sar 1:1 --preset slow --ctu 16 --y4m --pools "+" --output "1.mkv"

Laughs at me telling me it cannot open the source file :mad::rolleyes:

Yep, you do audio and video completely separately. We deliver them separately anyway via HLS and DASH (since multiple audio tracks and formats are common)

benwaggoner

12th December 2018, 20:12

MCW has just checked in a bunch of improvements, after a long draught. Being able to use cutree in analysis reuse should help speed and quality in those scenarios somewhat. Also a new parameter to make muxing after chunked encoding a lot easier. The other stuff is for Dolby Vision, including QP tuning for Y'CtCp.

BLKMGK

18th December 2018, 06:03

Yep, you do audio and video completely separately. We deliver them separately anyway via HLS and DASH (since multiple audio tracks and formats are common)

Okay, I successfully split up and encoded a video using ffmpeg piped to X256, I broke the jobs on keyframes generated by ffmsindex using seek and frames X265 switches, and built the final video file with mkvmerge. Lastly I muxxed in the original audio track and new video with mkvmerge. The video plays well, I cannot spot the seams in this single example I've managed to build. :devil:

To automate this is going to take me some time but it's my goal, best of all this seems to be using all cross platform tools so in theory my Windows and Linux machines can all play together. However...

Reading some I see there's a new feature named chunk, an advantage to this appears to be that it reads ahead and after the given start/stop frames which is an advantage to the encoder's prediction and analysis - yes? It appears to be geared towards exactly what I'm attempting to do. However when I attempt to use it I receive errors that it must be used on a "closed GOP" structure. I understand that this has to do with the way that frames are written/sequenced in the video but I'm not sure how to satisfy this option or how I might pull out good starting and stopping points as I did with ffmsindex. Google isn't helping me much I'm afraid :( Little help for the newbie please?

I'd make this a separate post but this was one of the sparse few threads I've found that mention chunking and thus far I've found no examples using it <sigh> :o

:thanks:

Bonus question - if I'd like to trim off black bars around a video as say HandBrake does how best to do this with something cross-platform?

Blue_MiSfit

18th December 2018, 10:35

set open-gop=0 (in ffmpeg speak, --open-gop=0 in x265 binary speak) to make all GOPs glosed.

BLKMGK

19th December 2018, 05:26

set open-gop=0 (in ffmpeg speak, --open-gop=0 in x265 binary speak) to make all GOPs closed.

I think x265 does it slightly differently but you set me on the right path after I had troubles with ffmpeg, thank you! I used --no-open-gop and it seemed to like it, much experimentation left to do. Does closing gop structures impact encoding efficiency? Seemed to run across some information that sounded like it might. Appreciate the help!

--open-gop, --no-open-gop
Enable open GOP, allow I-slices to be non-IDR. Default enabled

For black borders the following website had some good info using ffplay.

http://www.renevolution.com/ffmpeg/2013/05/23/understanding-ffmpeg-part-iii-cropping.html

BLKMGK

22nd December 2018, 22:38

Interesting - so I could encode 30 second chunks but have each include a few seconds of the prior chunk in the rate control. Neat!

about chunked encoding: Would this be a good start for additional multi-threading option? Like 'virtually' splitting the source in chunks and encode them in parallel without having to physically split the source on systems where the cpu usage isn't that high?

it could be, for cases where you have a lot of unused RAM or cores. People have already done scripts or apps using the .dll for that.

Yep, absolutely. This is how you encode super fast using tons of machines e.g. in the cloud :)

Reading this exchange I built a script to do exactly this but was pretty puzzled by some of the results I was seeing. Turns out "chunk" starts at the beginning of the file with it's encoding even if you've specified a starting frame deep into the source video. It will merrily encode and throw away the work until it hits the starting frame, save frames until it hits the provided end frame, and then quit. I had expected I could give it a start frame and it would read back to a keyframe to begin encoding and then spit out the frames I specified - it does NOT in my testing :eek: Judging from the conversation here it sounds like I wasn't the only one confused, I'm pretty bummed...

Edit: My previous test was cropping and I decided to drop that to see if it impacted the behavior. I changed to a much earlier portion of the source (frame 2000) and encoded only 500 frames. When it hit the end frame at 2500 it kept running until I broke it. Aborted input at frame 108145 output frame 501, it seems that it also doesn't stop encoding at the chunk end frame either - it seems previously I'd used a point near the end of the video. I'm really confused as to how this function is used and hope I've simply screwed something up.