Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > High Efficiency Video Coding (HEVC)

Reply
 
Thread Tools Search this Thread Display Modes
Old 14th June 2024, 00:25   #1  |  Link
rdseven
Registered User
 
Join Date: Jun 2024
Posts: 1
HEVC for pixelated game footage

Hello! Firstly, I'd like to say thank you all for creating and maintaining this forum! I've been hobbying with video encoding for 4 years, and this forum has been such a wealth of knowledge, explanation, and expertise that I haven't needed to make an account until now. Secondly: I apologize for how long this is lol

Some background info: I a somewhat high level player of the platforming game Celeste. I'm also a video nerd, and typically record all of my footage losslessly through OBS (I believe it's the nvenc h264 encoder with the lossless tune, not QP 0). However, this is obviously stupidly large and I end up lossily re-encoding to h265. I chose libx265 because AV1 was just much slower and didn't provide any major benefits over h265. (although I'm definitely still open to the option of changing codecs.)
Celeste is a 2D platformer game which comes across as pixelated — the actual game itself renders and upscales from 360x180p. However, it also overlays a 1920x1080p "HUD" (or overlay. It's just an overlay lol) on top and then up/downscales to whatever you monitor resolution is. I mention this because it means that not everything in the game can be downscaled back to its original resolution or recorded at the base 180p — I had this idea early on, tried, and the outcome looked great... until I opened up the pause menu which renders at 1080p. The only resolution to maintain perfect visual fidelity would be 1080p, and I am currently working on a mod to record the game at 1080p before the up/downscaling to the monitor happens; but that's getting into tangent territory. I currently record in 1440p, which is what my monitor resolution is.
Celeste also features a mix of both fast and dynamic motion and slow and precision motion. A good chunk of recordings are just dying, a screen wipe, and respawning (which makes for a lovely scene cut insertion). But in the times I'm not dying, sometimes its faster than the game camera can keep up with, and other times the screen doesn't move at all very little motion happens. Sometimes there's a lot going on in the background, and other times there isn't. I'll attach some videos for some overall context, but the point is that Celeste is rather dynamic with its content and motion.

This exposition dump leads me to a couple interesting questions:
1) Would it worth it to to determine when the overlay is affecting the game and encode those segments differently? My reason for this is that without the overlay, you're essentially just looking at an upscaled 180p image and a lot of the settings for "finer" (i.e. higher resolution) details don't need to be kept.
2) I assume HEVC already does this to some degree, but would it be a good idea to determine the overall entropy of a scene and change the settings around that? I feel like this is overkill, and honestly probably works against x265, but I figured I'd ask anyways.

Alright, on to the actual encoding stuff! My goal for encoding is to have a personal archive of everything I do in the game. As a relatively high level player, I record almost everything I do with it, and it is nice to just have an archive of everything. Files should be relatively low bitrate, somewhat fast (hopefully faster than 1/3 the fps of the input), and essentially indistinct from the original. I know that last one is a terrible measurement, but since this is a pixelated game, sharpness and being able to have each game "pixel" not blend into another is pretty important to me. I definitely hold the quality standard way higher than it really needs to be, and I know that... but I'm still not changing it lol.

I use a program called av1an to manage chunking and encoding multiple chunks at a time. It is intended for usage of AV1, but can be used with x264 and x265. I'm pretty sure it runs only the first run of a low-resolution rav1e 2-pass to determine chunks for the most optimal positions, and then passes those chunks to the encoder. My current encoder parameters are:
--preset medium --crf 14 --tune animation --input-csp 3 --input-depth 10 --output-depth 10 --range full --colorprim bt709 --colormatrix bt709 --level-idc 5.1 --high-tier --no-scenecut --keyint -1 --hme --hme-range 35,57,57 --hme-search umh,umh,star --rd 4 --bframes 8 --subme 7 --ref 4 --tskip --tu-inter 4 --tu-intra 4 --limit-tu 4 --tu-intra-depth 4 --tu-inter-depth 4 --no-sao --aq-mode 4 --rskip 2 --rskip-edge-threshold 3 --no-strong-intra-smoothing --fast-intra --qcomp 0.7 --qblur 0 --cplxblur 0
I'll do my best to explain what I know about each one, and please let me know if something seems amiss or if there's a parameter that might work that I missed.
--preset medium According to the documentation, I change a lot of the settings the preset affects, so there isn't too much of a change if I switch this. However, I still notice a difference in size and visual fidelity between fast and medium (but not really encoding speed). I'm not sure if something internally changes based on the preset, or if the few options that do change make that difference, but it's definitely noticeable. Medium works best for me though.
--crf 14 seems to be the sweet spot. 12 is a bit too large, 16 is a bit too compressed, 14 worked nicely. Of course, I'd love to have it look "better", but the difference in file size for 12 was just not worth it.
--tune-animation According to the forum, this does some stuff to make sharpness better. This is a pixelated game, so sharpness is obviously very important.
--input-csp 3 --input-depth 10 --output-depth 10 --range full --colorprim bt709 --colormatrix bt709 --level-idc 5.1 --high-tier Format stuff. I record at 4:4:4, full range, and I'm actually not 100% sure why I said the input range is 10 bit. It isn't, it's 8 bit, I'll have to change that. I figure it's better to specify this stuff. I ran into issues with specifying --profile main10 because it doesn't like 4:4:4 or something.
--no-scenecut --keyint -1 av1an does this for me. No need to put extra stress on the encoder.
--hme --hme-range 35,57,57 --hme-search umh,umh,star This just works better. I think I enabled this because I thought the multiple motion search would make it more efficient and improve sharpness, but it's somehow better than that in every way: speed is faster, it looks better, it's a smaller file. I don't know how, but I'll take it. x265 doesn't seem to acknowledge the multiple ranges in the error messages though, which may be a bug, but it works for me. Error message: (x265 [INFO]: HME L0,1,2 / range / subpel / merge : umh, umh, star / 57 / 7 / 3)
--rd 4 It just worked better than 6. 6 is substantially slower and from my tests didn't provide a whole ton of benefit here. In my test, it's about 40% the speed of 4 without providing much benefit (~2% smaller file size, doesn't really look much better than 4). I sort of doubt that it's 40% the speed and it's likely just the way the video is or how it got chunked, but even then I doubt it's worth it.
--bframes 8 This is (subjectively) the best setting. I would crank this up to 16 if I didn't have to break conformance, and actually might. In a test encoding, 90% of my b-frames are in groups of 8. I don't know if a non-conformance would break HW decoding playback at this point, so if y'all have any opinions on this let me know
--subme 7 Is the most efficient without adding too much extra encoding time (<1% over --subme 5) . I'm not sure why, considering other encodes slow down significantly from this, but I'm guessing the content doesn't utilize it a whole bunch.
--ref 4 I did a bunch of testing on this after some very weird initial results where 4 was faster and smaller than both 5 and 6. 4 seems to be the perfect spot here. The results from 5 and 6 are slightly smaller (<2%), but have no visual difference to my eye and are slower by a significant margin (~5% for 6 and ~3% for 5). I did not test 3 or lower, but if recommended I will do that.
--tskip apparently helps with sharpness (I think I read that somewhere on the forum). Testing with --tskip-fast didn't help at all here, it looked worse and was marginally faster.
--tu-inter --tu-intra --limit-tu-4 --tu-intra-depth 4 --tu-inter-depth 4 I might need some help with. I believe I read that it helps with anime/sharpness here on the forum, but I'm not entirely sure what it does. I probably need a better understanding of exactly how x265 works to fully understand what it's doing.
--no-sao If I recall correctly, SAO likes to blur things... Which is absolutely not what you want when you want to preserve sharp edges.
--aq-mode 4 states "auto variance with edge information". I think this refers to information on the edges of each block, which then determines how quantized/compressed the block should be. I think this would be best for the content I'm working with? I also attempted --aq-auto, but it was slower and I don't think it had much benefit. Can try again if anyone recommends it.
--rskip 2 --rskip-edge-threshold 3 Once again, I read this here on the forum. The documentation says this marginally reduces quality while increasing performance. As the same the aq-mode, I think edge density is what works best here? But I have no clue why I chose 3 for the edge density. The default is 5, so maybe I just went lower for no good reason lol.
--no-strong-intra-smoothing removes the bi-linear interpolation for block corners. The less smoothing and removal of sharp edges, the better.
--fast-intra apparently this reduces the amount of checks the encoder has to do by about a third. I'm not sure what an "angular mode" is, but I'm not sure if going through all 33 of them is worth it or if it provides much compression benefit.
--qcomp 0.7 I believe this determines how variable the QP is for P and B frames. 0.7 was smaller and didn't look that different from 0.8, but I may end up changing this for better visuals.
--qblur 0 --cplxblur 0 I think both of these blur blocks, but in different ways? Which... Less blur=better! (I think, at least for this scenario)

A couple of options I thought might be interesting for advice on:
--rect and --amp. I did some very early tests on these and they just seemed to be slower and not super useful. Any thoughts on these?
--b-intra. I'm not sure how this works or how it would help. But with how many b-frames I have, it might be useful?
--bframes 16 I'm almost positive increasing b-frames is one of the best things I can do. But, if I remember, b-frames higher than 8 would need --allow-non-conformance and I don't know how detrimental that is or if playback is flexible enough to allow this. I have a decent enough CPU to deal with software decoding but GPU decoding is always a nice option to have.

I feel like this sort of content can be super compressed/efficient due to how flat most of it is. There are very few gradients, motion is relatively predictable, and the difference between most frames is usually as simple as a transposition. Currently these settings work extremely well, with 1440p encodes averaging from around 3mbps to 7 Mbps, and peaking around 12mbps for high complexity/fast movement. Although it is barely noticeable to anyone watching at full speed, there is a little bit of edge blurring for the few gradients which may be inevitable. I've linked to a zip file with content and notes and stuff for all of this if anybody wants to take a look. It'll only be up for 3 days to help the site owner not permanently store files, so get it while it's hot!
Anyways, I appreciate y'all and thank you for reading through this massive text dump! I look forward to any advice or input you have, and appreciate this forum highly, the massive history and knowledge in these posts helped me whenever I get stuck (which is a lot lol)
rdseven is offline   Reply With Quote
Old 18th June 2024, 03:26   #2  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 4,878
I'm sure you could do a lot of clever things with --zones to change parameters based on the regions.

It's kinda hard to speculate without at least some frame grabs of the two kinds of footage to check out.

There is all kind of fun stuff optimizations that can be done with game stuff. You are tuning a lot of knobs at once that can have some complex interactions, some of which are pretty unlikely to be helpful. And ones that will limit how many hardware decoders will be compatible with the streams that won't actually make anything better. Best to get something that gives you quality and bitrates you like, and then iterate from there.

What frame rate is your source? The below assumes 60 fps. It sounds like your source is full range 8-bit RGB. I presume you'll be converting to limited range YUV HEVC for compatibility (and almost double the speed), so you'll want to convert to 10-bit to preserve all the precision of the 8-bit full range. 10-bit also encodes a bit smaller, and has way less risk of having banding issues.

To set an initial high quality test that you can tune for speed from, and incorporating some of your discoveries so far, start with (presuming generic x265 3.6)

--preset veryslow --profile Main10 --level-idc 4.1 --crf 14 --tskip --selective-sao 2 --bframes 16 --aq-mode 4 --rskip 2 --rskip-edge-threshold 3.

Veryslow includes a good set of quality-over-speed defaults that you speculate about above, tuned to work well together.

It won't be particularly fast, but if it gives good quality and file size, you can start tuning to speed it up (first up, try --preset slower and adding --tskip-fast and see if there are acceptable quality/size regressions).

And if quality still isn't quite where you want, use --preset placebo --cu-lossless. That'll be quite a lot slower. Normally --cu-lossless don't do much beyond wasting time and watts, but it might actually be helpful for this particular kind of content.
You can also try with/without --tune animation. I invented that myself years ago as a suggested starting point for MCW to tune from, and it wound up being incorporated as is. I'm not that good, so I am not confident in how well tested it is.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 00:16.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.