Log in

View Full Version : HEVC Shrinker: Bash script for transcoding big collections


ABurns
2nd March 2025, 18:40
So I was recently faced with needing to transcode and shrink several terabytes of videos that were mostly encoded with fairly high bitrate h264 but also included a hodgepodge of esoteric and obsolete containers and codecs, like wmv, flv, QuickTime, DivX, mpeg-1/2, etc. Because this was primarily a file shrinking operation, I wanted to use AVIsynth to filter it a little for better compression. LRemoveDust is somewhat destructive to very fine detail, but the compression gains are substantial in many cases. At first I tried some of the automated GUIs like MeGUI and a couple of others, but they're simply not suited for thousands of oddball files like that.

Those apps were crash-prone, and I would start it up at bedtime and wake up to find that it had crashed on the 2nd or 3rd file and nothing was done. So I needed something that was robust with good error handling, lightweight and wouldn't bog down queueing up thousands of files, and was more intelligent than just a simple encoder. So I got ChatGPT to help me write a bash script to do what I wanted. I had several requirements:

- Output had to be transparent, i.e., the same visual quality as input. I was after small size, but not at the expense of quality.
- It needed to be robust and able to load a lot of oddball formats without crashing.
- Output needed to be standardized in Matroska containers with AAC audio
- Sometimes the new file was actually bigger, so it needed to be able to compare the new file to the old one and keep the smaller.
- Many of the files had cover art in the folder, so it needed to mux those images in as attachments
- It needed to be able to efficiently keep track of which files had already been processed to avoid encoding the same file multiple times if the script gets restarted.
- Needed to be able to log errors so I could easily go back to problematic files later.

So here is what the script does:

- Runs in a Git Bash for Windows terminal (although I'm sure it can be pretty easily modified to run on *nix and wine)
- Search recursively for video files and gather stream information
- Skips files that are already hevc/aac and just remuxes them, otherwise creates an AVS script with LRemoveDust and encodes with ffmpeg and qaac.
- Downscales 4k to a max of 1080p and halves the frame rate of 50/60 fps videos with a simple SelectEven() command. Frame rate doesn't really affect file size, just encoding time.
- Compares the new file and original file and keeps the smaller, UNLESS the original streams are incompatible with Matroska (like wmv), in which case the new output is always kept. Files are moved to ./.Trash for recycle bin-like functionality in case of errors.
- Searches the directory for cover art under a number of naming conventions and if found, muxes it in.
- Uses a SQLite database file to record successfully processed files, ensuring that files are not re-processed when the script is restarted
- Logs any errors to error.log and continues processing subsequent files.

Dependencies:
AVISynth (https://github.com/AviSynth/AviSynthPlus) with LSMASHSource (http://avisynth.nl/index.php/LSMASHSource) plugin
LRemoveDust and its dependencies (included in the repo)
FFMPEG with AVISynth support compiled in (Media Autobuild Suite (https://github.com/m-ab-s/media-autobuild_suite))
QAAC (https://github.com/nu774/qaac) (and its dependency, iTunes)
SQLite3 (https://www.sqlite.org/download.html)
Git Bash for Windows (https://gitforwindows.org/) (to execute the script)

Github Repo:
https://github.com/heyburns/hevc-shrinker

I created this only for my specific use case, and it does what I need. I'm sharing just on the off chance that somebody else might find it useful, but I don't really have any interest in supporting or maintaining it beyond my own needs. If you do use it, be aware that it's unforgiving and deletes files permanently with no warning, so be sure you test it thoroughly and understand what it does. As I'm not much of a programmer and ChatGPT did a lot of the heavy lifting here, it's entirely possible - likely, even - that there are better ways to do certain things. If you see something that can be improved, I'm entirely open to suggestions, bug fixes, criticism, etc.

[Updated: pushed several updates, mkvtoolnix and directshowsource are no longer dependencies, move files to ./.Trash, convert to 10-bit before encoding.]

Z2697
3rd March 2025, 01:25
profile=main10 won't work this way.
What you need is convert the video to yuv420p10le.

And you are forcing avisynth to output yuv420p8, this is a weird combination to be honest, I know encoding 8bit source with 10bit x265 is usually beneficial, but what if the source file is 10bit? You are downscaling it to 8bit, unnecessarily.


echo "ConvertBits(8)"
echo "ConverttoYV12()"

ABurns
3rd March 2025, 01:37
profile=main10 won't work this way.
What you need is convert the video to yuv420p10le.

And you are forcing avisynth to output yuv420p8, this is a weird combination to be honest, I know encoding 8bit source with 10bit x265 is usually beneficial, but what if the source file is 10bit? You are downscaling it to 8bit, unnecessarily.


echo "ConvertBits(8)"
echo "ConverttoYV12()"


That's fair.

[EDIT: My previous response was from faulty memory. This is the correction]

LRemoveDust requires YV12 input, and ConverttoYV12() requires 8-bt input. I'm not sure how else to approach that but I'm open to suggestions.

ABurns
3rd March 2025, 18:02
I just pushed an update that moves files to ./.Trash instead of deleting them. It's amazing how we get so hung up in details that we miss the simplest things. :rolleyes:

Z2697
4th March 2025, 01:06
That's fair.

[EDIT: My previous response was from faulty memory. This is the correction]

LRemoveDust requires YV12 input, and ConverttoYV12() requires 8-bt input. I'm not sure how else to approach that but I'm open to suggestions.

You can at least convert to 10bit afterwards, otherwise the x265 is not using 10bit

ABurns
5th March 2025, 19:54
You can at least convert to 10bit afterwards, otherwise the x265 is not using 10bit

I'll do some testing with that.

ABurns
7th March 2025, 05:48
Pushed an update that adds ConvertBits(10) after LRemoveDust.

excellentswordfight
7th March 2025, 09:52
Pushed an update that adds ConvertBits(10) after LRemoveDust.
If you have native 10bit, it's a bit of shame to go down to 8bit, and back up again. Is there any specific reason for LRemoveDust? Sounds like you use it as a denoiser, surely there are plenty of options there? Cause if its avisnyth+ based there should be viable high-bitdepth options here.

And just on a personal note, I saw that sao was configurable which is good, but if you are denoising and doing that low bitrate (crf23) you are actually in a use case were SAO is beneficial, so with that usecase as default, I wouldnt also have no-sao as default. Justa guestimate, I assume that a normal 1080p movie "shrinks" to 1-2Mbps? Thats defiantly a high enough compression-rate were sao starts to make a lot of sense.

ABurns
10th March 2025, 23:22
If you have native 10bit, it's a bit of shame to go down to 8bit, and back up again. Is there any specific reason for LRemoveDust? Sounds like you use it as a denoiser, surely there are plenty of options there? Cause if its avisnyth+ based there should be viable high-bitdepth options here.

You're right, it's not ideal. But the goal here is 1) size reduction at a reasonable approximation of the same quality, and 2) universal application without user intervention, not quality optimization. If quality is the top concern, then you probably wouldn't even want to use this script at all and stick to dealing with files individually. This is for shrinking large file collections where you're not terribly vexed about getting the most out of every pixel, i.e., your porn collection. There are other denoisers, but not any that I've found which offer the kind of compression gains that it does. I did some pretty extensive testing a few years back and found it to be the best for that purpose. It was one of the more destructive denoisers I tested, but it was also the best for compression by a fairly wide margin and wickedly fast. The destructiveness was fairly minor relative to the compression gains. To me, it's a trade-off that I'm willing to make for certain types of content. I would never use it for content where quality was my top concern, but for bulk transcoding of unimportant files it's fine. It's been a while so there may be other options now, but last time I tested for it, that was the one to go with. If you have any suggestions for possible replacements I'm more than willing to test again.

And just on a personal note, I saw that sao was configurable which is good, but if you are denoising and doing that low bitrate (crf23) you are actually in a use case were SAO is beneficial, so with that usecase as default, I wouldnt also have no-sao as default. Justa guestimate, I assume that a normal 1080p movie "shrinks" to 1-2Mbps? Thats defiantly a high enough compression-rate were sao starts to make a lot of sense.

I'm not sure how you're working out that math. One of my goals is reasonable transparency, so 23 is actually a rather high average bitrate for x265. The x265 default is 28, so for 1080p, the typical bitrate at 23 is usually around 4-5 Mbps. Of course, every file is different, but I typically see around 50% reductions in file size over h264. SAO tends to apply too much smoothing for my taste. However, if there's a compelling case to be made for its use, I'm open.