ABurns
2nd March 2025, 18:40
So I was recently faced with needing to transcode and shrink several terabytes of videos that were mostly encoded with fairly high bitrate h264 but also included a hodgepodge of esoteric and obsolete containers and codecs, like wmv, flv, QuickTime, DivX, mpeg-1/2, etc. Because this was primarily a file shrinking operation, I wanted to use AVIsynth to filter it a little for better compression. LRemoveDust is somewhat destructive to very fine detail, but the compression gains are substantial in many cases. At first I tried some of the automated GUIs like MeGUI and a couple of others, but they're simply not suited for thousands of oddball files like that.
Those apps were crash-prone, and I would start it up at bedtime and wake up to find that it had crashed on the 2nd or 3rd file and nothing was done. So I needed something that was robust with good error handling, lightweight and wouldn't bog down queueing up thousands of files, and was more intelligent than just a simple encoder. So I got ChatGPT to help me write a bash script to do what I wanted. I had several requirements:
- Output had to be transparent, i.e., the same visual quality as input. I was after small size, but not at the expense of quality.
- It needed to be robust and able to load a lot of oddball formats without crashing.
- Output needed to be standardized in Matroska containers with AAC audio
- Sometimes the new file was actually bigger, so it needed to be able to compare the new file to the old one and keep the smaller.
- Many of the files had cover art in the folder, so it needed to mux those images in as attachments
- It needed to be able to efficiently keep track of which files had already been processed to avoid encoding the same file multiple times if the script gets restarted.
- Needed to be able to log errors so I could easily go back to problematic files later.
So here is what the script does:
- Runs in a Git Bash for Windows terminal (although I'm sure it can be pretty easily modified to run on *nix and wine)
- Search recursively for video files and gather stream information
- Skips files that are already hevc/aac and just remuxes them, otherwise creates an AVS script with LRemoveDust and encodes with ffmpeg and qaac.
- Downscales 4k to a max of 1080p and halves the frame rate of 50/60 fps videos with a simple SelectEven() command. Frame rate doesn't really affect file size, just encoding time.
- Compares the new file and original file and keeps the smaller, UNLESS the original streams are incompatible with Matroska (like wmv), in which case the new output is always kept. Files are moved to ./.Trash for recycle bin-like functionality in case of errors.
- Searches the directory for cover art under a number of naming conventions and if found, muxes it in.
- Uses a SQLite database file to record successfully processed files, ensuring that files are not re-processed when the script is restarted
- Logs any errors to error.log and continues processing subsequent files.
Dependencies:
AVISynth (https://github.com/AviSynth/AviSynthPlus) with LSMASHSource (http://avisynth.nl/index.php/LSMASHSource) plugin
LRemoveDust and its dependencies (included in the repo)
FFMPEG with AVISynth support compiled in (Media Autobuild Suite (https://github.com/m-ab-s/media-autobuild_suite))
QAAC (https://github.com/nu774/qaac) (and its dependency, iTunes)
SQLite3 (https://www.sqlite.org/download.html)
Git Bash for Windows (https://gitforwindows.org/) (to execute the script)
Github Repo:
https://github.com/heyburns/hevc-shrinker
I created this only for my specific use case, and it does what I need. I'm sharing just on the off chance that somebody else might find it useful, but I don't really have any interest in supporting or maintaining it beyond my own needs. If you do use it, be aware that it's unforgiving and deletes files permanently with no warning, so be sure you test it thoroughly and understand what it does. As I'm not much of a programmer and ChatGPT did a lot of the heavy lifting here, it's entirely possible - likely, even - that there are better ways to do certain things. If you see something that can be improved, I'm entirely open to suggestions, bug fixes, criticism, etc.
[Updated: pushed several updates, mkvtoolnix and directshowsource are no longer dependencies, move files to ./.Trash, convert to 10-bit before encoding.]
Those apps were crash-prone, and I would start it up at bedtime and wake up to find that it had crashed on the 2nd or 3rd file and nothing was done. So I needed something that was robust with good error handling, lightweight and wouldn't bog down queueing up thousands of files, and was more intelligent than just a simple encoder. So I got ChatGPT to help me write a bash script to do what I wanted. I had several requirements:
- Output had to be transparent, i.e., the same visual quality as input. I was after small size, but not at the expense of quality.
- It needed to be robust and able to load a lot of oddball formats without crashing.
- Output needed to be standardized in Matroska containers with AAC audio
- Sometimes the new file was actually bigger, so it needed to be able to compare the new file to the old one and keep the smaller.
- Many of the files had cover art in the folder, so it needed to mux those images in as attachments
- It needed to be able to efficiently keep track of which files had already been processed to avoid encoding the same file multiple times if the script gets restarted.
- Needed to be able to log errors so I could easily go back to problematic files later.
So here is what the script does:
- Runs in a Git Bash for Windows terminal (although I'm sure it can be pretty easily modified to run on *nix and wine)
- Search recursively for video files and gather stream information
- Skips files that are already hevc/aac and just remuxes them, otherwise creates an AVS script with LRemoveDust and encodes with ffmpeg and qaac.
- Downscales 4k to a max of 1080p and halves the frame rate of 50/60 fps videos with a simple SelectEven() command. Frame rate doesn't really affect file size, just encoding time.
- Compares the new file and original file and keeps the smaller, UNLESS the original streams are incompatible with Matroska (like wmv), in which case the new output is always kept. Files are moved to ./.Trash for recycle bin-like functionality in case of errors.
- Searches the directory for cover art under a number of naming conventions and if found, muxes it in.
- Uses a SQLite database file to record successfully processed files, ensuring that files are not re-processed when the script is restarted
- Logs any errors to error.log and continues processing subsequent files.
Dependencies:
AVISynth (https://github.com/AviSynth/AviSynthPlus) with LSMASHSource (http://avisynth.nl/index.php/LSMASHSource) plugin
LRemoveDust and its dependencies (included in the repo)
FFMPEG with AVISynth support compiled in (Media Autobuild Suite (https://github.com/m-ab-s/media-autobuild_suite))
QAAC (https://github.com/nu774/qaac) (and its dependency, iTunes)
SQLite3 (https://www.sqlite.org/download.html)
Git Bash for Windows (https://gitforwindows.org/) (to execute the script)
Github Repo:
https://github.com/heyburns/hevc-shrinker
I created this only for my specific use case, and it does what I need. I'm sharing just on the off chance that somebody else might find it useful, but I don't really have any interest in supporting or maintaining it beyond my own needs. If you do use it, be aware that it's unforgiving and deletes files permanently with no warning, so be sure you test it thoroughly and understand what it does. As I'm not much of a programmer and ChatGPT did a lot of the heavy lifting here, it's entirely possible - likely, even - that there are better ways to do certain things. If you see something that can be improved, I'm entirely open to suggestions, bug fixes, criticism, etc.
[Updated: pushed several updates, mkvtoolnix and directshowsource are no longer dependencies, move files to ./.Trash, convert to 10-bit before encoding.]