View Full Version : Auto Target Encoder - an AV1 GUI based on Machine Learning
Kurt.noise
17th August 2025, 14:21
https://github.com/Snickrr/Auto-Target-Encoder
A sophisticated, GUI-based encoding tool designed for automated batch processing of your videos that do not require comprehensive fine-tuning. It leverages machine learning to create high-quality, efficient AV1 video encodes. This application automates the entire workflow for large batches of files: it learns from past encodes to predict optimal quality settings, intelligently analyzes each video's complexity, and displays the progress of all parallel jobs in a real-time dashboard.
This tool moves beyond single-file, trial-and-error encoding by building persistent knowledge. A RandomForest machine learning model predicts the exact CQ/CRF value needed to hit a target quality score (VMAF, SSIMULACRA2, BUTTERAUGLI), while other models provide highly accurate ETA predictions by learning your hardware's real-world performance across hundreds of encodes.
https://github.com/Snickrr/Auto-Target-Encoder/raw/main/demo.gif
----
Not tested nor created by me.
Blue_MiSfit
18th August 2025, 06:03
Fascinating! Thank you for sharing
Z2697
18th August 2025, 14:48
This feels like madness to me.
Or maybe AV1 encoders are just so bad that they need AI assist. (which I don't think is the case)
The whole README feels like AI generated just from the excessive use of Emojis... And too "well written" for a repo that contains just 1 source file and 7 out of 9 commits are editing README.
The whole project might be AI generated as well.
EDIT: it is.
This project was created by someone with no prior coding experience, using AI assistance (Claude Opus 4.1 and Gemini 2.5 Pro) for advanced mathematics and coding implementation. The core ideas and extensive debugging/fine-tuning were done manually.
Now don't get me wrong, AI can be a useful tool to help coding, but I don't think this one is.
Leo 69
19th August 2025, 01:47
Now don't get me wrong, AI can be a useful tool to help coding, but I don't think this one is.
Could you explain in detail why you think there's something wrong with this tool just because it was created with the help of AI? Which bugs or flaws have you already found after using the tool?
Z2697
19th August 2025, 02:01
Could you explain in detail why you think there's something wrong with this tool just because it was created with the help of AI? Which bugs or flaws have you already found after using the tool?
What I mean is just that it's created entirely by AI, not created with help of AI.
Leo 69
19th August 2025, 07:37
What I mean is just that it's created entirely by AI, not created with help of AI.
You're overestimating the capabilities of modern LLMs. They can't create anything on their own without a user's input, which I'd say has to be rather specific. So, to create such a tool the human does need to actually know what he's doing in order to make correct prompts, do the quality assessment of the LLM's output, wrap up the complete package and publish it on the forum. LLMs can't do that alone. There's obviously a lot of human work that was done to make this tool possible. You're wrong.
Z2697
19th August 2025, 16:19
You're overestimating the capabilities of modern LLMs. They can't create anything on their own without a user's input, which I'd say has to be rather specific. So, to create such a tool the human does need to actually know what he's doing in order to make correct prompts, do the quality assessment of the LLM's output, wrap up the complete package and publish it on the forum. LLMs can't do that alone. There's obviously a lot of human work that was done to make this tool possible. You're wrong.
Human is helping AI, then.
But I agree the concept of this tool should work, at least in theory.
RanmaCanada
19th August 2025, 18:07
This looks interesting, but I don't know if it's practical as we know automated metrics can be bad.
benwaggoner
19th August 2025, 23:38
Huh. We need a lot more details about what it is actually generating, based on what training data! Is it generating command lines? Trained on what? And with what input data? What source attributes go into the generation?
AI and ML certainly have applications for video encoding, but I don't see enough details to know what this is doing, or how much better it could do than a skilled human.
Leo 69
19th August 2025, 23:53
Here is a stage-by-stage description of what it does:
Stage 1: Initialization and Learning
When the application starts, it first loads all user settings from a config.ini file. It then initializes its Machine Learning (ML) models by training them on historical data stored in a local SQLite database. This allows the app to learn from previous encodes to make smarter predictions about encoding speed and quality settings for new videos.
Stage 2: File Queuing and Pre-Filtering
The user selects a folder of videos to process. The script scans this folder and populates a queue. Before any heavy processing begins, it performs a fast pre-filter, automatically skipping files that are too short, too small, or have a bitrate lower than user-defined thresholds.
Stage 3: Video Analysis
For each valid file, the application performs a detailed analysis. It uses ffprobe to get technical details (resolution, frame rate, etc.) and performs a complexity analysis to understand the video's content (e.g., detecting scene changes). This data is compiled into a set of "features" that the ML models can understand.
Stage 4: ML-Driven Quality Search
This is the core of the application. To find the perfect quality setting (CQ/CRF value) that meets the user's target (e.g., a VMAF score of 95):
It first creates a short, high-quality "master sample" by stitching together representative clips from the video.
It uses its trained Quality Model to predict the best CQ value needed to hit the target score.
Based on the model's confidence, it intelligently tests one or two CQ values by encoding only the small sample file, which is extremely fast.
If the prediction is wrong or the model is not confident, it falls back to an efficient search to find the optimal CQ value. All test results are cached in the database to avoid re-doing work.
Stage 5: Final Encoding
Once the optimal CQ value has been found, the script proceeds to encode the full-length original video using that setting. It monitors the encoding process in real-time to provide progress updates and detect if the process has stalled.
Stage 6: Finalization, Logging, and Learning
After the encode is complete, the script verifies the new file. If it meets the criteria (e.g., sufficient size reduction), it saves the final file and can optionally delete the original. Critically, it logs the performance data (how long it took, the final file size, etc.) back into the SQLite database. This act of logging completes the feedback loop, ensuring that the ML models become more accurate with every video it processes.
GeoffreyA
20th August 2025, 07:22
So, if I understand correctly, it extends the approach of Av1an, reducing the CQ/CRF search space by inferring what would be the right value. The updating of the database and retraining of the model should, in theory, lead to improved accuracy in that selection.
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.