Log in

View Full Version : AlphaVC E2E AI codec


birdie
4th August 2022, 22:42
Recently, learned video compression has drawn lots of attention and show a rapid development trend with promising results. However, the previous works still suffer from some criticial issues and have a performance gap with traditional compression standards in terms of widely used PSNR metric. In this paper, we propose several techniques to effectively improve the performance. First, to address the problem of accumulative error, we introduce a conditional-I-frame as the first frame in the GoP, which stabilizes the reconstructed quality and saves the bit-rate. Second, to efficiently improve the accuracy of inter prediction without increasing the complexity of decoder, we propose a pixel-to-feature motion prediction method at encoder side that helps us to obtain high-quality motion information. Third, we propose a probability-based entropy skipping method, which not only brings performance gain, but also greatly reduces the runtime of entropy coding. With these powerful techniques, this paper proposes AlphaVC, a high-performance and efficient learned video compression scheme. To the best of our knowledge, AlphaVC is the first E2E AI codec that exceeds the latest compression standard VVC on all common test datasets for both PSNR (-28.2% BD-rate saving) and MSSSIM (-52.2% BD-rate saving), and has very fast encoding (0.001x VVC) and decoding (1.69x VVC) speeds.

Paper: https://arxiv.org/abs/2207.14678

soresu
5th August 2022, 08:48
Errr... 0.001x VVC is fast encoding?

I'm hoping this is just proof of concept code vs production code they are talking about bcuz that is daaayyuummm slow.

Zarxrax
5th August 2022, 22:03
Errr... 0.001x VVC is fast encoding?

I'm hoping this is just proof of concept code vs production code they are talking about bcuz that is daaayyuummm slow.

Looking into the paper, it says the encoding speed is 715ms per frame, which I believe works out to about 1.4 fps.
More concerning is decoding speed at 379ms per frame though, which is way below real-time. And this is on a V100 GPU, so way out of consumer range at the moment.

ksec
6th August 2022, 10:05
PSNR (-28.2% BD-rate saving) and MSSSIM (-52.2% BD-rate saving), and has very fast encoding (0.001x VVC) and decoding (1.69x VVC) speeds.

Compared to next version of JCT-VC, which barely managed ~15% of BD Rate saving with hundreds more tools, 50% saving over VTM is quite impressive. It does seems to me that work on next gen Video ( H.267 ? ) is accelerating. Compared to HEVC and VVC during similar time line.