Log in

View Full Version : FFV1 Vulkan compute shader decoder in FFmpeg - has anyone got it working?


Z2697
24th February 2026, 06:48
I'm not sure if this is the most correct place to post, but I first noticed the problem with it some time ago, when opening a FFV1 video with MPV.
For me, it brings down my video driver, even now, after tons of related commits.
Sometimes the driver dies, sometimes it doesn't, but it always make my system unresponsive for some length of time (I dare not to try long clips).
I guess the driver just crashes itself due to some timeout.
Driver is restarted no problem, thankfully.
It's enabled by default in MPV thanks to the same author.

Error from MPV:

[ffmpeg] Unable to submit command buffer: VK_ERROR_DEVICE_LOST
[ffmpeg] vk: Unable to submit command buffer: VK_ERROR_DEVICE_LOST
Could not copy back hardware decoded frame.
Error while decoding frame (hardware decoding)!


The prores and prores_raw compute shader decoder works better. Don't know why.
And I refuse to call them hwaccel or hwdec.

GeoffreyA
24th February 2026, 10:13
I tried an FFV1 video with MPV. The video played all right, but I don't think it was running the compute shader. How does one enable it? I don't use this player, so am not familiar with its workings.

Z2697
24th February 2026, 12:12
hwdec is off by default, there are mainly 3 ways to enable it:
1. write hwdec=auto to the mpv.conf file
2. use --hwdec=auto command line option
3. use ctrl+h hotkey during playback to toggle on/off
(https://mpv.io/manual/master/#options-hwdec
https://mpv.io/manual/master/#options-hwdec-codecs
by default hwdec is off, but ffv1 and other compute shader decoders are included in the hwdec codecs list)

To confirm the "hwdec" is being used, you can check the console output (in terminal or [`] key), or the stats by press the key [i].
If the vulkan decoder is in use, there should be something like one of the following:

Using hardware decoding (vulkan).
Using hardware decoding (vulkan-copy).
HW: vulkan
HW: vulkan-copy


Yes, these things use the same "naming convention" as the actual hardwired decoder.

GeoffreyA
24th February 2026, 13:28
Thanks for the helpful description. Unfortunately, hardware acceleration doesn't work for FFV1, but does for other formats by way of D3D11. Must be disabled, or not implemented, on Intel Arc.

Z2697
24th February 2026, 16:10
The prebuilt binary from shinchiro have the vulkan compute shader decoders not enabled (with other vulkan feature being intact), I did a quick look around and didn't find code that explicitly does that, maybe it's a coincidental thing.

I'm now using MPV from m-ab-s, those decoders are compiled fine.
But I was using shinchiro's releases until 2025-10, and my "first encounter" with the problem was in 2025-04, I have some local file's timestamp as evidence... usually I don't trust my memory, but there's proof.

So something's changed somewhere during that time that makes the decoders fail to compile by shinchiro's procedure...
I guess it's the recent "convert shaders to compile time SPIRV generation" thing.

Quite a bit of rambling, yeah, what can I say.

Update: actually, some of the vulkan filters that depend on spirv_compiler according to configure script are also missing, which confirms my theory, mostly.

GeoffreyA
24th February 2026, 19:01
Today, I used the zhongfly build.

Yes, there were a lot of Vulkan/SPIRV changes recently in FFmpeg. So it could have broken other things elsewhere.

Z2697
24th February 2026, 19:08
I think they are more or less the same? The main difference is the release schedule.

pirlouy
24th February 2026, 19:18
I remember having desktop freezes with FFV1 videos and Geforce 1060.
With Radeon 9060 XT, I don't have freeze but it's not working and process is there without anything displayed (until I kill it, maybe it would appear after some time).

I don't know what this FFV1 format is for, I have it only for a test file, so I don't really care, I just use a mpv.conf file in this directory with
hwdec-codecs=h264,vc1,vp8,vp9,av1,prores

GeoffreyA
24th February 2026, 20:40
FFV1 is a lossless codec, useful when needing to preserve quality perfectly. Also quite fast.

huhn
24th February 2026, 21:29
lossless h264 is well lossless and hardware accelerated just saying.

GeoffreyA
25th February 2026, 07:16
If I remember correctly, FFV1 compresses slightly better than lossless H.264 and faster. It seems to be accepted as a standard too, different institutions using it to archive material.

huhn
25th February 2026, 08:33
i did a small test.

321 mb lossless H264
-c:v ffv1 -level 3 -coder 1 -context 1 -g 1 -slices 24 -slicecrc 1 -c:a copy

result 1670 mb edit: -slices 4 and no or default -slicecrc is down to 1600 mb at 40 fps /edit
about 60 fps encode speed on my low end HTPC

the file was realtime recoded on the same CPU at 144 FPS (ultra fast file size was bigger the file encode was placebo) while other stuff was using it.

there are a lot of repeated frames in there and P frames may help a lot but that's X5 the repeated frames are only x3.

and the biggest thing it can not be decoded in realtime on this system not even close i guess that's the reason for the thread to use GPU to help while the h264 stream can be with ease be decoded on the CPU.

i'm not convinced it has crc and stuff like that which is nice. but you can also just save the file on a zfs or similar file system which not only gives you checks but also limited correction. and i'm aware that this si only one test.

H264 also has hardware decoder and encoder available...

GeoffreyA
25th February 2026, 09:14
I'm surprised there is such a big discrepancy in the file size. What content was the video?

It does use a lot more CPU, yes, and is without hardware decoding.

huhn
25th February 2026, 09:40
quite static very predicable motion gradians very low details/ bulky lines but real high resolution true 1080 to UHD: https://drive.google.com/file/d/1arhPLG8QFLKw4qgRCI9034F-LZrmw1mD/view?usp=sharing

because i had it lying around.

i used the test video i created to test nvidia terrible frame interpolation it is 24 to 48 in 144. the encode are not intra only i see no point in ever doing that that should be the major reason the repeated frames.

Z2697
25th February 2026, 18:41
Since "semi-official" builds don't have the decoder (for now (https://forum.doom9.org/showthread.php?p=2028640#post2028640)), here's a light build of MPV that I just made with m-ab-s.
https://pixeldrain.com/u/TU9idmyx
https://workupload.com/file/RZPxKMcfYEu

pirlouy
25th February 2026, 20:18
I don't know if it we were supposed to test something and forgive me because I'm just a simple user and I tested your build.
I tried gpu-api=vulkan and indeed, with your build, I see "HW" decoding, but GPU is 100% and video is not really working (stucked for several seconds).
The video I've been using: https://drive.google.com/file/d/1RaA9h1Lfpa-6i_V8rs9qYBxT1W5DYu-n/view?usp=sharing

Btw, I don't understand why people use Vulkan on Windows. There's a delay to start video, and a delay each time you go Fullscreen. Why would you choose that ?

huhn
25th February 2026, 20:49
the GPU is at 100% because it is not fixed function hardware decoding but it is using the shader to do the work which your GPU couldn't handle.

i can software decode the video you are using and i have the same GPU which should tell you quite a bit...

Z2697
25th February 2026, 21:14
prores and prores_raw compute shader decoder work better (probably still have some minor issues), so the problem isn't "ANY shader is bad".
I know that the complexity of decoder will make a difference, but estimating from the GLSLs' size of them, ffv1 is less than 2x more complex than prores.
Combined with the fact that I'm using a 2nd best consumer GPU currently on the planet, and I can't even decode a single second of 320x240 video, it's ridiculous.
So it made me wonder if I am the problem (for not using it correctly).

Yes, no one cares (probably), software decoding works very well, but I'm just curious.

FFV1 is a "semi-intra-only" lossless codec with some sort of context (not sure exactly) span across frames, but you can make it "truly intra-only" by using <-g 1>, the compression ratio isn't decreased very much.

H264 Lossless on the other hand, needs inter frame motion prediction to be able to beat FFV1 in compression ratio, in general.
Roughly a couple of P frames (x264 doesn't use B frames in lossless mode) are required, more P frames / longer GOP affects compression ratio less in lossless mode than in lossy mode. And YMMV depending on the type of video, how predictable it is.
You can optionally make decoding a lot faster than FFV1 by using CAVLC, with some cost in compression ratio of course. (FFV1 decoding is faster than H264 with CABAC)
The entropy coding takes a major role in the speed when the data rate is huge.

So FFV1 works better when you need intra-only video, but a super-short-GOP H264 is pretty close.
Other advantage of FFV1 over x264 lossless include encoding speed and pixel format support. (Specifically x264, for the speed comparison, because HW encoder can be faster)
To me, there's no other technical reason to use FFV1.

huhn
25th February 2026, 21:54
give me your mpv config and i will test with an AMD card on a system where the CPU can easily decode 1080p of that codec.
cause i get VK_ERROR_UNKNOWN

edit: got it working cause mpv been mpv.

this decoding is so slow this has to be a proof of concept there is no way to use it.
it's like 0.1-0.5 FPS
on software i get on a single thread realtime!

Z2697
25th February 2026, 22:16
(https://mpv.io/manual/master/#options-hwdec
https://mpv.io/manual/master/#options-hwdec-codecs
by default hwdec is off, but ffv1 and other compute shader decoders are included in the hwdec codecs list)

Oh this thing is wery much more than just a proof of concept.
If not for the recent commits broke it for some prebuilt, anyone that wants HW decoding and using the recommended "hwdec=auto" will get hit by this.

huhn
25th February 2026, 22:23
and the code you used is pushed up stream this way?
and they really consider to put "hybrid" decoder/fake hardware decoder under hardware decoding?

do i understand this correctly?

pretty much every hybrid decoder ever released is so bad you are better of using software when CPU got better just saying...

Z2697
25th February 2026, 22:30
This is literally the way it is in the upstream repos right now.
Several months already, actually.

that prores_raw decoder is really faster than CPU (seemingly single thread only), prores is meh (consumes roughly same amount of power as a CPU that's as fast, and CPU can be faster), ffv1 is ugh.
dpx is uncompressed I don't know why a compute shader decoder exists.

huhn
25th February 2026, 22:50
pirlouy has close to zero FPS
i have close to zero FPS on both AMD and nvidia
you have zero FPS

you did nothing wrong this build or this decode isn't there "yet".

BTW. my 4060 is massively faster compared to my 9060 XT like 5-10 times still useless.

putting down shader decoder as hardware is just wrong. the expected result of hardware decoding is close to zero load on CPU and GPU cause a dedicated piece of hardware does the job. this is just a "hybrid" decoder and currently not a useful one.

GeoffreyA
26th February 2026, 21:55
Since "semi-official" builds don't have the decoder (for now (https://forum.doom9.org/showthread.php?p=2028640#post2028640)), here's a light build of MPV that I just made with m-ab-s.
https://pixeldrain.com/u/TU9idmyx
https://workupload.com/file/RZPxKMcfYEu

It wouldn't play the FFV1 file when hwdec was enabled.

Z2697
7th March 2026, 21:00
https://github.com/FFmpeg/FFmpeg/commit/5a6eeed9f0e375bde70c8e6c82fad58d8d06700c
vulkan_ffv1: warn users on low number of slices

Hmm...
Does this mean I am doing it wrong?

huhn
7th March 2026, 22:10
this is so stupid...
slices decrease compression efficiency by alot in this type of codec and are literally threads and a GPU likes to "multi thread" so yes a low slices encode can not be as easily multi thread decoded as a high slices encode maybe.

no you did just fine. the existence of this decoding is just questable.
so i did the only sane thing and encoded with 1024 threads from 1.6 gb to 1.9 GB.
still unplayable except if you have a new build nothing changed.

this is some code for cluster or other server farms to do some stuff with that thing where efficiency is ignored and speed is the only thing that matters. maybe this is a tint bit faster on 2000 AI cards compared to a 16 core CPU or something...
or for ultra high resolution where the upload to the GPU wouldn't be possible or something i may overdosing on copium here.

i will do 16 now i except nothing...

Z2697
8th March 2026, 08:57
So I tested it again, apparently it provides "0.5 frames per second per slice per frame" for a 1080p yuv420p10le video, on a RTX 4090.
Amazing, yeah. If I encode such video with 200 slices I get roughly 100 fps on 100% load.
(but the power consumption is still relatively low at ~90w, compared to the rated 450w or smth, so there's a lot of "wasted cycles" I guess?)

My driver crash happens when decoding rice coded video.
If the range coder is used, no crash (so far), which is the default or only option for high bitdepth anyway.

huhn
8th March 2026, 11:10
maybe int32 only or something like that. it is not bus or memory limited
i would move on ffmpeg just has very very low standard for such stuff there is no other explanation.
as long as it mathematical does something correct it them to be allowed in the code even through 1000 entires of BonziBuddy wouldn't be this harmful as this.

BTW. it can not decode 1024 slices correctly so much about the mathematically correct part...

what is rice now?

Z2697
8th March 2026, 14:08
Golomb-rice coding, or "rice" coding, is the default coder in FFV1 (when possible).
https://en.wikipedia.org/wiki/Golomb_coding

-coder <int> E..V....... Coder type (from -2 to 2) (default rice)
rice 0 E..V....... Golomb rice
range_def -2 E..V....... Range with default table
range_tab 2 E..V....... Range with custom table
ac 1 E..V....... Range with custom table (the ac option exists for compatibility and is deprecated)


It often performs worse than range coder in FFV1, compression ratio-wise. But it's faster.

GeoffreyA
8th March 2026, 14:31
I wonder if the other Vulkan encoders added of late are of this calibre.

Z2697
8th March 2026, 15:03
Here's a new light build if anyone wants to try. (not recommend)
https://workupload.com/file/F98d3wnmwm9
https://pixeldrain.com/u/tK6JBypq

(should there be a difference? idk.)

GeoffreyA
8th March 2026, 15:26
I've lost track of what's going on, having been away from home for three weeks.

Z2697
8th March 2026, 15:52
Oh, some vacation? :)

You are fine, this is nothing important.

GeoffreyA
8th March 2026, 22:13
Oh, some vacation? :)

You are fine, this is nothing important.

Sort of. I was at my grandparents' but am back now :)