vs-propainter on mac - it works [Archive]

elcoyote

19th December 2024, 20:25

I don't know if there's any interest here on running vs-propainter on mac, but I got it working and wanted to share my little how to:

I'm using dan64's vs-propainter here: https://github.com/dan64/vs-propainter

MPS support was added a while ago, but if you try to use the script as-is on a mac, you'll get CUDA errors because propainter defaults to CUDA device. If you set the "device_index=-1" in the propainter function of your script, it will work, but cpu only, so very very slow.

What I did to enable MPS was to to modify __init__.py so that "device_index=-1" uses "mps" and not "cpu". This is on line 173 and 365:

Change device = torch.device("cpu") to device = torch.device("mps")
If you just do that, you'll still get errors because some torch operators are not supported in MPS. The trick is enable PYTORCH_ENABLE_MPS_FALLBACK=1 by doing this in ther terminal: export PYTORCH_ENABLE_MPS_FALLBACK=1
source ~/.zshrc

You can enable this permanently with echo 'export PYTORCH_ENABLE_MPS_FALLBACK=1' >> ~/.zshrc

On a Mac Studio M1 running macOS Sonoma, I also needed to add torch.set_num_threads(1) in the __init__.py file after line 29 to avoid segmentation faults:
Add torch.set_num_threads(1)

I had to install torch using pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
I also had to install some other packages:
pip3 install opencv-python
brew install scipy
Then, in the .vpy script, I use this:
clip = propainter(clip, clip_mask=clipMask, device_index=-1)

Performance is pretty good and seems to use the GPU mostly. I only tested on a Macbook Air M2 with 24GB of ram, but I will test on a Mac Studio with 128GB of ram soon. I hope that the Mac Studio with its 128GB of unified memory will be able to process higher resolution clips than the PC with a RTX 3090 (only 24GB of ram) that I'm using now.

elcoyote

19th December 2024, 22:35

I did not do a thorough comparison with propainter running on CUDA yet, but it seems to work fine.

Here are the "bmx" and "tennis" outputs:
https://www.dropbox.com/scl/fi/cky5rq7jfo52vac7uk761/bmx_inpaint.mp4?rlkey=7sqc34fzzsuw3v7yhs5cqn7f9&st=ebgjhxeg&dl=0
https://www.dropbox.com/scl/fi/2ggzvadulunxp324bu5vj/tennis_inpaint.mp4?rlkey=7fmjbilszwbbo8ut4h2gyqurg&st=gechgtqv&dl=0

Have you seen temporal stability problems in the past with MPS?