VSGAN - VapourSynth GAN Implementation, based on ESRGAN's implementation [Archive] - Page 3

View Full Version : VSGAN - VapourSynth GAN Implementation, based on ESRGAN's implementation

Pages : 1 2 [3] 4

Selur

25th April 2022, 18:08

knumag

25th April 2022, 19:42

I got it working now, but why does VSGAN seem better when looking at the result? Same model is being used, but it still seem sharper.

knumag

25th April 2022, 19:54

And also, using the latest vsrealesrgan with onnx, R58 and pyton 3.8, I got 90-100% GPU usage with same fps as i got with the older vsrealesrgan 1.2.0 at 10% GPU which I did get to work with R57 and python 3.9.
Very confusing. CPU might av been a bit higher without ONNX, but that different is weird, when getting same FPS, no?

knumag

25th April 2022, 19:58

a. That guide, won't work with current vsrealesrgan.
b. Assuming you use Vapoursynth R57 and not the new Vapoursynth R58 the steps I posted over at https://forum.doom9.org/showthread.php?t=184000 should work.
Using VapourSynth R58 would require to use Python 3.8 (not 3.9 or 3.10) since R58 only supports 3.8 and 3.10, but onnxruntime does not support 3.10 atm.

Cu Selur

I tried that guide also, didnt work for me. Had to install earlier versions of a lot of them to get it to work for some reason.
Been at this now for 10-15 hours.. :D

lansing

7th December 2022, 04:50

What are the required GPU ram to run this? I'm upscaling 2x on a 1080p image and it's eating up all my 6G of ram.

Selur

7th December 2022, 09:20

@lansing: going for 4k from 1080p with an 2x_-model with 16bit indicates a VRAM usage of 6.3GB on my card (Geforce rtx 4080), so you might be out of luck. You could try setting an overlap value, maybe that triggers the tiling support of VSGAN and help with the ram shortage.

kedautinh12

7th December 2022, 09:36

Lol, 4080, a rich man :D

Selur

7th December 2022, 09:53

Not any more. :( (new card + new power supply + new ups)
After 5 years my old 1070 simply was having more and more issues, since NVIDIA messed up the driver support more and more. (Accessing my 5k display which needs dual display port import wasn't possible with any drivers newer than the drivers from May.)

lansing

7th December 2022, 11:14

kedautinh12

7th December 2022, 11:40

But don't support CUDA if VSGAN will support CUDA :D

Julek

7th December 2022, 18:21

Is the plugin Nvidia card only? AMD card offers so much more ram for lesser price.

You can use vsncnn with vs-mlrt (https://github.com/AmusementClub/vs-mlrt#vsncnn-ncnn-based-gpu-vulkan-runtime) to run on AMD, here is the list of models (https://github.com/AmusementClub/vs-mlrt/wiki#models) currently supported by vs-mlrt, for other specific models you need to convert to onnx by yourself.

lansing

8th December 2022, 06:49

You can use vsncnn with vs-mlrt (https://github.com/AmusementClub/vs-mlrt#vsncnn-ncnn-based-gpu-vulkan-runtime) to run on AMD, here is the list of models (https://github.com/AmusementClub/vs-mlrt/wiki#models) currently supported by vs-mlrt, for other specific models you need to convert to onnx by yourself.

I'm waiting for the RX 7900 XTX launch next week to upgrade to AMD. But wow this field is so dominated by Nvidia. It's like nobody is developing anything for AMD...

The models I'm interested in are the 2x_LD-Anime_Skr_v1.0 and some anime sharpen AI for old anime. It's turning my anime dvd into blu-ray.

original
https://i.imgur.com/yhPEkaI.png

2x_LD-Anime_Skr_v1.0
https://i.imgur.com/wpSbhLB.png

color matched to cel
https://i.imgur.com/ahY3plj.jpg

I tried to run another 2x sharpen upscale to the result but ran out of memory.

ChaosKing

8th December 2022, 10:05

kedautinh12

8th December 2022, 11:16

Do you had channel of discord for encoder?? Can let me in? :D

Krizzz989

8th December 2022, 23:03

ChaosKing those look great, are you open to sharing your script for that? I'm interesting in upscaling the same anime but I'm completely lost on where to start.

lansing

8th December 2022, 23:47

Some test I made, One Piece DVD:
https://cdn.discordapp.com/attachments/426802196194263051/1003626930169454592/op1.png <-- (stronger filtering)

https://cdn.discordapp.com/attachments/426802196194263051/1003622612938801204/unknown.png
https://cdn.discordapp.com/attachments/426802196194263051/1003622370965209128/unknown.png
https://cdn.discordapp.com/attachments/426802196194263051/1003623183980699738/unknown.png

One problem that I often see with these models is that there is often a color shift. It is not very fast, but awesome quality (if the source video "matches")

I just found the model's author on discord, I'll see if I can get some support from him.

UPDATE:

I got reply from the author. I don't know how to interpret it, so I direct quoted him:
Because LDs have crap colors. I had to color match the entire data set, so in the end, there's still a bit of shift

So the shift is expected since our source is DVD instead of LD.

lansing

9th December 2022, 00:16

ChaosKing those look great, are you open to sharing your script for that? I'm interesting in upscaling the same anime but I'm completely lost on where to start.

OP has a installation guide:
https://vsgan.phoeniix.dev/en/stable/installation.html

Total file size would be a few GB.

And then go here to look for a model, there are model for different situations. What we use here is 2x_LD-Anime_Skr_v1.0:
https://upscale.wiki/wiki/Model_Database#Anime

import vapoursynth as vs
from vsgan import ESRGAN

core = vs.core

# your dvd source
clip = core.ffms.Source()

# convert to RGB
clip = core.resize.Bicubic(clip=clip, format=vs.RGB24, matrix_in_s="470bg", range_s="limited")

# instantiate esrgan and load the model
esrgan = ESRGAN(clip, "cuda")

model = r'my_path\2x_LD-Anime_Skr_v1.0.pth'

esrgan.load(model)
esrgan.apply()
clip = esrgan.clip

# get output
clip_set_output()

ChaosKing

9th December 2022, 09:52

I don't remember which models I used, I simply tried out different models (and combined some).
You can start with one of these: 2x_AnimeClassics_UltraLite_510K, 2x_LD-Anime_Skr_v1.0, 2x_SHARP_ANIME_V1, 2x_DigitalFlim_SubCompact_nf24-nc8_289k_net_g

I also used the example script from the docs. Nothing special, just patience, oh and save your script from time to time, because your editor will crash!

LD = Laserdisc? My source is a jap. R2 DVD.

lansing

9th December 2022, 11:55

After going through many models, for upscaling old anime, I think the models with the best quality are the one that were trained using the actual blu-ray, such as 2x_LD-Anime_Skr_v1.0. There's a guy in the sailormoon forum (https://www.sailormoonforum.com/index.php?threads/sailor-moon-color-correction.33743/page-6#post-975611) that trained his model using blu-ray from the sailor moon movies is also getting amazing result.

ChaosKing

9th December 2022, 12:08

I really need to learn how to train models myself. Imagine if you had like 200 CELs to train on.
I read that you need at least 100 different frames to get decent results.

lansing

9th December 2022, 12:18

I need to learn how to train models myself. Imagine if you had like 200 CELs to train on.
I read that you need at least 100 different frames to get decent results.

I'm also thinking about training one myself (don't even know how yet). I need a 1x model for sharpening right after 2x_LD-Anime but couldn't find any that were trained from blu-ray.

Selur

9th December 2022, 15:35

color matched to cel
https://i.imgur.com/ahY3plj.jpg

@lansing: How did you do the color adjustment?

lansing

9th December 2022, 16:05

@lansing: How did you do the color adjustment?

I use 3D LUT Creator

Selur

9th December 2022, 16:09

Ah, okay. (I was hoping for some Vapoursynth plugin I wasn't aware of. ;))

lansing

9th December 2022, 16:29

Ah, okay. (I was hoping for some Vapoursynth plugin I wasn't aware of. ;))

Vapoursynth is not the right tool for color adjustment

kedautinh12

9th December 2022, 16:51

Vapoursynth is not the right tool for color adjustment

Vapoursynth had vscube can load 3D LUTs
https://github.com/sekrit-twc/timecube

lansing

9th December 2022, 17:11

Vapoursynth had vscube can load 3D LUTs
https://github.com/sekrit-twc/timecube

What I mean is that we can load the finished product into vs sure but we can't use it for the adjustment process.

mastrboy

9th December 2022, 20:04

I really need to learn how to train models myself. Imagine if you had like 200 CELs to train on.
I read that you need at least 100 different frames to get decent results.

Any chance you could write down the process if you figure this out?
My attempts at training my own models has not produced good results at all and there's not a lot of good documentation out there either...

Selur

9th December 2022, 20:05

I agree some good documentation would be nice. :)

lansing

10th December 2022, 01:39

Selur

10th December 2022, 06:51

Multiple solutions:
a. don't use a VSGAN model, but something like:
# denoising using KNLMeansCL
# adjusting color space from RGB24 to YUV444P8 for vsKNLMeans
clip = core.resize.Bicubic(clip=clip, format=vs.YUV444P8, matrix_s="470bg", range_s="limited")
clip = core.knlm.KNLMeansCL(clip=clip, d=0, h=10.00, channels="Y")
clip = havsfunc.DeHalo_alpha(clip)
# adjusting color space from YUV444P8 to YUV420P10 for vsHysteria
clip = core.resize.Bicubic(clip=clip, format=vs.YUV420P10, range_s="limited")
clip = hysteria.Hysteria(clip=clip)
for the cleaning and simple don't sharpen the lines. (I got rgb24 as source color sampling since I used your image as source)

b. use a filter that does the smoothing and only apply it using an edge mask. :) (see: https://guide.encode.moe/encoding/masking-limiting-etc.html for the general idea)
c. use a dehalo filter :)

Cu Selur

pandy

12th December 2022, 12:38

I want to create images with softer edge from blu-ray sources for training, is there any filter that only blur the line? I tried downsize the original to 50% and resize it back in photoshop, but it add some ringing along the lines.

Use classic kernel filter:

0 .5 0
0 1 0
0 .5 0

For resize use bilinear then no ringing should be introduced.

lansing

14th December 2022, 08:47

I was testing out this standalone image processing program call chaiNNer for running ESRGAN. Upscaling using the same 640x480 image and a 2x model, during processing it uses 1.8 GB of GPU ram, and it drops to 600 MB when finished. While vapoursynth uses 1.7 GB and never releases it.

lansing

14th December 2022, 12:28

I've found the denoising in 2x_LD-Anime_Skr_v1.0 model exceptionally impressive. It is able to smooth out the chroma noise in near black color, making it possible to change color/brightness on those areas without having artifacts. I couldn't even do this in Neat Video.

https://imgsli.com/MTM5ODIz

Selur

14th December 2022, 16:32

@lansing: Couldn't you use DPIR or CCD for that, which both should be faster?

lansing

14th December 2022, 18:27

@lansing: Couldn't you use DPIR or CCD for that, which both should be faster?

I just tested them, CCD is not doing a thing on these super dark area. DPIR did do some smoothing but not good enough to avoid artifact.

DPIR +50% brightness in dark
https://imgur.com/rM3jYI8

Selur

14th December 2022, 21:05

What threshold did you use for CCD? (try 5 or even higher)
You could also use something like, if you mainly want to filter just the dark area.
## Starting applying 'limit' masked filtering for vsCCD
clipMask = clip
clipMask = core.std.BinarizeMask(clipMask, 30)
clipMask = core.std.InvertMask(clipMask)
clipFiltered = clip
# adjusting color space from RGB24 to RGBS for vsCCD
clipFiltered = core.resize.Bicubic(clip=clipFiltered, format=vs.RGBS, range_s="limited")
# chroma denoising using CCD
clipFiltered = core.ccd.CCD(clip=clipFiltered, threshold=50.00)
clipFiltered = core.resize.Bicubic(clip=clipFiltered, format=vs.RGB24, range_s="limited", dither_type="error_diffusion")
clip = core.std.MaskedMerge(clip, clipFiltered, clipMask)
## Finished applying 'limit' masked filtering for vsCCD

With DPIR:
from vsdpir import DPIR
## Starting applying 'limit' masked filtering for vsDPIRDeblock
clipMask = clip
clipMask = core.std.BinarizeMask(clipMask, 30)
clipMask = core.std.InvertMask(clipMask)
clipFiltered = clip
# adjusting color space from RGB24 to RGBS for vsDPIRDeblock
clipFiltered = core.resize.Bicubic(clip=clipFiltered, format=vs.RGBS, range_s="limited")
# deblocking using DPIRDeblock
clipFiltered = DPIR(clip=clipFiltered, strength=150.000, task="deblock", provider=1, device_id=0, dual=True)
clipFiltered = core.resize.Bicubic(clip=clipFiltered, format=vs.RGB24, range_s="limited", dither_type="error_diffusion")
clip = core.std.MaskedMerge(clip, clipFiltered, clipMask)
## Finished applying 'limit' masked filtering for vsDPIRDeblock
(Masked BasicVSR++ should also do the trick, but that might be slower)
Taking the image as source:
## Starting applying 'limit' masked filtering for vsLevels
clipMask = clip
clipMask = core.std.BinarizeMask(clipMask, 30)
clipMask = core.std.InvertMask(clipMask)
clipFiltered = clip
# Color Adjustment using Levels on RGB24 (8 bit)
clipFiltered = core.std.Levels(clip=clipFiltered, min_in=16, max_in=235, min_out=16, max_out=235)
clip = core.std.MaskedMerge(clip, clipFiltered, clipMask)
## Finished applying 'limit' masked filtering for vsLevels
seems also to work,..

lansing

15th December 2022, 01:58

I used the CCD from Virtualdub, set it to max 10 and nothing changed on the dark areas, I don't think it's really meant for this.

You methods are simply clipping out the dark area to wipe out the chroma noise with limited range, it has nothing to do with the tested filters.

Selur

15th December 2022, 05:43

I used the CCD from Virtualdub, set it to max 10 and nothing changed on the dark areas, I don't think it's really meant for this.
Okay, I thought you were using Vapoursynth.

You methods are simply clipping out the dark area to wipe out the chroma noise with limited range, it has nothing to do with the tested filters.
Yes, that was the last approach, since I was thinking that it would make no sense that the video is 4:4:4 and pc scale. :)

Selur

17th December 2022, 19:58

btw. for those filtering animes&cartoons: https://github.com/HolyWu/vs-animesr/ might be interesting :)
(TensorRT + RGBH doesn't seems to be the only thing not working atm.)

lansing

18th December 2022, 06:47

btw. for those filtering animes&cartoons: https://github.com/HolyWu/vs-animesr/ might be interesting :)
(TensorRT + RGBH doesn't seems to be the only thing not working atm.)

Where do you unzip the file?

Selur

18th December 2022, 08:43

I installed it through pip install -U vsanimesr downloading CUDA-11.7_cuDNN-8.6.0_TensorRT-8.5.2.2_win64.7z extracting the dll into the runtime folder (I add in my scripts) and installing ' tensorrt-8.5.2.2-cp310-none-win_amd64.whl' through python -m pip install tensorrt-8.5.2.2-cp310-none-win_amd64.whl.

lansing

18th December 2022, 11:35

Got it working. Its pretrain models didn't look good on my old anime. What source is it supposed to be good at atm?

Selur

18th December 2022, 13:02

No clue, what's supposed to be good. But, looking at the examples over at https://github.com/TencentARC/AnimeSR I would say cartoon&animes where mainly compression artifacts are the problem.
So depending on your source, normal denoising&co should be applied and this is mainly to save resize, line darkening and sharpening.

AnimeSR_v1-PaperModel.pth: v1 model, also the paper model. You can use this model for paper results reproducing.
AnimeSR_v2.pth: v2 model. Compare with v1, this version has better naturalness, fewer artifacts, and better texture/background restoration. If you want better results, use this model.
source: https://github.com/TencentARC/AnimeSR#zap-quick-inference
I would say it's mainly suited for simpler animes&cartoon.

Cu Selur

ChaosKing

18th December 2022, 14:30

The demo videos seem a bit overfiltered to me. Not bad, but also not that good. :)
But it's good to have options. Could be a good prefilter or you can merge/average it with other filters.

Selur

18th December 2022, 16:17

Yup, I too think it's mainly about having options. :)

lansing

19th December 2022, 04:45

I tried on their monkey test clip, it pretty much wipes out all small details leaving only the solid lines

Selur

19th December 2022, 16:19

Here are some examples of that clip processed with AnimeSR_v2 model:
https://i.ibb.co/z2NPgpj/grafik.png (https://ibb.co/C9WVygN)
https://i.ibb.co/5TvXG0j/grafik.png (https://ibb.co/kK0V57D)
https://i.ibb.co/rZs9ms5/grafik.png (https://ibb.co/Zhm9Vmc)
https://i.ibb.co/5jFcbwc/grafik.png (https://ibb.co/NNSYcbY)
https://i.ibb.co/5n2P2Zd/grafik.png (https://ibb.co/pRnSnqg)
https://i.ibb.co/pZ9d4b6/grafik.png (https://ibb.co/ct7Q2kK)
https://i.ibb.co/FmGvZHX/grafik.png (https://ibb.co/jb7K0Jz)
last image with the intended resolution (x4, not scaled down to source)
https://i.ibb.co/cgNsr4j/grafik.png (https://ibb.co/JBzSsD9)
so everyone can make up their mind.

Cu Selur

ChaosKing

19th December 2022, 19:07

Definitely better than the demo video!

ReinerSchweinlin

20th December 2022, 09:44

not too bad. Reminds me of results with a clever combination of traditional filters, areas moother, line darkeners, etc.... How is the speed ?