Log in

View Full Version : Waifu2x-caffe


HolyWu
8th July 2016, 15:35
https://github.com/HomeOfVapourSynthEvolution/VapourSynth-Waifu2x-caffe

Another implementation of waifu2x, based on the waifu2x-caffe (https://github.com/lltcggie/waifu2x-caffe) library. It can only run on NVIDIA GPU.

HolyWu
11th July 2016, 17:05
Update r6.


Update waifu2x-caffe library to 1.1.6.
Add noise reduction level 0 and make it the default level.

HolyWu
11th September 2016, 10:00
Update r7.


Update upconv_7 models.
Change the default model to upconv_7_anime_style_art_rgb.
Update waifu2x-caffe library to 1.1.7.1.

brucethemoose
19th September 2016, 17:40
Out of curiosity, why does Waifu2x caffe support non-integer scaling when this version doesn't? 2.25x scaling would be perfect for 480p -> 1080p.

EDIT: Also, great filter BTW. This is by far the best animation scaling I've seen.

HolyWu
20th September 2016, 13:56
Out of curiosity, why does Waifu2x caffe support non-integer scaling when this version doesn't? 2.25x scaling would be perfect for 480p -> 1080p.

Internally the image is always enlarged by a power of 2. So even you specify a scale ratio of 2.25x, the program still enlarges the image to 4x first. It then downscales the 4x image to the final resolution afterwards. I simply don't want to hardcode the downscaling part in my filter. It's up to the users to decide which resizer to use for downscaling.

HolyWu
23rd May 2017, 19:42
Update r8.


Update waifu2x-caffe library to 1.1.8.3.
Now uses cuDNN 6.0.
Add configure script. It's compilable on Linux now. Thanks Are_ for testing.

stax76
1st June 2017, 11:22
I try to replicate an exception posted to the staxrip issue tracker, the user is getting E_ACCESSDENIED after staxrip calls TaskDialog to show an error returned from VapourSynth, I get following error:

Python exception: Waifu2x-caffe: failed open model file at initialization

Where can I find this model file?

import vapoursynth as vs
core = vs.get_core()
core.std.LoadPlugin(r"D:\Projekte\VS\VB\StaxRip\bin\Apps\Plugins\vs\fmtconv\fmtconv.dll")
core.std.LoadPlugin(r"D:\Projekte\VS\VB\StaxRip\bin\Apps\Plugins\vs\vslsmashsource\vslsmashsource.dll")
clip = core.lsmas.LWLibavSource(r"D:\Temp\staxrip\ash.mkv")
core.std.LoadPlugin(path=r'D:\Temp\Waifu2x-caffe-r8\Win64\Waifu2x-caffe.dll')
clip = core.fmtc.bitdepth(clip,bits=32)
clip = core.caffe.Waifu2x(clip, noise=1, scale=2, block_w=512, block_h=512, model=3, cudnn=True, processor=0, tta=False)
clip.set_output()

https://github.com/stax76/staxrip/issues/190

HolyWu
1st June 2017, 13:04
Python exception: Waifu2x-caffe: failed open model file at initialization

Where can I find this model file?

They are at the GitHub page as well, just in previous release.

stax76
1st June 2017, 18:12
Thanks, got it working now.

poisondeathray
3rd June 2017, 00:34
Thanks HolyWu,

I tested on some low to midrange cards, and this seems much faster than the vpy Waifu2x-w2xc implementation. Playing with the blocksize can make a big difference in the speed

Offhand do you know of any big differences quality wise ? or would there be any reason NOT to switch to using the caffe version if you had a compatible GPU ?

HolyWu
3rd June 2017, 03:45
Offhand do you know of any big differences quality wise ? or would there be any reason NOT to switch to using the caffe version if you had a compatible GPU ?

The difference in quality is quite minor. Actually the models 0~2 in caffe version are the same as w2xc version. The major difference is that caffe version has upconv_7 models and cuDNN support while w2xc version doesn't. Both of them primarily concern speed. There is really no reason to use w2xc version when you have a compatible NVIDIA GPU though.

lansing
12th June 2017, 13:12
What is the recommended card for this? I'm using the default setting on a clip scaled to 1440x1020 after the filter, and I'm getting 0.66fps...

HolyWu
12th June 2017, 16:48
What is the recommended card for this? I'm using the default setting on a clip scaled to 1440x1020 after the filter, and I'm getting 0.66fps...

There is no officially recommended card. The only requirement is a NVIDIA GPU of compute capability 3.0 or higher.

I guess the color family of your clip is YUV? You will get slightly faster speed if you convert to RGB first before invoking the filter. Furthermore, tweak block_w/block_h setting so that it can divide the width/height of the clip with no remainder. Larger value isn't necessarily better because it has to do with the card's capability and/or memory size. You have to do experiments with some different values to find out the optimal pair for the specific resolution.

lansing
12th June 2017, 17:19
Furthermore, tweak block_w/block_h setting so that it can divide the width/height of the clip with no remainder. Larger value isn't necessarily better because it has to do with the card's capability and/or memory size. You have to do experiments with some different values to find out the optimal pair for the specific resolution.

Do you mean the width/height of the video before the filter or after?

poisondeathray
13th June 2017, 06:37
Do you mean the width/height of the video before the filter or after?

no, the block_w , block_h are filter arguments

https://github.com/HomeOfVapourSynthEvolution/VapourSynth-Waifu2x-caffe



block_w: The horizontal block size for dividing the image during processing. Smaller value results in lower VRAM usage, while larger value may not necessarily give faster speed. The optimal value may vary according to different graphics card and image size.

block_h: The same as block_w but for vertical.




It can make a big difference in speed , play with different values

Don't complain, this is much faster than the waifu w2xc implementation . And that was much much faster than the avisynth waifu2x implementation

HolyWu
13th June 2017, 18:00
I guess lansing was asking whether he should divide the image of the original size or the upscaled size. The answer is the original size though.

HolyWu
28th October 2018, 11:07
Update r10.


Update waifu2x-caffe library to 1.1.9.2.
Add UpResNet10 model.
Add batch parameter.
Now uses CUDA Toolkit v10.0.130 and cuDNN v7.3.1.

amayra
28th October 2018, 21:44
so this work only in windows ?

Blovesx
29th October 2018, 00:38
Okay, where can I find any explanation what's UpResNet10? How much is it different from other models?

What's batch option for?

I'm using GUI version for Windows. How can I encode video using this program? You posted the link for vapoursynth (whatever it is) but I don't get it how to compile it. Can't you upload it as you do with caffe-gui releases?

WolframRhodium
29th October 2018, 02:09
Okay, where can I find any explanation what's UpResNet10? How much is it different from other models?

Personally, UpResNet10 combines the advantages of residual block (https://arxiv.org/abs/1512.03385) and squeeze-and-excitation block (https://arxiv.org/abs/1709.01507) (the former one eases the training while the later one boosts the representation power):
The network is made up of 5 "residual SE blocks" with global skip connection. Each residual SE block includes 2 convolution layers (5 * 2 =10, which gives the name UpResNet10) and a channel attention mechanism.
Previous waifu2x models do not make use of any of these techniques.

Similar structure can be found in recent state-of-the-art super-resolution models like EDSR (https://arxiv.org/abs/1707.02921), RCAN (https://arxiv.org/abs/1807.02758), etc.

HolyWu
24th November 2018, 09:12
Update r11.


Update waifu2x-caffe library to 1.2.0.
Add CUnet model and it's the default model now.
Now uses cuDNN v7.4.1.

2gig
11th December 2018, 00:31
Hi, sorry if this is the wrong place to be posting this, but I'm having some trouble with this plugin. There's a good chance that my vapoursynth script is simply wrong, since I only started using it yesterday. (I used Avisynth for years; finally made the switch for this plugin).

Vapoursynth Script:
import vapoursynth as vs
core = vs.get_core()
vid = core.ffms2.Source(source=r'D:\path\to\source.vob')
#vid = core.std.Trim(vid, 0, 238)
#vid = core.nnedi3.nnedi3(vid, field=1, dh=False, nsize=6, nns=1, qual=1, etype=0, pscrn=2, opt=True, int16_prescreener=True, int16_predictor=True, exp=2, show_mask=False, combed_only=False)
vid = core.fmtc.bitdepth(vid, bits=32)
vid = core.w2xc.Waifu2x(vid, noise=0, scale=2, block=512, photo=False, gpu=1, processor=-1, list_proc=False, log=False)
vid = core.fmtc.bitdepth(vid, bits=16)
vid = core.std.AssumeFPS(vid, fpsnum=30000, fpsden=1001)
vid.set_output()

Command-line:
for %%f in (*.vpy) do START "vspipe" /B /BELOWNORMAL /WAIT "C:\Program Files (x86)\VapourSynth\core64\vspipe.exe" "%%f" - | "C:\Program Files\x265\x265.exe" --input - --input-depth 16 --fps 30000/1001 --input-res 1440x960 --output-depth 10 --level-idc 5.0 --no-high-tier --profile main10 -preset slow --psy-rd 0.4 --aq-strength 0.6 --crf 23 --ref 6 -o "converted/%%~nf_slow_10bit_crf23_vs.hevc"

Source Mediainfo:
ID : 224 (0xE0)
Format : MPEG Video
Format version : Version 2
Format profile : Main@Main
Format settings : CustomMatrix / BVOP
Format settings, BVOP : Yes
Format settings, Matrix : Custom
Format settings, GOP : Variable
Format settings, picture structure : Frame
Duration : 26 min 35 s
Bit rate mode : Variable
Bit rate : 7 641 kb/s
Maximum bit rate : 8 000 kb/s
Width : 720 pixels
Height : 480 pixels
Display aspect ratio : 4:3
Frame rate : 29.970 (30000/1001) FPS
Standard : NTSC
Color space : YUV
Chroma subsampling : 4:2:0
Bit depth : 8 bits
Scan type : Interlaced
Scan order : Top Field First
Compression mode : Lossy
Bits/(Pixel*Frame) : 0.738
Time code of first frame : 00:00:00:00
Time code source : Group of pictures header
GOP, Open/Closed : Open
GOP, Open/Closed of first frame : Closed
Stream size : 1.42 GiB (88%)

My hardware is i7-7700k (no overclock for testing), GTX 1080 Ti, 32GB Ram @2133MHz (again, no OC for testing).

The two commented-out lines don't seem to matter whether they're included or not. Commenting out the Waifu2x line and halving the resolution in the command-line works as expected. With GPU=1, the whole thing terminates rather abruptly with x265 simply stating "encoded 0 frames". With GPU=0, it runs, but the output is garbage (https://i.imgur.com/2m8cfS6.jpg), a technicolor silhouette of the original source. Sorry about the lack of a log; I can't figure out how to produce one with vspipe. Setting log=True in Waifu2x doesn't appear to do anything.

Edit: Forgot to mention that I tried lowering the blocksize, but that didn't improve anything. Also I have gotten the old x32 Waifu2x for Avisynth to run, but it's incredibly slow.

Edit2: So, I discovered VSEdit. GPU=1 causes VSEdit to immediately close itself. GPU=0 will actually display a preview that looks pretty good, though the colors are off (https://imgur.com/a/86IawkW) compared to source/avisynth-w2x. I tested with just the fmtc bit-depth conversions, commenting out w2x, and the colors come out fine that way. It was also surprisingly fast. GPU=2 also works, but is about as fast as GPU=0 and has the same color issue.

HolyWu
11th December 2018, 15:39
Vapoursynth Script:
import vapoursynth as vs
core = vs.get_core()
vid = core.ffms2.Source(source=r'D:\path\to\source.vob')
#vid = core.std.Trim(vid, 0, 238)
#vid = core.nnedi3.nnedi3(vid, field=1, dh=False, nsize=6, nns=1, qual=1, etype=0, pscrn=2, opt=True, int16_prescreener=True, int16_predictor=True, exp=2, show_mask=False, combed_only=False)
vid = core.fmtc.bitdepth(vid, bits=32)
vid = core.w2xc.Waifu2x(vid, noise=0, scale=2, block=512, photo=False, gpu=1, processor=-1, list_proc=False, log=False)
vid = core.fmtc.bitdepth(vid, bits=16)
vid = core.std.AssumeFPS(vid, fpsnum=30000, fpsden=1001)
vid.set_output()

My hardware is i7-7700k (no overclock for testing), GTX 1080 Ti, 32GB Ram @2133MHz (again, no OC for testing).

Since you have a quite decent NVIDIA graphics card, I would recommend using Waifu2x-caffe instead of Waifu2x-w2xc (you are using Waifu2x-w2xc but posting in Waifu2x-caffe thread). Waifu2x-caffe has many new models than Waifu2x-w2xc, and probably also faster. Don't forget to convert the format to RGB color family. And you should use newer znedi3 instead of nnedi3 as well. But is your source interlaced rather than telecined?


Update r12 by the way.

Update waifu2x-caffe library to 1.2.0.1.
Now it's an error when block size of CUnet isn't divisible by 4.

2gig
12th December 2018, 01:28
Since you have a quite decent NVIDIA graphics card, I would recommend using Waifu2x-caffe instead of Waifu2x-w2xc (you are using Waifu2x-w2xc but posting in Waifu2x-caffe thread). Waifu2x-caffe has many new models than Waifu2x-w2xc, and probably also faster. Don't forget to convert the format to RGB color family. And you should use newer znedi3 instead of nnedi3 as well. But is your source interlaced rather than telecined?

So, I didn't realize that w2xc and caffe were two different things; I thought I was using caffe... I switched it to caffe now, and it's great! Much faster and looks perfect. I also did convert to RGB now, which upscales a lot faster than yuv420p32.

Edit: Actually I was wrong. The color-changing issue still exists. The changes in color appear identical to those of the last example I linked, even though that one was made with w2xc.

Edit2: Changing block_w=512, block_h=512 to block_w=720, block_h=480 (source dimensions) and cropping after w2x improved speed considerably. Colors are still off.

Edit3: I found the source of my confusion regarding the color-change. The color change does not appear in VSEdit, only VSPipe. So I guess that probably means it's not anything to do with this plugin.

I also didn't know that telecine was a thing. I've only worked on Blu-Ray/UHD stuff until recently. I looked it up and I am fairly certain that my source is indeed telecine. Quite confusing when mediainfo tells me interlaced. So I am now using vivtc instead of a deinterlacer. I will also look up and add to my plugins znedi3 for when I need to work on interlaced content in the future.

Thank you. You have been extremely helpful.

(Edited) Current VPY for reference:

import vapoursynth as vs
core = vs.get_core()
vid = core.ffms2.Source(source=r'D:\path\to\source.vob')
vid = core.vivtc.VFM(vid, order=1, mode=0)
vid = core.vivtc.VDecimate(vid)
vid = core.fmtc.resample (clip=vid, css="444")
vid = core.fmtc.matrix (clip=vid, mat="601", col_fam=vs.RGB)
vid = core.fmtc.bitdepth (clip=vid, bits=32)
vid = core.caffe.Waifu2x(vid, noise=-1, scale=4, block_w=720, block_h=480, model=6, cudnn=True, tta=False)
vid = core.fmtc.matrix (clip=vid, mat="601", col_fam=vs.YUV, bits=32)
vid = core.fmtc.resample (clip=vid, css="420")
vid = core.fmtc.bitdepth (clip=vid, bits=8)
vid = core.std.Crop(vid, 24, 24, 24, 8)
vid = core.resize.Lanczos(vid, 1440, 1080)
vid = core.std.AssumeFPS(vid, fpsnum=24000, fpsden=1001)
vid.set_output()

HolyWu
30th January 2019, 05:23
Update r12.


Update waifu2x-caffe library to 1.2.0.1.
Now it's an error when block size of CUnet isn't divisible by 4.


Update r13.


Update waifu2x-caffe library to 1.2.0.2, which fixed incorrect models being used for CUnet.

thedangle
8th April 2019, 12:51
Not sure if this is a VS or waifu2x caffe thing, but using scripts with caffe seems to cause large amounts of virtual memory to be committed, even with core.max_cache_size set to 5124. Physical memory use seems to adhere to the memory limit, though, even if there's about 6gb of extra "available" physical memory and 3gb free VRAM.

I'd just ignore it since memory is meant to be used anyway but when it hits my commit cap virtualdub64 crashes. Video output before the crash is fine and cunet denoiser seems much more accurate than upconv. Example script (source res is 720x480):


import vapoursynth as vs
import havsfunc as haf
import adjust
from vapoursynth import core
core.num_threads = 8
core.max_cache_size = 5124

clp = core.lsmas.LWLibavSource(r'test.avi',threads=8)
clp = adjust.Tweak(clp,sat=1.05)
clp = core.fft3dfilter.FFT3DFilter(clp, sigma = 2.2, planes=[1,2])
clp = core.pp7.DeblockPP7(clp, qp=1.8, mode=2)
clp = core.fmtc.matrix (clp, mat="601",col_fam=vs.RGB)
clp = core.fmtc.bitdepth (clp,bits=32,dmode=0)
clp = core.caffe.Waifu2x(clp, noise=2, model=6, scale=2, block_w=320, block_h=240, cudnn=True, tta=False, batch=3) # crashes at 720 wblock
clp = core.fmtc.bitdepth(clp, bits=16,dmode=0)
clp = core.knlm.KNLMeansCL(clp, d=0, a=12, s=0, h=.23, wmode=0)
clp = core.f3kdb.Deband(clp,random_algo_ref=2,random_algo_grain=2,blur_first=True,dynamic_grain=True,sample_mode=1,range=16,dither_algo=1,y=64,cb=80,cr=80,grainy=0,grainc=0,output_depth=16)
clp.set_output()

HolyWu
8th April 2019, 16:02
Not sure if this is a VS or waifu2x caffe thing, but using scripts with caffe seems to cause large amounts of virtual memory to be committed, even with core.max_cache_size set to 5124. Physical memory use seems to adhere to the memory limit, though, even if there's about 6gb of extra "available" physical memory and 3gb free VRAM.

I'm not sure what's the culprit but I'd guess batch=3. Have you tried batch=1?

thedangle
8th April 2019, 22:52
Batch 1 does drop ram use by about 3gb. Looking at it again in perfmon it seems windows commits the total local and VRAM use combined, so when a script uses 5gb and waifu uses 6gb VRAM the commit charge rises by 11gb. Seems to consider VRAM use part of the total but crashes at the regular motherboard + virtual memory limit.

ChaosKing
11th October 2019, 17:39
There is a much faster Waifu2x "anime" version available on https://github.com/aka-katto/dandere2x
It is faster because it processes only the changed parts of the next frame https://github.com/aka-katto/dandere2x/wiki/How-Dandere2x-Works

Maybe this could also be ported to VS? :D


p.s. There is also a vulkan version of Waifu2x (for AMD users) https://github.com/nihui/waifu2x-ncnn-vulkan

HolyWu
12th October 2019, 12:50
p.s. There is also a vulkan version of Waifu2x (for AMD users) https://github.com/nihui/waifu2x-ncnn-vulkan

I do plan to port ncnn-vulkan version but that won't happen soon due to my other plans and limited spare time.

ChaosKing
12th October 2019, 12:53
I do plan to port ncnn-vulkan version but that won't happen soon due to my other plans and limited spare time.

That's great news :thanks:

ReinerSchweinlin
4th December 2019, 08:32
There is a much faster Waifu2x "anime" version available on https://github.com/aka-katto/dandere2x
It is faster because it processes only the changed parts of the next frame https://github.com/aka-katto/dandere2x/wiki/How-Dandere2x-Works

I tried this in my Kabylake i5/GTX 960 on SD Anime Files - for me in my configuration Dandere2x isnīt faster than simply doing each frame completely in waifu2x-caffe via Hybrid oder video2x.

AlphaAtlas
12th December 2019, 23:02
I tried this in my Kabylake i5/GTX 960 on SD Anime Files - for me in my configuration Dandere2x isnīt faster than simply doing each frame completely in waifu2x-caffe via Hybrid oder video2x.

If you're interested in something faster than Waifu2X, there's this: https://github.com/Sg4Dylan/vapoursynth-fsrcnn-ncnn-vulkan

Some of the older 2x mxnet models are pretty quick in Kice's MXNet plugin as well.


Unfortunately, neither is particularly easy to set up :/

ChaosKing
13th December 2019, 00:04
This one is easy https://github.com/Nlzy/vapoursynth-waifu2x-ncnn-vulkan/
Also available via vsrepo

luigizaninoni
15th December 2019, 19:04
This one is easy https://github.com/Nlzy/vapoursynth-waifu2x-ncnn-vulkan/
Also available via vsrepo

I'm trying this plugin with Staxrip 2.0.2.4 but gives this error:
Python exception: Core only supports API r3.5 but the loaded plugin requires API r3.6

How do I fix ? I have python 3.7.4 and Vapoursynth 45 installed

ChaosKing
15th December 2019, 19:44
Update vapoursynth.

luigizaninoni
15th December 2019, 20:11
Update vapoursynth.

thanks, vapoursynth 47 fixed it

ReinerSchweinlin
15th December 2019, 23:50
If you're interested in something faster than Waifu2X, there's this: https://github.com/Sg4Dylan/vapoursynth-fsrcnn-ncnn-vulkan

Some of the older 2x mxnet models are pretty quick in Kice's MXNet plugin as well.


Unfortunately, neither is particularly easy to set up :/
Thank you. On my journey with AI upscaling, I ran across
https://github.com/ptrsuder/IEU.Winforms
which uses esrgan. From my limited knowledge, I figured ESRGAN is able of better foto upscaling than waifu2x, so I installed ESRGAN - none of my testfiles look "good" right now (Topaz Gigapixel works much better at this point)..

So much to lurn, will take a look at your links.

brucethemoose
30th December 2019, 19:21
Thank you. On my journey with AI upscaling, I ran across
https://github.com/ptrsuder/IEU.Winforms
which uses esrgan. From my limited knowledge, I figured ESRGAN is able of better foto upscaling than waifu2x, so I installed ESRGAN - none of my testfiles look "good" right now (Topaz Gigapixel works much better at this point)..

So much to lurn, will take a look at your links.

Sorry for the late reply. ESRGAN works best with models trained specifically on the style of content you're trying to process, see this for some examples:

https://upscale.wiki/wiki/Model_Database

Otherwise Gigapixel usually ends up looking better. There are also some "video" networks out there, but theyre tricky to set up, even trickier to train and tend to benefit live action the most.

dadix
29th January 2020, 16:26
Can you do Tensorflow-GPU to work with this ? I think object detection will be very good to use with this .

AOmundson
13th March 2021, 04:20
I seem to be getting an error while attempting to run the latest version
import vapoursynth as vs
core = vs.get_core()

clip = r'D:\Video\title_t02.mkv' #replace with your video file
clip = core.ffms2.Source(clip)

#this is for converting the video to 32 bit float which the plugin needs
clip = core.fmtc.resample (clip, css="444")
clip = core.fmtc.matrix (clip, mat="601", col_fam=vs.RGB)
clip = core.fmtc.bitdepth(clip, bits=32)

#see plugin github page for what the arguments (settings) do
clip = core.caffe.Waifu2x(clip, noise=0, scale=2, block_w=128, block_h=block_w, model=6, cudnn=True, processor=0, tta=False, batch=1)

#convert back to 8bit so we get the correct output for encoding
clip = core.fmtc.matrix (clip, mat="601", col_fam=vs.YUV, bits=32)
clip = core.fmtc.resample (clip, css="420")
clip = core.fmtc.bitdepth(clip, bits=8)

#Press F5 to preview

#Press F8 to encode
#Set header to Y4M, ffmpeg as the executeable and the string itself should look like this
#-i pipe: -c:v libx264 -crf 18 -y test.mp4

clip.set_output()

Error Message:

2021-03-12 21:12:05.709
Failed to evaluate the script:
Python exception: name 'block_w' is not defined

Traceback (most recent call last):
File "src\cython\vapoursynth.pyx", line 2244, in vapoursynth.vpy_evaluateScript
File "src\cython\vapoursynth.pyx", line 2245, in vapoursynth.vpy_evaluateScript
File "E:\VapourSynthEditor\Untitled.vpy", line 13, in
clip = core.caffe.Waifu2x(clip, noise=0, scale=2, block_w=128, block_h=block_w, model=6, cudnn=True, processor=0, tta=False, batch=1)
NameError: name 'block_w' is not defined

poisondeathray
13th March 2021, 04:38
block_w cannot be used as a variable, just leave it blank (then it will be equal to block_w), or put in a number



clip = core.caffe.Waifu2x(clip, noise=0, scale=2, block_w=128, block_h=128, model=6, cudnn=True, processor=0, tta=False, batch=1)

AOmundson
13th March 2021, 05:25
Thanks for that. Now my only remaining questions are
1) How to (if it's at all possible) stop it from stretching the video horizontally. The answer seems to involve the way it treats the video as 720x480 when it's actually 640x480 with excess vertical lines.
2) Set it to deinterlace the video while it's rendering.

poisondeathray
14th March 2021, 00:37
1) How to (if it's at all possible) stop it from stretching the video horizontally. The answer seems to involve the way it treats the video as 720x480 when it's actually 640x480 with excess vertical lines.



A 4:3 NTSC DVD will be 720x480, but it's resampled on playback to 640x480 . The actual encoded pixels are 720x480

You can resize it, and/or crop as you see fit. It would make more sense to do that before upscaling, as that would be fewer pixels to upscale, and faster


2) Set it to deinterlace the video while it's rendering.

If it's a "cartoon" or animated material , generally you would IVTC , not deinterlace. Either would be performed before upscaling in the script

If's it's live action material, generally waifu2x would not be an ideal choice