on nnedi3_rpow2, you should use only std.Transpose() instead of your TurnLeft/Right functions. FlipHorizontal and FlipVertical are useless in this case.
Code:
yexpr = 'x 128 - y 128 - * 0 < 128 x 128 - abs y 128 - abs < x y ? ?'
I suspect that Lut2 is faster than Expr in the case of such a complicated expression.
and, since warpsharp package is not compatible with VS, you can't use UnsharpMask().