Memory access pattern on the vertical shear is nasty, but on something as simple as this who cares. Michael Unser has a paper on how accurate this approach can be compared to direct 2D interpolation (probably more than good enough most of the time).
http://bigwww.epfl.ch/publications/unser9502.pdf