PDA

View Full Version : Frame pointer and line pitch for Avisynth frames: Guaranteed mod 16?


Mini-Me
12th December 2011, 15:14
Does Avisynth guarantee that the frame pointer and pitch for each line will both be mod 16? It would greatly simplify SSE code if I know I can rely on that, but I want to make sure.

Gavino
12th December 2011, 17:25
I believe the pitch will always be mod16, but the frame pointer may not be (eg if it is the result of a Crop).

Mini-Me
12th December 2011, 17:36
I believe the pitch will always be mod16, but the frame pointer may not be (eg if it is the result of a Crop).

If that's the case, is it better to use unaligned loads or realign the data first? If realigning data is the right answer, how should I go about that? If the data pointer is e.g. mod 4, is that where the allocated memory starts? Or does it actually start at the previous mod 16 boundary?

I'm sorely tempted to just throw an Avisynth exception for "unaligned" frames say, "Hey, don't use my filter after an unaligned crop!" :p It would definitely cut down on extraneous codepaths. Do you think that would be acceptable? As long as I'm not violating any unspoken rules, it's definitely easiest on me.

IanB
12th December 2011, 21:43
The default 2.6 behaviour is to mod 16 all plane start addresses and pitches. Prior to 2.5.7 chroma pitch was only mod 8 as it was always 0.5 of luma pitch. In 2.5.8 Luma pitch was made mod 32 to fix this. From 2.6 all planes now have independent mod 16 pitch, i.e. assuming chroma pitch is 0.5 luma pitch is no longer acceptable, you must do a GetPitch call for the chroma planes. There is legacy support for YV12, but new code must not assume this.

Plugin authors can request arbitrary pitch on new video frames but not the plane start addresses. SubFrame allows arbitrary mashing of the frame geometry including start addresses and pitch. Of course it is better if they do not, and so far I don't know of any authors that do so or can think of applications that might need this.

Crop can advance the plane start addresses to a non-aligned value, but the original input frame would always have had mod 16 plane start addresses.

So you must always at least test for suitable alignment and handle the unaligned cases. Throwing an exception does count as "handling" the case, however I do not think your users may really appreciate this option. Crop does have the Align=True option to force a frame blit if the start addresses are not mod 16, if they are then there is no cost. So you could require this option from your users when they crop directly before your plugin.


For algorithms that read the input data once it is very difficult to amortise the cost of the extra frame blit in the total processing time. One unaligned read in your code is always going to be faster than (one unaligned read plus one aligned write in the blit) plus one aligned read in your code. If your code reads the input multiple times then the performance can tip back in favour of the blit. i.e. (1ur + 1aw + Nar) < Nur for some moderate value of N.

An often overlooked option to regain alignment is to partially undo the input crop. The original frame would always have been aligned, so back stepping the start address to the previous aligned location is quite legal. Of course the pixel data in this pre-pad region must be considered random uninitialised trash and you still probably need to write out aligned data.


At the end of the day it is probably better to have your plugin require aligned input than to not have it at all. Of course it is even better if you handle unaligned input faster than the added time caused by the alignment blit, but this is not always possible.

Mini-Me
19th December 2011, 02:02
Thank you for all the details, Ian! My plugin deals with RGB, YUY2, and planar input of 8-bit and 16-bit varieties, which requires 6 codepaths for the input and output sections already. Factor in SSE and fallback code and I'm up to 12, but I may just code an extra 6 for unaligned SSE loads as well, since it'd be a near copy-paste of the aligned SSE path. It's messy, and it doesn't give me the same kind of warm fuzziness as throwing a snarky exception, but hey. :p