You'll be mixing chroma from one picture with the luma of a different picture in some cases, won't you?
It seems to me that it doesn't matter which chroma sampling scheme was used. Putting matching fields back together fixes up the frame either way.
Interestingly, examination of the MPEG syntax for DVD VOB rips shows almost invariably that they are encoded with chroma_420_type set to 0, which means interlaced sampling. This is true even for 3:2 material. I don't have any PAL DVDs to look at. It makes sense because DVD players did interlaced upsampling (prior to more intelligent upsampling approaches based on the flags).
I agree that YV12 can get very confusing. I mean just think what happens when you SeparateFields() on a YV12 progressive sampled clip. The chroma assignments get mangled. But if you Weave() things get OK again. I think that may be the key to your dilemma.
OK, where did I go wrong?