STPresso seems worth a look. But in general, yes, downsizing + temporal filtering is the way to go. Downsize
especially if this is footage from a typical camera you see these days, with lots of advertised "megapixels" but with optical and noise problems that make that high resolution useless. Downsizing in that case loses very little information. (EDIT in fact it filters out some sensor noise)
And yeah, Handbrake has a
reputation for being very bit-efficient for some reason.