Doom9's Forum - View Single Post - Help with some Expression Encoder Options

benwaggoner · 16th August 2009, 21:51

Quote:

Originally Posted by CruNcher

Benwaggoner is there any paper how Microsofts Adaptive Deadzones HVS algorithm works internally and on what it adapts itself (Quantization, Motion, Both) ?

Not that I know of, but I don't really track the academic/patent side of these things. Generally any kind of psychovisual optimization publication from Microsoft will have Tom Holcomb as a co-author.

Quote:

i saw a lot of interesting possibilities with x264 in terms of perception in different situations (smooth out Quantization for sharper (without risk amplifying blocking), or blurrier (save bits) results in different motion situations) and sources (trellis is already some kind of it though i find it destructive visually, and i had better results speed wise and visually doing something like trellis outside of x264 using mvtools in combination with the GPU). Also using it as a artificial Encoder sharpening is interesting especially with Sharp Mpeg-2 Sources i get very interesting visual results using it (trying to smooth out Quantization Blurrienes effect on lower bitrates for interesting ROI scenes) that way (though it is not easy to hold it stable in motion frenzy out edges especially when AQ and Deadzones come together)

Yeah, blocking/dithering and the non perceptual uniformity of gamma are an interesting and probably under-researched area of codecs. We've certainy done a fair amount around dark luma DQuant which has proved to be a really big deal when targeting consumer LCD displays. And when a display shows adjoining blocks of Y'=16 and Y'=17 as visibly different, in-loop deblocking or overlap transform aren't going to help, since the output of the decoder is still 8-bit 4:2:0.

Much of the value of PEP and VC-1 for HD optical came from the outboard dithering we had to address this issue, but I think that an encoder that integrated the 10-bit to 8-bit conversion could do even better psychovisual tuning. In the end, dithering's a pretty lossy process, essentally randomizaing the LSB. IF the codec was able to account for both the source frequencies (what was really a band and what was a gradient or noisy region in the source), it'd be able to figure out much less costly ways of preventing blocking and banding.

And even that dithering's pretty rare. The only reasonably mainstream professional compression tool that includes good dithering is Inlet's Fathom.

While 10-bit isn't an option for the recompression workflow, 10-bit 4:2:2 sources are very common inprofessional content. And if we could really take advantage of 10-bit sources for content delivered as compressed, a lot more could be authored accordingly. All the major NLEs can do 10-bit 4:2:2 or beyond, and we've got nice visually lossless authoring codecs like ProRes, DNxHD, and Cineform that make it much less expensive than it was a few years ago when v210 and expensive RAIDs were the only real option.

Quote:

Btw x264 can by now with very low effort compress down your "What happens in Vegas" DNLA EE sample by Half in 1.5x Realtime (1pass) on 2 cores (K8) and keeping the grain structure (and most of the VC-1 compression artifacts intact) in most of the scenes visually (without using PSY-RD or any RD @ all) (especially the hardcore 2 woman stand in front of the door background grain)

(The most visual problem still are the fades but soon hopefully this will be also history)
Im about to compare it with The Island Trailer that you did back then with the Rhozet Carbon Encoder (Mpeg-2, blocky high motion scenes)

I have permission to provide the Island source, if you like. Unfortuantely it came from an 8-bit DPX sequence.

Quote:

Maybe you could create a new Sample with EEV3 using higher compression efficiency and maybe unrestricted to anything @ 2 mbits 720p

Fine idea! I've got a big backlog of blog tutorals and samples from when I was working on the book, but it's all checked in now, thank goodness. The new dynamic motion vector cost makes a big improvement on film grain encoding, pretty much eliminating false-positive motion vectors due to noise. This both looks a lot better, and saves bits that can be used elsewhere.

I've got a bunch of new trailers using the Smooth Streaming VC-1 VBR encoder that uses variable bitrate, chunk/GOP duration, and frame size. Hopefully they'll get posted this week.

I never use unrestricted for any real-world content, though, since there's always some upper bound of decode complexity and peak bitrate that should be applied. In practice, with a high enough buffer constrained and unconstrained give the same output, so there's no downside to having some constraint. My default (with many exceptions, particularly as ABR goes down) is a 1.5x ABR/PBR ratio. That provides probably 80% of the theoretical max quality advantage of VBR, without having complexity/VBV much harder than CBR.