YV12->RGB Conversion resample method for NLE use [Archive]

View Full Version : YV12->RGB Conversion resample method for NLE use

Yellow_

3rd September 2010, 11:18

Hi

A query about pros/cons of resample method when doing a YV12 to RGB conversion.

I'm doing full range conversion from various sources, DSLR mpeg4AVC (minimal in camera sharpening, saturation, contrast), HDV 1440x1080 mpeg2 (minimal in camera sharpening, saturation, contrast) and consumer DV to RGB32.

I understand AVISynths default method is Bilinear but my queries are with regard to good output for further processing in a typical NLE scenerio and grading.

**EDIT** Appears default method is Linear not Bilinear in AVS upto 2.56, hardcoded Bicubic for dev 2.6. Is this still hardcoded?

Which would you consider to be a good resampling method for 'quality' conversion?

I assume there is a trade off between resampling methods and that choice may be dependant on source? For example TV card captures to YV12/YUY2 where processing has been done before encoding?

But also assume two much resampling at the outset could be detrimental if further image processing is to be done?

Is there anything specific that could be done to 'improve' appearance of line skipped sampling from hires chip to HD resolution in DSLR video re moire or minimising rather than accentuating it in the conversion?

2Bdecided

3rd September 2010, 12:49

I wonder if you could use the tricks that NNEDI2 / EEDI2 use for luma on chroma? Interesting thought.

Is the HDV progressive or interlaced? If interlaced, some of the intelligence of a good deinterlacer could help, even if you want to keep the final result interlaced.

FWIW, apart from broken chroma upsampling, it's really hard to see any difference - that's why people don't seem to care. Red/Blue diagonal borders are good at revealing any problems.

Don't forget any subsequent use of this footage is likely to require YV12 encoding. Unless you're green-screening (or similar) it's not worth worrying about what the RGB intermediate looks like - worry more about making sure YV12 > RGB > YV12 is minimally damaging or even lossless. Near-lossless is possible, but good luck getting your NLE to do it!

Cheers,
David.

Yellow_

3rd September 2010, 13:30

Thanks for reply. oops fell foul of rule 12. :-)

The sources are all progressive. Prefer to leave interlaced and let the player do it. If I have to deinterlace I use TempGaussMC_Beta. Impacts on my carbon footprint though. :-)

Although possibly is difficult to tell the difference, with a few more image processing ops likely after the conversion, primary colour correction, transitions, grading on 8bit source (done at higher precision), does the resampling choice make a difference to how much 'post' a frame will stand? Not that it'll stand much in 8bit I guess.

2Bdecided

4th September 2010, 13:11

IME the lossy coding (HDV MPEG-2 or AVC MPEG-4) is what limits how much processing you can do. Extreme colour correction brings out hidden chroma artefacts from the coding far far more than the sub-sampling. e.g. overly boost the saturation on a mostly grey scene and see what happens!

Cheers,
David.

poisondeathray

4th September 2010, 15:22

^ I agree with David. It's also the noise (usually from compression, but can also be from low quality optics, or low lighting), that limits grading far before sampling. Especially in shadow regions.

Depending on what software you are using, converting to a 10-bit intermediate can help with reducing new banding during grading (Even though you are working at 16-bit or 32-bit float). Some native 8-bit formats do not interpolate correctly for some reason

For the DSLR footage, If this is a canon with mov wrapper, make sure you decode outside of quicktime when encoding to your intermediate.

Yellow_

5th September 2010, 13:05

David, agree, 8bit's is ok if you get the right 8bits re shooting settings and do a small amount of grading with is the aim, but compression is a given, fixed, however my thought was that resampling choice is a choice and may or may not make the compression look worse, such as choosing Lanczos over Gaussian or bilinear, may appear to 'over sharpen' or 'ring' or is that just with resizing a frame after?

posiondeathray, thanks, yes I avoid quicktime completely and use FFmpegSource2 for decoding and AVISynth for RGB conversion due to swscale 'issues'. Have used Haali Media Spiltter / GDSMux to mkv too I think.

getting 10bit / 16bit intermediates is something I've done with batch scripts and Wilberts Imagemagick script / .dll looks good to simplify that process, but Wilberts IM's and plugin have been compiled for 8bit processing whereas compiling for 16bit would allow exporting 16bit Tifs albeit with 8bit amount of data. I believe Adobe and Apple add some discreet noise to generate a bit of extra data to push about when processing.

poisondeathray

5th September 2010, 18:07

getting 10bit / 16bit intermediates is something I've done with batch scripts and Wilberts Imagemagick script / .dll looks good to simplify that process, but Wilberts IM's and plugin have been compiled for 8bit processing whereas compiling for 16bit would allow exporting 16bit Tifs albeit with 8bit amount of data. I believe Adobe and Apple add some discreet noise to generate a bit of extra data to push about when processing.

I guess it depends how that 10 or 16-bit was generated, if it's just padded "zeros" or some other method. I haven't used Imagemagick too much, but I'll check it out

Some formats work better than others - you can test it yourself and see quantization/banding in the histogram and corresponding to banding in gradients in the footage. For AE, I find v210 works well

Apple for sure adds noise when decoding through quicktime, at least for the h264/mov 5D/7D Canon DSLR footage , AE dithers slightly upon export, not sure about PP. The noise can be beneficial in some circumstances, acting as a dither for gradients. You can see evidence in the noise in the screenshots below, especially the blue channel. v210 is uncompressed 10-bit 4:2:2, so the noise must be from Apple decoder. Even if you unwrap the native h.264 stream, and use .mp4 container, the noise disappears. Libavcodec will decode at full range for these cameras, so you may have to make levels adjustments depending on your goal

There is quite a bit of "buzz" about a product called 5DtoRGB (for MAC), you can search there are several reviews. You can get similar results on Windows, just by avoiding decoding through quicktime by using avisynth, libavcodec. As you can see in the screenshots below, if you were doing FX work, keying etc... it will definitely get you better results

Summary: Quicktime is BAD

http://i55.tinypic.com/29y2m51.png

Yellow_

5th September 2010, 19:02

poisondeathray

5th September 2010, 19:07

I'll look into this. Thanks. I'm familiar with 5DtoRGB, but I'm on Linux + Wine. Contacted Rarevision recently to ask on progress, no updates since 2009 and if they'd share the secret if they're not developing further. :-)

There still is active development. Beta testers are evaluating batch encoding functionality right now, and have stated future Windows support (not sure about linux)

Anyways, not to derail the thread too much, but the point was to avoid QT decoding, and using avisynth for manipulations + ffmpeg works great (which you can already batch with).

Yellow_

5th September 2010, 20:54

Windows is close enough, hopefully will run with Wine on linux. :-)

Yes, back on topic, I've done some test recently with svn FFmpeg swscale resampling in the conversion to RGB, bicubic looked best, but the apparent BT601 only and possibly 16 - 235 stuffs ffmpeg for the colour space conversion, hence using AVISynth. But as no real definitive choice, I'll try out some AVIsynth resample methods and look at v210 decoding, then draw my own conclusions, those of an untrained eye.

poisondeathray

5th September 2010, 21:06

How are you determining "bicubic" looks best? What critera/method ? Do you mean bicubic resize or for the Y'CbCr => RGB conversion? I started reading this thread on gamut conversions , and there might be interesting possibilites using LUT's but I'm still digesting it
http://forum.doom9.org/showthread.php?t=139389

The Canon cameras actually use BT601 when decoded through QT, so when it's converted to v210 in ffmpeg the color looks a bit different than if you were viewing it through quicktime (but the difference is also partly due to the QT gamma shift bug). You can see the "pink" looks slightly different in the RGB screenshot above. Of course you can manipulate it to whatever you want if you are converting to RGB in avisynth. Also , v210 is YUY2 , so your other application may handle the RGB conversion differently, or you may have to use color management. For example, v210 is decoded using BT709 by default in AE, but you can force different working space or use color management

You will also notice libavcodec will decode at 1920x1088 (as the elementary stream is actually 1920x1088)

Yellow_

6th September 2010, 10:18

re bicubic, by my untrained eye, histogram, number of unique colours, but all subjective and almost certainly flawed. :-) It's in the conversion to RGB, rather than a resize after.

Yes I saw that thread, very interesting but like you still trying to digest. :-) And for me as a beginner a lot to take in.

This may explain a bit where I'm at. www.blendervse.wordpress.com Feel free to comment. :-)

poisondeathray

6th September 2010, 14:58

I've only experimented a little with blender, but could it be that blender isn't handling your HDV interlaced footage properly ? Just like you have to specify ConvertToRGB(matrix="PC.709", interlaced=true) in avisynth, maybe there is a flag or interlaced switch in blender ?

Yellow_

6th September 2010, 15:32

Blender uses FFmpeg to do the decode and swscale (part of FFmpeg) to do the colour space conversion. Blender want's RGB internally. Although there is other Blender specific code involved with FFmpegs tasks that may be getting involved. :-) Checking for interlaced material for example before the conversion to RGB as required.

The source HDV is actually progressive, but psf25, 2x streams of segmented frames in a container. I'm assuming they're identical streams.

Perhaps svn FFmpeg is assuming they're interlaced maybe?

Does the differences in interpolation method surprise you? I've never lookied into it before.

The last image, the AVIsynth conversion uses FFMS2 as the decoder and ConvertToRGB with interlaced=false.

poisondeathray

6th September 2010, 15:46

The blocky stair stepping low quality upsample reminds me a bit of other programs that have an interlaced chroma bug

IIRC, ffmpeg doesn't have a switch for YUV<=>RGB full range conversions, but Bt.709 is signalled by -colorspace 4 (however the last time I tested it a few months ago, it didn't work)

2Bdecided

6th September 2010, 16:08

The source HDV is actually progressive, but psf25, 2x streams of segmented frames in a container. I'm assuming they're identical streams.psf25 isn't "2x streams of segmented frames in a container". It's a way of storing progressive footage in interlaced video, done (in this context) purely to make certain consumer HDV camcorders not quite as good as real professional ones. It's trivial to use full progressive 1440x1080p25 MPEG-2 encoding for HDV, but they choose not to for commercial reasons.

The key points are that it probably contains interlaced-sampled chroma, and lots of software will resample the chroma an a way appropriate for interlaced video. You need to force software to treat it as progressive, and resample the chroma in a progressive way.

http://www.hometheaterhifi.com/the-dvd-benchmark/179-the-chroma-upsampling-error-and-the-420-interlaced-chroma-problem.html

Cheers,
David.

Yellow_

6th September 2010, 16:27

poisondeathray, thanks, re bt709 switch, I'd seen it mentioned here and there, but nothing recent. The authors of another open source NLE, called Kdenlive which is based on FFmpeg have discovered the same.

David, thank you for the correction, it explains a lot. :-)

This is another reason I just use AVISynth where there is some visibility and take the conversion out of FFmpeg / Blenders control.

So either svn FFmpeg and many version prior or Blender code is messing it up.

When I get back later I'll try FFmpeg on the CLI and go from there.

I'm very grateful for your comments.

poisondeathray

6th September 2010, 16:51

David's last comment was what I was alluding to for the switch - I'm not sure if Blender has this. In other programs, you can interpret the asset in different ways. Because it's packed in an interlaced "wrapper" most programs will "see" it as interlaced by default, and upsample it as such. For example, 30p in 60i (the NTSC variant of yours) is handled the same way, and you use the switch to interpret as progressive

You can check to see if there are any other problems beyond this, by repeating your tests with pN (native progressive) footage instead in blender/ffmpeg . Then you can test the resample methods without the extra variable

I use avisynth for the same control reasons :) , because many commercial programs just can't seem to get it right or enough control

Yellow_

6th September 2010, 18:24

Sorry, I'm a bit slow at times. No there's no options in blender to interpret incoming assests, Blender's not really an NLE, it's VSE, video sequence editor is really geared for assembling rendered image sequences, adding sound and encoding out.

FFmpeg does most of the work I think. But I'll test FFmpeg on the CLI to avoid any interference.

But I have tried to go about the test in a meaningful way. I took the same blender source code, the same svn version of FFmpeg which is my system FFmpeg and compiled each of those blender binaries built against the same FFmpeg version. All I amended was the one line of swscale flags in Blenders source code, so that when Blender imports a video it calls upon FFmpeg to decode it and with the swscale flags I set and convert it to RGB.

But why would the results differ so much between those swscale methods? It's all the same source code and versions? Its the same HDV clip and same frame rendered to pixel peep.

The images on the blog are zoomed 800% and screen grabbed.

But I'll try again and as you suggest with some native progressive T2i shots.

Thanks again.

**EDIT**

Added links to full images for comparison. :-)