Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
25th November 2013, 21:25 | #20961 | Link |
Registered User
Join Date: Feb 2002
Location: San Jose, California
Posts: 4,407
|
@makakam
Those predicted dropped or repeated frames are not due to the speed of your system but due to a mismatch between your monitors refresh rate and the video's refresh rate, turning on smooth motion will help. You can also tune your refresh rate by making a custom resolution or use Reclock. |
26th November 2013, 20:53 | #20967 | Link |
Nicolas Robidoux
Join Date: Mar 2011
Location: Montreal Canada
Posts: 269
|
A broadcaster, deeply involved in the standardization of 1080p- and Ultra HD-receivers, has kindly asked for my opinion regarding down/up-sampling of progressive scan video, but as I am no video expert I would like to ask for comments RE: what I would like to suggest as starting points. (I don't mind being called stupid, esp. if a better solution is proposed.)
Context TTBOMK (and I am not really at the liberty of saying much, and I certainly don't want to be put on the spot if what I describe fits nothing that ever sees the light of day): Let's talk luma. The full res signal will be downsampled by one of rational ratios that range between 1 (no downsampling) and 1/4, then compressed. Some of the rational ratios do not have a unit numerator (5/6, for example). This downsampled signal will then be decompressed, and upsampled or downsampled to fit the display by ratios ranging from 1/2 (further downsampled) and 4 (enlarged "a lot"). If I understand correctly, the recommendations must be "cheap to implement". At the very least, complications should give good bang for the buck. This is what I suggest for downsampling. If the ratio is 1, nothing to do (except compress). For the other downsampling ratios, keeping in mind that the downsampled and compressed signal is NOT meant to be viewed "as is", I would suggest the following. (Note that this describes the effect of the filter. It is not a description of an efficient implementation.) Step 1: Convert to YcCbcCrc. Step 2: Box filter by 2x2 (2 horizontally and 2 vertically) if the downsampling ratio is greater than or equal to 1/2, 3x3 if the downsampling ratio is greater than or equal to 1/3, 4x4 if ... 1/4. Step 3: Decimate. (For example, with downsampling ratio 5/6, keep the first 5 pixel values (each of which is an average of 4, since 5/6 >= 1/2), skip the sixth, keep the next 5, skip the 12th, etc. Then, only keep the first 5 scanlines, drop the 6th, etc.) Step 4: Convert back to Y'Cb'Cr'. Rationale: 1) Downsampling through something that approximates linear light is a high impact improvement, so push for that as the one "quality expense". 2) Halos and other sharpening artifacts will re-enlarge (and further downsample) badly, so we should not use a sharpening filter. In addition, clipping leads to loss of information (and visually annoying artifacts when re-enlarging). So, we may as well use the simplest low pass filter followed by decimation. 3) Processing the decompressed downsampled image for viewing then becomes a clearly defined problem. For the sake of exposition, let's use the 5/6 ratio re-enlarged by 6/5. Given averages of 4 pixels except that every 6th column and every 6th scanline of such averages is missing, reconstruct the full res image so it looks good. I'll discuss the issue of upsampling (actually resampling, since we me be also be downsampling further) in another post (if I have time...). However, the one thing I am quite sure to recommend, is NOT to perform this final resampling/sharpening through linear light: Filter the Y'Cb'Cr' values directly. Rationale: Section 6.6 of http://web.cs.laurentian.ca/nrobidou...tersThesis.pdf and http://www.imagemagick.org/Usage/filter/nicolas/#short. Last edited by NicolasRobidoux; 26th November 2013 at 21:24. |
26th November 2013, 21:32 | #20968 | Link |
Nicolas Robidoux
Join Date: Mar 2011
Location: Montreal Canada
Posts: 269
|
The one thing I don't like about my (not really anything novel there, so "my" should have quotes) proposed downsampling approach is that if we are going to compress the downsampled image with something that plays well with "Fourier" methods (like JPEG does through its connection to the DCT) we may as well "downsample" in "(Fourier) coefficient space", which really means dropping modes and/or compressing more.
"Downsample on load" does work well with JPEG. Basically, one would not downsample directly: One would simply compress the full res image using "quality settings" sufficient to get a reasonable image when viewed at a certain smaller size. This of course takes away the "linear light downsampling" benefits, since these types of compression are generally better used in a "perceptual color space", but it does make for a fairly elegant approach: Instead of downsampling then compressing, compress the full size image suitably aggressively. Last edited by NicolasRobidoux; 26th November 2013 at 21:45. |
26th November 2013, 22:14 | #20969 | Link |
Nicolas Robidoux
Join Date: Mar 2011
Location: Montreal Canada
Posts: 269
|
On to the final resampling for viewing (upsampling as much as 4x, or downsampling further by as much as 1/2).
If the above "downsample by box filtering by 2, 3 or 4 in both directions" recommendation is followed, the result should be filtered for display, not merely decompressed, even if the dimensions of the viewed image are the same as the dimensions of the downsampled image. The main reason is that the box filtered samples that make up the decompressed downsampled image are not equally spaced in the original full size image. If, for example, we downsampled by a ratio 5/6, the first five groups of 4 (2x2) pixels that were downsampled have centroids at a distance of 1 from one 2x2 group to the next (in the full size image), but the sixth retained 2x2 group is 2 away from the fifth retained one because the "actual" sixth 2x2 group was skipped. In addition, the alignment of the downsampled image is slightly different than the original: It is still centered, but the first centroid is one half inter-pixel distance to the right and down compared to the center of the first pixel. (The last kept centroid is one half-pixel to the left and up.) Now that we have established that the decompressed result should be filtered whether it is viewed at the same size, re-enlarged fully or partially, or further downsampled (with one exception: full, non-downsampled, resolution material viewed at full resolution, the "trivial case"), allow me to indicate how this filtering should be performed so as to preserve alignment. First, figure out the positions, within the full size image, of the centroids of the groups of pixels that were box filtered when downsampling. Then, map out where these centroids "land" within the viewed ("final") image when transformed using the "align corners" image geometry convention (http://libvips.blogspot.ca/2011/12/t...ith-align.html). Let's assume that the pixel centers of the "final viewed image" are located at coordinates written (i,j) such that nearest pixels are at a distance of 1, and the centroids are located at coordinates labeled (U,V). We have pixel values at the centroids that were not "thrown away" when decimating. Our job is to interpolate the "known" data at the (U,V) positions to the (i,j) positions. Construct a separable filter as follows. Let the horizontal = vertical distance between centroids in the viewed image be D. If D > 1, the downsampled image is re-enlarged. If D < 1, it is further downsampled. When reconstructing the pixel value at (i,j), we will consider all centroids that are within max(3,3D) to the left, right, top and bottom. That is, we consider all centroids that fall within the square of width = height = 2max(3,3D) centered at (i,j). The reconstructed pixel value will be a weighted sum of the centroid values. Give a raw weight w(U,V) equal to 0 if (U,V) is the position of a centroid that was "dropped" when decimating. This is equivalent to "ignoring" such centroids in the weighted sum. Otherwise, let the raw weight w(U,V) be the usual Lanczos 3 (Sinc-windowed Sinc with lobe parameter a = 3) weight w(U,V) = sinc(pi*(U-i)/max(1,D))*sinc(pi*(U-i)/(3*max(1,D)))*sinc(pi*(V-j)/(max(1,D))*sinc(pi*(V-j)/(3*max(1,3D))) The raw weights need to be normalized by the sum of the (nonzero) weights (there are at most 36 when re-enlarging, more when further downsampling). Although terse, this completes the description of the filter. ----- The above filter is separable. However, it has a rather large footprint. If this footprint is too large to be practical, the Mitchell-Netravali bicubic, another separable filter, can be used instead of Lanczos 3 to generate the raw weights. Details can be found here http://www.imagemagick.org/Usage/filter/#mitchell. Last edited by NicolasRobidoux; 27th November 2013 at 08:51. |
26th November 2013, 22:59 | #20970 | Link | |
Registered Developer
Join Date: Sep 2006
Posts: 9,140
|
Quote:
BTW, I can't imagine that the broadcaster is going to downscale, then compress, then upscale again for broadcasting. I suppose if they downscale, that will probably be the resolution they are going to broadcast in, and upscaling might then only be performed in the end user's video playback chain, depending on which resolution the display of the end user has. I would not try to "balance" downscaling and upscaling, unless you know for a fact that you will always have perfect control over both steps. Instead I'd look at every step separately, by trying to optimize each step as far as possible. In terms of "cheap to implement", I can say that Bicubic50 with linear light and anti-ringing performs quite well in madVR. It's noticeably faster than a simple Jinc/EWA Lanczos downscale (without linear light and without anti-ringing) would be. Of course this is just my 2 cents. Maybe it would make sense to do some tests. We should trust our eyes more than theoretical brainstorming. E.g. ask the broadcaster to provide you with a few samples. Or alternatively just take a Blu-Ray movie and downscale it. Then compare how the final result looks like with the suggested algorithms and pick the one which looks best. Personally, I believe choosing a sharper algorithm will have a more beneficial effect than using linear light. Edit: One more thing: What you destroy during downsampling you can't (easily) get back later through upscaling. So the downscaling step is quite important. Choosing a soft downscaling algorithm will make it extra hard to get a nicely sharp final output image to the display, IMHO. Last edited by madshi; 26th November 2013 at 23:07. |
|
27th November 2013, 00:06 | #20971 | Link |
Nicolas Robidoux
Join Date: Mar 2011
Location: Montreal Canada
Posts: 269
|
As usual, thank you Mathias.
----- I was not given much time on this ("ASAP = yesterday") and my day job is keeping me tres busy so... ----- I am getting the impression that downsampling and upsampling will be mixed and matched. In addition, that downsampling results are not meant to be viewed directly. Maybe you're right, and we should go for a fairly sharp downsampling filter (like Lanczos 3, say). (Side note: I think that EWA is out of the running because of computational complexity; anti-ringing may be doable. "Anti-ringing filters have been found to produce good results with sharpening filters like Lanczos." may be a good sentence to add.) However, I am not sure everybody would like the result of re-enlarging compressed images downsampled with Lanczos, say. Or anything with significant haloing, actually. Preventing such horrid artifacts, and avoiding "changes in texture/blurriness introduced by the filter changing phase across the image" is why I feel the classical box filtering then decimate (through linear light) is appropriate. But then using a very sharp filter to "finish up" (and hoping that the initial box filtering prevents the worst artifacts that can be introduced by the use of Lanczos in the final stage). This is not totally unreasonable: Sharpening (to tighted features and interfaces) sometimes looks better when applied to an image which was first lightly low pass filtered (enough to filter out the Nyquist checkerboard). But as you say, without taking the time to test... Well, bad advice is sometimes better than no advice. |
27th November 2013, 00:10 | #20972 | Link |
Nicolas Robidoux
Join Date: Mar 2011
Location: Montreal Canada
Posts: 269
|
There are several reasons why I am reasonably certain that downsampling through linear light then re-enlarging (or further downsampling) through perceptual light is a good idea.
One of them is chapter 6.6 of the thesis of my former student Adam Turcotte: From an accuracy viewpoint, downsampling through linear light (linear RGB with sRGB primaries) then re-enlarging through sRGB has been found to be consistently more accurate than toolchains that do everything through linear light or everything through sRGB. Of course I'm extrapolating... But there are heuristics that support this viewpoint. They are related to the intent of sigmoidization. ----- Indeed, I am skating on thin ice heuristics. |
27th November 2013, 07:37 | #20976 | Link |
Registered Developer
Join Date: Sep 2006
Posts: 9,140
|
@Nicolas, I think before doing a recommendation you should really do some comparison tests. Trust your eyes instead of your scientific brain, while testing with real world material (not with test patterns).
One problem with giving a recommendation is that not everybody has the same priorities. Some people have different sensitivities to specific artifacts (like aliasing or ringing) than others. But I think the majority of end users values sharpness very highly. And many end users don't even know what ringing is, nor are they very bothered by it. Personally, I hate ringing artifacts, but if I had to pick a cheap algorithm, I'd still pick a sharp one, even if it rings, because videos are often somewhat soft by nature, and I think a sharp resampling algorithm is what the majority of end users would likely prefer, if they had the choice. Because of that IMHO a good choice for a cheap upscaling algorithm is probably Lanczos3 (bzw, be very specific with taps, because marketing likes to count them differently than developers; developers say Lanczos3, marketing says the algorithm uses at least 6 taps, or maybe 12 or even 36). For downscaling Bicubic50 might do. Don't forget that video resampling is different to image resampling. In image resampling you can magnify the scaled image and judge every minute difference under a microscope. In video resampling you have only 1/24nd of a second to judge one image, then already the next image is displayed. And the human eye tries to track motion while watching the video, automatically reducing noise and tracking sharp edges etc. |
27th November 2013, 09:57 | #20977 | Link | |
Registered User
Join Date: Dec 2008
Posts: 496
|
Quote:
|
|
27th November 2013, 11:13 | #20978 | Link | |
Registered User
Join Date: Nov 2012
Posts: 167
|
Quote:
Since you didn't tell if the crash is in madVR, note that I'm also using MPC-HC 1.7.1.83 (nightly) with internal LAVFilters and the official XySubFilter build (3.1.0.546). |
|
27th November 2013, 16:24 | #20979 | Link | |
Registered User
Join Date: Oct 2012
Posts: 7,922
|
Quote:
if it still crash with xy vsfilter then it's most likely a problem with madvr you can try to lower the cpu queue to 20 or lower or you can look for related errors in the bug tracker http://code.google.com/p/xy-vsfilter/issues/list |
|
27th November 2013, 19:26 | #20980 | Link | |||||||||||
Registered Developer
Join Date: Sep 2006
Posts: 9,140
|
Weird, must be the drivers, I guess.
Quote:
Quote:
Quote:
Quote:
Quote:
What do you mean? Quote:
Quote:
Quote:
Quote:
Not sure what you mean? Flybar is a feature of MPC-BE, I think. If you want to have a button added to that, you need to talk to the MPC-BE developers. Quote:
Quote:
Definitely not. Need/want it myself. |
|||||||||||
Tags |
direct compute, dithering, error diffusion, madvr, ngu, nnedi3, quality, renderer, scaling, uhd upscaling, upsampling |
Thread Tools | Search this Thread |
Display Modes | |
|
|