Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
30th October 2024, 22:48 | #1 | Link |
Registered User
Join Date: Jun 2024
Location: South Africa
Posts: 79
|
Dithering: Sierra-24A vs. Floyd-Steinberg
Just a quick question. When dithering down from a higher bit depth, is Floyd-Steinberg error diffusion better than Sierra-24A, or are they visually the same? Thanks.
|
31st October 2024, 01:44 | #2 | Link |
Registered User
Join Date: Feb 2002
Location: San Jose, California
Posts: 4,428
|
For 8-bit output? 1-bit?
They are very similar. Better or worse probably depends on the exact image and which aspect you are paying attention to. For something like 16 bit to 8 bit I don't think anyone would be able to notice a difference.
__________________
madVR options explained |
31st October 2024, 01:51 | #3 | Link |
Broadcast Encoder
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 3,087
|
Good question. The answer is: it depends.
There are lots of dithering methods other than ordered dithering like Stucki and Atkinson (both available in the old Dither Tools back when 16bit stacked was the norm), however the most common ones are Floyd Steinberg and Sierra, the two you mentioned, for one simple reason: the first is integrated directly into ConvertBits() and the second is integrated within x264. Before we dig into the differences between the two, let's make an extreme example: you have an 8bit full pc range image with 0 being black and 255 being white and you wanna encode it so that each pixel is either black or white. A normal rounding would simply check whether the pixel you're trying to encode is closer to 0 or 255 and it would mark it either as black or as white. If you have a pixel that is like 82, it would be converted to black, but so would a pixel with value 44, one with value 55, one with value 61, one with value 73, one with value 96 and so on, thus leading to huge regions of black until there's a jump to white like for pixels whose values are 130, 140, 150 etc. What "error diffusion" or "dithering" does is diffusing (i.e spreading) the error of each calculation to the neighboring pixel. For instance, suppose you have a pixel whose value is 82 and we still need to make it either black or white. With dithering, we would still turn it black as it's closer to 0 than it is to 255, however we would then "remember" that the value was actually 82 steps away from black so that when we move to the neighboring pixel, which has, let's say, a similar value of 92, instead of making it black we add the error (82) to it so that 92+82 = 174 which is closer to 255 so we turn it white. Once again, we "remember" the error 'cause 174 is actually -81 steps away from the white which we will add to the value of the next pixel and so on. This creates an "alternating" pattern of black and white pixels that is much "smoother" than going from a totally black to a totally white transition of several pixels. This is because, in theory, this pattern should do a better job of "mimicking" a section of pixels with values closer with one another (like a gradient). Once we finish processing a line of the image, we "forget" the error and we move on to the next line starting from scratch (i.e starting from an error of 0). Obviously this is not how dithering actually works 'cause in this example I've only really shifted the error to the next pixel (i.e I shifted it to the right), so in one dimension, but dithering algorithms are actually two dimensional as they spread the error in the horizontal and vertical components. There are many ways in which you can diffuse an error in two dimensions, for instance you can diffuse it to one or more pixels on the right, one or more pixels on the left, one or more pixels up, one or more pixels down, however dithering algorithms always push the error forward and never backward, which means that if you start from the top left pixel of an image and you move across it one pixel at a time, you're going right (->) moving down one line at a time so you will never add errors to the left and up as those would be the pixels you have already processed. This is pretty obvious 'cause if you were to push the errors left or up to the pixels you have already processed you would be pushing the errors "backwards", thus leading to more errors. In other words, dithering algorithms only spread the error forward, so right and down. The way in which they spread the error is the way those are differentiated and it's what makes Floyd Steinberg and Sierra different. Let's suppose we have the following pixels in a section of an 8bit image: Code:
42 44 115 116 116 100 120 126 126 126 Code:
0 0 0 0 0 0 0 0 0 0 Floyd Steinberg distributes the error across the neighboring pixels (where "Pixel" is the pixel we're starting with) like this: Code:
0 Pixel 7/16 3/16 5/16 1/16 44 * 0 = 0 44 * 7/16 = 19 44 * 3/16 = 8 44 * 5/16 = 14 44 * 1/16 = 3 so we're not spreading any errors to the left or up (backwards), only to the right, bottom left, bottom center and bottom right pixels (forward). In this case we're diffusing 19 to the right pixel, 8 to the bottom left, 14 to the bottom center and 3 to the bottom right neighbors of the pixel we started from. Our new pixels will therefore be: 42 + 0 = 42 44 + 0 = 44 115 + 19 = 134 116 + 0 = 116 116 + 0 = 116 100 + 8 = 108 120 + 14 = 134 126 + 3 = 129 126 + 0 = 126 126 + 0 = 126 so our block subject to floyd steinberg error diffusion would be Code:
42 44 134 116 116 108 134 129 126 126 Code:
0 0 255 0 0 0 255 255 0 0 Code:
0 0 Pixel 4/16 3/16 1/16 2/16 3/16 2/16 1/16 115 * 0 = 0 115 * 0 = 0 115 * 0 = 0 115 * 4/16 = 29 115 * 3/16 = 22 115 * 1/16 = 7 115 * 2/16 = 14 115 * 3/16 = 22 115 * 2/16 = 14 115 * 1/16 = 7 So the Sierra 2-4A error diffusion calculation for our block would be: 42 + 0 = 42 44 + 0 = 44 115 + 0 = 115 116 + 29 = 145 116 + 22 = 138 100 + 7 = 107 120 + 14 = 134 126 + 22 = 148 126 + 14 = 140 126 + 7 = 133 so our block subject to Sierra 2-4A error diffusion error diffusion would be Code:
42 44 115 145 138 107 134 148 140 133 Code:
0 0 0 255 255 0 255 255 255 255 Now, after having said all that, if you're using x264 which supports internal dithering via Sierra 2-4A, then feed it with a 16bit planar input and let it dither it down to either 10bit or 8bit (whatever your bit depth target is) so that it can do it in a way for which it's able to detect those patterns and encode it more efficiently. If however you're using an encoder that doesn't support that (like HCEnc for MPEG-2), then use Floyd Steinberg within Avisynth and feed the encoder with the already dithered down 8bit stream. Last edited by FranceBB; 31st October 2024 at 02:01. |
31st October 2024, 10:13 | #5 | Link |
Broadcast Encoder
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 3,087
|
If you're using x264/x265 then feed them with the 16bit planar input and let them dither. If you're using x262, dither down to 8bit in Avisynth with Floyd Steinberg instead.
Avisynth doesn't support 32bit float dithering. If you're applying ConvertBits(bits=8, dither=1) to a 32bit float input what's gonna happen is that Avisynth first converts to 16bit planar and then it applies the Floyd Steinberg error diffusion to get to 8bit. As for x264/x265, they don't support 32bit float input either, so either you convert to 16bit inside Avisynth and then you let them dither or you convert already to the target bit depth. x264 Sierra 2-4A is here: Link Avisynth Floyd Steinberg is here: Link About the Avisynth section, if you scroll down to line 355 of the source code, you're gonna see the coefficients I mentioned above. You remember the Code:
0 Pixel 7/16 3/16 5/16 1/16 Well, here they are at line 355 Code:
static AVS_FORCEINLINE void diffuse_floyd_f(float err, float& nextError, float* error_ptr) { #if defined (FS_OPTIMIZED_SERPENTINE_COEF) const float e1 = 0; const float e3 = err * (4.0f / 16); #else const float e1 = err * (1.0f / 16); const float e3 = err * (3.0f / 16); #endif const float e5 = err * (5.0f / 16); const float e7 = err * (7.0f / 16); nextError = error_ptr[direction]; error_ptr[-direction] += e3; error_ptr[0] += e5; error_ptr[direction] = e1; nextError += e7; } Sound familiar? 7/16 is (7.0f / 16) 3/16 is (3.0f / 16) 5/16 is (5.0f / 16) 1/16 is (1.0f / 16) and the * you see in the code stands for "multiplication" of the error for those values, exactly like in the calculation I mentioned last night. |
31st October 2024, 10:53 | #6 | Link |
Registered User
Join Date: Jun 2024
Location: South Africa
Posts: 79
|
Hats off to you, FranceBB! What a descriptive, thorough, and simple explanation. Before, I had a vague picture of what error diffusion does, but now I understand it considerably better. I appreciate your taking the time to write that, and I'm sure it will also be useful to others. Many thanks!
I asked because, recently, encoding a set of anime in AV1, HEVC, and AVC, I used the default dithering of fmtconv, Sierra 2-4A, after debanding. I wondered whether Floyd-Steinberg would have been better or the same. Now, as chance would have it, I have to re-encode the anime, and thought I'd get to the bottom of this. Also, for encoding live-action films after colour-space conversion and tone mapping, it is being dithered from 32-bit float to 8 bits (fmtconv, Sierra; and zimg/zscale, FS). I had been piping the final, dithered result to FFmpeg. To follow your advice of sending the 16-bit data to the encoder, letting it handle dithering, I think I'll have to switch to using the x264/5 executable directly. I stand to be corrected, but in FFmpeg, I don't think it's possible for the encoding library to handle the final bit depth; it seems that is higher up in the architecture. |
31st October 2024, 11:42 | #7 | Link | |
Registered User
Join Date: Jun 2024
Location: South Africa
Posts: 79
|
Quote:
I'm using VapourSynth: fmtconv does support dithering straight from 32-bit float to whatever integer format and has a couple of algorithms, Sierra 2-4A being the default. For the anime, because the f3kdb debanding works at 16 bits, I'll try to pipe that straight to the encoders. For the films, I'll dither from 32 to 8 bits and then pipe to FFmpeg. |
|
31st October 2024, 11:58 | #8 | Link |
Registered User
Join Date: Aug 2024
Posts: 113
|
I'm trying to sound like an ___ (you name it) but since you are doing video encoding, based on my own testing the better choice is to aviod any kind of error diffusion methods. That leaves you with dmode 0, 8 and 9. For 8-bit target at least, for 10-bit I just following the result from 8-bit because it's less signifant so harder to compare.
Because this again is like "source: trust me bro" and you are only asking error diffusion algorithms, I'll pause here, if you want some explanations (which probably also sounds like "source: trust me bro") I shall continue. |
31st October 2024, 12:28 | #9 | Link | |
Registered User
Join Date: Jun 2024
Location: South Africa
Posts: 79
|
Quote:
|
|
31st October 2024, 13:53 | #10 | Link | |
Registered User
Join Date: Aug 2024
Posts: 113
|
https://f3kdb.readthedocs.io/en/late...rg-dither-algo
Quote:
This f3kdb doc only says Floyd-Steinberg as it's the one in f3kdb but it also applies to other error diffusions. And it's NOT only specific to the debanded pixels. The debanding and dithering of f3kdb is basically two independent parts, so this actually applies to any dither filter. I quoted f3kdb as a trusty source, not because the debanding is important in the context. Last edited by Z2697; 31st October 2024 at 14:00. |
|
31st October 2024, 13:55 | #11 | Link | |
Broadcast Encoder
Join Date: Nov 2013
Location: Royal Borough of Kensington & Chelsea, UK
Posts: 3,087
|
Quote:
I'm glad to be playing my part in the community. I should really add this to the Wiki, but I would then have to find the time to mention stucki and atkinson as well if I do that, so we'll see. |
|
1st November 2024, 07:49 | #12 | Link | |
Registered User
Join Date: Jun 2024
Location: South Africa
Posts: 79
|
Quote:
|
|
Thread Tools | Search this Thread |
Display Modes | |
|
|