Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
31st July 2013, 07:53 | #41 | Link |
Video compressionist
Join Date: Jun 2009
Location: Israel
Posts: 126
|
@Pieter3d
Thanks for your professional respond. According to your profound comments I guess you are Pieter K. Am I right? If so you should remember me from JCT-VC meetings. Till now only three experts (including you) reviewed the overview. My purpose is to compile a detailed complete free-access presentation on HEVC basing on discussions and feedbacks with/from experts and my own opinion. Therefore I use in the the title the word "prepared by" instead of "author". Regardding to your comments: 1) VPS is optional - agreed 2) ... mnemonics 2Nx2N etc make sense - very confused, especially for AVC/H.264 guys 3) ... intra prediction follows the TU tree - i tried to explain it (e.g. in the slide #35), apparently i need add more comments. 4) ... spatial neighbor scaling for AMVP - good point, it's worth also stress the following point: Unlike AVC/H.264 neighbors with different prediction direction are teken into consideration. For example if current block is forward-only and a neighboring one is backward-only then backward MVs of the neighboring blocks are incorporated in AMVP process with a corresponding scaling. 5) Transform can actually be implemented with 28-bit precision - it's worth to add with DR calculation. 6) ... columns because the coeff block is processed first by columns - correct, unlike to AVC/H.264 the HEVC defines a column-row order for the transform. This point is mentioned in the section "Transforms and quantization" in the IEEE paper: "HEVC Complexity and Implementation Analysis". 7) Visual artefacts on large transform blocks - this isn't always case ... do you know any heuristics to determine when the artefacts appear and when not? |
31st December 2013, 12:16 | #43 | Link |
Registered User
Join Date: Dec 2013
Posts: 6
|
Quantization
Hi Nice Explanation. Can you also explain how Quantization is done in HEVC. I am working on RDOQ in HM reference code. I have completely vague idea how exactly Rare distortion Optimization is done in HEVC. I will be Happy if someone can Explain on this. Thanks in Advance
|
1st January 2014, 03:29 | #44 | Link |
Registered User
Join Date: Jan 2013
Location: Santa Clara CA
Posts: 114
|
Forward quantization is pretty much up to whatever encoder you write. You just have to keep in mind the way a compliant decoder performs inverse quantization:
The QP value for a CU is determined, a number between 0 and 51 for normal 8-bit sequences. Then a scale factor is derived: scale_factor = levelScale[qp%6]<<(qp/6), where levelScale = { 40, 45, 51, 57, 64, 72 }. This creates an exponential relationship between qp and scale_factor. Then essentially the coefficients are multiplied by this values and shifted down by (bitDepth + log2TransformSize - 5). There are a few other details, but the spec is pretty easy to follow for this. RDO is a different topic though.... |
1st January 2014, 07:42 | #45 | Link |
Registered User
Join Date: Dec 2013
Posts: 6
|
Thanks Pieter. Please dont think that i am deviating from the topic. As per my Knowledge RDOQ must be done for every Coding unit and for each mode. That is for each CU in a particular mode(either INTRA, INTER,etc) we need to find Rate and Distortion and find the cost function J=R+(lambda)D. The mode which yields least 'J' is selected and is RD Optimal. My question is How this rate, Distortion and lambda are to be estimated. I went through HM reference code but could not follow the code flow as which algorithm is being used for RDOQ. This might be a very Basic Question, appreciate if you can explain or provide a link on this area.
|
3rd January 2014, 03:19 | #46 | Link |
Angel of Night
Join Date: Nov 2004
Location: Tangled in the silks
Posts: 9,560
|
Mode decision mostly boils down to "test everything, pick whatever costs the least." Since testing literally everything is stupidly slow, the complication comes from the tons of speedups to pare down the test space: Take a guess on where to start looking, compare mostly via fullpel SAD (or even low-res SAD), then only test subpixel on the decent matches, then only transform the closest candidates, then only attempt to entropy code the lowest energy candidates, then only try trellis quant on the smallest result(s) before settling and moving on to the next block. Each step is a sieve that reduces the problem space for successively slower steps, so you can spend time where it's more important. AVC and HEVC also include a number of predictors for each block, like skip and direct modes, so they need to be tested too; so you can bypass everything above entirely if heuristics tell you one of the predictors is already good enough. Most encoders let you tweak how much they ignore, so you can find your own speed/optimum balance.
Usually intra isn't even considered unless no viable candidate has been found with basic motion estimation first, because it's so much larger. One reason why the HM 12.1 can be higher PSNR than even x265's placebo mode is because it doesn't use as many shortcuts on its full mode. It doesn't bother to sieve out as much, it just tests everything within a specific range. It may take a year to encode a whole movie, but it will generate a more optimal solution for each individual picture. (Without rate control, adaptive GOP, CU-tree, or forward prediction, it does a much worse job at global optimization. But at least it completes in a year, instead of a millennium.) Last edited by foxyshadis; 3rd January 2014 at 03:22. |
24th January 2014, 18:04 | #48 | Link |
Registered User
Join Date: Jan 2013
Location: Santa Clara CA
Posts: 114
|
The encoder must generate the same picture that the receiving decoder will, because that is the picture the decoder will use as reference. If the encoder only used the source picture as reference, then differences would accumulate over time and become very noticeable. This is referred to as error drift.
DCT coefficients get quantized (precision reduced) in the process of encoding. Since the specification only tells you how to perform inverse quantization, the encoder may perform the forward quantization by any appropriate method. Typically some value is added to the coefficient first before the quantization (usually a division of some kind) to allow for rounding. For example if the scale factor was 10, and a coefficient is 59 before quantization, simply doing 59/10 = 5 seems like a bad result (integer division). So we can add a value: (59+5)/10 = 6. This accounts for the truncation towards 0 during integer division. |
26th January 2014, 08:18 | #49 | Link |
Registered User
Join Date: Dec 2013
Posts: 6
|
Thanks Pieter. So do you mean to say 5 is the Rounding Offset. But we can choose 2,3 also right. Basically how we will decide to chose Rounding offset. In the original HM code, Rounding offset is constant for all DCT Coefficients in a Block. But i could not understand what is the mathematical model used to choose a Particular Rounding Offset. Can you give me any Research paper links on this.
|
26th January 2014, 15:34 | #50 | Link |
Software Developer
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,251
|
It's up to the encoder to choose the "quantized" coefficients. So every encoder may use his own algorithm.
For an explanation of x264's "trellis" algorithm (and also a summary of "uniform deadzones"), please see the description here: http://akuvian.org/src/x264/trellis.txt
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊ Last edited by LoRd_MuldeR; 26th January 2014 at 15:49. |
6th February 2014, 16:56 | #53 | Link |
Registered User
Join Date: Jan 2014
Posts: 2
|
Using HEVC with OpenEXR
First of all thanks a lot, Pieter3d for your nice explanation.
I want to use HEVC to compress a set of openEXR files I have. OpenEXR is a high dynamic range (HDR) image format. These files are 32bit, 30 channels, i.e. an image has 30 different spectral(color) channels and a pixel in each channel is stored by 32bits. I consider these images as frames in a 'video' and then I want to encode this video using HEVC, so I can benefit from high compression ratio. Next I want to have random access to this openEXR-HEVC-coded “video”, so that I can quickly and easily read any pixel of any image in my original dataset. Obviously pixels' bit depth and color-space of my images are different from what is supported by HEVC by default. As I've no experience in this field, I'm yet not able to see beforehand if there is a theoretical barrier to what I want to do or is it possible to extend, say x265 implementation, to be able to read and encode my 'video'. Preferably I want to do this as simple as possible: add support for reading my input format and some tweaks and change of values here and there, and not changing the whole encoder completely. Now I'm asking you, the experts, if you see such a barrier or not. You are very kind to provide me with any kind of comments. |
6th February 2014, 20:52 | #56 | Link |
Registered User
Join Date: Jan 2013
Location: Santa Clara CA
Posts: 114
|
Well, many of the coding tools in HEVC are designed for 3-component YUV (1 luma channel and 2 chroma channels). Also, the quantization operations, transforms, and motion filters are not at all designed for crazy bitdepths like 32. The current draft range extension to HEVC goes up to 12 bits per channel, and still the same 3 channels, although it does include 4:4:4, which means no chroma sub-sampling.
Your data of 30 channels with 32-bits will need special purpose-built tools to effectively compress. |
12th February 2014, 04:40 | #57 | Link |
Registered User
Join Date: Feb 2014
Posts: 4
|
Hi, sry if this is the wrong place to ask but here goes. I have a question about the code which might be stupid though I'm completely stuck. Say I have a 32x32 CU block, how can I find the pixels of that block from the code. I'm finding the block using:
TComPic* pcPicTex = pcCU->getSlice()->getTexturePic(); TComDataCU* pcColTexCU = pcPicTex->getCU( pcCU->getAddr() ); This is taken from TEncSearch.cpp of 3d-hevc, though my question stands for any CU block. Any help would be greatly appreciate. Thanks |
12th February 2014, 06:39 | #59 | Link |
Registered User
Join Date: Feb 2014
Posts: 4
|
Hi, this is for the encoder. I want to try to implement a simple 1-d filter at each side of the block. Thus, I need the value of the pixels in that block so I can apply this filter to the top, bottom, left and right edges of the block. Thanks
|
Thread Tools | Search this Thread |
Display Modes | |
|
|