Log in

View Full Version : HM 14.0 WaveFront Processing (WPP)


rudyb
9th November 2014, 03:01
Hi,

Can anyone give a very high level explanation of what WPP does in HM for Encoding? And how much does it actually improve the timing with?

I read JCTVC-F274, and states:
"The average decoding time compared to anchors (sequential HM3.0) on larger sequences (classes A,B,E) is 55% with 2 decoding threads and 33% with 4 decoding threads"

I am assuming the same should be applicable for the encoder too, correct?
Does this really mean that it improves the timing by a factor of 2?

In the encoder, I can actually set paramter: WaveFrontSynchro

But what does this mean? for example, I set this to different values:

0,1, 2, 5, 8, 20, and when I looked at the time reported at the end as the total processing time, they are almost the same elapsed time for all the different parameters that I tried !

But when I looked at the the actual file size, I noticed the followings:
WaveFrontSynchro: 0 => filesize = 8922 Bytes
WaveFrontSynchro: 1 => filesize = 8931 Bytes
WaveFrontSynchro: 2 => filesize = 8931 Bytes
WaveFrontSynchro: 5 => filesize = 8931 Bytes
WaveFrontSynchro: 8 => filesize = 8931 Bytes

I can see the some minor filesize increased for WaveFrontSynchro = 1, but what would be the exact improvement as far as timing?
Wouldn't this mean that if the expected timing improvement was about 55% then I should've seen the final reported elapsed time to be different?

And how come I don't see any filesize difference for the other values (2,5,8 and 20) !
What is happening in the HM code?

Does this mean that WPP doesn't actually work for other parameters except 1 ?

Any hint and explanation would be appreciated.

Thanks,
--Rudy

LoRd_MuldeR
9th November 2014, 03:37
A high level explanation of Wavefront Parallel Processing can be found here:
* https://sites.google.com/site/hevcwppp/
* http://www.hhi.fraunhofer.de/fields-of-competence/image-processing/research-groups/multimedia-communications/wavefronts-for-hevc-parallelism.html

And a nice illustration is available here:
http://www.parabolaresearch.com/blog/2013-12-01-hevc-wavefront-animation.html

rudyb
9th November 2014, 17:23
Thanks for the links. Those were really helpful to see what is going on. But in terms of HM any idea how that exactly works?

Am I dialing the WPP setting correctly?
No matter what I dial the WaveFrontSynchro parameter, I roughly get the same result.
I think part of it is because even though I change WaveFrontSynchro, I don't see any difference in the number of substreams.

I think WaveFrontSynchro doesn't do what I think it should do !

Looking inside the source code:
m_iWaveFrontSubstreams = m_iWaveFrontSynchro ? (m_iSourceHeight + m_uiMaxCUHeight - 1) / m_uiMaxCUHeight : 1;

This tells me that WaveFrontSynchro doesn't really act as an integer, and rather as a Boolean.

And that explains the result I got. Because when I set WaveFrontSynchro to (1,2,5,8,20), the result wouldn't change since for my case m_iWaveFrontSubstreams always turns out to be 5 (since my m_uiMaxCUHeight is 288).

But my guess is that my stream is being broken down into 5 sub-streams. But does this mean HM doesn't allow me to set the number of sub-streams myself?
Cause based on the formula above no matter what I dial for WaveFrontSynchro parameter, m_iWaveFrontSubstreams will always be either 1 or 5 !
But how would I set it to be only two substreams?
By the way, wouldn't I expect to see some difference between the elapsed total time reported by HM for when I have 1 or 5 substreams?

Does this suggest that WPP is not really working in HM?

Thanks,
--Rudy

LoRd_MuldeR
9th November 2014, 17:50
Well, I'm not an expert on HM, but the manual says:
WaveFrontSynchro (Default: False)
Enables the use of specific CABAC probabilities synchronization at the beginning of each line of CTBs in order to produce a bitstream that can beencoded or decoded using one or more cores.

So it looks like this is a boolean parameter, i.e. "0" means "False" (Disabled) and anything other than "0" means "True" (Enabled). That would also be in accordance with your code snippet ;)

To my understanding, the whole point of WPP in HEVC is to allow parallelization of the entropy coding part (CABAC). And this is achieved by coding the current block not based on the entropy coder's state from the preceding block (which would require the preceding block to be completed first!), but based on the entropy coder's state from the top-right block (which means multiple blocks can be coded in parallel). It's probably exactly this what the "WaveFrontSynchro" option enables.

And I assume coding a block with the state of its top-right block rather than with the state of its exact predecessor block gives slightly worse compression, which would explain your results...