Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > High Efficiency Video Coding (HEVC)

Reply
 
Thread Tools Search this Thread Display Modes
Old 30th November 2013, 22:18   #221  |  Link
x265_Project
Registered User
 
Join Date: Jul 2013
Posts: 596
Quote:
Originally Posted by LoRd_MuldeR View Post
The smallest addressable unit of memory (aka "Byte") is 8-Bit on pretty much any modern computer. So if you store a value in memory, you need to use at least one byte (8-Bit). Even booleans usually take one byte! But if the data is bigger than one byte, you will need to use two bytes (16-Bit) to store the value - even if the value is only 10, 12 or 14 bits in size. The "unused" bits will usually be padded with zero's. For values bigger than 16-Bit you'd use 24-Bit or even 32-Bit. And so on...

Surely, one could "pack", for example, 4 values 10-Bit into 5 consecutive bytes, in order to eliminate that overhead. But then these values won't be addressable directly anymore. You would need to use some bit-operation magic to store/read your values, which usually is too complex and too slow. So, most of the time, you simply accept the overhead of storing a 10-Bit (12-Bit, 14-Bit) value in a 16-Bit variable.
This is a very good explanation. To this I would add that using the word "storing" (as procrastinating did in the original question) might be a bit misleading. Yes, we are storing these 10, 12 or 14 bit samples in 16 bit values, but only during the few milliseconds that we are processing those video samples in a CPU or GPU.

All modern video encoding standards use some form of lossless data compression (entropy encoding) which converts symbols into the final bitstream. HEVC uses context adaptive binary arithmetic coding (CABAC) to losslessly compress syntax elements to encoded bits. AVC uses CABAC or CAVLC. So, when the video is stored on a hard drive or other storage medium, we don't end up with all of those extra zeros padding the ends of our samples. But when we are encoding or decoding video, we need to process uncompressed video samples with standard CPUs or GPUs, and for this we need to move and perform operations on the uncompressed data in standard-sized units (data words). Computer processors can process data in 8 bit words, 16 bit words, 32 bit words, and for some 64 bits or larger. We can also process multiple words of data with a single instruction (this is called Single Instruction, Multiple Data, or SIMD), allowing us to pack eight 8-bit data words into a single 64 bit register, performing operations on all 8 of these data words with one instruction. When we are processing 10 bit video, we have to use 16 bit words to move and operate on these 10 bit values, and so we can only process half as many data words per clock cycle. This is why we see a big performance penalty the moment we start trying to process video that has more than 8 bits/sample. Once we switch over to using 16 bit words for every video sample, we will only move and operate on half as much data per clock cycle.

So, again... when stored in a video file, thanks to entropy encoding (CABAC, CAVLC, etc.), 10 bit/sample video is not twice the size of 8 bit/sample video. But when we are encoding, decoding or performing any intermediate processing (scaling, color space conversion, frame rate conversion, etc.) on 10 bit/sample video, we will see a big drop in performance on standard off-the-shelf CPU and GPU hardware (versus 8 bit/sample video). Of course, if you are designing an Application Specific Integrated Circuit (a hardware encoder/decoder/video processor), you can design it to operate on data of any width necessary, and so you aren't faced with the same limitation.

I hope this helps.

Tom
x265_Project is offline   Reply With Quote
Old 1st December 2013, 13:38   #222  |  Link
Procrastinating
Registered User
 
Procrastinating's Avatar
 
Join Date: Aug 2013
Posts: 71
Reading into that, I wonder if recent advancements in FPGA technology and openCL could allow for the development of cost effective ad hoc h265 encoders/decoders with the DSP capability of a software application.
Procrastinating is offline   Reply With Quote
Old 2nd December 2013, 18:22   #223  |  Link
x265_Project
Registered User
 
Join Date: Jul 2013
Posts: 596
Quote:
Originally Posted by Procrastinating View Post
Reading into that, I wonder if recent advancements in FPGA technology and openCL could allow for the development of cost effective ad hoc h265 encoders/decoders with the DSP capability of a software application.
There are many good possibilities when it comes to accelerating HEVC encoding.
x265_Project is offline   Reply With Quote
Old 2nd December 2013, 19:18   #224  |  Link
easyfab
Registered User
 
Join Date: Jan 2002
Posts: 327
@x265_Project

If possible, can we have a more accurate SSIM number, 4 or 5 decimal places or perhaps add db like in x264 ?
easyfab is offline   Reply With Quote
Old 2nd December 2013, 23:41   #225  |  Link
x265_Project
Registered User
 
Join Date: Jul 2013
Posts: 596
Quote:
Originally Posted by easyfab View Post
@x265_Project

If possible, can we have a more accurate SSIM number, 4 or 5 decimal places or perhaps add db like in x264 ?
My initial reaction is that this would be an easy change if it only involves the reporting of the number and not the calculation. I developed a similar patch (fixing the SSIM reporting in the log file) over the weekend.
diff -r 833d78aaf71e source/encoder/encoder.cpp
--- a/source/encoder/encoder.cpp Fri Nov 29 16:40:42 2013 +0530
+++ b/source/encoder/encoder.cpp Fri Nov 29 15:49:01 2013 -0800
@@ -570,7 +570,7 @@
else
fprintf(m_csvfpt, " -, -, -, -,");
if (param.bEnableSsim)
- fprintf(m_csvfpt, " %.2f,", stats.globalSsim);
+ fprintf(m_csvfpt, " %.3f,", stats.globalSsim);
else
fprintf(m_csvfpt, " -,");
Tom
x265_Project is offline   Reply With Quote
Old 3rd December 2013, 08:58   #226  |  Link
easyfab
Registered User
 
Join Date: Jan 2002
Posts: 327
Thanks Tom, that's what I want.

For info are there several possible calculations ? It's a standard formula, right ?
easyfab is offline   Reply With Quote
Old 3rd December 2013, 09:12   #227  |  Link
mandarinka
Registered User
 
mandarinka's Avatar
 
Join Date: Jan 2007
Posts: 737
For comparing different encoders, you should probably use a standalone tool.
For example x264 IIRC did some shortcuts (not deblocking non-reference Bframes or something like that) during the calculations, and it is possible x265 does it similarly. Using a standalone tool in any case gets rid of such variances.
mandarinka is offline   Reply With Quote
Old 3rd December 2013, 22:12   #228  |  Link
fumoffu
Registered User
 
Join Date: May 2013
Posts: 90
I was wondering, are there any plans for adjustable deblock strength and threshold like in x264?
fumoffu is offline   Reply With Quote
Old 3rd December 2013, 22:51   #229  |  Link
x265_Project
Registered User
 
Join Date: Jul 2013
Posts: 596
Version 0.6 released

Release Notes...

x265 0.6 is a regularly scheduled release

There were large improvements in compression efficiency since 0.5, mostly a result of the completion of weightp and b-pyramid. There is also a large amount of new assembly code; replacing most of the compiler intrinsic functions and adding coverage for some new primitives.


= New Features =
* CLI reads input video from stdin
* Main10 profile is enabled, requires a HIGH_BIT_DEPTH build
* weightp is now complete enough to be enabled by default
* performance presets have been defined, matching x264 preset names
* b-pyramid (hierarchical B frames) now supported
* Constant Rate Factor rate control is considered stable
* Adaptive Quantization introduced (experimental)

Adaptive Quantization is still considered experimental. We are not always seeing the expected improvements to SSIM when it is enabled, and thus it is still not enabled by default.


= API Changes =
* x265_nal data members renamed
* x265_picture now has colorSpace member
* --weightp enabled by default
* default parameters now match our medium preset
* new x265_param_default_preset() method for assigning preset and tune
* new x265_param_alloc() and x265_param_free() methods for version safety
* new x265_picture_alloc() and x265_picture_free() methods for version safety

The public data structures have changed enough that apps compiled against previous versions of x265 must be recompiled to use x265 0.6. We are taking steps to add version safety to the public interface. If you use the new alloc/free methods for the param and picture structures, and use x265_param_parse() to set param values by name, you will likely not have to recompile your application to dynamically link against later releases of x265.


= New Command Options =
* --y4m overrides detection of Y4M input stream, ex: x265 --y4m - out.hevc < vid.y4m
* --version long option alias for -V
* -p/--preset sets performance preset
* -t/--tune sets parameter tuning
* --[no-]b-pyramid enabled by default
* --input-csp color space parameter, only i420 is supported in this release
* --crf constant rate factor rate control
* --aq-mode and --aq-strength

See x265 --help for more details


= Upcoming improvements =
* motion compensated weightp analysis (using lookahead data)
* CU-tree (MBtree adapted from x264)
* VBV rate control
* assembly for HIGH_BIT_DEPTH builds
x265_Project is offline   Reply With Quote
Old 3rd December 2013, 23:43   #230  |  Link
foxyshadis
ангел смерти
 
foxyshadis's Avatar
 
Join Date: Nov 2004
Location: Lost
Posts: 9,449
Now this is good news. The rate of change has been incredible over the past few months, especially in getting assembly written, and I've seen the patches coming for CUtree already. Great work! I'll probably take this and see how it performs on a real video soon.
__________________
There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order.
foxyshadis is offline   Reply With Quote
Old 4th December 2013, 01:09   #231  |  Link
x265_Project
Registered User
 
Join Date: Jul 2013
Posts: 596
Quote:
Originally Posted by foxyshadis View Post
Now this is good news. The rate of change has been incredible over the past few months, especially in getting assembly written, and I've seen the patches coming for CUtree already. Great work! I'll probably take this and see how it performs on a real video soon.
Thanks Foxyshadis. The development team has definitely ramped up to a good pace, and we're very pleased with the results of their hard work. Still lots more work to do, but I think we're on track.

Tom
x265_Project is offline   Reply With Quote
Old 4th December 2013, 03:28   #232  |  Link
fumoffu
Registered User
 
Join Date: May 2013
Posts: 90
btw. 0.6 Release Notes, since it hasn't been mentioned:
--rd can now have values from 1 to 6 (previous value 2 is now 5,6)

from my tests Adaptive Quantization works correctly with rd>=4

Last edited by fumoffu; 6th December 2013 at 23:09.
fumoffu is offline   Reply With Quote
Old 4th December 2013, 12:55   #233  |  Link
benwaggoner
Moderator
 
Join Date: Jan 2006
Location: Portland, OR
Posts: 3,071
Quote:
Originally Posted by x265_Project View Post
Release Notes...

x265 0.6 is a regularly scheduled release

* Adaptive Quantization introduced (experimental)

Adaptive Quantization is still considered experimental. We are not always seeing the expected improvements to SSIM when it is enabled, and thus it is still not enabled by default.
If your AQ is based on the one from x264, it is considerably more advanced and subjectively correlated than SSIM. If it is on, you will probably see a bigger gap between SSIM and PSNR. But the only really relevant way to check is subjective comparison.

From my experimenting in the last couple of weeks, it did seem to provide a significant visual improvement, although I wasn't doing comprehensive testing.
__________________
Ben Waggoner
Principal Video Specialist, Amazon Prime Video

My Compression Book
benwaggoner is offline   Reply With Quote
Old 4th December 2013, 13:05   #234  |  Link
LigH
German doom9/Gleitz SuMo
 
LigH's Avatar
 
Join Date: Oct 2001
Location: Germany, rural Altmark
Posts: 6,004
SSIM is not able to measure a subjective improvement. So don't be sad about not getting a perfect similarity based on a technical metric; the metric will be the less reliable value, compared to an ABX test with hundreds of probands.
__________________

New German Gleitz board
MediaFire: x264 | x265 | VPx | AOM | Xvid
LigH is offline   Reply With Quote
Old 4th December 2013, 15:10   #235  |  Link
sneaker_ger
Registered User
 
Join Date: Dec 2002
Posts: 5,511
I'm having some difficulty using x265. I tried a 10 Bit encode using a build from x265.cc (64bit, 16bpp, 1d2d60f4eb81, mingw). Here's the command-line I used:
x265 - --input-res 1920x800 --fps 24000/1001 --input-depth 10 --crf 18 --aq-mode 1 -o test2.h265 --preset slow
Sample, log

1. I can't mux it using either l-smash or mp4box. Both suspect corruption.
2. It came out at only ~11 Mbytes for 2684 frames with horrible quality. I know I can decrease --crf further but is this really the quality expected for crf 18?
3. The log says "yuv [info]: 1920x800 24000Hz[...]". Should I have used "23.976" instead of "24000/1001"?

Last edited by sneaker_ger; 4th December 2013 at 15:19.
sneaker_ger is offline   Reply With Quote
Old 4th December 2013, 15:25   #236  |  Link
Kurtnoise
Swallowed in the Sea
 
Kurtnoise's Avatar
 
Join Date: Oct 2002
Location: Aix-en-Provence, France
Posts: 5,183
Quote:
Originally Posted by sneaker_ger View Post
1. I can't mux it using either l-smash or mp4box. Both suspect corruption.
With mp4box : you have to use either the hevc/hvc/265 file extension or add :FMT=HEVC to your command line

So, either
Code:
mp4box -add input.hevc output.mp4
or
Code:
mp4box -add input.h265:FMT=HEVC output.mp4
Kurtnoise is offline   Reply With Quote
Old 4th December 2013, 15:32   #237  |  Link
sneaker_ger
Registered User
 
Join Date: Dec 2002
Posts: 5,511
Thx, that does indeed fix 1.
sneaker_ger is offline   Reply With Quote
Old 4th December 2013, 19:23   #238  |  Link
fumoffu
Registered User
 
Join Date: May 2013
Posts: 90
Quote:
Originally Posted by sneaker_ger View Post
2. It came out at only ~11 Mbytes for 2684 frames with horrible quality. I know I can decrease --crf further but is this really the quality expected for crf 18?
3. The log says "yuv [info]: 1920x800 24000Hz[...]". Should I have used "23.976" instead of "24000/1001"?
I'm pretty sure those 2 are related, if x265 thinks the video is meant to be played at 24000fps ;-) it compresses the motion much, much, much more than usually so everything looks like crap if you play this at only 23.976...
fumoffu is offline   Reply With Quote
Old 4th December 2013, 19:32   #239  |  Link
sneaker_ger
Registered User
 
Join Date: Dec 2002
Posts: 5,511
I will test but it sounds reasonable. Totally blind to not notice 24000 != 24.000. Now when using "--fps 23.976" the log says "23Hz". Wondering if it just clips the log or if the output might be wrong (if there a timings in the raw stream at all).
sneaker_ger is offline   Reply With Quote
Old 4th December 2013, 19:38   #240  |  Link
Kurosu
Registered User
 
Join Date: Sep 2002
Location: France
Posts: 426
Quote:
Originally Posted by fumoffu View Post
--max-merge - what are we merging? I'm guessing partitions but I would love to know more how and when this works.
Partitions marked as merge will copy a neighbour partition motion as is (including one scaled from collocated block in another frame). The different potential motions are index in a specific order. Basically, this is a mean to declare an area that has the same motion.

Quote:
--early-skip - what are we skipping?
Probably a fast termination of the best partitioning/search if the best encoded mode for the CU is skip (akin to merged 2Nx2N partition without residual)

Quote:
--fast-cbf - what does cbf stands for again?
Coded block flag - whether there was residual in a transform block. I guess this is a fast decision, ie no need to test smaller transform sizes if there are already a lot of not-coded transform blocks.
Kurosu is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 13:13.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, vBulletin Solutions Inc.