Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Hardware & Software > Software players

Reply
 
Thread Tools Search this Thread Display Modes
Old 14th July 2011, 18:11   #1  |  Link
patrick_
Registered User
 
Join Date: Jan 2007
Posts: 85
FFDShow with H.264 10bit support

I searched the web, but had a hard time finding a FFDShow build that supports H.264 10-bit. Finally I found one included in CCCP. I hate codec packs so I decided to replace the files of the default FFDShow Installer with the ones from CCCP. I've only tested it with a single 10-bit file using MPC-HC*, but it worked

Download FFDShow rev 3925 with H.264 10-bit support (BETA)
http://www.filesonic.com/file/144456...0711-10bit.exe

*if you use MPC-HC, don't forget to deactivate the internal filters.
patrick_ is offline   Reply With Quote
Old 14th July 2011, 19:44   #2  |  Link
Hypernova
Registered User
 
Join Date: Feb 2006
Posts: 293
Thanks. I was looking for that also. I don't mind CCCP in comparison to bigger codec pack like k-lite, but still this is nice.
__________________
Spec: Intel Core i5-3570K, 8g ram, Intel HD4000, Samsung U28D590 4k monitor+1080p Projector, Windows 10.
Hypernova is offline   Reply With Quote
Old 14th July 2011, 20:24   #3  |  Link
clsid
Registered User
 
Join Date: Feb 2005
Posts: 5,010
Those builds are alpha quality and are not stable. They should be used with caution. For example 9-bit H.264 will just crash.

Btw, K-Lite has small variants as well. Its latest beta version even has Blu-ray playback capability.
clsid is offline   Reply With Quote
Old 14th July 2011, 20:58   #4  |  Link
JanWillem32
Registered User
 
JanWillem32's Avatar
 
Join Date: Oct 2010
Location: The Netherlands
Posts: 1,084
I know for a fact that quite a few people are working on this. I absolutely can't recommend usage of the 10-bit system yet with MPC-HC, until:
1.- A real ffdshow beta comes out that can output at least one of the recommended 10- or 16-bit formats for DirectShow to the player: http://msdn.microsoft.com/en-us/libr...=VS.85%29.aspx .
(Currently all output is rounded to 8-bit formats, so there's no gain by the 10-bit precision yet.)
2.- At least one of the internal renderers has been updated to become able to take in and mix the 10- or 16-bit formats.
3.- At least some quality and reliability testing has been done.
__________________
development folder, containing MPC-HC experimental tester builds, pixel shaders and more: http://www.mediafire.com/?xwsoo403c53hv
JanWillem32 is offline   Reply With Quote
Old 14th July 2011, 21:15   #5  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,689
Quote:
Originally Posted by JanWillem32 View Post
I know for a fact that quite a few people are working on this. I absolutely can't recommend usage of the 10-bit system yet with MPC-HC, until:
1.- A real ffdshow beta comes out that can output at least one of the recommended 10- or 16-bit formats for DirectShow to the player: http://msdn.microsoft.com/en-us/libr...=VS.85%29.aspx .
(Currently all output is rounded to 8-bit formats, so there's no gain by the 10-bit precision yet.)
This is completely false. 10-bit display support isn't merely unnecessary; it's beyond useless.
Dark Shikari is offline   Reply With Quote
Old 14th July 2011, 21:47   #6  |  Link
patrick_
Registered User
 
Join Date: Jan 2007
Posts: 85
JanWillem32, using 10bits in compression doesn't have anything to do with 10bit output. If it worked that way, you would need a 10bit display as well.

clsid, I updated the installer primary to allow x264 users to test/compare 8bit and 10bit outputs.
patrick_ is offline   Reply With Quote
Old 14th July 2011, 21:51   #7  |  Link
JanWillem32
Registered User
 
JanWillem32's Avatar
 
Join Date: Oct 2010
Location: The Netherlands
Posts: 1,084
10-bit display support is something entirely different than 10-bit input to the mixer.
A 10-bit bt.601 or bt.709 encode has a limited ranges [64, 940] [64, 960], [64, 960] Y'CbCr encoding, usually with chroma sub-sampling.
A 10-bit display has a full range [0, 1023] RGB display matrix.

In the link I gave, there's a list with the recommended replacement formats for the regular 8-bit YV12, I420/IYUV, NV12 and AYUV formats that do support more than 8-bit to feed to the mixer.

There's a lot going on in between to get the input format mixed and rendered to the output display. Even if the output of the display is just 8-bit RGB, there's no doubt that quality during mixing and rendering suffers if the input Y'CbCr format is rounded from 10- to 8-bit before even the mixer and renderer can receive the image.

By the way, there's plenty of scientific basis to why the Digital Cinema Initiative set 12-bit as a minimum requirement for both the encoding format and display capability for licensing.
There's also plenty of reasons why the studio formats store XYZ color data in a 32-bit floating-point format.
__________________
development folder, containing MPC-HC experimental tester builds, pixel shaders and more: http://www.mediafire.com/?xwsoo403c53hv

Last edited by JanWillem32; 14th July 2011 at 21:54.
JanWillem32 is offline   Reply With Quote
Old 14th July 2011, 22:01   #8  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,689
Quote:
Originally Posted by JanWillem32 View Post
10-bit display support is something entirely different than 10-bit input to the mixer.
A 10-bit bt.601 or bt.709 encode has a limited ranges [64, 940] [64, 960], [64, 960] Y'CbCr encoding, usually with chroma sub-sampling.
A 10-bit display has a full range [0, 1023] RGB display matrix.

In the link I gave, there's a list with the recommended replacement formats for the regular 8-bit YV12, I420/IYUV, NV12 and AYUV formats that do support more than 8-bit to feed to the mixer.

There's a lot going on in between to get the input format mixed and rendered to the output display. Even if the output of the display is just 8-bit RGB, there's no doubt that quality during mixing and rendering suffers if the input Y'CbCr format is rounded from 10- to 8-bit before even the mixer and renderer can receive the image.
It isn't rounded, it's dithered. There's a slight, but critical difference.

The effect of your display on the effectiveness of 10-bit is negligible. A 6-bit $50 LCD benefits from 10-bit just as much as the world's most expensive IPS monitor because 10-bit is about internal codec precision, not output precision.
Dark Shikari is offline   Reply With Quote
Old 15th July 2011, 00:33   #9  |  Link
JanWillem32
Registered User
 
JanWillem32's Avatar
 
Join Date: Oct 2010
Location: The Netherlands
Posts: 1,084
How awful to use the dithering compromise right at the start of a rendering instance. I'm having enough trouble with performing convolutions and other filter passes on the inputs full of synthetic noise as it is.
I guess the dithering is less than a structured 128128 dithering map lookup for every channel, too?
If it's about internal codec precision, please make the decoder 32-bit floating point or better on output, so I don't have to static cast every element of the mixer input to 32-bit floating-point anymore. I can handle pretty much any input quantization, only the double precision for the color management section can be a bit intense to process.
What comes out of the renderer takes at least 7 conversion passes from the decoder's output to the back buffer of the allocator-presenter. What goes on screen can hardly be called raw from what the decoder outputs. Depending on settings, things can look very bad. For instance, a low gamma setting of 2.0 to 2.2 is murder on darker scenes with most consumer-grade video.
__________________
development folder, containing MPC-HC experimental tester builds, pixel shaders and more: http://www.mediafire.com/?xwsoo403c53hv
JanWillem32 is offline   Reply With Quote
Old 15th July 2011, 00:55   #10  |  Link
LoRd_MuldeR
Software Developer
 
LoRd_MuldeR's Avatar
 
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,047
Quote:
Originally Posted by JanWillem32 View Post
If it's about internal codec precision, please make the decoder 32-bit floating point or better on output, so I don't have to static cast every element of the mixer input to 32-bit floating-point anymore. I can handle pretty much any input quantization, only the double precision for the color management section can be a bit intense to process.
As far as I know, h.264 (and probably most video formats) internally use integer math only. So floating point output doesn't make much sense. You can as well do the conversion yourself, if you need FP math for your post-processing. Also: Even if the original input source only used 8-Bit precision (per color channel) and the final output is going to be 8-Bit again, using 10-bit (12-bit) internal codec precision improves compression efficiency. Whether the decoder outputs 8-Bit or 10-Bit (12-Bit) is up to the decoder's choice, i.e. we don't necessarily need "true" 10-bit (12-bit) output to benefit from "high bit-depth" h.264 video...
__________________
There was of course no way of knowing whether you were being watched at any given moment.
How often, or on what system, the Thought Police plugged in on any individual wire was guesswork.



Last edited by LoRd_MuldeR; 15th July 2011 at 01:00.
LoRd_MuldeR is offline   Reply With Quote
Old 15th July 2011, 01:28   #11  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,689
Quote:
Originally Posted by JanWillem32 View Post
If it's about internal codec precision, please make the decoder 32-bit floating point or better on output, so I don't have to static cast every element of the mixer input to 32-bit floating-point anymore.
a) The H.264 spec only supports up to 14-bit.
b) Floating point math is incredibly slow relative to integer math. A decoder written in 32-bit float would probably not be able to decode 1080p on an overclocked 6-core Core i7.
Dark Shikari is offline   Reply With Quote
Old 15th July 2011, 04:10   #12  |  Link
JanWillem32
Registered User
 
JanWillem32's Avatar
 
Join Date: Oct 2010
Location: The Netherlands
Posts: 1,084
When looking at the specifications, I really don't see any reason to round any output to integer, and most certainly not 8-bit. The decoded structures are only guaranteed to be 8-bit on the 8-bit lossless profile. All lossy modes will generate in-between values, even on encoding integer input.
I've had absolutely no issues with floating-point performance, as long as you don't rely on the FPU to do the work. A nice example on how to use packed SSE (readable for non-programmers, too): http://software.intel.com/en-us/blog...-acceleration/ . It's also convenient to blend in GPU power when heavy floating-point operations are wanted, but the programming for that is very specific, and the GPU has only very little integer math power.

A bit of information why consumer formats are frustrating for studio workers:
You start off with a big, raw camera image format.
You take the camera color calibration scheme and project the input images to a nice studio format with the full XYZ color palette and plenty of quantization.
You edit in that studio format, towards the cinema format 20481080 or with less height on 1:2.40 movies. (The 4k profile of 40962160 is still very rare.)
You make the cinema screener: the XYZ color space remains intact in the encode, only when encoding the JPEG2000 video, colors are quantized to 12- or 14-bit and the gamma is set to 2.6. The encoding profile is lossless up to 250 Mbit/s. There's no visible change from the studio master to the encode at all. Features are generally some 300 GB in size.
On approval, the final version is encoded for distribution to cinemas with the same type of encoding.
When it's time to do the blu-ray and DVD release, things get nasty.
The image is clipped to 1920 wide for blu-ray, and usually the same amount for DVD, too.
The XYZ color space is converted to the HD, PAL and NTSC color spaces, losing about 2/3 of all possible colors in the space.
The input is limited on maximum lightness (a lowpass filter), because of limitations in the HDTV and SDTV standards.
The gamma is typically set to 2.4.
With DVD, the image is scaled down.
Chroma (2 color channels, relative to grayscale) is sub-sampled to half-resolution in height and width (4:2:0).
On the encoding step, the images are heavily dithered to mask the rounding to 8-bit+limited ranges.

There are plenty of "magic" filters in use by studios when doing the encodes, but you simply can't overcome the limitations set by the encoded format. Studio or cinema video and consumer video don't look nearly the same.
If there are no consumer products that can actually decode anything better than 8-bit, 4:2:0 Y'CbCr, studios hardly have a reason to use anything better. If the rare decoder found that can decode more than 8-bit, but rounds/dithers it again afterwards, studios don't have a reason either. They can take care of a much better set of filters for dithering to 8-bit themselves.
Considering the relative ages of the JPEG2000 and h.264 codecs, I do believe better results can be achieved with the 50 or 100 GB of space on a blu-ray.

On the other side: what I'm working on now. The mixer and renderer have to deal with images that have starvation in input precision, by the quantization level, by the image quality loss trough lossy encoding, by horizontal/vertical resolution and by the limitations in color space.
When writing filters for mixing and rendering stages, I've seen poor quality results with simple filters on even a lossless 8-bit RGB input from a BMP file. (The still image filter in Windows supports it.)
Quantization is one of the things that can be become better than what it is now. The recommended list of formats to transport 10- or 16-bit has been around for a while. Using those formats to for at least the 9-bit and better images is only logical.
__________________
development folder, containing MPC-HC experimental tester builds, pixel shaders and more: http://www.mediafire.com/?xwsoo403c53hv

Last edited by JanWillem32; 15th July 2011 at 04:13. Reason: typo
JanWillem32 is offline   Reply With Quote
Old 15th July 2011, 04:22   #13  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,689
Quote:
Originally Posted by JanWillem32 View Post
When looking at the specifications, I really don't see any reason to round any output to integer, and most certainly not 8-bit. The decoded structures are only guaranteed to be 8-bit on the 8-bit lossless profile. All lossy modes will generate in-between values, even on encoding integer input.
Incorrect. The specification requires bit-exact decoding or decoder/encoder desync will occur, resulting in artifacts.

Quote:
Originally Posted by JanWillem32 View Post
I've had absolutely no issues with floating-point performance, as long as you don't rely on the FPU to do the work.
SSE is part of the FPU. It shares the same execution units.

Quote:
Originally Posted by JanWillem32 View Post
A nice example on how to use packed SSE (readable for non-programmers, too): http://software.intel.com/en-us/blog...-acceleration/ . It's also convenient to blend in GPU power when heavy floating-point operations are wanted, but the programming for that is very specific, and the GPU has only very little integer math power.
I've written thousands of lines of assembly code. I think I know how SIMD works.

Floating point addition in SSE is typically 3/1 (latency/invthroughput). By comparison, integer addition is typically 1/0.5, and with 16-bit integers you get twice as many per register (and four times as many with 8-bit). In the end this means the typical throughput from integer math is 4-8 times higher than floating point.

This isn't even considering the fact that integer math allows all sorts of useful shortcuts, like shifting instead of multiplication, bitmasking, and other performance tricks which are often impossible with floating point math.
Dark Shikari is offline   Reply With Quote
Old 15th July 2011, 05:38   #14  |  Link
JanWillem32
Registered User
 
JanWillem32's Avatar
 
Join Date: Oct 2010
Location: The Netherlands
Posts: 1,084
Quote:
Originally Posted by Dark Shikari View Post
Incorrect. The specification requires bit-exact decoding or decoder/encoder desync will occur, resulting in artifacts.
Okay, so it's a "cleanup" cycle in decoding. Too bad, I had hoped for sine-wave-like transformations and convolution like with lossy audio that does benefit from floating-points.
Quote:
Originally Posted by Dark Shikari View Post
SSE is part of the FPU. It shares the same execution units.
I mostly meant relying on the classical 1-level approach, instead of 2 or more SIMD instructions at a time.
Quote:
Originally Posted by Dark Shikari View Post
This isn't even considering the fact that integer math allows all sorts of useful shortcuts, like shifting instead of multiplication, bitmasking, and other performance tricks which are often impossible with floating point math.
Bitmasking and such is indeed tricky on floating-points. I've done some on GPU work, and I can say that the bit field in floating-point mode is too variable to pull it off correctly usually. Only sign and some exponent tricks work well bit-wise.
The fact that I need to rely more and more on doubles and some floats for various things, does decrease my usage of bitmasking and such. The SSE example shows how to get at least some decent performance on doubles and floats on a CPU. (And give at least some impression to what it is for the regular people here.)
__________________
development folder, containing MPC-HC experimental tester builds, pixel shaders and more: http://www.mediafire.com/?xwsoo403c53hv
JanWillem32 is offline   Reply With Quote
Old 15th July 2011, 05:42   #15  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,689
Quote:
Originally Posted by JanWillem32 View Post
Okay, so it's a "cleanup" cycle in decoding. Too bad, I had hoped for sine-wave-like transformations and convolution like with lossy audio that does benefit from floating-points.
That's because audio codecs are, in terms of prediction, stuck in the 1980s: they still generally have no inter prediction. CELT and AAC Main (LTP) are the only audio codecs with inter prediction, and both do require nearly-bit-exact decoding as a result. IIRC, Long Term Prediction actually requires an implementation of 16-bit floats in order to work correctly.
Dark Shikari is offline   Reply With Quote
Old 15th July 2011, 13:11   #16  |  Link
JanWillem32
Registered User
 
JanWillem32's Avatar
 
Join Date: Oct 2010
Location: The Netherlands
Posts: 1,084
Interesting material. It seems that many lossless image formats also use a strict step rounding mechanism of the lossy internal structure, just before error correction to lossless.
I know the half float format very well. We currently use the D3DXFloat32To16Array function to write out to D3DFMT_A16B16G16R16F (can also carry non-ABGR data). It's not very fast, so I'd love to replace that one with something better. The structure it outputs is an unsigned __int16*, so the CPU can't really do anything useful with it, either. Once a D3DFMT_A16B16G16R16F texture is created, it goes straight into the GPU with a DMA transfer. Most GPU's nowadays don't have a half float calculation mode (anymore). It's just a supported format to save memory bandwidth. With every texture transfer, all vertices and subpixels are converted to 32-bit float for calculation. Other types are allowed in DirectX 10 for some calculation stages, DirectX 11 allows even doubles on the GPU. The performance is low with anything else but 32-bit float. That's one of the reasons why DXVA parts are separate on the silicon of a GPU.

Anyway, a bit more on topic. I very much hope that this advancement to leads a proper ffdshow beta soon. I'm more than willing to help writing and testing code for a video mixer part, so it's able to receive 10-bit or zero-padded 16-bit Y'CbCr formats. I'm looking at this from the perspective of the regular consumer and professional, too. Once a complete system has been proven to be able to maintain more measurable significant quality than what there was before, it becomes advertisable to the public and professionals in general.
__________________
development folder, containing MPC-HC experimental tester builds, pixel shaders and more: http://www.mediafire.com/?xwsoo403c53hv
JanWillem32 is offline   Reply With Quote
Old 15th July 2011, 14:50   #17  |  Link
TheRyuu
warpsharpened
 
Join Date: Feb 2007
Posts: 788
Quote:
Originally Posted by clsid View Post
For example 9-bit H.264 will just crash.
That's not going to get fixed until tryouts gets unborked.

No one uses 9bit h264 so I don't really think it matters that much.

Last edited by TheRyuu; 15th July 2011 at 15:27.
TheRyuu is offline   Reply With Quote
Old 15th July 2011, 15:32   #18  |  Link
clsid
Registered User
 
Join Date: Feb 2005
Posts: 5,010
Quote:
Originally Posted by TheRyuu View Post
That's not going to get fixed until tryouts gets unborked.

No one uses 9bit h264 so I don't really think it matters that much.
That part of the code is a huge mess and the people who wrote it are not around anymore. So any help is welcome. I have already done some major cleanup in Swscale, so that the difference with vanilla FFmpeg is now minimal.

As long as x264 supports 9-bit, such files will be made. If ffdshow can not be fixed to support it, then it should reject decoding such files. Crashing is unacceptable.

Last edited by clsid; 20th July 2011 at 15:46.
clsid is offline   Reply With Quote
Old 20th July 2011, 06:57   #19  |  Link
aegisofrime
Registered User
 
Join Date: Apr 2009
Posts: 462
Quote:
Originally Posted by patrick_ View Post
I searched the web, but had a hard time finding a FFDShow build that supports H.264 10-bit. Finally I found one included in CCCP. I hate codec packs so I decided to replace the files of the default FFDShow Installer with the ones from CCCP. I've only tested it with a single 10-bit file using MPC-HC*, but it worked

Download FFDShow rev 3925 with H.264 10-bit support (BETA)
http://www.filesonic.com/file/144456...0711-10bit.exe

*if you use MPC-HC, don't forget to deactivate the internal filters.
Thanks so much for this, plays my 10-bit x264 encodes perfectly

BTW, is there a link or repository where I can keep up with future revisions?
aegisofrime is offline   Reply With Quote
Old 20th July 2011, 15:51   #20  |  Link
SamKook
Registered User
 
Join Date: Mar 2011
Posts: 212
Quote:
Originally Posted by aegisofrime View Post
BTW, is there a link or repository where I can keep up with future revisions?
You can get beta for ffdshow anywhere, just do a search on google.

Or try here: http://www.afterdawn.com/software/au...m#all_versions

Not sure if they all have 10-bit support enabled though, I don't have anything to test them with.

Last edited by SamKook; 20th July 2011 at 15:54.
SamKook is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 15:02.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2019, vBulletin Solutions Inc.