View Full Version : x264 Known Hardware accelleration problems and solutions
Pages :
1
2
3
4
5
6
7
8
9
[
10]
P.J
22nd October 2008, 19:10
Is there any way to play incompatible files with DxVA?
Even tsMuxeR doesn't work. It just gives me lots of artifacts xD (5.1 -> 4.1)
Sulik
23rd October 2008, 02:40
The biggest problem with B-pyramid & DXVA is that MPC-HC does not support it (it ignores explicit reordering of re-ordered B-frames).
P.J
23rd October 2008, 18:08
what about cyberlink ?
tetsuo55
23rd October 2008, 19:57
The biggest problem with B-pyramid & DXVA is that MPC-HC does not support it (it ignores explicit reordering of re-ordered B-frames).
Do you have any technical reference to support this claim?
I agree that the DXVA implementation has some problems, but that should not be one of them (besides b-pyramids decoding breaks on consoles and some dedicated players too)
tofans
27th October 2008, 12:52
I have a few problems regarding interlaced DV encoded to x264 and playback using a GPU deinterlacer.
It seems that the GPU decoder doesn't recognize the input as being interlaced, because there is no deinterlacing applied to the image.
If I encode the same videostream using an old mainconcept encoder using the field-based or MBAFF setting, the GPU deinterlaces the video just nicely.
Is there any x264 option, that I am missing here? I tried all combinations of the "--tff" "--bff" "--interlaced" "--nal-hrd" options, but none of them seems to do the trick.
nm
27th October 2008, 13:26
I have a few problems regarding interlaced DV encoded to x264 and playback using a GPU deinterlacer.
It seems that the GPU decoder doesn't recognize the input as being interlaced, because there is no deinterlacing applied to the image.
If I encode the same videostream using an old mainconcept encoder using the field-based or MBAFF setting, the GPU deinterlaces the video just nicely.
Is there any x264 option, that I am missing here? I tried all combinations of the "--tff" "--bff" "--interlaced" "--nal-hrd" options, but none of them seems to do the trick.
Do you have an x264 build that includes the nal-hrd patch? Note that you also need to set the VBV buffer size and maximum bitrate parameters (for example --vbv-bufsize 30000 --vbv-maxrate 30000 --nal-hrd --tff) because otherwise nal-hrd is disabled and it won't write any additional information to the stream.
tofans
27th October 2008, 17:08
Yes, I am pretty sure that the build includes the nal-hrd patch, otherwise "--longhelp" wouldn't offer such an option, right?
Thanks for your explanation, I included these options in my command line, but my first test indicate that the output video is still not deinterlaced by the GPU.
Is there any way to verify that certain flags are set in the videostream?
nm
27th October 2008, 18:01
Yes, I am pretty sure that the build includes the nal-hrd patch, otherwise "--longhelp" wouldn't offer such an option, right?
Yes.
Thanks for your explanation, I included these options in my command line, but my first test indicate that the output video is still not deinterlaced by the GPU.
Is there any way to verify that certain flags are set in the videostream?
One way is to use h264_parse from the MPEG4IP project. It is a command-line tool that parses raw H.264 streams. Search the output for pict_struct, which is 3 for TFF and 4 for BFF.
If you can't find a Windows build of h264_parse, upload a sample clip and I'll take a look at it (I have MPEG4IP tools on Linux, so I can't help you with Windows executables).
tofans
27th October 2008, 18:51
First of all, a big "thank you" for your support!
I uploaded a sample clip onto http://drop.io/x264sample
This is the command line I used:
--crf 23.0 --level 4.1 --ref 3 --mixed-refs --bframes 3 --weightb --filter -1:-1 --subme 2 --trellis 1 --vbv-bufsize 30000
--vbv-maxrate 30000 --merange 12 --threads auto --thread-input --progress --no-psnr --no-ssim --output "output" "input" --nal-hrd --bff
I found a win32 build of h264_parse, but only found this entry regarding pic_struct:
pic_struct_present_flag: 1
nm
27th October 2008, 19:18
--crf 23.0 --level 4.1 --ref 3 --mixed-refs --bframes 3 --weightb --filter -1:-1 --subme 2 --trellis 1 --vbv-bufsize 30000
--vbv-maxrate 30000 --merange 12 --threads auto --thread-input --progress --no-psnr --no-ssim --output "output" "input" --nal-hrd --bff
Your --nal-hrd and --bff options are at a wrong position. Move them before "input":
--crf 23.0 --level 4.1 --ref 3 --mixed-refs --bframes 3 --weightb --filter -1:-1 --subme 2 --trellis 1 --vbv-bufsize 30000
--vbv-maxrate 30000 --merange 12 --threads auto --thread-input --progress --no-psnr --no-ssim --nal-hrd --bff --output "output" "input"
Edit: well, that probably isn't a problem though. My version of h264_parse reports that everything should be fine with your sample. There are SEIs with pict_struct=4.
Next step would be to compare the SEI data of these x264 encodes to that of Mainconcept MBAFF, which you said was deinterlaced properly.
tofans
27th October 2008, 21:56
Sorry, my bad, Mainconcept and MBAFF is a also a no go.
Only field-based interlaced coding does the trick, which is sadly not available in x264.
So is this a general problem with the GPU decoder and MBAFF in gerneal, or is it just a flag that doesn't get passed to the renderer?
EDIT: I have uploaded a sample clip (same location as previously) of BBCHD with MBAFF interlacing which is getting perfectly deinterlaced by the GPU. Maybe you can spot the differences between my encodes and the BBCHD stream.
nm
28th October 2008, 07:37
Sorry, my bad, Mainconcept and MBAFF is a also a no go.
Only field-based interlaced coding does the trick, which is sadly not available in x264.
So is this a general problem with the GPU decoder and MBAFF in gerneal, or is it just a flag that doesn't get passed to the renderer?
I don't know much about DXVA, but I think the problem is either in the DXVA filter or the driver-side implementation. Have you tried both CyberLink's filter and the one in MPC HC?
EDIT: I have uploaded a sample clip (same location as previously) of BBCHD with MBAFF interlacing which is getting perfectly deinterlacerd by the GPU. Maybe you can spot the differences between my encodes and the BBCHD stream.
An obvious difference is that the BBCHD clip is TFF while your clip was BFF. Maybe the decoder always expects MBAFF to be TFF, or simply doesn't understand that pict_struct=4 means BFF interlaced. Could you try if this x264-encoded TFF sample (http://www.cs.helsinki.fi/u/mikkila/video/hdvtest/test-x264-1080i25-nalhrd.mkv) gets deinterlaced?
G_M_C
28th October 2008, 09:30
Is there any way to play incompatible files with DxVA?
Even tsMuxeR doesn't work. It just gives me lots of artifacts xD (5.1 -> 4.1)
Just a though: Why dont you simply go back to the source, and re-do your encode while adhering to the 4.1 specs ? You do have the source right ?
Cause that's the only way of getting it to play on DXVA ...
PS: B-pyramids seem to be tricky though, which is somehow strange. B-pyramids can be used on BD-compatible streams, my stand-alone Pana BD30 accepts them with no problems. Why some DXVA / GPU's won't ... I dont know.
tofans
28th October 2008, 10:41
Could you try if this x264-encoded TFF sample (http://www.cs.helsinki.fi/u/mikkila/video/hdvtest/test-x264-1080i25-nalhrd.mkv) gets deinterlaced?
Thanks for the sample.
No, sadly it doesn't. I tried it both, the internal MPC DXVA decoder and the Cyberlink decoder. I also tried "DGAVCIndexNV" but there is no deinterlacing applied, even if I specifically select "use PureVideo deinterlacer".
So I guess there must be some weird flag present in the BBCHD stream, which triggers the deinterlacing fuction of DXVA, but which is not present in the x264 encodes. :confused:
nm
28th October 2008, 12:38
So I guess there must be some weird flag present in the BBCHD stream, which triggers the deinterlacing fuction of DXVA, but which is not present in the x264 encodes. :confused:
Perhaps. We would need to ask advice from someone who knows the specs better (or how your DXVA decoder implementation works). I couldn't spot any relevant differences in the interlacing/field order signals of these streams.
nm
30th October 2008, 09:40
Ok, neuron2 explains the problem in this thread: http://forum.doom9.org/showthread.php?t=142427
...or at least that could be one factor.
max2k2
28th February 2009, 04:32
I did not found it in the initial post of this thread, but heard some where that you should disable a macroblock size if 4*4 for P-Frames under x264 if you like to encode DXVA-compatible. Is this true?
I plan to use a aac with 80kbs Adabtive Bitrate in mp4 for my encode. Coulde this audioformat introduce any troubels for DXVA SAP`s?
UsedUser
28th February 2009, 09:07
you should disable a macroblock size if 4*4 for P-Frames under x264 if you like to encode DXVA-compatible. Is this true?
Yes. Can't quote a source atm, but it originates with game console compatibility, and extends generally to DXVA.
I plan to use a aac with 80kbs Adabtive Bitrate in mp4 for my encode. Coulde this audioformat introduce any troubels for DXVA SAP`s?
I haven't seen any DXVA issues related to audio streams. The video codec doesn't get the stream until after demux. Your stream splitter sends video to the video decoder, and audio to the audio decoder.
looney
3rd June 2009, 00:45
It's the maximum decoded picture buffer that determines the limit.
1920x1080x4 = 8294400
1280x720x9 = 8294400
So 900p would have a limit of
8294400/(1600*900) = 5.76 reference frames
Why does this happen? Because hardware decoders are only going to add as much cache into the chip as necessary, otherwise you'll be wasting transistor / increasing die space. Software decoders can just allocate as much memory as they feel like since you already have hundreds of megs of ram.
I think it's nothing to do with GPU's on-chip cache beause GPUs dont have that much cache on die nor it's needed when they have plenty of too fast memory for plain 2d+t enhancement. Believe all buggy lies in dsp to be equipped with enough cheap ram and 8M was probably something they came in 2002/3 when they developing h264 as cheap enough solution for consumer electronics in 5year timeframe.
Yeah, thing is it would be very easy and cheap to implement an (not even high speed is necessary) external 32 MB of RAM with a few leads to the main chip. With that, you can easily fit up to 15 (you'd need an extra 1.7 MB for the 16th one) reference frames for 1080p content.
I think something is wrong with that math. Cause more monolithic space cant be fragmented with same size frags that you end up in something smaller than a multiplier of frags number when used on X times smaller space (16,18fps) Anyway i think they firs reinvent the latform in another revision before they simply add up more memory allocation to L4.1. Like that will ever happen. Cant they just add new sublevel L4.2-L4.3 that has that new improved buffer and lot of new CE for sale in next 5 years.
tetsuo55
3rd June 2009, 07:30
I think it's nothing to do with GPU's on-chip cache beause GPUs dont have that much cache on die nor it's needed when they have plenty of too fast memory for plain 2d+t enhancement. Believe all buggy lies in dsp to be equipped with enough cheap ram and 8M was probably something they came in 2002/3 when they developing h264 as cheap enough solution for consumer electronics in 5year timeframe.
I think something is wrong with that math. Cause more monolithic space cant be fragmented with same size frags that you end up in something smaller than a multiplier of frags number when used on X times smaller space (16,18fps) Anyway i think they firs reinvent the latform in another revision before they simply add up more memory allocation to L4.1. Like that will ever happen. Cant they just add new sublevel L4.2-L4.3 that has that new improved buffer and lot of new CE for sale in next 5 years.The cache is just a number they decide on (although there are guidelines to how big it should be)
Theoretically an GPU could have a cache as large as its onboard RAM.
Nvidia recently decided to upsize the cache to L5.1(it's a driver limitation, not a hardware one). This means that any stream will work as long as it never goes over 17 (depending on what encoding settings you used 16 ref frames will often not work)
pokazene_maslo
28th January 2010, 16:51
Hello. I have a video with 1280*544 resolution in High@L4.1 using 12 refs. According to this thread 8388608 / (1280 X 544) = 12.047.... so 12 reference frames should be fine, but dxva is not working (ATI radeon HD4850, MPCHC 1.3.1443). Video file is certainly not older than 1 year. Can anyone explain me why it's not working?
DarkZell666
28th January 2010, 17:13
Hello. I have a video with 1280*544 resolution in High@L4.1 using 12 refs. According to this thread 8388608 / (1280 X 544) = 12.047.... so 12 reference frames should be fine, but dxva is not working (ATI radeon HD4850, MPCHC 1.3.1443). Video file is certainly not older than 1 year. Can anyone explain me why it's not working?
You'd be better off in the MPC-HC topic imho : http://forum.doom9.org/showthread.php?t=123537
valnar
28th January 2010, 19:41
Hello. I have a video with 1280*544 resolution in High@L4.1 using 12 refs. According to this thread 8388608 / (1280 X 544) = 12.047.... so 12 reference frames should be fine, but dxva is not working (ATI radeon HD4850, MPCHC 1.3.1443). Video file is certainly not older than 1 year. Can anyone explain me why it's not working?
Using 12 refs no matter what is ridiculous. Lower it to about 5.
pokazene_maslo
28th January 2010, 20:13
I didn't encode it and I'm not asking if it is useful. I'm asking why its not working. According to wiki page it should work: http://en.wikipedia.org/wiki/H264#Decoded_picture_buffering
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.