Log in

View Full Version : x264 Known Hardware accelleration problems and solutions


Pages : [1] 2 3 4 5 6 7 8 9 10

tetsuo55
20th December 2007, 16:21
Below you will find the settings to create a file that will work on any hardware decoder.
Current problems will be explained and finally i will explain how you can try to convince an older file to work.

What settings to use to create a universally(L4.1) working fileNOTE1:
Follow these 3 steps

STEP 1: Determine the REF frame in DPB limit:

8388608 / (Height X Width) = nREF

The value can never be higher than 16 so round it down if its higherNOTE2

STEP 2: Decide if you want to use B-pyramidsNOTE3:

If B-Pyramids = YES then nREF -1
If B-Pyramids = NO then add "--no-b-pyramid"


STEP 3: Make sure use these commands and never cross the limits, shown here are the highest settings(not marked) or mandatory settings(marked with *):

--VBVMaxBitrate 40000 (highest possible value)
--KeyframeInterval 24*
level-idc = 4.1*
Profile = High*



NOTE1
If a file created with these settings does not work on your hardware it is either not capable of L4.1 decoding or there is something wrong with the universal settings.
Handheld devices do not support L4.1 i have created an alternative setting below.
NOTE2
Currently there is a bug with DXVA(videocards) on several of the decoders, a REF value higher than 11 will not work on all systems or with all players. (if you want to help fix the bug or want more information see the thread here http://forum.doom9.org/showthread.php?p=1170371)
NOTE3
B-pyramids are "broken" in x264, this is why you need to reduce ref frames by 1 to guarantee that it works, and even then it might not be stable at all. if you want to know more and/or help fix it see this thread: http://forum.doom9.org/showthread.php?t=140223


Not all hardware devices support L4.1, these devices need L3.0 settings(A lot of portable devices use even lower settings, please check the manual and use the correct MEgui profile):
Follow these 2 steps

STEP 1: Determine the REF frame in DPB limit:

2073600 / (Height X Width) = nREF

The value can never be higher than 16 so round it down if its higherNOTE2

STEP 2: Make sure use these commands and never cross the limits, shown here are the highest/lowest settings(not marked) or mandatory settings(marked with *):

--vbv-bufsize 10000 (highest possible value)
--vbv-maxrate 12500 (highest possible value)
--KeyframeInterval 24*
level-idc = 3.0*
Profile = High*
--no-b-pyramid*


B-pyramids does not work on most portable devices, so its best to disable it completely.

---

If the file is already encoded but you cannot get it to work you can try the following:

If you have a file that was already encoded, but it doesn't work there are 4 options:
1.Try the file in MPC-HC if it supports you videocard
2.Use h264info to change the header to level3.1 or level4.1 and try your regular player
3.Use a software decoder like coreavc
4.Transcode the file to be Hardware level compliant.

Non-working files usually means that it was encoded with an old version of x264, has too many ref-frames for its resolution/level or was muxed with an old version of mkvmerge.

(The previous post i had here has been archived here: http://forum.doom9.org/showthread.php?p=1170369#post1170369)

EDIT:1 (Added min-keyint 4 to the required settings thanks to the findings in this thread: http://forum.doom9.org/showthread.php?t=140135 )
EDIT:2 (min-keyint 4 removed, x264 already respects these limits by default)
EDIT:3 (Corrected fullhd formula(the incorrectness of the old formula did not effect ref frame caluclation))

Atak_Snajpera
20th December 2007, 18:09
- slower encoding
+ very small improvements in quality

Maybe people creating the frontends for encoding could do the same so files become more hardware-compliant?

I will make necessary changes in my GUI. Thanks for info :)

Dark Shikari
20th December 2007, 21:40
Note the improvement in quality is much greater on anime/cartoon footage, where you can get a 1% increase in quality per bitrate per reference frame even at high numbers.

Atak_Snajpera
20th December 2007, 21:55
What's the point of encoding cartoons with 16ref and not be able to watch them on PS3/X360 or PC with hardware acceleration? In this case I choose compatibility over small increase in quality.

akupenguin
20th December 2007, 21:58
Is it number of reference frames, or is it DPB size (i.e. a Level restriction like every other limited HW decoder under the sun)?

arfster
20th December 2007, 22:01
Is it number of reference frames, or is it DPB size (i.e. a Level restriction like every other limited HW decoder under the sun)?

Copy/pasted this from somewhere that did tests to nail down the cyberlink decoder+dxva limitations:

"1080p - Number of Reference frames must be equal to or less than 4
720p - Number of Reference frames must be be equal to or less than 8
Mixed Reference Frames must be false.
B Frames must equal 2
Adaptive B frames must be false"

Apparently not meeting any of the above causes the 720p 20fps bug, or the 1080p black screen.

Dark Shikari
20th December 2007, 22:04
Copy/pasted this from somewhere that did tests to nail down the cyberlink decoder+dxva limitations:

"1080p - Number of Reference frames must be equal to or less than 4
720p - Number of Reference frames must be be equal to or less than 8
Mixed Reference Frames must be false.
B Frames must equal 2
Adaptive B frames must be false"

Apparently not meeting any of the above causes the 720p 20fps bug, or the 1080p black screen.That's a very heavy set of restrictions... :eek:

akupenguin
20th December 2007, 22:09
That's strange. If you go by memory used, 4 frames of 1080p is equivalent to 9 frames of 720p, not 8.

If Level 4.1 were the only restriction, I might recommend people to restrict refs for hardware compatibility. But non-adaptive B-frames and no mixed refs? that's too harsh. Plus it's just plain a decoder bug, not a legitimate limitation. (As a codec developer, I can confidently say that adaptive B-frames do not require any explicit support. You just implement the standard, and any sequence of frame types is decodable.)

I question the DPB limit too... 1080p 16ref takes 50MB of RAM. Any video card recent enough to decode HD h264 will come with at least 256MB. Unless the decoder's memory is separate from the video memory? in which case it seems wasteful to dedicate 5% (12MB) to only h264 decoding when it could be reused.

Sulik
20th December 2007, 22:15
The real restriction is the HD/BD profile & level restriction, ie: high profile, level 4.1.
For the number of references, this means that the maximum is 12288KB/(W*H*1.5):
1920x1080 -> max num_ref_frames = 4
1280x720 -> max num_ref_frames = 9
720x480 -> max num_ref_frames = 16 (max)

The mixed_ref/adaptive B-frames restriction is not a HW restriction, but more likely a bug in the Cyberlink decoder (or demux<->decoder interaction problem, or possibly though unlikely, encoder compliance issue).

I can confirm that 1080p content with Adaptive B-frames + mixed_ref with the proper number of references plays just fine with HW acceleration on my 8600GT.

arfster
21st December 2007, 03:18
To make clear: I've no idea if the above list is right, it was just something I copied off a forum somewhere for my own reference for future encodes. Haven't actually tested it at all.

If anyone needs to do some tests, feel free to point to some samples and I'll play them with my 2600.

NanoBot
21st December 2007, 07:54
Hi,

at least with my 8600gts I found another restriction: I have to uncheck "b-pyramids" during the encode to achieve a playback with full GPU decoding.

C.U. NanoBot

tetsuo55
21st December 2007, 15:31
Well i would be willing to test encodes.

Maybe if we had a few test files, that where encoded in the same format as hddvd/bluray, so they can be easily tested in all hardware compliant software players i could try them and see which ones break on which players.

This way we could find out where the bug really lies and file a good bug-report to the responsible party.

As the issues effect both ATI and NVIDIA i think its safe to say that its not the hardware

UsedUser
22nd December 2007, 03:21
The real restriction is the HD/BD profile & level restriction, ie: high profile, level 4.1.
For the number of references, this means that the maximum is 12288KB/(W*H*1.5):
1920x1080 -> max num_ref_frames = 4
1280x720 -> max num_ref_frames = 9
720x480 -> max num_ref_frames = 16 (max)
I would agree, from my experience troubleshooting, that the profile/level compliance is the issue for DXVA.

The mixed_ref/adaptive B-frames restriction is not a HW restriction, but more likely a bug in the Cyberlink decoder (or demux<->decoder interaction problem, or possibly though unlikely, encoder compliance issue).
I would also agree that the mixed ref / adaptive b-frames issue is with the decoder, and is the source of the 20fps bug, as DXVA will work with the proper number of ref frames, but the 20fps bug persists with adaptive b-frames. I haven't tested mixed ref without adaptive b-frames to isolate it further, but the bug persists when using adaptive b-frames without mixed ref. So at a minimum, adaptive b-frames alone can cause the 20fps bug.

valnar
22nd December 2007, 05:22
As somebody who wants to use DXVA and hardware players (like the Sage HD Extender), keeping to these "standards" is important. Not every playing device has 256MB of RAM.

Since most people use front ends, as tetsuo55 pointed out, it would be great if they could give a warning or perhaps change the default recommendations.

-Robert

tetsuo55
22nd December 2007, 18:20
Well maybe the following tests will help:

-Play a b-frame encoded file with a different decoder (mainconcept and or nero)

-Encode a b-frame file with mainconcept, then try that file on cyberlink, mainconcept and nero.

arfster
22nd December 2007, 18:34
Well maybe the following tests will help:

-Play a b-frame encoded file with a different decoder (mainconcept and or nero)


Afaik neither of those supports VLD acceleration, which is when the problem pops up :-(

Arcsoft seems to support it, but I couldn't get the h264 decoder to work outside the player itself, and the player only plays certain types of video.

tetsuo55
22nd December 2007, 19:58
Afaik neither of those supports VLD acceleration, which is when the problem pops up :-(

Arcsoft seems to support it, but I couldn't get the h264 decoder to work outside the player itself, and the player only plays certain types of video.

Again stressing my point that the file needs to be in a stand-alone player type container, so mkv is obviously out of the question

Crisidelm
24th December 2007, 15:43
Afaik neither of those supports VLD acceleration, which is when the problem pops up :-(

Arcsoft seems to support it, but I couldn't get the h264 decoder to work outside the player itself, and the player only plays certain types of video.

Within Dvb software Dvbviewer the Arcsoft H.264 works fine, and Dvbviewer can be used to play other files, no only for live DVB viewing...

valnar
29th December 2007, 22:38
Is there a megui profile that is recommended to not only give the best quality, but have the highest compatibility for DXVA or hardware players like the Sage HD Extender?

What would those options be for HD or DVD rips?

Robert

Atak_Snajpera
29th December 2007, 22:42
PS3 profile

valnar
29th December 2007, 23:02
PS3 profile

Easy enough! Thanks.
Robert

UsedUser
1st January 2008, 11:02
PS3 profile

Easy enough! Thanks.
Robert
I'm seeing the following settings for the PS3 profile:

--level 4.1 --ref 3 --mixed-refs --bframes 3 --b-pyramid

It includes >2 b-frames, adaptive b-frames, and b-frame pyramids, all of which have been noted for their potential to break DXVA. Unless someone has already tested them @ 1080p + num_ref_frames < 5?

tetsuo55
1st January 2008, 11:10
someone will have to test them, but from what i can find on hardware players it seems that simply using profile 4.1 will fix the problems

cscxk
1st January 2008, 11:28
my test is for 1920*1080,only need to set ref 3 and no b-pyramid.for 1280*720,ref to 6 and no b-pyramid.so i think the first is disable b-pyramid.

UsedUser
1st January 2008, 12:00
my test is for 1920*1080,only need to set ref 3 and no b-pyramid.for 1280*720,ref to 6 and no b-pyramid.so i think the first is disable b-pyramid.
My suspicion is that b-pyramid is involved.

B-pyramid allows b-frames to be used as reference frames for other b-frames. So, if B-pyramid is "on", then the b-frames you have set are counted in the total num_ref_frames.

You can see this formula holds (partially) true in the AVInaptic output. With B-pyramids, num_ref_frames = ref_frames + b-frames (i.e., num_ref_frames = (3 ref_frames) + (3 b-frames) = 6). However, with B-pyramids, it must allow for extra references even at 2 b-frames, as num_ref_frames = (2 ref_frames) + (2 b-frames) = 5.

I'm currently testing the hypothesis that without b-pyramids, you can push the ref_frames up to the max allowed num_ref_frames for L4.1, e.g., 4 @ 1080p, 9 @ 720p.

UsedUser
1st January 2008, 12:57
Well, I made a bit of an error in thinking the ref value could be set to the max allowed num_ref_frames, because that wouldn't allow for the B-frames. So, I have to reject that hypothesis.

However, like cscxk, I have found that, with B-pyramids off, the magic value for ref is 3. num_ref_frames = (3 ref) + (3 B-frames) + (no-B-pyramids) = 4. With num_ref_frames = 4, DXVA works @ 1080p.

I'm thinking what we need, though, is more information about how to predict num_ref_frames.

My new hypothesis is that num_ref_frames won't always equal 4 when ref = 3 and B-frames = 3, if adaptive b-frames is ON.

With adaptive b-frames the encoder decides the number of (consecutive) B-frames, and the B-frames value is then used as a MAXIMUM, rather than a forced number of B-frames. So, my prediction is that for the clips I/we have tested, only 1 B-frame is actually being used with adaptive B-frames turned on. If I'm right, then num_ref_frames will be 6 if ref = 3, B-frames = 3, and B-adapt = no.

If this is true, then to force DXVA compatibility @ 1080p, we really need the following settings to use at most 1 consecutive B-frame and limit num_ref_frames to 4:

--level 4.1 --ref 3 --mixed-refs --bframes 1 --b-adapt

UsedUser
1st January 2008, 13:12
Well, reject that hypothesis as well.

I just tested a clip using the following settings, expecting num_ref_frames = 6:

--level 4.1 --ref 3 --mixed-refs --bframes 3 --no-b-adapt

Adaptive b-frames made no difference. num_ref_frames = 4 in both cases.

So, I still don't know what the magic formula is to arrive at num_ref_frames = 4, but with any clip I've encoded thus far, using the following settings has worked and resulted in successful DXVA @ 1080p:

--level 4.1 --ref 3 --mixed-refs --bframes 3 --b-adapt --no-b-pyramid

So, the only things that need to be changed from the PS3 profile for 1080p are to disable B-pyramids and increase the bitrate to your desired quality/filesize.

As for 720p, I've tested a number of clips with num_ref_frames from 4 to 8, all of which used DXVA, but with choppy playback. I used MP4 instead of MKV, so it isn't the 20fps bug I'm seeing. I need to do more 720p testing.

CruNcher
1st January 2008, 13:44
in some tests of this problem i found out that you can workaround it by useing drect temporal prediction instead of auto or spatial but im not sure why this was the case but it worked and yes b-pyramid shouldn't be used ever for DXVA compatibility and 2 ref frames btw is also the standard for example in Sonic Cinevision. Also something strange is with X264 bitstreams since the begining it seems the bitrate declared flag is wrong you can see this by renameing a .h264 bitstream to .mpv and playback it via PowerDVD then activate the OSD and you gonna see a non changeing bitrate value (also in the information tab the Value is to high for example you encoded AVG 7500 and Max is 10000 kbits it shows you 36 Mbits but it never reached that and only if you rename the bitstream to .mpv you gonna see it or analyzeing it with Elecard Stream analyzer there the Bitrate Declared Field is also visible) in the Counter wich is to high most of the times. With other Encoded Bitstreams from Nero (Ateme) and Mainconcept this value is changeing acordingly to the stream bitrate. Now if you use h264info on such a X264 bitstream without changeing any parameter just run it throug you gonna see suddenly that the counter is moveing but with extreme High values even higher then what it showed as the Hard value before useing h264info on it for example in the case above you gonna suddenly see 2200 Mbits in the Counter hehe, this doesn't seem right to me.

cscxk
1st January 2008, 13:54
on my system,this could play with DXVA.Job commandline:
--level 4.1 --keyint 999 --min-keyint 1 --ref 3 --mixed-refs --no-fast-pskip --bframes 8 --b-rdo --bime --weightb --direct auto --filter -3,-2 --subme 6 --trellis 1 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --me umh --threads auto --thread-input --progress --no-dct-decimate --no-psnr

and you see,b-frames is 8

valnar
1st January 2008, 16:22
@UsedUser.

Keep those tests coming! Great work!

Robert

UsedUser
2nd January 2008, 00:54
on my system,this could play with DXVA.Job commandline:
--level 4.1 --keyint 999 --min-keyint 1 --ref 3 --mixed-refs --no-fast-pskip --bframes 8 --b-rdo --bime --weightb --direct auto --filter -3,-2 --subme 6 --trellis 1 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --me umh --threads auto --thread-input --progress --no-dct-decimate --no-psnr

and you see,b-frames is 8
I concur. B-frames can be pushed all the way up to 16 without breaking DXVA, as long as (num_ref_frames < 5 @ 1080p) or (num_ref_frames < 10 @ 720p) and B-pyramids are OFF.

valnar
2nd January 2008, 01:07
I concur. B-frames can be pushed all the way up to 16 without breaking DXVA, as long as (num_ref_frames < 5 @ 1080p) or (num_ref_frames < 10 @ 720p) and B-pyramids are OFF.

Supposedly if you keep bframes below 3 always, you add PS3 and other H.264 device compatibility too. That's what I'm looking for - the ubiquitous configuration that'll work with everything that can handle hi-def up to level 4.1 and hopefully futureproof the files.

-Robert

UsedUser
2nd January 2008, 01:49
I'm fairly confident now that the feature breaking DXVA really is compliance with Profile High@L4.1, which is to say the num_ref_frames value.

What I didn't initially understand, and still don't fully, is the formula that is used to determine the num_ref_frames value, which is distinct from the "ref" value set when encoding. The "ref" value, the "bframes" value, and the "b-pyramid" value all factor into the final "num_ref_frames" value. The "ref" value seems to contribute its value directly, while adding b-frames only contributes 1 (no matter the "b-frames" value), and b-pyramid seems to contribute 2.

For example:

ref=3 adds 3 to num_ref_frames
b-frames=3 adds 1 to num_ref_frames
b-pyramid adds 2 to num_ref_frames
num_ref_frames = 3 + 1 + 2 = 6 (2 more than are allowed @ 1080p)

It is the num_ref_frames value that must comply with High@L4.1, where (num_ref_frames < 5 @ 1080p) and (num_ref_frames < 10 @ 720p).

B-pyramids, adaptive b-frames, mixed ref frames, weighted prediction --- all can be enabled, as long as the resulting encode stays under the limit of num_ref_frames.

The easiest way to reliably meet this requirement while encoding is to set the "ref" value one under the max allowed num_ref_frames and to turn off B-pyramids, because they may unpredictably contribute to num_ref_frames. The B-frames value can be anything you want, but with adaptive b-frames, I believe "3" is still the recommended value, unless you're encoding animation, where "8" or more could be justified.


The following encoding settings allow DXVA to work @ 1080p:

num_ref_frames=4 ref=1 b-frames=3 b-pyramid=on
--level 4.1 --ref 1 --bframes 3 --b-pyramid --b-adapt --b-rdo --bime --weightb --direct auto --subme 6 --trellis 1 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --vbv-maxrate 25000 --me umh --merange 12

num_ref_frames=4 ref=3 b-frames=3 b-pyramid=off
--level 4.1 --ref 3 --mixed-refs --bframes 3 --no-b-pyramid --b-adapt --b-rdo --bime --weightb --direct auto --subme 6 --trellis 1 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --vbv-maxrate 25000 --me umh --merange 12

num_ref_frames=4 ref=3 b-frames=3 b-pyramid=off b-adapt=off
--level 4.1 --ref 3 --mixed-refs --bframes 3 --no-b-pyramid --no-b-adapt --b-rdo --bime --weightb --direct auto --subme 6 --trellis 1 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --vbv-maxrate 25000 --me umh --merange 12

num_ref_frames=4 ref=3 b-frames=16 b-pyramid=off
--level 4.1 --ref 3 --mixed-refs --bframes 16 --no-b-pyramid --b-rdo --bime --weightb --direct auto --subme 6 --trellis 1 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --vbv-maxrate 25000 --me umh --merange 12

num_ref_frames=4 ref=4 b-frames=0 b-pyramid=off b-adapt=off
--level 4.1 --ref 4 --mixed-refs --bframes 0 --no-b-pyramid --no-b-adapt --b-rdo --bime --weightb --direct auto --subme 6 --trellis 1 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --vbv-maxrate 25000 --me umh --merange 12


The following encoding settings BREAK DXVA @ 1080p:

num_ref_frames=5 ref=2 b-frames=2 b-pyramid=on
--level 4.1 --ref 2 --mixed-refs --bframes 2 --b-pyramid --b-adapt --b-rdo --bime --weightb --direct auto --subme 6 --trellis 1 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --vbv-maxrate 25000 --me umh --merange 12

num_ref_frames=6 ref=3 b-frames=3 b-pyramid=on
--level 4.1 --ref 3 --mixed-refs --bframes 3 --b-pyramid --b-adapt --b-rdo --bime --weightb --direct auto --subme 6 --trellis 1 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --vbv-maxrate 25000 --me umh --merange 12

num_ref_frames=5 ref=4 b-frames=3 b-pyramid=off
--level 4.1 --ref 4 --mixed-refs --bframes 3 --no-b-pyramid --b-adapt --b-rdo --bime --weightb --direct auto --subme 6 --trellis 1 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --vbv-maxrate 25000 --me umh --merange 12


The following encoding settings allow DXVA to work @ 720p:

num_ref_frames=9 ref=9 b-frames=0 b-pyramid=off b-adapt=off
--level 4.1 --ref 9 --mixed-refs --bframes 0 --no-b-pyramid --no-b-adapt --direct auto --subme 6 --trellis 1 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --vbv-maxrate 25000 --me umh --merange 12

num_ref_frames=9 ref=8 b-frames=3 b-pyramid=off
--level 4.1 --ref 8 --mixed-refs --bframes 3 --no-b-pyramid --b-adapt --b-rdo --bime --weightb --direct auto --subme 6 --trellis 1 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --vbv-maxrate 25000 --me umh --merange 1

num_ref_frames=8 ref=7 b-frames=3 b-pyramid=off
--level 4.1 --ref 7 --mixed-refs --bframes 3 --no-b-pyramid --b-adapt --b-rdo --bime --weightb --direct auto --subme 6 --trellis 1 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --vbv-maxrate 25000 --me umh --merange 12

num_ref_frames=7 ref=6 b-frames=3 b-pyramid=off
--level 4.1 --ref 6 --mixed-refs --bframes 3 --no-b-pyramid --b-adapt --b-rdo --bime --weightb --direct auto --subme 6 --trellis 1 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --vbv-maxrate 25000 --me umh --merange 12

num_ref_frames=9 ref=6 b-frames=3 b-pyramid=on
--level 4.1 --ref 6 --mixed-refs --bframes 3 --b-pyramid --b-adapt --b-rdo --bime --weightb --direct auto --subme 6 --trellis 1 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --vbv-maxrate 25000 --me umh --merange 12


The following encoding settings BREAK DXVA @ 720p:

num_ref_frames=10 ref=9 b-frames=3 b-pyramid=off
--level 4.1 --ref 9 --mixed-refs --bframes 3 --b-pyramid --b-adapt --b-rdo --bime --weightb --direct auto --subme 6 --trellis 1 --analyse p8x8,b8x8,i4x4,i8x8 --8x8dct --vbv-maxrate 25000 --me umh --merange 12

cscxk
2nd January 2008, 02:41
i think you are right."num_ref_frames=4 ref=3 b-frames=16 b-pyramid=off" is power than "num_ref_frames=4 ref=3 b-frames=3 b-pyramid=off b-adapt=off".so "b-adapt=off" is not need.but "num_ref_frames=4 ref=4 b-frames=0 b-pyramid=off" should allow DxVA?

and allow dxva only be limited by num_ref_frames,not other.did you think so?

UsedUser
2nd January 2008, 05:03
"num_ref_frames=4 ref=4 b-frames=0 b-pyramid=off" should allow DxVA?
Yes --- ref=4 b-frames=0 worked with DXVA for me.

and allow dxva only be limited by num_ref_frames,not other.did you think so?
Yeah, the bottom line, I believe, is that allowing DXVA is only limited by the num_ref_frames value.

UsedUser
2nd January 2008, 05:12
One other reasonably important note --- I tested all of the encodes using the Cyberlink H.264 decoder on both an Nvidia 8800GT and an ATI HD2600PRO.

I got a gray screen during playback with every clip using the Nvidia card --- it NEVER successfully displayed a picture.

I got perfect DXVA playback with every compliant clip on the ATI card.

cscxk
2nd January 2008, 06:10
One other reasonably important note --- I tested all of the encodes using the Cyberlink H.264 decoder on both an Nvidia 8800GT and an ATI HD2600PRO.

I got a gray screen during playback with every clip using the Nvidia card --- it NEVER successfully displayed a picture.

I got perfect DXVA playback with every compliant clip on the ATI card.

I had enable DXVA on G98 and 2600pro

CruNcher
2nd January 2008, 06:24
UsedUser im useing the 8800 GT G92 VP2 too with Cyberlinks Decoder before that a 7600 GS G71 VP1 and all complaint streams are working with the latest PowerDVD Decoder Filter and give me actually almost 0% Cpu Utilization on the G92 now without Sound im really amazed by that. Had no problems yet with it staying in the Parameters (even tried bitstreams with over 200 Mbits ok they where not playing fluid but they played) ( Forceware 192.21 WHQL and 192.28 Beta), but thats true for now only with tested raw bitstreams useing the .mpv extension with PowerDVD.
Those Bitstreams also had no playback problems in .mp4 but in .mkv they gave me often problems with the G71 for example if you have a .mkv where the fps lowers suddenly after some seconds of playback dramaticly down, my experience is that it lowers from normal playback framerate constantly down this constant 20 fps problem i didn't experienced yet (but i watch no anime or vfr stuff that might force timecode problems here with Cyberlinks Decoder that's mostly only used with anime and mkv). Demuxing such lowering fps framerate streams out of the .MKV containter and playback those Bitstreams with PowerDVD (7600 GS G71) worked for me so it must be something else here that has todo with this phenomenon except Hardware Limitations of the Stream itself. I asked Haali about a problem that might be in the way PowerDVD filter parses the container, but haali has not looked into this yet as he doesn't use PowerDVD and it's filters personaly. For Nvidia tough Cyberlinks Decoder seems to be the most compatible for WinXP at least in the meaning of Hardware Playback, so i would advise as hard as it sounds in terms of compatibility screw Matroska .MKV (there is so many Matroska stuff out that plays wrong with Hardware Playback, sometimes its also just stuttering for some frames) and use the standard advised .MP4, for hopefully no Interoperability problems @ the moment with Cyberlinks or any other Decoder on the market and correct Hardware DXVA Playback :(

At the moment also PowerDVD is the only choice for WinXP 8800GT VP2 PureVideo users as it is allways allowing to use Hardware accellerated Playback even with Subtitles and both HD-DVD and Blu-Ray and it makes no difference wich Codec Mpeg-2, VC-1 or H.264 everything works as it should be and is full Hardware accellerated to the capabilities of the Chip (VC-1 has higher CPU utilization as we know on Nvidia).
I tested every Software HD-DVD and Blu-Ray Player on the Market today except Corel WinDVD 9 Plus as it isn't available yet and all of them Show problems with Hardware Playback and the 8800 GT even Arcsoft Digital Theatre 2 wich has problems with Hardware accellerated VC-1 Playback doesn't showing the Video @ all only Software Mode Playback works here (and even then HD-DVD menus are flickering on screen). Nero Showtime does in General not work when the Medium uses Subtitles Hardware Accelleration is disabled on XP (not sure if it's a bug or Limitation but i try to make it known to Ahead and Nero for a long time and now im almost sure it's a Limitation of their Video Rendering usage on XP (VMR9 usage)). The only one that works with Mpeg-2 and H.264 in those regards are PowerDVD and Arcsoft Digital Theatre 2 with Arcsoft haveing the fastest BDJ Engine i saw nowdays but more Problems with HD-DVD and VC-1 then PowerDVD (it's BDJ Engine is not as fast as Arcsoft but works and also Menu Interoperability with HD-DVD and VC-1 is better, no flickering Menus and full Accelleration to the capabilities of the G92) :) But Arcsofts Player Software nevertheless has great Potential if they once squashed this bugs and is allready better then Showtime in Hardware Accelleration regards being used allways even with Subtitles, see http://forum.doom9.org/showthread.php?t=133278: for a inside look into there Software Player :)

kumi
2nd January 2008, 06:31
@Cruncher:

A friendly tip: if you don't add any punctuation at all to your sentences, most people are not going to read past the first 2 or 3 dozen words.

It's unreadable.

foxyshadis
2nd January 2008, 07:30
So basically x264 needs a new parameter to specify the absolute maximum refs any frame can use, to keep b-frame efficiency? Sounds like a good patch if Dark Shikari is interested =D

CruNcher
2nd January 2008, 07:42
Done,can you read and understand it now ?

akupenguin
2nd January 2008, 07:51
So basically x264 needs a new parameter to specify the absolute maximum refs any frame can use, to keep b-frame efficiency?
No, that's what it already has. The new option needed is something to specify DPB size, and just let each frame use as many refs as possible within that constraint. Or I can repurpose --ref to do that: the current method was chosen because direct=temporal benefits from having the same number of L0 refs in all frames, but since direct=spatial is usually better anyway, it's no longer necessary.

CruNcher
2nd January 2008, 08:07
No, that's what it already has. The new option needed is something to specify DPB size, and just let each frame use as many refs as possible within that constraint. Or I can repurpose --ref to do that: the current method was chosen because direct=temporal benefits from having the same number of L0 refs in all frames, but since direct=spatial is usually better anyway, it's no longer necessary.

Ahhh that would also explain why useing direct=temporal workedaround that slowdown problems in my previous tests :)

Here is alot to read about this stuff that drive me almost crazy and the workkaround that actually worked @ that time dunno about the situation now as so much changed again (Mp4box, Mkvmerge,X264,Cyberlinks Decoder, Haalis Splitter, Nvidias Drivers and the new VP2 (PV2))

http://forum.doom9.org/showthread.php?t=124945
http://forum.doom9.org/showthread.php?t=127712

alot of MKV i tested last Month still showed problems not the old ones slowdowns or black screen but now it's that frames are beginn to stutter with Hardware Playback and all of them use X264 so there is definately still something wrong somewhere and we still have a interoperability problem here (wherever it comes from).

Btw the Stream lowcomplexity-highpro-test.mp4 that i tested that time that showed a black Screen this same Stream works in the latest PowerDVD Player directly without a Problem but via Mplayer Classic and Haali or the standard MP4 splitter and Cyberlinks Decoder it just shows the start frame (via VMR9 with Standard Video Renderer i get a Black Screen and i could swear that with Mainconcepts Splitter and Annex B output it would play) and does not start playing the whole thing (jesus). It works in any other Software Player Mplayer or Vlc for example without Problems and also if disabling DXVA in the new PowerDVD filter it plays normal. So the Problem here can't be anything else then how Cyberlinks Decoder handles the other Splitters Information as in combination with its own Splitter via it's own Player it works painless. So most of this Problems arise by a combination of many factors that have to come together for Painless Hardware Accellerated Playback under Windows XP those are (Player Software and it's Rendering Mode, Parser/Splitter and Decoder) if all of these 3 don't work in a absolute interoperable way you will experience Problems with Hardware Accellerated Playback dunno if this is still the case for DXVA 2.0 under Vista as it is more managed there into the platform it seems. Sad thing is Windows is the only Solution @ the moment for Full Hardware Accellerated Playback as even AMD doesn't want to reveal it's UVD inner workings to the Open Source Community yet and most probably never will :(

Athlon 64 X2 2.8 Ghz Toledo Windows XP SP2

lowcomplexity-highpro-test.mp4 <- it's a very simple stream no real goodies of H.264 here even lower complexity than Quicktime
PowerDVD useing Nvidia VP2 G92= 0% Full Hardware Accelleration (how much it stresses the G92 and how much power that utilizes is uknown yet )
CoreAVC 1.6 = upto 13%
Mplayer = upto 14%
VLC = upto 18%

UsedUser
2nd January 2008, 09:24
No, that's what it already has. The new option needed is something to specify DPB size, and just let each frame use as many refs as possible within that constraint. Or I can repurpose --ref to do that: the current method was chosen because direct=temporal benefits from having the same number of L0 refs in all frames, but since direct=spatial is usually better anyway, it's no longer necessary.
What are the implications of repurposing --ref? It seems like the most expedient solution, and the most logical use of --ref, but if the underlying functionality is changed, can you foresee the repercussions for anyone expecting the old functionality?

akupenguin
2nd January 2008, 10:08
What are the implications of repurposing --ref?
It would slightly change the quality and speed of any given ref number. Not the overall quality-per-speed tradeoff, just the mapping of --ref argument to a point on the quality curve. Probably no one would notice.
Another way to think of it is: Currently refs are allocated in a way that's most convenient for the encoder. The alternative is an allocation that maximizes quality per memory use (i.e. Level constraint). If I were to implement the slightly less convenient (for me) but more efficient (for the decoder) arrangement, there's no point in making it just optional.

UsedUser
2nd January 2008, 11:27
It would slightly change the quality and speed of any given ref number. Not the overall quality-per-speed tradeoff, just the mapping of --ref argument to a point on the quality curve. Probably no one would notice.
Another way to think of it is: Currently refs are allocated in a way that's most convenient for the encoder. The alternative is an allocation that maximizes quality per memory use (i.e. Level constraint). If I were to implement the slightly less convenient (for me) but more efficient (for the decoder) arrangement, there's no point in making it just optional.
Given the constraints are being imposed by standards (i.e., High@L4.1) now implemented in hardware and closed-source commercial decoders, it seems it would make the most sense to build in at least the capability to maximize quality within the constraints. I say do it! :)

This discussion is making me curious about the effort required for such changes. Time to dig into the code a bit. :)

CruNcher
2nd January 2008, 11:38
Yep actually that is what all the big companies currently do they try to achive the best visual quality in those given Level constraints and the nearest look and feel to the input source, but visual tweaking should come after stabilizing everything and im personaly not sure if X264 is that Hardware stable yet testing this on such a HD-DVD or Blu-Ray standalone and it's chip i think would reveal where the problems still are hiding (can't test on those as im no SAP user) but i can test it on Nvidias and Atis 2nd Generation Decoder Cores and that's what im currently doing with the G92 and RV670 :)

audyovydeo
2nd January 2008, 12:36
What I didn't initially understand, and still don't fully, is the formula that is used to determine the num_ref_frames value, which is distinct from the "ref" value set when encoding.

UserUser,

from an exchange I had with fsinapsi some time ago, the answer is in : x264/encoder/set.c :

sps->i_num_ref_frames = X264_MIN(16, param->i_frame_reference + sps->vui.i_num_reorder_frames + param->b_bframe_pyramid);

I'm quoting his quote, I am not qualified to delve into code.

What I havent figured out though is that I have always encoded my SD content to sorta-level 3.1, with num_ref_frames =< 6, and it does not seem to get accelerated.

Or maybe it does, sometimes.
hard to figure.

cheers
audyovydeo

valnar
2nd January 2008, 15:36
It is the num_ref_frames value that must comply with High@L4.1, where (num_ref_frames < 5 @ 1080p) and (num_ref_frames < 10 @ 720p).


I think the only questions I have left is why is this the case and how is it enforced by the video cards for DXVA?

"Why" we may never know, but let me expand on "how".

How does the video card know if something is 1080p or 720p to enforce the 5 or 10 num_ref_frames limit? What is it looking for?

What if you encode the video for 721p or 1079p... or 1001p? What value determines the 5 or 10 num_ref_frames limit in the eyes of the video card? Do those numbers have to be hit exactly (720p or 1080p) for DXVA to work at all? Or is it based on a threshold and approximation? ie. Would 900p (half-way between 720 + 1080) mean 7 or 8 num_ref_frames allow you hardware acceleration?

The question is mostly academic since most won't encode for those odd resolutions, but I'm curious.

-Robert

RaynQuist
2nd January 2008, 17:26
...
What value determines the 5 or 10 num_ref_frames limit in the eyes of the video card?
...
Would 900p (half-way between 720 + 1080) mean 7 or 8 num_ref_frames allow you hardware acceleration?


It's the maximum decoded picture buffer that determines the limit.
1920x1080x4 = 8294400
1280x720x9 = 8294400

So 900p would have a limit of
8294400/(1600*900) = 5.76 reference frames

Why does this happen? Because hardware decoders are only going to add as much cache into the chip as necessary, otherwise you'll be wasting transistor / increasing die space. Software decoders can just allocate as much memory as they feel like since you already have hundreds of megs of ram.