View Full Version : x264 Known Hardware accelleration problems and solutions
Pages :
1
2
3
4
5
6
7
8
[
9]
10
tetsuo55
5th September 2008, 10:36
I did all my previous testing with the Cyberlink PDVD7 decoder. I will test the new Nvidia drivers with both PDVD7 and PDVD8 decoders to isolate the issue/resolution.
Previously, ref=5 @ 1920x1080 @ any bitrate would break DXVA. Why is a ref=6, 60Mbps stream required to test?
But does the new Nvidia driver work with the old PDVD7 decoder? It could be both the drivers and the decoder had fixes applied.
Question: The new Cyberlink decoder plays actual L5.1 streams, or just L4.1 streams with the IDC flag set to 5.1?
I said ref 6 and 60mbits because then we can 100% sure that L5.1 is being decoded (L4.1 should never work with those combined)
The file has to be L5.1 this has nothing ot do with the IDC flag (although it might have to be set to L4.1 to trick cyberlink)
blizzbit
25th September 2008, 10:40
Two questions regarding the first post of this thread. :)
1. I'm assuming 8355840 = mod16_width * mod16_height * 4 = 1920 * 1088 * 4, but why?
STEP 1: Determine the REF frame in DPB limit:
8355840 / (Height X Width) = nREF
Imho
Max Ref = Pixel(YV12) / (mod16_width * mod16_height)
In this case for L4.1: Annex A of the H.264 specification shows the Max DPB in bytes for L4.1 is 12288,0 * 1024. Divided by 1,5 to convert from bytes to pixels (1,5 bytes per pixel in YV12 format) is 8388608 (instead of 8355840). So for resolution 1920 * 1088 max refs is 8388608 / (1920 * 1088) = 4 (rounded down).
For L3.0:
Max Ref = 3037,5 * 1024 / 1,5 / (720 * 576)
Max Ref = 2073600 / (720 * 576) = 5 (rounded down)
So for L3.0 it's 2073600 as described in the first post.
STEP 1: Determine the REF frame in DPB limit:
2073600 / (Height X Width) = nREF
2. In STEP2 for L3.0 encoding the highest possible value for CPB is 10000 and max bitrate is 12500.
STEP 2: Make sure use these commands and never cross the limits, shown here are the highest/lowest settings(not marked) or mandatory settings(marked with *):
--vbv-bufsize 10000 (highest possible value)
--vbv-maxrate 12500 (highest possible value)
[...]
But level limits (CPB/Max BR) for L3.0 are 10000/10000 or 12000/12000 (High Profile).
http://img409.imageshack.us/img409/9297/level3da4.th.jpg (http://img409.imageshack.us/my.php?image=level3da4.jpg)http://img409.imageshack.us/images/thpix.gif (http://g.imageshack.us/thpix.php)
So why 10000/12500 for L3.0@High encoding?
Sulik
25th September 2008, 16:42
NVidia official WHQL drivers 178.13 were just released today. It now appears to fully support 1080p files with more than 4 reference frames.
Inventive Software
25th September 2008, 16:46
This being the start of the Level 5.1 support they promised? :rolleyes:
Sulik
25th September 2008, 16:56
I'm guessing it's more like allowing 'out-of-spec' level 4.1, which is really what the level 5.1 content out there really is.
tetsuo55
27th September 2008, 19:07
Two questions regarding the first post of this thread. :)
1. I'm assuming 8355840 = mod16_width * mod16_height * 4 = 1920 * 1088 * 4, but why?
Imho
Max Ref = Pixel(YV12) / (mod16_width * mod16_height)
In this case for L4.1: Annex A of the H.264 specification shows the Max DPB in bytes for L4.1 is 12288,0 * 1024. Divided by 1,5 to convert from bytes to pixels (1,5 bytes per pixel in YV12 format) is 8388608 (instead of 8355840). So for resolution 1920 * 1088 max refs is 8388608 / (1920 * 1088) = 4 (rounded down).
For L3.0:
Max Ref = 3037,5 * 1024 / 1,5 / (720 * 576)
Max Ref = 2073600 / (720 * 576) = 5 (rounded down)
So for L3.0 it's 2073600 as described in the first post.
2. In STEP2 for L3.0 encoding the highest possible value for CPB is 10000 and max bitrate is 12500.
But level limits (CPB/Max BR) for L3.0 are 10000/10000 or 12000/12000 (High Profile).
http://img409.imageshack.us/img409/9297/level3da4.th.jpg (http://img409.imageshack.us/my.php?image=level3da4.jpg)http://img409.imageshack.us/images/thpix.gif (http://g.imageshack.us/thpix.php)
So why 10000/12500 for L3.0@High encoding?
The number you're using is probably a little more accurate than my guesstimate, it doesn't really matter for compliance calculation though. However i updated the first post to match the number you provided so thanks!.
I'm not sure where your screenshot comes from but i got the info off wikipedia\doom9 experts.
http://en.wikipedia.org/wiki/H264#Levels
blizzbit
27th September 2008, 23:15
The number you're using is probably a little more accurate than my guesstimate, it doesn't really matter for compliance calculation though. However i updated the first post to match the number you provided so thanks!
I like your work in this thread. So thank you! :thanks:
I'm not sure where your screenshot comes from but i got the info off wikipedia\doom9 experts.
http://en.wikipedia.org/wiki/H264#Levels
It's from the ITU-T. H.264: Advanced video coding for generic audiovisual services. Recommendation H.264 (11/07), Table A-1 – Level limits, page 283. Take a look here (http://www.itu.int/rec/T-REC-H.264-200711-I/en). There is a PDF file you can download.
azulazules
30th September 2008, 08:59
Hi! I have trouble to DXVA read SD resolution video on HD2400 pro PCI while I have not problem with full HD resolutions (up to 1920px * 1088px 25hz); using MPC HC from v604 to v811.
The problematic video I created is from one of my DVD.
Here are the parameters and the calculations I have made to make sure it is DXVA and level is setup correctly.
I used x264 core 64 r979
Video resolution is: 720x432 (encoded anamorphic DAR 2.35 in MP4)
FPS: 25hz progressive
ref#: 16
bframes#: 5
bpyramid
High@3.2, CABAC, Trellis2
VBVMaxRate=20000
VBVBufSize=3200
BitRate = 587kb/sec
MVRange = 511
Computed DPB = 16*720*432*1.5 = 7464960 (max 7864320 at High@3.2, if I am right)
Computed #MB = 720*432/256 = 1215 (max 5120 at High@3.2, if I am right)
Computed #MB/sec = 1215*25 = 30375 (max 216000, if I am right)
I thought I understood how to make sure DXVA will work on my card, but it seems I am wrong somewhere.
Do you see anything wrong in the above? :confused:
Nice day!
MatMaul
30th September 2008, 12:02
it has been said a million times : ref > 11 doesn't work...
blizzbit
30th September 2008, 12:25
Hi azulazules,
Do you see anything wrong in the above? :confused:
Yep, take a look here (http://www.avsforum.com/avs-vb/showthread.php?t=972503) and here (http://www.avsforum.com/avs-vb/showthread.php?p=12704376#post12704376) (resolution: max num_ref_frames).
Everybody should be encoding to L3.1 with L3.1 set in the stream for SD (576p, 480p, or less).
Keep in mind that, despite the resolution, it is still generally recommended to only use between 1-5 reference frames, usually centering on 3.
The max reference frames can be broken down as follows. The number of reference frames is the max at the given resolution, so if your resolution is between the resolutions given, use the lower number of reference frames (i.e., 1920x816 is between 1920x720 and 1920x864, so you can have a max of 5 reference frames). B-frames count towards one reference frame, which means when encoding with x264, --ref needs to be set to one less than each value (i.e., if max num_ref_frames = 4, then --ref 3 should be used).
Max decoded picture buffer size (MaxDPB) for High@L3.1 is 6750 * 1024 bytes.
Pixel(YV12) = DPB(bytes) / 1,5
Pixel(YV12) / (mod16_width * mod16_height) = Max Ref
Max Ref =
6750 * 1024 / 1,5 / (720 * 432)
= 4608000 / 311040
= 14 (rounded down)
And don't forget this value has to be reduced for b-frames and b-pyramids (see above: 'resolution: max num_ref_frames').
Have a nice day, too! :)
azulazules
30th September 2008, 23:50
@MatMaul & @blizzbit thanks for your time and help!
I decided to dig based on your inputs and linked references to help me understand. All, sorry if the following investigation was done already.
X264 v979 encoded 500 frames at 1280x400@23.976hz Main@4.0 bframe=6, bpyramid to find out:
– with ref#=11: Cyberlink (10% CPU) & MPC Decoder (11% CPU) both did DXVA
– with ref#=12: Cyberlink (10% CPU) did DXVA & MPCHC Decoder (35% CPU) fell back to software ffdshow
– with ref#=16: Cyberlink (10% CPU) did DXVA & MPCHC Decoder (40% CPU) fell back to software ffdshow
using MPC HC v746 each time to render the video.
Maybe Cyberlink did only GPU assisted decoding instead of bitstream, I don't know. But it seems that:
– (assisted?) DXVA is possible with ref# up to 16 with Cyberlink
– bitstream DXVA is possible with ref# up to 11 with MPC HC Decoder
Calculations done:
maxRef# = 12288*1024/(1280*400*1.5) = 16
With 12288kB the size of the DPB for a h264 Main@4.0 compliant decoder (source wiki x264).
X264 (fast) command line I used:
--ref ?? --aq-strength 0.88 --crf 16 --no-chroma-me --me dia --subme 1 --partitions none --trellis 0 --direct auto --bframes 6 --weightb --b-bias 0 --b-pyramid --b-adapt 2 --min-keyint 48 --keyint 480 --no-cabac --progress --no-psnr --no-ssim --level 4
So, I tend to think MPC HC bitstream DXVA is limited to 11 ref and Cyberlink seems capable of a good GPU assisted decoding up to ref#=16 --at least on ATi HD2400Pro.
Then, tried 720x432 @ 23.976, Main@4.0 and:
- Cyberlink did DXVA (10% CPU again)
- MPC HC bitstream decoder fell back to ffdshow software (25% CPU)
And finally, I was able to decode my 720x432 @ 25hz High@3.2 video with DXVA thanks to Cyberlink.
Basic settings used: ref#16, bframes#6, bpyramid!, trellis#2, partitions all but p4x4 with dct, mixedref.
Interesting... :) ...but at the end I guess I will stick to suggested ref#=11 anyway as I like MPC HC DXVA decode better than Cyberlink's (PDVDv7).
Nevertheless, I guess I will ask the MPC HC thread if the limit of ref#11 is due to bitstream decoding technology limitation or miscalculation of the max ref# in MPC HC.
qyqgpower
1st October 2008, 04:27
I've got troubles with SD DXVA on Radeon, the same clip can be decoded flawlessly on NVIDIA card. HD-DXVA is fine with both card.
Tested GPU is
Radeon 4850 with catalyst 8.8 & 8.9: serious random blocky pictures, both MPCHC Internal and Cyberlink Decoder
GeForce 9600GT with 178.13: flawlessly
the x264 setting is pretty hold back for SD-DXVA in my opinion.
--crf 20 --level 3.1 --keyint 240 --min-keyint 24 --ref 5 --bframes 6 --b-adapt 2 --b-rdo --bime --weightb --direct auto --subme 7 --trellis 2 --psy-rd 0:0 --partitions p8x8,b8x8,i4x4,i8x8 --8x8dct --vbv-bufsize 14000 --vbv-maxrate 14000 --me tesa --merange 32 --threads 3 --thread-input --progress --no-psnr --no-ssim --sar 32:27
even with SD-DXVA profile provided with MeGUI auto-update, I got no luck with a successful SD-DXVA decoding with 4850.
Here is a sample clip: http://www.mediafire.com/?gtnzwjzunyt
Have I missed anything with SD-DXVA encoding?
qyqgpower
1st October 2008, 05:51
I forgot to mention one more situation:
If the SD clip is encoded as interlaced, it can be played flawlessly with Cyberlink Decoder in DXVA mode on Radeon 4850.
--crf 20 --keyint 300 --min-keyint 30 --ref 5 --mixed-refs --bframes 6 --b-adapt 2 --b-pyramid --b-rdo --bime --weightb --subme 7 --trellis 2 --psy-rd 0:0 --partitions p8x8,b8x8,i4x4,i8x8 --8x8dct --vbv-bufsize 10000 --vbv-maxrate 17500 --me umh --merange 32 --threads 3 --thread-input --progress --no-psnr --no-ssim --interlaced --sar 32:27 --nal-hrd
As you can see, I didn't even specify --level and used --b-pyramid. While it works for interlaced clip, the settings in previous post is totally broken on 4850. So this issue is a bit strange now.
azulazules
2nd October 2008, 10:47
@qyqgpower:
First post you set your 720x480@23.976fps video encode Level to High@3.1 and your settings are OK to me compared to High@3.1 spec'. Second post you do not set Level (and encode interlaced) and I guess High@5.1 is registered by x264 then (you may want to double check that).
Maybe you can try to set the level of your first progressive encode to High@4.1 (four dot one) as this could enable hardware assisted video decoding with Cyberling h264 decoder in PDVDv7 in MPC HC.
I had the same experience as you're having and making sure the level is High@4.1 for SD solved the issue for me (using Cyberlink decoder with MPC HC).
qyqgpower
2nd October 2008, 11:29
Thank you for the reply.
x264 have changed the behavior since r915 when --level is not set . The file encoded with second setting is marked as High@L3.2
The main reason for SD-DXVA is : It's annoying to switch between decoders every time for different kind of files (for example, Interlaced ? Cyberlink(DXVA deinterlcer/IVTC) : CoreAVC)
I'll try your suggestion now.
tetsuo55
2nd October 2008, 12:36
the 11 ref frame limit has several reasons.
Driver limits, hardware limits, decoder implementation limits.
The 16ref frames files should work even with b-pyramids if the encoder correctly encodes the file and the decoder correctly handles the stream.
We need programmers to help create "perfect" decoding specs for MPC-HC in bitstream mode, AND we need help adding assisted decoding. After than everything should be ported back upstream to ffmpeg if and where possible..
qyqgpower
2nd October 2008, 13:44
--level 4.1 for SD-DXVA failed too.
I even tried my HD-DXVA setting with SD material, nothing is changed from displaying totally random (on several rare cases, no blocking was shown during entire playback) blocky pictures.
At the same time, 9600GT is fine with all these test cases.
azulazules
2nd October 2008, 14:17
@qyqgpower:
Do you have AGP Radeon 4850 card --if only it exists? => too low bandwidth AGP causes problem with ATi DXVA I read once somewhere.
Also, will you give these (low on purpose) x264 settings a try for an additional test on your progressive 720x480@23.976fps video?
--ref 12 --crf 16 --no-chroma-me --me dia --subme 1 --partitions none --trellis 0 --direct auto --bframes 6 --weightb --b-bias 0 --b-pyramid --b-adapt 2 --min-keyint 24 --keyint 240 --no-cabac --progress --no-psnr --no-ssim --level 4 --vbv-maxrate 14000 --vbv-buf 3200 --mvrange 511
qyqgpower
2nd October 2008, 15:18
No, I'm using PCI-E 4850.
the above setting still produces same blocky pictures, so I think there must something wrong with AMD's driver or my whole system is screwed up ;)
Sagekilla
2nd October 2008, 15:27
@Matmul: I don't see why that would be true. The number of references frames you can use relates to the size of your frames. Max refs for 1920x1080p = 4 refs, for 1280x720p = 9 refs, for 1280x544p = 11 refs. Once you go to an even lower resolution, like 848x480, you can use 16 refs since it would take up, more or less, the same space in the DPB as 1920x1080 w/ 4 refs or any of the other combinations.
The kicker? I've used DXVA with 16 refs on SD video. No issues.
Also, everyone please keep in mind that issue of having to use less refs in the presence of B's does not exist anymore. It was fixed in an old patch a while ago that automatically reduced the number of b-refs so you wouldn't have to decrease --ref to compensate.
azulazules
2nd October 2008, 15:29
:( Oh...
Now I leave this link to you just in case of it could help: http://home.comcast.net/~exdeus/ati-hd2x00/
If so, let us know.
Edit: back to my home, I was able to test your video is MPC HC Video Decoder bitstream DXVA and Cyberlink PDVD7 DXVA just fine (no blocking). :confused:
Turtleggjp
12th October 2008, 07:48
Is there any danger using more than 3 B frames with hardware acceleration? So far I have tested clips like this (--bframes 5) on a Radeon 3450 and a GeForce 8500GT and have had no trouble. However, I should have a Popcorn Hour A-110 on the way, and I hope that it won't have any trouble either. If so, I'll need to redo a bunch of my HDTV show encodes, and adjust my settings for all the ones I still have to do. Thanks!
Matt
Quark.Fusion
12th October 2008, 12:21
Can someone explain to me how number of b-frames can be limited by hardware if they already supported? Do they need any additional memory or affect other frames?
Turtleggjp
14th October 2008, 15:31
I wouldn't think it would matter, since all you need in memory is the frame before and the frame after the B group. The number of frames in between shouldn't matter. However, I think DVD MPEG2 video is limited to 2 B frames, and I thuoght I read that Blu Ray AVC video was limited to 3 B frames. Although I don't really care about making my videos Blu Ray compatible, I do want the Popcorn Hour A-110 to be able to play them. I was hoping to hear from someone that has one if they are able to play x264 encodes with more than 3 B frames.
laserfan
14th October 2008, 16:27
Although I don't really care about making my videos Blu Ray compatible, I do want the Popcorn Hour A-110 to be able to play them. I was hoping to hear from someone that has one if they are able to play x264 encodes with more than 3 B frames.But maybe it's a crap shoot i.e. "you might get lucky" with >3 B frames, but why risk it?
I'm not on the inside with Sigma Designs (inside the PCH), but if their chipset is designed to be Blu-ray compatible then I would surely want to stick to 3 B frames myself.
Ranguvar
14th October 2008, 20:15
@Quark.Fusion: Companies dealing with video are, in general, collections of idiots.
@laserfun: Huh? "You might get lucky"? The number of b-frames allowed should be fixed across each device of the same model, and it does help to have more... Why would Sigma (logically) want to keep 3 b-fames max, instead of going all-out? I'm sure Joe consumer would be happier with more, even if it just means he can play more of his who-knows-where-they-came-from videos. Even if there's stuttering, it seems to make more sense to at least try to play everything, even if it doesn't work out so well. Of course, companies are idiots...
Turtleggjp
14th October 2008, 21:07
But maybe it's a crap shoot i.e. "you might get lucky" with >3 B frames, but why risk it?
I'm doing it because in my tests, using 5 B frames did offer a good savings on space. I would like to use this if possible.
I'm not on the inside with Sigma Designs (inside the PCH), but if their chipset is designed to be Blu-ray compatible then I would surely want to stick to 3 B frames myself.
According to the Popcorn Hour's website, it supports AVC High Profile @ Level 4.1. It does not say Blu Ray. As far as know, there is no specific limit on number of B frames for level 4.1, other than the maximum of 16 for the AVC standard. Am I wrong?
Quark.Fusion
14th October 2008, 23:24
As Dark Shikari says:There is no max number of B-frames in any H.264 profile (except obviously Baseline). x264's limit of 16 is simply arbitrary for technical reasons.
laserfan
15th October 2008, 02:28
I'm doing it because in my tests, using 5 B frames did offer a good savings on space. I would like to use this if possible.
According to the Popcorn Hour's website, it supports AVC High Profile @ Level 4.1. It does not say Blu Ray. As far as know, there is no specific limit on number of B frames for level 4.1, other than the maximum of 16 for the AVC standard. Am I wrong?I have a PCH A100 and from what I can gather from the PCH forum site, whenever anyone complains about files that don't play, B-frames >3 are cited as one possible reason.
You might want to surf that forum, or join-and-post there about 5 B-frames.
Sagekilla
15th October 2008, 06:03
Ergh, how unfortunate.. Popcorn Hour looks -extremely- tempting but I wouldn't bother re-encoding my movies with 6 B's from the original Blu-ray again.
Turtleggjp
15th October 2008, 15:13
Exactly my point, except for me it's not only Blu Rays but TV shows I have captured. This means that until I know for sure (when I get the PCH, probably not until early next month) I'll be holding on to my original .ts recordings so I can re-encode from the source if needed. That's about 13GB per week. I think I can manage that.
laserfan
15th October 2008, 17:43
Well, don't shoot me if I'm wrong, but it did appear to me that this was an issue with the PCH.
I never download videos off the internet--that seems to be where most of the complaints are coming from on that forum (where anything goes it seems wrt video origins)... :scared:
UsedUser
15th October 2008, 23:06
Can someone explain to me how number of b-frames can be limited by hardware if they already supported? Do they need any additional memory or affect other frames?
B-frames require references, and hardware has ref / DPB size limits.
Mr VacBob
16th October 2008, 00:13
That limits B pyramid depth, not B-frames, and x264 only supports one-level-deep B-pyramid.
Sagekilla
16th October 2008, 05:37
Would using deeper levels of B-Pyramid help any?
UsedUser
17th October 2008, 00:18
That limits B pyramid depth, not B-frames, and x264 only supports one-level-deep B-pyramid.
Yeah, ya got me there. I was thinking of how enabling B-pyramid used to increase DPB size beyond the value specified by --ref. Now that we have that fix, B-pyramid doesn't cause DPB size to exceed the --ref value, and it doesn't have anything to do with the number of B-frames used, anyway.
professor_desty_nova
17th October 2008, 08:08
Yeah, ya got me there. I was thinking of how enabling B-pyramid used to increase DPB size beyond the value specified by --ref. Now that we have that fix, B-pyramid doesn't cause DPB size to exceed the --ref value, and it doesn't have anything to do with the number of B-frames used, anyway.
Acording to the thread B-pyramids breaking DPB limits discussion (http://forum.doom9.org/showthread.php?t=140223), it seems there are ocasions where b-pyramids still break the DPB limit. Since I don't remember seeing a fix for it in the git since this discussion, if you want a 100% compliant AVC stream don't use b-pyramids for now.
UsedUser
17th October 2008, 23:10
Acording to the thread B-pyramids breaking DPB limits discussion (http://forum.doom9.org/showthread.php?t=140223), it seems there are ocasions where b-pyramids still break the DPB limit. Since I don't remember seeing a fix for it in the git since this discussion, if you want a 100% compliant AVC stream don't use b-pyramids for now.
I believe that's a little different.
--ref used to specify an absolute number of references for any frame; now it specifies DPB size.
The problem is that with --ref=x, B-pyramid currently requires DPB=x+1.
The result isn't exceeding the DPB size, it's a broken reference to a frame that isn't in the DPB.
UsedUser
17th October 2008, 23:35
That limits B pyramid depth, not B-frames, and x264 only supports one-level-deep B-pyramid.
Yeah, ya got me there. I was thinking of how enabling B-pyramid used to increase DPB size beyond the value specified by --ref. Now that we have that fix, B-pyramid doesn't cause DPB size to exceed the --ref value, and it doesn't have anything to do with the number of B-frames used, anyway.
Disregard my concession above. I believe it does limit B-frames. Someone can correct me if I'm wrong.
The number of B-frames is related to DPB size. B-frames require references. You can't have more referenced frames than frames in the DPB. Enabling B-pyramids exacerbates the situation further, by requiring an additional frame in the DPB to satisfy all references.
Back to my original assertion:
Adding more B-frames requires more references. More references may require more frames to be stored in the DPB. DPB is limited on hardware, whether artificially by decoder profile level limits (i.e., L4.1), or by absolute quantity of memory.
MasterNobody
18th October 2008, 00:46
Adding more B-frames requires more references.You are wrong in this statement and that why your conclusion is incorrect.
UsedUser
18th October 2008, 01:35
You are wrong in this statement and that why your conclusion is incorrect.
Ok. Care to explain?
I went back over my DXVA tests and thought it through, and it seems to hold true, but I'm not entirely sure why.
If you've got more B-frames than allowed frames in the DPB (i.e., --ref 4 --bframes 16), are decoded B-frames only allowed to reference other frames within the DPB? The number of B-frames indicates the number of B-frames per... what?
akupenguin
18th October 2008, 11:20
Conventional B-frames don't go into the DPB. They're just decoded, immediately displayed, and discarded. The decoder can do that any number of times in a row without changing state.
asarian
19th October 2008, 14:43
I'm not sure if this is the right place to ask, but will x264 support CUDA (Compute Unified Device Architecture) in the future? Right now, re-encoding a 1080p VC-1 stream can easily take > 20 hours on my quad-core (with some hefty settings). It sure would be nice if this could be done 7x times faster (to follow nVidia's claim) using, say, a GeForce GTX 280.
The idea of using a GPU is good anyway, right?
Sagekilla
19th October 2008, 19:16
So theoretically, we could have a more or less motionless scene that has one keyframe followed by hundreds of B's? Like IBB(1000)P or something, and the decoder would be fine with this?
Shinigami-Sama
19th October 2008, 22:05
I'm not sure if this is the right place to ask, but will x264 support CUDA (Compute Unified Device Architecture) in the future? Right now, re-encoding a 1080p VC-1 stream can easily take > 20 hours on my quad-core (with some hefty settings). It sure would be nice if this could be done 7x times faster (to follow nVidia's claim) using, say, a GeForce GTX 280.
The idea of using a GPU is good anyway, right?
discussed to death
:search:
asarian
20th October 2008, 04:39
discussed to death
:search:
That's the problem, isn't it? :) Much "YAY! Wouldn't it be nice if we had it!" or "Its in the works for x264", but nothing concrete, let alone examples on how to use it on a command-line.
Sagekilla
20th October 2008, 05:11
I'm sorry, what are you trying to say? there is no GPU acceleration for x264, everything available is easy to find using --longhelp and there's no real future for GPU based encoding for x264 at this time. It's simply not feasible right now, and it's already been tried.
Go ahead, do some searching and you'll find Dark Shikari already attempted to (and even Avail Media, I believe) port something as simple as the SAD function to the GPU, and it performed miserably compared to the CPU version because of how difficult it is to optimize for a GPU.
Yes, the idea of using a GPU is good for encoding. You basically have a massively parallel processor to work with, but it's so difficult to code for that no one is pursuing it. Plus, the commercial implementations that do exist absolutely suck both speed and quality wise. x264 is easily as fast and better quality than the commercial apps, so there's no point in trying them (they're all baseline I believe)
asarian
20th October 2008, 06:20
I'm sorry, what are you trying to say? there is no GPU acceleration for x264, everything available is easy to find using --longhelp and there's no real future for GPU based encoding for x264 at this time. It's simply not feasible right now, and it's already been tried.
Go ahead, do some searching and you'll find Dark Shikari already attempted to (and even Avail Media, I believe) port something as simple as the SAD function to the GPU, and it performed miserably compared to the CPU version because of how difficult it is to optimize for a GPU.
Yes, the idea of using a GPU is good for encoding. You basically have a massively parallel processor to work with, but it's so difficult to code for that no one is pursuing it. Plus, the commercial implementations that do exist absolutely suck both speed and quality wise. x264 is easily as fast and better quality than the commercial apps, so there's no point in trying them (they're all baseline I believe)
Well, thanks for clarifying this. I had indeed read a guy had been hired at Avail Media to try and implement it, but I was unclear at were things are now. I can see how difficult it would be. I guess nVidia's propaganda had made me a bit too enthusiastic. :)
Again, thanks for the explanation.
Turtleggjp
21st October 2008, 02:09
Conventional B-frames don't go into the DPB. They're just decoded, immediately displayed, and discarded. The decoder can do that any number of times in a row without changing state.
This makes the most sense to me, but then what about when you enable b-pyramid? Doesn't that mean that now B frames can also be referenced, meaning they would need to be kept in the DPB?
Sagekilla
21st October 2008, 02:28
The ones that do get referenced, yes, would be put into the DPB. Not all B's are referenced though.
vBulletin® v3.8.11, Copyright ©2000-2025, vBulletin Solutions Inc.