Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
4th May 2008, 18:56 | #1 | Link |
Testeur de codecs
Join Date: May 2003
Location: France
Posts: 2,484
|
Cuda Challenge for x264 ... ?
http://www.nvidia.fr/object/cuda_con...il2008_fr.html
http://www.nvidia.fr/content/EMEAI/C..._terms_fr.html
__________________
Le Sagittaire ... ;-) 1- Ateme AVC or x264 2- VP7 or RV10 only for anime 3- XviD, DivX or WMV9 |
6th May 2008, 07:00 | #5 | Link | |
Solaris: burnt by the Sun
Join Date: Oct 2004
Location: /etc/default/moo
Posts: 1,923
|
Quote:
5200fx first gen dx9! it will be nice to see this nice looking stuff make it in x264 its definitely growing fast these days, kinda sad big commercial companies can't even compete still |
|
6th May 2008, 09:47 | #6 | Link |
Turkey Machine
Join Date: Jan 2005
Location: Lowestoft, UK (but visit lots of places with bribes [beer])
Posts: 1,953
|
Yeah, Shinigami, except it cheated with DX9 and fell back to DX8.1 when it felt like it!
__________________
On Discworld it is clearly recognized that million-to-one chances happen 9 times out of 10. If the hero did not overcome huge odds, what would be the point? Terry Pratchett - The Science Of Discworld |
6th May 2008, 14:04 | #7 | Link |
aka XaS
Join Date: Jun 2005
Location: France
Posts: 1,122
|
Too bad, I switched from a Geforce 8600 GTS to a Radeon HD 3870 and I'm regretting I chose ATI, not only because of it's poor drivers and performance, but now even more so because of CUDA *Looks at his HD 3870 with an evil grin mwahahahaha*
__________________
Q9300 OC @ 3.2ghz / Asus P5E3 / 4GB PC10600 / Geforce 8600 GTS |
7th May 2008, 03:24 | #11 | Link | |
Registered User
Join Date: Jan 2004
Posts: 849
|
Quote:
__________________
Geforce GTX 260 Windows 7, 64bit, Core i7 MPC-HC, Foobar2000 |
|
7th May 2008, 17:33 | #13 | Link | |
Registered User
Join Date: Jan 2004
Posts: 849
|
Quote:
I think Sh commercialized by RapidMind (same developers) are the cross platform as well as CPU and GPU independent version of this (even supports PS3's Cell + GPU combo). Since last I checked RapidMind still offered their stuff for free for non-commercial use, maybe it will be a better code investment than CUDA? Especially since Sh devs have been at it longer than Nvidia.
__________________
Geforce GTX 260 Windows 7, 64bit, Core i7 MPC-HC, Foobar2000 Last edited by lexor; 7th May 2008 at 17:41. |
|
7th May 2008, 21:28 | #14 | Link |
Registered User
Join Date: Mar 2002
Posts: 1,075
|
Rapidmind is higher level ... with abstraction comes loss of performance, more severe with these kind of architectures than on a desktop processor which bristle with technology to make bad code run fast. I don't think you can freely download it anymore.
|
7th May 2008, 21:48 | #15 | Link |
Registered User
Join Date: Dec 2005
Posts: 133
|
I was curious so i read through a cuda introduction tutorial they have on their website. It's well written and easy to understand even for programming noobs like me.
i'd really like to see this technology being used in x264 Apprently they use the GPU as a giant SIMD co-processor : .so it should be very usefull for video, since you often have to execute the same algorighm on a huge amount of pixels and/or macroblocks (especially in HD) simultaneously, but i don't know much about the x264 code so i can't really say which parts of the encoding process would benefit the most from the gpu. using the cuda api (C/c++) seems simple, but in order to get the computations done "10/20/30/40/50/100+ times faster" (which is what they claim), you need to have massively parallelizable code without divergent branching, and then avoid bottlenecks caused by memory latency and the PCI-express bandwidth. |
7th May 2008, 22:29 | #16 | Link |
Registered User
Join Date: Mar 2002
Posts: 1,075
|
Highly irregular branching patterns (skip modes) and bit manipulation (quantization/entropy coding) don't suit present GPUs. IMO the only really good application at the moment are full search ME algorithms, in the end though accelerated full search is still slow even if it's faster than on the CPU. Because it has to do everything at single precision floating point it won't usually have 10x+ the performance of modern processors when you can use 8 or 16 bit on the CPU BTW. In pure arithmetic at usual precisions in x.264 it probably ties with a quad core ... it does have a lot more bandwidth though.
Last edited by MfA; 7th May 2008 at 22:32. |
7th May 2008, 22:33 | #17 | Link | |
x264 developer
Join Date: Sep 2005
Posts: 8,666
|
Quote:
x264 CUDA will implement a fullpel and subpel ME algorithm initially; later on we could do something like RDO with a bit-cost approximation instead of CABAC. Wrong, CUDA supports integer math. |
|
7th May 2008, 22:40 | #20 | Link |
Registered User
Join Date: Mar 2002
Posts: 1,075
|
Seriously? Hmm, didn't notice that before ... I stand corrected.
PS. are you quite certain it can operate on uchar4 at 4 times the throughput as fp? (I don't have a g80 unfortunately.) I would have thought they would have made a bigger deal out of that if they could do it. PPS. I think I worded it poorly the first time, I meant they can't do SIMD inside a "thread" (operations on vectors are simply iterated over the components) everything from 8 bits to 24 bits operations run on the same arithmetic units and at the same throughput. Whereas a CPU will get significant increase in throughput of arithmetic ops when using lower precision. Last edited by MfA; 7th May 2008 at 23:44. |
|
|