Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > MPEG-4 AVC / H.264
Register FAQ Calendar Today's Posts Search

Reply
 
Thread Tools Search this Thread Display Modes
Old 4th May 2008, 18:56   #1  |  Link
Sagittaire
Testeur de codecs
 
Sagittaire's Avatar
 
Join Date: May 2003
Location: France
Posts: 2,484
Cuda Challenge for x264 ... ?

http://www.nvidia.fr/object/cuda_con...il2008_fr.html
http://www.nvidia.fr/content/EMEAI/C..._terms_fr.html
__________________
Le Sagittaire ... ;-)

1- Ateme AVC or x264
2- VP7 or RV10 only for anime
3- XviD, DivX or WMV9
Sagittaire is offline   Reply With Quote
Old 4th May 2008, 20:07   #2  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Avail Media is already working on CUDA for x264
Dark Shikari is offline   Reply With Quote
Old 4th May 2008, 21:58   #3  |  Link
Yoshiyuki Blade
Novice x264 User
 
Yoshiyuki Blade's Avatar
 
Join Date: Dec 2006
Location: California
Posts: 169
I cant wait to see how that turns out. I hope my aging 8800GTX will have even more lasting value than it has already given me.
Yoshiyuki Blade is offline   Reply With Quote
Old 4th May 2008, 22:49   #4  |  Link
lexor
Registered User
 
Join Date: Jan 2004
Posts: 849
Quote:
Originally Posted by Yoshiyuki Blade View Post
I hope my aging 8800GTX
I hate you so much right now...

/me looks at his 6600gt with it's poor crackling fan and sighs
__________________
Geforce GTX 260
Windows 7, 64bit, Core i7
MPC-HC, Foobar2000
lexor is offline   Reply With Quote
Old 6th May 2008, 07:00   #5  |  Link
Shinigami-Sama
Solaris: burnt by the Sun
 
Shinigami-Sama's Avatar
 
Join Date: Oct 2004
Location: /etc/default/moo
Posts: 1,923
Quote:
Originally Posted by lexor View Post
I hate you so much right now...

/me looks at his 6600gt with it's poor crackling fan and sighs
I got all of you beat!
5200fx
first gen dx9!

it will be nice to see this nice looking stuff make it in x264
its definitely growing fast these days, kinda sad big commercial companies can't even compete still
__________________
Quote:
Originally Posted by benjust View Post
interlacing and telecining should have been but a memory long ago.. unfortunately still just another bizarre weapon in the industries war on image quality.
Shinigami-Sama is offline   Reply With Quote
Old 6th May 2008, 09:47   #6  |  Link
Inventive Software
Turkey Machine
 
Join Date: Jan 2005
Location: Lowestoft, UK (but visit lots of places with bribes [beer])
Posts: 1,953
Yeah, Shinigami, except it cheated with DX9 and fell back to DX8.1 when it felt like it!
__________________
On Discworld it is clearly recognized that million-to-one chances happen 9 times out of 10. If the hero did not overcome huge odds, what would be the point? Terry Pratchett - The Science Of Discworld
Inventive Software is offline   Reply With Quote
Old 6th May 2008, 14:04   #7  |  Link
DarkZell666
aka XaS
 
DarkZell666's Avatar
 
Join Date: Jun 2005
Location: France
Posts: 1,122
Too bad, I switched from a Geforce 8600 GTS to a Radeon HD 3870 and I'm regretting I chose ATI, not only because of it's poor drivers and performance, but now even more so because of CUDA *Looks at his HD 3870 with an evil grin mwahahahaha*
__________________

Q9300 OC @ 3.2ghz / Asus P5E3 / 4GB PC10600 / Geforce 8600 GTS
DarkZell666 is offline   Reply With Quote
Old 6th May 2008, 15:33   #8  |  Link
lucassp
Registered User
 
Join Date: Jan 2007
Location: Romania, Timisoara
Posts: 223
I think an ATi fanboy will make the same thing using CTM...it's just a matter of time.
lucassp is offline   Reply With Quote
Old 7th May 2008, 02:40   #9  |  Link
Sulik
Registered User
 
Join Date: Jan 2002
Location: San Jose, CA
Posts: 216
IIRC, CTM has been discontinued and is no longer officially supported.
Sulik is offline   Reply With Quote
Old 7th May 2008, 03:17   #10  |  Link
bob0r
Pain and suffering
 
bob0r's Avatar
 
Join Date: Jul 2002
Posts: 1,337
What is CUDA, I can't read french!
(And dont say google/wikpedia, doom9 is my bibble)
bob0r is offline   Reply With Quote
Old 7th May 2008, 03:24   #11  |  Link
lexor
Registered User
 
Join Date: Jan 2004
Posts: 849
Quote:
Originally Posted by bob0r View Post
What is CUDA, I can't read french!
(And dont say google/wikpedia, doom9 is my bibble)
http://www.nvidia.com/object/cuda_learn.html
__________________
Geforce GTX 260
Windows 7, 64bit, Core i7
MPC-HC, Foobar2000
lexor is offline   Reply With Quote
Old 7th May 2008, 05:07   #12  |  Link
Shinigami-Sama
Solaris: burnt by the Sun
 
Shinigami-Sama's Avatar
 
Join Date: Oct 2004
Location: /etc/default/moo
Posts: 1,923
Quote:
Originally Posted by lexor View Post
so basicly C/++ CPU-GPU interaction/offloading for parallel computing no?
__________________
Quote:
Originally Posted by benjust View Post
interlacing and telecining should have been but a memory long ago.. unfortunately still just another bizarre weapon in the industries war on image quality.
Shinigami-Sama is offline   Reply With Quote
Old 7th May 2008, 17:33   #13  |  Link
lexor
Registered User
 
Join Date: Jan 2004
Posts: 849
Quote:
Originally Posted by Shinigami-Sama View Post
so basicly C/++ CPU-GPU interaction/offloading for parallel computing no?
Yeah, from the example code they have on their site it seems they just provide an abstraction layer to programmers that unifies CPU and GPU. So you don't need to learn OpenGL or Direct3D, and worry about switching between CPU and GPU targets for your code, you just write what appears to be normal C++ with some new coding conventions (function and variable names), and compiler does the rest. And if it does it well enough you get profit

I think Sh commercialized by RapidMind (same developers) are the cross platform as well as CPU and GPU independent version of this (even supports PS3's Cell + GPU combo). Since last I checked RapidMind still offered their stuff for free for non-commercial use, maybe it will be a better code investment than CUDA? Especially since Sh devs have been at it longer than Nvidia.
__________________
Geforce GTX 260
Windows 7, 64bit, Core i7
MPC-HC, Foobar2000

Last edited by lexor; 7th May 2008 at 17:41.
lexor is offline   Reply With Quote
Old 7th May 2008, 21:28   #14  |  Link
MfA
Registered User
 
Join Date: Mar 2002
Posts: 1,075
Rapidmind is higher level ... with abstraction comes loss of performance, more severe with these kind of architectures than on a desktop processor which bristle with technology to make bad code run fast. I don't think you can freely download it anymore.
MfA is offline   Reply With Quote
Old 7th May 2008, 21:48   #15  |  Link
BlackSharkfr
Registered User
 
Join Date: Dec 2005
Posts: 133
I was curious so i read through a cuda introduction tutorial they have on their website. It's well written and easy to understand even for programming noobs like me.
i'd really like to see this technology being used in x264

Apprently they use the GPU as a giant SIMD co-processor :
.so it should be very usefull for video, since you often have to execute the same algorighm on a huge amount of pixels and/or macroblocks (especially in HD) simultaneously, but i don't know much about the x264 code so i can't really say which parts of the encoding process would benefit the most from the gpu.

using the cuda api (C/c++) seems simple, but in order to get the computations done "10/20/30/40/50/100+ times faster" (which is what they claim), you need to have massively parallelizable code without divergent branching, and then avoid bottlenecks caused by memory latency and the PCI-express bandwidth.
BlackSharkfr is offline   Reply With Quote
Old 7th May 2008, 22:29   #16  |  Link
MfA
Registered User
 
Join Date: Mar 2002
Posts: 1,075
Highly irregular branching patterns (skip modes) and bit manipulation (quantization/entropy coding) don't suit present GPUs. IMO the only really good application at the moment are full search ME algorithms, in the end though accelerated full search is still slow even if it's faster than on the CPU. Because it has to do everything at single precision floating point it won't usually have 10x+ the performance of modern processors when you can use 8 or 16 bit on the CPU BTW. In pure arithmetic at usual precisions in x.264 it probably ties with a quad core ... it does have a lot more bandwidth though.

Last edited by MfA; 7th May 2008 at 22:32.
MfA is offline   Reply With Quote
Old 7th May 2008, 22:33   #17  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Quote:
Originally Posted by MfA View Post
Highly irregular branching patterns (skip modes) and bit manipulation (quantization/entropy coding) don't suit present GPUs. IMO the only really good application at the moment are full search ME algorithms, in the end though accelerated full search is still slow even if it's faster than on the CPU.
Actually, basically everything can be reasonably done on the GPU except CABAC (which could be done, it just couldn't be parallelized).

x264 CUDA will implement a fullpel and subpel ME algorithm initially; later on we could do something like RDO with a bit-cost approximation instead of CABAC.
Quote:
Originally Posted by MfA View Post
Because it has to do everything at single precision floating point
Wrong, CUDA supports integer math.
Dark Shikari is offline   Reply With Quote
Old 7th May 2008, 22:34   #18  |  Link
MfA
Registered User
 
Join Date: Mar 2002
Posts: 1,075
It supports integer math, but it doesn't support SIMD ... it's basically just not doing the renormalization.
MfA is offline   Reply With Quote
Old 7th May 2008, 22:36   #19  |  Link
Dark Shikari
x264 developer
 
Dark Shikari's Avatar
 
Join Date: Sep 2005
Posts: 8,666
Quote:
Originally Posted by MfA View Post
It supports integer math, but it doesn't support SIMD ... it's basically just not doing the renormalization.
Yes it does support SIMD. CUDA on an 8800GTX, for example, has 16 stream processors, each of which can perform 8 of the same arithmetic operation at the same time.
Dark Shikari is offline   Reply With Quote
Old 7th May 2008, 22:40   #20  |  Link
MfA
Registered User
 
Join Date: Mar 2002
Posts: 1,075
Seriously? Hmm, didn't notice that before ... I stand corrected.

PS. are you quite certain it can operate on uchar4 at 4 times the throughput as fp? (I don't have a g80 unfortunately.) I would have thought they would have made a bigger deal out of that if they could do it.

PPS. I think I worded it poorly the first time, I meant they can't do SIMD inside a "thread" (operations on vectors are simply iterated over the components) everything from 8 bits to 24 bits operations run on the same arithmetic units and at the same throughput. Whereas a CPU will get significant increase in throughput of arithmetic ops when using lower precision.

Last edited by MfA; 7th May 2008 at 23:44.
MfA is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 04:29.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.