Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Video Encoding > MPEG-4 AVC / H.264

Reply
 
Thread Tools Search this Thread Display Modes
Old 23rd June 2005, 23:05   #1  |  Link
ArcticFox
Registered User
 
Join Date: Mar 2002
Posts: 190
Hardware Accelerated Encoding

Are there any hardware accelerators for H.264 that arent professional level equiptment that dont cost thousands of pounds/dollars?
ArcticFox is offline   Reply With Quote
Old 23rd June 2005, 23:24   #2  |  Link
acidsex
Registered User
 
Join Date: Dec 2002
Posts: 492
Quote:
Originally Posted by ArcticFox
Are there any hardware accelerators for H.264 that arent professional level equiptment that dont cost thousands of pounds/dollars?
None that I know of but I hope Nero tries to incorporate Gridiron Xfactor for network encoding. I use this for rendering my After Effects comps and it is sweet. Last I heard, M$ was working with them to develop network encoding of WM9 HD files so it should be possible to do H.264 AVC the same way.
acidsex is offline   Reply With Quote
Old 24th June 2005, 09:13   #3  |  Link
Mutant_Fruit
Registered User
 
Join Date: Apr 2004
Posts: 287
GPU accelerated encoding would be nice, but doesn't look like it can/will be done. The power of a 7800GT helping an encode would help quite a bit.
Mutant_Fruit is offline   Reply With Quote
Old 24th June 2005, 09:27   #4  |  Link
Blue_MiSfit
Derek Prestegard IRL
 
Blue_MiSfit's Avatar
 
Join Date: Nov 2003
Location: Los Angeles
Posts: 5,988
Actually theoretically the 7800gtx could be very effectively programmed to help video encoding. Programmible shaders can be used to help encoding tasks for example fft3dgpu - an avisynth plugin. It uses directx9c shaders to "render" a powerful denoiser at high speed. It might not end up being all that useful for something like h.264 though - I really don't know\

-MiSfit
__________________
These are all my personal statements, not those of my employer :)
Blue_MiSfit is offline   Reply With Quote
Old 24th June 2005, 09:37   #5  |  Link
bill_baroud
Registered User
 
Join Date: Feb 2002
Posts: 407
the problem with GPU general purpose programming is that the download speed from the AGP (and even pci-express now) to the main memory is hella slow. Don't expect more than 30 or 40Mb/s... I'll let you do the math to have the maximum framerate you can achieve with this sort of bandwidth.

Btw, even with a fx5900 the processing power is interesting, it's just that nvidia and ati doesn't want you to use the gpu for something they don't want (what, you wouldn't pay a lot of $$ for their crap PureVideo system !).
bill_baroud is offline   Reply With Quote
Old 24th June 2005, 10:43   #6  |  Link
bond
Registered User
 
Join Date: Nov 2001
Posts: 9,770
excuse my newbieness, but cant x264's already existing ability for multithreaded encoding making use of dualcore cpus be called "hardware accelerated encoding"?
__________________
Between the weak and the strong one it is the freedom which oppresses and the law that liberates (Jean Jacques Rousseau)
I know, that I know nothing (Socrates)

MPEG-4 ASP FAQ | AVC/H.264 FAQ | AAC FAQ | MP4 FAQ | MP4Menu stores DVD Menus in MP4 (guide)
Ogg Theora | Ogg Vorbis
use WM9 today and get Micro$oft controlling the A/V market tomorrow for free
bond is offline   Reply With Quote
Old 24th June 2005, 11:15   #7  |  Link
Egladil
Registered User
 
Join Date: May 2004
Posts: 68
Quote:
Originally Posted by bill_baroud
the problem with GPU general purpose programming is that the download speed from the AGP (and even pci-express now) to the main memory is hella slow. Don't expect more than 30 or 40Mb/s... I'll let you do the math to have the maximum framerate you can achieve with this sort of bandwidth.

Btw, even with a fx5900 the processing power is interesting, it's just that nvidia and ati doesn't want you to use the gpu for something they don't want (what, you wouldn't pay a lot of $$ for their crap PureVideo system !).
This used to be true, but it changes at present. Even with AGP, 3 years ago, I was able to get 200 MB/s, with PCIe today some people report getting up to 1500 MB/s.

Regards,
-Lev
Egladil is offline   Reply With Quote
Old 24th June 2005, 12:04   #8  |  Link
bill_baroud
Registered User
 
Join Date: Feb 2002
Posts: 407
well, i was reading totally the contrary about pci-express at least... It can do 1.5GB/s but that doesn't mean that the gpu let you do it. I found a link somewhere (talking about fast memory move) the guy was getting 16Mb/s (!) from his 6800.
Well i would happily have more informations on how to achieve the speeds you're talking about.

(iirc, the 40/50Mb/s value was what tsp got with his fftgpu avisynth plugin, at least at the begining, over AGP. the same, i'm interested by any information on how you got 200Mb/s )
bill_baroud is offline   Reply With Quote
Old 24th June 2005, 12:26   #9  |  Link
Egladil
Registered User
 
Join Date: May 2004
Posts: 68
Quote:
Originally Posted by bill_baroud
well, i was reading totally the contrary about pci-express at least... It can do 1.5GB/s but that doesn't mean that the gpu let you do it. I found a link somewhere (talking about fast memory move) the guy was getting 16Mb/s (!) from his 6800.
Well i would happily have more informations on how to achieve the speeds you're talking about.

(iirc, the 40/50Mb/s value was what tsp got with his fftgpu avisynth plugin, at least at the begining, over AGP. the same, i'm interested by any information on how you got 200Mb/s )
I was browsing GPGPU.org forums and saw that values. the 200 MB/s I got were using OpenGL, I was doing some simple arithmetics (hardware couldn't do complex shaders those days) and already at 200 MB/s it proved to be useful (in my case, this is not to be generalized). ATI cards seem far inferior to NVIDIA when it comes to readback performance with OpenGL there you won't get more than ~200 MB/s, even ith PCIe card, I don't know whats the situation with D3D today. NVIDIA reaches 600-800 MB/s quite easily AFAIK, higher speeds can only be reached under optimal conditions.

The sutiation could be much better, since the numbers are still far from theoretical peak (4 GB/s), but I have the feeling that the situation is improving.

Regards,
-Lev
Egladil is offline   Reply With Quote
Old 24th June 2005, 12:29   #10  |  Link
bill_baroud
Registered User
 
Join Date: Feb 2002
Posts: 407
well 500Mb/s is quite enough for what you can do on the gpu (motion estimation ? transform ?)
bill_baroud is offline   Reply With Quote
Old 24th June 2005, 18:38   #11  |  Link
Mutant_Fruit
Registered User
 
Join Date: Apr 2004
Posts: 287
Quote:
Originally Posted by bond
excuse my newbieness, but cant x264's already existing ability for multithreaded encoding making use of dualcore cpus be called "hardware accelerated encoding"?
Erm... no. Thats like saying that since x264 can run on a Pentium 4 its hardware accelerated.

We're talking about a piece of dedicated circuitry that is built specifically for encoding x264 (or possibly other codecs). Or using a GPU to help accelerate the encoding process.
Mutant_Fruit is offline   Reply With Quote
Old 24th June 2005, 23:33   #12  |  Link
708145
Professional Lemming
 
708145's Avatar
 
Join Date: Dec 2003
Location: Stuttgart, Germany
Posts: 359
From what I know, using GPU shaders for ME and transforms should be possible but a big problem is that the GPU drivers "optimize" the shaders you send them.
The result can vary from speed improvement over inaccuracies to utter nonsense!

If anyone knows a way to circumvent this optimization, please do tell.

bis besser,
Tobias
__________________
projects page: ELDER, SmoothD, etc.
708145 is offline   Reply With Quote
Old 25th June 2005, 06:09   #13  |  Link
Mug Funky
interlace this!
 
Mug Funky's Avatar
 
Join Date: Jun 2003
Location: i'm in ur transfers, addin noise
Posts: 4,555
aren't drivers something that can be fixed with a download? there's plenty of haxx0red drivers out there, but i'm not sure if shader optimisation is something that can be hacked (and if it can, what speeds would you end up with?).

it'd be really really really really good to see GPU/whatever accelerated encoding though. AFAIK some cards already "assist" in this kind of stuff, but that's mostly for decoding i think. it'd be good to be able to encode in faster than realtime (think of doing 2-pass in the time it takes to play out!).

well, there's some clever people out there. i wish i had the maths and nunchuk skills to do it myself.
__________________
sucking the life out of your videos since 2004
Mug Funky is offline   Reply With Quote
Old 27th June 2005, 09:27   #14  |  Link
Inventive Software
Turkey Machine
 
Join Date: Jan 2005
Location: Lowestoft, UK (but visit lots of places with bribes [beer])
Posts: 1,953
OK, we know that the download speed is far from ideal according to some posts. But this is fine, because we only need something that will do about 4 times the bit-rate that you encode with. With x264 and many codecs, that is about 10 Mbits/sec. If the max bandwidth is 40 or 50 Mbits/sec, according to some reports, that's gonna be enough.

I can see a really big problem with this. Until the GPU speeds start coming anywhere near the CPU speeds, the GPU just physically cannot keep up with the pace of the CPU. As a result, the CPU encoding speed is throttled significantly to allow the GPU to keep up. So unless GPU speeds are going to magically hike to 3 GHz in the next 6 months, I rather think that we're out of options when it comes to encoding with the GPU.
__________________
On Discworld it is clearly recognized that million-to-one chances happen 9 times out of 10. If the hero did not overcome huge odds, what would be the point? Terry Pratchett - The Science Of Discworld
Inventive Software is offline   Reply With Quote
Old 27th June 2005, 10:31   #15  |  Link
708145
Professional Lemming
 
708145's Avatar
 
Join Date: Dec 2003
Location: Stuttgart, Germany
Posts: 359
Quote:
Originally Posted by Inventive Software
OK, we know that the download speed is far from ideal according to some posts. But this is fine, because we only need something that will do about 4 times the bit-rate that you encode with. With x264 and many codecs, that is about 10 Mbits/sec. If the max bandwidth is 40 or 50 Mbits/sec, according to some reports, that's gonna be enough.

I can see a really big problem with this. Until the GPU speeds start coming anywhere near the CPU speeds, the GPU just physically cannot keep up with the pace of the CPU. As a result, the CPU encoding speed is throttled significantly to allow the GPU to keep up. So unless GPU speeds are going to magically hike to 3 GHz in the next 6 months, I rather think that we're out of options when it comes to encoding with the GPU.
Huh? You lost me somewhere. Raw processing power on the GPU is already higher than that of a CPU (when comparing those instructions a GPU is optimized for).

Are you measuring speed or clock frequency here? They are NOT the same!

And a third point: Your claim that a couply Mbit/s are enough download bandwith is only true if you do everything including bitstream encapsulation on the GPU. The current flexibility of the Shaders is not fit for this task. Therefore you have to download intermediate representations which are more bandwidth hungry and finish the encoding job on the CPU for the time being.

bis besser,
Tobias
__________________
projects page: ELDER, SmoothD, etc.
708145 is offline   Reply With Quote
Old 27th June 2005, 11:20   #16  |  Link
Inventive Software
Turkey Machine
 
Join Date: Jan 2005
Location: Lowestoft, UK (but visit lots of places with bribes [beer])
Posts: 1,953
Quote:
Huh? You lost me somewhere. Raw processing power on the GPU is already higher than that of a CPU (when comparing those instructions a GPU is optimized for).
Yeah, but it depends how you look at it. The clock frequencies of the GPU are limited to a stable 500 MHz. You need sufficiently more cooling to push the frequencies any higher, or you start getting instability problems.

The frequencies of GDDR3 can go as high as 1066 MHz, which is pretty fast. The RAMDACs are limited currently to about 400 MHz, which ain't bad.

I'm confused, because the Northbridge can run at about 1 GHz. I know it affects CPU speed, but does it affect the speed that the graphics card runs at? Or is that left to the PCI-E controller?
__________________
On Discworld it is clearly recognized that million-to-one chances happen 9 times out of 10. If the hero did not overcome huge odds, what would be the point? Terry Pratchett - The Science Of Discworld
Inventive Software is offline   Reply With Quote
Old 27th June 2005, 12:19   #17  |  Link
708145
Professional Lemming
 
708145's Avatar
 
Join Date: Dec 2003
Location: Stuttgart, Germany
Posts: 359
Quote:
Originally Posted by Inventive Software
Yeah, but it depends how you look at it. The clock frequencies of the GPU are limited to a stable 500 MHz. You need sufficiently more cooling to push the frequencies any higher, or you start getting instability problems.

The frequencies of GDDR3 can go as high as 1066 MHz, which is pretty fast. The RAMDACs are limited currently to about 400 MHz, which ain't bad.

I'm confused, because the Northbridge can run at about 1 GHz. I know it affects CPU speed, but does it affect the speed that the graphics card runs at? Or is that left to the PCI-E controller?
This is all a bit OT but many people lack understanding of this topic so I reply anyway

Why are you so obsessed with clock frequencies?
The important thing is work done per time unit.

The parallelism of GPUs has been increased dramatically in the last few years (24 parallel shaders in the 7800GTX).
If the used algorithms provide enough parallelism it is more efficient to design a chip at lower clock since the overhead due to the pipeline registers is lowered. Less overhead means less power at any given compute speed.

What I'm trying to tell is this:
compute speed = clock frequency * parallelism

If you compare two CPUs which have the same amount of parallelism (like P2 and K6(-2) almost had) then the one with higher clock is faster. But once the parallelism is significantly different you have to take it into account.

bis besser,
Tobias
__________________
projects page: ELDER, SmoothD, etc.
708145 is offline   Reply With Quote
Old 27th June 2005, 13:46   #18  |  Link
Mutant_Fruit
Registered User
 
Join Date: Apr 2004
Posts: 287
Think about it... why do we need a graphics card: We need it to speed up graphics processing.

A CPU by itself cannot run a game at max res, max details with a good framerate (ignoring hardware directx issues). Yet a new GFX card can. Therefore it should be obvious that at graphics processing, an ATI/Nvidia card is much faster than a pentium/amd processor, even though they are clocked at a much much lower speed.
Mutant_Fruit is offline   Reply With Quote
Old 27th June 2005, 15:12   #19  |  Link
bill_baroud
Registered User
 
Join Date: Feb 2002
Posts: 407
a subtile difference : they are faster when executing code for what they have been designed ..... try to run some algorithm, like a collision detection, with a lot of branchements on a GPU (even though this is changing with the latest chips), and see....
bill_baroud is offline   Reply With Quote
Old 28th June 2005, 03:03   #20  |  Link
LiFe
PC Dom: Computer Support
 
Join Date: Nov 2003
Posts: 165
Noone has yet got hardware accelerated decoding working on the MS platform yet (which is a heck of a lot simpler):

My last post there gives reference to what Apple is doing, which uses OpenGL run by the GPU to process filters on images/video in realtime for both playback and editing. Next step is hardware assisted encoding.
LiFe is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 11:08.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.