PDA

View Full Version : Digital Signal Processing on Raw Clock Speed vs SMT/Parallelism


nFury8
24th November 2004, 13:41
Seeing that x86 microprocessors are on the verge of a design shift away from mhz to multicore in pursuit of performance, and that DSPs like video and audio encoding have been shown to benefit more on raw clock speed, will there be a significant boost if these apps were run purely SMT? Or another way of asking it would be: Is the nature of video/audio encoding (with specific consideration for FFT/DCT algo) purely reliant on raw clock speed disregarding for the moment platform-specific optimizations (ie, HT,SSE)? Or will it benefit more with parallelism approach? I don't know jack about coding so I'm not sure if the gist of these questions are valid from a programming point of view. So will the gurus please help shed some light on the essence of my queries? I believe you know what I'm trying to get at, its just that I'm not sure if I put the questions in a technically-correct manner.

Mug Funky
24th November 2004, 17:24
i think it's the difference between building "up" and building "out".

today's fastest supercomputers are all (i think all of them are..) massively parallel. they're far cheaper, and tend to be more useful.

by their nature they can handle far more simultaneous tasks. take for example my local supercomputer VPAC. it's just a cluster of quad alpha 1ghz RISC boxes. 4GHZ per box, 32 boxes, on a fast network.

the advantage of this is you can dial-up to it (assuming you have an account), and run your program on any one (or several) of the nodes. it requires a fair bit of parellelization of your code (i don't quite get what this means - my brother's the guy that uses this rig), but it's a very efficient way to do things, and it's infinitely extensible.

the extreme example of this is those distributed processing things - you know, the ones where you download a satellite proggy, which churns away on spare CPU cycles. it started with SETI, i think.

a side-note: i never thought i'd ever see anybody yell at a supercomputer to go faster. hehe.

Wolfman
25th November 2004, 02:03
supercomputing and seti are not strictly the same, SETI uses grid computing rather than supercomputing. Video processing would appear (to me) to be a highly parallel friendly process tho, software allowing.

"and it's infinitely extensible." = Highly scalable.. nothings infinitely extensible... not even my pedantry. :rolleyes:

nFury8
25th November 2004, 02:29
Originally posted by Wolfman
.. Video processing would appear (to me) to be a highly parallel friendly process tho, software allowing.
..

So is it valid to postulate that because of how apps are currently written for video encoding, the clockspeed advantage is only relevant to the current state of applications written that way? Meaning if they were re-written for parallel processing and gain more performance it would invalidate the notion that encoding is purely reliant on clockspeed.

Mug Funky
27th November 2004, 06:34
i suppose so. look at 3d applications, and some 2d ones (after effects?). they'll do network-distributed rendering.

because parallelism works best when there are multiple tasks that can be assigned to different CPUs and let run from there - a render farm is simply a lot of machines running the rendering app, and something to coordinate them - in a room with 30 computers, assign the first one with frames 1,31,61,etc.

of course, for filters that require knowledge of previous frames this doesn't work so nicely (temporal filters).

Wolfman
28th November 2004, 13:18
ALL the fastest supercomputers are multi-cpu. I would say yes. parallelism can easily surpass sheer clockspeed in amounts of work done, as virtually any real world task consists of more than one process eg input=process= output (3 tasks). The correct software has to be used of course (SMP compliant). Even desktop CPUs are going multicore. why stop at two when four is better! think what a quad cpu setup will do for doom5. or tmpgenc.

Paazabel
8th December 2004, 01:10
The question is really one of parallelism. The PC world is going that direction and the MHz race is implied to be over. That said, chips WILL continue to get faster, but it's much harder to make a quantum leap, nowadays. In simple terms, going from 1 GHz to 2 GHz is a lot easier than going from 4 GHz to 8 GHz.

Video is well-suited to parallelism on many levels. You can "chop up" video into multiple segments, dispatching them to multiple processors with a bit of overlap. Throw away the overlap up to key frames (scene changes), and you can process it that much faster. Or, within single frames, you can process multiple blocks at the same time to apply compression (especially intensive algorithms, like SNOW).

Whatever the hardware world does to go faster, video will be able to take advantage, somehow.

LordRPI
8th December 2004, 03:03
I think what's being done with the Sony PS3 is interesting, even though I've been made aware that last week's press releases have not been entirely accurate or representative. I've been in discussions with computer hardware gurus and seem to agree on the limitations of increasing clock speed while turning their attention towards parallelism and other clever innovations to increase work done while reducing power consumption.

I also think it would be wise to consider that algorithmic parallelism and hardware parallelism are two different things... as well as algorithmically removing absurd dependencies...

nFury8
8th December 2004, 07:14
The thing I'm wondering about is, once the multi-core processors start showing up, will developers for specific applications that benefit greatly from parallelism find enough incentive or motivation to re-write their apps since I assume that it will also entail relevant compiler support? And that looks to me like a whole lot of work.

Paazabel
8th December 2004, 13:52
It's largely going to depend on how they have written it so far. Some codecs/apps/middleware are multi-threaded, and those will benefit without much help. Those which are not may need to be revisited. Most live MPEG-4 "hardware encoders" use dual-processor boxes, and I would assume those would instantly benefit.

Think about Hyperthreading. Most of the time, I turn it off if I know the box is going to be used to process video. Now, this is not true parallelism, mind you, but many times, encoding is slower with it on than off ... so there are some things true parallelism would help with where virtual parallelism is a hinderance.