Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion.

Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules.

 

Go Back   Doom9's Forum > Hardware & Software > PC Hard & Software
Register FAQ Calendar Today's Posts Search

Reply
 
Thread Tools Search this Thread Display Modes
Old 5th July 2007, 21:12   #1  |  Link
graysky
Registered User
 
graysky's Avatar
 
Join Date: Sep 2004
Posts: 429
Programs that actually use all 4 cores of a quad core chip... please contribute!

Here is a short list of programs that actually use all 4 cores of a quad core chip. If you know of others, please post them to this thread, and I'll update it.

THE LIST:
Real-World Applications
Adobe Premiere Elements v3.0.2 (52-85 % of 4 cores depending on source type, filters, etc.)
AutoGK v2.40 (30-53 % of 4 cores depending on source type, filters, etc.)
Cinema 4d Rendering (>99 % of 4 cores)
Dr. DivX v2.0.0 (47-65 % of 4 cores depending on source type, filters, etc.)
DVDShrink v3.2 (~90 % of 4 cores)
Lightwave 3D (>99 % of 4 cores)
Noise Ninja v2.13 (~80 % of 4 cores when doing the noise reduction on an image)
Sony Vegas 7.0e (83-100 % of 4 cores depending on source type, filters, etc.)
TMPG XPress v4.2.3.193 (65-100 % of 4 cores depending on source type, filters, etc.)
Winrar v3.70 (~85-90 % of 4 cores on benchmark; ~75% in practice)
x264 v0.55.663 (>99 % of 4 cores when doing the 2nd pass of a 2 pass encode)

Benchmark/Distributed Computing Applications
BOINC Clients (most of them) (>99 % of 4 cores)
Folding@home SMP client (>99 % of 4 cores)
Muon1 DPAD (~85 % of 4 cores)
OCCT (>99 % of 4 cores)
Prime95 v25.3 (>99 % of 4 cores)
wprime v1.50 (>99 % of 4 cores)

Games
none that I know of yet

If you'd like to contribute an application or game, please post the following:

1) Program name
2) URL to homepage of the program
3) The percentage as shown in the Windows task manager of the CPU's that are getting used @ peak or thereabouts along with a screen-shot of the task manager.

Here's an example screen-shot of my task manager during the Noise Ninja noise reduction:


Also, please limit the replies to quad chips only (yeah, I know this will limit the amount of replies, but this is after all what this thread is all about).

Thanks!

Note: I don't think this thread really belongs under the software section since it's specifically about software for quad core chips.

Last edited by graysky; 18th August 2007 at 15:14.
graysky is offline   Reply With Quote
Old 26th July 2007, 22:08   #2  |  Link
flib0
Registered User
 
Join Date: Jul 2002
Posts: 1
distributed.net client utilizes all four cores of a Quad-core CPU

1) distributed.net client
2) http://www.distributed.net/download/clients.php
3) 100 % of four cores at all times

I have attached two files: one screenshot of the task manager while the client was running on an Intel Core 2 Quad Q6600, and one screenshot of the client itself.

HTH,
flib0
Attached Images
  
flib0 is offline   Reply With Quote
Old 27th July 2007, 01:52   #3  |  Link
JohnnyMalaria
Registered User
 
Join Date: Sep 2006
Posts: 602
The licensed version of our Enosoft DV Processor will use up to 8 cores (if present) to perform those tasks that can be parallelized (e.g., proc amp functions, logo/text overlay).

You won't typically see 80+% on all cores simply because the algorithms are extremely efficient (often 100 to 1000 times less instructions than conventional software) and the limiting factors are throughput from a live camera or disk transfer rates for file-based processing. i.e., a single frame can be processed in a fraction of the time between frames.
__________________
John Miller
Enosoft DV Processor - Free for personal use
JohnnyMalaria is offline   Reply With Quote
Old 8th August 2007, 21:59   #4  |  Link
col_oddball
Registered User
 
Join Date: Jan 2007
Posts: 2
Aircrack-ng http://aircrack-ng.org/doku.php

~95% when cracking WPA (directory attack)

Processing 38880000 keys took ~9 hours on a Q6600 (@2.4GHz) 1200 keys per sec

I had an old dell machine to try, results were not good lol :
took ~60 hours on a xeon dual processor (@ 2GHz) 183 keys per sec (2 phyical CPU's) ,

Laptop Dell 620 took 30hours

hope this helps

Oddball
col_oddball is offline   Reply With Quote
Old 15th August 2007, 13:16   #5  |  Link
morph166955
Registered User
 
Join Date: Mar 2006
Posts: 443
FYI, I've pegged all 8 cores I have with x264 when doing the second pass of a 1080p encode. I don't have any screen shots cause it was done in linux so you will just have to take my word on it I guess but its definitely doable.
morph166955 is offline   Reply With Quote
Old 16th August 2007, 00:04   #6  |  Link
graysky
Registered User
 
graysky's Avatar
 
Join Date: Sep 2004
Posts: 429
good to hear you finally got it working... did you ever get it to use all 8 cores under win32 or win64?
graysky is offline   Reply With Quote
Old 16th August 2007, 02:51   #7  |  Link
morph166955
Registered User
 
Join Date: Mar 2006
Posts: 443
nope, its barely pegging in my own optimized completely home built version of linux. Haven't even thought of loading windows on it to see if i can get it working on that.
morph166955 is offline   Reply With Quote
Old 16th August 2007, 11:14   #8  |  Link
jeffy
Registered User
 
Join Date: Jan 2007
Posts: 943
Quote:
Originally Posted by graysky View Post

THE LIST:
Real-World Applications
Adove Premiere Elements v3.0.2
typo: Adobe
jeffy is offline   Reply With Quote
Old 18th August 2007, 15:04   #9  |  Link
morph166955
Registered User
 
Join Date: Mar 2006
Posts: 443
Quote:
Originally Posted by graysky View Post
good to hear you finally got it working... did you ever get it to use all 8 cores under win32 or win64?
I just did a new diff based on the current svn for the threadpool patch that pengo created a while back. I'd love to know how it works on your quadcore in comparison to a vanilla build (please dont test with one that has the aq patch, it could skew results a little)

http://forum.doom9.org/showthread.ph...37#post1035137
morph166955 is offline   Reply With Quote
Old 18th August 2007, 15:13   #10  |  Link
graysky
Registered User
 
graysky's Avatar
 
Join Date: Sep 2004
Posts: 429
@morph: I don't have a compiler anymore, you're gonna have to post the executables... what's it supposed to do? The standard build already maxes out my cores.
graysky is offline   Reply With Quote
Old 18th August 2007, 17:25   #11  |  Link
morph166955
Registered User
 
Join Date: Mar 2006
Posts: 443
Long story short (if you want to read the long story, read the entire thread from my link), x264 currently creates and destroys threads very often (this is why you might set --threads 6 for a 4 core machine). We found that when doing this on a significant number of threads (6+ normally) the amount of time that it took to create and destroy the threads as well as to handle the mutex lockings between them actually was longer then the encode was taking for each thread (generally systems with less cores have been taking longer to do the encodes so this is almost irrelevant for them).

What that patch does is instead of creating/destroying threads every few milliseconds, it creates a "thread pool" where the threads are created once at the beginning of the encode and then only destroyed when the encode was complete. So while a thread may be sitting idle for a brief interval while x264 is loading the data for it, were not having to wait for it to be created or destroyed at all. On my side this showed a significant boost in cpu usage (almost 200% from the 800% possible, 100% per core) on files that were 848x352 (my test media was the dvd of "die another day") and in turn a speed boost. While the cpu % difference dropped as the image got larger until it wasn't visible at 1080p resolutions (because the vanilla build can max that out), the time difference was still apparent although it wasn't as noticeable as before.

I haven't personally tested this really on a windows box with more then 2 cores since I don't really have one to do the tests on. I'm curious how my new media center with an E6600 (2.4GHz Core2Duo for those who dont speak "Intel-Processoreze") and a boat load of ram will handle the difference. Its also running vista-32 so that makes me even more curious as to how this will work out.

I've gotta redo my mingw/msys install right now since its just clogged up with lots of old libraries and things that I don't want effecting the outcome of any results from a test build. I'm heading to the post office shortly to send out some things so I'll poke at this later on today and see what I can do.
morph166955 is offline   Reply With Quote
Old 19th August 2007, 10:48   #12  |  Link
graysky
Registered User
 
graysky's Avatar
 
Join Date: Sep 2004
Posts: 429
How is the official build of x264.exe (from x264 dot nl) built? I noticed it says mingw so I'm assuming that's true... I will say that there are differences in the speed/efficiency as which lameenc encodes mp3 files that are dependant on not only the compilier, but the switches used to compile it. Mingw vs. ICL vs. MSVC for example.

Can you post the console commands you used to compile it?
graysky is offline   Reply With Quote
Old 23rd August 2007, 20:33   #13  |  Link
graysky
Registered User
 
graysky's Avatar
 
Join Date: Sep 2004
Posts: 429
@morph: you got some exes for me or what?
graysky is offline   Reply With Quote
Old 24th August 2007, 12:12   #14  |  Link
morph166955
Registered User
 
Join Date: Mar 2006
Posts: 443
hey...sorry got jammed up with work over the weekend and didnt get to finish reloading mingw/msys on my laptop. I'm going to try to finish it up this weekend and get ya something.
morph166955 is offline   Reply With Quote
Old 25th August 2007, 06:04   #15  |  Link
pc_speak
Old Batch Hacker
 
Join Date: Oct 2006
Location: At Home
Posts: 78
Acronis True Image 10.0 Home
http://www.acronis.com.au/homecomputing/products/

Seriously simple & fast drive backups. Reminds me a little of how Powerquest's 'Drive Image' used to be.


Last edited by pc_speak; 25th August 2007 at 06:42.
pc_speak is offline   Reply With Quote
Old 25th August 2007, 16:38   #16  |  Link
morph166955
Registered User
 
Join Date: Mar 2006
Posts: 443
Sorry for the delay...Heres your exes!

http://www.benswebs.com/public/x264/...1_no-tp04a.exe
http://www.benswebs.com/public/x264/x264_r671_tp04a.exe

I haven't had a chance to test em yet (doing so now) so if they fail let me know and I'll see whats wrong with em. I didn't compile MP4 support in but i think for the purposes of this test having it dump to a raw .264 (or to NUL for that matter) is probably the best option to eliminate even the tiniest amount of cpu time it takes to calculate any container headers. The one w/o the patch is a plain vanilla svn build, the other is just the vanilla with the threadpool patch applied. These builds are also not fprofiled. I'm going to make some that are to compare the speeds among all 4 shortly.

For the one with the threadpool patch, dont run with --threads over the actual number of cores you have. Remember, the difference between the two is that the original vanilla build and this one is that the threadpool one creates its threads at the begining and then utilizes them through out the encode where as the vanilla one creates and destroys threads very frequently (on the order of 250-500ms for a threads life). For the purposes of this test, I'd say that running --threads at is default of 1.5*cores on the vanilla build is probably fine unless you have any personal preferences for speed that you find different.

Enjoy and sorry again for the delay!

EDIT: For the purposes of testing I have also uploaded a windows version of gnu time 1.17 to my webspace at http://www.benswebs.com/public/x264/time.exe. For those unfamiliar with it, gnutime gives the average cpu usage as well as runtimes after the program exits. To use this simply do
Code:
time.exe x264_r671_tp04a.exe (options)

Last edited by morph166955; 23rd September 2007 at 17:27.
morph166955 is offline   Reply With Quote
Old 26th August 2007, 11:01   #17  |  Link
graysky
Registered User
 
graysky's Avatar
 
Join Date: Sep 2004
Posts: 429
This application has failed to start because pthreadsGC2.dll was not found.

Neither of your exes work.

EDIT: I just found ftp://sourceware.org/pub/pthreads-win32/dll-latest and placed it in the same dir as your exe's and they are running. I'll post results in a few.

Last edited by graysky; 26th August 2007 at 11:04.
graysky is offline   Reply With Quote
Old 26th August 2007, 11:12   #18  |  Link
graysky
Registered User
 
graysky's Avatar
 
Join Date: Sep 2004
Posts: 429
Q6600 @ 9x266 (stock)
Some results with a test avs making heavy use of plugins on a 480x480 NTSC source mpeg:

Code:
global MeGUI_darx = 4
global MeGUI_dary = 3
DGDecode_mpeg2source("E:\Incoming\test\test-new.d2v")
AssumeTFF()
Telecide(guide=1,post=2,vthresh=35) # IVTC
Decimate(quality=3) # remove dup. frames
crop( 2, 0, -10, -4)
Spline36Resize(640,480) # Spline36 (Neutral)
Pass1:
Code:
--pass 1 --bitrate 2175 --stats "E:\Incoming\test\test-NEW.stats" --bframes 3 --b-pyramid --direct auto --subme 1 --analyse none --vbv-maxrate 25000 --me dia --threads auto --thread-input --sar 4:3 --progress --no-dct-decimate --no-psnr --no-ssim --output NUL "E:\Incoming\test\test-NEW.avs"
Pass2:
Code:
--pass 2 --bitrate 2175 --stats "E:\Incoming\test\test-NEW.stats" --ref 5 --mixed-refs --no-fast-pskip --bframes 3 --b-pyramid --b-rdo --bime --weightb --direct auto --subme 6 --trellis 2 --analyse all  --8x8dct --vbv-maxrate 25000 --me umh --threads auto --thread-input --sar 4:3 --progress --no-dct-decimate --no-psnr --no-ssim --output "E:\Incoming\test\test-NEW.264" "E:\Incoming\test\test-NEW.avs"
no-tp04-a
FPS1: 81.46
FPS2: 15.65

tp04-a
FPS1: 80.14
FPS2: 15.56

md5sums for the two files did not match nor did filesizes

Last edited by graysky; 26th August 2007 at 11:35.
graysky is offline   Reply With Quote
Old 26th August 2007, 11:38   #19  |  Link
graysky
Registered User
 
graysky's Avatar
 
Join Date: Sep 2004
Posts: 429
Q6600 @ 9x266 (stock)
Finally, some results with a test avs making heavy use of plugins on a 480x480 NTSC source mpeg:

Code:
global MeGUI_darx = 4
global MeGUI_dary = 3
DGDecode_mpeg2source("C:\work\test-new.d2v")
edeintted = AssumeTFF().SeparateFields().SelectEven().EEDI2(field=-1)
tdeintted = TDeint(edeint=edeintted,order=1)
tfm(order=1,clip2=tdeintted).tdecimate(hybrid=1)
crop( 6, 0, -10, 0)
Pass1:
Code:
--pass 1 --bitrate 2175 --stats "E:\Incoming\testencode.stats" --bframes 3 --b-pyramid --direct auto --subme 1 --analyse none --vbv-maxrate 25000 --me dia --threads auto --thread-input --sar 4:3 --progress --no-dct-decimate --no-psnr --no-ssim --output NUL "E:\Incoming\testencode.avs"
Pass2:
Code:
--pass 2 --bitrate 2175 --stats "E:\Incoming\testencode.stats" --ref 5 --mixed-refs --no-fast-pskip --bframes 3 --b-pyramid --b-rdo --bime --weightb --direct auto --subme 6 --trellis 2 --analyse all  --8x8dct --vbv-maxrate 25000 --me umh --threads auto --thread-input --sar 4:3 --progress --no-dct-decimate --no-psnr --no-ssim --output "E:\Incoming\testencode.mkv" "E:\Incoming\testencode.avs"
no-tp04-a
FPS1: 105.87
FPS2: 18.23

tp04-a
FPS1: 107.29
FPS2: 18.26

md5sums for the two files did not match nor did filesizes

Last edited by graysky; 26th August 2007 at 12:22.
graysky is offline   Reply With Quote
Old 26th August 2007, 12:20   #20  |  Link
graysky
Registered User
 
graysky's Avatar
 
Join Date: Sep 2004
Posts: 429
Q6600 @ 9x266 (stock settings).
Some results with a test avs doing just pure encode of a 720x480 DVD source:

Code:
global MeGUI_darx = 16
global MeGUI_dary = 9
DGDecode_mpeg2source("C:\work2\test-720.d2v")
loop(10)
Pass1:
Code:
--pass 1 --bitrate 2175 --stats "C:\work2\x.stats" --bframes 3 --b-pyramid --direct auto --subme 1 --analyse none --vbv-maxrate 25000 --me dia --threads auto --thread-input --sar 16:9 --progress --no-dct-decimate --no-psnr --no-ssim --output NUL "C:\work2\x.avs"
Pass2:
Code:
--pass 2 --bitrate 2175 --stats "C:\work2\x.stats" --ref 5 --mixed-refs --no-fast-pskip --bframes 3 --b-pyramid --b-rdo --bime --weightb --direct auto --subme 6 --trellis 2 --analyse all  --8x8dct --vbv-maxrate 25000 --me umh --threads auto --thread-input --sar 16:9 --progress --no-dct-decimate --no-psnr --no-ssim --output "C:\work2\x.mkv" "C:\work2\x.avs"
no-tp04-a
FPS1: 122.40
FPS2: 18.39

tp04-a
FPS1: 120.51
FPS2: 18.36

x264 version 0.55.663 from x264.nl
FPS1: 130.65
FPS2: 17.84
graysky is offline   Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 10:19.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2024, vBulletin Solutions Inc.