Log in

View Full Version : multiple encoder processes & v1.22


archaeo
25th February 2007, 18:56
Added detection and auto configuration for multiple
core processing. Now the "Multiple Encoder Processes"
setting is automatically set to match the number of
processors at startup or when clicking on the "Multiple
Encoder Processes" option. If you only have one
processor and select the option, DVD-RB will assume a
"fast" system and will set the process count to 2. You
can, however, manually override the setting by changing
the "Encode_Processes=" parameter in REBUILDER.INI (for
instance if you want to use only 2 processors for
encoding on a 4 processor system).

A quick question on this feature update:

As I read it, DVDRB now auto-detects the number of processors on your system?

Is this feature update still mainly directed at a situation when one chooses an encoder that CAN run multiple instances (eg: HC, and not CCE). Is there any benefit when one is running CCE, and this is checked at (2)?

I have a dual Pentium system, so am always interested in any updates that can improve the encoding/rebuilding times.

thanks

Boulder
25th February 2007, 19:36
You can't run multiple CCE instances, the restriction is set by CCE itself. I don't know if installing two separate CCE's and renaming the other exe would work though.

jdobbs
25th February 2007, 21:10
It doesn't -- I've tried it. You can, though run CCE Basic and CCE SP at the same time... but DVD-RB will not attempt to do so.

If you look at the task manager on a dual cpu system, however, you will see that CCE uses both processors on its own -- although not necessarily at 100%.

@archaeo

You may find that running HC on your dual pentium system may actually be faster than CCE now (with HC v0.20 running two instances). I assume it's a newer system that supports SSE2 (there were some older dual pentium motherboards created for servers).

archaeo
25th February 2007, 23:46
It doesn't -- I've tried it. You can, though run CCE Basic and CCE SP at the same time...

Couldn't there be some incompatibility issues when doing this w/ these two different versions of CCE? I've known that you can't run two instances of the same version, but this idea is intriguing - I'd love to see each of my processors devoted to an individual instance of CCE, to see how much it would improve speed.

@archaeo
You may find that running HC on your dual pentium system may actually be faster than CCE now (with HC v0.20 running two instances). I assume it's a newer system that supports SSE2 (there were some older dual pentium motherboards created for servers).


Hmmm, unfortunately, I'm still running my old dinosaur :eek: - dual PIII 1Ghz, SSE only (while I save for an E6700) - so I can't benefit from SSE2.

jdobbs
26th February 2007, 03:22
I'd probably hesitate to use CCE Basic and CCE SP on the same source with the fear there may be a change from segment to segment... although, honestly, I haven't tried it.

dragongodz
28th February 2007, 04:02
You may find that running HC on your dual pentium system may actually be faster than CCE now (with HC v0.20 running two instances).
maybe one day we will see proper speed tests with different encoders on different processors,including dual cores running 2 instances, so people could have an idea of the different speeds.

wmansir
28th February 2007, 05:01
On my pentiumD CCE is still 50% faster than 2 instances of HC or a single instance of Procoder (which runs <10% slower than HC).

dragongodz
28th February 2007, 06:59
wmansir - it would be interesting to have some idea of QuEncs speed as well with 2 instances since it is also included with DVD-RB of course. ;)

of course trellis and the extreme setting should not be used. these are for extreme cases only and the tiny gain from them, for the huge speed loss, is hardly worth it for normal cases.

sockeye
28th February 2007, 07:04
On my pentiumD CCE is still 50% faster than 2 instances of HC or a single instance of Procoder (which runs <10% slower than HC).
The latest HC?
I just upgraded yesterday from a P4 640 hyperthreading, to a 940 pentium d, and installed Hank's 0.20. It is awsome, when compared to the single core processor with HC 0.19. With a moderate overclock, it did 6.73gb of animation at best quality, in 90 min. flat. (looks nice to) It has cut my encoding times in half.

jdobbs
28th February 2007, 12:14
Interesting. On my Opteron CCE is still faster than two instances of HC (in BEST mode) -- but only about (this is a guess) 20% or so. That's a vast improvement over my Athlon XP 3200+. It ran CCE about in about half the time of HC (single instance).

I'd heard that the Pentium Dual Core processors actually do better using HC than AMD -- but I haven't tested it.

Boulder
28th February 2007, 12:27
I'd heard that the Pentium Dual Core processors actually do better using HC than AMD -- but I haven't tested it.Probably due to the heavy SSE2 optimizations.

therat
28th February 2007, 12:42
How many passes does one need to do with CCE to equal BEST quality with HCEnc?

dragongodz
28th February 2007, 13:19
I'd heard that the Pentium Dual Core processors actually do better using HC than AMD -- but I haven't tested it.
its true.

Probably due to the heavy SSE2 optimizations.
throw in thats its compiled with intels fortran compiler. most of us know that intel compiler compiled code tends to run faster on intel cpus. infact with the c++ compiler the gap has gotten bigger with every new release.

How many passes does one need to do with CCE to equal BEST quality with HCEnc?
ahhh theres the rub. you can not give a direct "these settings equal" because each encoder does things a bit different.

writersblock29
28th February 2007, 16:32
@Jdobbs

I recently upgraded one of my processors to an AMD X64 4600+ Dual Core... and keeping in mind that I upgraded old technology (it's still a socket 939, PC3200 DDR memory), I still find CCE SP 2.5 to encode juuuust a hair faster than HC20. I decided to use HC, though, because I happen to like its quality on interlaced and low-bitrate stuff more than CCE. Anyway, a two-hour movie takes roughly one hour to encode with HC20 (using 2-passes) on the new processor, as opposed to the same movie taking just over real time and a half (2 1/5 hours on NORMAL settings, perhaps 3 hours on BEST) on my old single-core 3200+. I'm not sure how that would measure up to an Intel, but that ought to give you a fair idea of how the new HC handles work these days.

That's also using NAN's DGdecode SSE2 version... is there a reason to keep using that, BTW? I hate changing a working system, but the packaged version of DGdecode's changed in the Rebuilder installer, hasn't it?

Boulder
28th February 2007, 17:14
The "official" DGDecode version hasn't changed, IIRC it's still v1.4.5.

writersblock29
28th February 2007, 17:25
@Boulder

I see. So if it ain't broke, don't fix it, eh? Thanks for the info!

jdobbs
28th February 2007, 19:14
The only downside to the NaN version is that there was a bug related to field-based pictures (if I recall correctly) in which it could hang... but I think I've only heard of it happening twice in the last three years, so it's not like it's something that should necessarily get anyone's panties in a bunch worrying about it.

Boulder
28th February 2007, 19:40
I've had problems with one disc using NaN's DGDecode, too bad I don't remember which one it was. Switching to v1.4.5 fixed the issue.

wmansir
1st March 2007, 03:13
I ran these tests about a month ago, just after upgrading from an AMD XP to a PentiumD. The old system was a rather bloated install of XP pro, the upgrade was a fresh install of Vista Business.

Here's what I jotted down at the time. The tests weren't meant to be to very accurate, but just to give myself an idea of the performance difference from the upgrade and to see how encoders generally compared using dual processors, since it was my first dual system. As you can see I was using the computer some of the time, but noted whenever I did anything CPU/IO intensive like watching an AVI.

This system was a fresh install of Vista Business. It uses basically a stock/installer install of DVD-RB 1.21. The only real tweak is that I have overloaded AVS's bilinearResize with simpleResize.dll's fastBilinearResize.
DVD RB Encode Times
Source: approx 150 minutes, NTSC, no filters
Encoder: Procoder (unless specified)
Computer usage: varies, assume light web browsing and/or mp3 playback, CPU/IO intensive activities will be noted.

AMD XP Tbred @ 2.0Ghz 1Gig DDR (180 Mhz FSB) Nforce2, Windows XP

Prep - 5 min
Enc - 393 min (- 80 min avi playback)
RB - 18 min

Pentium D 850 @ 2.66Ghz (stock) 1 Gig DDR (133 Mhz FSB, Flexible) Via P880 Ultra, Vista

Prep - 5 min
Enc - 198 min
RB - 15 min

Pentium D 850 @ 3.0Ghz 1 Gig DDR (150 Mhz FSB, Flexible) Via P880 Ultra, Vista

Enc - 174 min
RB - 15 min

Pentium D 850 @ 3.0Ghz 1 Gig DDR (150 Mhz FSB, Flex disabled) Via P880 Ultra, Vista

Enc - 164 min ( - 40 min avi playback)

Enc - 153 min (HC enc, Best Quality, 2 instances)

Enc - 103 min (CCE 2.5, 1 instance, ~65% cpu usage)

Enc - 97 min (CCE 2.70.02, 1 instance, ~75% cpu usage)

FYI: Flexible/Flex disabled denotes the bios "flexible memory settings" option, which relaxes timings to improve compatibility. I enabled it because I was having stability issues, but the problem was resolved elsewhere.

Rippraff
1st March 2007, 12:51
@wmansir

Maybe you should tell something about how many passes you've used with CCE. ;)
I guess two but this is only an assumption.

Cu Rippraff

wmansir
1st March 2007, 16:35
Yes, it was only 2 passes for each CCE run.

wmansir
1st March 2007, 18:42
wmansir - it would be interesting to have some idea of QuEncs speed as well with 2 instances since it is also included with DVD-RB of course. ;)

of course trellis and the extreme setting should not be used. these are for extreme cases only and the tiny gain from them, for the huge speed loss, is hardly worth it for normal cases.

I'm going to redo all these tests to make them a bit more accurate and because I want to compare Vista to WinXP.

I'm going to reset my PentiumD 805 to it's stock 2.66ghz and try to make the DVD-RB install as base/standard as possible. I usually run my encodes between two HDs, but to make it more standard I will use a single drive to hold the source and work folder.

For a source disc I want to use something very common. I was thinking of using the original Matrix DVD, since providing the Proof of Purchase is practically a prerequisite to joining the forum, but the feature has IlVu which complicates things. Instead I'm going to use Saving Private Ryan (DTS version, UPC: 67068-46642), movie only, DTS stripped. EDIT: I don't mean stripped from source, but from DVD-RB. The source is a straight File rip via DVD Decrypter.

wmansir
4th March 2007, 04:53
I've rerun all my tests, both in XP and VIsta. The results were about what I expected, except my previous tests did give HC encoder the short stick and there's something odd about ProCoder and Vista.

Anyway, here's the setup:CPU: Pentium D 805
Memory: 512MBx2 Crucial PC3200 DDR
Board: Asrock 775dual-VSTA
BIOS: All settings default/auto/normal
Drives: C:,D: IBM Deskstar 7.2k 2mb cache, H: Maxtor 160GB 7.2k 8mb cache

Windows Xp SP2 on C:, Virtual Memory 1500-3000MB on C: Vista on D:, Source/Work folders on H:

Vista: Sidebar, Aero, UAC disabled.


Software:
QuickTime 7.1.3.100 (req for ProCoder)
Canopus Procoder 2 v2.0
Cinema Craft Encoder SP v2.50.01.00
Cinema Craft Encoder SP v2.70.2.6
DVD Rebuilder v1.22 Pro
AviSynth 2.5.6.0
HCenc 0.20.0.0
QuEnc 0.72

The Results:

http://img263.imageshack.us/img263/706/chartws4.gif

All settings except HC Best were DVD-RB encoder defaults. Results are +/- 1 minute because RB only goes by the minute hand of the clock.

The funny thing was ProCoder under Vista. ProCoder is multithreaded, so one instance will use both CPUs in a multi-core system. In XP running 2 instances gained a few % in speed, but under Vista performance was significantly hurt. When I saw the original results (188 minutes) I thought perhaps Vista had run a search index crawl or auto defrag to throw off the test so I ran it again. The second time the results were even worse at 198 minutes. I really don't know what's going on, not only to make the times much worse than XP, but also so different from each other.

Taking out the ProCoder outliers the average "Vista Penalty" was under 2%. Keep in mind this is without anything else going on, it is possible Vista is more or less efficient when multitasking, and I'm particularly interested to see the effects of Vista's new sound subsystem, so I may run some tests while playing music.

Also, just for fun I ran Quenc and HC in single instances. QuEnc under XP took 256 minutes and HC Normal under Vista took 176 minutes.

dragongodz
4th March 2007, 05:13
wmansir - thanks for the tests.

were DVD-RB encoder defaults
so QuEncs "high quality" setting is on from memory, isnt that right ? just confirming for clarity. :)

wmansir
4th March 2007, 07:18
The RB defaults are High Quality, Two Pass VBR without Scene Detection or Trellis.

archaeo
4th March 2007, 15:46
Very informative tests, wmansir. Thanks.
It gives me yet another reason not to run out and upgrade to Vista anytime soon.

wmansir
5th March 2007, 06:07
Actually I've been using Vista now for just about 2 months and spending some time in XP did make me want to go back, it just felt more responsive. Unfortunately the major reason I made a complete move to Vista was that after upgrading my motherboard and shuffling my HDs around a fresh install of XP couldn't import/activate my dynamic disc RAID. With Vista it just worked.