PDA

View Full Version : CPU utilization on transcodes in Real


UofC
5th July 2006, 19:03
CPU utilization on transcodes in REAL

While transcoding files in the 32GB range to real format, I am not able to fully utilize the CPU’s potential. 80% usage is the best I have seen so far.

With dual Intel xeon 3.2Ghz, AMD opteron 275, and dual AMD opteron 275; I can not maximize the performance. I have noted that real producer 10 and 11 can only use two threads or two CPUs for a single encode. I implemented RAID0 in hopes of increasing drive performance but this changed very little. I have tried assigning affinity but this did very little.

I googled my self half to death and found no explanation. Could any one here provide some sage advice.

karl_lillevold
5th July 2006, 19:28
The latest Helix Producer 11 can encode a single file with up to 4 threads, that is if you have 4 virtual or real CPUs.

I wonder what may be the limiting factor here. Usually I see higher than 80% when encoding on my dual. What is your input format? Maybe if it is not I420, or YV12, color conversion or decoding may be taking place by the reader module of the encoder.

UofC
5th July 2006, 21:55
The capture is in native DV both AVI and MOV formats, both do the same thing.

On the quad core AMD it uses less then 40% on 2 or even 3 simultaneous transcodes. A single transcode will not span over more then two CPUs using Real Producer 10 or 11.

On the dual core AMD it uses 65% on both CPUs.

On the daul CPU xeon it uses 70% on both CPUs.

bratao
6th July 2006, 02:03
You disk is limiting the encoder performace.

32GB file is too much for any disk performace

Sirber
6th July 2006, 02:08
68MB/s is not enough?

UofC
6th July 2006, 18:11
Good point. See this is why I ask questions.

The hard drive averages 60MB/s. RAID0 averages 110MB/s.

The CPU averages 200MB/s or 400MB/s if the two cores double the bandwidth. (googled it for 2GHZ bandwidth)

How do you think I could get 400MB/s out of a hard drive. :) I was thinking that the CPU is not getting the data fast enough. RAID0 tests showed 100% improvement over a single disk.

As for why I can not get Real to use more then two CPUs, I will assume that 4 CPUs is just to much power for it. There goes my visions of a quad core; RAID wielding beast. argh argh arghh more power!

karl_lillevold
6th July 2006, 22:51
I just tested both the latest RealProducer 11 (GUI) and Helix Producer 11 (cmd line). On my dual Xeon with HT enabled, it launches 4 worker threads. Now, since this is an older HT system, it's not very effective in this configuration, but a true quad system would be, as well as a newer dual core dual CPU system.

Also, if you use the GUI, remember to disable all video preview, and audio meters... Do you use the GUI version?

EDIT: P.S. output video dimensions must be > 180 lines for the multiple threads to kick in.

UofC
7th July 2006, 15:31
I use the GUI sometimes but not for testing and such.

Interesting that you mention HT as it was the root of a major quality problem in Real Producer 10. Across 4 identical machines the video was distorted and pixelated.

After HT was turned off everything was fine.

What xeons and motherboard do you have? I am running 3.2Ghz 800fsb xeons ( I really wish Intel would adopt a naming convention that is easyer to read) with supermicro MB.

EDIT: P.S. output video dimensions must be > 180 lines for the multiple threads to kick in.

What do you mean by lines? Are you talking dimensions like 320x240 res?

karl_lillevold
7th July 2006, 18:24
The hard drive averages 60MB/s. RAID0 averages 110MB/s.

The CPU averages 200MB/s or 400MB/s if the two cores double the bandwidth. (googled it for 2GHZ bandwidth)

How do you think I could get 400MB/s out of a hard drive. :) I was thinking that the CPU is not getting the data fast enough. RAID0 tests showed 100% improvement over a single disk.

I am not sure about this calculation. You said your source is DV. DV bitrate, including audio is 26.5 Mbps, that is 26.5/8 MB/s ~= 3.3 MB/s. Even with a single HD at 60 MB/s, you should be able to read 60/3.3 = 18 times real-time. That is 25fps*18 = 450 fps for PAL... What is the encoding speed (fps) you are seeing?

So I do not think the hard-drive is the bottleneck, in fully utilizing both CPUs. I think the decoding of the DV format is probably single-threaded, and might be the limiting factor, since the encoder has to wait for each DV video frame to decoded, and possibly color-converted, before it can encode it.

What do you mean by lines? Are you talking dimensions like 320x240 res?
Yes. 320x240 would have 240 lines.

Interesting that you mention HT as it was the root of a major quality problem in Real Producer 10. Across 4 identical machines the video was distorted and pixelated.

After HT was turned off everything was fine.
This is really strange. I have been encoding with HT without any problems, and I have no explanation for this. My threading implementation works the exact same way on actual dual CPUs as with virtual CPUs. Perhaps a hardware problem, but you say it happened on 4 idential machines.