Ok, but GPUs are massively parallel, while CPUs are not. If the algo takes 4µs per color per thread on the CPU, then with a quad core CPU it would actually only take 0.25 seconds to calculate a 64^3 3dlut. However, a GPU doesn't calculate just 4 colors at once, but hundreds or thousands at once (not sure how many). So even if the GPU needs 4µs per color, because of the massive parallel calculation, the algo would still run in real time without any problems. But I'd say 4µs on the CPU is very optimistic. It'd probably be much slower than that.
|