View Single Post
Old 22nd July 2013, 12:49   #14  |  Link
TurboPascal7
Registered User
 
TurboPascal7's Avatar
 
Join Date: Jan 2010
Posts: 270
I tried avstp lately and while I definitely like the idea, current implementation has some issues.

First - library interface is a bit dated, imho. While moving from C++11 thread pool with lambda support to avstp, I had to replace code like
Code:
for (int i = 0; i < threadPool.numberOfThreads(); ++i) {
        threadPool.enqueue([=]{
            prepareBuffers_op(pSrc + (offset_+i*heightPerThread)*srcPitch, buffers_, width, heightPerThread+2, srcPitch, bufferPitch_, heightPerThread / 2 * i * bufferPitch_);
        });
    }
with
Code:
 struct RunData {
        const BYTE* pSrc;
        BYTE** buffers;
        int width;
        int height;
        int srcPitch;
        int bufferPitch;
        int bufferOffset;
        decltype(prepareBuffers_op) op;
    };

    std::vector<RunData> datas;

    for (int i = 0; i < pool.numberOfThreads(); ++i) {
        RunData d;
        d.pSrc = pSrc + (offset_+i*heightPerThread)*srcPitch;
        d.buffers = buffers_;
        d.width = width;
        d.height = heightPerThread + 2;
        d.srcPitch = srcPitch;
        d.bufferPitch = bufferPitch_;
        d.bufferOffset = heightPerThread / 2 * i * bufferPitch_;
        d.op = prepareBuffers_op;
        datas.push_back(d);
    }

    for (int i = 0; i < pool.numberOfThreads(); ++i) {
        pool.enqueue(dispatcher_, [](avstp_TaskDispatcher *dispatcher, void *userData){
            auto data = reinterpret_cast<RunData*>(userData);
            data->op(data->pSrc, data->buffers, data->width, data->height, data->srcPitch, data->bufferPitch, data->bufferOffset);
        }, &datas[i]);
    }
While lambdas support isn't something you can't live without, it would be definitely nice to have.

Second - avstp is slow. I get about 10-15% fps drop compared to a simple implementation like https://github.com/progschj/ThreadPool. I'm not sure what is the reason for this slowdown but it's here and it's really annoying. Maybe I'm doing something wrong in the code above, but I don't think it's the case.

And a personal question - why are you reimplementing a lot of things that have been available in different compilers/boost for ages? I do appreciate extremely easy compilation process of your plugins, but most of avstp code looks like a rewrite of boost to me.

-------
Okay, nevermind the performance part, but your API is just way too confusing.
When called avstp_set_threads(4), thread pool will actually have 3 threads. This is good because we also have the main thread, but I didn't find any mention of this in the documentation. At the same time avstp_get_nbr_threads (also, please rename to "number_of_threads" or "thread_count", nbr is just bad) returns 4, leading to the wrong assumption that there are actually 4 threads in the pool and hence incorrect workload distribution. You should either make this part correct at API level (return actual number of threads in the pool) or note it in the documentation.

Last edited by TurboPascal7; 24th July 2013 at 14:54. Reason: important update
TurboPascal7 is offline   Reply With Quote