PDA

View Full Version : is a lookup table for three pixel values possible??


E-Male
2nd December 2004, 18:19
i read about the advantages of lookup tables, but i got a question:
is it also possible to do a lookup table for an operation that takes three pixel values as input? (i know 256^3 is big, just asking)

"unsigned char t[256*256*256]" didn't work in a quick test i did, 256*256*15 seems to be the maximum size for an array

so there are 3 posibilities:
i'm just too much of a noob to...
a)...get it to work
b)...see that it won't work
c)...see that would work but wouldn't help speed things up

i hope one of our many experts here can help me out

thx
E-Male

Nic
2nd December 2004, 18:24
When creating an array such as
unsigned char x[250];

The array is allocated from the stack, the stack is a very finite amount of memory (I think by default it's about 1Mb in Visual Studio, but that could be wrong)

If you want a very large array, allocate it from the heap with:

unsigned char *t = new unsigned char[256*256*256];

or if in C not C++

unsigned char *t = (unsigned char*) malloc(256*256*256*sizeof(unsigned char));

Remember to "delete [] t" or "free(t)" when you're finished with the memory...

-Nic

ps
If you're feeling really adventurous you could try it with a 3 dimensional array, such as unsigned char x[5][5][5] ;) A little harder to allocate off the heap though ;)

E-Male
2nd December 2004, 18:29
thx, just the quick help i needed

tritical
2nd December 2004, 18:46
If your going to be randomly accessing that array on every pixel in an image or something like that then the calculations involved would have to be pretty intensive to get much of a speed up I think. A 256*256*256 unsigned char array is gonna be around 16 megabytes in size, which defintely wont fit into an L1 or L2 cache. It depends on the specific application of the LUT table though.

E-Male
2nd December 2004, 19:10
it's exactly 16MB
i gues i can only find out if it help by coding and benchmarking it

akupenguin
2nd December 2004, 19:16
Originally posted by Nic
If you're feeling really adventurous you could try it with a 3 dimensional array, such as unsigned char x[5][5][5] ;) A little harder to allocate off the heap though ;)
It's only a little harder.

unsigned char (*lut)[256][256] = (unsigned char (*)[256][256]) malloc(256*256*256);

or even

unsigned char (*lut)[256][256] = new char[256][256][256];

Nic
2nd December 2004, 21:02
@aku: I know that, it was a challenge for him! Now how's he gonna learn? ;)

@tritical: You're right, but it depends on what calulation the LUT replaces...E-Male will know by trying it out, which shouldn't be hard...let us know how you get on :)

@Mod: Maybe this should be in the development forum (?)

-Nic

E-Male
2nd December 2004, 22:10
i tried it with the sharpening formula from limitedsharpen
and the lookup table is noticably slower

i see a way for possible optimizing it:
we have three values: pixel, maximum and minimum (max and min of the pixel and it's 3x3 neighborhood)
now since the minimum can't be bigger than the pixel and the maximum is never smaller we have many unused values in the table
i don't know if i'm anywhere near able to take advantage of that now, but i'll have another look tomorrow

Bidoche
3rd December 2004, 12:09
a 16M lookup table is definitely a bad idea.

Even it manages to make your filter faster, which is unlikely, (and you are better off trying assembly first)
It eats memory that could be used otherwise for the frame pipeline, and I guarantee that a cache miss (or getting to swap mem on HD) hurts a lot more.

E-Male
3rd December 2004, 23:24
the questions how to optimize it, eliminating the combinations that can't happen

tritical
4th December 2004, 01:21
Well, you could allocate/decallocate the array this way and eliminate the unneeded entries. However, I really don't think this will be much faster... but I haven't tested it so I can't say for sure. It would probably be slower :).

unsigned char ***lut;
lut = (unsigned char ***)malloc(256*sizeof(unsigned char **));
for (int x=0; x<=255; ++x)
{
lut[x] = (unsigned char **)malloc((256-x)*sizeof(unsigned char *));
for (int y=x; y<=255; ++y)
{
lut[x][y-x] = (unsigned char *)malloc((x+1)*sizeof(unsigned char));
}
}
for (int x=0; x<=255; ++x)
{
for (int y=x; y<=255; ++y) free(lut[x][y-x]);
free(lut[x]);
}
free(lut);

The lookups would then be:

[current_pixel][max_value-current_pixel][low_value]

E-Male
4th December 2004, 04:21
thx
i'll see if i can put that code into my test app over the WE

Prettz
12th December 2004, 19:44
Keep in mind that a cache miss is at minimum several hundred clock cycles and you'll most likely be getting a cache miss for every single array access.

Edit: Maybe I should add in that a cache miss is at minimum several hundred clock cycles on my system. I suppose it could be much less on an A64 system with the on-board MMU and some fast CAS 2 RAM.