View Full Version : When writing our own filters question.
Dark Alchemist
5th January 2006, 17:18
How do I get to the data of the image? When I have my src filled with the bytes of information from a frame I want to manipulate the image and leave the chroma but no matter which format I try or what I do I end up with a mess or everything is colored green.
I tried RGB32 thinking it would be easier to manipulate the individual pixels but no dice.
I am using the simplesample (15a) example but there is nothing to grab onto in it. I am looking at other developer's plugins but they all use MMX/SSE and I do not know how to code for that (I really do not want to code for that at all while I am learning how to manipulate an image).
I know not many come in here or answer these threads since most are users not programmers but if anyone can offer help I really appreciate it.
tsp
5th January 2006, 17:53
If you only want to use the luma information a planar format(YV12) would be the most easy format to work with.
Try posting your code (or the part from SimpleSample you don't understand) and it will be easier to tell you what is wrong.
Dark Alchemist
5th January 2006, 20:20
Well, simplesample15a doesn't have anything I could see that told me what was what. All that simple samle did was copy source to destination and sticking a box in the middle of the screen.
if (vi.IsYUY2()) {
// This code deals with YUY2 colourspace where each 4 byte sequence represents
// 2 pixels, (Y1, U, Y2 and then V).
// This colourspace is more memory efficient than RGB32 but can be more awkward to use sometimes.
// However, it can still be manipulated 32bits at a time depending on the
// type of filter you are writing
// There is no difference in code for this loop and the RGB32 code due to a coincidence :-)
// 1) YUY2 frame_width is half of an RGB32 one
// 2) But in YUY2 colourspace, a 32bit variable holds 2 pixels instead of the 1 in RGB32 colourspace.
for (h=0; h < src_height;h++) { // Loop from top line to bottom line (opposite of RGB colourspace).
for (w = 0; w < src_width/4; w+=1) { // and from leftmost double-pixel to rightmost one.
*((unsigned int *)dstp + w) = *((unsigned int *)srcp + w); // Copy 2 pixels worth of information from source to destination.
} // at a time by temporarily treating the src and dst pointers as
// 32bit (4 byte) pointers intead of 8 bit (1 byte) pointers
srcp = srcp + src_pitch; // Add the pitch (note use of pitch and not width) of one line (in bytes) to the source pointer
dstp = dstp + dst_pitch; // Add the pitch to the destination pointer.
}
// end copy src to dst
//Now draw a white square in the middle of the frame
// Normally you'd do this code within the loop above but here it is in a separate loop for clarity;
dstp = dst->GetWritePtr(); // reset the destination pointer to the top, left pixel. (YUY2 colourspace only)
dstp = dstp + (dst_height/2 - SquareSize/2)*dst_pitch; // move pointer to SquareSize/2 lines from the middle of the frame;
int woffset = dst_width/8 - SquareSize/4; // lets precalulate the width offset like we do for the lines.
for (h=0; h < SquareSize;h++) { // only scan SquareSize number of lines
for (w = 0; w < SquareSize/2; w+=1) { // only scans the middle SquareSize pixels of a line
*((unsigned int *)dstp + woffset + w) = 0x80EB80EB; // Set Y1 and Y2 to max, U and V to no colour.
} // LSB = Y1, MSB = V
dstp = dstp + dst_pitch;
}
As you can see nothing there for me to grab onto so I can understand what it is I need to know to manipulate the actual image.
hanfrunz
5th January 2006, 21:05
First you have to understand how a frame is stored in memory. In RGB32 Mode you have 4byte/pixel:
Byte1: the value for the BLUE channel (B)
Byte2: the value for the GREEN channel (G)
Byte3: the value for the RED chanel (R)
Byte4: the value for the alpha-channel (most time not used)
in RGB24 Mode you have 3 byte/pixel BGR, and no alpha.
The simplesample filter just copies these 4 bytes (RGB32) at one time.
The YUY2-mode is a little more complicated. The Resolution of the chromainformation is 50% of the luma-information. This is called 4:2:2 chromasubsampling. So 2 pixels share the same byte of chroma-information for the "U" and "V" component:
Byte1: Y1
Byte2: U
Byte3: Y2
Byte4: V
in YV12 its even more complicated because the chroma resolution is 50% in both dimensions (height and width).
I recommend you start with the RRB24 or RGB32 mode. And remember the frame is stored upside down.
If you want to manipulate the frame start with something like a invert-filter.
You read the value for red and invert it R=255-r. Do the same for G and B. the store the new values to the dst-frame.
Have fun writing filters!
hanfrunz
Dark Alchemist
5th January 2006, 21:30
Yes, I did go to rgb32 before I went to bed and was just as lost. All I managed to do was make green shaded people or green would flash.
Now I know to invert all I need do is ^255 but how can I manipulate the image?
Lets take this as an example of something I would like to do using audio samples as an example.
AS = Audio Sample MA = Mixed Audio
MA1 = AS1 * 0.5 + AS2 * .05
How do you do something like that in RGB32 for a frame?
Wilbert
5th January 2006, 21:48
Adjust the code above to
for (h=0; h < src_height; h++) { // Loop from bottom to top line (opposite of YUV colourspace).
for (w = 0; w < src_width; w+=3) { // and from leftmost pixel to rightmost one.
dstp[w] = srcp[w]; // copy blue
dstp[w+1] = srcp[w+1]; // copy green
dstp[w+2] = 255-srcp[w+2]; // invert red
}
srcp = srcp + src_pitch; // Add the pitch to the source pointer (go to next line).
dstp = dstp + dst_pitch; // Add the pitch to the destination pointer (go to next line).
}
and replace vi.IsYUY2() by vi.IsRGB24().
Thus for h=0 (bottom line) you adjust dstp[2], dstp[5], .... h=1 corresponds to dstp[dst_pitch+2], dstp[dst_pitch+5], .... Etc ...
Wilbert
5th January 2006, 22:00
That made everything blue
That's because you set the blue pixels to zero. The other pixels
*(dstp + w + j) = *(srcp + w + j); // j=1,2,3
are not copied. I guess they become zero, but you can't rely on this.
Change your code to
for (h=0; h < src_height; h++) {
for (w = 0; w < src_width; w+=4) {
*(dstp + w) = 0;
*(dstp + w + 1) = *(srcp + w + 1);
*(dstp + w + 2) = *(srcp + w + 2);
*(dstp + w + 3) = *(srcp + w + 3);
}
}
edit: could you stop deleting your posts :)
hanfrunz
5th January 2006, 22:00
@Dark Angel:
mmh you need some knowledge of programming of course. If you want to "mix" pixels like you wrote above write something that does:
1. read pixel 1 (rgb)
2. read pixel 2 (rgb)
3. calculate the mix: r=r1+r2*0.5; g=g1+g2*0.5; b=b1+b2*0.5;
4. write the calculated values to dst
I think you have to learn the basics first, before you start coding a complicated filter. Read something about digital video, filter-theory, and all that stuff get the book "digital television ans hdtv" for example.
hanfrunz
Dark Alchemist
5th January 2006, 22:21
Yes, I can program but I am trying to wrap my head around the video aspect.
I already see rgb32 is much easier to work with so that will help but do you have something in mind for video filter theory? The simpler the better in my case so maybe a web site.
@wilbert: Habit cause as soon as I posted it I saw what I had done wrong and fixed it (forgot to add the +1,+2,+3).
Richard Berg
6th January 2006, 03:55
@wilbert: Habit cause as soon as I posted it I saw what I had done wrong and fixed it (forgot to add the +1,+2,+3).
So edit the post, don't delete it.
Dark Alchemist
6th January 2006, 04:45
What is wrong with this 3x3 convolve routine?
float kernel[3][3] = { {1,0,1}, {0,0,0}, {-1,-2,-1} };
for (int h=0; h < src_height; h++) { // Loop from bottom to top line (opposite of YUV colourspace).
for (int w = 0; w < src_width; w+=4) { // and from leftmost pixel to rightmost one.
const unsigned char *line = src->GetReadPtr();
float rSum = 0, gSum = 0, bSum = 0, kSum = 0;
for (int i = 0; i < 2; i++)
{
for (int j = 0; j < 2; j++)
{
float fKernel = kernel[i][j];
bSum += (float)(line[j*4]);
gSum += (float)(line[j*4+1]);
rSum += (float)(line[j*4+2]);
//add the kernel value to the kernel sum
kSum += fKernel;
}
line = line + src_pitch;
}
//if kernel sum is less than 0, reset to 1 to avoid divide by zero
if (kSum <= 0)
kSum = 1;
//divide each channel by kernel sum
rSum/=kSum;
gSum/=kSum;
bSum/=kSum;
//prevent channel overflow by clamping to 0..255
if (rSum > 255)
rSum = 255;
else if (rSum < 0)
rSum = 0;
if (gSum > 255)
gSum = 255;
else if (gSum < 0)
gSum = 0;
if (bSum > 255)
bSum = 255;
else if (bSum < 0)
bSum = 0;
dstp[w] = bSum;
dstp[w+1] = gSum;
dstp[w+2] = rSum;
}
srcp = srcp + src_pitch; // Add the pitch to the source pointer (go to next line).
dstp = dstp + dst_pitch; // Add the pitch to the destination pointer (go to next line).
}
}
// As we now are finished processing the image, we return the destination image.
return dst;
}
All I get is black and I have been banging my head on it for 4 or 5 hours now without making any headroom.
Bidoche
6th January 2006, 13:06
Flaws :
- You calculate ksum for each pixel when it's always the same result
- You don't weight pixels with your kernel, what's the point of having one then ?
- ksum = -2 in your code, incorrectly handled by if (kSum <= 0) kSum = 1;
- and FINALLY : for each output line, you use the same 3 input lines with that : const unsigned char *line = src->GetReadPtr(), no wonder all output lines are identical...
Dark Alchemist
6th January 2006, 13:40
I will look at this when I get home but the first two points of yours I don't get.
ksum could be anything depending on what kernel I throw at it and what do you mean by weighing the pixels?
Oh, as far as being the same they are just black. I should get some lit pixels but I don't.
Wilbert
6th January 2006, 14:18
bidoche means that for every pixel you determine ksum, but ksum should be independent of it. Similar for the read pointer. So start with
int kernel[3][3] = { {1,0,1}, {0,0,0}, {-1,-2,-1} };
int fKernel;
const unsigned char *line = src->GetReadPtr();
int rSum = 0, gSum = 0, bSum = 0, kSum = 0;
for (int i = 0; i < 2; i++) {
for (int j = 0; j < 2; j++) {
fKernel = kernel[i][j];
kSum += fKernel;
}
}
if (kSum == 0)
kSum = 1;
for (int h=0; h < src_height; h++) {
for (int w = 0; w < src_width; w+=4) {
...
Dark Alchemist
6th January 2006, 15:35
If you notice I am only using a 2 x 2 out of the 3 x 3 kernel but that is only because I am still testing and last night I was even down to a 1x1. heh.
So, will the fact that in yours ksum is now an integer versus my float change anything? I ask because some kernels are in float (0.707, etc...).
Wilbert
6th January 2006, 16:13
So, will the fact that in yours ksum is now an integer versus my float change anything?
No. To speed things up people try to us integer calculations, but that's not important at this stage.
Dark Alchemist
6th January 2006, 16:17
Is there anyway I can send text out to the frame? I am working in the blind without it as I always use printf on my variables/constants to see what is happening and without something similiar I am going nowheres. I mostly get access violation errors so it will help if I can see what is happening under the hood. :)
Bidoche
6th January 2006, 17:12
helper to sum kernels (modulo syntax errors)
template <typename T, int N>
T sumKernel(T const& kernel[N][N])
{
T result = 0;
for (int i = 0; i < N; ++i)
for (int j = 0; i < N; ++j)
result += kernel[i][j];
return result;
}
helper for saturate values (same conditions)
template <typename T, T min, T max, typename U> T saturate(U value)
{
return value < U(min) ? min
: value > U(max) ? max
: T(value);
}
Then you can use dstp[w] = saturate<BYTE, 0, 255>(bSum / kSum);
Nice, no ?
Dark Alchemist
6th January 2006, 17:20
How do you handle the window to the last pixel? Seems my routine is fubar so I rewrote and the one thing I noticed was that the access violation errors I am getting is because it is trying to read outside of the square so it can handle every last pixel.
So, lets say I have 1 line of pixels my window will read 1 line + N (in your above example) so 1 line of pixels (width) + 3 (for a 3 x 3 kernel) pixels (or 12 bytes for a RGB32 formated frame.
Do I simply forget about those last pixels and crop them out (bad) or is there some way to make a slightly wider (and taller) frame filled with zeros?
float rSum = 0, gSum = 0, bSum = 0, kSum = 0;
int fKernel = 0;
for (int i = 0; i < 3; i++)
{
for (int j = 0; j < 3; j++)
{
fKernel = kernel[i][j];
kSum += fKernel;
}
}
if (kSum <= 0)
kSum = 1;
const unsigned char* line = src->GetReadPtr();
for (int h=0; h < src_height; h++) { // Loop from bottom to top line (opposite of YUV colourspace).
for (int w = 0; w < src_width; w+=4) // and from leftmost pixel to rightmost one.
{
for (int i = 0; i < 3; i++)
{
for (int j = 0; j < 3; j++)
{
bSum += line[(j * 4) + w]; //causes an access violation error
}
line += src_pitch;
}
bSum/=kSum;
if (bSum > 255)
bSum = 255;
else if (bSum < 0)
bSum = 0;
dstp[w] = bSum;
}
srcp = srcp + src_pitch; // Add the pitch to the source pointer (go to next line).
line = srcp;
dstp = dstp + dst_pitch; // Add the pitch to the destination pointer (go to next line).
}
Bidoche
6th January 2006, 17:36
You need to somehow handle the edges :
You can ignore them, ie you only pass the kernel where it has the space to.
But then the frame will shrink.
You can use have special cases for the edges and use cropped kernels there.
You can interpolate the frame on edges and then use method 1 on the extended frame.
That's one memory violation cause you will encounter, but real one here is that line += src_pitch; is executed 3 times per pixel, it needs to be compensated.
Dark Alchemist
6th January 2006, 17:49
line += src_pitch;
That line is supposed to be after the pixels when it must drop down to the next line to pick up the next 3 pixel.
See this is entirely too hard to put onto paper if you ask me.
I know EXACTLY how this works but when I lay it out on the paper to write I fail miserably.
So, I suck in implementation because when I look at the code I don't see what you are seeing. All it is supposed to do is go 3 pixels right then drop to the next line. Then it is on the next line and it goes 3 pixels to the right. One more drop down to the next line and it goes over 3 pixels.
Now after doing the above line will equal the 4th line down and we have our rgb sums.
whoa, you have damn good eyes there :) because I was typing this out and noticed I forgot to make the line back to the original line we are on getting ready for the next pixel pass through all of this.
const unsigned char* line = src->GetReadPtr();
for (int h=0; h < src_height; h++) { // Loop from bottom to top line (opposite of YUV colourspace).
for (int w = 0; w < src_width; w+=4) // and from leftmost pixel to rightmost one.
{
for (int i = 0; i < 3; i++)
{
for (int j = 0; j < 3; j++)
{
bSum += line[(j * 4) + w];
}
line += src_pitch;
}
bSum/=kSum;
if (bSum > 255)
bSum = 255;
else if (bSum < 0)
bSum = 0;
dstp[w] = bSum;
line = srcp;
}
srcp = srcp + src_pitch; // Add the pitch to the source pointer (go to next line).
line = srcp;
dstp = dstp + dst_pitch; // Add the pitch to the destination pointer (go to next line).
} is how it reads now. :) I still have my access violation though. So, I tried for (int h=0; h < src_height-3; h++) // Loop from bottom to top line (opposite of YUV colourspace).
{
for (int w = 0; w < src_width-12; w+=4) // and from leftmost pixel to rightmost one.
{ and the crashes went away as I figured. Of course it still doesn't do anything but turn the screen white but at least the crashes are gone.
float rSum = 0, gSum = 0, bSum = 0, kSum = 0;
int fKernel = 0;
for (int i = 0; i < 3; i++)
{
for (int j = 0; j < 3; j++)
{
fKernel = kernel[i][j];
kSum += fKernel;
}
}
if (kSum <= 0)
kSum = 1;
const unsigned char* line = src->GetReadPtr();
for (int h=0; h < src_height-3; h++) // Loop from bottom to top line (opposite of YUV colourspace).
{
for (int w = 0; w < src_width-12; w+=4) // and from leftmost pixel to rightmost one.
{
for (int i = 0; i < 3; i++)
{
for (int j = 0; j < 3; j++)
{
bSum += line[(j * 4) + w];
gSum += line[(j * 4) + w + 1];
rSum += line[(j * 4) + w + 2];
}
line += src_pitch;
}
bSum/=kSum;
gSum/=kSum;
rSum/=kSum;
dstp[w] = summinmax(bSum, 0, 255);
dstp[w+1] = summinmax(gSum, 0, 255);
dstp[w+2] = summinmax(rSum, 0, 255);
line = srcp;
}
srcp = srcp + src_pitch; // Add the pitch to the source pointer (go to next line).
line = srcp;
dstp = dstp + dst_pitch; // Add the pitch to the destination pointer (go to next line).
}
}
// As we now are finished processing the image, we return the destination image.
return dst;
} Latest code that turns everything white except the first 3 lines (because rgb32 is upside down) and the last 3 pixels.
I still wonder if there is a way to get my stuff to print to me. Without I am severely hindered.
Bidoche
6th January 2006, 19:07
Make yourself a favor, use an helper class : (simplified from 3.0 window_ptr)
struct helper //TODO: find better name
{
BYTE * ptr_;
int pitch_;
helper(BYTE ptr, int pitch) : ptr_( ptr ), pitch_( pitch ) { }
BYTE& operator[](int x) { return ptr_[x]; }
BYTE& operator(int x, int y) { return *(ptr_ + x + y * pitch_); }
void next() { ptr_ += pitch_; }
};
then use it like it :helper srcLine(srcp, src_pitch);
helper dstLine(dstp, dst_pitch);
for (int h = 0; i < src_height; ++i, srcLine.next(), dstLine.next())
for(int w = 0; i < src_width; w += 4)
{
//...
for(int i = 0; i < 3; ++i)
for(int j = 0; j < 3; ++j)
bSum += line(w + 4*j, i);
dstp[w] = saturate<BYTE>(bSum/kSum);
}
Dark Alchemist
6th January 2006, 19:45
Well, I am lost in all of this so I think this is too much for my knowledge, and lack of.
I did finally get the thing to work but my kernel never acted how it was supposed to.
I have a program that will test my kernels (written in java) and what it does and what my program does is completely different meaning my routine is borked. I just can't figure out wth is wrong (most kernels give me pure white and 1,1,1,1,1,1,1,1,1 gives me a slight blur effect).
Not much more that I can do to figure this out on my side since we are now approaching new ground for me.
Thanks to both of you for helping me. :) There is one thing I just learnt from all of this "never try to write a filter again."
Thanks again.
Bidoche
6th January 2006, 21:06
Well, you started by trying to code a convolution...
It's not the 1st pick generally.
And as I said initially you don't weigh your pixels by your kernel.
It's like you always use the { {1 1 1 } {1 1 1} {1 1 1} } kernel.
I am not surprised it's the only one that appears correct coz then kSum is correct.
Dark Alchemist
6th January 2006, 21:10
Well, I just did it. I am so damn happy right now I could burst.
I just did an edge detect kernel and it worked. Woohoo. Just tried a gaussian blur kernel and it worked too. OMG. woohoo.
What I did was I went to wordpad so I could see all of my code and I found I had left out my kernel in one of my builds but that was not entirely it what was giving me the all white (ffffff) was the fact I forgot to clear out my sums.
float rSum = 0, gSum = 0, bSum = 0, kSum = 0, fKernel = 0;
for (int i = 0; i < 3; i++)
{
for (int j = 0; j < 3; j++)
{
fKernel = kernel[i][j];
kSum += fKernel;
}
}
if (kSum <= 0)
kSum = 1;
const unsigned char* line = src->GetReadPtr();
for (int h=0; h < src_height-3; h++) // Loop from bottom to top line (opposite of YUV colourspace).
{
for (int w = 0; w < src_width-12; w+=4) // and from leftmost pixel to rightmost one.
{
rSum = 0;
gSum = 0;
bSum = 0;
for (int i = 0; i < 3; i++)
{
for (int j = 0; j < 3; j++)
{
fKernel = kernel[i][j];
bSum += (line[(j * 4) + w] * fKernel);
gSum += (line[(j * 4) + w + 1] * fKernel);
rSum += (line[(j * 4) + w + 2] * fKernel);
}
line += src_pitch;
}
bSum/=kSum;
gSum/=kSum;
rSum/=kSum;
dstp[w] = summinmax(bSum, 0, 255);
dstp[w+1] = summinmax(gSum, 0, 255);
dstp[w+2] = summinmax(rSum, 0, 255);
line = srcp;
}
srcp = srcp + src_pitch; // Add the pitch to the source pointer (go to next line).
line = srcp;
dstp = dstp + dst_pitch; // Add the pitch to the destination pointer (go to next line).
} Of course I still have those untouched lines and untouched pixels that are the size of my kernel (in this simple case 3 lines and 3 pixels).
That is a tad slow even after I changed it all to integers. hmmmm.
Bidoche
6th January 2006, 23:06
I don't remember anyone accusing general convolutions to be too fast ;p
Dark Alchemist
7th January 2006, 00:45
=P true.
Oh, can you help in the regards to those lines and pixels issues? Since I am effectively sliding a grid (3x3 for this) over a large window (the videoframe) when I get to the side (or down at the bottom) I will have an overlap *but* that overlap will cause an access violation.
I have been reading via google but I have yet to see anything written to handle the over lap. Best I saw was where they basically crop or discard and that is nasty for a video.
tsp
7th January 2006, 16:05
with your code you don't center the kernel on the pixel you write to meaning that if you used a 3x3 kernel like this
0 0 0
0 1 0
0 0 0
the image would be shifted.
Also you still use float values that slows the filter down. For the border pixels you can mirror or just repeat the edge pixel.
I modified your code to look something like this:
int kernel[3][3] = { {1,0,1}, {0,0,0}, {-1,-2,-1} };
int rSum = 0, gSum = 0, bSum = 0, kSum = 0, iKernel = 0;
for (int i = 0; i < 3; i++)
{
for (int j = 0; j < 3; j++)
{
iKernel = kernel[i][j];
kSum += iKernel;
}
}
if (kSum <= 0)
kSum = 1;
const unsigned char* line = src->GetReadPtr();
//Calc first line
//edge pixel
rSum = 0;
gSum = 0;
bSum = 0;
int w = 0;
for (int i = 0; i < 3; i++)
{
for (int j = 0; j < 3; j++)
{
iKernel = kernel[i][j];
if(j==0)
{
bSum += (line[(j * 4) + w] * iKernel);
gSum += (line[(j * 4) + w + 1] * iKernel);
rSum += (line[(j * 4) + w + 2] * iKernel);
}
else
{
bSum += (line[(j * 4-4) + w] * iKernel);
gSum += (line[(j * 4-4) + w + 1] * iKernel);
rSum += (line[(j * 4-4) + w + 2] * iKernel);
}
}
line += src_pitch;
}
bSum/=kSum;
gSum/=kSum;
rSum/=kSum;
dstp[] = summinmax(bSum, 0, 255);
dstp[1] = summinmax(gSum, 0, 255);
dstp[2] = summinmax(rSum, 0, 255);
line = srcp;
//center pixels
for (w = 4; w < src_width-4; w+=4) // and from leftmost pixel to rightmost one.
{
rSum = 0;
gSum = 0;
bSum = 0;
for (int i = 0; i < 3; i++)
{
for (int j = 0; j < 3; j++)
{
iKernel = kernel[i][j];
bSum += (line[(j * 4-4) + w] * iKernel);
gSum += (line[(j * 4-4) + w + 1] * iKernel);
rSum += (line[(j * 4-4) + w + 2] * iKernel);
}
if(i==1)//repeats the first line
line += src_pitch;
}
bSum/=kSum;
gSum/=kSum;
rSum/=kSum;
dstp[w] = summinmax(bSum, 0, 255);
dstp[w+1] = summinmax(gSum, 0, 255);
dstp[w+2] = summinmax(rSum, 0, 255);
line = srcp;
}
//TODO: add right edge pixel
dstp = dstp + dst_pitch; // Add the pitch to the destination pointer (go to next line).
//center lines
for (int h=1; h < src_height-1; h++) // Loop from bottom to top line (opposite of YUV colourspace).
{
//edge pixel
rSum = 0;
gSum = 0;
bSum = 0;
int w = 0;
for (int i = 0; i < 3; i++)
{
for (int j = 0; j < 3; j++)
{
iKernel = kernel[i][j];
if(j==0)
{
bSum += (line[(j * 4) + w] * iKernel);
gSum += (line[(j * 4) + w + 1] * iKernel);
rSum += (line[(j * 4) + w + 2] * iKernel);
}
else
{
bSum += (line[(j * 4-4) + w] * iKernel);
gSum += (line[(j * 4-4) + w + 1] * iKernel);
rSum += (line[(j * 4-4) + w + 2] * iKernel);
}
}
line += src_pitch;
}
bSum/=kSum;
gSum/=kSum;
rSum/=kSum;
dstp[] = summinmax(bSum, 0, 255);
dstp[1] = summinmax(gSum, 0, 255);
dstp[2] = summinmax(rSum, 0, 255);
line = srcp;
//center pixels
for (w = 4; w < src_width-4; w+=4) // and from leftmost pixel to rightmost one.
{
rSum = 0;
gSum = 0;
bSum = 0;
for (int i = 0; i < 3; i++)
{
for (int j = 0; j < 3; j++)
{
iKernel = kernel[i][j];
bSum += (line[(j * 4-4) + w] * iKernel);
gSum += (line[(j * 4-4) + w + 1] * iKernel);
rSum += (line[(j * 4-4) + w + 2] * iKernel);
}
line += src_pitch;
}
bSum/=kSum;
gSum/=kSum;
rSum/=kSum;
dstp[w] = summinmax(bSum, 0, 255);
dstp[w+1] = summinmax(gSum, 0, 255);
dstp[w+2] = summinmax(rSum, 0, 255);
line = srcp;
}
//TODO: add right edge pixel
srcp = srcp + src_pitch; // Add the pitch to the source pointer (go to next line).
line = srcp;
dstp = dstp + dst_pitch; // Add the pitch to the destination pointer (go to next line).
}
//TODO: add last(edge) line
It's not tested and might/will contain bugs but I hope it makes sence anyway.
Dark Alchemist
7th January 2006, 16:29
Yes I knew about it being shifted by 1 pixel but your way I don't understand.
if(j==0)
{
bSum += (line[(j * 4) + w] * iKernel);
gSum += (line[(j * 4) + w + 1] * iKernel);
rSum += (line[(j * 4) + w + 2] * iKernel);
}
else
{
bSum += (line[(j * 4-4) + w] * iKernel);
gSum += (line[(j * 4-4) + w + 1] * iKernel);
rSum += (line[(j * 4-4) + w + 2] * iKernel);
}Looks to me like you are taking pixel 0 line 0 like I do but when we reach pixel 1 you are redoing pixel 0 again.
Basically I do not understand all that redundancy. Wouldn't it just be easier to have your frame be slightly bigger than the image but the frame would be zero? Seems this code is getting overly complicated now for some reason (and slower too I bet).
tsp
7th January 2006, 16:36
exactly that is because I can't take line -1 right so to avoid that I just take line 0 one time more (It would be faster just to multiply by 2 but this illustrate better how it works.) you asked how to handle the edges and this how you can do it.
Dark Alchemist
7th January 2006, 16:51
What if we created an artifical border of black (000) would that serve the same purpose (meaning would that work) or no?
tsp
7th January 2006, 17:04
you can do that but depending on the kernelsize and coefficients the destination border will be darker (it can look artificial).
Dark Alchemist
7th January 2006, 17:10
Well, what I was thinking was if I created the border and did the filter then removed the border what would happen?
Probably would be bad but I was just thinking it would be a way of removing redundancy.
tsp
7th January 2006, 17:27
You want to copy the src to a new frame and add a black border to avoid the extra code to handle the edge?
That would be slower because you will have to copy the entire frame before filtering. Also because you use information from the black border it will influence the edge of the final image (try a simple 1d kernel (1 0 0) that shift the frame to the right. This would result in a black left edge if you use a black border or a copy of the leftmost pixel if you just mirror the edge. The last approach is usually less distinct.)
Dark Alchemist
7th January 2006, 17:30
Ahhh. :(
I tried your code and it too shifted the image (I have 1 pixel on both sides of the image).
Since you know about all of this and I don't I am stuck. Maybe the best way is to simply cut the offending pixels out of the final image (slow I know but no way around it that I can see using the above code snippets).
http://img510.imageshack.us/img510/292/1pix0ip.jpg
Not a movie picture so not as easy to see but the 1 pixel on both sides is there (same as my code). Weird stuff.
tsp
7th January 2006, 17:48
try post a before and a after image with the simple kernel:
0 0 0
0 1 0
0 0 0
It makes it easier to see how the image is shifted.
Also my code snippet is not complete. The code for the right edge and the top line is still missing.
Dark Alchemist
7th January 2006, 17:55
You mean before with no filter then an after with the above kernel?
tsp
7th January 2006, 18:01
yes exactly.
Dark Alchemist
7th January 2006, 18:05
Without the filter (http://img366.imageshack.us/img366/7919/withoutfilter0bf.jpg)
With filter (http://img366.imageshack.us/img366/5493/withfilter4jy.jpg)
Looks fine to me now but when the video moves you can see 1 pixel on both sides. Oh, well I can always just crop.
tsp
7th January 2006, 19:16
it doesn't look like it is shifted to me. Try adding the missing code (for the right edge and the last line) and see if it help.
Dark Alchemist
7th January 2006, 19:19
Yes, I noticed that the image did not appear to shift at all so I wonder why? What was that kernel for as it looked like it did nothing (the more I look at it the more I think it was meant to do nothing just a basic copy).
tsp
8th January 2006, 04:58
don't worry. It shouldt do nothing. If it change the image it is an error. Just a simple test. Just look at the kernel. The current(center) pixel is multiplyed by 1 and all the other pixel is ignored.
vBulletin® v3.8.5, Copyright ©2000-2012, Jelsoft Enterprises Ltd.