Tuesday, September 2, 2008

A16- Color Image Segmentation

According to wikipedia, segmentation refers to the process of partitioning a digital image into multiple region. The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries (lines, curves, etc.) in images.

Note that for any colored image, each pixel color is a sum of the different contributions of red, blue and green. So if a pixel has a color value of I, then it can be written as:

I = R + G + B.

Alternatively, we can also normalize this color values so that,
r=R/I,
g=G/I,
b=B/I.
where r+g+b=1.

In this activity, I applied the concept of color segmentation on two images. There are two ways to do this namely, parametric and non-parametric.

For parametric segmentation, I took a patch from the region of interest or ROI. I then solve for p(r) and p(g). These are the probability density of the red and green pixels respectively. Assuming that the distribution of the pixels is Gaussian, we can write the PDF as:

where sigma and mu are the standard deviation and mean of the corresponding colors respectively. The joint PDF is then equal to p(r)*p(g). I then applied this resulting PDF to the image shown in Figure 1 and the result is Figure 3.

For the nonparametric segmentation. We obtained a three-dimensional histogram with the x- and y- axes as the red and green vaues. The code used is written below. This histogram will then be used to segment the image.

Figure 1. A blue ball will serve as an image

Figure 2. The ROI of the Image

Figure 3. Result of Parametric Segmentation

Figure 4. Result of Non-Parametric Segmentation

Figure 5. 3d Histogram and Chromacity Plane

I used the following codes:

im = imread("ball.jpg");
im1 = imread("ball3.jpg");
//To compute for the RGB values of each element in the segmented image
R = im(:,:,1);
G = im(:,:,2);
B = im(:,:,3);
R1 = im1(:,:,1);
G1 = im1(:,:,2);
B1 = im1(:,:,3);
I = R + G + B;
I1 = R1 + G1 + B1;
r = R./I;
g = G./I;
b = B./I;
r1 = R1./I1;
g1 = G1./I1;
b1 = B1./I1;

// For Parametric

// Computing for the Histogram
mured = mean(r1);
mugreen = mean(g1);
sigmared = st_deviation(r1);
sigmagreen = st_deviation(g1);
pr = exp(-(r-mured^2)/(2*sigmared))./(sigmared.*sqrt(2*%pi));
pg = exp(-(g-mugreen^2)/(2*sigmagreen))./(sigmagreen.*sqrt(2*%pi));
prg = pr.*pg;
imshow(prg);

// For Non-Parametric

// Computing for the 3D Histogram
binsize = 32;
RED1 = r1*binsize;
GREEN1 = g1*binsize;
PDF = zeros(binsize,binsize); // define a matrix for the PDF
[m1,n1] = size(RED1);
for i = 1:m1;
for j = 1:n1;
x1 = round(RED1(i,j))+1; // x gives the red value
y1 = round(GREEN1(i,j))+1; // gives the green value
PDF(x1,y1) = PDF(x1,y1)+1;
end;
end;
plot3d(1:binsize,1:binsize,PDF);
// To write a new image based on the histogram
RED = r*binsize;
GREEN = g*binsize;
[m,n] = size(RED);
Image = zeros(size(im,1),size(im,2));
for k=1:m;
for l=1:n;
x = round(RED(k,l))+1;
y = round(GREEN(k,l))+1;
Image(k,l) = PDF(x,y);
end;
end;
xbasc();
imshow(Image);

Since I found that non-parametric segmentation is better in comparison to parametric segmentation, I used this to the image of a hand that appears below.

Figure 6. Image 2: Bug

Figure 7. ROI for figure 6
Figure 8. 3d Histogram for Figure 6

Figure 9. Result for non-parametric segmentation

I rate myself 10/10 for my effort in this activity.

Reference:
Picture from:
http://charmed-charmedimsure.blogspot.com/2007/12/blue-ball.html
http://en.wikipedia.org/wiki/Segmentation_(image_processing)

Thursday, August 28, 2008

A15 - Color Camera Processing

In photography and image processing, color balance is the global adjustment of the intensities of the colors (typically red, green, and blue). An important goal of this adjustment is to render specific colors – particularly neutral colors – correctly; hence, the general method is sometimes called gray balance, neutral balance, or white balance. Color balance changes the overall mixture of colors in an image and is used for color correction; generalized versions of color balance are used to get colors other than neutrals to also appear correct or pleasing.

Digital cameras come with white balance (WB) settings. Here are the some white balance settings available in digital cameras:

Daylight - used when taking photos outdoors
Incandescent - used when taking photos under a incandescent bulb
Flourescent - used when taking photos under a flourescent bulb
Sunset - used when taking photos during sunset
Cloudy - used when taking photos during cloudy days

When you set the right WB settings in your camera, then you can remove unrealistic color casts, so that objects which appear white in person are rendered white in your photo.

In this activity, I captured images under the wrong WB settings and try to correct it using scilab.
The process includes taking the RGB value of the white part of the image and divide the entire image with this RGB value.
I used the following codes to treat the images:

stacksize(20000000);
im = imread("leaves.jpg");
im1 = imread("leaves1.jpg");
//Reference white
Rw = mean(im1(:,:,1));
Gw = mean(im1(:,:,2));
Bw = mean(im1(:,:,3));
//Getting the RGB values of the image
R = im(:,:,1);
G = im(:,:,2);
B = im(:,:,3);
//Treating the image
new = zeros(im);
new(:,:,1) = R./Rw;
new(:,:,2) = G./Gw;
new(:,:,3) = B./Bw;
//Writing the new image
maxa = max(max(max(new)));
imnew = new./maxa;
imwrite(imnew,"newleaf.jpg");
//Gray World
Rwg = mean(im(:,:,1));
Gwg = mean(im(:,:,2));
Bwg = mean(im(:,:,3));
newg = zeros(im);
newg(:,:,1) = im(:,:,1)./Rwg;
newg(:,:,2) = im(:,:,2)./Gwg;
newg(:,:,3) = im(:,:,3)./Bwg;
maxb = max(max(max(newg)));
imnewg = newg./maxb;
imwrite(imnewg,"newgleaf.jpg");

And here are the results:
Figure 1. Photo taken under incandescent setting but should be taken under flourescent setting.
Figure 2. Treated Image of Figure 1 (White Balanced)
Figure 3. Treated Image of Figure 1 (Gray Balanced)

The other treated image appears below.

Figure 4. Photo of leaves taken under broad daylight.(L-R) The setting used was incandescent. White balanced image of the leaves. Gray balanced image of the leaves.

Figure 5. Photo of objects taken under broad daylight.(L-R) The setting used was incandescent. White balanced image of the leaves. Gray balanced image of the leaves.

References:
en.wikipedia.org/wiki/White_balance
www.cambridgeincolour.com/tutorials/white-balance.htm

Thursday, August 7, 2008

A13 - Photometric Stereo

Consider an object illuminated by a source. We assume that the intensity captured by the camera at point (x,y) is given by I(x,y). The goal of this activity is to approximate the shape of the surface of the object given the intensity. Multiple images of the surfaces with the sources at different locations, which we denote as v(x,y), will give information about the shape of the surface. We start with the following expression:

I = vg

where I is the intensity, v is the source location, and g is to be computed. I and v are given so that we can express g as:

g = (inverse(v*v))v*I

What we need is the norm of g which is just given by:

n = g/abs(g)

We then let z = f(x,y) be the equation of the surface. If we equate the gradient of the surface to the norm of g, we get the following expression:

df/dx = -nx/nz

df/dy = -ny/nz

where (nx, ny, nz) is the norm of g.

Finally, integrating the RHS of the two equations above, we get an approximate of f(x,y)

The codes are given below:

//We first load the gathered intensity of the object which is captured by the camera
loadmatfile('Photos.mat',['I1','I2','I3', 'I4']);
//Defining the matrices to be used
v = [0.085832 0.17365 0.98106; 0.085832 -0.17365 0.98106; 0.17365 0 0.98481; 0.16318 -0.34202 0.92542];
Im1 = I1(:);
Im2 = I2(:);
Im3 = I3(:);
Im4 = I4(:);
I = [Im1';Im2';Im3';Im4'];
g = inv(v'*v)*v'*I;
//To get the norm of g
for i=1:size(g,2);
n1(i)=sqrt((g(1,i)**2)+(g(2,i)**2)+(g(3,i)**2));
n1 = n1 + 0.000000001;
end;
n(1,:) = g(1,:)./n1';
n(2,:) = g(2,:)./n1';
n(3,:) = g(3,:)./n1';
//Differentials
dfx= -n(1,:) ./(n(3,:)+0.000000001);
dfy= -n(2,:)./(n(3,:)+0.000000001);
//Resizing
Imx = matrix(dfx,128,128);
Imy = matrix(dfy,128,128);
//Integration
intx = cumsum(Imx,2);
inty = cumsum(Imy,1);
Image = intx + inty;
mesh(Image);

The result is:

April and Rica helped me in this activity.

I rate myself 10/10 for my effort in this activity.

Monday, August 4, 2008

A11 - Camera Calibration

As the title suggests, a camera calibration process was used to relate the object’s real world coordinate to its captured image's coordinate. A checker board was chosen to be the object because its boxes can serve as grid lines of a 3 dimensional system. The object’s image is shown in figure 1. Note that each box has a dimension of 1”x 1” in the real world.

One must also note that we have a 3 dimensional object and every object point can be labelled as (x,y,z). On the other hand, any point in the captured image is only 2 dimensional. Linear algebra can be used so that we can perform the following transformation: (x,y,z) --> (y,z).


Figure 1. Checker board

1. The first step is to choose the primary axes both for the object and the image. Figure 2 shows the x-,y-, and z-axis for the object.


Figure 2. Object's axes

For the image, the default image axes in scilab are shown below


Figure 3: Image's axes

In the Camera Calibration hand out, it was shown that if (x,y,z) is a certain point in the real world and (y,z) is its corresponding point in the image, then we have the following relations:


Figure 4. Object to Image relation


2. What I need are 23 points for both the object and the image coordinates. Each point is to be substituted to the relation above. After which we can obtain A.

Locating 23 object points is easy. Every box is 1 inch long/wide, so all I have to do is count the boxes. The chosen points in the object are shown in Figure 4. The origin is labelled “1”.


Figure 4. Object points with labels


Figure 5. Object coodinates (inches)


3. What was left to do is to locate the corresponding image coordinates. We can load the image shown in Figure 5 in scilab and this will give us the pixel coordinates of the 23 points.


//Locate 23 points arbitrary points in the checkerboard image
stacksize(20000000);
im = imread("check.jpg");
imshow(im); o = locate(1); //to locate the origin
xbasc;
imshow(im);
w = locate(23); //to locate 23 arbitrary points in the image
xbasc;


Using this, I obtained the the following:


Figure 6. Image coordinates (pixels)


4. Since I want to perform the operation in a series of points, I listed these points in scilab.

//Image Coordinates in y and z
yp = [w(1,1) w(1,2) w(1,3) w(1,4) w(1,5) w(1,6) w(1,7) w(1,8) w(1,9) w(1,10) w(1,11) w(1,12) w(1,13) w(1,14) w(1,15) w(1,16) w(1,17) w(1,18) w(1,19) w(1,20) w(1,21) w(1,22) w(1,23)] - o(1,1);
zp = [w(2,1) w(2,2) w(2,3) w(2,4) w(2,5) w(2,6) w(2,7) w(2,8) w(2,9) w(2,10) w(2,11) w(2,12) w(2,13) w(2,14) w(2,15) w(2,16) w(2,17) w(2,18) w(2,19) w(2,20) w(2,21) w(2,22) w(2,23)] - o(2,1);
//Corresponding object coordinates
x = [0 0 0 0 0 0 0 0 0 0 0 0 3 3 5 3 2 4 7 6 8 0 8];
y = [2 4 6 2 6 2 4 2 2 0 8 8 0 0 0 0 0 0 0 0 0 0 0];
z = [1 3 5 5 7 7 9 9 11 12 12 0 1 3 5 6 9 10 8 11 12 7 0];

5. I can now define the matrix to be used.

for i = 1:length(x);

Q((2*i)+1,:) = [x(i) y(i) z(i) 1 0 0 0 0 -(yp(i)*x(i)) -(yp(i)*y(i)) -(yp(i)*z(i))];

Q((2*i)+2,:) = [0 0 0 0 x(i) y(i) z(i) 1 -(zp(i)*x(i)) -(zp(i)*y(i)) -(zp(i)*z(i))];

d((2*i)+1) = yp(i);

d((2*i)+2) = zp(i);

end;

6. Solving for a:

a = inv(Q'*Q)*Q'*d;


I then obtained the following values:
Figure 7. “a” values

7. To check if the process above is accurate, we get new object and image coordinates. We substitute the object coordinates in one of our relation and see how much the result deviates from our located image.



Figure 8. New test coordinates (red)

//Accuracy test

xtest = [6 6 4 0 0 0];

ytest = [0 0 0 5 5 4];

ztest = [6 9 7 10 7 5];

for j=1:6;

yptest(j) = ((a(1)*xtest(j))+(a(2)*ytest(j))+(a(3)*ztest(j))+a(4))/((a(9)*xtest(j))+(a(10)*ytest(j))+(a(11)*ztest(j))+1);

zptest(j) = ((a(5)*xtest(j))+(a(6)*ytest(j))+(a(7)*ztest(j))+a(8))/((a(9)*xtest(j))+(a(10)*ytest(j))+(a(11)*ztest(j))+1);

end;

yzptest = [yptest,zptest];

imtest = imread("check2.jpg");

imshow(imtest);

wtest = locate(6);

dev = (yzptest - wtest')

devy = dev(:,1);

devz = dev(:,2);

wy = wtest(1,:);

wz = wtest(2,:);

for i=1:6;

%devy(i) = devy(i).*100/wy(i)

%devz(i) = devz(i).*100/wz(i)

end;


And for each of these points the deviation is given by:

% deviation in y:


2.1570407

0.9817405

0.5051955

- 0.2694823

- 0.1936041

- 0.0723059


% deviation in z:


0.0671348

- 0.2211321

0.1453209

- 0.2033003

- 0.5786263

- 0.0462943


I have 2.1570407% as my largest deviation, which I account on the small graphics window and the locate function in scilab. Therefore I can conclude that the process is successful.

I rate myself 9/10 for this activity, because although I put a lot of effort to finish this, I was not able to meet the deadline.


Acknowledgement:

Ed helped me in this activity. Ralph helped me in publishing this blog since I've been having internet problems this week.

Tuesday, July 22, 2008

A10 – Preprocessing Handwritten Text

I will try to obtain an enhanced handwriting in the scanned image that appears below.

Figure 1. Scanned Image

I will try to work on a part of this image shown in Figure 2.


Figure 2. Handwriting

To obtain the handwriting, I need to eliminate the lines present. First, I obtained the Fourier transform of the image, from here I can design a filter that can remove the horizontal lines present in the image.

im1 = imread("image.jpg");
f = im2gray(im1);
ft1 = fft2(f); // takes the ft of the image
ft2 = log(fftshift(abs(ft1))); // for smaller magnitude of ft
imshow(ft2, []);

Below is the Fourier transform of the image and the filter I used.


Figure 3. FT of handwriting

Figure 4. Filter

im1 = imread('image.jpg');
im2 = imread('filter.jpg');
fprint = im2gray(im1);
filter = im2gray(im2);
ft1 = fft2(fprint); // takes the ft of the image
FP = log(fftshift(abs(ft1))); // for smaller magnitude of ft
F = fftshift(filter);
FPF = ft1.*F; // convolve the images
final = ifft(FPF);
last = real(final);
imshow(last, [ ]);


I then obtained a treated image.

Figure 5

The next step is to convert this image to a binary image.

im1 = imread("last.jpg");
im2 = im2gray(im1);
im3 = abs(1-im2bw(im2, 0.56));
imshow(im3, []);

Figure 6. Binarized Image

To be able to read the image better, each letter in the hand writing should be one pixel thick. Dilation and erosion operator can do the trick. But this depends on the choice of structure element, which can enhance or destroy the letters. For this activity, I used a rectangular structure element with a dimension of 1x3.

im1 = imread("last.jpg");
im2 = im2gray(im1);
se1 = imread("structure.jpg");
se2 = im2gray(se1);
im3 = abs(1-im2bw(im2, 0.56));
se3 = im2bw(se2,0.5);
imwrite(im3, "last1.jpg");
e = erode(im3,se3,[2,2]);
imshow(e,[]);


The resulting image is:

Figure 7. Enhanced Image

Finally, I labeled each letter.

[L,n] = bwlabel(e); //labels each component
imshow(L+1, rand(n+1,3));

Separating the handwriting from the background was hard because the region of interest overlaps with the background. But in my opinion, Figures 6 and 7 are still readable and the letters can be separated hence the attempt to enhance the image was successful.

I rate myself 10/10 for this activity because I was able to do fast and without the help pf my classmates.

Thursday, July 17, 2008

A9 - Binary Operations

For this activity, I have a grayscale image of scattered circles. The goal is to obtain the most accurate measure of an area of a circle present in the image.



Since the image is too large to deal with, I first divide it into 256 x 256 subimages. And for each of these subimages, I used the algorithm below.


//Part A. Preparing the binary images to be used
im1 = imread('image.jpg'); // loads one of the subimages
se1 = imread('se.jpg'); // loads the structure element
im2 = im2gray(im1);
se2 = im2gray(se1);
im3 = im2bw(im2, 0.847 ); // 0.817 is the threshold value
se3 = im2bw(se2, 0.5);
// Part B. Image cleaning
d = dilate(im3,se3,[2,2]);
e = erode(d,se3,[2,2]);
//Area Calculation
[L,n] = bwlabel(e); //labels each component
for j=1:n
f = find(L==j);
reg_size = size(f,'*');
if reg_size <> 600
L(f) = 0;
end
end
imwrite(L,newimage.jpg');
for i=1:n;
v = find(L==i);
Area = size(v,2) // Gives the area in terms of pixel number
end;

Part A. Preparing Binary Images

The image can either be a true color or a grayscale image. The area we wish to obtain is in terms of the number of pixels, hence it will be easier if we convert each subimages to binary. In doing this, we need to know the threshold value. Although an average threshold value can be used in all the subimages, I opted to obtain a threshold value for each subimage. This is to ensure that each blob will not blend with the background. Plus a good threshold value can remove unwanted spots in the image.

Part B. Image Cleaning.

Aside from choosing a good threshold value, dilation and erosion operations can be applied. Dilation can be used to fill up small holes inside the region of interest or ROI while erosion can be used to remove unwanted spots and blots. Together they can smoothen out the surfaces of the ROI.

Part C. Area Calculation

In some of the blobs, cluttered circles overlapping each other are present. I chose to remove these types circles so that it will not interfere with data gathering. In this case, blobs with sizes less than 400 and greater than 600 will be removed. Labeling is also important since we need to output the size of each circle and not the entire ROI. After labeling, we can simply output the area which is just the pixel size of each circle. The data is shown below.


The numbers 1 to 12 are the assigned names of the 12 subimages. Using MS Excel, we can plot the frequency vs area as shown below.

The data above shows that there are 28 circles having an area of 555. 28 pixels. To verify this result, I obtained the diameters of each of the circle in the untreated image.

diameter = 27 (+- 1) pix

Area of a circle = 572.55 pix (+- 7.41 %)

Hence the error is 3.01 %. Thus we can conclude that the process shown above is an effective method of area computation.

Ed and Rica helped me in this activity.

I rate myself 10/10 for my effort in this activity.

Tuesday, July 15, 2008

A8 - Morphological Operations

This activity demonstrates how morphological operations, particularly dilation and erosion, can change the shape and structure of a binary image.

Let A be our image and define B as a structuring element. The dilation operation gives all the z's which are translations of a reflected B that when intersected with A is not the empty set. The effect of a dilation is to expand or elongate A in the shape of B.

On the other hand, the erosion operation results to all points z such that B translated by z is contained in A. The effect of erosion is to reduce the image by the shape of B.

To illustrate the definitions given above, consider the following sample images:


Figure1. Square (50x50 pixels)


Figure 2. Triangle ( B = 50 pixels, H = 30 pixels)

Figure 3. Circle (R = 25 pixels)

Figure 4. Hollow Square (60x60 pixels , 4 pixels thick)

Figure 5. Plus Sign ( 50 pixels long, 8 pixels thick)


And the structure elements that will be used are below:



Figure 6. Structure elements


Before doing the said operations in scilab, we first predicted the output of the operations.


We then perform the said operations in scilab. For the dilation operation, the codes are below:

A = imread('shape.jpg');

B = imread('structureelement.jpg');

im = im2bw(A, 0.5); // converts the shape to a binary image

se = im2bw(B, 0.5); // converts the structure element to binary

imse = dilate(im,se,[1,1]); //dilation operation

imshow(imse,[]);

The [1,1] written above is the chosen position of the origin for the all the structure elements except the plus sign, which has its origin at (3,3).

The results are summarized below. Note that the resulting image is also binary. The colors were added so that one can easily see the differences among them. Also, below each dilated images are their corresponding structure elements.

Figure 7. Dilated images of the square.

Figure 8. Dilated images of the triangle


Figure 9. Dilated images of the circle



Figure 10. Dilated images of the hollow square



Figure 11. Dilated images of the plus sign

For the erosion operation:

A = imread('shape.jpg');

B = imread('structureelement.jpg');

im = im2bw(A, 0.5); // converts the shape to a binary image

se = im2bw(B, 0.5); // converts the structure element to binary

imse = erode(im,se,[1,1]); //dilation operation

imshow(imse,[]);

And the results are also summarized below:


Figure 12. Eroded images of the square.


Figure 13. Eroded images of the triangle.



Figure 14. Eroded images of the circle.

Figure 15. Eroded images of the hollow square.


Figure 16. Eroded images of the plus sign.

I rate myself 10/10 for this activity because it took me half my day to do the predicted images and the other half for checking my results in scilab. Also because I had a hard time convincing other people and myself that some of my predicted images are correct. ;)

I want to thank April, Beth and Rica for helping me in this activity,

Reference:

A8 - Morphological Operations hand-out