Aiyin's Page

Thursday, August 28, 2008

A15 - Color Camera Processing

In photography and image processing, color balance is the global adjustment of the intensities of the colors (typically red, green, and blue). An important goal of this adjustment is to render specific colors – particularly neutral colors – correctly; hence, the general method is sometimes called gray balance, neutral balance, or white balance. Color balance changes the overall mixture of colors in an image and is used for color correction; generalized versions of color balance are used to get colors other than neutrals to also appear correct or pleasing.

Digital cameras come with white balance (WB) settings. Here are the some white balance settings available in digital cameras:

Daylight - used when taking photos outdoors
Incandescent - used when taking photos under a incandescent bulb
Flourescent - used when taking photos under a flourescent bulb
Sunset - used when taking photos during sunset
Cloudy - used when taking photos during cloudy days

When you set the right WB settings in your camera, then you can remove unrealistic color casts, so that objects which appear white in person are rendered white in your photo.

In this activity, I captured images under the wrong WB settings and try to correct it using scilab.
The process includes taking the RGB value of the white part of the image and divide the entire image with this RGB value.
I used the following codes to treat the images:

stacksize(20000000);
im = imread("leaves.jpg");
im1 = imread("leaves1.jpg");
//Reference white
Rw = mean(im1(:,:,1));
Gw = mean(im1(:,:,2));
Bw = mean(im1(:,:,3));
//Getting the RGB values of the image
R = im(:,:,1);
G = im(:,:,2);
B = im(:,:,3);
//Treating the image
new = zeros(im);
new(:,:,1) = R./Rw;
new(:,:,2) = G./Gw;
new(:,:,3) = B./Bw;
//Writing the new image
maxa = max(max(max(new)));
imnew = new./maxa;
imwrite(imnew,"newleaf.jpg");
//Gray World
Rwg = mean(im(:,:,1));
Gwg = mean(im(:,:,2));
Bwg = mean(im(:,:,3));
newg = zeros(im);
newg(:,:,1) = im(:,:,1)./Rwg;
newg(:,:,2) = im(:,:,2)./Gwg;
newg(:,:,3) = im(:,:,3)./Bwg;
maxb = max(max(max(newg)));
imnewg = newg./maxb;
imwrite(imnewg,"newgleaf.jpg");

And here are the results:

Figure 1. Photo taken under incandescent setting but should be taken under flourescent setting.

Figure 2. Treated Image of Figure 1 (White Balanced)

Figure 3. Treated Image of Figure 1 (Gray Balanced)

The other treated image appears below.

Figure 4. Photo of leaves taken under broad daylight.(L-R) The setting used was incandescent. White balanced image of the leaves. Gray balanced image of the leaves.

Figure 5. Photo of objects taken under broad daylight.(L-R) The setting used was incandescent. White balanced image of the leaves. Gray balanced image of the leaves.

References:
en.wikipedia.org/wiki/White_balance
www.cambridgeincolour.com/tutorials/white-balance.htm

Thursday, August 7, 2008

A13 - Photometric Stereo

Consider an object illuminated by a source. We assume that the intensity captured by the camera at point (x,y) is given by I(x,y). The goal of this activity is to approximate the shape of the surface of the object given the intensity. Multiple images of the surfaces with the sources at different locations, which we denote as v(x,y), will give information about the shape of the surface. We start with the following expression:

I = vg

where I is the intensity, v is the source location, and g is to be computed. I and v are given so that we can express g as:

g = (inverse(v*v))v*I

What we need is the norm of g which is just given by:

n = g/abs(g)

We then let z = f(x,y) be the equation of the surface. If we equate the gradient of the surface to the norm of g, we get the following expression:

df/dx = -nx/nz

df/dy = -ny/nz

where (nx, ny, nz) is the norm of g.

Finally, integrating the RHS of the two equations above, we get an approximate of f(x,y)

The codes are given below:

//We first load the gathered intensity of the object which is captured by the camera
loadmatfile('Photos.mat',['I1','I2','I3', 'I4']);
//Defining the matrices to be used
v = [0.085832 0.17365 0.98106; 0.085832 -0.17365 0.98106; 0.17365 0 0.98481; 0.16318 -0.34202 0.92542];
Im1 = I1(:);
Im2 = I2(:);
Im3 = I3(:);
Im4 = I4(:);
I = [Im1';Im2';Im3';Im4'];
g = inv(v'*v)*v'*I;
//To get the norm of g
for i=1:size(g,2);
n1(i)=sqrt((g(1,i)**2)+(g(2,i)**2)+(g(3,i)**2));
n1 = n1 + 0.000000001;
end;
n(1,:) = g(1,:)./n1';
n(2,:) = g(2,:)./n1';
n(3,:) = g(3,:)./n1';
//Differentials
dfx= -n(1,:) ./(n(3,:)+0.000000001);
dfy= -n(2,:)./(n(3,:)+0.000000001);
//Resizing
Imx = matrix(dfx,128,128);
Imy = matrix(dfy,128,128);
//Integration
intx = cumsum(Imx,2);
inty = cumsum(Imy,1);
Image = intx + inty;
mesh(Image);

The result is:

April and Rica helped me in this activity.

I rate myself 10/10 for my effort in this activity.

Monday, August 4, 2008

A11 - Camera Calibration

As the title suggests, a camera calibration process was used to relate the object’s real world coordinate to its captured image's coordinate. A checker board was chosen to be the object because its boxes can serve as grid lines of a 3 dimensional system. The object’s image is shown in figure 1. Note that each box has a dimension of 1”x 1” in the real world.

One must also note that we have a 3 dimensional object and every object point can be labelled as (x,y,z). On the other hand, any point in the captured image is only 2 dimensional. Linear algebra can be used so that we can perform the following transformation: (x,y,z) --> (y,z).

Figure 1. Checker board

1. The first step is to choose the primary axes both for the object and the image. Figure 2 shows the x-,y-, and z-axis for the object.

Figure 2. Object's axes

For the image, the default image axes in scilab are shown below

Figure 3: Image's axes

In the Camera Calibration hand out, it was shown that if (x,y,z) is a certain point in the real world and (y,z) is its corresponding point in the image, then we have the following relations:

Figure 4. Object to Image relation

2. What I need are 23 points for both the object and the image coordinates. Each point is to be substituted to the relation above. After which we can obtain A.

Locating 23 object points is easy. Every box is 1 inch long/wide, so all I have to do is count the boxes. The chosen points in the object are shown in Figure 4. The origin is labelled “1”.

Figure 4. Object points with labels

Figure 5. Object coodinates (inches)

3. What was left to do is to locate the corresponding image coordinates. We can load the image shown in Figure 5 in scilab and this will give us the pixel coordinates of the 23 points.

//Locate 23 points arbitrary points in the checkerboard image
stacksize(20000000);
im = imread("check.jpg");
imshow(im); o = locate(1); //to locate the origin
xbasc;
imshow(im);
w = locate(23); //to locate 23 arbitrary points in the image
xbasc;

Using this, I obtained the the following:

Figure 6. Image coordinates (pixels)

4. Since I want to perform the operation in a series of points, I listed these points in scilab.

//Image Coordinates in y and z
yp = [w(1,1) w(1,2) w(1,3) w(1,4) w(1,5) w(1,6) w(1,7) w(1,8) w(1,9) w(1,10) w(1,11) w(1,12) w(1,13) w(1,14) w(1,15) w(1,16) w(1,17) w(1,18) w(1,19) w(1,20) w(1,21) w(1,22) w(1,23)] - o(1,1);
zp = [w(2,1) w(2,2) w(2,3) w(2,4) w(2,5) w(2,6) w(2,7) w(2,8) w(2,9) w(2,10) w(2,11) w(2,12) w(2,13) w(2,14) w(2,15) w(2,16) w(2,17) w(2,18) w(2,19) w(2,20) w(2,21) w(2,22) w(2,23)] - o(2,1);
//Corresponding object coordinates
x = [0 0 0 0 0 0 0 0 0 0 0 0 3 3 5 3 2 4 7 6 8 0 8];
y = [2 4 6 2 6 2 4 2 2 0 8 8 0 0 0 0 0 0 0 0 0 0 0];
z = [1 3 5 5 7 7 9 9 11 12 12 0 1 3 5 6 9 10 8 11 12 7 0];

5. I can now define the matrix to be used.

for i = 1:length(x);

Q((2*i)+1,:) = [x(i) y(i) z(i) 1 0 0 0 0 -(yp(i)*x(i)) -(yp(i)*y(i)) -(yp(i)*z(i))];

Q((2*i)+2,:) = [0 0 0 0 x(i) y(i) z(i) 1 -(zp(i)*x(i)) -(zp(i)*y(i)) -(zp(i)*z(i))];

d((2*i)+1) = yp(i);

d((2*i)+2) = zp(i);

end;

6. Solving for a:

a = inv(Q'*Q)*Q'*d;

I then obtained the following values:

Figure 7. “a” values

7. To check if the process above is accurate, we get new object and image coordinates. We substitute the object coordinates in one of our relation and see how much the result deviates from our located image.

Figure 8. New test coordinates (red)

//Accuracy test

xtest = [6 6 4 0 0 0];

ytest = [0 0 0 5 5 4];

ztest = [6 9 7 10 7 5];

for j=1:6;

yptest(j) = ((a(1)*xtest(j))+(a(2)*ytest(j))+(a(3)*ztest(j))+a(4))/((a(9)*xtest(j))+(a(10)*ytest(j))+(a(11)*ztest(j))+1);

zptest(j) = ((a(5)*xtest(j))+(a(6)*ytest(j))+(a(7)*ztest(j))+a(8))/((a(9)*xtest(j))+(a(10)*ytest(j))+(a(11)*ztest(j))+1);

end;

yzptest = [yptest,zptest];

imtest = imread("check2.jpg");

imshow(imtest);

wtest = locate(6);

dev = (yzptest - wtest')

devy = dev(:,1);

devz = dev(:,2);

wy = wtest(1,:);

wz = wtest(2,:);

for i=1:6;

%devy(i) = devy(i).*100/wy(i)

%devz(i) = devz(i).*100/wz(i)

end;

And for each of these points the deviation is given by:

% deviation in y:

2.1570407

0.9817405

0.5051955

- 0.2694823

- 0.1936041

- 0.0723059

% deviation in z:

0.0671348

- 0.2211321

0.1453209

- 0.2033003

- 0.5786263

- 0.0462943

I have 2.1570407% as my largest deviation, which I account on the small graphics window and the locate function in scilab. Therefore I can conclude that the process is successful.

I rate myself 9/10 for this activity, because although I put a lot of effort to finish this, I was not able to meet the deadline.

Acknowledgement:

Ed helped me in this activity. Ralph helped me in publishing this blog since I've been having internet problems this week.

Tuesday, July 22, 2008

A10 – Preprocessing Handwritten Text

I will try to obtain an enhanced handwriting in the scanned image that appears below.

Figure 1. Scanned Image

I will try to work on a part of this image shown in Figure 2.

Figure 2. Handwriting

To obtain the handwriting, I need to eliminate the lines present. First, I obtained the Fourier transform of the image, from here I can design a filter that can remove the horizontal lines present in the image.

im1 = imread("image.jpg");
f = im2gray(im1);
ft1 = fft2(f); // takes the ft of the image
ft2 = log(fftshift(abs(ft1))); // for smaller magnitude of ft
imshow(ft2, []);

Below is the Fourier transform of the image and the filter I used.

Figure 3. FT of handwriting

Figure 4. Filter

im1 = imread('image.jpg');
im2 = imread('filter.jpg');
fprint = im2gray(im1);
filter = im2gray(im2);
ft1 = fft2(fprint); // takes the ft of the image
FP = log(fftshift(abs(ft1))); // for smaller magnitude of ft
F = fftshift(filter);
FPF = ft1.*F; // convolve the images
final = ifft(FPF);
last = real(final);
imshow(last, [ ]);

I then obtained a treated image.

Figure 5

The next step is to convert this image to a binary image.

im1 = imread("last.jpg");
im2 = im2gray(im1);
im3 = abs(1-im2bw(im2, 0.56));
imshow(im3, []);

Figure 6. Binarized Image

To be able to read the image better, each letter in the hand writing should be one pixel thick. Dilation and erosion operator can do the trick. But this depends on the choice of structure element, which can enhance or destroy the letters. For this activity, I used a rectangular structure element with a dimension of 1x3.

im1 = imread("last.jpg");
im2 = im2gray(im1);
se1 = imread("structure.jpg");
se2 = im2gray(se1);
im3 = abs(1-im2bw(im2, 0.56));
se3 = im2bw(se2,0.5);
imwrite(im3, "last1.jpg");
e = erode(im3,se3,[2,2]);
imshow(e,[]);

The resulting image is:

Figure 7. Enhanced Image

Finally, I labeled each letter.

[L,n] = bwlabel(e); //labels each component
imshow(L+1, rand(n+1,3));

Separating the handwriting from the background was hard because the region of interest overlaps with the background. But in my opinion, Figures 6 and 7 are still readable and the letters can be separated hence the attempt to enhance the image was successful.

I rate myself 10/10 for this activity because I was able to do fast and without the help pf my classmates.

Thursday, July 17, 2008

A9 - Binary Operations

For this activity, I have a grayscale image of scattered circles. The goal is to obtain the most accurate measure of an area of a circle present in the image.

Since the image is too large to deal with, I first divide it into 256 x 256 subimages. And for each of these subimages, I used the algorithm below.

//Part A. Preparing the binary images to be used
im1 = imread('image.jpg'); // loads one of the subimages

se1 = imread('se.jpg'); // loads the structure element

im2 = im2gray(im1);
se2 = im2gray(se1);
im3 = im2bw(im2, 0.847 ); // 0.817 is the threshold value
se3 = im2bw(se2, 0.5);
// Part B. Image cleaning
d = dilate(im3,se3,[2,2]);
e = erode(d,se3,[2,2]);
//Area Calculation
[L,n] = bwlabel(e); //labels each component
for j=1:n
f = find(L==j);
reg_size = size(f,'*');
if reg_size <> 600
L(f) = 0;
end
end
imwrite(L,newimage.jpg');
for i=1:n;
v = find(L==i);
Area = size(v,2) // Gives the area in terms of pixel number
end;

Part A. Preparing Binary Images

The image can either be a true color or a grayscale image. The area we wish to obtain is in terms of the number of pixels, hence it will be easier if we convert each subimages to binary. In doing this, we need to know the threshold value. Although an average threshold value can be used in all the subimages, I opted to obtain a threshold value for each subimage. This is to ensure that each blob will not blend with the background. Plus a good threshold value can remove unwanted spots in the image.

Part B. Image Cleaning.

Aside from choosing a good threshold value, dilation and erosion operations can be applied. Dilation can be used to fill up small holes inside the region of interest or ROI while erosion can be used to remove unwanted spots and blots. Together they can smoothen out the surfaces of the ROI.

Part C. Area Calculation

In some of the blobs, cluttered circles overlapping each other are present. I chose to remove these types circles so that it will not interfere with data gathering. In this case, blobs with sizes less than 400 and greater than 600 will be removed. Labeling is also important since we need to output the size of each circle and not the entire ROI. After labeling, we can simply output the area which is just the pixel size of each circle. The data is shown below.

The numbers 1 to 12 are the assigned names of the 12 subimages. Using MS Excel, we can plot the frequency vs area as shown below.

The data above shows that there are 28 circles having an area of 555. 28 pixels. To verify this result, I obtained the diameters of each of the circle in the untreated image.

diameter = 27 (+- 1) pix

Area of a circle = 572.55 pix (+- 7.41 %)

Hence the error is 3.01 %. Thus we can conclude that the process shown above is an effective method of area computation.

Ed and Rica helped me in this activity.

I rate myself 10/10 for my effort in this activity.

Tuesday, July 15, 2008

A8 - Morphological Operations

This activity demonstrates how morphological operations, particularly dilation and erosion, can change the shape and structure of a binary image.

Let A be our image and define B as a structuring element. The dilation operation gives all the z's which are translations of a reflected B that when intersected with A is not the empty set. The effect of a dilation is to expand or elongate A in the shape of B.

On the other hand, the erosion operation results to all points z such that B translated by z is contained in A. The effect of erosion is to reduce the image by the shape of B.

To illustrate the definitions given above, consider the following sample images:

Figure1. Square (50x50 pixels)

Figure 2. Triangle ( B = 50 pixels, H = 30 pixels)

Figure 3. Circle (R = 25 pixels)

Figure 4. Hollow Square (60x60 pixels , 4 pixels thick)

Figure 5. Plus Sign ( 50 pixels long, 8 pixels thick)

And the structure elements that will be used are below:

Figure 6. Structure elements

Before doing the said operations in scilab, we first predicted the output of the operations.

We then perform the said operations in scilab. For the dilation operation, the codes are below:

A = imread('shape.jpg');

B = imread('structureelement.jpg');

im = im2bw(A, 0.5); // converts the shape to a binary image

se = im2bw(B, 0.5); // converts the structure element to binary

imse = dilate(im,se,[1,1]); //dilation operation

imshow(imse,[]);

The [1,1] written above is the chosen position of the origin for the all the structure elements except the plus sign, which has its origin at (3,3).

The results are summarized below. Note that the resulting image is also binary. The colors were added so that one can easily see the differences among them. Also, below each dilated images are their corresponding structure elements.

Figure 7. Dilated images of the square.

Figure 8. Dilated images of the triangle

Figure 9. Dilated images of the circle

Figure 10. Dilated images of the hollow square

Figure 11. Dilated images of the plus sign

For the erosion operation:

A = imread('shape.jpg');

B = imread('structureelement.jpg');

im = im2bw(A, 0.5); // converts the shape to a binary image

se = im2bw(B, 0.5); // converts the structure element to binary

imse = erode(im,se,[1,1]); //dilation operation

imshow(imse,[]);

And the results are also summarized below:

Figure 12. Eroded images of the square.

Figure 13. Eroded images of the triangle.

Figure 14. Eroded images of the circle.

Figure 15. Eroded images of the hollow square.

Figure 16. Eroded images of the plus sign.

I rate myself 10/10 for this activity because it took me half my day to do the predicted images and the other half for checking my results in scilab. Also because I had a hard time convincing other people and myself that some of my predicted images are correct. ;)

I want to thank April, Beth and Rica for helping me in this activity,

Reference:

A8 - Morphological Operations hand-out

Thursday, July 10, 2008

A7-Enhncement in the Frequency Domain

Again in this activity, we made use of the Fourier Transform (FT) of an image. An image is treated as a superposition of sinusoids. And by applying FT on an image, we can obtain its spatial frequency. Every f(x,y) in an image has a corresponding F(k,l) in the Fourier space.

A. Anamorphic property of the Fourier Transform

For this part of the activity, we have a two dimensional signal and we take its FT. The steps are below:

//Creates a signal
nx = 100;
ny = 100;
x = linspace(-1,1,nx);
y = linspace(-1,1,ny);
[X,Y] = ndgrid(x,y);
f = 4 //frequency of the signal
z = sin(2*%pi*f*X);
imshow(z,[]);

And we obtain the figure below:

Figure 1: 2D signal with a f(frequency)=4

//Displays the FT of the signal
ft=fft2(z) // Takes the FT
imshow(fftshift(abs(z)), [ ] ); // Displays the magnitude of the signal

This will then display the figure below:

Figure 2: FT of signal with f=4

It can be observed in this figure that there are two dots lying on a vertical line. This two dots correspond to the frequency of the stripes in Figure 1. Midway between the dots is the DC value of the image. The DC value corresponds to the average frequency of the image. The dots are in a vertical line because the sinusoid's intensity vary vertically than horizontally.

The figure below shows the resulting FT for a signal with varying frequency.

Figure 3. Signals with f=2, 4, 10, 20

Figure 4. FT of signals with f=2, 4, 10, 20

It can be observed from Figure 4 that the distance between the stripes of the signal decreases as the frequency increases. Again, by taking the FT of the signal, we obtain its frequency, which is f=(1/distance between stripes), represented by the two dots. Since the frequency is inversely proportional to the distance of stripes, the separation between the two dots increases.

By choosing to rotate the sinusoid by an angle of 30 degrees:

nx = 100;
ny = 100;
x = linspace(-1,1,nx);
y = linspace(-1,1,ny);
[X,Y] = ndgrid(x,y);
f = 4;
theta = 30;
z = sin(2*%pi*f*(Y*sin(theta) + X*cos(theta)));
ft=fft2(z);
imshow(fftshift(abs(ft)),[]);

Figure 5: Theta = 30 , f=4

We obtained an image tilted by an angle of 120 degrees and its FT is also tilted by the same angle. Since the sinusoid's intensity vary most at this angle, then so does its FT frequency. Hence the FT shows us the frequency of the image and the direction where it varies most.

For the pattern below,

Figure 6: Patterned image

We used:

nx = 100;
ny = 100;
x = linspace(-1,1,nx);
y = linspace(-1,1,ny);
[X,Y] = ndgrid(x,y);
f = 4;
z = sin(2*%pi*4*X).*sin(2*%pi*4*Y);
ft = fft2(z);
imshow(fftshift(abs(ft)),[]);

Because the frequency of this image varies on the x and the y directions or diagonally, then so does its frequency.

B. Fingerprints

For this avtivity, we need to enhance the image of a fingerprint. The fingerprint that will be used is below.

Figure 8. Fingerprint

im1 = imread("fingerprint.jpg");
f = im2gray(im1);
ft1 = fft2(f); // takes the ft of the image
ft2 = log(fftshift(abs(ft1))); // for smaller magnitude of ft
imshow(ft2, []);

This will display the FT of figure 8. We took the logarithm because the intensity of the Fourier image is large that we need to rescale it. Otherwise the resulting FT image will be black.

Figure 9. FT of fingerprint

From the image of the FT, we can design a filter. The goal is to let the bright part of the image through the filter to enhance the ridges.

Figure 10. Filter

After the filter is designed, we can think of them as the aperture. We then convolve the FT of the image with this filter.

im1 = imread('fingerprint.jpg');
im2 = imread('filter.jpg');
fprint = im2gray(im1);
filter = im2gray(im2);
ft1 = fft2(fprint); // takes the ft of the image
FP = log(fftshift(abs(ft1))); // for smaller magnitude of ft
F = fftshift(filter);
FPF = ft1.*F; // convolve the images
final = ifft(FPF);
last = real(final);
imshow(last, [ ]);

The result is below:

The result is obtained via trial and error. Because the goal is to obtain all the necessary information about the image and at the same time remove the unnecessary ones. I also tried using a matrix pattern but since the background of the image is uneven, using matrices to enhance the ridges also enhances the lines present in the background.

C. Lunar Image

For this part of the activity, the goal is to remove the vertical lines in the figure below.

Using a code similar to part B,

stacksize(20000000);
im1 = imread('moon.jpg');
im2 = imread('filter1.jpg');
moon = im2gray(im1);
filter = im2gray(im2);
ft1 = fft2(moon); // takes the ft of the image
M = log(fftshift(abs(ft1))); // for smaller magnitude of ft
F = fftshift(filter);
MF = ft1.*F; // convolve the images
final = ifft(FPF);
last = real(final);
imshow(last, [ ]);

The FT of the image is,

And the filter used was,

Finally, we obtained the enhanced image.

I also used the filter below,

and the resulting image is:

which in my opinion is the same as the as the other enhanced image.

I rate myself 10/10 for this activity because it took me a lot of time to figure out what to do.

Thanks Beth for helping me in this activity.

Reference:

http://homepages.inf.ed.ac.uk/rbf/HIPR2/fourier.htm