Tuesday, July 22, 2008

A10 – Preprocessing Handwritten Text

I will try to obtain an enhanced handwriting in the scanned image that appears below.

Figure 1. Scanned Image

I will try to work on a part of this image shown in Figure 2.


Figure 2. Handwriting

To obtain the handwriting, I need to eliminate the lines present. First, I obtained the Fourier transform of the image, from here I can design a filter that can remove the horizontal lines present in the image.

im1 = imread("image.jpg");
f = im2gray(im1);
ft1 = fft2(f); // takes the ft of the image
ft2 = log(fftshift(abs(ft1))); // for smaller magnitude of ft
imshow(ft2, []);

Below is the Fourier transform of the image and the filter I used.


Figure 3. FT of handwriting

Figure 4. Filter

im1 = imread('image.jpg');
im2 = imread('filter.jpg');
fprint = im2gray(im1);
filter = im2gray(im2);
ft1 = fft2(fprint); // takes the ft of the image
FP = log(fftshift(abs(ft1))); // for smaller magnitude of ft
F = fftshift(filter);
FPF = ft1.*F; // convolve the images
final = ifft(FPF);
last = real(final);
imshow(last, [ ]);


I then obtained a treated image.

Figure 5

The next step is to convert this image to a binary image.

im1 = imread("last.jpg");
im2 = im2gray(im1);
im3 = abs(1-im2bw(im2, 0.56));
imshow(im3, []);

Figure 6. Binarized Image

To be able to read the image better, each letter in the hand writing should be one pixel thick. Dilation and erosion operator can do the trick. But this depends on the choice of structure element, which can enhance or destroy the letters. For this activity, I used a rectangular structure element with a dimension of 1x3.

im1 = imread("last.jpg");
im2 = im2gray(im1);
se1 = imread("structure.jpg");
se2 = im2gray(se1);
im3 = abs(1-im2bw(im2, 0.56));
se3 = im2bw(se2,0.5);
imwrite(im3, "last1.jpg");
e = erode(im3,se3,[2,2]);
imshow(e,[]);


The resulting image is:

Figure 7. Enhanced Image

Finally, I labeled each letter.

[L,n] = bwlabel(e); //labels each component
imshow(L+1, rand(n+1,3));

Separating the handwriting from the background was hard because the region of interest overlaps with the background. But in my opinion, Figures 6 and 7 are still readable and the letters can be separated hence the attempt to enhance the image was successful.

I rate myself 10/10 for this activity because I was able to do fast and without the help pf my classmates.

Thursday, July 17, 2008

A9 - Binary Operations

For this activity, I have a grayscale image of scattered circles. The goal is to obtain the most accurate measure of an area of a circle present in the image.



Since the image is too large to deal with, I first divide it into 256 x 256 subimages. And for each of these subimages, I used the algorithm below.


//Part A. Preparing the binary images to be used
im1 = imread('image.jpg'); // loads one of the subimages
se1 = imread('se.jpg'); // loads the structure element
im2 = im2gray(im1);
se2 = im2gray(se1);
im3 = im2bw(im2, 0.847 ); // 0.817 is the threshold value
se3 = im2bw(se2, 0.5);
// Part B. Image cleaning
d = dilate(im3,se3,[2,2]);
e = erode(d,se3,[2,2]);
//Area Calculation
[L,n] = bwlabel(e); //labels each component
for j=1:n
f = find(L==j);
reg_size = size(f,'*');
if reg_size <> 600
L(f) = 0;
end
end
imwrite(L,newimage.jpg');
for i=1:n;
v = find(L==i);
Area = size(v,2) // Gives the area in terms of pixel number
end;

Part A. Preparing Binary Images

The image can either be a true color or a grayscale image. The area we wish to obtain is in terms of the number of pixels, hence it will be easier if we convert each subimages to binary. In doing this, we need to know the threshold value. Although an average threshold value can be used in all the subimages, I opted to obtain a threshold value for each subimage. This is to ensure that each blob will not blend with the background. Plus a good threshold value can remove unwanted spots in the image.

Part B. Image Cleaning.

Aside from choosing a good threshold value, dilation and erosion operations can be applied. Dilation can be used to fill up small holes inside the region of interest or ROI while erosion can be used to remove unwanted spots and blots. Together they can smoothen out the surfaces of the ROI.

Part C. Area Calculation

In some of the blobs, cluttered circles overlapping each other are present. I chose to remove these types circles so that it will not interfere with data gathering. In this case, blobs with sizes less than 400 and greater than 600 will be removed. Labeling is also important since we need to output the size of each circle and not the entire ROI. After labeling, we can simply output the area which is just the pixel size of each circle. The data is shown below.


The numbers 1 to 12 are the assigned names of the 12 subimages. Using MS Excel, we can plot the frequency vs area as shown below.

The data above shows that there are 28 circles having an area of 555. 28 pixels. To verify this result, I obtained the diameters of each of the circle in the untreated image.

diameter = 27 (+- 1) pix

Area of a circle = 572.55 pix (+- 7.41 %)

Hence the error is 3.01 %. Thus we can conclude that the process shown above is an effective method of area computation.

Ed and Rica helped me in this activity.

I rate myself 10/10 for my effort in this activity.

Tuesday, July 15, 2008

A8 - Morphological Operations

This activity demonstrates how morphological operations, particularly dilation and erosion, can change the shape and structure of a binary image.

Let A be our image and define B as a structuring element. The dilation operation gives all the z's which are translations of a reflected B that when intersected with A is not the empty set. The effect of a dilation is to expand or elongate A in the shape of B.

On the other hand, the erosion operation results to all points z such that B translated by z is contained in A. The effect of erosion is to reduce the image by the shape of B.

To illustrate the definitions given above, consider the following sample images:


Figure1. Square (50x50 pixels)


Figure 2. Triangle ( B = 50 pixels, H = 30 pixels)

Figure 3. Circle (R = 25 pixels)

Figure 4. Hollow Square (60x60 pixels , 4 pixels thick)

Figure 5. Plus Sign ( 50 pixels long, 8 pixels thick)


And the structure elements that will be used are below:



Figure 6. Structure elements


Before doing the said operations in scilab, we first predicted the output of the operations.


We then perform the said operations in scilab. For the dilation operation, the codes are below:

A = imread('shape.jpg');

B = imread('structureelement.jpg');

im = im2bw(A, 0.5); // converts the shape to a binary image

se = im2bw(B, 0.5); // converts the structure element to binary

imse = dilate(im,se,[1,1]); //dilation operation

imshow(imse,[]);

The [1,1] written above is the chosen position of the origin for the all the structure elements except the plus sign, which has its origin at (3,3).

The results are summarized below. Note that the resulting image is also binary. The colors were added so that one can easily see the differences among them. Also, below each dilated images are their corresponding structure elements.

Figure 7. Dilated images of the square.

Figure 8. Dilated images of the triangle


Figure 9. Dilated images of the circle



Figure 10. Dilated images of the hollow square



Figure 11. Dilated images of the plus sign

For the erosion operation:

A = imread('shape.jpg');

B = imread('structureelement.jpg');

im = im2bw(A, 0.5); // converts the shape to a binary image

se = im2bw(B, 0.5); // converts the structure element to binary

imse = erode(im,se,[1,1]); //dilation operation

imshow(imse,[]);

And the results are also summarized below:


Figure 12. Eroded images of the square.


Figure 13. Eroded images of the triangle.



Figure 14. Eroded images of the circle.

Figure 15. Eroded images of the hollow square.


Figure 16. Eroded images of the plus sign.

I rate myself 10/10 for this activity because it took me half my day to do the predicted images and the other half for checking my results in scilab. Also because I had a hard time convincing other people and myself that some of my predicted images are correct. ;)

I want to thank April, Beth and Rica for helping me in this activity,

Reference:

A8 - Morphological Operations hand-out



Thursday, July 10, 2008

A7-Enhncement in the Frequency Domain

Again in this activity, we made use of the Fourier Transform (FT) of an image. An image is treated as a superposition of sinusoids. And by applying FT on an image, we can obtain its spatial frequency. Every f(x,y) in an image has a corresponding F(k,l) in the Fourier space.

A. Anamorphic property of the Fourier Transform

For this part of the activity, we have a two dimensional signal and we take its FT. The steps are below:

//Creates a signal
nx = 100;
ny = 100;

x = linspace(-1,1,nx);

y = linspace(-1,1,ny);
[X,Y] = ndgrid(x,y);
f = 4 //frequency of the signal
z = sin(2*%pi*f*X);
imshow(z,[]);


And we obtain the figure below:


Figure 1: 2D signal with a f(frequency)=4

//Displays the FT of the signal
ft=fft2(z) // Takes the FT
imshow(fftshift(abs(z)), [ ] ); // Displays the magnitude of the signal

This will then display the figure below:


Figure 2: FT of signal with f=4

It can be observed in this figure that there are two dots lying on a vertical line. This two dots correspond to the frequency of the stripes in Figure 1. Midway between the dots is the DC value of the image. The DC value corresponds to the average frequency of the image. The dots are in a vertical line because the sinusoid's intensity vary vertically than horizontally.

The figure below shows the resulting FT for a signal with varying frequency.


Figure 3. Signals with f=2, 4, 10, 20


Figure 4. FT of signals with f=2, 4, 10, 20

It can be observed from Figure 4 that the distance between the stripes of the signal decreases as the frequency increases. Again, by taking the FT of the signal, we obtain its frequency, which is f=(1/distance between stripes), represented by the two dots. Since the frequency is inversely proportional to the distance of stripes, the separation between the two dots increases.

By choosing to rotate the sinusoid by an angle of 30 degrees:

nx = 100;
ny = 100;

x = linspace(-1,1,nx);

y = linspace(-1,1,ny);

[X,Y] = ndgrid(x,y);
f = 4;
theta = 30;
z = sin(2*%pi*f*(Y*sin(theta) + X*cos(theta)));
ft=fft2(z);
imshow(fftshift(abs(ft)),[]);



Figure 5: Theta = 30 , f=4

We obtained an image tilted by an angle of 120 degrees and its FT is also tilted by the same angle. Since the sinusoid's intensity vary most at this angle, then so does its FT frequency. Hence the FT shows us the frequency of the image and the direction where it varies most.

For the pattern below,

Figure 6: Patterned image

We used:

nx = 100;
ny = 100;

x = linspace(-1,1,nx);

y = linspace(-1,1,ny);
[X,Y] = ndgrid(x,y);
f = 4;
z = sin(2*%pi*4*X).*sin(2*%pi*4*Y);

ft = fft2(z);
imshow(fftshift(abs(ft)),[]);

Because the frequency of this image varies on the x and the y directions or diagonally, then so does its frequency.

B. Fingerprints

For this avtivity, we need to enhance the image of a fingerprint. The fingerprint that will be used is below.

Figure 8. Fingerprint

im1 = imread("fingerprint.jpg");
f = im2gray(im1);

ft1 = fft2(f); // takes the ft of the image
ft2 = log(fftshift(abs(ft1))); // for smaller magnitude of ft
imshow(ft2, []);


This will display the FT of figure 8. We took the logarithm because the intensity of the Fourier image is large that we need to rescale it. Otherwise the resulting FT image will be black.


Figure 9. FT of fingerprint

From the image of the FT, we can design a filter. The goal is to let the bright part of the image through the filter to enhance the ridges.


Figure 10. Filter

After the filter is designed, we can think of them as the aperture. We then convolve the FT of the image with this filter.

im1 = imread('fingerprint.jpg');
im2 = imread('filter.jpg');
fprint = im2gray(im1);
filter = im2gray(im2);
ft1 = fft2(fprint); // takes the ft of the image
FP = log(fftshift(abs(ft1))); // for smaller magnitude of ft

F = fftshift(filter);
FPF = ft1.*F; // convolve the images
final = ifft(FPF);
last = real(final);
imshow(last, [ ]);

The result is below:

The result is obtained via trial and error. Because the goal is to obtain all the necessary information about the image and at the same time remove the unnecessary ones. I also tried using a matrix pattern but since the background of the image is uneven, using matrices to enhance the ridges also enhances the lines present in the background.


C. Lunar Image


For this part of the activity, the goal is to remove the vertical lines in the figure below.


Using a code similar to part B,

stacksize(20000000);
im1 = imread('moon.jpg');
im2 = imread('filter1.jpg');
moon = im2gray(im1);
filter = im2gray(im2);
ft1 = fft2(moon); // takes the ft of the image
M = log(fftshift(abs(ft1))); // for smaller magnitude of ft
F = fftshift(filter);
MF = ft1.*F; // convolve the images
final = ifft(FPF);
last = real(final);
imshow(last, [ ]);

The FT of the image is,

And the filter used was,


Finally, we obtained the enhanced image.

I also used the filter below,


and the resulting image is:
which in my opinion is the same as the as the other enhanced image.

I rate myself 10/10 for this activity because it took me a lot of time to figure out what to do.

Thanks Beth for helping me in this activity.

Reference:

http://homepages.inf.ed.ac.uk/rbf/HIPR2/fourier.htm

Tuesday, July 8, 2008

A6-Fourier Tranform Model of Image Formation

A. Familiarization with Discrete FFT

In the first part of the activity, we performed Fourier Transform (FT) on an image of a circle .

Figure 1. circle.bmp (128x128)

I = imread('circle.bmp');

Igray = im2gray(I);
FIgray = fft2(Igray); // This produces the fourier transform of a 2D image
//This is a complex image so we use

imshow(abs(FIgray),[]); // to obtain the magnitude of the image

The result is below

Figure 2: FT of circle.bmp

Because the result of the FT is a matrix with the diagonal quadrants interchanged, we must put the quadrants back to their original position. So we used:

imshow(fftshift(abs(FIgray))), []);

And the result is

Figure 3: FT of circle.bmp

By taking the FT of the same image twice:

imshow(abs(fft2(FIgray)));

The obtained result is the same as the image in Figure 1. This is expected because if the dimensions of the original image is X, its FT has dimensions of 1/X. Hence the FT of the FT of the image has dimensions of X, which is the same as the original.



Now we replace the circle image with "A".

Figure 4: A.bmp

I = imread('A.bmp');

Igray = im2gray(I); FIgray = fft2(Igray);// This produces the fourier transform of a 2D image, this is complex
imshow(abs(FIgray),[]); // To obtain the magnitude of the image

Figure 5: FT of A.bmp
imshow(fftshift(abs(FIgray))), []);

Figure 6: FT of A.bmp

imshow(abs(fft2(FIgray)));

Figure 7: FT of figure 6

As it turns out, when the FT is applied twice the image becomes inverted. Because the Fourier transform of the image is complex, it has a real and imaginary part. We can also think of it as the magnitude and phase (Euler Formula). Since the above codes deal with the magnitude of the image only, the phase of the image was not preserved and hence the result is an inverted image. Figure 7 has the same magnitude as Figure 4 but their corresponding phases are different.

B. Simulation of an Imaging Device

For the next part of the activity, there are two images used. Convolution between the two functions were performed. One of the functions serves as the object and the other one serves as the transfer function of an imaging system. For this activity, we can think of this transfer function as the aperture of a circular lens.

Figure 8. VIP. bmp is the object

Figure 9. circle.bmp is the aperture

r=imread('circle.bmp');
a=imread('VIP.bmp');
rgray = im2gray(r);
agray = im2gray(a);
Fr = fftshift(rgray);

Fa = fft2(agray);
FRA = Fr.*(Fa);

IRA = fft2(FRA); //inverse
FFT
FImage = abs(IRA);
imshow(FImage, [ ]);


The result is below:
Figure 10: Convolved figure of a and r

When a smaller circle is used, the result will be:

Figure 11: Smaller Circle used

On the other hand, when a larger circle is used such that the one shown below,


the result will be:
Figure 12: Larger Circle used

The larger the aperture is, the finer the image becomes. If "circle.bmp" serves as an aperture, then it determines how much light reaches the image plane. This is also explains why Figure 12 has the brightest image and Figure 11 is blurred.

C. Template Matching Using Correlation

This process is used for pattern recognition. One of the images will be the pattern and the other one will be the image where a certain pattern is or is not found. The goal is to match the A in the template to the A's in the given image. Here are the images:

Figure 13: Template Image


Figure 14: Image with the pattern "A"

Note that in Figure 14, there are 5 A's in the sentence. The code for this activity is written below.

a=imread('A.bmp');

b=imread('sentence.bmp');

agray = im2gray(a);

bgray = im2gray(b);
Fa = fft2(agray);
Fb = fft2(bgray);

C=conj(Fb); // conjugate of bgray

ab = Fa.*C;
I = fft2(ab); //forward FT

FImage = abs(I); // Inverse FT
imshow(FImage, [ ]);

Figure 15. Convolved image

In Figure 15, bright spots can be observed. This bright spots can be thought of as a match confirmation between the images in figures 13 and 14. Although the resulting image is not readable, the spots are distinct in this final image. There are 5 spots in Figure 15 telling us that there must be 5 A's in Figure 14.

D. Edge Detection using Convolution Integral

For this part of the activity, several 3x3 matrices will be used to allow edge recognition of an image. The image that was used is the same image as Figure 8.

pattern = [0,-1,0;-1,5,-1;0,-1,0];
a=imread('VIP.bmp');

agray = im2gray(a);
C = imcorrcoef(a, pattern); // convolves the matrix with the VIP image

imshow(C, [ ]);


The result is an image with sharper edges.

Figure 16: [0,-1,0;-1,5,-1;0,-1,0] for sharper edges

By using different types of matrices we have:

Figure 17: [-2,-1,0;-1,1,1;0,1,2] for an embossed image

Figure 18: [-1,-1,-1;-1,8,-1;-1,-1,-1] for line detection

Figure 19: [-1,0,0;0,-1,0;0,0,0] for edge enhancement

apreture
References:
http://www.cra.org/Activities/craw/dmp/awards/2006/Bolan/DMP_Pages/filters.html
http://en.wikipedia.org/wiki