ECE
893-2 Machine Vision
|
|
Spring 2005
This course builds upon ECE847 by exposing students to fundamental concepts, issues, and algorithms in
digital image processing and computer vision. Topics include segmentation, texture, detection,
3D reconstruction, calibration, shape, and energy minimization.
The goal is to equip students with the skills and tools needed to manipulate
images, along with an appreciation for the difficulty of the problems. Students
will implement several standard algorithms, evaluate the strengths and weakness
of various approaches, and explore a topic of their own choosing in a course
project.
Syllabus
Week
| Topic
| Assignment
|
1
| Texture |
HW1: Warm-up, due 1/21 |
2
| Texture
|
HW2: Texture, due 1/28 |
3
| Segmentation
|
|
4
| Segmentation
|
HW3: Segmentation, due 2/11 |
5
| Detection |
|
6
| Detection |
HW4: Detection, due 2/25 |
7
| 3D reconstruction |
|
8
| 3D reconstruction |
HW5: 3D reconstruction, due 3/18 |
9
| Camera calibration |
|
10
| [spring break] |
|
11
| Range images |
HW6: Calibration, due 4/1 |
12
| Object recognition
|
start projects |
13
| Shape and active contours
|
|
14
| Energy minimization
|
|
15
| Energy minimization
|
projects due |
In your final project, you will investigate some area of image processing or computer vision in more detail. Typically
this will involve formulating a problem, reading the literature, proposing a solution, implementing the solution,
evaluating the results, and communicating your findings. In the case of a survey project, the quality and depth of
the literature review should be increased significantly to compensate for the lack of implementation.
Project deadlines:
- 4/15/05: progress report (1 page)
- 5/2/05: final oral presentation in class during
final exam slot, 8:00-11:00am
- 5/2/05: final written
report (up to 5 pages), due at noon
- 5/5/05: peer reviews of written
reports and oral presentations, due at noon
Homeworks:
- HW#1 (due 1/21; email code to
the grader by
11:59pm):
- Implement the floodfill algorithm in C/C++. Create a console-based
Visual C++ application that takes three command-line parameters: filename
x y (in that order), where 'filename' is the image filename and (x,y) are the
coordinates of the seed point. The application should load the image from
disk, run the algorithm, and display both the original image and the resulting
image in separate windows on the screen.
- To create a console app, follow these instructions: File -> New ->
Project -> Win32 Console Application. Give it a name and keep the checkbox on
"Create new workspace". Choose "An application that supports MFC." Now compile
and run (Build -> Build ..., and Build -> Execute, or F7 and Ctrl-F5). Under
FileView -> Source Files you will find the main cpp file. (Also, I
would recommend that you turn off Precompiled Headers: Project -> Settings
-> C/C++ -> Precompiled headers -> Not using precompiled headers. Before you
click on the radio button, though, first select All configurations in the drop
down box so that both Debug and Release versions are affected.).
- Email the grader with the subject line "HW1" (without quotes). The
email should have the following files attached:
- username_hw1_exe (This is the precompiled executable, but with the
extension removed so that it is not blocked by the email system).
- username_hw1_zip (This is a zipped version of your entire workspace
-- with extension removed -- with everything needed to compile your code.
Do not include DLLs.)
- all the source files that you actually wrote for the homework, attached as
individual text files (there will probably be only one or two of these, or at
most three).
- The image that the grader will use to test your code is
quantized.pgm and another image that is similar.
- For this assignment, you are encouraged to use the Blepo computer vision
library, a preliminary version of which can be found at the location emailed to
you. A tutorial on the library will be given in class.
- No report is due for this assignment.
- HW#2 (due 1/31; bring hardcopy report to my office by 4:30pm, email code to
grader by 4:30pm)
- Write a C++ function SynthesizeTextureEfrosLeung(const ImgGray& texture, int
out_width, int out_height, int window_width, ImgGray* out); where
- 'texture' is the sample texture image
- 'out_width x out_height' are the dimensions of the output image
- 'window_width' is the neighborhood window width (and height, since the
window is square); this is the only parameter of the algorithm
- 'out' contains the 'out_width x out_height' image resulting from
synthesizing the texture
The function should implement the algorithm described in Efros and Leung, "Texture Synthesis
by Non-parametric Sampling", ICCV 1999. See their
website
for details. - Create a console-based Visual C++ application that takes four
command-line parameters: texture_filename out_width out_height
window_width (in that order), which correspond to the input parameters of the
function. The application should load the image from
disk, run the algorithm, and display both the original image and the resulting
image in separate windows on the screen.
- Make sure your code is readable
and well-organized, including comments where necessary
- Test your
algorithm on images such as those used by Efros and Leung:
texture_brick.pgm,
texture_dense_weave.pgm,
texture_grate.pgm,
texture_mesh.pgm, texture_ripples.pgm,
texture_text.pgm,
texture_thin_weave.pgm; small versions:
texture_brick_small.pgm,
texture_mesh_small.pgm
- Email the grader with the subject line "HW2" (without quotes). The
email should have the following files attached:
- username_hw2_exe (This is the precompiled executable, but with the
extension removed so that it is not blocked by the email system).
- username_hw2_zip (This is a zipped version of the folder containing your entire workspace
-- with extension removed -- with everything needed to compile your code.
Be sure that unzipping the file causes a folder to be created, with the
individual files inside the folder. Do not include DLLs, and do not include the Blepo directory. Also be sure
to delete the Debug and Release directories, as well as the .opt and .ncb files,
since these files take up a lot of disk space and are automatically generated by
VC++. Your workspace should be at the same level as the inner-most Blepo
directory, so that your main file has #include "../Blepo/blepo.h".)
- all the source files that you actually wrote for the homework, attached as
individual text files (there will probably be only one or two of these, or at
most three).
- Write a report describing the problem of texture synthesis, the algorithm
you used, resulting images, and analysis of the results. There is no need
to include source code.
- HW#3 (due 2/11; bring hardcopy report to my office by 4:30pm, email code to
grader by 4:30pm)
- Implement the split-and-merge segmentation algorithm in C++.
- Run the algorithm with a homogeneous predicate that ensures that the
standard deviation of the gray-level values of the pixels in a region are no
more than some threshold. Generate results for the following image:
segmentation_holes.pgm
- Run the algorithm with a homogeneous predicate that ensures that the
standard deviation of the elements of the pixels' parent are no more than some
threshold. In other words, if x_i is the parent vector of pixel i, then std({x_i})
< th, where {x_i} is the set of the parent vectors of all the pixels in the
region. Compute the parent vector of a pixel by concatenating the values
at all orientations and scales of a Haar wavelet transform. Generate results for the following image:
segmentation_zebra.pgm
- Develop an executable that takes two commandline arguments: filename
mode, the first being the name of the image to segment and the segment being an
integer (1 indicates texture segmentation, while 0 indicates gray-level
segmentation). The program should run the algorithm and display the
original and resulting images, along with any intermediate processing that you
think would help convey the working of the algorithm.
- Email the grader with the subject line "HW3" (without quotes). The
email should have the same attached files as the previous assignments.
- Write a report describing the problem, the algorithm
you used, resulting images, and analysis of the results. There is no need
to include source code.
- HW#4 (due 2/25; bring hardcopy report and demo code to grader in the library
5th floor, right side (Java City side) from 3pm - 7pm)
- Using the images provided (textdoc-training.bmp
and textdoc-testing.bmp), pick a letter of the
alphabet and build a model of that letter's appearance (in the lower-case Times
font of the main body text -- do not worry about the title, heading, or figure
captions) using the training image. Then search in the testing image for
all occurrences of the letter. For this assignment, your detector may use
a simple model such as a single template built from a single example. The
output should be a colored (e.g., red) rectangle outlining each letter found in
the testing image.
- Do the same thing, but this time for the entire word
"texture" (all lowercase).
- Show receiving operating characteristic curves (ROC) for the detection
problems, with different thresholds. To make an ROC curve, vary the
threshold and measure the false positives and false negatives, plotting on a
graph. Then connect the dots. Here is a demonstration of ROC curves:
http://wise.cgu.edu/sdt/sdt.html
.
- Implement the split-and-merge segmentation algorithm in C++ (again).
For this assignment, your code should be efficient (able to process each image
in a few seconds). Run your code on the holes and zebra image above, as
well as the following:
segmentation_hydrant.bmp. Show results after split, and after
split-and-merge. Show results for different thresholds. Do not
implement the Haar transform.
- Write a report describing the problem, the algorithm
you used, resulting images, and analysis of the results. There is no need
to include source code.
- HW#5 (due 4/1; bring hardcopy report and demo code to grader in the library
5th floor, right side (Java City side) from 3pm - 7pm)
- Implement the dynamic-programming snake algorithm, as described in Amini et
al., 1990. Include first-order and second-order derivative terms for the
internal energy (i.e., alpha and beta), and allow each point to move to one of
nine positions in its immediate vicinity. For the external energy, use the
negative of the magnitude of the gradient.
- For simplicity, restrict your implementation to work only with closed
curves.
- Start your snake from an initial curve that is larger than the object and
display its evolution over time.
- To draw a line overlaid on the image, add the following method to Figure:
void DrawLine(int x0, int y0, int x1, int y1, COLORREF color, int width)
{
CClientDC dc(m_wnd);
CPen pen(PS_SOLID, width, color), *pen_old;
pen_old = dc.SelectObject(&pen);
dc.MoveTo(CPoint(x0, y0));
dc.LineTo(CPoint(x1, y1));
dc.SelectObject(pen_old);
}
To call, use something like
{
Figure fig;
fig.Draw(img);
fig.DrawLine(0, 0, 20, 20, RGB(255,0,0), 1);
}
(Of course, you can modify this code to suit your needs.)
- Run the "repeatable experiment" mentioned at the end of section VI of the
Amini et al. paper. Here is the image:
synthetic_square.pgm .
- Also run your code on fruitfly.pgm . For
this case, initialize your contour to a curve that surrounds the fly.
- Try your algorithm with different parameters for alpha and beta, number and
location of points, etc.
- Write a report describing the problem, the algorithm you used, resulting
images, and analysis of the results. There is no need to include source
code.
Homework grading:
- A. Report is coherent, concise, clear, and neat, with correct
grammar and punctuation. Code works correctly the first time and
achieves good results on both images.
- B. Report adequately
describes the work done, and code generally produces good results. There
are a small number of defects either in the implementation or the writeup, but
the essential components are there.
- C. Report or code are
inadequate. The report contains major errors or is illegible, the code
does not produce meaningful results or does not run at all, or instructions are
not followed.
- D or F. Report or code not attempted, not turned
in, or contain extremely serious deficiencies.
Instructor: Stan Birchfield, 207-A Riggs Hall, 656-5912, email: stb at clemson
Grader: Sunil Guduru, email: guduru at clemson (Note: Please
use this email address, not Sunil's regular one)
Lectures: 9:05 - 9:55 MWF, 305 Riggs Hall