ECE
877 Computer Vision
|
|
Spring 2007
This course builds upon ECE 847 by exposing students to fundamental concepts, issues, and algorithms in
digital image processing and computer vision. Topics include segmentation, texture, detection,
3D reconstruction, calibration, shape, and energy minimization.
The goal is to equip students with the skills and tools needed to manipulate
images, along with an appreciation for the difficulty of the problems. Students
will implement several standard algorithms, evaluate the strengths and weakness
of various approaches, and explore a topic in more detail in a course
project.
Syllabus
Week
| Topic
| Assignment
|
1
| Shape and active contours |
HW1: Template matching, due 1/19 |
2
| Shape and active contours |
Quiz #1 |
3
| Level sets |
HW2: Active contours, due 2/2 |
4
| Classification |
Quiz #2 |
5
| Classification |
HW3: Level sets, due 2/16 |
6
| Fourier transform |
Quiz #3 |
7
| Texture |
HW4: Object detection, due 3/2 |
8
| 3D reconstruction |
Quiz #4 |
9
| Camera calibration |
HW5: Texture, due 3/16 |
10
| [break] |
Quiz #5 |
11
| Model fitting |
|
12
| Tracking and filtering |
HW6: Head tracking, due 4/6 |
13
| Scale space and SIFT features |
Quiz #6 |
14
| Function optimization |
|
15
| Function optimization |
projects due |
See ECE 847 Readings and Resources.
In the assignments, you will implement several fundamental algorithms in C/C++,
documenting your findings is an accompanying report for each assignment.
The C/C++ languages are chosen for their fundamental importance, their ubiquity,
and their efficiency (which is crucial to image processing and computer vision).
For your convenience, you are encouraged to use the latest version of the
Blepo computer vision library.
To make grading easier, your code should do one of the following:
-
#include "blepo.h" (In this case it does not matter where your blepo
directory is, because the grader can simply change the directory include
settings (Tools->Options->Directories->Include files) for Visual Studio
to automatically find the header file.)
or
-
#include "../blepo/src/blepo.h" (assuming your main file
is directly inside your directory). In other words, your assignment directory
should be at the same level as the blepo directory. Here is an example:
To turn in your assignment, send a blank email to
assign@assign.ece.clemson.edu
with the subject line "ECE877-1,#n" (without quotes), where 'n' is the
assignment number; and cc the instructor and grader. (No one reads the body of the email, so anything there will be
ignored.) You
must send this email from your Clemson account, because the assign server is not
smart enough to know who you are if you use another account. Attach
a zip file containing your report, all of your source files, and any other files needed to compile your project, to
the email: *.h, *.c, *.cpp, *.rc, *.dsp, *.dsw. (Do not include the
res, Debug, or Release directories.) Also be sure your report, which may
be in any standard format (.pdf, .doc, etc.), is in this directory. Be sure that this file is actually
attached to the email rather than being automatically included in the body of
the email (This behavior has been observed in Eudora, for example, but it can be
turned off). Also, be sure to change the extension of your zip file (e.g.,
change .zip to _zip) so that the server does not block the
attachment!!! We cannot grade what we do not receive.
An example report
Assignments:
- HW#1 (Detection)
- Using the images provided (textdoc-training.bmp
and textdoc-testing.bmp), pick a letter of the
alphabet and build a model of that letter's appearance (in the lower-case Times
font of the main body text -- do not worry about the title, heading, or figure
captions) using the training image. Then search in the testing image for
all occurrences of the letter. For this assignment, your detector may use
a simple model such as a single template built from a single example. The
output should be a colored (e.g., red) rectangle outlining each letter found in
the testing image.
- Do the same thing, but this time for the entire word
"texture" (all lowercase).
- Show receiving operating characteristic curves (ROC) for the detection
problems, with different thresholds. To make an ROC curve, vary the
threshold and measure the false positives and false negatives, plotting on a
graph. Then connect the dots. Here is a demonstration of ROC curves:
http://wise.cgu.edu/sdt/sdt.html
.
- Write a report describing your approach, including your algorithms and
methodology, experimental results, and discussion. The report may be in
any standard format (.pdf, .doc, etc.).
- HW#2 (Active contours)
- Implement the dynamic-programming snake algorithm, as described in Amini et
al., 1990. Include first-order and second-order derivative terms for the
internal energy (i.e., alpha and beta), and allow each point to move to one of
nine positions in its immediate vicinity. For the external energy, use the
negative of the magnitude of the gradient.
- For simplicity, restrict your implementation to work only with closed
curves.
- Start your snake from an initial curve that is larger than the object and
display its evolution over time.
- Run the "repeatable experiment" mentioned at the end of section VI of the
Amini et al. paper. Here is the image:
synthetic_square.pgm .
- Also run your code on fruitfly.pgm . For
this case, initialize your contour to a curve that surrounds the fly.
- Try your algorithm with different parameters for alpha and beta, number and
location of points, etc.
- Write a report describing your approach, including your algorithms and
methodology, experimental results, and discussion. The report may be in
any standard format (.pdf, .doc, etc.).
- HW#3 (Level sets)
- Implement the basic level set algorithm to segment a simple image:
shapes.pgm. The initial implicit surface should
surround all the objects initially, and as the surface evolves, the zeroth level
set should conform to the boundaries of the objects. Use the gradient
magnitude of the image to determine the speed of evolution.
- Write a report describing your approach, including the algorithm,
methodology, experimental results, and discussion. The report may be in any
standard format (.pdf, .doc, etc.). Also include an overview of level set
methods, the narrow band algorithm, and the fast marching method, explaining
when the latter is applicable.
- HW#4 (Object detection)
- Train and test an object detector using the code provided by OpenCV.
- The vehicle dataset vehicle_images.zip
contains 9 sequences with 200 images each. Label the images in the
datasets assigned to you, according to the email. To label your sequence,
use objectmarker.exe, in haarkit.rar (available from
http://www.iem.pw.edu.pl/~domanskj/haarkit.rar ). Use your best
judgment when labeling the data; for example, only label vehicles on the near
side of the road (but all the lanes on that side), only label vehicles greater
than a minimum size, and only label vehicles that are for the most part
unoccluded.
- Label your sequence no later than the Monday before the assignment deadline,
and send me your info.txt file so that we can share it with all the other
students. Also collect at least 20 background images by that time, so that
we can share those as well. One of these background images should be
computed as the average of your vehicle sequence, yielding an image of the road
with no vehicles. These background images can be found here:
more_background_images.zip
- Additional background images can be found here:
background_images.zip (courtesy of UIUC
researchers Agarwal, Awan, and Roth:
CarData.tar.gz).
Some of the images contain people, bicycles, motorcycles, occluded vehicles,
wheels, and animals, so use your best judgment in deciding whether to include
them.
- Divide your sequence into training and testing sets (100 images each).
Train a detector using the training set, and test the detector on the test set.
Compute an ROC curve by varying the threshold of the detector.
- Use the labeled data from other students to try different combinations.
Without computing full ROC curves of each, compare the performance of training
on one sequence while testing on another sequence. Also train using images
from two different sequences, and measure the performance.
- You may find Zhichao's tutorial on
using the OpenCV object detector useful.
- Write a report describing your approach, including the algorithm,
methodology, experimental results, and discussion. The report may be in any
standard format (.pdf, .doc, etc.). Provide your info.txt and cascade.xml
files, along with a project to load an image, run the detector, and display the
output.
- HW#5 (Texture synthesis)
- Implement the Efros-Leung texture synthesis algorithm described in the
paper, "Texture
Synthesis by Non-parametric Sampling", ICCV 1999. See their
website for details. Create a Visual C++ application that allows the
user to select a texture, the size of the window, and the desired size of the
output image. The application should load the texture image, run the
algorithm, and display both the original image and the resulting image in
separate windows on the screen.
- Test your algorithm on images such as those used by Efros and Leung:
texture_brick.pgm,
texture_dense_weave.pgm,
texture_grate.pgm,
texture_mesh.pgm,
texture_ripples.pgm,
texture_text.pgm,
texture_thin_weave.pgm; small versions:
texture_brick_small.pgm,
texture_mesh_small.pgm
- Write a report describing your approach, including the algorithm,
methodology, experimental results, and discussion. The report may be in any
standard format (.pdf, .doc, etc.).
- HW#6 (Head tracking)
- Implement the head tracking algorithm described in
"An Elliptical Head Tracker" (1997) in order to track a person's head through a
video sequence using the gradient around the perimeter of the head modeled as an
ellipse.
- For initialization, write code to compute the points along the boundary of a
vertically-oriented ellipse given its center point and the length of its two
axes. Fix the aspect ratio to a hard-coded value (A value between 1.2 and 1.4 is suggested). Also write code to compute the sum
of the gradient magnitude around the perimeter of an ellipse. Run this
code over all possible (x,y) positions in the first image of the sequence
(ignoring image boundaries), generating a likelihood map for the person's
location. Use a single scale determined manually. Plot the
likelihood map in one figure, and the most likely ellipse location in another
figure (overlaying the ellipse on the grayscale image).
- For tracking, write code to compute the location of the head in frame i+1 given its
location in frame i, using a local search around the current location varying x,
y, and scale. Run your tracker over the entire sequence, displaying the
best location in each image. Use pause(0.3) after each display, so that
you can see the results for each image. Evaluate the algorithm's
performance with and without constant-velocity prediction.
- Augment the code to use, instead of the gradient magnitude, the absolute
value of the dot product
of the gradient with the ellipse normal. (The equations for this
version can be found in Section 3 of
"Elliptical Head Tracking Using Intensity Gradients and Color Histograms" (1998).)
Run this tracker over the entire sequence, with constant-velocity prediction.
Evaluate the performance of the algorithm.
- Use the video sequence melissa_head.zip.
- Write a report describing your approach, including the algorithm,
methodology, experimental results, and discussion. The report may be in any
standard format (.pdf, .doc, etc.).
Grading standard:
- A. Report is coherent, concise, clear, and neat, with correct
grammar and punctuation. Code works correctly the first time and
achieves good results on both images.
- B. Report adequately
describes the work done, and code generally produces good results. There
are a small number of defects either in the implementation or the writeup, but
the essential components are there.
- C. Report or code are
inadequate. The report contains major errors or is illegible, the code
does not run or produces significantly flawed results, or instructions are
not followed.
- D or F. Report or code not attempted, not turned
in, or contains extremely serious deficiencies.
Detailed grading breakdown is available in the grading chart.
Extra credit: Contributions to the Blepo computer vision library
will earn up to 10 points extra credit on your final grade. In general, you
should expect 1 point for a major bug fix, and 2-7 points for a significant
extension to an existing function or implementation of an algorithm or set of
functions. Contributions should be cleaning written, with code-level and
user-level documentation, and a test harness. To receive extra credit, you
must meet the following deadlines:
- announce (non-binding) intention to contribute (3/16)
- get interface approval (4/6)
- turn in final code and documentation (4/27)
In your final project, you will investigate some area of image processing or computer vision in more detail. Typically
this will involve formulating a problem, reading the literature, proposing a solution, implementing the solution,
evaluating the results, and communicating your findings. In the case of a survey project, the quality and depth of
the literature review should be increased significantly to compensate for the lack of implementation.
Project deadlines:
- 4/9: team, title, and brief description
- 4/20: progress report (1 page)
-
5/1: final oral presentation in class during
final exam slot, 1:00-4:00
- 5/3: final written
report (approximately 5 pages)
Instructor: Stan Birchfield, 207-A Riggs Hall, 656-5912, email: stb at clemson
Grader: Zhichao Chen, 015 Riggs Hall, 650-0308, email: zhichac at clemson
Lectures: 1:25 - 2:15 MWF, 301 Riggs Hall