ECE 877

ECE 877 Computer Vision

Spring 2010

This course builds upon ECE 847 by exposing students to fundamental concepts, issues, and algorithms in digital image processing and computer vision. Topics include segmentation, texture, detection, 3D reconstruction, calibration, shape, and energy minimization. The goal is to equip students with the skills and tools needed to manipulate images, along with an appreciation for the difficulty of the problems. Students will implement several standard algorithms, evaluate the strengths and weakness of various approaches, and explore a topic in more detail in a course project.

Syllabus

Tentative Schedule

Week Topic Assignment
1 Shape and active contours HW1: Template matching, due 1/15
2 Shape and active contours Quiz #1, 1/22
3 Level sets HW2: Active contours, due 1/29
4 Classification Quiz #2, 2/5
5 Classification HW3: Level sets, due 2/12
6 Fourier transform Quiz #3, 2/19
7 Texture HW4: Texture, due 2/26
8 Model fitting Quiz #4, 3/5
9 Model fitting HW5: Hough transform, due 3/12
10 [break] Quiz #5, 3/26
11 Multiple view geometry HW6: Mosaicking, due 4/2
12 3D reconstruction Quiz #6, 4/10
13 Camera calibration
14 Camera calibration Quiz #7, 4/24
15 Function optimization projects due

Readings and Resources

See ECE 847 Readings and Resources.

Assignments

In the assignments, you will implement several fundamental algorithms in C/C++, documenting your findings is an accompanying report for each assignment. C/C++ is chosen for its fundamental importance, ubiquity, and efficiency (which is crucial to image processing and computer vision). For your convenience, you are encouraged to use the latest version of the Blepo computer vision library.

Your code must compile under VC++ 6.0 or Visual Studio 2008. To make grading easier, your code should do one of the following:

#include "blepo.h" (In this case it does not matter where your blepo directory is, because the grader can simply change the directory include settings (Tools->Options->Directories->Include files) for Visual Studio to automatically find the header file.)
or
#include "../blepo/src/blepo.h" (assuming your main file is directly inside your directory). In other words, your assignment directory should be at the same level as the blepo directory. Here is an example:

To turn in your assignment, send an email to assign@assign.ece.clemson.edu (and cc the instructor and grader) with the subject line "ECE877-1,#n" (without quotes but with the # sign), where 'n' is the assignment number. You may leave the body of the email blank. Attach a zip file containing your report (in any standard format such as .pdf or .doc; but not .docx), and all the files needed to compile your project (such as *.h, *.c, *.cpp, *.rc, *.dsp, *.dsw; do not include *.ncb, *.opt, *.plg, *.aps, or the res, Debug, or Release directories). You must send this email from your Clemson account, because the assign server is not smart enough to know who you are if you use another account. (E.g., do not use @g.clemson.edu) Be sure that this file is actually attached to the email rather than being automatically included in the body of the email (Eudora, for example, has been known include files inline, but this behavior can be turned off). Also, be sure to change the extension of your zip file (e.g., change .zip to _zip) so that the server does not block the attachment!!! We cannot grade what we do not receive. (Also be sure that you're not hiding extensions for known types; in Windows explorer, uncheck the box "Tools.Folder Options.View.Hide extensions for known file types".)

All assignments are due at 11:59pm on the due date shown. An 8-hour grace period is extended, so that no points will be deducted for anything submitted before 8:00am the next morning.

In addition to submitting your report electronically, please also turn in a hardcopy. The deadline for the electronic copy is the same as for the code, whereas the hardcopies should be brought to the instructor by noon of the next business day after the deadline (at the latest). Just slip it under the door if I'm not in. No points will be deducted for printing in black-and-white, even if the report is in color. An example report

Assignments:

HW#1 (Detection)
- Using the images provided (textdoc-training.bmp and textdoc-testing.bmp), pick a letter of the alphabet and build a model of that letter's appearance (in the lower-case Times font of the main body text -- do not worry about the title, heading, or figure captions) using the training image. Then search in the testing image for all occurrences of the letter. For this assignment, your detector may use a simple model such as a single template built from a single example. The output should be a colored (e.g., red) rectangle outlining each letter found in the testing image.
- Do the same thing, but this time for the entire word "texture" (all lowercase).
- Show receiving operating characteristic curves (ROC) for the detection problems, with different thresholds. To make an ROC curve, vary the threshold and measure the false positives and false negatives, plotting on a graph. Then connect the dots. Here is a demonstration of ROC curves: http://wise.cgu.edu/sdtmod/measures6.asp (or see http://en.wikipedia.org/wiki/Receiver_operating_characteristic ).
- Write a report describing your approach, including your algorithms and methodology, experimental results, and discussion.
HW#2 (Snakes)
- Implement the dynamic-programming snake algorithm, as described in Amini et al., 1990. Include first-order and second-order derivative terms for the internal energy (i.e., alpha and beta), and allow each point to move to one of nine positions in its immediate vicinity. For the external energy, use the negative of the magnitude of the gradient.
- For simplicity, restrict your implementation to work only with closed curves.
- Start your snake from an initial curve that is larger than the object and display its evolution over time.
- Run the "repeatable experiment" mentioned at the end of section VI of the Amini et al. paper on synthetic_square.pgm. Also run the code on fruitfly.pgm, initializing your contour to a curve that surrounds the fly (using the GrabMouseClicks function).
- Try your algorithm with different parameters for alpha and beta, number and location of points, etc.
- Write a report describing your approach, experimental results, and discussion.
HW#3 (Level sets)
- Implement the level set segmentation algorithm described in the Chan-Vese 2001 paper. For simplicity, let the implicit surface initially surround the image, excluding only a narrow band of pixels near the border. As the surface evolves, the zeroth level set should conform to the boundaries of the object(s).
- Be sure to periodically reinitialize the level set function using the signed distance to the contour.
- Run your code on the fruitfly.pgm image.
- Write a report describing your approach, including the algorithm, methodology, experimental results, and discussion. In your report, show results for different initial implicit surfaces, including one that is neither completely outside nor completely inside the image.
HW#4 (Texture synthesis)
- Implement the Efros-Leung texture synthesis algorithm described in the paper, "Texture Synthesis by Non-parametric Sampling", ICCV 1999. See their website for details. Create a Visual C++ application with two modes:
  - In the first mode, the application allows the user to select a gray-level texture, the size of the window, and the desired size of the output image. The application loads the texture image, runs the algorithm, and displays both the original image and the resulting image in separate windows on the screen.
  - In the second mode, the application allows the user to select a gray-level image. Any pixel in the image that has a value of zero is filled in by the algorithm. The application displays both the original image and the resulting image in separate windows on the screen.
- If you create a command-line application, then the syntax should be as follows:
  - elprog texture.pgm window_size output_width output_height (for first mode)
  - elprog image.pgm window_size (for second mode)
    where
    - 'elprog' is the name of your executable,
    - the texture/image can be in any format recognizable by the software,
    - 'window_size' is the width/height of the square window used in the computation, and
    - the output image is of size 'output_width' x 'output_height' for the first mode.
    The two modes are distinguished by the number of parameters passed in.
- On the other hand, if you create a windows-based application, then make it obvious how to select between the modes and set the parameters.
- Test your algorithm on images such as those used by Efros and Leung: texture_brick.pgm, texture_dense_weave.pgm, texture_grate.pgm, texture_mesh.pgm, texture_ripples.pgm, texture_text.pgm, texture_thin_weave.pgm; small versions: texture_brick_small.pgm, texture_mesh_small.pgm
- Write a report describing your approach, including the algorithm, methodology, experimental results, and discussion.
HW#5 (Hough transform)
- Implement the Hough transform to detect straight lines in an image, using the edge detector of your choice.
- Detect the end points of the lines.
- Overlay the detected line segments on the original image.
- Run your code on the following images: riggs1.pgm, riggs2.pgm, and riggs3.pgm.
- Write a report describing your approach, including the algorithm, methodology, experimental results, and discussion.
HW#6 (Mosaicking)
- Create a high resolution mosaic from the following input images: tillman.zip .
- For reference, you may want to refer to Szeliski's classic tutorial on image alignment and stitching. However, for our purposes, we will greatly simplify the problem:
  - Downsample the images by a factor of 8 in both directions to make it easier to fit the result on the screen
  - Correspondence between pairs of images may be performed manually. Feel free to use these or these.
  - Using the correspondences, solve for the homography between pairs of overlapping images using the normalized Direct Linear Transformation algorithm (DLT). If you want, you may use Blepo's HomographyFit or OpenCV's cvFindHomography: http://www.seas.upenn.edu/~bensapp/opencvdocs/ref/opencvref_cv.htm . (You should be able to call OpenCV functions directly from Blepo; simply #include the correct header files and then the OpenCV data structures and functions will be available. See the top of FaceDetector.cpp for an example.) For more detail on the DLT, see http://w3.impa.br/~lvelho/outgoing/thales/visao/ex1.pdf .
  - Choose the middle image as the reference. Warp the other images to it, and feather the pixels to reduce the effect of seams. In a separate display, show the outlines of the warped images (see Fig. 16 on p. 44 of the above tutorial).
- Write a report describing your approach, including the algorithm, methodology, experimental results, and discussion.

Grading standard:

A. Report is coherent, concise, clear, and neat, with correct grammar and punctuation. Code works correctly the first time and achieves good results on both images.
B. Report adequately describes the work done, and code generally produces good results. There are a small number of defects either in the implementation or the writeup, but the essential components are there.
C. Report or code are inadequate. The report contains major errors or is illegible, the code does not run or produces significantly flawed results, or instructions are not followed.
D or F. Report or code not attempted, not turned in, or contains extremely serious deficiencies.

Detailed grading breakdown is available in the grading chart.

Projects

In your final project, you will investigate some area of image processing or computer vision in more detail. Typically this will involve formulating a problem, reading the literature, proposing a solution, implementing the solution (using the programming language/environment of your choice), evaluating the results, and communicating your findings. In the case of a survey project, the quality and depth of the literature review should be increased significantly to compensate for the lack of implementation.

Project deadlines:

3/26: team (1 or 2 people), title, and brief description
4/14: progress report (1 page)
4/30: final oral presentation in class during final exam slot, 3:00-5:30
4/30: final written report (5 pages)

To turn in your report, please send me a single email per group (do not email the assign server) with two attachments:

PDF file containing your 5-page report, conference format (title, authors, abstract, introduction, method, experimental results, conclusion, references)
PPT file containing your slides

Both files should have the same name, which should correspond somehow to your topic. Use underscores instead of spaces. Do not send PPTX files. Example: face_detection.pdf and face_detection.ppt. You do *not* need to send me your code (although you may if you like).

Projects from previous years

Administrivia

Instructor: Stan Birchfield, 207-A Riggs Hall, 656-5912, email: stb at clemson
Office hours: MWF afternoons, or by appointment
Grader: Zhichao Chen, zhichac at clemson
Lectures: 1:25 - 2:15 MWF, 227 Riggs Hall

Week	Topic	Assignment
1	Shape and active contours	HW1: Template matching, due 1/15
2	Shape and active contours	Quiz #1, 1/22
3	Level sets	HW2: Active contours, due 1/29
4	Classification	Quiz #2, 2/5
5	Classification	HW3: Level sets, due 2/12
6	Fourier transform	Quiz #3, 2/19
7	Texture	HW4: Texture, due 2/26
8	Model fitting	Quiz #4, 3/5
9	Model fitting	HW5: Hough transform, due 3/12
10	[break]	Quiz #5, 3/26
11	Multiple view geometry	HW6: Mosaicking, due 4/2
12	3D reconstruction	Quiz #6, 4/10
13	Camera calibration
14	Camera calibration	Quiz #7, 4/24
15	Function optimization	projects due