ECE 877

ECE 877 Computer Vision

Spring 2012

This course builds upon ECE 847 by exposing students to fundamental concepts, issues, and algorithms in digital image processing and computer vision. Topics include segmentation, texture, detection, 3D reconstruction, calibration, shape, and energy minimization. The goal is to equip students with the skills and tools needed to manipulate images, along with an appreciation for the difficulty of the problems. Students will implement several standard algorithms, evaluate the strengths and weakness of various approaches, and explore a topic in more detail in a course project.

Syllabus

Tentative Schedule

Week Topic Assignment
1 Classification HW1: Template matching, due 1/20
2 Classification Quiz #1, 1/23 *
3 Shape HW2: Level sets, due 2/3
4 Shape Quiz #2, 2/10
5 Texture HW3: Feature detection / matching, due 2/17
6 Texture Quiz #3, 2/24
7 Model fitting HW4: Mosaicking, due 3/2
8 Model fitting Quiz #4, 3/9
9 Camera calibration HW5: Two-view reconstruction, due 3/16
10 [break] Quiz #5, 3/30
11 Multiple view geometry HW6: N-view reconstruction, due 4/6
12 Multiple view geometry Quiz #6, 4/13
13 3D reconstruction
14 3D reconstruction Quiz #7, 4/27
15 Function optimization projects due

Readings and Resources

See ECE 847 Readings and Resources.

Assignments

In the assignments, you will implement several fundamental algorithms in C/C++, documenting your findings is an accompanying report for each assignment. C/C++ is chosen for its fundamental importance, ubiquity, and efficiency (which is crucial to image processing and computer vision). For your convenience, you are encouraged to use the latest version of the Blepo computer vision library.

Your code must compile under Visual Studio 2008, Visual Studio 2010, or VC++ 6.0. You should develop your code in Debug mode but test in Release mode before submitting. The grader will test in Release mode. To make grading easier, your code should do one of the following:

#include "blepo.h" (In this case it does not matter where your blepo directory is, because the grader can simply change the directory include settings (Tools->Options->Directories->Include files) for Visual Studio to automatically find the header file.)
or
#include "../blepo/src/blepo.h" (assuming your main file is directly inside your directory). In other words, your assignment directory should be at the same level as the blepo directory. Here is an example:

To turn in your assignment, send an email to assign@assign.ece.clemson.edu (and cc the instructor and grader) with the subject line "ECE877-1,#n" (without quotes but with the # sign), where 'n' is the assignment number. You must send this email from your Clemson account, because the assign server is not smart enough to know who you are if you use another account. E.g., do not use @g.clemson.edu. If you are using Gmail, it is not sufficient to change the 'send mail as:' to @clemson.edu. Instead, from 'Mail Settings' you need to go to 'Accounts and Import', 'Send mail as:', 'Send mail from another address', type in your userid@clemson.edu, select 'Send through clemson.edu SMTP servers', type 'smtp.clemson.edu' along with your userid and password, select 'Secured connection using SSL', then 'Add account'.

To your email, attach a zip file containing your report (in any standard format such as .pdf or .doc; but not .docx), and all the files needed to compile your project (such as .h, .c, .cpp, .rc, .vcproj, .sln, .dsp, .dsw). Also, if you have built an MFC Windows application (as opposed to a console-based application), check in the res directory that contains .ico and .rc2 files. But do NOT check in all the other files that Visual Studio creates automatically, such as .aps, .clw, .ncb, .opt, .plg, .suo, or the Debug or Release directories. When in doubt, check out your code to a new temporary directory and verify that it compiles and runs.

You may leave the body of the email blank. Be sure that your zip file is actually attached to the email rather than being automatically included in the body of the email (Eudora, for example, has been known include files inline, but this behavior can be turned off). We cannot grade what we do not receive.

(Obsolete instructions that were applicable when the Clemson server used to block .zip attachments: Also, be sure to change the extension of your zip file (e.g., change .zip to _zip) so that the server does not block the attachment!!! Also be sure that you're not hiding extensions for known types; in Windows explorer, uncheck the box "Tools.Folder Options.View.Hide extensions for known file types".)

All assignments are due at 11:59pm on the due date shown. An 8-hour grace period is extended, so that no points will be deducted for anything submitted before 8:00am the next morning.

Reports should be professionally written, with a title, a description of the problem, a description of the algorithm, a detailed discussion of your particular implementation, results, and analysis. An example report. Similarly, code should be professionally and cleanly written, making use of standard programming practices.

Assignments:

HW#1 (Template matching)
- Using the images provided (textdoc-training.bmp and textdoc-testing.bmp), pick a letter of the alphabet and build a model of that letter's appearance (in the lower-case Times font of the main body text -- do not worry about the title, heading, or figure captions) using the training image. Then search in the testing image for all occurrences of the letter. For this assignment, your detector may use a simple model such as a single template built from a single example (use the Extract method in Blepo), and SSD (or SAD) template matching can be used to compare the template with locations in the test image. Use non-maximal suppression and a threshold to determine where the letter occurs. The output should be the test image with colored (e.g., red) rectangles outlining each letter found.
- Do the same thing, but this time for the entire word "texture" (all lowercase).
- In your report, show receiving operating characteristic curves (ROC) for the detection problems, with different thresholds. To make an ROC curve, vary the threshold and measure the false positives and false negatives, plotting on a graph. Then connect the dots. Here is a demonstration of ROC curves: http://wise.cgu.edu/sdtmod/measures6.asp (or see http://en.wikipedia.org/wiki/Receiver_operating_characteristic ).
- Write a report describing your approach, including your algorithms and methodology, experimental results, and discussion.
HW#2 (Level sets)
- Implement the level set segmentation algorithm described in the Chan-Vese 2001 paper. For simplicity, let the implicit surface initially surround the image, excluding only a narrow band of pixels near the border. As the surface evolves, the zeroth level set should conform to the boundaries of the object(s).
- Be sure to periodically reinitialize the level set function using the signed distance to the contour.
- Run your code on the fruitfly.pgm image and some images of your own.
- Write a report describing your approach, including the algorithm, methodology, experimental results, and discussion. In your report, show results for different initial implicit surfaces, including one that is completely outside the object, one that is completely inside the object, and one that is neither.
HW#3 (Feature detection and matching)
- Implement the SURF feature detector described in Herbert Bay, Andreas Ess, Tinne Tuytelaars, Luc van Gool, "SURF: Speeded Up Robust Features", Computer Vision and Image Understanding (CVIU), Vol. 110, No. 3, pp. 346--359, 2008. Simplifications:
  - Only one octave (four scales within octave)
  - Implement U-SURF (no rotation)
  - (optional) Use simple 1D quadratic interpolation between scales, none spatially
- Note: First convert to grayscale and do all processing on the grayscale image. Also, only detect features at the middle two scales (the other two scales are for non-maximal suppression).
- Detect features in some pairs of these images: tillman.zip . Choose from either of the two resolutions: the original resolution, or downsampled by 8 in both directions. Overlay rectangles/circles over original image to display the SURF features detected. Color the rectangles/circles as red or blue depending on the sign of the Laplacian (red = bright surrounded by dark, whereas blue = dark surrounded by light, as shown in the slides)
- Extra: Displaying correspondences by displaying two images, the SURF features detected, and the matches. For computing descriptor, do not worry about Gaussian weighting of circle
- Write a report describing your approach, including the algorithm, methodology, experimental results, and discussion.
HW#4 (Mosaicking)
- Create a mosaic from the Tillman input images (above).
- For reference, you may want to refer to Szeliski's classic tutorial on image alignment and stitching. However, for our purposes, we will greatly simplify the problem:
  - Use the images that are downsampled by a factor of 8 in both directions to make it easier to fit the result on the screen
  - Correspondence between pairs of images may be performed manually (feel free to use these or these). (Note that in a real implementation, a feature detector/descriptor like SIFT/SURF/DAISY would be used.)
  - Using the correspondences, solve for the homography between pairs of overlapping images using the normalized Direct Linear Transformation algorithm (DLT). If you want, you may use Blepo's HomographyFit or OpenCV's cvFindHomography: http://www.seas.upenn.edu/~bensapp/opencvdocs/ref/opencvref_cv.htm . (You should be able to call OpenCV functions directly from Blepo; simply #include the correct header files and then the OpenCV data structures and functions will be available. See the top of FaceDetector.cpp for an example.) For more detail on the DLT, see http://w3.impa.br/~lvelho/outgoing/thales/visao/ex1.pdf .
  - Choose the middle image as the reference. Warp the other images to it, and feather the pixels to reduce the effect of seams. For feathering, you may find the chamfer distance helpful. In a separate display, show the outlines of the warped images (see Fig. 16 on p. 44 of the above tutorial).
- Write a report describing your approach, including the algorithm, methodology, experimental results, and discussion.
HW#5 (Homographies and Fundamental matrices)
- In this assignment, you will practice mapping image coordinates to metric coordinates in a world plane. There are three parts:
  - Write code to compute the homography between this image and the ground plane, given the fact that the squares on the floor are 16 inches on each side. Unwarp the image to a top-down / bird's eye view, and provide an interface so that when a user clicks on two points in the original image, the following is displayed: the two points in the original image, the two points in the unwarped image connected by a line, and the length (in inches) is displayed in a popup (modal) dialog box -- hint: use AfxMessageBox(). It is okay to allow the user to click only a single pair of points.
  - Write an application that computes the fundamental matrix between a pair of uncalibrated stereo images using the normalized Eight-point algorithm. Then, the application should display the two images and allow the user to repeatedly click on a point in the first image. When a point is clicked, then the two epipolar lines (which come from the fundamental matrix) associated with that point should be displayed. Allow at least 5 clicks. The application only needs to work with these two images: burgher1_small.jpg and burgher2_small.jpg, and you may hardcode these correspondences (or your own) in your program to simplify the problem.
- Write a report describing your approach, including the algorithm, methodology, experimental results, and discussion.
HW#6 (Tomasi-Kanade factorization SFM)
- Implement the Tomasi-Kanade factorization structure-from-motion (SFM) algorithm. You are free to use KLT or other Lucas-Kanade code to track features, or you may establish correspondence manually. Your program should output a PLY file that can be read by MeshLab.
- Test your implementation on one or more of the following image sequences: cube_sequence.zip, chalk_sequence.zip, fernow_sequence.zip, riggs_corner_sequence.zip, holmes_sequence.zip . You are welcome to remove frames from any sequence if you need to. 2D point locations for the cube (61 points, 17 frames) are here.
- Hint: The SVD of the transpose is the transpose of the SVD: A = USV^T, A^T=VS^TU^T.
- Write a report describing your approach, including the algorithm, methodology, experimental results, and discussion.

Grading standard:

A. Report is coherent, concise, clear, and neat, with correct grammar and punctuation. Code works correctly the first time and achieves good results on both images.
B. Report adequately describes the work done, and code generally produces good results. There are a small number of defects either in the implementation or the writeup, but the essential components are there.
C. Report or code are inadequate. The report contains major errors or is illegible, the code does not run or produces significantly flawed results, or instructions are not followed.
D or F. Report or code not attempted, not turned in, or contains extremely serious deficiencies.

Detailed grading breakdown is available in the grading chart.

Projects

In your final project, you will investigate some area of image processing or computer vision in more detail. Typically this will involve formulating a problem, reading the literature, proposing a solution, implementing the solution (using the programming language/environment of your choice), evaluating the results, and communicating your findings. In the case of a survey project, the quality and depth of the literature review should be increased significantly to compensate for the lack of implementation.

Project deadlines:

4/04: team (1 or 2 people), title, and brief description
4/20: progress report (1 page)
5/03: final written report (5 pages)
5/04: final oral presentation in class during final exam slot, 3:00-5:30

To turn in your report, please send me a single email per group (do not email the assign server) with two attachments:

PDF file containing your 5-page report, conference format (title, authors, abstract, introduction, method, experimental results, conclusion, references)
PPT file containing your slides

Both files should have the same name, which should correspond somehow to your topic. Use underscores instead of spaces. Do not send PPTX files. Example: face_detection.pdf and face_detection.ppt. You do *not* need to send me your code (although you may if you like).

Projects from previous years

Administrivia

Instructor: Stan Birchfield, 209 Riggs Hall, 656-5912, email: stb at clemson
Grader: TBD
Lectures: 1:25 - 2:15 MWF, 223 Riggs Hall

Week	Topic	Assignment
1	Classification	HW1: Template matching, due 1/20
2	Classification	Quiz #1, 1/23 *
3	Shape	HW2: Level sets, due 2/3
4	Shape	Quiz #2, 2/10
5	Texture	HW3: Feature detection / matching, due 2/17
6	Texture	Quiz #3, 2/24
7	Model fitting	HW4: Mosaicking, due 3/2
8	Model fitting	Quiz #4, 3/9
9	Camera calibration	HW5: Two-view reconstruction, due 3/16
10	[break]	Quiz #5, 3/30
11	Multiple view geometry	HW6: N-view reconstruction, due 4/6
12	Multiple view geometry	Quiz #6, 4/13
13	3D reconstruction
14	3D reconstruction	Quiz #7, 4/27
15	Function optimization	projects due