ECE 847

ECE 847 Digital Image Processing

Fall 2008

This course introduces students to the basic concepts, issues, and algorithms in digital image processing and computer vision. Topics include image formation, projective geometry, convolution, Fourier analysis and other transforms, pixel-based processing, segmentation, texture, detection, stereo, and motion. The goal is to equip students with the skills and tools needed to manipulate images, along with an appreciation for the difficulty of the problems. Students will implement several standard algorithms, evaluate the strengths and weakness of various approaches, and explore a topic of their own choosing in a course project.

Syllabus

Schedule

Week Topic Assignment
1 Pixel-based processing HW1: Warm-up, due 8/29
2 Pixel-based processing Quiz #1, 9/5
3 Filters and edge detection HW2: Pixels and regions, due 9/12
4 Filters and edge detection Quiz #2, 9/19
5 Segmentation HW3: Edge detection, due 9/26
6 Segmentation Quiz #3, 10/3
7 Stereo HW4: Segmentation, due 10/10
8 Stereo Quiz #4, 10/17
9 Motion HW5: Stereo matching, due 10/24
10 Motion Quiz #5, 10/31
11 Image formation HW6: Lucas-Kanade tracking, due 11/7
12 Projective geometry Quiz #6, 11/14
13 Projective geometry
14 Color Quiz #7, 12/5
15 Color projects due

Readings and Resources

Readings to complement the lectures:

Sonka et al., Region-based shape representation and description
Robyn Owens, Mathematical morphology (dilation and erosion)
R. Fisher et al., Connected components
Bill Green, Canny edge detection tutorial
Bob Fisher et al., Canny edge detector
Michael Bach, Muller-Lyer illusion
Various authors, Split-and-merge segmentation
Serge Beucher, Watershed segmentation; Roerdink and Meijster, The watershed transform; Matlab, watershed tutorial
Sylvain Bougnoux, Learning epipolar geometry
Nikos Paragios, Level set tutorial; J. Sethian's level set page
R. Wang, various lectures
Adobe TIFF specification document (color spaces and JPEG)
AIM-DP (color spaces)
Amara Graps, Introduction to Wavelets

Computer vision in the news:

Help organizing your digital photos, CBS News, Feb. 9, 2006 (Riya)
'Silent drowning' pool girl saved by underwater cameras, Times Online, Aug. 31, 2005
Courtrooms could host virtual crime scenes, New Scientist.com, March 10, 2005
Sportvision virtual first-down markers
Basketball buddies build a computerized shot doctor, USA Today, Feb. 7, 2003 (Noah Basketball)
Automotive applications:
- Infiniti advanced lane departure warning system
- Infiniti Around View Monitor, Nissan Around View Monitor, 2007
- Chrysler automobile uses CMOS cameras for smart headlights, IEEE Spectrum, Apr. 2006 (Gentex SmartBeam)
- Lexus uses computer vision for automatic parallel parking, IEEE Spectrum, Apr. 2006 (Intelligent Parking Assist)
- Electronic vision unblocks the 'blind spot', IEEE Spectrum, Apr. 2006 (Volvo's Blind-Spot Information System)
- Car, park thyself (Toyota's automatic parking feature), CBS News, Jan. 15, 2003
Content-Aware Image Sizing

Vision in biological systems:

P. Gurney, Is our 'inverted' retina really 'bad design'?, Technical Journal, 1999
C. Wieland, Seeing back to front, Creation, 1996 (see also An eye for creation, Creation, 1996)
J. Sarfati, Can it bee?, Creation, 2003 -- honeybees using optic flow for navigation
Centeye -- obstacle avoidance using optic flow
C. Stammers, Trilobite technology, Creation, 1993
S. M. Gon, The trilobite eye
J. Sarfati, Lobster eyes: brilliant geometric design, Creation, 2001
Sight in British garden birds
Color vision in birds
P. Gurney, Our eye movements and their control: Part 1, Technical Journal, 2002
P. Gurney, Our eye movements and their control: Part 2, Technical Journal, 2003
C. Wieland, New eyes for blind cave fish, 2000
T. Wagner, Darwin vs. the eye, Creation, 1994
D. E. Stoltzmann, The specified complexity of retinal imagery, CRSQ, 43(1):4-12, June 2006
Eye Design Book -- overview of eyes in animal world
Human visual system:
- Change detection, Visual Cognition Lab, Univ. of Illinois
- Change blindness and inattentional blindness, Visual Cognition Lab, Univ. of Illinois
  - Basketball video -- Can you count the number of times the basketball is passed?

Computer vision companies:

Object Video -- Vistascape -- ioimage -- NICE systems -- Cognex -- Ojos -- Digital Persona -- TZYX --
Organic Motion -- Avid Technology -- Mobileye -- Vision Robotics -- Sportvision
more ... (an extensive list from David Lowe)

Software:

Microsoft Visual Studio Service Pack 6 download
Visual C++ quick guide
Irfanview -- free image viewer
Common VC++ 6.0 problems:
- Installation:
  - Program not available in computer labs starting Fall 2008.
  - To install on Vista, right-click on the file and select "Run as Administrator". Do this even if you are already Adminstrator.
  - Installation says that it is unsuccessful. Solution: Ignore the warning; sometimes it will still work just fine.
  - During installation, message that you need to update Java library then reboot. This was seen on a machine borrowed from the library. No known solution.
- Running VC++:
  - On Vista, warning says that it is not compatible. Solution: Ignore the warning.
- Compiling:
  - To get rid of warnings, #pragma warning ( disable : 4786 )
- Linking:
  - Program won't link. Make sure under Project Settings that 'Use MFC in shared DLL' is checked.
  - On Vista, program won't link. Cannot create project.exe. By default, VC++ 6.0 puts projects in a directory like C:/Program Files/Microsoft Visual Studio/MyProjects/. But on Vista, when you try to navigate to this directory, it does not exist. How VC++ can create a project that can compile (even though it does not link) in a directory that does not exist is beyond me. But the solution is simple: Create a new project in a directory that does exist.
- Running your program:
  - cannot find .dll files; maybe you put a space in your path by accident?

Additional computer vision resources

Resources for current students (restricted access, not open to the public)

Assignments

In the assignments, you will implement several fundamental algorithms in C/C++, documenting your findings is an accompanying report for each assignment. C/C++ is chosen for its fundamental importance, ubiquity, and efficiency (which is crucial to image processing and computer vision). For your convenience, you are encouraged to use the latest version of the Blepo computer vision library.

Your code must compile under VC++ 6.0. To make grading easier, your code should do one of the following:

#include "blepo.h" (In this case it does not matter where your blepo directory is, because the grader can simply change the directory include settings (Tools->Options->Directories->Include files) for Visual Studio to automatically find the header file.)
or
#include "../blepo/src/blepo.h" (assuming your main file is directly inside your directory). In other words, your assignment directory should be at the same level as the blepo directory. Here is an example:

To turn in your assignment, send an email to assign@assign.ece.clemson.edu (and cc the instructor and grader) with the subject line "ECE847-1,#n" (without quotes but with the # sign), where 'n' is the assignment number. You may leave the body of the email blank. Attach a zip file containing your report (in any standard format such as .pdf or .doc; but not .docx), and all the files needed to compile your project (such as *.h, *.c, *.cpp, *.rc, *.dsp, *.dsw; do not include *.ncb, *.opt, *.plg, *.aps, or the res, Debug, or Release directories). You must send this email from your Clemson account, because the assign server is not smart enough to know who you are if you use another account. (E.g., do not use @g.clemson.edu) Be sure that this file is actually attached to the email rather than being automatically included in the body of the email (Eudora, for example, has been known include files inline, but this behavior can be turned off). Also, be sure to change the extension of your zip file (e.g., change .zip to _zip) so that the server does not block the attachment!!! We cannot grade what we do not receive. (Also be sure that you're not hiding extensions for known types; in Windows explorer, uncheck the box "Tools.Folder Options.View.Hide extensions for known file types".)

All assignments are due at 11:59pm on the due date shown. An 8-hour grace period is extended, so that no points will be deducted for anything submitted before 8:00am the next morning.

In addition to submitting your report electronically, please also turn in a hardcopy. The deadline for the electronic copy is the same as for the code, whereas the hardcopies should be brought to the instructor by noon of the next business day after the deadline (at the latest). Just slip it under the door if I'm not in. No points will be deducted for printing in black-and-white, even if the report is in color. An example report

Assignments:

HW#1 (Floodfill)
- Implement the floodfill algorithm in C/C++. Create an executable that allows the user to choose the filename and seed point; it is okay if you hardcode the new color. The application should load the image from disk, display the original image, run the algorithm, and display the resulting image. (The specific interface is up to you: Either use command-line parameters, such as: filename x y (in that order), where 'filename' is the image filename and (x,y) are the coordinates of the seed point; Or use a windows-based interface, such as CFileDialog for selecting the file and GrabMouseClick for getting the seed point.)
  - To create a console app in Visual C++, follow these instructions: File -> New -> Project -> Win32 Console Application. Give it a name and keep the checkbox on "Create new workspace". Choose "An application that supports MFC." Now compile and run (Build -> Build ..., and Build -> Execute, or F7 and Ctrl-F5). Under FileView -> Source Files you will find the main cpp file. (Also, I would recommend that you turn off Precompiled Headers: Project -> Settings -> C/C++ -> Precompiled headers -> Not using precompiled headers. Before you click on the radio button, though, first select All configurations in the drop down box so that both Debug and Release versions are affected.)
- The image that the grader will use to test your code is quantized.pgm and another image that is similar.
- It is okay if your code only works for grayscale images (converting color images to grayscale).
- A tutorial on the Blepo library will be given in class. You may use any part of the library except the Floodfill function itself.
- No report is due for this assignment.
HW#2 (Fruit classification)
- Write code to automatically detect and classify fruit on a dark background.
  - Use double graylevel thresholding to count and detect the foreground regions of the image, distinguishing them from the background.
  - Print the properties of each foreground region, including
    - zeroth-, first- and second-order moments (regular and centralized)
    - compactness
    - eccentricity (or elongatedness), using eigenvalues
    - direction, using either eigenvectors (PCA) or the moments formula
  - Classify the pieces of fruit using a combination of these properties or others that you develop. Also detect the banana stem.
- The grader will test your code on the images fruit1.pgm and fruit2.pgm (or, in BMP format, fruit1.bmp and fruit2.bmp), along with other similar images. The same algorithm parameters should be used for all objects and for both images.
- For this assignment, you may not use any Blepo functionality contained or prototyped in ImageAlgorithms.h, with the one exception of ConnectedComponents.
- Write a report describing your approach, including your algorithms and methodology, experimental results, and discussion.
HW#3 (Canny edge detection)
- Implement the Canny edge detector. There should be three steps to your code: gradient estimation, non-maximum suppression, and thresholding (with hysteresis). For the gradient estimation, convolve the image with the derivative of a Gaussian, rather than computing finite differences in the smoothed image. Automatically compute the threshold values based upon image statistics. Run your code on the following images: cat.pgm and cameraman.pgm. Display intermediate results (e.g., the two x- and y- gradient components, the gradient magnitude and angle, and the edges before thresholding) in separate figures, in addition to the final result.
- Implement the chamfer distance algorithm with the Manhattan distance. Compute the chamfer distance of the edges of the cherrypepsi.jpg image, then perform an exhaustive search (for simplicity, only consider locations for which the template is completely in bounds) for the best location of the cherrypepsi_template.jpg template. Convert from color to grayscale before computing the edges. Display the resulting probability map by summing the distances to the edges, and (in a separate window) overlay on the original image the rectangle corresponding to the peak.
- For this assignment, you may not use any Blepo functionality contained or prototyped in ImageAlgorithms.h, and you may not use the Gauss* or Gradient* functions prototyped in ImageOperations.h.
- Write a report describing your approach, including your algorithms and methodology, experimental results, and discussion. Be sure to show the effect of the scale parameter on the output for at least one image.
HW#4 (Watershed segmentation)
- Implement the simplified Vincent-Soille marker-based watershed segmentation algorithm. The basic algorithm involves three steps: (1) Compute the magnitude of the image gradient, quantized; (2) Construct a data structure allowing fast access to all the pixels with a certain value; (3) Apply breadth-first search to flood the pixels one graylevel at a time, starting with the minimum value, assigning each pixel to either the nearest existing catchment basin or to a new catchment basin. Define the watershed pixels as those which occur at a transition between basins. The marker-based extension should use the steps presented in class to reduce oversegmentation.
- Run your code on the following images: holes.pgm and cells_small.pgm. Display the result of the algorithm at the various stages of the computation. (Due to the difficulty of thresholding these images, it is okay for your code to have a command-line (or Windows-based) switch to select two different variations of your algorithm.)
- For this assignment, you may not use any Blepo code in Watershed.cpp.
- Write a report describing your approach, including your algorithms and methodology, experimental results, and discussion.
HW#5 (Stereo matching)
- Write an application that displays two images and allows the user to specify n pairs of corresponding points by clicking in one image, then clicking in the other image. After the n pairs have been specified, compute and display the pencil of epipolar lines in both images. The intersections of the lines should yields the epipoles. Test this part of the assignment with n=20 and the following images: burgher1_small.jpg and burgher2_small.jpg.
  Modification: Instead of having the user click on the corresponding points, use these correspondences hardcoded in your program to compute the fundamental matrix and epipolar lines. But instead of displaying the pencil of epipolar lines together, you should allow the user to repeatedly click on a point in the first image. When a point is clicked, then the two epipolar lines associated with that point should be displayed. Allow at least 5 clicks.
  Hint: The resulting fundamental matrix should be approximately
  [ -0.0000 0.0002 -0.0231 -0.0002 *.**** *.**** 0.0256 -*.**** -0.9973 ]where some of the values have been hidden so as not to give away the answer completely.
- Implement correlation-based matching of rectified stereo images. The resulting disparity map should be the same size as the two input images, although the values at the left edge will be erroneous. Match from left to right (i.e., for each window in the left image, search in the right image), so that the disparity map is with respect to the left image. Recall that a (left) disparity map D(x,y) between a left image L and a right image R that have been rectified is an array such that the pixel corresponding to L(x,y) is R(x-D(x,y), y).
  - Implement the left-to-right consistency check, retaining a value in the left disparity map only if the corresponding point in the right disparity map yields the negative of that disparity. The resulting disparity map should be valid only at the pixels that pass the consistency check; set other pixels to zero.
  - Your code should be efficient as possible, on the order of several frames per second. (Hint: First compute the dissimilarities of all the pixels for each disparity, storing the results in an array of images; then convolve each image with a summing kernel (all ones) in both directions. Further speedup can be obtained using mmx_diff and xmm_diff in Blepo, but this is not required.)
  - Suggestion: use SAD (sum of absolute differences) to match raw intensities and use a window size of 5x5.
  - Run your code on tsukuba_left.pgm and tsukuba_right.pgm. Show the results both with and without the consistency check. What kind of errors do you notice? Now run the algorithm on lamp_left.pgm and lamp_right.pgm. What happens? Why is this image difficult?
- Take a look at the results of the latest stereo research at http://vision.middlebury.edu/stereo (click on the "Evaluation" tab). Look only at the column (all) under the column Tsukuba. What errors do you see in the best algorithm (the one with minimum error in this column)? What does this tell you about the difficulty of the problem?
- Write a report describing your approach, including your algorithm and methodology, experimental results, and discussion.
HW#6 (Lucas-Kanade)

Implement Lucas-Kanade feature point detection and tracking.
- Detection. For each pixel in a graylevel image, construct the 2x2 covariance matrix of the gradients in the 5x5 window surrounding the pixel. Then compute the minimum eigenvalue of the gradient covariance matrix for each pixel. Detect the n most salient features, separated from each other by a distance of at least k pixels, where n=100 and k=8.
- Tracking. For each feature, track its location from one image frame to the next by iteratively solving the Lucas-Kanade equation Zd=e, where Z is the 2x2 gradient covariance matrix and e is the 2x1 vector of gradients multiplied by the temporal derivative. Display a movie of the original images with features overlaid.
Run your code on the following image sequences: flowergarden.zip and statue_sequence.zip, overlaying the features on the original images.
For this assignment you may not use any of the Lucas-Kanade or KLT implementations in Blepo, or any other existing implementations of Lucas-Kanade.
Write a report describing your approach, including your algorithm and methodology, experimental results, and discussion.

Grading standard:

A. Report is coherent, concise, clear, and neat, with correct grammar and punctuation. Code works correctly the first time and achieves good results on both images.
B. Report adequately describes the work done, and code generally produces good results. There are a small number of defects either in the implementation or the writeup, but the essential components are there.
C. Report or code are inadequate. The report contains major errors or is illegible, the code does not run or produces significantly flawed results, or instructions are not followed.
D or F. Report or code not attempted, not turned in, or contains extremely serious deficiencies.

Detailed grading breakdown is available in the grading chart.

Extra credit: Contributions to the Blepo computer vision library will earn up to 10 points extra credit on your final grade. In general, you should expect 1 point for a major bug fix, and 2-7 points for a significant extension to an existing function or implementation of an algorithm or set of functions. Contributions should be cleanly written, with code-level and user-level documentation, and a test harness. To receive extra credit, you must meet the following deadlines:

announce (non-binding) intention to contribute (10/17)
get interface approval (10/31)
turn in final code and documentation (12/5)

Projects

In your final project, you will investigate some area of image processing or computer vision in more detail. Typically this will involve formulating a problem, reading the literature, proposing a solution, implementing the solution (using the programming language/environment of your choice), evaluating the results, and communicating your findings. In the case of a survey project, the quality and depth of the literature review should be increased significantly to compensate for the lack of implementation.

Project deadlines:

10/31: team (1 or 2 people), title, and brief description
11/21: progress report (1 page)
12/8: final oral presentation in class during final exam slot, 8:00-10:30
12/10: final written report (up to 5 pages)

To turn in your report, please send me a single email per group (do not email the assign server) with two attachments:

PDF file containing your 5-page report, conference format (title, authors, abstract, introduction, method, experimental results, conclusion, references)
PPT file containing your slides

Both files should have the same name, which should correspond somehow to your topic. Use underscores instead of spaces. Do not send PPTX files. Example: face_detection.pdf and face_detection.ppt. You do *not* need to send me your code (although you may if you like).

Projects from previous years

Administrivia

Instructor: Stan Birchfield, 207-A Riggs Hall, 656-5912, email: stb at clemson
Office hours: 2:00-4:00pm, F, or by appointment
Grader: Zhichao Chen, 017 Riggs Hall, zhichac at clemson
Lectures: 12:20 - 1:10 MWF, 223 Riggs Hall

Week	Topic	Assignment
1	Pixel-based processing	HW1: Warm-up, due 8/29
2	Pixel-based processing	Quiz #1, 9/5
3	Filters and edge detection	HW2: Pixels and regions, due 9/12
4	Filters and edge detection	Quiz #2, 9/19
5	Segmentation	HW3: Edge detection, due 9/26
6	Segmentation	Quiz #3, 10/3
7	Stereo	HW4: Segmentation, due 10/10
8	Stereo	Quiz #4, 10/17
9	Motion	HW5: Stereo matching, due 10/24
10	Motion	Quiz #5, 10/31
11	Image formation	HW6: Lucas-Kanade tracking, due 11/7
12	Projective geometry	Quiz #6, 11/14
13	Projective geometry
14	Color	Quiz #7, 12/5
15	Color	projects due