ECE 847

ECE 847 Digital Image Processing

Fall 2012

This course introduces students to the basic concepts, issues, and algorithms in digital image processing and computer vision. Topics include image formation, projective geometry, convolution, Fourier analysis and other transforms, pixel-based processing, segmentation, texture, detection, stereo, and motion. The goal is to equip students with the skills and tools needed to manipulate images, along with an appreciation for the difficulty of the problems. Students will implement several standard algorithms, evaluate the strengths and weakness of various approaches, and explore a topic of their own choosing in a course project.

Syllabus

Schedule

Week Topic Assignment
1 Pixel-based processing HW1: Floodfill, due 8/31
2 Pixel-based processing Quiz #1, 9/7
3 Filters and edge detection HW2: Pixels and regions, due 9/14
4 Filters and edge detection Quiz #2, 9/21
5 Segmentation HW3: Edge detection, due 9/28
6 Segmentation Quiz #3, 10/5
7 Stereo HW4: Segmentation, due 10/12
8 Stereo Quiz #4, 10/19
9 Motion HW5: Stereo matching, due 10/26
10 Motion Quiz #5, 11/2
11 Image formation HW6: Lucas-Kanade tracking, due 11/9
12 Projective geometry Quiz #6, 11/16
13 Projective geometry
14 Color Quiz #7, 12/7
15 Color projects due

Readings and Resources

Readings to complement the lectures:

Sonka et al., Region-based shape representation and description
Robyn Owens, Mathematical morphology (dilation and erosion)
R. Fisher et al., Connected components
Bill Green, Canny edge detection tutorial
Bob Fisher et al., Canny edge detector
Michael Bach, Muller-Lyer illusion
Various authors, Split-and-merge segmentation
Serge Beucher, Watershed segmentation; Roerdink and Meijster, The watershed transform; Matlab, watershed tutorial
Sylvain Bougnoux, Learning epipolar geometry
Nikos Paragios, Level set tutorial; J. Sethian's level set page
R. Wang, various lectures
Adobe TIFF specification document (color spaces and JPEG)
AIM-DP (color spaces)
Amara Graps, Introduction to Wavelets; wavelet resources

Computer vision in the news:

Help organizing your digital photos, CBS News, Feb. 9, 2006 (Riya)
'Silent drowning' pool girl saved by underwater cameras, Times Online, Aug. 31, 2005
Courtrooms could host virtual crime scenes, New Scientist.com, March 10, 2005
Sportvision virtual first-down markers
Basketball buddies build a computerized shot doctor, USA Today, Feb. 7, 2003 (Noah Basketball)
Automotive applications:
- Infiniti advanced lane departure warning system
- Infiniti Around View Monitor, Nissan Around View Monitor, 2007
- Chrysler automobile uses CMOS cameras for smart headlights, IEEE Spectrum, Apr. 2006 (Gentex SmartBeam)
- Lexus uses computer vision for automatic parallel parking, IEEE Spectrum, Apr. 2006 (Intelligent Parking Assist)
- Electronic vision unblocks the 'blind spot', IEEE Spectrum, Apr. 2006 (Volvo's Blind-Spot Information System)
- Car, park thyself (Toyota's automatic parking feature), CBS News, Jan. 15, 2003
- Mobileye EyeQ2
- Ford's Lane Keeping System
Content-Aware Image Sizing
Fly-Eye Inspired Speed Sensor
Sudoku solver: http://www.codeproject.com/KB/game/WebcamSudokuSolver.aspx

Vision in biological systems:

P. Gurney, Is our 'inverted' retina really 'bad design'?, Technical Journal, 1999
C. Wieland, Seeing back to front, Creation, 1996 (see also An eye for creation, Creation, 1996)
J. Sarfati, Can it bee?, Creation, 2003 -- honeybees using optic flow for navigation
Centeye -- obstacle avoidance using optic flow
C. Stammers, Trilobite technology, Creation, 1993
S. M. Gon, The trilobite eye
J. Sarfati, Lobster eyes: brilliant geometric design, Creation, 2001
Sight in British garden birds
Color vision in birds
P. Gurney, Our eye movements and their control: Part 1, Technical Journal, 2002
P. Gurney, Our eye movements and their control: Part 2, Technical Journal, 2003
C. Wieland, New eyes for blind cave fish, 2000
T. Wagner, Darwin vs. the eye, Creation, 1994
D. E. Stoltzmann, The specified complexity of retinal imagery, CRSQ, 43(1):4-12, June 2006
Eye Design Book -- overview of eyes in animal world
Human visual system:
- Change detection, Visual Cognition Lab, Univ. of Illinois
  - Basketball video -- Can you count the number of times the basketball is passed?

Computer vision companies:

Object Video -- Vistascape -- ioimage -- NICE systems -- Cognex -- Digital Persona -- TZYX --
Organic Motion -- Avid Technology -- Mobileye -- Vision Robotics -- Sportvision -- Kitware -- UVP
more ... (an extensive list from David Lowe)

Software:

Coding resources
Irfanview -- free image viewer

Additional computer vision resources

Resources for current students (restricted access, not open to the public)

Assignments

In the assignments, you will implement several fundamental algorithms in C/C++, documenting your findings is an accompanying report for each assignment. C/C++ is chosen for its fundamental importance, ubiquity, and efficiency (which is crucial to image processing and computer vision). For your convenience, you are encouraged to use the latest version of the Blepo computer vision library.

Your code must compile under Visual Studio 2010 or VC++ 6.0. You should develop your code in Debug mode but test in Release mode before submitting. The grader will test in Release mode. To make grading easier, your code should do one of the following:

#include "blepo.h" (In this case it does not matter where your blepo directory is, because the grader can simply change the directory include settings (Tools->Options->Directories->Include files) for Visual Studio to automatically find the header file.)
or
#include "../blepo/src/blepo.h" (assuming your main file is directly inside your directory). In other words, your assignment directory should be at the same level as the blepo directory. Here is an example:

To turn in your assignment, send an email to assign@assign.ece.clemson.edu . Be sure to do the following:

make the subject line "ECE847-1,#n" (without quotes but with the # sign), where 'n' is the assignment number.
cc the instructor and grader, so we have a record of your submission in case something is wrong with the assign server. We cannot grade what we do not receive.
send this email from your @clemson.edu account, because the assign server is not smart enough to know who you are if you use another account.
- For example, do NOT use @g.clemson.edu. If you are using Gmail, it is not sufficient to change the 'send mail as:' to @clemson.edu. Instead, either send from webmail.clemson.edu or change your Gmail settings as follows:
  - login to your account through gmail.com (not from Clemson's Google Apps)
  - click on 'Settings . Settings . Accounts and Import'. Under 'Send mail as:', select 'Add another email address', type in your userid@clemson.edu, click 'Next step', then select 'Send through clemson.edu SMTP servers', type 'smtp.clemson.edu' along with your userid and password, select 'Secured connection using SSL', then 'Add account'.
attach a zip file containing all the files needed to compile your project. But do NOT check in all the other files that Visual Studio creates automatically. When in doubt, check out your code to a new temporary directory and verify that it compiles and runs. In other words,

Do include files such as .h, .c, .cpp, .rc, .vcxproj, .sln, ... (or .dsp and .dsw if using VC6.0). Also, if you have built an MFC Windows application (as opposed to a console-based application), check in the res directory that contains .ico and .rc2 files.
Do NOT include these files: .aps, .clw, .ncb, .opt, .plg, .suo, .sdf. Also, be sure to delete the Debug and Release and ipch directories.

include your report in the zip file (in any standard format such as .pdf or .doc; but NOT .docx). Reports should be professionally written, with a title, a description of the problem, a description of the algorithm, a detailed discussion of your particular implementation, results, and analysis. An example report. Similarly, code should be professionally and cleanly written, making use of standard programming practices.
the body of the email is not important and may be left blank

All assignments are due at 11:59pm on the due date shown. An 8-hour grace period is extended, so that no points will be deducted for anything submitted before 8:00am the next morning.

Assignments:

HW#1 (Floodfill)
- Implement the floodfill algorithm in C/C++. Create an executable that allows the user to choose the filename and seed point; it is okay if you hardcode the new color. The application should load the image from disk, display the original image, run the algorithm, and display the resulting image. (The specific interface is up to you: Either use command-line parameters, such as: filename x y (in that order), where 'filename' is the image filename and (x,y) are the coordinates of the seed point; Or use a windows-based interface, such as CFileDialog for selecting the file and GrabMouseClick for getting the seed point.)
  - To create a console app in Visual C++ 6.0, follow these instructions: File -> New -> Project -> Win32 Console Application. Give it a name and keep the checkbox on "Create new workspace". Choose "An application that supports MFC." Now compile and run (Build -> Build ..., and Build -> Execute, or F7 and Ctrl-F5). Under FileView -> Source Files you will find the main cpp file. (Also, I would recommend that you turn off Precompiled Headers: Project -> Settings -> C/C++ -> Precompiled headers -> Not using precompiled headers. Before you click on the radio button, though, first select All configurations in the drop down box so that both Debug and Release versions are affected.)
- The images that the grader will use to test your code are quantized.pgm, tillman.ppm, and others that are similar.
- Your code should work for either grayscale or color images, and it should allow the new value to be a Bgr color (just load the image into ImgBgr, and treat it like a color image).
- For simplicity, use 4-neighbor connectedness (but 8-connected is fine, too, if you want to do a little additional work).
- To make memory management easier, feel free to use std::stack or std::vector.
- A tutorial on the Blepo library will be given in class. You may use any part of the library except the Floodfill function itself.
- No report is due for this assignment.
HW#2 (Fruit classification)
- Write code to automatically detect and classify fruit on a dark background.
  - Implement double thresholding using two thresholds that you determine by trial and error, which are hardcoded in your code.
  - At any point before or during thresholding, perform noise removal (if needed) using your own combination of erosion / dilation / opening / closing.
  - Implement connected components (by repeated applications of floodfill) to detect and count the foreground regions of the graylevel image, distinguishing them from the background. Hint: Use an ImgInt rather than an ImgGray for the output labels, in case there are more than 256 regions due to noise, even if there are only a small number of objects in the image.
  - Compute the properties of each foreground region, including
    - zeroth-, first- and second-order moments (regular and centralized)
    - compactness (To compute the area, simply count the number of pixels. To compute the perimeter, apply the logical XOR to the thresholded image and the result of eroding this image with a 3x3 structuring element of all ones; the result will be the number of 4-connected foreground boundary pixels.)
    - * eccentricity (or elongatedness), using eigenvalues
    - * direction, using either eigenvectors (PCA) or the moments formula (they are equivalent)
  - Using a combination of these properties or others that you develop, write an algorithm to automatically classify each piece of fruit into one of three categories: apple, grapefruit, and banana.
  - * Also detect the banana stem using an idea that you come up with.
- Your output should look like this:
  - One figure window should show the original image. Three additional figures should show the result of thresholding the image with the low and high thresholds, along with the output of double-thresholding. Be sure to set the title of each figure to an appropriate human-readable string that indicates what is being displayed. Feel free to display additional intermediate results in other figures if you like.
- The grader will test your code on the images fruit1.pgm and fruit2.pgm (or, in BMP format, fruit1.bmp and fruit2.bmp), along with other similar images (same scale and lighting conditions, but the image dimensions, rotation, and number of fruit instances may change). The same algorithm parameters should be used for all objects and for both images.
- For this assignment, you may use any Blepo functions in ImageOperations.h, except for the dilation and erosion functions. You may not use any Blepo functionality contained or prototyped in ImageAlgorithms.h.
  - As a debugging strategy, however, you may find it helpful to use various Blepo functions (e.g., dilation, erosion, Floodfill, ConnectedComponents) as stand-ins until you write your own versions.
- No report is due for this assignment.

Grading standard:

A. Report is coherent, concise, clear, and neat, with correct grammar and punctuation. Code works correctly the first time and achieves good results on both images. All items marked (*) are implemented.
B. Report adequately describes the work done, and code generally produces good results. There are a small number of defects either in the implementation or the writeup, but the essential components are there. Many or all items marked (*) are not implemented.
C. Report or code are inadequate. The report contains major errors or is illegible, the code does not run or produces significantly flawed results, or instructions are not followed.
D or F. Report or code not attempted, not turned in, or contains extremely serious deficiencies.

Detailed grading breakdown is available in the grading chart.

Projects

In your final project, you will investigate some area of image processing or computer vision in more detail. Typically this will involve formulating a problem, reading the literature, proposing a solution, implementing the solution (using the programming language/environment of your choice), evaluating the results, and communicating your findings. In the case of a survey project, the quality and depth of the literature review should be increased significantly to compensate for the lack of implementation.

Project deadlines:

11/2: team (1 or 2 people), title, and brief description
11/23: progress report (1 page)
12/10: final oral presentation in class during final exam slot, 8:00-10:30
12/12: final written report (5 pages)

To turn in your report, please send me a single email per group (do not email the assign server) with two attachments:

PDF file containing your 5-page report, conference format (title, authors, abstract, introduction, method, experimental results, conclusion, references)
PPT file containing your slides

Both files should have the same name, which should correspond somehow to your topic. Use underscores instead of spaces. Do not send PPTX files. Example: face_detection.pdf and face_detection.ppt. You do *not* need to send me your code (although you may if you like).

Projects from previous years

Administrivia

Instructor: Stan Birchfield, 209 Riggs Hall, 656-5912, email: stb at clemson
Office hours: MWF afternoons
Grader: Brian Peasley, bpeasle at clemson
Lectures: 12:20 - 1:10 MWF, 223 Riggs Hall

Week	Topic	Assignment
1	Pixel-based processing	HW1: Floodfill, due 8/31
2	Pixel-based processing	Quiz #1, 9/7
3	Filters and edge detection	HW2: Pixels and regions, due 9/14
4	Filters and edge detection	Quiz #2, 9/21
5	Segmentation	HW3: Edge detection, due 9/28
6	Segmentation	Quiz #3, 10/5
7	Stereo	HW4: Segmentation, due 10/12
8	Stereo	Quiz #4, 10/19
9	Motion	HW5: Stereo matching, due 10/26
10	Motion	Quiz #5, 11/2
11	Image formation	HW6: Lucas-Kanade tracking, due 11/9
12	Projective geometry	Quiz #6, 11/16
13	Projective geometry
14	Color	Quiz #7, 12/7
15	Color	projects due