Elliptical Head Tracking
Using Intensity Gradients and Color Histograms

Stan Birchfield

An elliptical head tracker has been developed that works as follows. When a new image becomes available, the tracker performs a local search to find the best position and size for the ellipse by maximizing the sum of two terms: one involving the dot product of the image gradient with the ellipse normal, and another involving the normalized histogram intersection between the color histogram of the ellipse's interior and a previously stored, personalized color histogram model. Despite the local search, velocity prediction removes any restriction on maximum lateral image velocity. In real time, the tracker is able to reliably and automatically control the camera's pan, tilt, and zoom in order to keep the subject centered in the field of view at a desired size.

Compared with previous work, this head tracker is the first system that uses multiple tracking criteria to handle full 360-degree out-of-plane rotation, large scale changes, arbitrary camera movement, and multiple moving people in the background.

Below are some experiments showing the tracker's performance on people with widely different complexions, hair color, amount of hair, head shape, and type and color of shirt. Several of the people are wearing spectacles and one has a beard. Click on any of the images to download its corresponding MPEG file.

The first number in the second column of the table gives the search range, where, for example, 16x4x1 means +/- 16 pixels in x, +/- 4 pixels in y, and +/- 1 pixel in size. The second number gives the file's speed as a percentage of real time (Thus, 150% means that the file plays 1.5 times faster than real time).

MPEG video clip Search range /
Speed
Description

1500 KB
4x4x1


150%
An extended sequence showing pan, tilt, and zoom control; occlusion; 360-degree rotation; and head tilting.

1576 KB
4x4x1


150%
Another extended sequence similar to the previous one, also showing the overlapping of two faces. Notice the flesh-colored board in the background.

166 KB
16x4x1


128%
Simultaneous occlusion, rotation, and translation.

347 KB
4x4x1


150%
Severe occlusion of subject by another person.

171 KB
16x4x1


128%
An attempt to distract the tracker with elliptical hands. On the fourth try, the tracker was distracted.

474 KB
16x4x1


128%
Tilting, rotation, and zooming.

838 KB
16x4x1


128%
Three villains unsuccessfully trying to steal the ellipse. Notice the relatively large search range.

396 KB
4x4x1


150%
The tradeoff between the different search ranges.
Red ellipse: The tracker originally ran with a 4x4x1 range but lost the subject because he accelerated too quickly.
Blue ellipse: Off-line, with an 8x8x1 search range the tracker was not thrown off the subject but was distracted by another person.
Green ellipse: Off-line, the tracker was successful with a 16x4x1 range.

121 KB
16x4x1


96%
Jerky, back-and-forth head motion. Since the file plays in real time, it gives some indication of the tracker's actual speed.

146 KB
16x4x1


128%
The subject walking behind a cubicle.

124 KB
16x4x1


128%
Because the histogram model had little hair, the tracker failed when the subject rotated.

775 KB
16x4x1


128%
This time the villains were more successful.

367 KB
16x4x1


128%
More rotation and occlusion.

202 KB
16x4x1


128%
More rotation and distraction.

188 KB
16x4x1


128%
More rotation and distraction.

161 KB
16x4x1


128%
More rotation.

169 KB
16x4x1


128%
More rotation.

Publications

Other


This work was conducted at Autodesk Advanced Products Group, Vision Technology Center, Mountain View, California.