Elliptical Head Tracking 
Using Intensity Gradients and Color Histograms
 Stan Birchfield
An elliptical head tracker has been developed that works as follows.
When a new image becomes
available, the tracker performs a local search to find the best
position and size for the ellipse by maximizing the sum of two
terms:  one involving the dot product of the image gradient with
the ellipse normal, and another involving the normalized histogram
intersection between the color histogram of the ellipse's interior
and a previously stored, personalized color histogram model.
Despite the local search, velocity prediction removes any restriction
on maximum lateral image velocity.  In real time, the tracker is able to
reliably and automatically control the camera's pan, tilt, and zoom in
order to keep the subject centered in the field of view at a desired
size.
Compared with previous work, this head tracker is the first system  that uses multiple tracking criteria 
to handle full 360-degree out-of-plane rotation,
large scale changes, arbitrary camera movement, and multiple moving people
in the background.
Below are some experiments showing the tracker's performance on
people with widely different complexions, hair color, amount of hair, head
shape, and type and color of shirt.  Several of the people are wearing 
spectacles and one has a
beard.  Click on any of the images to download its corresponding MPEG file. 
The first number in the second column of the table gives the search range,
where, for example, 16x4x1 means +/- 16 pixels in x, +/- 4 pixels
in y, and +/- 1 pixel in size.  
The second number gives the file's speed as a percentage of real time (Thus,
150% means that the file plays 1.5 times faster than real time).
     |  MPEG video clip  |  Search range /   Speed  |  Description
 | 
        
		1500 KB  | 
	  4x4x1       150%  | 
	 An extended sequence showing pan, tilt, and zoom control;
		occlusion; 360-degree rotation; and head tilting. |  
       
		 1576 KB  | 
	  4x4x1       150%  | 
	 Another extended sequence similar to the previous one,
		also showing the overlapping of two faces. 
		Notice the flesh-colored board in the background. | 
     
       
		 166 KB  | 
	 16x4x1       128%  | 
	 Simultaneous occlusion, rotation, and translation.  | 
       
		 347 KB  | 
	  4x4x1       150%  | 
	 Severe occlusion of subject by another person.  | 
       
		 171 KB  | 
	 16x4x1       128%  | 
	 An attempt to distract the tracker with elliptical hands. 
		On the fourth try, the tracker was distracted.  | 
       
		 474 KB  | 
	 16x4x1       128%  | 
	 Tilting, rotation, and zooming.  | 
     
       
		 838 KB  | 
	 16x4x1       128%  | 
	 Three villains unsuccessfully trying to steal the ellipse.
		Notice the relatively large search range.  | 
       
		 396 KB  | 
	  4x4x1       150%  | 
	 The tradeoff between the different search ranges.
		
		- Red ellipse:
			The tracker originally ran with a 4x4x1 range but lost 
		the subject because he accelerated too quickly. 
 
		- Blue ellipse: Off-line, with an 8x8x1 search range 
		the tracker was not thrown off the subject but was distracted 
		by another person. 
 
		- Green ellipse:  Off-line, the tracker was
		successful with a 16x4x1 range. 
     | 
     
       
		 121 KB  | 
	 16x4x1       96%  | 
	 Jerky, back-and-forth head motion. Since the file plays in real time, it
		gives some indication of the tracker's actual speed.  | 
     
       
		 146 KB  | 
	 16x4x1       128%  | 
	 The subject walking behind a cubicle.  | 
       
		 124 KB  | 
	 16x4x1       128%  | 
	 Because the histogram model had little hair, the tracker
		failed when the subject rotated. | 
     
       
		 775 KB  | 
	 16x4x1       128%  | 
	 This time the villains were more successful.  | 
       
		 367 KB  | 
	 16x4x1       128%  | 
	 More rotation and occlusion. | 
       
		 202 KB  | 
	 16x4x1       128%  | 
	 More rotation and distraction.  | 
     
       
		 188 KB  | 
	 16x4x1       128%  | 
	 More rotation and distraction. | 
     
       
		 161 KB  | 
	 16x4x1       128%  | 
	 More rotation. | 
       
		 169 KB  | 
	 16x4x1       128%  | 
	 More rotation.  | 
Publications
Other
This work was conducted at Autodesk Advanced Products Group,
     Vision Technology Center, Mountain View, California.