Depth Discontinuities by Pixel-to-Pixel Stereo

Stan Birchfield and Carlo Tomasi

We have developed an algorithm to find depth discontinuities from a stereo pair of images. It earns its name from the fact that it matches the pixels directly in the two images without preprocessing the images or using windows, thus producing a disparity map that preserves sharp changes in disparity. One interesting part of the algorithm is the pixel dissimilarity measure that is insensitive to image sampling. The algorithm uses dynamic programming (with a fast pruning mechanism that we developed) to match the scanlines independently, followed by a fast postprocessor to clean up the results.

Eight stereo pairs of images are shown below. The first six consist of high-quality images that were taken in our laboratory using a single camera that was translated roughly in the direction of the scanlines; the lens was slightly defocused to remove aliasing. The last two come from the well-known JISCT data set. Click on any of these images to see its full-sized JPEG version.

Also shown are the disparity maps and depth discontinuities computed by our algorithm. Just click to see the full-sized GIF versions. All results were obtained with the same set of parameters and were computed in just four seconds using a 333 MHz Pentium II microprocessor (630 by 480 pixels with 21 disparity levels).

Original Images Algorithm Output
Left Image Right Image Disparity Map Depth Discontinuities Comments
Clean results on a difficult image (large untextured regions and a slanted, untextured surface). The table is recovered as a series of constant-disparity strips.
Similar results with a textured background. Edges along boxes are accurately recovered.
Accurate outline of objects. Errors near bottom are due to lack of texture and specular reflections on table
Depth discontinuities are accurately recovered in the presence of both horizontal and vertical slant
Middle bottle is lost because its disparity differs from background by only one level, which is below our threshold of two. This problem of thresholding is inherent in the task.
A mobile robot's view as it leaves a room. The edge of the recorder is recovered despite the weakness of the intensity edge.
JISCT image. The major discontinuities are correctly detected.
JISCT image. Details such as the mirror of the car and the vertical boundary between the buildings are accurately found.