We address the problem of computing the 3-dimensional shape of an arbitrary
scene from a set of images taken at known viewpoints. Multi-camera scene
reconstruction is a natural generalization of the stereo matching problem.
However, it is much more difficult than stereo, primarily due to the
difficulty of reasoning about visibility. In this paper, we take an approach
that has yielded excellent results for stereo, namely energy minimization via
graph cuts. We first give an energy minimization formulation of the
multi-camera scene reconstruction problem. The energy that we minimize treats
the input images symmetrically, handles visibility properly, and imposes
spatial smoothness while preserving discontinuities. As the energy
function is NP-hard to minimize exactly, we give a graph cut algorithm that
computes a local minimum in a strong sense. We handle all camera
configurations where voxel coloring can be used, which is a large and natural
class. Experimental data demonstrates the effectiveness of our approach.