It is commonly known that stereopsis is the primary way for humans to perceive depth. Although, with one eye, we can still interact very well with our environment and do very highly skillful tasks by using other visual cues such as occlusion and motion, the resultant e ect of the absence of stereopsis is that the relative depth information between objects is essentially lost (Frisby,1979). While humans fuse the images seen by the left and right eyes in a seemingly easy way, the major problem - the correspondence of features - that needs to be solved in all binocular stereo systems of machine vision is not trivial. In this thesis, line segments and corners are chosen to be the features to be matched because they typically occur at object boundaries, surface discontinuities, and across surface markings. Polygonal regions are also selected since they are known to be well-configured and are, very often, associated with salient structures in the image. The use of these high level features, although helping to diminish matching ambiguities, does not completely resolve the matching problem when the scene contains repetitive structures. The spatial relationships between the feature matching pairs enforced in the stereo matching process, as proposed in this thesis, are found to provide even stronger support for correct feature matching pairs and, as a result, incorrect matching pairs can be largely eliminated. Getting global and salient 3D structures has been an important prerequisite for environmental modelling and understanding. While research on postprocessing the 3D information obtained from stereo has been attempted (Ayache and Faugeras, 1991), the strategy presented in this thesis for retrieving salient 3D descriptions is propagating the prominent information extracted from the 2D images to the 3D scene. Thus, the matching of two prominent 2D polygonal regions yields a prominent 3D region, and the inter-relation between two 2D region matching pairs is passed on and taken as a relationship between two 3D regions. Humans, when observing and interacting with the environment do not confine themselves to the observation and then the analysis of a single image. Similarly stereopsis can be vastly improved with the introduction of additional stereo image pairs. Eye, head, and body movements provide essential mobility for an active change of viewpoints, the disocclusion of occluded objects, the avoidance of obstacles, and the performance of any necessary tasks on hand. This thesis presents a mobile stereo vision system that has its eye movements provided by a binocular head support and stepper motors, and its body movements provided by a mobile platform, the Labmate. With a viewer centred coordinate system proposed in this thesis the computation of the 3D information observed at each individual viewpoint, the merging of the 3D in formation at consecutive viewpoints for environmental reconstruction, and strategies for movement control are discussed in detail.
|Qualification||Doctor of Philosophy|
|Publication status||Unpublished - 1994|