[Truncated abstract] Vision-based human pose estimation and tracking is a popular research area that has generated a great deal of interest in the last decade. This is motivated by the fact that this research area has many applications including video surveillance, clinical rehabilitation and the analysis of athlete performance. It is also non-intrusive and does not require markers to be attached to the body parts, as opposed to the marker based motion capture systems. In this thesis, two machine learning and one feature representation techniques have been developed to automatically capture human motion from images and videos. This thesis is organized as a set of papers published to and/or under review by journals or international conferences. During the last two decades there has been much work in markerless human motion capture. This thesis contributes to the existing body of work by providing three new algorithms. First, an appearance descriptor is proposed for human pose estimation from monocular images. Second, a discriminative learning-based fusion algorithm is proposed to combine shape and appearance features for human pose estimation from monocular images. Third, a hybrid discriminative and generative method that takes into account prediction uncertainty of the discriminative model is proposed for 3D human pose tracking from both single and multiple cameras. Shape-based features such as silhouettes and appearance features are commonly used for pose estimation from monocular images using regression based techniques. Silhouette features require a segmentation step to obtain only information pertinent to the shape of the occluding body parts and discards appearance information that can potentially be useful for pose estimation.
|Qualification||Doctor of Philosophy|
|Publication status||Unpublished - 2012|