TY - BOOK
T1 - Human motion capture in images and videos using discriminative and hybrid methods
AU - Sedai, Suman
PY - 2012
Y1 - 2012
N2 - [Truncated abstract] Vision-based human pose estimation and tracking is a popular research area that has generated a great deal of interest in the last decade. This is motivated by the fact that this research area has many applications including video surveillance, clinical rehabilitation and the analysis of athlete performance. It is also non-intrusive and does not require markers to be attached to the body parts, as opposed to the marker based motion capture systems. In this thesis, two machine learning and one feature representation techniques have been developed to automatically capture human motion from images and videos. This thesis is organized as a set of papers published to and/or under review by journals or international conferences. During the last two decades there has been much work in markerless human motion capture. This thesis contributes to the existing body of work by providing three new algorithms. First, an appearance descriptor is proposed for human pose estimation from monocular images. Second, a discriminative learning-based fusion algorithm is proposed to combine shape and appearance features for human pose estimation from monocular images. Third, a hybrid discriminative and generative method that takes into account prediction uncertainty of the discriminative model is proposed for 3D human pose tracking from both single and multiple cameras. Shape-based features such as silhouettes and appearance features are commonly used for pose estimation from monocular images using regression based techniques. Silhouette features require a segmentation step to obtain only information pertinent to the shape of the occluding body parts and discards appearance information that can potentially be useful for pose estimation.
AB - [Truncated abstract] Vision-based human pose estimation and tracking is a popular research area that has generated a great deal of interest in the last decade. This is motivated by the fact that this research area has many applications including video surveillance, clinical rehabilitation and the analysis of athlete performance. It is also non-intrusive and does not require markers to be attached to the body parts, as opposed to the marker based motion capture systems. In this thesis, two machine learning and one feature representation techniques have been developed to automatically capture human motion from images and videos. This thesis is organized as a set of papers published to and/or under review by journals or international conferences. During the last two decades there has been much work in markerless human motion capture. This thesis contributes to the existing body of work by providing three new algorithms. First, an appearance descriptor is proposed for human pose estimation from monocular images. Second, a discriminative learning-based fusion algorithm is proposed to combine shape and appearance features for human pose estimation from monocular images. Third, a hybrid discriminative and generative method that takes into account prediction uncertainty of the discriminative model is proposed for 3D human pose tracking from both single and multiple cameras. Shape-based features such as silhouettes and appearance features are commonly used for pose estimation from monocular images using regression based techniques. Silhouette features require a segmentation step to obtain only information pertinent to the shape of the occluding body parts and discards appearance information that can potentially be useful for pose estimation.
KW - 3D human motion capture
KW - Appearance descriptor
KW - Discriminative learning
KW - Generative models
KW - Hybrid method
KW - Gaussian Process
KW - Descriptor fusion
KW - Partical filter
M3 - Doctoral Thesis
ER -