TY - BOOK
T1 - Listening with your eyes: towards a practical visual speech recognition system
AU - Sui, Chao
PY - 2016
Y1 - 2016
N2 - This thesis develops visual speech feature extraction methods using the most advanced computer vision and machine learning techniques. First, a visual and audio-visual speech recognition system, which combines both grey and depth information, is proposed. This system is expected to provide the research community with a new perspective to overcome the limitations of grey-level visual speech features and boost visual and audio-visual speech accuracy. Second, several automatic deep visual feature learning techniques are also introduced. Experimental results show that these proposed techniques outperform the performance of the commonly used handcrafted features.
AB - This thesis develops visual speech feature extraction methods using the most advanced computer vision and machine learning techniques. First, a visual and audio-visual speech recognition system, which combines both grey and depth information, is proposed. This system is expected to provide the research community with a new perspective to overcome the limitations of grey-level visual speech features and boost visual and audio-visual speech accuracy. Second, several automatic deep visual feature learning techniques are also introduced. Experimental results show that these proposed techniques outperform the performance of the commonly used handcrafted features.
KW - Visual speech recognition
KW - Lipreading
KW - Planar-stereo visual information
KW - Hybrid-level visual feature
KW - Deep bottleneck feature
KW - Deep Boltzmann machine
KW - particle swarm optimisation
KW - Marginalised stacked auto-encoder
M3 - Doctoral Thesis
ER -