This thesis develops visual speech feature extraction methods using the most advanced computer vision and machine learning techniques. First, a visual and audio-visual speech recognition system, which combines both grey and depth information, is proposed. This system is expected to provide the research community with a new perspective to overcome the limitations of grey-level visual speech features and boost visual and audio-visual speech accuracy. Second, several automatic deep visual feature learning techniques are also introduced. Experimental results show that these proposed techniques outperform the performance of the commonly used handcrafted features.
|Qualification||Doctor of Philosophy|
|Award date||31 Mar 2017|
|Publication status||Unpublished - 2016|