Hand gesture recognition is an active area of research in recent years, being used in various applications from deaf sign recognition systems to human-machine interaction applications. The gesture recognition process, in general, may be divided into two stages: the motion sensing, which extracts useful data from hand motion; and the classification process, which classifies the motion sensing data as gestures. The existing vision-based gesture recognition systems extract 2-D shape and trajectory descriptors from the visual input, and classify them using various classification techniques from maximum likelihood estimation to neural networks, finite state machines, Fuzzy Associative Memory (FAM) or Hidden Markov Models (HMMs). This thesis presents the framework of the vision-based Hand Motion Understanding (HMU) system that recognises static and dynamic Australian Sign Language (Auslan) signs by extracting and classifying 3-D hand configuration data from the visual input. The HMU system is a pioneer gesture recognition system that uses a combination of a 3-D hand tracker for motion sensing, and an adaptive fuzzy expert system for classification. The HMU 3-D hand tracker extracts 3-D hand configuration data that consists of the 21 degrees-of-freedom parameters of the hand from the visual input of a single viewpoint, with an aid of a colour coded glove. The tracker uses a model-based motion tracking algorithm that makes incremental corrections to the 3-D model parameters to re-configure the model to fit the hand posture appearing in the images through the use of a Newton style optimisation technique. Finger occlusions are handled to a certain extent by recovering the missing hand features in the images through the use of a prediction algorithm. The HMU classifier, then, recognises the sequence of 3-D hand configuration data as a sign by using an adaptive fuzzy expert system where the sign knowledge are used as inference rules. The classification is performed in two stages. Firstly, for each image, the classifier recognises Auslan basic hand postures that categorise the Auslan signs like the alphabet in English. Secondly, the sequence of Auslan basic hand postures that appear in the image sequence is analysed and recognised as a sign. Both the posture and sign recognition are performed by the same adaptive fuzzy inference engine. The HMU rule base stores 22 Auslan basic hand postures, and 22 signs. For evaluation, 44 motion sequences (2 for each of the 22 signs) are recorded. Among them, 22 randomly chosen sequences (1 for each of the 22 signs) are used for testing and the rest are used for training. The evaluation shows that before training the HMU system correctly recognised 20 out of 22 signs. After training, with the same test set, the HMU system recognised 21 signs correctly. All of the failed cases did not produce any output. The evaluation has successfully demonstrated the functionality of the combined use of a 3-D hand tracker and an adaptive fuzzy expert for a vision-based sign language recognition.
|Qualification||Doctor of Philosophy|
|Publication status||Unpublished - 1997|