Abstract
In this paper we present a novel audio-visual (AV) person identification system based on joint sparse representation. Video features used were vectorized raw pixel values, while ivectors were used as the audio features. Classification is performed by solving the joint sparsity optimization problem, and fusion is carried out by using the quality (confidence) assigned to each matcher. Our experimental results on the challenging MOBIO database using 100 subjects show that the system based on joint sparse representation outperforms the system based on separate sparse representations for each modality. Furthermore, we show that our newly introduced quality measure improves the system’s performance, when compared to conventionally used quality measures for sparse representation - based systems.
Original language | English |
---|---|
Title of host publication | Proceedings of the 23rd International Conference on Pattern Recognition (ICPR) |
Editors | Eduardo Bayro-Corrochano |
Place of Publication | USA |
Publisher | IEEE, Institute of Electrical and Electronics Engineers |
Pages | 3026-3030 |
ISBN (Print) | 9781509048465 |
Publication status | Published - 2016 |
Event | 23rd International Conference on Pattern Recognition - Cancun, Mexico Duration: 4 Dec 2016 → 8 Dec 2016 |
Conference
Conference | 23rd International Conference on Pattern Recognition |
---|---|
Abbreviated title | ICPR 2016 |
Country/Territory | Mexico |
City | Cancun |
Period | 4/12/16 → 8/12/16 |