This paper presents an approach for acoustic scene classification using the local binary pattern (LBP) and random forest (RF). The audio signal is converted to a Constant-Q transform (CQT) representation and LBP is used to extract the features from this time-frequency representation. The CQT representations are divided into a number of sub-bands to obtain more localized features relevant to the spectral information. We then use random forest to select the most important features for each band of extracted LBP features. For further performance enhancement, we use feature level fusion of LBP and HOG features. The proposed system has achieved an accuracy of 85% on the DCASE 2016 dataset.
|Title of host publication||2018 IEEE International Conference on Multimedia and Expo, ICME 2018|
|Publisher||IEEE, Institute of Electrical and Electronics Engineers|
|Publication status||Published - 8 Oct 2018|
|Event||2018 IEEE International Conference on Multimedia and Expo, ICME 2018 - San Diego, United States|
Duration: 23 Jul 2018 → 27 Jul 2018
|Conference||2018 IEEE International Conference on Multimedia and Expo, ICME 2018|
|Period||23/07/18 → 27/07/18|