Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition

Hasan F. Zaki, F. Shafait, Ajmal Mian

    Research output: Chapter in Book/Conference paperConference paper

    23 Citations (Scopus)

    Abstract

    © 2016 IEEE.Deep learning based methods have achieved unprecedented success in solving several computer vision problems involving RGB images. However, this level of success is yet to be seen on RGB-D images owing to two major challenges in this domain: training data deficiency and multi-modality input dissimilarity. We present an RGB-D object recognition framework that addresses these two key challenges by effectively embedding depth and point cloud data into the RGB domain. We employ a convolutional neural network (CNN) pre-trained on RGB data as a feature extractor for both color and depth channels and propose a rich coarse-to-fine feature representation scheme, coined Hypercube Pyramid, that is able to capture discriminatory information at different levels of detail. Finally, we present a novel fusion scheme to combine the Hypercube Pyramid features with the activations of fully connected neurons to construct a compact representation prior to classification. By employing Extreme Learning Machines (ELM) as non-linear classifiers, we show that the proposed method outperforms ten state-of-the-art algorithms for several tasks in terms of recognition accuracy on the benchmark Washington RGB-D and 2D3D object datasets by a large margin (upto 50% reduction in error rate).
    Original languageEnglish
    Title of host publication2016 IEEE International Conference on Robotics and Automation (ICRA)
    EditorsAllison Okamura
    Place of PublicationUSA
    PublisherIEEE, Institute of Electrical and Electronics Engineers
    Pages1685-1692
    Number of pages8
    ISBN (Print)9781467380263
    DOIs
    Publication statusPublished - 2016
    Event2016 IEEE International Conference on Robotics and Automation: ICRA 2016 - Stockholm, Sweden
    Duration: 16 May 201621 May 2016

    Conference

    Conference2016 IEEE International Conference on Robotics and Automation
    CountrySweden
    CityStockholm
    Period16/05/1621/05/16

    Fingerprint Dive into the research topics of 'Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition'. Together they form a unique fingerprint.

  • Cite this

    Zaki, H. F., Shafait, F., & Mian, A. (2016). Convolutional hypercube pyramid for accurate RGB-D object category and instance recognition. In A. Okamura (Ed.), 2016 IEEE International Conference on Robotics and Automation (ICRA) (pp. 1685-1692). USA: IEEE, Institute of Electrical and Electronics Engineers. https://doi.org/10.1109/ICRA.2016.7487310