Integrating Geometrical Context for Semantic Labeling of Indoor Scenes using RGBD Images

    Research output: Contribution to journalArticle

    13 Citations (Scopus)
    200 Downloads (Pure)

    Abstract

    © 2015, Springer Science+Business Media New York. Inexpensive structured light sensors can capture rich information from indoor scenes, and scene labeling problems provide a compelling opportunity to make use of this information. In this paper we present a novel conditional random field (CRF) model to effectively utilize depth information for semantic labeling of indoor scenes. At the core of the model, we propose a novel and efficient plane detection algorithm which is robust to erroneous depth maps. Our CRF formulation defines local, pairwise and higher order interactions between image pixels. At the local level, we propose a novel scheme to combine energies derived from appearance, depth and geometry-based cues. The proposed local energy also encodes the location of each object class by considering the approximate geometry of a scene. For the pairwise interactions, we learn a boundary measure which defines the spatial discontinuity of object classes across an image. To model higher-order interactions, the proposed energy treats smooth surfaces as cliques and encourages all the pixels on a surface to take the same label. We show that the proposed higher-order energies can be decomposed into pairwise sub-modular energies and efficient inference can be made using the graph-cuts algorithm. We follow a systematic approach which uses structured learning to fine-tune the model parameters. We rigorously test our approach on SUN3D and both versions of the NYU-Depth database. Experimental results show that our work achieves superior performance to state-of-the-art scene labeling techniques.
    Original languageEnglish
    Pages (from-to)1-20
    JournalInternational Journal of Computer Vision
    Volume117
    Issue number1
    Early online date3 Jul 2015
    DOIs
    Publication statusPublished - Mar 2016

    Fingerprint

    Labeling
    Semantics
    Pixels
    Geometry
    Labels
    Sensors
    Industry

    Cite this

    @article{f8d755fd1def47e699c1c72123c7b309,
    title = "Integrating Geometrical Context for Semantic Labeling of Indoor Scenes using RGBD Images",
    abstract = "{\circledC} 2015, Springer Science+Business Media New York. Inexpensive structured light sensors can capture rich information from indoor scenes, and scene labeling problems provide a compelling opportunity to make use of this information. In this paper we present a novel conditional random field (CRF) model to effectively utilize depth information for semantic labeling of indoor scenes. At the core of the model, we propose a novel and efficient plane detection algorithm which is robust to erroneous depth maps. Our CRF formulation defines local, pairwise and higher order interactions between image pixels. At the local level, we propose a novel scheme to combine energies derived from appearance, depth and geometry-based cues. The proposed local energy also encodes the location of each object class by considering the approximate geometry of a scene. For the pairwise interactions, we learn a boundary measure which defines the spatial discontinuity of object classes across an image. To model higher-order interactions, the proposed energy treats smooth surfaces as cliques and encourages all the pixels on a surface to take the same label. We show that the proposed higher-order energies can be decomposed into pairwise sub-modular energies and efficient inference can be made using the graph-cuts algorithm. We follow a systematic approach which uses structured learning to fine-tune the model parameters. We rigorously test our approach on SUN3D and both versions of the NYU-Depth database. Experimental results show that our work achieves superior performance to state-of-the-art scene labeling techniques.",
    author = "Salman Khan and Mohammed Bennamoun and Ferdous Sohel and Roberto Togneri and Imran Naseem",
    year = "2016",
    month = "3",
    doi = "10.1007/s11263-015-0843-8",
    language = "English",
    volume = "117",
    pages = "1--20",
    journal = "International Journal of Computer Vision",
    issn = "0920-5691",
    publisher = "Springer",
    number = "1",

    }

    TY - JOUR

    T1 - Integrating Geometrical Context for Semantic Labeling of Indoor Scenes using RGBD Images

    AU - Khan, Salman

    AU - Bennamoun, Mohammed

    AU - Sohel, Ferdous

    AU - Togneri, Roberto

    AU - Naseem, Imran

    PY - 2016/3

    Y1 - 2016/3

    N2 - © 2015, Springer Science+Business Media New York. Inexpensive structured light sensors can capture rich information from indoor scenes, and scene labeling problems provide a compelling opportunity to make use of this information. In this paper we present a novel conditional random field (CRF) model to effectively utilize depth information for semantic labeling of indoor scenes. At the core of the model, we propose a novel and efficient plane detection algorithm which is robust to erroneous depth maps. Our CRF formulation defines local, pairwise and higher order interactions between image pixels. At the local level, we propose a novel scheme to combine energies derived from appearance, depth and geometry-based cues. The proposed local energy also encodes the location of each object class by considering the approximate geometry of a scene. For the pairwise interactions, we learn a boundary measure which defines the spatial discontinuity of object classes across an image. To model higher-order interactions, the proposed energy treats smooth surfaces as cliques and encourages all the pixels on a surface to take the same label. We show that the proposed higher-order energies can be decomposed into pairwise sub-modular energies and efficient inference can be made using the graph-cuts algorithm. We follow a systematic approach which uses structured learning to fine-tune the model parameters. We rigorously test our approach on SUN3D and both versions of the NYU-Depth database. Experimental results show that our work achieves superior performance to state-of-the-art scene labeling techniques.

    AB - © 2015, Springer Science+Business Media New York. Inexpensive structured light sensors can capture rich information from indoor scenes, and scene labeling problems provide a compelling opportunity to make use of this information. In this paper we present a novel conditional random field (CRF) model to effectively utilize depth information for semantic labeling of indoor scenes. At the core of the model, we propose a novel and efficient plane detection algorithm which is robust to erroneous depth maps. Our CRF formulation defines local, pairwise and higher order interactions between image pixels. At the local level, we propose a novel scheme to combine energies derived from appearance, depth and geometry-based cues. The proposed local energy also encodes the location of each object class by considering the approximate geometry of a scene. For the pairwise interactions, we learn a boundary measure which defines the spatial discontinuity of object classes across an image. To model higher-order interactions, the proposed energy treats smooth surfaces as cliques and encourages all the pixels on a surface to take the same label. We show that the proposed higher-order energies can be decomposed into pairwise sub-modular energies and efficient inference can be made using the graph-cuts algorithm. We follow a systematic approach which uses structured learning to fine-tune the model parameters. We rigorously test our approach on SUN3D and both versions of the NYU-Depth database. Experimental results show that our work achieves superior performance to state-of-the-art scene labeling techniques.

    U2 - 10.1007/s11263-015-0843-8

    DO - 10.1007/s11263-015-0843-8

    M3 - Article

    VL - 117

    SP - 1

    EP - 20

    JO - International Journal of Computer Vision

    JF - International Journal of Computer Vision

    SN - 0920-5691

    IS - 1

    ER -