Feature learning and structured prediction for scene understanding

Salman Hameed Khan

    Research output: ThesisDoctoral Thesis

    817 Downloads (Pure)

    Abstract

    When one talks about the visual comprehension ability of humans, even a young child can easily describe events happening in a scene, differentiate between different scene types, identify objects present in a scene and effortlessly reason about their location and geometry. The ultimate goal of computer vision is to mimic the astounding capabilities of human vision. However after 50 years of progress in this area, computer vision is still far from the scene understanding capabilities of a toddler. In this dissertation, we aim to further extend the frontiers of computer vision by investigating robust feature learning and structured prediction frameworks for visual scene understanding. This dissertation is organized as a collection of research manuscripts which are either already published or submitted to internationally refereed conference and journals.

    The dissertation explores two distinct aspects of scene understanding and analysis. First, we explore improved feature representations for scene understanding tasks. We investigate both hand-crafted as well as automatically learned feature representations using deep neural networks. Second, we propose new structured prediction models to incorporate rich relationships between both low-level and high-level scene elements. More specifically, we study some of the most important sub-tasks under the umbrella of scene understanding such as semantic labelling, geometric and volumetric reasoning, object shadow detection and removal, scene categorization and change detection and analysis. The proposed algorithms in this dissertation pertain to different data modalities including RGB images, RGB+Depth data, underwater imagery, dermoscopy images, synthetic images and spectral data from satellites.

    A major hurdle towards the goal of scene understanding is the limited availability of data and annotations. This dissertation also contributes towards this aspect by gathering two new datasets along with their annotations. Moreover, we present methods to directly deal with specific data related issues e.g., recovery of missing data, learning with only weak supervision and handling highly unbalanced data sets during model learning. Our proposed approaches show very promising results on a diverse set of scene understanding tasks. We hope that this dissertation will inspire more such eorts to realise the ultimate objective of visual scene understanding in machine vision.
    Original languageEnglish
    QualificationDoctor of Philosophy
    Awarding Institution
    • The University of Western Australia
    Award date15 Jun 2016
    Publication statusUnpublished - 2016

    Fingerprint

    Computer vision
    Labeling
    Semantics
    Availability
    Satellites
    Recovery
    Geometry

    Cite this

    @phdthesis{2a958cd10b924d78ade8b44d6159f5d6,
    title = "Feature learning and structured prediction for scene understanding",
    abstract = "When one talks about the visual comprehension ability of humans, even a young child can easily describe events happening in a scene, differentiate between different scene types, identify objects present in a scene and effortlessly reason about their location and geometry. The ultimate goal of computer vision is to mimic the astounding capabilities of human vision. However after 50 years of progress in this area, computer vision is still far from the scene understanding capabilities of a toddler. In this dissertation, we aim to further extend the frontiers of computer vision by investigating robust feature learning and structured prediction frameworks for visual scene understanding. This dissertation is organized as a collection of research manuscripts which are either already published or submitted to internationally refereed conference and journals.The dissertation explores two distinct aspects of scene understanding and analysis. First, we explore improved feature representations for scene understanding tasks. We investigate both hand-crafted as well as automatically learned feature representations using deep neural networks. Second, we propose new structured prediction models to incorporate rich relationships between both low-level and high-level scene elements. More specifically, we study some of the most important sub-tasks under the umbrella of scene understanding such as semantic labelling, geometric and volumetric reasoning, object shadow detection and removal, scene categorization and change detection and analysis. The proposed algorithms in this dissertation pertain to different data modalities including RGB images, RGB+Depth data, underwater imagery, dermoscopy images, synthetic images and spectral data from satellites. A major hurdle towards the goal of scene understanding is the limited availability of data and annotations. This dissertation also contributes towards this aspect by gathering two new datasets along with their annotations. Moreover, we present methods to directly deal with specific data related issues e.g., recovery of missing data, learning with only weak supervision and handling highly unbalanced data sets during model learning. Our proposed approaches show very promising results on a diverse set of scene understanding tasks. We hope that this dissertation will inspire more such eorts to realise the ultimate objective of visual scene understanding in machine vision.",
    keywords = "Deep learning, Graphical models, Segmentation, Detection, Scene understanding, Learning/Inference, Geometry estimation, Classification",
    author = "Khan, {Salman Hameed}",
    year = "2016",
    language = "English",
    school = "The University of Western Australia",

    }

    Khan, SH 2016, 'Feature learning and structured prediction for scene understanding', Doctor of Philosophy, The University of Western Australia.

    Feature learning and structured prediction for scene understanding. / Khan, Salman Hameed.

    2016.

    Research output: ThesisDoctoral Thesis

    TY - THES

    T1 - Feature learning and structured prediction for scene understanding

    AU - Khan, Salman Hameed

    PY - 2016

    Y1 - 2016

    N2 - When one talks about the visual comprehension ability of humans, even a young child can easily describe events happening in a scene, differentiate between different scene types, identify objects present in a scene and effortlessly reason about their location and geometry. The ultimate goal of computer vision is to mimic the astounding capabilities of human vision. However after 50 years of progress in this area, computer vision is still far from the scene understanding capabilities of a toddler. In this dissertation, we aim to further extend the frontiers of computer vision by investigating robust feature learning and structured prediction frameworks for visual scene understanding. This dissertation is organized as a collection of research manuscripts which are either already published or submitted to internationally refereed conference and journals.The dissertation explores two distinct aspects of scene understanding and analysis. First, we explore improved feature representations for scene understanding tasks. We investigate both hand-crafted as well as automatically learned feature representations using deep neural networks. Second, we propose new structured prediction models to incorporate rich relationships between both low-level and high-level scene elements. More specifically, we study some of the most important sub-tasks under the umbrella of scene understanding such as semantic labelling, geometric and volumetric reasoning, object shadow detection and removal, scene categorization and change detection and analysis. The proposed algorithms in this dissertation pertain to different data modalities including RGB images, RGB+Depth data, underwater imagery, dermoscopy images, synthetic images and spectral data from satellites. A major hurdle towards the goal of scene understanding is the limited availability of data and annotations. This dissertation also contributes towards this aspect by gathering two new datasets along with their annotations. Moreover, we present methods to directly deal with specific data related issues e.g., recovery of missing data, learning with only weak supervision and handling highly unbalanced data sets during model learning. Our proposed approaches show very promising results on a diverse set of scene understanding tasks. We hope that this dissertation will inspire more such eorts to realise the ultimate objective of visual scene understanding in machine vision.

    AB - When one talks about the visual comprehension ability of humans, even a young child can easily describe events happening in a scene, differentiate between different scene types, identify objects present in a scene and effortlessly reason about their location and geometry. The ultimate goal of computer vision is to mimic the astounding capabilities of human vision. However after 50 years of progress in this area, computer vision is still far from the scene understanding capabilities of a toddler. In this dissertation, we aim to further extend the frontiers of computer vision by investigating robust feature learning and structured prediction frameworks for visual scene understanding. This dissertation is organized as a collection of research manuscripts which are either already published or submitted to internationally refereed conference and journals.The dissertation explores two distinct aspects of scene understanding and analysis. First, we explore improved feature representations for scene understanding tasks. We investigate both hand-crafted as well as automatically learned feature representations using deep neural networks. Second, we propose new structured prediction models to incorporate rich relationships between both low-level and high-level scene elements. More specifically, we study some of the most important sub-tasks under the umbrella of scene understanding such as semantic labelling, geometric and volumetric reasoning, object shadow detection and removal, scene categorization and change detection and analysis. The proposed algorithms in this dissertation pertain to different data modalities including RGB images, RGB+Depth data, underwater imagery, dermoscopy images, synthetic images and spectral data from satellites. A major hurdle towards the goal of scene understanding is the limited availability of data and annotations. This dissertation also contributes towards this aspect by gathering two new datasets along with their annotations. Moreover, we present methods to directly deal with specific data related issues e.g., recovery of missing data, learning with only weak supervision and handling highly unbalanced data sets during model learning. Our proposed approaches show very promising results on a diverse set of scene understanding tasks. We hope that this dissertation will inspire more such eorts to realise the ultimate objective of visual scene understanding in machine vision.

    KW - Deep learning

    KW - Graphical models

    KW - Segmentation

    KW - Detection

    KW - Scene understanding

    KW - Learning/Inference

    KW - Geometry estimation

    KW - Classification

    M3 - Doctoral Thesis

    ER -