Feature learning and structured prediction for scene understanding

Salman Hameed Khan

    Research output: ThesisDoctoral Thesis

    1284 Downloads (Pure)


    When one talks about the visual comprehension ability of humans, even a young child can easily describe events happening in a scene, differentiate between different scene types, identify objects present in a scene and effortlessly reason about their location and geometry. The ultimate goal of computer vision is to mimic the astounding capabilities of human vision. However after 50 years of progress in this area, computer vision is still far from the scene understanding capabilities of a toddler. In this dissertation, we aim to further extend the frontiers of computer vision by investigating robust feature learning and structured prediction frameworks for visual scene understanding. This dissertation is organized as a collection of research manuscripts which are either already published or submitted to internationally refereed conference and journals.

    The dissertation explores two distinct aspects of scene understanding and analysis. First, we explore improved feature representations for scene understanding tasks. We investigate both hand-crafted as well as automatically learned feature representations using deep neural networks. Second, we propose new structured prediction models to incorporate rich relationships between both low-level and high-level scene elements. More specifically, we study some of the most important sub-tasks under the umbrella of scene understanding such as semantic labelling, geometric and volumetric reasoning, object shadow detection and removal, scene categorization and change detection and analysis. The proposed algorithms in this dissertation pertain to different data modalities including RGB images, RGB+Depth data, underwater imagery, dermoscopy images, synthetic images and spectral data from satellites.

    A major hurdle towards the goal of scene understanding is the limited availability of data and annotations. This dissertation also contributes towards this aspect by gathering two new datasets along with their annotations. Moreover, we present methods to directly deal with specific data related issues e.g., recovery of missing data, learning with only weak supervision and handling highly unbalanced data sets during model learning. Our proposed approaches show very promising results on a diverse set of scene understanding tasks. We hope that this dissertation will inspire more such eorts to realise the ultimate objective of visual scene understanding in machine vision.
    Original languageEnglish
    QualificationDoctor of Philosophy
    Awarding Institution
    • The University of Western Australia
    • Togneri, Roberto, Supervisor
    • Bennamoun, Mohammed, Supervisor
    • Sohel, Ferdous, Supervisor
    • Naseem, Imran, Supervisor
    Award date15 Jun 2016
    Publication statusUnpublished - 2016


    Dive into the research topics of 'Feature learning and structured prediction for scene understanding'. Together they form a unique fingerprint.

    Cite this