Natural Language Description of Images

Naeha Sharif

Research output: ThesisDoctoral Thesis

70 Downloads (Pure)

Abstract

This thesis focuses on developing and evaluating captioning models which can communicate their interpretation of the visual world in natural language. The key research gaps addressed by this thesis include the lack of exploitation of the language space, inapt handling of rare words, and the dearth of systematic research into captioning-specific evaluation metrics. This thesis proposes linguistically-aware features to improve the visual interpretation, sub-word language modelling to tackle out-of-vocabulary words, and a novel framework for soft-candidate-based image captioning. This thesis also presents state-of-the-art deterministic and learning-based evaluation metrics to capture the quality of captions at various linguistic levels.
Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • The University of Western Australia
Supervisors/Advisors
  • Bennamoun, Mohammed, Supervisor
  • Liu, Wei, Supervisor
  • Shah, Syed Afaq, Supervisor
Award date7 Apr 2021
DOIs
Publication statusUnpublished - 2020

Fingerprint

Dive into the research topics of 'Natural Language Description of Images'. Together they form a unique fingerprint.

Cite this