Abstract
The dissertation contributes to the field of video description which is a task of describing videos in syntactically and semantically meaningful natural language sentences. The dissertation develops various computational models and techniques that can automatically generate natural language descriptions for events in videos. This emerging research field has many applications e.g., human-robot interaction, vision and language navigation and especially facilitating the visually impaired by describing their surroundings, help them read, identify currency or even pushing its limits further to let them enjoy watching movies where the model is able to describe what is happening on the screen.
Original language | English |
---|---|
Qualification | Doctor of Philosophy |
Awarding Institution |
|
Supervisors/Advisors |
|
Award date | 24 Feb 2022 |
DOIs | |
Publication status | Unpublished - 2021 |