Projects per year
Abstract
The automatic evaluation of image descriptions is an intricate task, and it is highly important in the development and fine-grained analysis of captioning systems. Existing metrics to automatically evaluate image captioning systems fail to achieve a satisfactory level of correlation with human judgements at the sentence-level. Moreover, these metrics, unlike humans, tend to focus on specific aspects of quality, such as the n-gram overlap or the semantic meaning. In this paper, we present the first learning-based metric to evaluate image captions. Our proposed framework enables us to incorporate both lexical and semantic information into a single learned metric. This results in an evaluator that takes into account various linguistic features to assess the caption quality. The experiments we performed to assess the proposed metric, show improvements upon the state of the art in terms of correlation with human judgements and demonstrate its superior robustness to distractions.
Original language | English |
---|---|
Title of host publication | ECCV |
Editors | Vittorio Ferrari, Cristian Sminchisescu, Yair Weiss, Martial Hebert |
Publisher | Springer |
Pages | 39-55 |
Number of pages | 17 |
ISBN (Print) | 9783030012366 |
DOIs | |
Publication status | Published - 2018 |
Event | 15th European Conference on Computer Vision - Munich, Germany Duration: 8 Sept 2018 → 14 Sept 2018 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 11212 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 15th European Conference on Computer Vision |
---|---|
Abbreviated title | ECCV 2018 |
Country/Territory | Germany |
City | Munich |
Period | 8/09/18 → 14/09/18 |
Fingerprint
Dive into the research topics of 'NNEval: Neural Network based Evaluation Metric for Image Captioning'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Advanced 3D Computer Vision Algorithms for 'Find and Grasp' Future Robots
Bennamoun, M. (Investigator 01)
ARC Australian Research Council
1/01/15 → 31/12/20
Project: Research