Free-form Description Guided 3D Visual Graph Network for Object Grounding in Point Cloud

Mingtao Feng, Zhen Li, Qi Li, Liang Zhang, Xiang Dong Zhang, Guangming Zhu, Hui Zhang, Yaonan Wang, Ajmal Mian

Research output: Chapter in Book/Conference paperConference paperpeer-review

39 Citations (Scopus)

Abstract

3D object grounding aims to locate the most relevant target object in a raw point cloud scene based on a free-form language description. Understanding complex and diverse descriptions, and lifting them directly to a point cloud is a new and challenging topic due to the irregular and sparse nature of point clouds. There are three main challenges in 3D object grounding: to find the main focus in the complex and diverse description; to understand the point cloud scene; and to locate the target object. In this paper, we address all three challenges. Firstly, we propose a language scene graph module to capture the rich structure and long-distance phrase correlations. Secondly, we introduce a multi-level 3D proposal relation graph module to extract the object-object and object-scene co-occurrence relationships, and strengthen the visual features of the initial proposals. Lastly, we develop a description guided 3D visual graph module to encode global contexts of phrases and proposals by a nodes matching strategy. Extensive experiments on challenging benchmark datasets (ScanRefer [3] and Nr3D [42]) show that our algorithm outperforms existing state-of-the-art. Our code is available at https://github.com/PNXD/FFL-3DOG.

Original languageEnglish
Title of host publicationProceedings - 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021
PublisherIEEE, Institute of Electrical and Electronics Engineers
Pages3702-3711
Number of pages10
ISBN (Electronic)9781665428125
DOIs
Publication statusPublished - 2021
Event18th IEEE/CVF International Conference on Computer Vision, ICCV 2021 - Virtual, Online, Canada
Duration: 11 Oct 202117 Oct 2021

Publication series

NameProceedings of the IEEE International Conference on Computer Vision
ISSN (Print)1550-5499

Conference

Conference18th IEEE/CVF International Conference on Computer Vision, ICCV 2021
Country/TerritoryCanada
CityVirtual, Online
Period11/10/2117/10/21

Fingerprint

Dive into the research topics of 'Free-form Description Guided 3D Visual Graph Network for Object Grounding in Point Cloud'. Together they form a unique fingerprint.

Cite this