TY - JOUR
T1 - Relation Graph Network for 3D Object Detection in Point Clouds
AU - Feng, Mingtao
AU - Gilani, Syed Zulqarnain
AU - Wang, Yaonan
AU - Zhang, Liang
AU - Mian, Ajmal
PY - 2021
Y1 - 2021
N2 - Convolutional Neural Networks (CNNs) have emerged as a powerful tool for object detection in 2D images. However, their power has not been fully realised for detecting 3D objects directly in point clouds without conversion to regular grids. Moreover, existing state-of-the-art 3D object detection methods aim to recognize objects individually without exploiting their relationships during learning or inference. In this article, we first propose a strategy that associates the predictions of direction vectors with pseudo geometric centers, leading to a win-win solution for 3D bounding box candidates regression. Secondly, we propose point attention pooling to extract uniform appearance features for each 3D object proposal, benefiting from the learned direction features, semantic features and spatial coordinates of the object points. Finally, the appearance features are used together with the position features to build 3D object-object relationship graphs for all proposals to model their co-existence. We explore the effect of relation graphs on proposals' appearance feature enhancement under supervised and unsupervised settings. The proposed relation graph network comprises a 3D object proposal generation module and a 3D relation module, making it an end-to-end trainable network for detecting 3D objects in point clouds. Experiments on challenging benchmark point cloud datasets (SunRGB-D, ScanNet and KITTI) show that our algorithm performs better than existing state-of-the-art.
AB - Convolutional Neural Networks (CNNs) have emerged as a powerful tool for object detection in 2D images. However, their power has not been fully realised for detecting 3D objects directly in point clouds without conversion to regular grids. Moreover, existing state-of-the-art 3D object detection methods aim to recognize objects individually without exploiting their relationships during learning or inference. In this article, we first propose a strategy that associates the predictions of direction vectors with pseudo geometric centers, leading to a win-win solution for 3D bounding box candidates regression. Secondly, we propose point attention pooling to extract uniform appearance features for each 3D object proposal, benefiting from the learned direction features, semantic features and spatial coordinates of the object points. Finally, the appearance features are used together with the position features to build 3D object-object relationship graphs for all proposals to model their co-existence. We explore the effect of relation graphs on proposals' appearance feature enhancement under supervised and unsupervised settings. The proposed relation graph network comprises a 3D object proposal generation module and a 3D relation module, making it an end-to-end trainable network for detecting 3D objects in point clouds. Experiments on challenging benchmark point cloud datasets (SunRGB-D, ScanNet and KITTI) show that our algorithm performs better than existing state-of-the-art.
UR - http://www.scopus.com/inward/record.url?scp=85096456980&partnerID=8YFLogxK
U2 - 10.1109/TIP.2020.3031371
DO - 10.1109/TIP.2020.3031371
M3 - Article
C2 - 33085616
AN - SCOPUS:85096456980
SN - 1057-7149
VL - 30
SP - 92
EP - 107
JO - IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
JF - IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
ER -