TY - JOUR
T1 - Discriminative Correspondence Estimation for Unsupervised RGB-D Point Cloud Registration
AU - Yan, Chenbo
AU - Feng, Mingtao
AU - Wu, Zijie
AU - Guo, Yulan
AU - Dong, Weisheng
AU - Wang, Yaonan
AU - Mian, Ajmal
N1 - Publisher Copyright:
© 1991-2012 IEEE.
PY - 2024/10/14
Y1 - 2024/10/14
N2 - Point cloud registration is a fundamental task for estimating the rigid transformation matrix between two point clouds, and is regarded as a prerequisite for downstream vision tasks. Recent works have sought to address the registration problem using the obtainable RGB-D sequence, rather than relying solely on point clouds, which may not always be available. However, most existing unsupervised RGB-D point cloud registration works struggle to obtain fine-grained, robust, discriminative correspondences due to the simple concatenation of multimodal features and the increase in vector dimensions. These methods typically follow a common paradigm: extracting features from the input data, estimating correspondences, and obtaining the transformation matrix through geometric fitting. In this work, we design a generative feature extraction module to fully leverage multimodal information, and seek a novel perspective for correspondence estimation which expands the points in the source and target point clouds into hyperrectangle-based embeddings and considers their inner relationships, based on intersections in n-dimensional space, as the basis for estimating correspondences. Each hyperrectangle-based embedding is built upon the natural and discriminative semantics from the proposed generative feature extraction module, which involves a diffusion branch, a geometric branch, and point-pixel fusion. We harness the capability of the generative model to fully leverage the information from both complementary modalities in RGB-D frames. Furthermore, this distinctive geometry space allows for efficient calculation of intersection volumes and model conditional probabilistics for estimating correspondences. Extensive experiments on the 3DMatch and ScanNet datasets show the effectiveness of the proposed method in this challenging task, outperforming state-of-the-art approaches. Our code will be released at: https://github.com/cbyan1003/DCE.
AB - Point cloud registration is a fundamental task for estimating the rigid transformation matrix between two point clouds, and is regarded as a prerequisite for downstream vision tasks. Recent works have sought to address the registration problem using the obtainable RGB-D sequence, rather than relying solely on point clouds, which may not always be available. However, most existing unsupervised RGB-D point cloud registration works struggle to obtain fine-grained, robust, discriminative correspondences due to the simple concatenation of multimodal features and the increase in vector dimensions. These methods typically follow a common paradigm: extracting features from the input data, estimating correspondences, and obtaining the transformation matrix through geometric fitting. In this work, we design a generative feature extraction module to fully leverage multimodal information, and seek a novel perspective for correspondence estimation which expands the points in the source and target point clouds into hyperrectangle-based embeddings and considers their inner relationships, based on intersections in n-dimensional space, as the basis for estimating correspondences. Each hyperrectangle-based embedding is built upon the natural and discriminative semantics from the proposed generative feature extraction module, which involves a diffusion branch, a geometric branch, and point-pixel fusion. We harness the capability of the generative model to fully leverage the information from both complementary modalities in RGB-D frames. Furthermore, this distinctive geometry space allows for efficient calculation of intersection volumes and model conditional probabilistics for estimating correspondences. Extensive experiments on the 3DMatch and ScanNet datasets show the effectiveness of the proposed method in this challenging task, outperforming state-of-the-art approaches. Our code will be released at: https://github.com/cbyan1003/DCE.
KW - Corresponding Estimation
KW - Feature Fusion
KW - Point Cloud Registration
KW - Unsupervised Learning
UR - http://www.scopus.com/inward/record.url?scp=85207454286&partnerID=8YFLogxK
U2 - 10.1109/TCSVT.2024.3480268
DO - 10.1109/TCSVT.2024.3480268
M3 - Article
AN - SCOPUS:85207454286
SN - 1051-8215
JO - IEEE Transactions on Circuits and Systems for Video Technology
JF - IEEE Transactions on Circuits and Systems for Video Technology
ER -