B站链接: https://www.bilibili.com/video/BV11RNHeVExM
点云的广泛应用促进了用于点云处理的神经网络的快速发展。这些网络的一个关键属性是在输入点云随机旋转下保持输出结果的一致性,即旋转不变性。实现旋转不变性的主流方法是构建局部坐标系以计算不变的局部点云坐标。然而,这种方法忽略了局部点云结构之间的相对姿态关系,导致了网络性能的下降。为了解决这一局限性,我们提出了一种新的旋转不变点云变换器(RotInv-PCT)。该方法使用局部参考帧(LRFs)提取点云的局部抽象形状特征,并显式计算局部点云之间的空间相对姿态特征,这些都被证明是旋转不变的。此外,为了捕捉点之间远距离的姿态依赖关系,我们引入了一种创新的特征聚合变换器(FAT)模型,该模型无缝地将姿态特征与形状特征融合,以获得全局旋转不变表示。进一步地,为了处理大规模点云,我们利用分层随机下采样逐步减少点云的规模,然后通过FAT进行特征聚合。为展示RotInv-PCT的有效性,我们在多个任务和数据集上进行了对比实验,包括在ScanObjectNN和ModelNet40上的点云分类、ShapeNet上的部件分割以及S3DIS和KITTI上的语义分割。得益于我们证明了的旋转不变特征和FAT,我们的方法通常优于最先进的网络。特别是,RotInv-PCT在实际点云分类任务中比最强的基线提高了2%。此外,在语义分割任务中,我们在S3DIS数据集上提升了10%的性能,并且首次实现了KITTI数据集上的旋转不变点云语义分割。
C. He, Z. Zhao*, X. Zhang , H. Yu and R. Wang. RotInv-PCT: Rotation-Invariant Point Cloud Transformer via Feature Separation and Aggregation, Neural Networks, 2025, accepted.
Abstract
The widespread use of point clouds has spurred the rapid development of neural networks for point cloud processing. A crucial property of these networks is maintaining consistent output results under random rotations of the input point cloud, namely, rotation invariance. The dominant approach achieves rotation invariance is to construct local coordinate systems for computing invariant local point cloud coordinates. However, this method neglects the relative pose relationships between local point cloud structures, leading to a decline in network performance. To address this limitation, we propose a novel Rotation-Invariant Point Cloud Transformer (RotInv-PCT). This method extracts the local abstract shape features of the point cloud using Local Reference Frames (LRFs) and explicitly computes the spatial relative pose features between local point clouds, both of which are proven to be rotation-invariant.Furthermore, to capture the long-range pose dependencies between points, we introduce an innovative Feature Aggregation Transformer (FAT) model, which seamlessly fuses the pose features with the shape features to obtain a globally rotation-invariant representation. Moreover, to manage large-scale point clouds, we utilize hierarchical random downsampling to gradually decrease the scale of point clouds, followed by feature aggregation through FAT. To demonstrate the effectiveness of RotInv-PCT, we conducted comparative experiments across various tasks and datasets, including point cloud classification on ScanObjectNN and ModelNet40, part segmentation on ShapeNet, and semantic segmentation on S3DIS and KITTI. Thanks to our provable rotation-invariant features and FAT, our method generally outperforms state-of-the-art networks. In particular, we highlight that RotInv-PCT achieved a 2% improvement in real-world point cloud classification tasks compared to the strongest baseline. Furthermore, in the semantic segmentation task, we improved the performance on the S3DIS dataset by 10% and, for the first time, realized rotation-invariant point cloud semantic segmentation on the KITTI dataset.