基于转移分块Transformer和特征金字塔的点-体素三维目标检测方法

Point-voxel 3D object detection method based on transfer block Transformer and feature pyramid

ES评分 0

DOI 10.12208/j.jer.20250007
刊名
Journal of Engineering Research
年,卷(期) 2025, 4(1)
作者
作者单位

1株洲中车时代软件技术有限公司 湖南株洲 2国家能源集团陕西神延煤炭有限责任公司西湾露天煤矿 陕西榆林 3湖南大学机械与运载工程学院 湖南长沙 4湖南大学无锡智能控制研究院 江苏无锡

摘要
随着环境感知技术的发展,激光雷达三维目标检测取得了显著进展。然而,基于体素的三维检测器在划分点云时,难以捕捉丰富的上下文信息和细节特征,尤其在处理遮挡和截断问题时,原始点云的细节信息常常丢失。为解决这些挑战,本文提出了一种新型数据增强策略,增强了模型对不完整点云的处理能力;并提出了基于转移分块Transformer和特征金字塔的点-体素三维目标检测模型PV-FMRTNet,有效解决了点云转换为体素过程中位置信息丢失的问题。此外,设计了一种新的二维特征编码网络,提升了基于体素的三维目标检测系统的性能。评估结果显示,本文模型在检测汽车、行人和骑行者方面的准确度分别达到84.30%、61.76%和78.08%,相比主流算法PointPillars等基准模型平均提升2.08%,展现出先进的准确性和鲁棒性。
Abstract
With the development of environmental sensing technology, lidar three-dimensional target detection has made significant progress. However, it is difficult for voxel-based 3D detectors to capture rich contextual information and detailed features when dividing point clouds. Especially when dealing with occlusion and truncation problems, the detailed information of the original point cloud is often lost. To address these challenges, this paper proposes a new data augmentation strategy to enhance the model's ability to handle incomplete point clouds; and proposes a point-to-voxel 3D target detection model PV-FMRTNet based on transfer block Transformer and feature pyramid, which effectively solves the problem of position information loss in the process of converting point clouds to voxels. In addition, a new 2D feature encoding network was designed to improve the performance of the voxel-based 3D object detection system. The evaluation results show that the accuracy of the proposed model in detecting cars, pedestrians and cyclists reached 84.30%, 61.76% and 78.08% respectively, which is an average improvement of 2.08% over the mainstream algorithm PointPillars and other benchmark models, showing advanced accuracy and robustness.
关键词
自动驾驶;深度学习;三维目标检测;特征金字塔;点体素
KeyWord
Autonomous driving; Deep learning; 3D object detection; Feature pyramid; Point voxel
基金项目
页码 46-58
  • 参考文献
  • 相关文献
  • 引用本文

刘良杰, 任宏乾, 孙凯, 谢国涛. 基于转移分块Transformer和特征金字塔的点-体素三维目标检测方法 [J]. 工程学研究. 2025; 4; (1). 46 - 58.

  • 文献评论

相关学者

相关机构