江汉大学学报(自然科学版) ›› 2025, Vol. 53 ›› Issue (5): 85-96.doi: 10.16389/j.cnki.cn42-1737/n.2025.05.010

• 人工智能 • 上一篇    

基于YOLO-Pose的遮挡场景下的多人姿态估计算法

侯顺智,陶 俊*,袁冬华,吴文俊,隗一凡   

  1. 江汉大学 人工智能学院,湖北 武汉 430056
  • 发布日期:2025-10-22
  • 通讯作者: 陶 俊
  • 作者简介:侯顺智(2000—),男,硕士生,研究方向:深度学习与计算机视觉。
  • 基金资助:
    江汉大学研究生培养基金(301004310001)

Multi-person Pose Estimation Algorithm Based on YOLO-Pose in Occluded Scenes

HOU Shunzhi,TAO Jun*,YUAN Donghua,WU Wenjun,WEI Yifan   

  1. School of Artificial Intelligence,Jianghan University,Wuhan 430056,Hubei,China
  • Published:2025-10-22
  • Contact: TAO Jun

摘要: 人体姿态估计在体育训练、机器人行为训练、智能交互等多个现实应用场景中都有极其 重要的作用。针对大多数人体姿态估计算法的复杂神经网络结构与效率不足的问题,提出一种 基于改进YOLO-Pose的多人姿态估计算法YOLO-Pose-GSNS。为了减少模块的参量和计 算量,通过提高计算效率来实现轻量化,使用GSConv卷积模块代替普通的Conv卷积计算;采用 NAMAttention模块重新设计其特征融合层,提高特征提取的能力,同时使用4个不同的检测头, 使算法增强对遮挡场景的检测,引入SIoU损失函数重新定义边界框回归的损失函数,提高定位 的准确性。在OC_Human数据集上进行测试,改进后的YOLO-Pose-GSNS模型与基准模型 相比,模型大小降低了7.4%,GFLOPs降为19.5,降低了3.4%,P值、R值、mAP@0.5和mAP@ 0. 5:0. 95 分别提高了8.7、13.4、12.1和17.2个百分点。本文提出的YOLO-Pose-GSNS算法 既实现了模型的轻量化,又保证了在遮挡场景下多人姿态估计准确率的提升。

关键词: 多人姿态估计, YOLO-Pose, 遮挡场景, 轻量化, NAMAttention

Abstract: Human pose estimation plays a crucial role in various real-world applications, such as sports training,robot behavior training,and intelligent interaction. Due to the shortcomings of complex neural network structures and the low efficiency of most human pose estimation algorithms,a multi-person pose estimation algorithm,YOLO- Pose-GSNS,based on improved YOLO-Pose,was proposed. To reduce the parameters and computational complexity of the module and achieve lightweight by improving computational efficiency,the GSConv convolution module was used instead of the ordinary Conv convolution calculation. Using the NAMAttention module to redesign its feature fusion layer and improve its feature extraction capability,while using four different detection heads to enhance the algorithm′s detection of occluded scenes. Introducing the SIoU loss function to redefine the loss function of bounding box regression and improve the accuracy of localization. Tested on the OC_Human dataset,the improved YOLO-Pose-GSNS model showed a 7. 4% reduction in model size compared to the baseline model,a 3. 4% decrease (19. 5)in GFLOPs,the P-value,R-value,mAP@0.5,and mAP@0.5:0.95 increased by 8. 7%,13. 4%,12.1%,and 17.2%,respectively. The YOLO-Pose-GSNS algorithm proposed in this article not only achieves the model′s lightweight,but also ensures an improvement in the accuracy of multi-person pose estimation in occluded scenes.

Key words: multi-person pose estimation, YOLO-Pose, occluded scene, lightweight, NAMAttention

中图分类号: