基于深度学习cnn的人体姿态估计识别基于yolov8 pose estimation(姿态估计+目标检测+跟踪)

张

张建站

2026/5/12 9:53:31

10分钟阅读

基于深度学习cnn的人体姿态估计识别基于yolov8 pose estimation(姿态估计+目标检测+跟踪)

、概述YOLOv7姿态估计一种快速准确的人体姿态估计模型人体姿态估计是计算机视觉中的一项重要任务具有各种应用例如动作识别、人机交互和监控。近年来基于深度学习的方法在人体姿态估计方面取得了显著的性能。其中最流行的深度学习方法之一是YOLOv7姿态估计模型。算法YOLOv7姿态估计模型是YOLOv7目标检测模型的扩展使用单个神经网络同时预测图像中多个物体的边界框和类别概率。在YOLOv7姿态估计模型中网络预测每个人的关键点位置从而可以用于估计人的姿态。网络YOLOv7姿态估计模型基于深度卷积神经网络架构由多个卷积层、最大池化和全连接层组成。网络接受输入图像并产生特征图然后用于预测每个人的关键点位置。数据集YOLOv7姿态估计模型使用大型数据集进行训练例如COCO通用对象上下文和MPII马克斯·普朗克计算机科学研究所这些数据集包含成千上万的人在各种姿势和环境中的注释图像。该模型使用监督学习和数据增强技术进行训练例如随机缩放、旋转和平移输入图像。优势YOLOv7姿态估计模型的一个关键优势是其速度和准确性。该模型能够实时估计多个人的姿态使其适用于人机交互和监控等应用。此外该模型在COCO和MPII等基准数据集上实现了最先进的性能展示了其准确性和鲁棒性。结论总之YOLOv7姿态估计模型是一种快速准确的基于深度学习的人体姿态估计模型。其能够实时估计多个人的姿态使其适用于各种应用而其在基准数据集上的最先进性能证明了其有效性。随着深度学习的不断发展我们可以预期在人体姿态估计方面会有进一步的改进而YOLOv7姿态估计模型很可能在这些发展中发挥重要作用。代码#全部代码可私信或者qq1309399183 def run(poseweightsyolov7-w6-pose.pt,sourcefootball1.mp4,devicecpu,view_imgFalse, save_confFalse,line_thickness 3,hide_labelsFalse, hide_confTrue): frame_count 0 #count no of frames total_fps 0 #count total fps time_list [] #list to store time fps_list [] #list to store fps device select_device(opt.device) #select device half device.type ! cpu model attempt_load(poseweights, map_locationdevice) #Load model _ model.eval() names model.module.names if hasattr(model, module) else model.names # get class names if source.isnumeric() : cap cv2.VideoCapture(int(source)) #pass video to videocapture object else : cap cv2.VideoCapture(source) #pass video to videocapture object if (cap.isOpened() False): #check if videocapture not opened print(Error while trying to read video. Please check path again) raise SystemExit() else: frame_width int(cap.get(3)) #get video frame width frame_height int(cap.get(4)) #get video frame height vid_write_image letterbox(cap.read()[1], (frame_width), stride64, autoTrue)[0] #init videowriter resize_height, resize_width vid_write_image.shape[:2] out_video_name f{source.split(/)[-1].split(.)[0]} out cv2.VideoWriter(f{source}_keypoint.mp4, cv2.VideoWriter_fourcc(*mp4v), 30, (resize_width, resize_height)) while(cap.isOpened): #loop until cap opened or video not complete print(Frame {} Processing.format(frame_count1)) ret, frame cap.read() #get frame and success from video capture if ret: #if success is true, means frame exist orig_image frame #store frame image cv2.cvtColor(orig_image, cv2.COLOR_BGR2RGB) #convert frame to RGB image letterbox(image, (frame_width), stride64, autoTrue)[0] image_ image.copy() image transforms.ToTensor()(image) image torch.tensor(np.array([image.numpy()])) image image.to(device) #convert image data to device image image.float() #convert image to float precision (cpu) start_time time.time() #start time for fps calculation with torch.no_grad(): #get predictions output_data, _ model(image) output_data non_max_suppression_kpt(output_data, #Apply non max suppression 0.25, # Conf. Threshold. 0.65, # IoU Threshold. ncmodel.yaml[nc], # Number of classes. nkptmodel.yaml[nkpt], # Number of keypoints. kpt_labelTrue) output output_to_keypoint(output_data) im0 image[0].permute(1, 2, 0) * 255 # Change format [b, c, h, w] to [h, w, c] for displaying the image. im0 im0.cpu().numpy().astype(np.uint8) im0 cv2.cvtColor(im0, cv2.COLOR_RGB2BGR) #reshape image format to (BGR) gn torch.tensor(im0.shape)[[1, 0, 1, 0]] # normalization gain whwh for i, pose in enumerate(output_data): # detections per image if len(output_data): #check if no pose for c in pose[:, 5].unique(): # Print results n (pose[:, 5] c).sum() # detections per class print(No of Objects in Current Frame : {}.format(n)) for det_index, (*xyxy, conf, cls) in enumerate(reversed(pose[:,:6])): #loop over poses for drawing on frame c int(cls) # integer class kpts pose[det_index, 6:] label None if opt.hide_labels else (names[c] if opt.hide_conf else f{names[c]} {conf:.2f}) plot_one_box_kpt(xyxy, im0, labellabel, colorcolors(c, True), line_thicknessopt.line_thickness,kpt_labelTrue, kptskpts, steps3, orig_shapeim0.shape[:2])环境安装教程#1.克隆项目并进入#联系我然后git clone my_projcet2.linux创建虚拟环境python3 -m venv psestenv source psestenv/bin/activate3.如果windows用户请用这个python3 -m venv psestenv cd psestenv cd Scripts activate cd .. cd .. pip install --upgrade pippip installpip install -r requirements.txt结果展示更多视觉相关项目见专栏。如果对你有用欢迎私聊点赞交流

物联网项目实战：在Ubuntu 20.04上快速部署Mosquitto MQTT Broker（含客户端测试）

物联网开发实战：Ubuntu 20.04下Mosquitto MQTT Broker的高效部署与全链路测试在智能家居和工业物联网项目中，设备间的实时通信往往面临网络不稳定、硬件资源有限等挑战。MQTT协议凭借其轻量级和发布/订阅模式，成为连接传感器与云端的最优解。…...

2026/5/12 9:50:17 阅读更多 →

告别外部中断！用STM32定时器输入捕获实现EC11编码器的高效解码

STM32定时器输入捕获实现EC11编码器的高效解码方案在嵌入式开发中，旋转编码器作为人机交互的重要组件，广泛应用于工业控制、智能家居和消费电子等领域。EC11作为常见的机械编码器，其稳定性和低成本使其成为许多项目的首选。然而，…...

2026/5/12 9:38:34 阅读更多 →

1930年的 AI 没见过电脑，居然能写 Python 代码

来源：量子位机器学习算法与自然语言处理本文约3000字，建议阅读5分钟本文介绍 AI Agent 四大记忆分类与流水线，解析生产架构、选型方案及常见落地误区。活久见！一个生活在1931年之前，在训练数据里没见过任何一台计算…...

2026/5/12 9:35:10 阅读更多 →

【四川电影电视学院主办 | AP出版，高录用快见刊，最快刊后1个月内上知网谷歌学术 | 主题不设限，教育、艺术、语言等人文社科主题均可】第五届科学教育与艺术鉴赏国际学术会议（SEAA 2026）

高录用快见刊，会议快见刊，最快刊后1个月内上知网&谷歌学术主题不设限，教育、艺术、语言等人文社科主题均可第五届科学教育与艺术鉴赏国际学术会议（SEAA 2026） 2026 5th International Conference on Science …...

2026/5/11 9:28:07 阅读更多 →

【斯普林格Springer 旗下的Atlantis Press出版社出版 | EI Compendex、Scopus、谷歌学术】第五届区块链、信息技术与智慧经济国际学术会议（ICBIS 2026）

第五届区块链、信息技术与智慧经济国际学术会议（ICBIS 2026） The 5th International Conference on Blockchain, Information Technology and Smart Finance 2026年6月19日 -21日，中国-上海大会官网：www.ic-bis.net【论文投…...

2026/5/12 5:45:54 阅读更多 →