保姆级教程：用Python把COCO格式的json标签转成YOLOv5能用的txt文件

张

张建站

2026/5/9 5:09:01

10分钟阅读

保姆级教程：用Python把COCO格式的json标签转成YOLOv5能用的txt文件

从COCO到YOLOv5零基础完成JSON标签格式转换实战第一次接触目标检测时最让人头疼的往往不是模型训练本身而是数据准备阶段的各种格式转换。上周帮实验室新生处理COCO数据集时发现网上大多数教程要么过于简略要么代码存在隐藏bug。这次我们就用最直白的方式从数据结构解析到完整Python实现彻底解决这个看似简单却暗藏玄机的格式转换问题。1. 理解两种数据格式的本质差异1.1 COCO JSON的DNA解析打开任意COCO格式的JSON文件你会看到五个核心字段像俄罗斯套娃一样嵌套{ info: {}, # 数据集元信息 licenses: [], # 版权声明 images: [ # 图像文件列表 { id: 0, width: 640, height: 480, file_name: 000001.jpg } ], annotations: [ # 标注数据集合 { id: 1, image_id: 0, category_id: 2, bbox: [x,y,width,height] } ], categories: [ # 类别定义 { id: 1, name: person } ] }关键点在于annotations中的bbox采用左上角坐标宽高表示法XYWH这与YOLO的中心点坐标宽高表示法有着本质区别。举个例子COCO格式[100, 150, 200, 300]表示左上角在(100,150)宽200像素高300像素YOLO格式0 0.3125 0.5 0.625 0.75表示类别0中心点在(0.3125,0.5)宽度0.625高度0.75归一化值1.2 YOLO TXT的生存法则YOLOv5要求的标签文件是纯文本格式每行对应一个物体标注其核心特征归一化数值所有坐标值必须除以图像宽高转换为0-1之间的浮点数中心点表示class x_center y_center width height文件对应规则image.jpg对应image.txt两者需同名同目录典型目录结构示例dataset/ ├── images/ │ ├── train/ │ │ ├── 000001.jpg │ │ └── 000002.jpg └── labels/ ├── train/ ├── 000001.txt └── 000002.txt2. 转换工程的四大核心模块2.1 数据预处理检查清单开始编码前请确认以下事项路径验证import os json_path coco/annotations/instances_train2017.json assert os.path.exists(json_path), fJSON文件不存在: {json_path}数据采样检查import json with open(json_path) as f: data json.load(f) print(f总图像数: {len(data[images])}) print(f总标注数: {len(data[annotations])}) print(f示例标注: {data[annotations][0]})关键字段验证表字段必须存在示例值备注images[].id✓397133唯一标识images[].file_name✓000000397133.jpg带扩展名annotations[].image_id✓397133对应图像IDannotations[].bbox✓[472.81, 125.66, 103.29, 85.67]XYWH格式2.2 坐标转换的数学本质转换过程本质上是两个数学运算COCO → YOLO中心点x_center x width/2 y_center y height/2归一化处理x_center / image_width y_center / image_height width / image_width height / image_height用Python实现就是def coco_to_yolo(bbox, img_w, img_h): x, y, w, h bbox x_center (x w/2) / img_w y_center (y h/2) / img_h w_norm w / img_w h_norm h / img_h return [x_center, y_center, w_norm, h_norm]2.3 高效批处理架构设计直接遍历所有annotations的O(n)算法效率低下推荐使用字典预处理的O(1)查询方案from collections import defaultdict # 建立图像ID到标注的映射 image_annots defaultdict(list) for ann in data[annotations]: image_annots[ann[image_id]].append(ann) # 建立图像ID到图像信息的映射 image_info {img[id]: img for img in data[images]}2.4 完整转换脚本实现import json import os from pathlib import Path def convert_coco_to_yolo(json_path, output_dir): with open(json_path) as f: data json.load(f) # 创建类别ID到连续索引的映射 categories {cat[id]: idx for idx, cat in enumerate(data[categories])} # 预处理数据结构 image_annots defaultdict(list) for ann in data[annotations]: image_annots[ann[image_id]].append(ann) image_info {img[id]: img for img in data[images]} # 确保输出目录存在 os.makedirs(output_dir, exist_okTrue) for img_id, annots in image_annots.items(): img image_info[img_id] img_w, img_h img[width], img[height] # 生成对应的txt文件名 txt_name Path(img[file_name]).stem .txt txt_path os.path.join(output_dir, txt_name) with open(txt_path, w) as f: for ann in annots: # 坐标转换 x, y, w, h ann[bbox] x_center (x w/2) / img_w y_center (y h/2) / img_h w_norm w / img_w h_norm h / img_h # 获取类别索引 class_idx categories[ann[category_id]] # 写入YOLO格式 line f{class_idx} {x_center:.6f} {y_center:.6f} {w_norm:.6f} {h_norm:.6f}\n f.write(line) # 使用示例 convert_coco_to_yolo( json_pathcoco/annotations/instances_train2017.json, output_dircoco/labels/train2017 )3. 避坑指南与性能优化3.1 新手常见错误TOP5路径陷阱错误直接使用Windows路径C:\Users\name\data正确使用原始字符串rC:\Users\name\data或Path对象归一化遗漏# 错误忘记归一化 x_center (x w/2) # 正确 x_center (x w/2) / img_w类别ID不连续COCO原始ID可能不连续如1,2,4,7建议重建连续索引0,1,2,3图像尺寸错误某些数据集可能包含0x0尺寸图像需添加过滤if img_w 0 or img_h 0: continue浮点数精度直接str()转换可能导致科学计数法使用f{value:.6f}固定小数点位数3.2 高级优化技巧内存优化版处理超大数据集import ijson def stream_convert(json_path, output_dir): os.makedirs(output_dir, exist_okTrue) with open(json_path, rb) as f: # 流式处理JSON images ijson.kvitems(f, images) annots ijson.items(f, annotations) # 其余处理逻辑类似...多进程加速from multiprocessing import Pool def process_image(args): img_id, annots, image_info, categories args # 转换逻辑... with Pool(processes4) as pool: pool.map(process_image, chunked_data)4. 验证转换结果的正确性4.1 可视化检查工具使用OpenCV绘制检测框验证import cv2 def visualize_yolo_label(img_path, label_path): img cv2.imread(img_path) h, w img.shape[:2] with open(label_path) as f: for line in f: class_id, xc, yc, bw, bh map(float, line.split()) # 转换回像素坐标 x1 int((xc - bw/2) * w) y1 int((yc - bh/2) * h) x2 int((xc bw/2) * w) y2 int((yc bh/2) * h) cv2.rectangle(img, (x1,y1), (x2,y2), (0,255,0), 2) cv2.imshow(Validation, img) cv2.waitKey(0)4.2 自动化验证脚本def validate_conversion(original_json, yolo_labels_dir): # 统计原始JSON中的标注数量 with open(original_json) as f: data json.load(f) original_count len(data[annotations]) # 统计转换后的TXT文件标注数量 converted_count 0 for txt_file in Path(yolo_labels_dir).glob(*.txt): with open(txt_file) as f: converted_count sum(1 for _ in f) assert original_count converted_count, \ f标注数量不匹配: 原始{original_count} ! 转换后{converted_count} print(f验证通过所有{original_count}个标注已正确转换)

Phi-4多模态模型：轻量架构与高效推理实践

1. 项目背景与核心价值在人工智能领域，多模态模型正逐渐成为解决复杂现实问题的关键技术路径。Phi-4-reasoning-vision-15B这个命名本身就揭示了它的三大核心特性：基于Phi架构的第四代优化、强化推理能力（reasoning）以及视觉模态&…...

2026/5/9 4:51:12 阅读更多 →

YOLOv8目标检测实战：手把手教你集成Deformable Attention（附完整代码）

YOLOv8目标检测实战：手把手教你集成Deformable Attention（附完整代码） 在计算机视觉领域，目标检测一直是核心任务之一。YOLOv8作为当前最先进的实时检测框架，凭借其卓越的速度-精度平衡赢得了广泛认可。然而&#xff0…...

2026/5/9 4:39:35 阅读更多 →

GenAI与LLM发展时间线：从业者的知识图谱与趋势洞察工具

1. 项目概述：一个AI从业者的“编年史”工具箱如果你和我一样，在过去几年里深度卷入了生成式AI和大型语言模型的浪潮，那你一定有过这样的时刻：刚读完一篇关于GPT-4架构分析的论文，转头就看到新闻说某个团队又发布了新的…...

2026/5/9 4:37:32 阅读更多 →

C语言RTOS多核协同失效真相：Cache一致性缺失、内存序乱序、GCC -O2优化陷阱——三重危机诊断工具链实战

更多请点击： https://intelliparadigm.com 第一章：C语言RTOS多核协同失效的系统性认知在嵌入式实时系统中，基于C语言开发的RTOS（如FreeRTOS、Zephyr或RT-Thread）常被移植至ARM Cortex-A/R系列或多核RISC-V SoC平台。…...

2026/5/8 3:27:44 阅读更多 →

Zotero GPT终极指南：用AI轻松读懂学术文献的研究态度与情感倾向

Zotero GPT终极指南：用AI轻松读懂学术文献的研究态度与情感倾向【免费下载链接】zotero-gpt GPT Meet Zotero. 项目地址: https://gitcode.com/gh_mirrors/zo/zotero-gpt 你是否曾被海量学术文献淹没？是否在阅读论文时难以快速把握作者的研究立场…...

2026/5/8 1:39:53 阅读更多 →