level 2 Data-Centric, 0. img위에 annotations 시각화

0. img위에 annotations 시각화
1. Custom Augmentation
2. makeing validset
3. CVAT 사용법 (labeling tool)
4. 이어서 학습하기, (pth file 사용법)
5. 공공행정문서ocr (aihub) 사용법 (coco ↔ ufo)
6. ocr데이터, 금융 및 물류 (aihub) 사용법 (coco ↔ ufo)
7. CORD (clova, huggingface) 사용법 (coco ↔ ufo)
8. json file, concat방법
9. numpy, albumentation 버전에 따른 훈련속도 차이 변화 12분 → 3분

0. img 위에 annotations 시각화

import os
import json
from glob import glob
from pathlib import Path
from PIL import Image, ImageDraw

def read_json(filename: str):
    with Path(filename).open(encoding='utf8') as handle:
        ann = json.load(handle)
    return ann


img_lists = glob('../data/medical/img/test/*.jpg') # test 이미지 폴더에서 이미지 불러오기
img_folder = ../data/medical/img/test' # test 폴더 풀러오기
data = read_json("../data/medical/ufo/blahblah.json") #json 불러오기 ufo format

img_lists = [i.split('/')[-1] for i in img_lists]

def save_vis_to_img(save_dir: str | os.PathLike = None, img_lists: list = None) -> None:
    if not os.path.exists(save_dir):
        os.makedirs(save_dir, exist_ok=True) 
    
    for i in range(len(img_lists)):
        img_json = [[k, v] for k, v in data['images'].items() if k == img_lists[i]]
        if img_json:                     # img_json 리스트가 비어 있지 않다면
            img_path = img_json[0][0]
            img = Image.open(os.path.join(img_folder, img_path)).convert("RGB")
            draw = ImageDraw.Draw(img)
        else:
            print(f"Image not found for {img_lists[i]}")

        # All of the prepared dataset consists of words. Not a character.
        for obj_k, obj_v in img_json[0][1]['words'].items():

            obj_name = f"{obj_k}"

            # bbox points
            pts = [(int(p[0]), int(p[1])) for p in obj_v['points']]
            pt1 = sorted(pts, key=lambda x: (x[1], x[0]))[0]

            # Masking object which not use for training.            

            draw.polygon(pts, outline=(255, 0, 0))
            draw.text(
                (pt1[0]-3, pt1[1]-12),
                obj_name,
                fill=(0, 0, 0)
            )
        img.save(os.path.join(save_dir, img_path))
        
save_vis_to_img("blahblah", img_lists) # 이 폴더에 이미지가 모두 저장된다.

이렇게 하면 이미지 위에 annotation이 생성되어 시각화할 수 있다.

이런 식으로 예측한 데이터의 box를 확인할 수 있다.

그런데 재미있는 지점은 저런 노이즈에 반응하여 bbox를 많이 친 경우에도 오히려 f1 score가 높은 경우가 많다.

사람이 직관적으로 더 높은 점수를 부여할 수 있다고 생각하는 데이터에 평가 메트릭으로 정량평가를 하면 오히려 점수 경향은 반대인 경우가 많다는 뜻이다.

실제로 upstages의 ocr 부서에서도 데이터 전수조사를 상당히 자주 한다고 한다. 사람의 직관과 평가 메트릭 사이에 gap을 메우는 것도 꽤 중요한 일이라고 한다. 평가 메트릭을 선정하는 것은 훈련만큼이나 중요하다.

728x90

저작자표시 비영리 변경금지

'Lectures > BoostCamp -Naver' 카테고리의 다른 글

level 2 Data-Centric, 3. CVAT 사용법 (labeling tool) (0)	2024.02.05
level 2 Data-Centric, 2. makeing validset (2)	2024.02.05
level 2 Data-Centric 대회 정리 (0)	2024.02.03
2-2. mmdetection ConvNext 사용법 (mask rcnn, fp16 error) (0)	2024.01.20
2.1 mmdetection cascade rcnn config 사용법 (0)	2024.01.20

sundry story

level 2 Data-Centric, 0. img위에 annotations 시각화

0. img 위에 annotations 시각화

'Lectures > BoostCamp -Naver' 카테고리의 다른 글

댓글

티스토리툴바

level 2 Data-Centric, 0. img위에 annotations 시각화

0. img 위에 annotations 시각화

'Lectures > BoostCamp -Naver' 카테고리의 다른 글

관련글

댓글

티스토리툴바