our-claude-skills/custom-skills/44-jamie-youtube-subtitle-checker/code/CLAUDE.md

# Jamie YouTube Subtitle Editor

SBV 자막 파일 오타 교정 및 YouTube 메타데이터 생성 도구.

## Quick Commands

```bash
# SBV 파일 오타 교정
python scripts/sbv_corrector.py --input "subtitle.sbv" --output "subtitle_corrected.sbv"

# 챕터 추출
python scripts/chapter_extractor.py --input "subtitle.sbv"

# 전체 워크플로우 (교정 + 챕터 + 메타데이터)
python scripts/process_subtitle.py --input "subtitle.sbv" --output-dir "./output"
```

## Workflow

```
[Input: SBV 자막 파일]
    ↓
[1. SBV 파싱 및 텍스트 추출]
    ↓
[2. 오타 자동 교정 (typo_dictionary.json)]
    ↓
[3. 의학 용어 표준화]
    ↓
[4. 챕터 타임스탬프 추출]
    ↓
[5. YouTube 메타데이터 생성]
    ↓
[Output: 교정된 SBV + youtube_video_info.md]
```

## SBV Format

```
H:MM:SS.sss,H:MM:SS.sss
자막 텍스트

0:00:05.120,0:00:08.450
안녕하세요 제이미성형외과입니다
```

## Core Functions

### SBV 파싱

```python
import re

def parse_sbv(content: str) -> list[dict]:
    """Parse SBV content into list of subtitle blocks."""
    blocks = content.strip().split('\n\n')
    subtitles = []

    for block in blocks:
        lines = block.strip().split('\n')
        if len(lines) >= 2:
            timestamp = lines[0]
            text = '\n'.join(lines[1:])

            # Extract start time
            match = re.match(r'(\d+:\d+:\d+\.\d+),', timestamp)
            start_time = match.group(1) if match else "0:00:00.000"

            subtitles.append({
                'timestamp': timestamp,
                'start': start_time,
                'text': text
            })

    return subtitles
```

### 오타 교정

```python
import json

def load_typo_dict(path: str = "references/typo_dictionary.json") -> dict:
    """Load typo correction dictionary."""
    with open(path, 'r', encoding='utf-8') as f:
        return json.load(f)

def correct_text(text: str, typo_dict: dict) -> tuple[str, list]:
    """Apply typo corrections and return corrected text with change log."""
    corrected = text
    changes = []

    for wrong, right in typo_dict.items():
        if wrong in corrected:
            changes.append({'original': wrong, 'corrected': right})
            corrected = corrected.replace(wrong, right)

    return corrected, changes
```

### 챕터 추출

```python
CHAPTER_KEYWORDS = {
    "intro": ["안녕하세요", "제이미성형외과", "정기호"],
    "problem": ["고민", "걱정", "불편", "문제"],
    "explanation": ["이란", "무엇", "어떤", "방법"],
    "procedure": ["수술", "시술", "과정", "진행"],
    "benefits": ["장점", "효과", "결과", "좋은"],
    "recovery": ["회복", "관리", "주의", "후에"],
    "closing": ["상담", "문의", "감사", "추천"]
}

def extract_chapters(subtitles: list[dict]) -> list[dict]:
    """Extract chapter timestamps from subtitles."""
    chapters = []
    seen_types = set()

    for sub in subtitles:
        for chapter_type, keywords in CHAPTER_KEYWORDS.items():
            if chapter_type not in seen_types:
                if any(kw in sub['text'] for kw in keywords):
                    # Convert to M:SS format
                    start = sub['start']
                    parts = start.split(':')
                    if len(parts) == 3:
                        h, m, s = parts
                        total_min = int(h) * 60 + int(m)
                        sec = int(float(s))
                        formatted = f"{total_min}:{sec:02d}"
                    else:
                        formatted = start

                    chapters.append({
                        'time': formatted,
                        'type': chapter_type
                    })
                    seen_types.add(chapter_type)
                    break

    return chapters
```

### SBV 재구성

```python
def rebuild_sbv(subtitles: list[dict]) -> str:
    """Rebuild SBV content from subtitle blocks."""
    blocks = []
    for sub in subtitles:
        blocks.append(sub['timestamp'])
        blocks.append(sub['text'])
        blocks.append('')
    return '\n'.join(blocks)
```

### 메타데이터 생성

```python
def generate_metadata(chapters: list, changes: list, procedure_name: str = "") -> str:
    """Generate YouTube metadata markdown."""

    chapter_text = '\n'.join([f"{c['time']} {c['type']}" for c in chapters])

    changes_table = "| 위치 | 원본 | 수정 |\n|------|------|------|\n"
    for c in changes[:10]:  # Top 10 changes
        changes_table += f"| - | {c['original']} | {c['corrected']} |\n"

    return f"""# YouTube 영상 정보

## 추천 제목
{procedure_name} | 제이미성형외과 정기호 원장

## 챕터 (Chapters)
{chapter_text}

## 영상 설명 (Description)

⏱️ 타임스탬프
{chapter_text}

🏥 제이미성형외과
📍 서울시 강남구 압구정로 136 EHL빌딩 3층
📞 02-542-2399
🌐 https://jamie.clinic

#제이미성형외과 #{procedure_name} #압구정성형외과

## 오타 수정 내역
{changes_table}
"""
```

## 오타 사전 (typo_dictionary.json)

주요 교정 패턴:

| 카테고리 | 오타 | 정정 |
|----------|------|------|
| 브랜드 | 데이미, 재이미 | 제이미 |
| 브랜드 | 성액과, 성현외과 | 성형외과 |
| 시술 | 쌍거풀, 쌍커풀 | 쌍꺼풀 |
| 시술 | 매물법, 메몰법 | 매몰법 |
| 시술 | 이마거상 | 이마거상술 |
| 의학 | 절계, 절게 | 절개 |
| 의학 | 수면 마취 | 수면마취 |

## 공식 시술명

### 눈 성형
- 퀵 매몰법
- 하이브리드 쌍꺼풀
- 안검하수 눈매교정술
- 눈밑지방 재배치
- 듀얼 트임 수술

### 이마 성형
- 내시경 이마 거상술
- 내시경 눈썹 거상술

### 동안 성형
- 스마스 리프팅
- 자가 지방이식
- 하이푸 리프팅

## Output Files

| 파일 | 설명 |
|------|------|
| `{filename}_corrected.sbv` | 교정된 SBV 자막 |
| `youtube_video_info.md` | YouTube 메타데이터 |
| `correction_report.md` | 오타 수정 내역 |

## Quality Checklist

- [ ] 브랜드명 "제이미성형외과" 정확히 표기
- [ ] 시술명 공식 명칭으로 통일
- [ ] 챕터 0:00 인트로 포함
- [ ] 해시태그 3-5개 포함

## Reference Files

- `references/typo_dictionary.json` - 오타 교정 사전
- `references/chapter_patterns.md` - 챕터 추출 패턴
- `references/youtube_metadata.md` - 메타데이터 템플릿