our-claude-skills/custom-skills/90-reference-curator/04-content-distiller/code/CLAUDE.md

# Content Distiller

Analyzes and distills raw crawled content into concise reference materials. Extracts key concepts, code snippets, and creates structured summaries.

## Trigger Keywords
"distill content", "summarize document", "extract key concepts", "process raw content", "create reference summary"

## Goals

1. **Compress** - Reduce token count while preserving essential information
2. **Structure** - Organize content for easy retrieval
3. **Extract** - Pull out code snippets, key concepts, patterns
4. **Annotate** - Add metadata for searchability

## Workflow

### Step 1: Load Raw Content
```bash
python scripts/load_pending.py --output pending_docs.json
```

### Step 2: Analyze Content Structure
Identify document characteristics:
- Has code blocks?
- Has headers?
- Has tables?
- Estimated tokens?

### Step 3: Extract Key Components
```bash
python scripts/extract_components.py --doc-id 123 --output components.json
```

Extracts:
- Code snippets with language tags
- Key concepts and definitions
- Best practices
- Techniques and patterns

### Step 4: Create Structured Summary
Output template:
```markdown
# {title}

**Source:** {url}
**Type:** {source_type} | **Tier:** {credibility_tier}
**Distilled:** {date}

## Executive Summary
{2-3 sentence overview}

## Key Concepts
{bulleted list with definitions}

## Techniques & Patterns
{extracted techniques with use cases}

## Code Examples
{relevant code snippets}

## Best Practices
{actionable recommendations}
```

### Step 5: Optimize for Tokens
Target: 25-35% of original token count
```bash
python scripts/optimize_content.py --doc-id 123 --target-ratio 0.30
```

### Step 6: Store Distilled Content
```bash
python scripts/store_distilled.py --doc-id 123 --content distilled.md
```

## Quality Metrics

| Metric | Target |
|--------|--------|
| Compression Ratio | 25-35% of original |
| Key Concept Coverage | ≥90% of important terms |
| Code Snippet Retention | 100% of relevant examples |
| Readability | Clear, scannable structure |

## Handling Refactor Requests

When `quality-reviewer` returns `refactor`:
```bash
python scripts/refactor_content.py --distill-id 456 --instructions "Add more examples"
```

## Scripts

- `scripts/load_pending.py` - Load documents pending distillation
- `scripts/extract_components.py` - Extract code, concepts, patterns
- `scripts/optimize_content.py` - Token optimization
- `scripts/store_distilled.py` - Save to database
- `scripts/refactor_content.py` - Handle refactor requests

## Integration

| From | To |
|------|-----|
| content-repository | Raw document records |
| → | quality-reviewer (distilled content) |
| quality-reviewer | Refactor instructions (loop back) |