our-claude-skills/custom-skills/90-reference-curator/commands/content-distiller.md

---
description: Analyze and summarize stored documents. Extracts key concepts, code snippets, and creates structured content.
argument-hint: <doc-id|all-pending> [--focus keywords] [--max-tokens 2000]
allowed-tools: Read, Write, Bash, Glob, Grep
---

# Content Distiller

Analyze, summarize, and extract key information from stored documents.

## Arguments
- `<doc-id|all-pending>`: Specific document ID or process all pending
- `--focus`: Keywords to emphasize in distillation
- `--max-tokens`: Target token count for distilled output (default: 2000)

## Distillation Process

### 1. Load Raw Content
```bash
source ~/.envrc
# Get document path
mysql -u $MYSQL_USER -p"$MYSQL_PASSWORD" reference_library -N -e \
  "SELECT raw_content_path FROM documents WHERE doc_id = $DOC_ID"
```

### 2. Analyze Content

Extract:
- **Summary**: 2-3 sentence executive summary
- **Key Concepts**: Important terms with definitions
- **Code Snippets**: Relevant code examples
- **Structured Content**: Full distilled markdown

### 3. Output Format

```json
{
  "summary": "Executive summary of the document...",
  "key_concepts": [
    {"term": "System Prompt", "definition": "..."},
    {"term": "Context Window", "definition": "..."}
  ],
  "code_snippets": [
    {"language": "python", "description": "...", "code": "..."}
  ],
  "structured_content": "# Title\n\n## Overview\n..."
}
```

### 4. Store Distilled Content

```sql
INSERT INTO distilled_content
  (doc_id, summary, key_concepts, code_snippets, structured_content,
   token_count_original, token_count_distilled, distill_model, review_status)
VALUES
  (?, ?, ?, ?, ?, ?, ?, 'claude-opus-4', 'pending');
```

### 5. Calculate Metrics

- `token_count_original`: Tokens in raw content
- `token_count_distilled`: Tokens in output
- `compression_ratio`: Auto-calculated (distilled/original * 100)

## Distillation Guidelines

**For Prompt Engineering Content:**
- Emphasize techniques and patterns
- Include before/after examples
- Extract actionable best practices
- Note model-specific behaviors

**For API Documentation:**
- Focus on endpoint signatures
- Include request/response examples
- Note rate limits and constraints
- Extract error handling patterns

**For Code Repositories:**
- Summarize architecture
- Extract key functions/classes
- Note dependencies
- Include usage examples

## Example Usage

```
/content-distiller 42
/content-distiller all-pending --focus "system prompts"
/content-distiller 15 --max-tokens 3000
```