--- description: Analyze and summarize stored documents. Extracts key concepts, code snippets, and creates structured content. argument-hint: [--focus keywords] [--max-tokens 2000] allowed-tools: Read, Write, Bash, Glob, Grep --- # Content Distiller Analyze, summarize, and extract key information from stored documents. ## Arguments - ``: Specific document ID or process all pending - `--focus`: Keywords to emphasize in distillation - `--max-tokens`: Target token count for distilled output (default: 2000) ## Distillation Process ### 1. Load Raw Content ```bash source ~/.envrc # Get document path mysql -u $MYSQL_USER -p"$MYSQL_PASSWORD" reference_library -N -e \ "SELECT raw_content_path FROM documents WHERE doc_id = $DOC_ID" ``` ### 2. Analyze Content Extract: - **Summary**: 2-3 sentence executive summary - **Key Concepts**: Important terms with definitions - **Code Snippets**: Relevant code examples - **Structured Content**: Full distilled markdown ### 3. Output Format ```json { "summary": "Executive summary of the document...", "key_concepts": [ {"term": "System Prompt", "definition": "..."}, {"term": "Context Window", "definition": "..."} ], "code_snippets": [ {"language": "python", "description": "...", "code": "..."} ], "structured_content": "# Title\n\n## Overview\n..." } ``` ### 4. Store Distilled Content ```sql INSERT INTO distilled_content (doc_id, summary, key_concepts, code_snippets, structured_content, token_count_original, token_count_distilled, distill_model, review_status) VALUES (?, ?, ?, ?, ?, ?, ?, 'claude-opus-4', 'pending'); ``` ### 5. Calculate Metrics - `token_count_original`: Tokens in raw content - `token_count_distilled`: Tokens in output - `compression_ratio`: Auto-calculated (distilled/original * 100) ## Distillation Guidelines **For Prompt Engineering Content:** - Emphasize techniques and patterns - Include before/after examples - Extract actionable best practices - Note model-specific behaviors **For API Documentation:** - Focus on endpoint signatures - Include request/response examples - Note rate limits and constraints - Extract error handling patterns **For Code Repositories:** - Summarize architecture - Extract key functions/classes - Note dependencies - Include usage examples ## Example Usage ``` /content-distiller 42 /content-distiller all-pending --focus "system prompts" /content-distiller 15 --max-tokens 3000 ```