Pipeline Orchestrator: - Add 07-pipeline-orchestrator skill with code/CLAUDE.md and desktop/SKILL.md - Add /reference-curator-pipeline slash command for full workflow automation - Add pipeline_runs and pipeline_iteration_tracker tables to schema.sql - Add v_pipeline_status and v_pipeline_iterations views - Add pipeline_config.yaml configuration template - Update AGENTS.md with Reference Curator Skills section - Update claude-project files with pipeline documentation Skill Format Refactoring: - Extract YAML frontmatter from SKILL.md files to separate skill.yaml - Add tools/ directories with MCP tool documentation - Update SKILL-FORMAT-REQUIREMENTS.md with new structure - Add migrate-skill-structure.py script for format conversion Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
16 KiB
Reference Curator - Complete Skill Set
This document contains all 7 skills for curating, processing, and exporting reference documentation.
Pipeline Orchestrator (Recommended Entry Point)
Coordinates the full 6-skill workflow with automated QA loop handling.
Quick Start
# Full pipeline from topic
curate references on "Claude Code best practices"
# From URLs (skip discovery)
curate these URLs: https://docs.anthropic.com/en/docs/prompt-caching
# With auto-approve
curate references on "MCP servers" with auto-approve and fine-tuning output
Configuration Options
| Option | Default | Description |
|---|---|---|
| max_sources | 10 | Maximum sources to discover |
| max_pages | 50 | Maximum pages per source |
| auto_approve | false | Auto-approve above threshold |
| threshold | 0.85 | Approval threshold |
| max_iterations | 3 | Max QA loop iterations |
| export_format | project_files | Output format |
Pipeline Flow
[Input: Topic | URLs | Manifest]
│
▼
1. reference-discovery (skip if URLs/manifest)
│
▼
2. web-crawler
│
▼
3. content-repository
│
▼
4. content-distiller ◄─────────────┐
│ │
▼ │
5. quality-reviewer │
│ │
├── APPROVE → export │
├── REFACTOR (max 3) ─────┤
├── DEEP_RESEARCH (max 2) → crawler
└── REJECT → archive
│
▼
6. markdown-exporter
QA Loop Handling
| Decision | Action | Max Iterations |
|---|---|---|
| APPROVE | Proceed to export | - |
| REFACTOR | Re-distill with feedback | 3 |
| DEEP_RESEARCH | Crawl more sources | 2 |
| REJECT | Archive with reason | - |
Documents exceeding iteration limits are marked needs_manual_review.
Output Summary
Pipeline Complete:
- Sources discovered: 5
- Pages crawled: 45
- Approved: 40
- Needs manual review: 2
- Exports: ~/reference-library/exports/
1. Reference Discovery
Searches for authoritative sources, validates credibility, and produces curated URL lists for crawling.
Source Priority Hierarchy
| Tier | Source Type | Examples |
|---|---|---|
| Tier 1 | Official documentation | docs.anthropic.com, docs.claude.com, platform.openai.com/docs |
| Tier 1 | Engineering blogs (official) | anthropic.com/news, openai.com/blog |
| Tier 1 | Official GitHub repos | github.com/anthropics/, github.com/openai/ |
| Tier 2 | Research papers | arxiv.org, papers with citations |
| Tier 2 | Verified community guides | Cookbook examples, official tutorials |
| Tier 3 | Community content | Blog posts, tutorials, Stack Overflow |
Discovery Workflow
Step 1: Define Search Scope
search_config = {
"topic": "prompt engineering",
"vendors": ["anthropic", "openai", "google"],
"source_types": ["official_docs", "engineering_blog", "github_repo"],
"freshness": "past_year",
"max_results_per_query": 20
}
Step 2: Generate Search Queries
def generate_queries(topic, vendors):
queries = []
for vendor in vendors:
queries.append(f"site:docs.{vendor}.com {topic}")
queries.append(f"site:{vendor}.com/docs {topic}")
queries.append(f"site:{vendor}.com/blog {topic}")
queries.append(f"site:github.com/{vendor} {topic}")
queries.append(f"site:arxiv.org {topic}")
return queries
Step 3: Validate and Score Sources
def score_source(url, title):
score = 0.0
if any(d in url for d in ['docs.anthropic.com', 'docs.claude.com', 'docs.openai.com']):
score += 0.40 # Tier 1 official docs
elif any(d in url for d in ['anthropic.com', 'openai.com', 'google.dev']):
score += 0.30 # Tier 1 official blog/news
elif 'github.com' in url and any(v in url for v in ['anthropics', 'openai', 'google']):
score += 0.30 # Tier 1 official repos
elif 'arxiv.org' in url:
score += 0.20 # Tier 2 research
else:
score += 0.10 # Tier 3 community
return min(score, 1.0)
def assign_credibility_tier(score):
if score >= 0.60:
return 'tier1_official'
elif score >= 0.40:
return 'tier2_verified'
else:
return 'tier3_community'
Output Format
{
"discovery_date": "2025-01-28T10:30:00",
"topic": "prompt engineering",
"total_urls": 15,
"urls": [
{
"url": "https://docs.anthropic.com/en/docs/prompt-engineering",
"title": "Prompt Engineering Guide",
"credibility_tier": "tier1_official",
"credibility_score": 0.85,
"source_type": "official_docs",
"vendor": "anthropic"
}
]
}
2. Web Crawler Orchestrator
Manages crawling operations using Firecrawl MCP with rate limiting and format handling.
Crawl Configuration
firecrawl:
rate_limit:
requests_per_minute: 20
concurrent_requests: 3
default_options:
timeout: 30000
only_main_content: true
Crawl Workflow
Determine Crawl Strategy
def select_strategy(url):
if url.endswith('.pdf'):
return 'pdf_extract'
elif 'github.com' in url and '/blob/' in url:
return 'raw_content'
elif any(d in url for d in ['docs.', 'documentation']):
return 'scrape'
else:
return 'scrape'
Execute Firecrawl
# Single page scrape
firecrawl_scrape(
url="https://docs.anthropic.com/en/docs/prompt-engineering",
formats=["markdown"],
only_main_content=True,
timeout=30000
)
# Multi-page crawl
firecrawl_crawl(
url="https://docs.anthropic.com/en/docs/",
max_depth=2,
limit=50,
formats=["markdown"]
)
Rate Limiting
class RateLimiter:
def __init__(self, requests_per_minute=20):
self.rpm = requests_per_minute
self.request_times = deque()
def wait_if_needed(self):
now = time.time()
while self.request_times and now - self.request_times[0] > 60:
self.request_times.popleft()
if len(self.request_times) >= self.rpm:
wait_time = 60 - (now - self.request_times[0])
if wait_time > 0:
time.sleep(wait_time)
self.request_times.append(time.time())
Error Handling
| Error | Action |
|---|---|
| Timeout | Retry once with 2x timeout |
| Rate limit (429) | Exponential backoff, max 3 retries |
| Not found (404) | Log and skip |
| Access denied (403) | Log, mark as failed |
3. Content Repository
Manages MySQL storage for the reference library. Handles document storage, version control, deduplication, and retrieval.
Core Operations
Store New Document:
def store_document(cursor, source_id, title, url, doc_type, raw_content_path):
sql = """
INSERT INTO documents (source_id, title, url, doc_type, crawl_date, crawl_status, raw_content_path)
VALUES (%s, %s, %s, %s, NOW(), 'completed', %s)
ON DUPLICATE KEY UPDATE
version = version + 1,
crawl_date = NOW(),
raw_content_path = VALUES(raw_content_path)
"""
cursor.execute(sql, (source_id, title, url, doc_type, raw_content_path))
return cursor.lastrowid
Check Duplicate:
def is_duplicate(cursor, url):
cursor.execute("SELECT doc_id FROM documents WHERE url_hash = SHA2(%s, 256)", (url,))
return cursor.fetchone() is not None
Table Quick Reference
| Table | Purpose | Key Fields |
|---|---|---|
sources |
Authorized content sources | source_type, credibility_tier, vendor |
documents |
Crawled document metadata | url_hash (dedup), version, crawl_status |
distilled_content |
Processed summaries | review_status, compression_ratio |
review_logs |
QA decisions | quality_score, decision |
topics |
Taxonomy | topic_slug, parent_topic_id |
Status Values
- crawl_status:
pending→completed|failed|stale - review_status:
pending→in_review→approved|needs_refactor|rejected - decision:
approve|refactor|deep_research|reject
4. Content Distiller
Transforms raw crawled content into structured, high-quality reference materials.
Distillation Goals
- Compress - Reduce token count while preserving essential information
- Structure - Organize content for easy retrieval and reference
- Extract - Pull out code snippets, key concepts, and actionable patterns
- Annotate - Add metadata for searchability and categorization
Extract Key Components
Extract Code Snippets:
def extract_code_snippets(content):
pattern = r'```(\w*)\n([\s\S]*?)```'
snippets = []
for match in re.finditer(pattern, content):
snippets.append({
"language": match.group(1) or "text",
"code": match.group(2).strip(),
"context": get_surrounding_text(content, match.start(), 200)
})
return snippets
Extract Key Concepts:
def extract_key_concepts(content, title):
prompt = f"""
Analyze this document and extract key concepts:
Title: {title}
Content: {content[:8000]}
Return JSON with:
- concepts: [{{"term": "...", "definition": "...", "importance": "high|medium|low"}}]
- techniques: [{{"name": "...", "description": "...", "use_case": "..."}}]
- best_practices: ["..."]
"""
return claude_extract(prompt)
Summary Template
# {title}
**Source:** {url}
**Type:** {source_type} | **Tier:** {credibility_tier}
## Executive Summary
{2-3 sentence overview}
## Key Concepts
{bulleted list of core concepts}
## Techniques & Patterns
{extracted techniques with use cases}
## Code Examples
{relevant code snippets}
## Best Practices
{actionable recommendations}
Quality Metrics
| Metric | Target |
|---|---|
| Compression Ratio | 25-35% of original |
| Key Concept Coverage | ≥90% of important terms |
| Code Snippet Retention | 100% of relevant examples |
5. Quality Reviewer
Evaluates distilled content, routes decisions, and triggers refactoring or additional research.
Review Workflow
[Distilled Content]
│
▼
┌─────────────────┐
│ Score Criteria │ → accuracy, completeness, clarity, PE quality, usability
└─────────────────┘
│
├── ≥ 0.85 → APPROVE → markdown-exporter
├── 0.60-0.84 → REFACTOR → content-distiller (with instructions)
├── 0.40-0.59 → DEEP_RESEARCH → web-crawler (with queries)
└── < 0.40 → REJECT → archive with reason
Scoring Criteria
| Criterion | Weight | Checks |
|---|---|---|
| Accuracy | 0.25 | Factual correctness, up-to-date info, proper attribution |
| Completeness | 0.20 | Covers key concepts, includes examples, addresses edge cases |
| Clarity | 0.20 | Clear structure, concise language, logical flow |
| PE Quality | 0.25 | Demonstrates techniques, before/after examples, explains why |
| Usability | 0.10 | Easy to reference, searchable keywords, appropriate length |
Calculate Final Score
WEIGHTS = {
"accuracy": 0.25,
"completeness": 0.20,
"clarity": 0.20,
"prompt_engineering_quality": 0.25,
"usability": 0.10
}
def calculate_quality_score(assessment):
return sum(
assessment[criterion]["score"] * weight
for criterion, weight in WEIGHTS.items()
)
Route Decision
def determine_decision(score, assessment):
if score >= 0.85:
return "approve", None, None
elif score >= 0.60:
instructions = generate_refactor_instructions(assessment)
return "refactor", instructions, None
elif score >= 0.40:
queries = generate_research_queries(assessment)
return "deep_research", None, queries
else:
return "reject", f"Quality score {score:.2f} below minimum", None
Prompt Engineering Quality Checklist
- Demonstrates specific techniques (CoT, few-shot, etc.)
- Shows before/after examples
- Explains why techniques work, not just what
- Provides actionable patterns
- Includes edge cases and failure modes
- References authoritative sources
6. Markdown Exporter
Exports approved content as structured markdown files for Claude Projects or fine-tuning.
Export Structure
Nested by Topic (recommended):
exports/
├── INDEX.md
├── prompt-engineering/
│ ├── _index.md
│ ├── 01-chain-of-thought.md
│ └── 02-few-shot-prompting.md
├── claude-models/
│ ├── _index.md
│ └── 01-model-comparison.md
└── agent-building/
└── 01-tool-use.md
Document File Template
def generate_document_file(doc, include_metadata=True):
content = []
if include_metadata:
content.append("---")
content.append(f"title: {doc['title']}")
content.append(f"source: {doc['url']}")
content.append(f"vendor: {doc['vendor']}")
content.append(f"tier: {doc['credibility_tier']}")
content.append(f"quality_score: {doc['quality_score']:.2f}")
content.append("---")
content.append("")
content.append(doc['structured_content'])
return "\n".join(content)
Fine-tuning Export (JSONL)
def export_fine_tuning_dataset(content_list, config):
with open('fine_tuning.jsonl', 'w') as f:
for doc in content_list:
sample = {
"messages": [
{"role": "system", "content": "You are an expert on AI and prompt engineering."},
{"role": "user", "content": f"Explain {doc['title']}"},
{"role": "assistant", "content": doc['structured_content']}
],
"metadata": {
"source": doc['url'],
"topic": doc['topic_slug'],
"quality_score": doc['quality_score']
}
}
f.write(json.dumps(sample) + '\n')
Cross-Reference Generation
def add_cross_references(doc, all_docs):
related = []
doc_concepts = set(c['term'].lower() for c in doc['key_concepts'])
for other in all_docs:
if other['doc_id'] == doc['doc_id']:
continue
other_concepts = set(c['term'].lower() for c in other['key_concepts'])
overlap = len(doc_concepts & other_concepts)
if overlap >= 2:
related.append({
"title": other['title'],
"path": generate_relative_path(doc, other),
"overlap": overlap
})
return sorted(related, key=lambda x: x['overlap'], reverse=True)[:5]
Integration Flow
| From | Output | To |
|---|---|---|
| pipeline-orchestrator | Coordinates all stages | All skills below |
| reference-discovery | URL manifest | web-crawler |
| web-crawler | Raw content + manifest | content-repository |
| content-repository | Document records | content-distiller |
| content-distiller | Distilled content | quality-reviewer |
| quality-reviewer (approve) | Approved IDs | markdown-exporter |
| quality-reviewer (refactor) | Instructions | content-distiller |
| quality-reviewer (deep_research) | Queries | web-crawler |
State Management
The pipeline orchestrator tracks state for resume capability:
With Database:
pipeline_runstable tracks run status, current stage, statisticspipeline_iteration_trackertracks QA loop iterations per document
File-Based Fallback:
~/reference-library/pipeline_state/run_XXX/
├── state.json # Current stage and stats
├── manifest.json # Discovered sources
└── review_log.json # QA decisions
Resume Pipeline
To resume a paused or failed pipeline:
- Provide the run_id or state file path
- Pipeline continues from last successful checkpoint