# Reference Curator - Claude.ai Project Knowledge This project knowledge enables Claude to curate, process, and export reference documentation through 7 modular skills. ## Quick Start - Pipeline Orchestrator Run the full curation workflow with a single command: ``` # Full pipeline from topic curate references on "Claude Code best practices" # From URLs (skip discovery) curate these URLs: https://docs.anthropic.com/en/docs/prompt-caching # With auto-approve curate references on "MCP servers" with auto-approve ``` ## Skills Overview | Skill | Purpose | Trigger Phrases | |-------|---------|-----------------| | **pipeline-orchestrator** | Full 6-skill workflow with QA loops | "curate references", "run full pipeline", "automate curation" | | **reference-discovery** | Search & validate authoritative sources | "find references", "search documentation", "discover sources" | | **web-crawler** | Multi-backend crawling orchestration | "crawl URL", "fetch documents", "scrape pages" | | **content-repository** | MySQL storage management | "store content", "save to database", "check duplicates" | | **content-distiller** | Summarize & extract key concepts | "distill content", "summarize document", "extract key concepts" | | **quality-reviewer** | QA scoring & routing decisions | "review content", "quality check", "assess distilled content" | | **markdown-exporter** | Export to markdown/JSONL | "export references", "generate project files", "create markdown output" | ## Workflow ``` ┌───────────────────────────┐ │ pipeline-orchestrator │ (Coordinates all stages) └───────────────────────────┘ │ ┌───────────────────┼───────────────────┐ ▼ ▼ ▼ [Topic Input] [URL Input] [Manifest Input] │ │ │ ▼ │ │ ┌─────────────────────┐ │ │ │ reference-discovery │ ◄───┴───────────────────┘ └─────────────────────┘ (skip if URLs/manifest) │ ▼ ┌─────────────────────┐ │ web-crawler │ → Crawl (Firecrawl/Node.js/aiohttp/Scrapy) └─────────────────────┘ │ ▼ ┌─────────────────────┐ │ content-repository │ → Store in MySQL └─────────────────────┘ │ ▼ ┌─────────────────────┐ │ content-distiller │ → Summarize & extract ◄────┐ └─────────────────────┘ │ │ │ ▼ │ ┌─────────────────────┐ │ │ quality-reviewer │ → QA loop │ └─────────────────────┘ │ │ │ ├── REFACTOR (max 3) ───────────────────────┤ ├── DEEP_RESEARCH (max 2) → crawler ────────┘ │ ▼ APPROVE ┌─────────────────────┐ │ markdown-exporter │ → Project files / Fine-tuning └─────────────────────┘ ``` ## Quality Scoring Thresholds | Score | Decision | Action | |-------|----------|--------| | ≥ 0.85 | **Approve** | Ready for export | | 0.60-0.84 | **Refactor** | Re-distill with feedback | | 0.40-0.59 | **Deep Research** | Gather more sources | | < 0.40 | **Reject** | Archive (low quality) | ## Source Credibility Tiers | Tier | Source Type | Examples | |------|-------------|----------| | **Tier 1** | Official documentation | docs.anthropic.com, platform.openai.com/docs | | **Tier 1** | Official engineering blogs | anthropic.com/news, openai.com/blog | | **Tier 2** | Research papers | arxiv.org papers with citations | | **Tier 2** | Verified community guides | Official cookbooks, tutorials | | **Tier 3** | Community content | Blog posts, Stack Overflow | ## Files in This Project - `INDEX.md` - This overview file - `reference-curator-complete.md` - All 7 skills in one file (recommended) - `01-reference-discovery.md` - Source discovery skill - `02-web-crawler.md` - Crawling orchestration skill - `03-content-repository.md` - Database storage skill - `04-content-distiller.md` - Content summarization skill - `05-quality-reviewer.md` - QA review skill - `06-markdown-exporter.md` - Export skill - `07-pipeline-orchestrator.md` - Full pipeline orchestration ## Usage Upload all files to a Claude.ai Project, or upload only the skills you need. For the complete experience, upload `reference-curator-complete.md` which contains all skills in one file. ## Pipeline Orchestrator Options | Option | Default | Description | |--------|---------|-------------| | max_sources | 10 | Max sources to discover | | max_pages | 50 | Max pages per source | | auto_approve | false | Auto-approve above threshold | | threshold | 0.85 | Approval threshold | | max_iterations | 3 | Max QA loop iterations | | export_format | project_files | Output format |