Pipeline Orchestrator: - Add 07-pipeline-orchestrator skill with code/CLAUDE.md and desktop/SKILL.md - Add /reference-curator-pipeline slash command for full workflow automation - Add pipeline_runs and pipeline_iteration_tracker tables to schema.sql - Add v_pipeline_status and v_pipeline_iterations views - Add pipeline_config.yaml configuration template - Update AGENTS.md with Reference Curator Skills section - Update claude-project files with pipeline documentation Skill Format Refactoring: - Extract YAML frontmatter from SKILL.md files to separate skill.yaml - Add tools/ directories with MCP tool documentation - Update SKILL-FORMAT-REQUIREMENTS.md with new structure - Add migrate-skill-structure.py script for format conversion Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
124 lines
5.8 KiB
Markdown
124 lines
5.8 KiB
Markdown
# Reference Curator - Claude.ai Project Knowledge
|
|
|
|
This project knowledge enables Claude to curate, process, and export reference documentation through 7 modular skills.
|
|
|
|
## Quick Start - Pipeline Orchestrator
|
|
|
|
Run the full curation workflow with a single command:
|
|
|
|
```
|
|
# Full pipeline from topic
|
|
curate references on "Claude Code best practices"
|
|
|
|
# From URLs (skip discovery)
|
|
curate these URLs: https://docs.anthropic.com/en/docs/prompt-caching
|
|
|
|
# With auto-approve
|
|
curate references on "MCP servers" with auto-approve
|
|
```
|
|
|
|
## Skills Overview
|
|
|
|
| Skill | Purpose | Trigger Phrases |
|
|
|-------|---------|-----------------|
|
|
| **pipeline-orchestrator** | Full 6-skill workflow with QA loops | "curate references", "run full pipeline", "automate curation" |
|
|
| **reference-discovery** | Search & validate authoritative sources | "find references", "search documentation", "discover sources" |
|
|
| **web-crawler** | Multi-backend crawling orchestration | "crawl URL", "fetch documents", "scrape pages" |
|
|
| **content-repository** | MySQL storage management | "store content", "save to database", "check duplicates" |
|
|
| **content-distiller** | Summarize & extract key concepts | "distill content", "summarize document", "extract key concepts" |
|
|
| **quality-reviewer** | QA scoring & routing decisions | "review content", "quality check", "assess distilled content" |
|
|
| **markdown-exporter** | Export to markdown/JSONL | "export references", "generate project files", "create markdown output" |
|
|
|
|
## Workflow
|
|
|
|
```
|
|
┌───────────────────────────┐
|
|
│ pipeline-orchestrator │ (Coordinates all stages)
|
|
└───────────────────────────┘
|
|
│
|
|
┌───────────────────┼───────────────────┐
|
|
▼ ▼ ▼
|
|
[Topic Input] [URL Input] [Manifest Input]
|
|
│ │ │
|
|
▼ │ │
|
|
┌─────────────────────┐ │ │
|
|
│ reference-discovery │ ◄───┴───────────────────┘
|
|
└─────────────────────┘ (skip if URLs/manifest)
|
|
│
|
|
▼
|
|
┌─────────────────────┐
|
|
│ web-crawler │ → Crawl (Firecrawl/Node.js/aiohttp/Scrapy)
|
|
└─────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────┐
|
|
│ content-repository │ → Store in MySQL
|
|
└─────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────┐
|
|
│ content-distiller │ → Summarize & extract ◄────┐
|
|
└─────────────────────┘ │
|
|
│ │
|
|
▼ │
|
|
┌─────────────────────┐ │
|
|
│ quality-reviewer │ → QA loop │
|
|
└─────────────────────┘ │
|
|
│ │
|
|
├── REFACTOR (max 3) ───────────────────────┤
|
|
├── DEEP_RESEARCH (max 2) → crawler ────────┘
|
|
│
|
|
▼ APPROVE
|
|
┌─────────────────────┐
|
|
│ markdown-exporter │ → Project files / Fine-tuning
|
|
└─────────────────────┘
|
|
```
|
|
|
|
## Quality Scoring Thresholds
|
|
|
|
| Score | Decision | Action |
|
|
|-------|----------|--------|
|
|
| ≥ 0.85 | **Approve** | Ready for export |
|
|
| 0.60-0.84 | **Refactor** | Re-distill with feedback |
|
|
| 0.40-0.59 | **Deep Research** | Gather more sources |
|
|
| < 0.40 | **Reject** | Archive (low quality) |
|
|
|
|
## Source Credibility Tiers
|
|
|
|
| Tier | Source Type | Examples |
|
|
|------|-------------|----------|
|
|
| **Tier 1** | Official documentation | docs.anthropic.com, platform.openai.com/docs |
|
|
| **Tier 1** | Official engineering blogs | anthropic.com/news, openai.com/blog |
|
|
| **Tier 2** | Research papers | arxiv.org papers with citations |
|
|
| **Tier 2** | Verified community guides | Official cookbooks, tutorials |
|
|
| **Tier 3** | Community content | Blog posts, Stack Overflow |
|
|
|
|
## Files in This Project
|
|
|
|
- `INDEX.md` - This overview file
|
|
- `reference-curator-complete.md` - All 7 skills in one file (recommended)
|
|
- `01-reference-discovery.md` - Source discovery skill
|
|
- `02-web-crawler.md` - Crawling orchestration skill
|
|
- `03-content-repository.md` - Database storage skill
|
|
- `04-content-distiller.md` - Content summarization skill
|
|
- `05-quality-reviewer.md` - QA review skill
|
|
- `06-markdown-exporter.md` - Export skill
|
|
- `07-pipeline-orchestrator.md` - Full pipeline orchestration
|
|
|
|
## Usage
|
|
|
|
Upload all files to a Claude.ai Project, or upload only the skills you need.
|
|
|
|
For the complete experience, upload `reference-curator-complete.md` which contains all skills in one file.
|
|
|
|
## Pipeline Orchestrator Options
|
|
|
|
| Option | Default | Description |
|
|
|--------|---------|-------------|
|
|
| max_sources | 10 | Max sources to discover |
|
|
| max_pages | 50 | Max pages per source |
|
|
| auto_approve | false | Auto-approve above threshold |
|
|
| threshold | 0.85 | Approval threshold |
|
|
| max_iterations | 3 | Max QA loop iterations |
|
|
| export_format | project_files | Output format |
|