feat(reference-curator): Add pipeline orchestrator and refactor skill format
Pipeline Orchestrator: - Add 07-pipeline-orchestrator skill with code/CLAUDE.md and desktop/SKILL.md - Add /reference-curator-pipeline slash command for full workflow automation - Add pipeline_runs and pipeline_iteration_tracker tables to schema.sql - Add v_pipeline_status and v_pipeline_iterations views - Add pipeline_config.yaml configuration template - Update AGENTS.md with Reference Curator Skills section - Update claude-project files with pipeline documentation Skill Format Refactoring: - Extract YAML frontmatter from SKILL.md files to separate skill.yaml - Add tools/ directories with MCP tool documentation - Update SKILL-FORMAT-REQUIREMENTS.md with new structure - Add migrate-skill-structure.py script for format conversion Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -130,37 +130,44 @@ This displays available files in `claude-project/` and optionally copies them to
|
||||
## Architecture
|
||||
|
||||
```
|
||||
[Topic Input]
|
||||
│
|
||||
▼
|
||||
┌─────────────────────┐
|
||||
│ reference-discovery │ → Search & validate sources
|
||||
└─────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌──────────────────────────────┐
|
||||
│ reference-curator-pipeline │ (Orchestrator)
|
||||
│ /reference-curator-pipeline │
|
||||
└──────────────────────────────┘
|
||||
│
|
||||
┌───────────────────────┼───────────────────────┐
|
||||
▼ ▼ ▼
|
||||
[Topic Input] [URL Input] [Manifest Input]
|
||||
│ │ │
|
||||
▼ │ │
|
||||
┌─────────────────────┐ │ │
|
||||
│ reference-discovery │ ◄─────────┴───────────────────────┘
|
||||
└─────────────────────┘ (skip if URLs/manifest)
|
||||
│
|
||||
▼
|
||||
┌──────────────────────────┐
|
||||
│ web-crawler-orchestrator │ → Crawl (Firecrawl/Node.js/aiohttp/Scrapy)
|
||||
└──────────────────────────┘
|
||||
│
|
||||
▼
|
||||
│
|
||||
▼
|
||||
┌────────────────────┐
|
||||
│ content-repository │ → Store in MySQL
|
||||
└────────────────────┘
|
||||
│
|
||||
▼
|
||||
│
|
||||
▼
|
||||
┌───────────────────┐
|
||||
│ content-distiller │ → Summarize & extract
|
||||
└───────────────────┘
|
||||
│
|
||||
▼
|
||||
┌──────────────────┐
|
||||
│ quality-reviewer │ → QA loop
|
||||
└──────────────────┘
|
||||
│
|
||||
├── REFACTOR → content-distiller
|
||||
├── DEEP_RESEARCH → web-crawler-orchestrator
|
||||
│
|
||||
▼ APPROVE
|
||||
│ content-distiller │ → Summarize & extract ◄─────┐
|
||||
└───────────────────┘ │
|
||||
│ │
|
||||
▼ │
|
||||
┌──────────────────┐ │
|
||||
│ quality-reviewer │ → QA loop │
|
||||
└──────────────────┘ │
|
||||
│ │
|
||||
├── REFACTOR (max 3) ────────────────────┤
|
||||
├── DEEP_RESEARCH (max 2) → crawler ─────┘
|
||||
│
|
||||
▼ APPROVE
|
||||
┌───────────────────┐
|
||||
│ markdown-exporter │ → Project files / Fine-tuning
|
||||
└───────────────────┘
|
||||
@@ -170,7 +177,35 @@ This displays available files in `claude-project/` and optionally copies them to
|
||||
|
||||
## User Guide
|
||||
|
||||
### Basic Workflow
|
||||
### Full Pipeline (Recommended)
|
||||
|
||||
Run the complete curation workflow with a single command:
|
||||
|
||||
```
|
||||
# From topic - runs all 6 stages automatically
|
||||
/reference-curator-pipeline "Claude Code best practices" --max-sources 5
|
||||
|
||||
# From URLs - skip discovery, start at crawler
|
||||
/reference-curator-pipeline https://docs.anthropic.com/en/docs/prompt-caching
|
||||
|
||||
# Resume from manifest file
|
||||
/reference-curator-pipeline ./manifest.json --auto-approve
|
||||
|
||||
# Fine-tuning dataset output
|
||||
/reference-curator-pipeline "MCP servers" --export-format fine_tuning
|
||||
```
|
||||
|
||||
**Pipeline Options:**
|
||||
- `--max-sources 10` - Max sources to discover (topic mode)
|
||||
- `--max-pages 50` - Max pages per source to crawl
|
||||
- `--auto-approve` - Auto-approve scores above threshold
|
||||
- `--threshold 0.85` - Approval threshold
|
||||
- `--max-iterations 3` - Max QA loop iterations per document
|
||||
- `--export-format project_files` - Output format (project_files, fine_tuning, jsonl)
|
||||
|
||||
---
|
||||
|
||||
### Manual Workflow (Step-by-Step)
|
||||
|
||||
**Step 1: Discover References**
|
||||
```
|
||||
@@ -295,6 +330,7 @@ mysql -h $MYSQL_HOST -u $MYSQL_USER -p"$MYSQL_PASSWORD" reference_library -e "
|
||||
| 04 | content-distiller | `/content-distiller` | Summarize & extract |
|
||||
| 05 | quality-reviewer | `/quality-reviewer` | QA scoring & routing |
|
||||
| 06 | markdown-exporter | `/markdown-exporter` | Export to markdown/JSONL |
|
||||
| 07 | pipeline-orchestrator | `/reference-curator-pipeline` | Full pipeline orchestration |
|
||||
|
||||
---
|
||||
|
||||
@@ -435,7 +471,8 @@ mysql -h $MYSQL_HOST -u $MYSQL_USER -p"$MYSQL_PASSWORD" reference_library < shar
|
||||
│ ├── content-repository.md
|
||||
│ ├── content-distiller.md
|
||||
│ ├── quality-reviewer.md
|
||||
│ └── markdown-exporter.md
|
||||
│ ├── markdown-exporter.md
|
||||
│ └── reference-curator-pipeline.md
|
||||
│
|
||||
├── 01-reference-discovery/
|
||||
│ ├── code/CLAUDE.md # Claude Code directive
|
||||
@@ -455,6 +492,9 @@ mysql -h $MYSQL_HOST -u $MYSQL_USER -p"$MYSQL_PASSWORD" reference_library < shar
|
||||
├── 06-markdown-exporter/
|
||||
│ ├── code/CLAUDE.md
|
||||
│ └── desktop/SKILL.md
|
||||
├── 07-pipeline-orchestrator/
|
||||
│ ├── code/CLAUDE.md
|
||||
│ └── desktop/SKILL.md
|
||||
│
|
||||
└── shared/
|
||||
├── schema.sql # MySQL schema
|
||||
|
||||
Reference in New Issue
Block a user