feat(reference-curator): Add pipeline orchestrator and refactor skill format
Pipeline Orchestrator: - Add 07-pipeline-orchestrator skill with code/CLAUDE.md and desktop/SKILL.md - Add /reference-curator-pipeline slash command for full workflow automation - Add pipeline_runs and pipeline_iteration_tracker tables to schema.sql - Add v_pipeline_status and v_pipeline_iterations views - Add pipeline_config.yaml configuration template - Update AGENTS.md with Reference Curator Skills section - Update claude-project files with pipeline documentation Skill Format Refactoring: - Extract YAML frontmatter from SKILL.md files to separate skill.yaml - Add tools/ directories with MCP tool documentation - Update SKILL-FORMAT-REQUIREMENTS.md with new structure - Add migrate-skill-structure.py script for format conversion Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,115 @@
|
||||
---
|
||||
description: Orchestrates full reference curation pipeline as background task. Runs discovery → crawl → store → distill → review → export with QA loop handling.
|
||||
argument-hint: <topic|urls|manifest> [--max-sources 10] [--max-pages 50] [--auto-approve] [--threshold 0.85] [--max-iterations 3] [--export-format project_files]
|
||||
allowed-tools: WebSearch, WebFetch, Read, Write, Bash, Grep, Glob, Task
|
||||
---
|
||||
|
||||
# Reference Curator Pipeline
|
||||
|
||||
Full-stack orchestration of the 6-skill reference curation workflow.
|
||||
|
||||
## Input Modes
|
||||
|
||||
| Mode | Input Example | Pipeline Start |
|
||||
|------|---------------|----------------|
|
||||
| **Topic** | `"Claude system prompts"` | reference-discovery |
|
||||
| **URLs** | `https://docs.anthropic.com/...` | web-crawler (skip discovery) |
|
||||
| **Manifest** | `./manifest.json` | web-crawler (resume from discovery) |
|
||||
|
||||
## Arguments
|
||||
|
||||
- `<input>`: Required. Topic string, URL(s), or manifest file path
|
||||
- `--max-sources`: Maximum sources to discover (topic mode, default: 10)
|
||||
- `--max-pages`: Maximum pages per source to crawl (default: 50)
|
||||
- `--auto-approve`: Auto-approve scores above threshold
|
||||
- `--threshold`: Approval threshold (default: 0.85)
|
||||
- `--max-iterations`: Max QA loop iterations per document (default: 3)
|
||||
- `--export-format`: Output format: `project_files`, `fine_tuning`, `jsonl` (default: project_files)
|
||||
|
||||
## Pipeline Stages
|
||||
|
||||
```
|
||||
1. reference-discovery (topic mode only)
|
||||
2. web-crawler-orchestrator
|
||||
3. content-repository
|
||||
4. content-distiller ◄────────┐
|
||||
5. quality-reviewer │
|
||||
├── APPROVE → export │
|
||||
├── REFACTOR ─────────────────┤
|
||||
├── DEEP_RESEARCH → crawler ──┘
|
||||
└── REJECT → archive
|
||||
6. markdown-exporter
|
||||
```
|
||||
|
||||
## QA Loop Handling
|
||||
|
||||
| Decision | Action | Max Iterations |
|
||||
|----------|--------|----------------|
|
||||
| REFACTOR | Re-distill with feedback | 3 |
|
||||
| DEEP_RESEARCH | Crawl more sources, re-distill | 2 |
|
||||
| Combined | Total loops per document | 5 |
|
||||
|
||||
After max iterations, document marked as `needs_manual_review`.
|
||||
|
||||
## Example Usage
|
||||
|
||||
```
|
||||
# Full pipeline from topic
|
||||
/reference-curator-pipeline "Claude Code best practices" --max-sources 5
|
||||
|
||||
# Pipeline from specific URLs (skip discovery)
|
||||
/reference-curator-pipeline https://docs.anthropic.com/en/docs/build-with-claude/prompt-caching
|
||||
|
||||
# Resume from existing manifest
|
||||
/reference-curator-pipeline ./manifest.json --auto-approve
|
||||
|
||||
# Fine-tuning dataset output
|
||||
/reference-curator-pipeline "MCP servers" --export-format fine_tuning --auto-approve
|
||||
```
|
||||
|
||||
## State Management
|
||||
|
||||
Pipeline state is saved after each stage to allow resume:
|
||||
|
||||
**With MySQL:**
|
||||
```sql
|
||||
SELECT * FROM pipeline_runs WHERE run_id = 123;
|
||||
```
|
||||
|
||||
**File-based fallback:**
|
||||
```
|
||||
~/reference-library/pipeline_state/run_XXX/state.json
|
||||
```
|
||||
|
||||
## Output
|
||||
|
||||
Pipeline returns summary on completion:
|
||||
|
||||
```json
|
||||
{
|
||||
"run_id": 123,
|
||||
"status": "completed",
|
||||
"stats": {
|
||||
"sources_discovered": 5,
|
||||
"pages_crawled": 45,
|
||||
"documents_stored": 45,
|
||||
"approved": 40,
|
||||
"refactored": 8,
|
||||
"deep_researched": 2,
|
||||
"rejected": 3,
|
||||
"needs_manual_review": 2
|
||||
},
|
||||
"exports": {
|
||||
"format": "project_files",
|
||||
"path": "~/reference-library/exports/"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## See Also
|
||||
|
||||
- `/reference-discovery` - Run discovery stage only
|
||||
- `/web-crawler` - Run crawler stage only
|
||||
- `/content-repository` - Manage stored content
|
||||
- `/quality-reviewer` - Run QA review only
|
||||
- `/markdown-exporter` - Run export only
|
||||
Reference in New Issue
Block a user