feat(reference-curator): Add pipeline orchestrator and refactor skill format

Pipeline Orchestrator: - Add 07-pipeline-orchestrator skill with code/CLAUDE.md and desktop/SKILL.md - Add /reference-curator-pipeline slash command for full workflow automation - Add pipeline_runs and pipeline_iteration_tracker tables to schema.sql - Add v_pipeline_status and v_pipeline_iterations views - Add pipeline_config.yaml configuration template - Update AGENTS.md with Reference Curator Skills section - Update claude-project files with pipeline documentation Skill Format Refactoring: - Extract YAML frontmatter from SKILL.md files to separate skill.yaml - Add tools/ directories with MCP tool documentation - Update SKILL-FORMAT-REQUIREMENTS.md with new structure - Add migrate-skill-structure.py script for format conversion Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 01:01:02 +07:00
parent 243b9d851c
commit d1cd1298a8
91 changed files with 2475 additions and 281 deletions
--- a/custom-skills/90-reference-curator/claude-project/reference-curator-complete.md
+++ b/custom-skills/90-reference-curator/claude-project/reference-curator-complete.md
@@ -1,6 +1,87 @@
 # Reference Curator - Complete Skill Set

-This document contains all 6 skills for curating, processing, and exporting reference documentation.
+This document contains all 7 skills for curating, processing, and exporting reference documentation.
+
+---
+
+# Pipeline Orchestrator (Recommended Entry Point)
+
+Coordinates the full 6-skill workflow with automated QA loop handling.
+
+## Quick Start
+
+```
+# Full pipeline from topic
+curate references on "Claude Code best practices"
+
+# From URLs (skip discovery)
+curate these URLs: https://docs.anthropic.com/en/docs/prompt-caching
+
+# With auto-approve
+curate references on "MCP servers" with auto-approve and fine-tuning output
+```
+
+## Configuration Options
+
+| Option | Default | Description |
+|--------|---------|-------------|
+| max_sources | 10 | Maximum sources to discover |
+| max_pages | 50 | Maximum pages per source |
+| auto_approve | false | Auto-approve above threshold |
+| threshold | 0.85 | Approval threshold |
+| max_iterations | 3 | Max QA loop iterations |
+| export_format | project_files | Output format |
+
+## Pipeline Flow
+
+```
+[Input: Topic | URLs | Manifest]
+            │
+            ▼
+   1. reference-discovery  (skip if URLs/manifest)
+            │
+            ▼
+   2. web-crawler
+            │
+            ▼
+   3. content-repository
+            │
+            ▼
+   4. content-distiller ◄─────────────┐
+            │                         │
+            ▼                         │
+   5. quality-reviewer                │
+            │                         │
+            ├── APPROVE → export      │
+            ├── REFACTOR (max 3) ─────┤
+            ├── DEEP_RESEARCH (max 2) → crawler
+            └── REJECT → archive
+            │
+            ▼
+   6. markdown-exporter
+```
+
+## QA Loop Handling
+
+| Decision | Action | Max Iterations |
+|----------|--------|----------------|
+| APPROVE | Proceed to export | - |
+| REFACTOR | Re-distill with feedback | 3 |
+| DEEP_RESEARCH | Crawl more sources | 2 |
+| REJECT | Archive with reason | - |
+
+Documents exceeding iteration limits are marked `needs_manual_review`.
+
+## Output Summary
+
+```
+Pipeline Complete:
+- Sources discovered: 5
+- Pages crawled: 45
+- Approved: 40
+- Needs manual review: 2
+- Exports: ~/reference-library/exports/
+```

 ---

@@ -464,6 +545,7 @@ def add_cross_references(doc, all_docs):

 | From | Output | To |
 |------|--------|-----|
+| **pipeline-orchestrator** | Coordinates all stages | All skills below |
 | **reference-discovery** | URL manifest | web-crawler |
 | **web-crawler** | Raw content + manifest | content-repository |
 | **content-repository** | Document records | content-distiller |
@@ -471,3 +553,25 @@ def add_cross_references(doc, all_docs):
 | **quality-reviewer** (approve) | Approved IDs | markdown-exporter |
 | **quality-reviewer** (refactor) | Instructions | content-distiller |
 | **quality-reviewer** (deep_research) | Queries | web-crawler |
+
+## State Management
+
+The pipeline orchestrator tracks state for resume capability:
+
+**With Database:**
+- `pipeline_runs` table tracks run status, current stage, statistics
+- `pipeline_iteration_tracker` tracks QA loop iterations per document
+
+**File-Based Fallback:**
+```
+~/reference-library/pipeline_state/run_XXX/
+├── state.json       # Current stage and stats
+├── manifest.json    # Discovered sources
+└── review_log.json  # QA decisions
+```
+
+## Resume Pipeline
+
+To resume a paused or failed pipeline:
+1. Provide the run_id or state file path
+2. Pipeline continues from last successful checkpoint