Files
Andrew Yim 6d7a6d7a88 feat(reference-curator): Add portable skill suite for reference documentation curation
6 modular skills for curating, processing, and exporting reference docs:
- reference-discovery: Search and validate authoritative sources
- web-crawler-orchestrator: Multi-backend crawling (Firecrawl/Node/aiohttp/Scrapy)
- content-repository: MySQL storage with version tracking
- content-distiller: Summarization and key concept extraction
- quality-reviewer: QA loop with approve/refactor/research routing
- markdown-exporter: Structured output for Claude Projects or fine-tuning

Cross-machine installation support:
- Environment-based config (~/.reference-curator.env)
- Commands tracked in repo, symlinked during install
- install.sh with --minimal, --check, --uninstall modes
- Firecrawl MCP as default (always available)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 00:20:27 +07:00

4.7 KiB

Reference Curator Skills - Refactoring Log

Date: 2025-01-28 Version: 2.0 Author: Claude Code (Opus 4.5)


Summary

Complete restructuring of the Reference Curator skill suite from a flat structure to dual-platform format, with full installation automation.


Changes Made

1. Directory Restructuring

Before:

90-reference-curator/
├── SKILL.md (Skill 01)
└── mnt/user-data/outputs/reference-curator-skills/
    ├── 02-web-crawler/SKILL.md
    ├── 03-content-repository/SKILL.md
    ├── 04-content-distiller/SKILL.md
    ├── 05-quality-reviewer/SKILL.md
    └── 06-markdown-exporter/SKILL.md

After:

90-reference-curator/
├── README.md
├── install.sh                    # NEW: Installation script
├── 01-reference-discovery/
│   ├── code/CLAUDE.md           # NEW: Claude Code directive
│   └── desktop/SKILL.md
├── 02-web-crawler-orchestrator/
│   ├── code/CLAUDE.md           # NEW
│   └── desktop/SKILL.md
├── 03-content-repository/
│   ├── code/CLAUDE.md           # NEW
│   └── desktop/SKILL.md
├── 04-content-distiller/
│   ├── code/CLAUDE.md           # NEW
│   └── desktop/SKILL.md
├── 05-quality-reviewer/
│   ├── code/CLAUDE.md           # NEW
│   └── desktop/SKILL.md
├── 06-markdown-exporter/
│   ├── code/CLAUDE.md           # NEW
│   └── desktop/SKILL.md
└── shared/
    ├── schema.sql               # NEW: MySQL schema
    └── config/
        ├── db_config.yaml       # NEW
        ├── crawl_config.yaml    # NEW
        └── export_config.yaml   # NEW

2. New Files Created

File Purpose
install.sh Interactive installation script
shared/schema.sql MySQL schema (9 tables, 2 views)
shared/config/db_config.yaml Database connection config
shared/config/crawl_config.yaml Crawler routing config
shared/config/export_config.yaml Export format config
*/code/CLAUDE.md 6 Claude Code directives

3. Crawler Configuration

Implemented intelligent crawler routing:

Crawler Condition Use Case
Node.js (default) ≤50 pages, static Small documentation sites
Python aiohttp ≤200 pages, SEO needed Technical docs
Scrapy >200 pages, multi-domain Enterprise crawls
Firecrawl MCP SPA, JS-rendered Dynamic sites

4. Installation Script Features

./install.sh           # Interactive installation
./install.sh --check   # Verify status
./install.sh --reset   # Reset (preserves data)

Handles:

  • Config file deployment to ~/.config/reference-curator/
  • Storage directory creation at ~/reference-library/
  • MySQL credentials setup in ~/.envrc
  • Database creation and schema application
  • Skill symlink registration in ~/.claude/skills/

5. MySQL Schema

Tables (9):

  • sources - Authoritative source registry
  • documents - Crawled document storage
  • distilled_content - Processed summaries
  • review_logs - QA decision history
  • topics - Topic taxonomy
  • document_topics - Document-topic mapping
  • export_jobs - Export task tracking
  • crawl_schedule - Scheduled crawl jobs
  • change_detection - Content change tracking

Views (2):

  • v_pending_reviews - Documents awaiting review
  • v_export_ready - Approved documents ready for export

6. Environment Setup

Files installed:

~/.config/reference-curator/
├── db_config.yaml
├── crawl_config.yaml
└── export_config.yaml

~/reference-library/
├── raw/
├── processed/
└── exports/

~/.claude/skills/
├── reference-discovery -> .../01-reference-discovery/desktop
├── web-crawler-orchestrator -> .../02-web-crawler-orchestrator/desktop
├── content-repository -> .../03-content-repository/desktop
├── content-distiller -> .../04-content-distiller/desktop
├── quality-reviewer -> .../05-quality-reviewer/desktop
└── markdown-exporter -> .../06-markdown-exporter/desktop

Deleted

  • mnt/user-data/outputs/ directory (moved to proper structure)
  • Root-level SKILL.md (moved to 01-reference-discovery/desktop/)

Verification

$ ./install.sh --check

✓ Configuration files (3/3)
✓ Storage directories (3/3)
✓ MySQL database (11 tables)
✓ Skill registrations (6/6)

All components installed correctly.

Next Steps

  1. Add Python scripts to */code/scripts/ folders for automation
  2. Implement select_crawler.py for intelligent routing logic
  3. Add unit tests for database operations
  4. Create example workflows in */desktop/examples/