6 modular skills for curating, processing, and exporting reference docs: - reference-discovery: Search and validate authoritative sources - web-crawler-orchestrator: Multi-backend crawling (Firecrawl/Node/aiohttp/Scrapy) - content-repository: MySQL storage with version tracking - content-distiller: Summarization and key concept extraction - quality-reviewer: QA loop with approve/refactor/research routing - markdown-exporter: Structured output for Claude Projects or fine-tuning Cross-machine installation support: - Environment-based config (~/.reference-curator.env) - Commands tracked in repo, symlinked during install - install.sh with --minimal, --check, --uninstall modes - Firecrawl MCP as default (always available) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
4.7 KiB
4.7 KiB
Reference Curator Skills - Refactoring Log
Date: 2025-01-28 Version: 2.0 Author: Claude Code (Opus 4.5)
Summary
Complete restructuring of the Reference Curator skill suite from a flat structure to dual-platform format, with full installation automation.
Changes Made
1. Directory Restructuring
Before:
90-reference-curator/
├── SKILL.md (Skill 01)
└── mnt/user-data/outputs/reference-curator-skills/
├── 02-web-crawler/SKILL.md
├── 03-content-repository/SKILL.md
├── 04-content-distiller/SKILL.md
├── 05-quality-reviewer/SKILL.md
└── 06-markdown-exporter/SKILL.md
After:
90-reference-curator/
├── README.md
├── install.sh # NEW: Installation script
├── 01-reference-discovery/
│ ├── code/CLAUDE.md # NEW: Claude Code directive
│ └── desktop/SKILL.md
├── 02-web-crawler-orchestrator/
│ ├── code/CLAUDE.md # NEW
│ └── desktop/SKILL.md
├── 03-content-repository/
│ ├── code/CLAUDE.md # NEW
│ └── desktop/SKILL.md
├── 04-content-distiller/
│ ├── code/CLAUDE.md # NEW
│ └── desktop/SKILL.md
├── 05-quality-reviewer/
│ ├── code/CLAUDE.md # NEW
│ └── desktop/SKILL.md
├── 06-markdown-exporter/
│ ├── code/CLAUDE.md # NEW
│ └── desktop/SKILL.md
└── shared/
├── schema.sql # NEW: MySQL schema
└── config/
├── db_config.yaml # NEW
├── crawl_config.yaml # NEW
└── export_config.yaml # NEW
2. New Files Created
| File | Purpose |
|---|---|
install.sh |
Interactive installation script |
shared/schema.sql |
MySQL schema (9 tables, 2 views) |
shared/config/db_config.yaml |
Database connection config |
shared/config/crawl_config.yaml |
Crawler routing config |
shared/config/export_config.yaml |
Export format config |
*/code/CLAUDE.md |
6 Claude Code directives |
3. Crawler Configuration
Implemented intelligent crawler routing:
| Crawler | Condition | Use Case |
|---|---|---|
| Node.js (default) | ≤50 pages, static | Small documentation sites |
| Python aiohttp | ≤200 pages, SEO needed | Technical docs |
| Scrapy | >200 pages, multi-domain | Enterprise crawls |
| Firecrawl MCP | SPA, JS-rendered | Dynamic sites |
4. Installation Script Features
./install.sh # Interactive installation
./install.sh --check # Verify status
./install.sh --reset # Reset (preserves data)
Handles:
- Config file deployment to
~/.config/reference-curator/ - Storage directory creation at
~/reference-library/ - MySQL credentials setup in
~/.envrc - Database creation and schema application
- Skill symlink registration in
~/.claude/skills/
5. MySQL Schema
Tables (9):
sources- Authoritative source registrydocuments- Crawled document storagedistilled_content- Processed summariesreview_logs- QA decision historytopics- Topic taxonomydocument_topics- Document-topic mappingexport_jobs- Export task trackingcrawl_schedule- Scheduled crawl jobschange_detection- Content change tracking
Views (2):
v_pending_reviews- Documents awaiting reviewv_export_ready- Approved documents ready for export
6. Environment Setup
Files installed:
~/.config/reference-curator/
├── db_config.yaml
├── crawl_config.yaml
└── export_config.yaml
~/reference-library/
├── raw/
├── processed/
└── exports/
~/.claude/skills/
├── reference-discovery -> .../01-reference-discovery/desktop
├── web-crawler-orchestrator -> .../02-web-crawler-orchestrator/desktop
├── content-repository -> .../03-content-repository/desktop
├── content-distiller -> .../04-content-distiller/desktop
├── quality-reviewer -> .../05-quality-reviewer/desktop
└── markdown-exporter -> .../06-markdown-exporter/desktop
Deleted
mnt/user-data/outputs/directory (moved to proper structure)- Root-level
SKILL.md(moved to01-reference-discovery/desktop/)
Verification
$ ./install.sh --check
✓ Configuration files (3/3)
✓ Storage directories (3/3)
✓ MySQL database (11 tables)
✓ Skill registrations (6/6)
All components installed correctly.
Next Steps
- Add Python scripts to
*/code/scripts/folders for automation - Implement
select_crawler.pyfor intelligent routing logic - Add unit tests for database operations
- Create example workflows in
*/desktop/examples/