* Fix SEO skill 34 bugs, Korean labels, and transition Ahrefs refs to our-seo-agent P0: Fix report_aggregator.py — wrong SKILL_REGISTRY[33] mapping, missing CATEGORY_WEIGHTS for 7 categories, and break bug in health score parsing that exited loop even on parse failure. P1: Remove VIEW tab references from skill 20, expand skill 32 docs, replace Ahrefs MCP references across all 16 skills (19-28, 31-34) with our-seo-agent CLI data source references. P2: Fix Korean labels in executive_report.py and dashboard_generator.py, add tenacity to base requirements, sync skill 34 base_client.py with canonical version from skill 12. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add Claude Code slash commands for SEO skills 19-34 and fix stale paths Create 14 new slash command files for skills 19-28, 31-34 so they appear as /seo-* commands in Claude Code. Also fix stale directory paths in 8 existing commands (skills 12-18, 29-30) that referenced pre-renumbering skill directories. Update .gitignore to track .claude/commands/ while keeping other .claude/ files ignored. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add 8 slash commands, enhance reference-curator with depth/output options - Add slash commands: ourdigital-brand-guide, notion-writer, notebooklm-agent, notebooklm-automation, notebooklm-studio, notebooklm-research, reference-curator, multi-agent-guide - Add --depth (light/standard/deep/full) with Firecrawl parameter mapping - Add --output with ~/Documents/reference-library/ default and user confirmation - Increase --max-sources default from 10 to 100 - Rename /reference-curator-pipeline to /reference-curator - Simplify web-crawler-orchestrator label to web-crawler in docs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
7.9 KiB
description, argument-hint, allowed-tools
| description | argument-hint | allowed-tools |
|---|---|---|
| Full reference curation pipeline - discovery, crawl, store, distill, review, export with configurable depth | <topic|urls|manifest> [--depth light|standard|deep|full] [--output ~/Documents/reference-library/] [--max-sources 100] [--auto-approve] [--export-format project_files] | WebSearch, WebFetch, Read, Write, Bash, Grep, Glob, Task |
Reference Curator Pipeline
Full-stack orchestration of the 6-skill reference curation workflow.
Input Modes
| Mode | Input Example | Pipeline Start |
|---|---|---|
| Topic | "Claude system prompts" |
reference-discovery |
| URLs | https://docs.anthropic.com/... |
web-crawler (skip discovery) |
| Manifest | ./manifest.json |
web-crawler (resume) |
Arguments
<input>: Required. Topic string, URL(s), or manifest file path--depth: Crawl depth level (default:standard). See Depth Levels below--output: Output directory path (default:~/Documents/reference-library/)--max-sources: Max sources to discover (default: 100)--max-pages: Max pages per source to crawl (default varies by depth)--auto-approve: Auto-approve scores above threshold--threshold: Approval threshold (default: 0.85)--max-iterations: Max QA loop iterations per document (default: 3)--export-format:project_files,fine_tuning,jsonl(default: project_files)--include-subdomains: Include subdomains in site mapping (default: false)--follow-external: Follow external links found in content (default: false)
Output Directory
IMPORTANT: Never store output in any claude-skills, .claude/, or skill-related directory.
The --output argument sets the base path for all pipeline output. If omitted, the default is ~/Documents/reference-library/.
Directory Structure
{output}/
├── {topic-slug}/ # One folder per pipeline run
│ ├── README.md # Index with table of contents
│ ├── 00-page-name.md # Individual page files
│ ├── 01-page-name.md
│ ├── ...
│ ├── {topic-slug}-complete.md # Combined bundle (all pages)
│ └── manifest.json # Crawl metadata
├── pipeline_state/ # Resume state (auto-managed)
│ └── run_XXX/state.json
└── exports/ # Fine-tuning / JSONL exports
Resolution Rules
- If
--outputis provided, use that path exactly - If not provided, use
~/Documents/reference-library/ - Before writing any files, check if the output directory exists
- If the directory does NOT exist, ask the user for permission before creating it:
- Show the full resolved path that will be created
- Wait for explicit user approval
- Only then run
mkdir -p <path>
- The topic slug is derived from the input (URL domain+path or topic string)
Examples
# Uses default: ~/Documents/reference-library/glossary-for-wordpress/
/reference-curator https://docs.codeat.co/glossary/
# Custom path: /tmp/research/mcp-docs/
/reference-curator "MCP servers" --output /tmp/research
# Explicit home subfolder: ~/Projects/client-docs/api-reference/
/reference-curator https://api.example.com/docs --output ~/Projects/client-docs
Depth Levels
--depth light
Fast scan for quick reference. Main content only, minimal crawling.
| Parameter | Value |
|---|---|
onlyMainContent |
true |
formats |
["markdown"] |
maxDiscoveryDepth |
1 |
max-pages default |
20 |
| Map limit | 50 |
deduplicateSimilarURLs |
true |
| Follow sub-links | No |
| JS rendering wait | None |
Best for: Quick lookups, single-page references, API docs you already know.
--depth standard (default)
Balanced crawl. Main content with links for cross-referencing.
| Parameter | Value |
|---|---|
onlyMainContent |
true |
formats |
["markdown", "links"] |
maxDiscoveryDepth |
2 |
max-pages default |
50 |
| Map limit | 100 |
deduplicateSimilarURLs |
true |
| Follow sub-links | Same-domain only |
| JS rendering wait | None |
Best for: Documentation sites, plugin guides, knowledge bases.
--depth deep
Thorough crawl. Full page content including sidebars, nav, and related pages.
| Parameter | Value |
|---|---|
onlyMainContent |
false |
formats |
["markdown", "links", "html"] |
maxDiscoveryDepth |
3 |
max-pages default |
150 |
| Map limit | 300 |
deduplicateSimilarURLs |
true |
| Follow sub-links | Same-domain + linked resources |
| JS rendering wait | waitFor: 3000 |
includeSubdomains |
true |
Best for: Complete product documentation, research material, sites with sidebars/code samples hidden behind tabs.
--depth full
Exhaustive crawl. Everything captured including raw HTML, screenshots, and external references.
| Parameter | Value |
|---|---|
onlyMainContent |
false |
formats |
["markdown", "html", "rawHtml", "links"] |
maxDiscoveryDepth |
5 |
max-pages default |
500 |
| Map limit | 1000 |
deduplicateSimilarURLs |
false |
| Follow sub-links | All (same-domain + external references) |
| JS rendering wait | waitFor: 5000 |
includeSubdomains |
true |
| Screenshots | Capture for JS-heavy pages |
Best for: Archival, migration references, preserving sites, training data collection.
Depth Comparison
light ████░░░░░░░░░░░░ Speed: fastest Pages: ~20 Content: main only
standard ████████░░░░░░░░ Speed: fast Pages: ~50 Content: main + links
deep ████████████░░░░ Speed: moderate Pages: ~150 Content: full page + HTML
full ████████████████ Speed: slow Pages: ~500 Content: everything + raw
Pipeline Stages
1. reference-discovery (topic mode only)
2. web-crawler ← depth controls this stage
3. content-repository
4. content-distiller <--------+
5. quality-reviewer |
+-- APPROVE -> export |
+-- REFACTOR -----------------+
+-- DEEP_RESEARCH -> crawler -+
+-- REJECT -> archive
6. markdown-exporter
Crawl Execution by Depth
When executing the crawl (Stage 2), apply the depth settings to the Firecrawl tools:
Site Mapping (firecrawl_map)
firecrawl_map:
url: <target>
limit: {depth.map_limit}
includeSubdomains: {depth.includeSubdomains}
Page Scraping (firecrawl_scrape)
firecrawl_scrape:
url: <page>
formats: {depth.formats}
onlyMainContent: {depth.onlyMainContent}
waitFor: {depth.waitFor} # deep/full only
excludeTags: ["nav", "footer"] # light/standard only
Batch Crawling (firecrawl_crawl) - for deep/full only
firecrawl_crawl:
url: <target>
maxDiscoveryDepth: {depth.maxDiscoveryDepth}
limit: {depth.max_pages}
deduplicateSimilarURLs: {depth.dedup}
scrapeOptions:
formats: {depth.formats}
onlyMainContent: {depth.onlyMainContent}
waitFor: {depth.waitFor}
Example Usage
# Quick scan of a single doc page
/reference-curator https://docs.example.com/api --depth light
# Standard documentation crawl (default)
/reference-curator "Glossary for WordPress" --max-sources 5
# Deep crawl capturing full page content and HTML
/reference-curator https://docs.codeat.co/glossary/ --depth deep
# Full archival crawl with all formats
/reference-curator https://docs.anthropic.com --depth full --max-pages 300
# Deep crawl with auto-approval and fine-tuning export
/reference-curator "MCP servers" --depth deep --auto-approve --export-format fine_tuning
Related Sub-commands
Individual stages available at: custom-skills/90-reference-curator/commands/
/reference-discovery,/web-crawler,/content-repository/content-distiller,/quality-reviewer,/markdown-exporter
Source
Full details: custom-skills/90-reference-curator/README.md