Files

Andrew Yim 397fa2aa5d Fix SEO skills 19-34 bugs, add slash commands, enhance reference-curator (#3 )

* Fix SEO skill 34 bugs, Korean labels, and transition Ahrefs refs to our-seo-agent

P0: Fix report_aggregator.py — wrong SKILL_REGISTRY[33] mapping, missing
CATEGORY_WEIGHTS for 7 categories, and break bug in health score parsing
that exited loop even on parse failure.

P1: Remove VIEW tab references from skill 20, expand skill 32 docs,
replace Ahrefs MCP references across all 16 skills (19-28, 31-34)
with our-seo-agent CLI data source references.

P2: Fix Korean labels in executive_report.py and dashboard_generator.py,
add tenacity to base requirements, sync skill 34 base_client.py with
canonical version from skill 12.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add Claude Code slash commands for SEO skills 19-34 and fix stale paths

Create 14 new slash command files for skills 19-28, 31-34 so they
appear as /seo-* commands in Claude Code. Also fix stale directory
paths in 8 existing commands (skills 12-18, 29-30) that referenced
pre-renumbering skill directories.

Update .gitignore to track .claude/commands/ while keeping other
.claude/ files ignored.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add 8 slash commands, enhance reference-curator with depth/output options

- Add slash commands: ourdigital-brand-guide, notion-writer, notebooklm-agent,
  notebooklm-automation, notebooklm-studio, notebooklm-research,
  reference-curator, multi-agent-guide
- Add --depth (light/standard/deep/full) with Firecrawl parameter mapping
- Add --output with ~/Documents/reference-library/ default and user confirmation
- Increase --max-sources default from 10 to 100
- Rename /reference-curator-pipeline to /reference-curator
- Simplify web-crawler-orchestrator label to web-crawler in docs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-24 14:12:57 +09:00

2.2 KiB

Raw Blame History

description

description
Crawl budget optimization and log analysis

SEO Crawl Budget

Server access log analysis, bot profiling, and crawl budget waste identification.

Triggers

"crawl budget", "log analysis", "크롤 예산"

Capabilities

Log Parsing - Parse Nginx, Apache, CloudFront access logs (streaming for >1GB files)
Bot Identification - Googlebot, Yeti/Naver, Bingbot, Daumoa/Kakao, and others by User-Agent
Per-Bot Profiling - Crawl frequency, depth distribution, status codes, crawl patterns
Waste Detection - Parameter URLs, low-value pages, redirect chains, soft 404s, duplicate URLs
Orphan Pages - Pages in sitemap but never crawled, crawled but not in sitemap
Optimization Plan - robots.txt suggestions, URL parameter handling, noindex recommendations

Scripts

# Parse Nginx access log
python custom-skills/32-seo-crawl-budget/code/scripts/log_parser.py \
  --log-file /var/log/nginx/access.log --json

# Parse Apache log, filter by Googlebot
python custom-skills/32-seo-crawl-budget/code/scripts/log_parser.py \
  --log-file /var/log/apache2/access.log --format apache --bot googlebot --json

# Parse gzipped log in streaming mode
python custom-skills/32-seo-crawl-budget/code/scripts/log_parser.py \
  --log-file access.log.gz --streaming --json

# Full crawl budget analysis with sitemap comparison
python custom-skills/32-seo-crawl-budget/code/scripts/crawl_budget_analyzer.py \
  --log-file access.log --sitemap https://example.com/sitemap.xml --json

# Waste identification only
python custom-skills/32-seo-crawl-budget/code/scripts/crawl_budget_analyzer.py \
  --log-file access.log --scope waste --json

# Orphan page detection
python custom-skills/32-seo-crawl-budget/code/scripts/crawl_budget_analyzer.py \
  --log-file access.log --sitemap https://example.com/sitemap.xml --scope orphans --json

Output

Bot request counts, status code distribution, top crawled URLs per bot
Crawl waste breakdown (parameter URLs, redirects, soft 404s, duplicates)
Orphan page lists (in sitemap not crawled, crawled not in sitemap)
Efficiency score (0-100) with optimization recommendations
Saved to Notion SEO Audit Log (Category: Crawl Budget, Audit ID: CRAWL-YYYYMMDD-NNN)

2.2 KiB Raw Blame History

SEO Crawl Budget

Triggers

Capabilities

Scripts

Output

2.2 KiB

Raw Blame History