12 new skills: Keyword Strategy, SERP Analysis, Position Tracking, Link Building, Content Strategy, E-Commerce SEO, KPI Framework, International SEO, AI Visibility, Knowledge Graph, Competitor Intel, and Crawl Budget. ~20K lines of Python across 25 domain scripts. Updated skill 11 pipeline table and repo CLAUDE.md. Enhanced skill 18 local SEO workflow from jamie.clinic audit. Note: Skill 26 hreflang_validator.py pending (content filter block). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
40 lines
1.5 KiB
Markdown
40 lines
1.5 KiB
Markdown
---
|
|
name: seo-crawl-budget
|
|
description: |
|
|
Crawl budget optimization and log analysis. Triggers: crawl budget, log analysis, bot crawling, Googlebot, crawl waste, orphan pages, crawl efficiency.
|
|
---
|
|
|
|
# Crawl Budget Optimizer
|
|
|
|
Analyze server access logs to identify crawl budget waste and generate optimization recommendations for search engine bots.
|
|
|
|
## Capabilities
|
|
|
|
1. **Log Analysis**: Parse Nginx/Apache/CloudFront access logs to extract bot crawl data
|
|
2. **Bot Profiling**: Per-bot behavior analysis (Googlebot, Yeti, Bingbot, Daumoa)
|
|
3. **Waste Detection**: Parameter URLs, redirect chains, soft 404s, duplicate URL variants
|
|
4. **Orphan Pages**: Pages in sitemap but uncrawled, and crawled pages not in sitemap
|
|
5. **Recommendations**: Prioritized action items for crawl budget optimization
|
|
|
|
## Workflow
|
|
|
|
1. Parse server access log with `log_parser.py`
|
|
2. Run crawl budget analysis with `crawl_budget_analyzer.py`
|
|
3. Compare with sitemap URLs for orphan page detection
|
|
4. Optionally compare with Ahrefs page history data
|
|
5. Generate Korean-language report with recommendations
|
|
6. Save to Notion SEO Audit Log database
|
|
|
|
## Tools Used
|
|
|
|
- **Ahrefs**: `site-explorer-pages-history` for indexed page comparison
|
|
- **Notion**: Save audit report to database `2c8581e5-8a1e-8035-880b-e38cefc2f3ef`
|
|
- **WebSearch**: Current best practices and bot documentation
|
|
|
|
## Output
|
|
|
|
All reports are saved to the OurDigital SEO Audit Log with:
|
|
- Category: Crawl Budget
|
|
- Audit ID format: CRAWL-YYYYMMDD-NNN
|
|
- Content in Korean with technical English terms preserved
|