Files
our-claude-skills/custom-skills/10-seo-technical-audit/desktop/SKILL.md
Andrew Yim d1cd1298a8 feat(reference-curator): Add pipeline orchestrator and refactor skill format
Pipeline Orchestrator:
- Add 07-pipeline-orchestrator skill with code/CLAUDE.md and desktop/SKILL.md
- Add /reference-curator-pipeline slash command for full workflow automation
- Add pipeline_runs and pipeline_iteration_tracker tables to schema.sql
- Add v_pipeline_status and v_pipeline_iterations views
- Add pipeline_config.yaml configuration template
- Update AGENTS.md with Reference Curator Skills section
- Update claude-project files with pipeline documentation

Skill Format Refactoring:
- Extract YAML frontmatter from SKILL.md files to separate skill.yaml
- Add tools/ directories with MCP tool documentation
- Update SKILL-FORMAT-REQUIREMENTS.md with new structure
- Add migrate-skill-structure.py script for format conversion

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 01:01:02 +07:00

2.5 KiB

SEO Technical Audit

Purpose

Analyze crawlability fundamentals: robots.txt rules, XML sitemap structure, and URL accessibility. Identify issues blocking search engine crawlers.

Core Capabilities

  1. Robots.txt Analysis - Parse rules, check blocked resources
  2. Sitemap Validation - Verify XML structure, URL limits, dates
  3. URL Accessibility - Check HTTP status, redirects, broken links

MCP Tool Usage

Firecrawl for Page Data

mcp__firecrawl__scrape: Fetch robots.txt and sitemap content
mcp__firecrawl__crawl: Check multiple URLs accessibility

Perplexity for Best Practices

mcp__perplexity__search: Research current SEO recommendations

Workflow

1. Robots.txt Check

  1. Fetch [domain]/robots.txt using Firecrawl
  2. Parse User-agent rules and Disallow patterns
  3. Identify blocked resources (CSS, JS, images)
  4. Check for Sitemap declarations
  5. Report critical issues

2. Sitemap Validation

  1. Locate sitemap (from robots.txt or /sitemap.xml)
  2. Validate XML syntax
  3. Check URL count (max 50,000)
  4. Verify lastmod date formats
  5. For sitemap index: parse child sitemaps

3. URL Accessibility Sampling

  1. Extract URLs from sitemap
  2. Sample 50-100 URLs for large sites
  3. Check HTTP status codes
  4. Identify redirects and broken links
  5. Report 4xx/5xx errors

Output Format

## Technical SEO Audit: [domain]

### Robots.txt Analysis
- Status: [Valid/Invalid/Missing]
- Sitemap declared: [Yes/No]
- Critical blocks: [List]

### Sitemap Validation
- URLs found: [count]
- Syntax: [Valid/Errors]
- Issues: [List]

### URL Accessibility (sampled)
- Checked: [count] URLs
- Success (2xx): [count]
- Redirects (3xx): [count]
- Errors (4xx/5xx): [count]

### Recommendations
1. [Priority fixes]

Common Issues

Issue Impact Fix
No sitemap in robots.txt Medium Add Sitemap: directive
Blocking CSS/JS High Allow Googlebot access
404s in sitemap High Remove or fix URLs
Missing lastmod Low Add dates for freshness signals

Limitations

  • Cannot access password-protected sitemaps
  • Large sitemaps (10,000+ URLs) require sampling
  • Does not check render-blocking issues (use Core Web Vitals skill)

Notion Output (Required)

All audit reports MUST be saved to OurDigital SEO Audit Log:

  • Database ID: 2c8581e5-8a1e-8035-880b-e38cefc2f3ef
  • Properties: Issue (title), Site (url), Category, Priority, Found Date, Audit ID
  • Language: Korean with English technical terms
  • Audit ID Format: [TYPE]-YYYYMMDD-NNN