Files
our-claude-skills/custom-skills/10-seo-technical-audit/code/CLAUDE.md
Andrew Yim f0a1453918 feat(skills): Add mandatory Notion output requirement to all audit skills
All SEO (10-18) and GTM (20-21) skills now require saving reports to:
- Database: OurDigital SEO Audit Log (2c8581e5-8a1e-8035-880b-e38cefc2f3ef)
- Format: Korean content with English technical terms
- Audit ID: [TYPE]-YYYYMMDD-NNN

Updated files:
- 9 SEO skills (code/CLAUDE.md + desktop/SKILL.md)
- 2 GTM skills (code/CLAUDE.md + desktop/SKILL.md)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-22 02:16:26 +09:00

4.5 KiB

CLAUDE.md

Overview

Technical SEO auditor for crawlability fundamentals: robots.txt validation, XML sitemap analysis, and URL accessibility checking.

Quick Start

# Install dependencies
pip install -r scripts/requirements.txt

# Robots.txt analysis
python scripts/robots_checker.py --url https://example.com

# Sitemap validation
python scripts/sitemap_validator.py --url https://example.com/sitemap.xml

# Async URL crawl (check sitemap URLs accessibility)
python scripts/sitemap_crawler.py --sitemap https://example.com/sitemap.xml

Scripts

Script Purpose Key Output
robots_checker.py Parse and validate robots.txt User-agent rules, disallow patterns, sitemap declarations
sitemap_validator.py Validate XML sitemap structure URL count, lastmod dates, size limits, syntax errors
sitemap_crawler.py Async check URL accessibility HTTP status codes, response times, broken links
base_client.py Shared utilities RateLimiter, ConfigManager, BaseAsyncClient

Robots.txt Checker

# Basic analysis
python scripts/robots_checker.py --url https://example.com

# Test specific URL against rules
python scripts/robots_checker.py --url https://example.com --test-url /admin/

# Output JSON
python scripts/robots_checker.py --url https://example.com --json

Checks performed:

  • Syntax validation
  • User-agent rule parsing
  • Disallow/Allow pattern analysis
  • Sitemap declarations
  • Critical resource access (CSS/JS/images)

Sitemap Validator

# Validate sitemap
python scripts/sitemap_validator.py --url https://example.com/sitemap.xml

# Include sitemap index parsing
python scripts/sitemap_validator.py --url https://example.com/sitemap_index.xml --follow-index

Validation rules:

  • XML syntax correctness
  • URL count limit (50,000 max per sitemap)
  • File size limit (50MB max uncompressed)
  • Lastmod date format validation
  • Sitemap index structure

Sitemap Crawler

# Crawl all URLs in sitemap
python scripts/sitemap_crawler.py --sitemap https://example.com/sitemap.xml

# Limit concurrent requests
python scripts/sitemap_crawler.py --sitemap https://example.com/sitemap.xml --concurrency 10

# Sample mode (check subset)
python scripts/sitemap_crawler.py --sitemap https://example.com/sitemap.xml --sample 100

Output includes:

  • HTTP status codes per URL
  • Response times
  • Redirect chains
  • Broken links (4xx, 5xx)

Output Format

All scripts support --json flag for structured output:

{
  "url": "https://example.com",
  "status": "valid|invalid|warning",
  "issues": [
    {
      "type": "error|warning|info",
      "message": "Description",
      "location": "Line or URL"
    }
  ],
  "summary": {}
}

Common Issues Detected

Category Issue Severity
Robots.txt Missing sitemap declaration Medium
Robots.txt Blocking CSS/JS resources High
Robots.txt Overly broad disallow rules Medium
Sitemap URLs returning 404 High
Sitemap Missing lastmod dates Low
Sitemap Exceeds 50,000 URL limit High
Sitemap Non-canonical URLs included Medium

Configuration

Environment variables (optional):

# Rate limiting
CRAWL_DELAY=1.0          # Seconds between requests
MAX_CONCURRENT=20        # Async concurrency limit
REQUEST_TIMEOUT=30       # Request timeout seconds

Notion Output (Required)

IMPORTANT: All audit reports MUST be saved to the OurDigital SEO Audit Log database.

Database Configuration

Field Value
Database ID 2c8581e5-8a1e-8035-880b-e38cefc2f3ef
URL https://www.notion.so/dintelligence/2c8581e58a1e8035880be38cefc2f3ef

Required Properties

Property Type Description
Issue Title Report title (Korean + date)
Site URL Audited website URL
Category Select Technical SEO, On-page SEO, Performance, Schema/Structured Data, Sitemap, Robots.txt, Content, Local SEO
Priority Select Critical, High, Medium, Low
Found Date Date Audit date (YYYY-MM-DD)
Audit ID Rich Text Format: [TYPE]-YYYYMMDD-NNN

Language Guidelines

  • Report content in Korean (한국어)
  • Keep technical English terms as-is (e.g., SEO Audit, Core Web Vitals, Schema Markup)
  • URLs and code remain unchanged

Example MCP Call

mcp-cli call notion/API-post-page '{"parent": {"database_id": "2c8581e5-8a1e-8035-880b-e38cefc2f3ef"}, "properties": {...}}'