5.0 KiB
5.0 KiB
CLAUDE.md
Overview
International SEO audit tool for multi-language and multi-region website optimization. Validates hreflang tags (bidirectional, self-referencing, x-default), analyzes URL structure patterns (ccTLD vs subdomain vs subdirectory), audits content parity across language versions, checks language detection vs declared language, and analyzes international redirect logic. Supports Korean expansion patterns (ko→ja, ko→zh, ko→en).
Quick Start
pip install -r scripts/requirements.txt
# Hreflang validation
python scripts/hreflang_validator.py --url https://example.com --json
# Full international SEO audit
python scripts/international_auditor.py --url https://example.com --json
Scripts
| Script | Purpose | Key Output |
|---|---|---|
hreflang_validator.py |
Validate hreflang tag implementation | Hreflang errors, missing bidirectional links, x-default issues |
international_auditor.py |
Full international SEO audit | URL structure, content parity, redirect logic, language detection |
base_client.py |
Shared utilities | RateLimiter, ConfigManager, BaseAsyncClient |
Hreflang Validator
# Validate hreflang for homepage
python scripts/hreflang_validator.py --url https://example.com --json
# Validate with sitemap-based discovery
python scripts/hreflang_validator.py --url https://example.com --sitemap https://example.com/sitemap.xml --json
# Check specific pages
python scripts/hreflang_validator.py --urls-file pages.txt --json
Capabilities:
- Hreflang tag extraction from HTML head, HTTP headers, and XML sitemap
- Bidirectional validation (if page A→B, then B→A must exist)
- Self-referencing check (each page should reference itself)
- x-default tag verification
- Language/region code validation (ISO 639-1 + ISO 3166-1)
- Conflicting hreflang detection
- Missing language version detection
- Return tag validation (confirmation links from alternate pages)
International Auditor
# Full international audit
python scripts/international_auditor.py --url https://example.com --json
# URL structure analysis
python scripts/international_auditor.py --url https://example.com --scope structure --json
# Content parity check
python scripts/international_auditor.py --url https://example.com --scope parity --json
# Korean expansion focus
python scripts/international_auditor.py --url https://example.com --korean-expansion --json
Capabilities:
- URL structure analysis (ccTLD vs subdomain vs subdirectory)
- Recommendation engine based on business context
- Content parity audit across language versions
- Page count comparison per language
- Key page availability check (home, about, contact, products)
- Content freshness comparison across languages
- Language/locale detection vs declared language
- HTML lang attribute check
- Content-Language header check
- Actual content language detection
- International redirect logic audit
- IP-based redirect detection
- Accept-Language redirect behavior
- Geo-redirect best practices (suggest→don't force)
- Korean expansion patterns (ko→ja, ko→zh, ko→en)
- Priority market recommendations for Korean businesses
- CJK-specific URL encoding issues
- Regional search engine considerations (Naver, Baidu, Yahoo Japan)
Data Sources
| Source | Purpose |
|---|---|
our-seo-agent CLI |
Primary data source (future); use --input for pre-fetched JSON |
| WebSearch / WebFetch | Supplementary live data |
| Notion MCP | Save audit report to database |
Output Format
{
"url": "https://example.com",
"url_structure": "subdirectory",
"languages_detected": ["ko", "en", "ja"],
"hreflang_validation": {
"total_pages_checked": 50,
"errors": [],
"warnings": [],
"missing_bidirectional": [],
"missing_self_reference": [],
"x_default_present": true
},
"content_parity": {
"ko": {"pages": 150, "freshness_score": 90},
"en": {"pages": 120, "freshness_score": 75},
"ja": {"pages": 80, "freshness_score": 60}
},
"redirect_logic": {
"ip_based_redirect": false,
"language_based_redirect": true,
"is_forced": false
},
"score": 68,
"timestamp": "2025-01-01T00:00:00"
}
Notion Output (Required)
IMPORTANT: All audit reports MUST be saved to the OurDigital SEO Audit Log database.
Database Configuration
| Field | Value |
|---|---|
| Database ID | 2c8581e5-8a1e-8035-880b-e38cefc2f3ef |
| URL | https://www.notion.so/dintelligence/2c8581e58a1e8035880be38cefc2f3ef |
Required Properties
| Property | Type | Description |
|---|---|---|
| Issue | Title | Report title (Korean + date) |
| Site | URL | Audited website URL |
| Category | Select | International SEO |
| Priority | Select | Based on hreflang error count |
| Found Date | Date | Audit date (YYYY-MM-DD) |
| Audit ID | Rich Text | Format: INTL-YYYYMMDD-NNN |
Language Guidelines
- Report content in Korean (한국어)
- Keep technical English terms as-is (e.g., hreflang, x-default, ccTLD)
- URLs and code remain unchanged