Skill Numbering Changes: - 01-03: OurDigital core (was 30-32) - 31-32: Notion tools (was 01-02) - 99_archive: Renamed from _archive for sorting New Files: - AGENTS.md: Claude Code agent routing guide - requirements.txt for 00-claude-code-setting, 32-notion-writer, 43-jamie-youtube-manager Documentation Updates: - CLAUDE.md: Updated skill inventory (23 skills) - AUDIT_REPORT.md: Current completion status (91%) - Archived REFACTORING_PLAN.md (most tasks complete) Removed: - ga-agent-skills/ (moved to separate repo ~/Project/dintel-ga4-agent) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
5.4 KiB
5.4 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Skill Overview
ourdigital-seo-audit is a comprehensive SEO audit skill that performs technical SEO analysis, schema validation, sitemap/robots.txt checks, and Core Web Vitals measurement. Results are exported to a Notion database.
Architecture
12-ourdigital-seo-audit/
├── SKILL.md # Skill definition with YAML frontmatter
├── scripts/ # Python automation scripts
│ ├── base_client.py # Shared utilities: RateLimiter, ConfigManager
│ ├── full_audit.py # Main orchestrator (SEOAuditor class)
│ ├── gsc_client.py # Google Search Console API
│ ├── pagespeed_client.py # PageSpeed Insights API
│ ├── schema_validator.py # JSON-LD/Microdata extraction & validation
│ ├── schema_generator.py # Generate schema markup from templates
│ ├── sitemap_validator.py # XML sitemap validation
│ ├── sitemap_crawler.py # Async sitemap URL crawler
│ ├── robots_checker.py # Robots.txt parser & rule tester
│ ├── page_analyzer.py # On-page SEO analysis
│ └── notion_reporter.py # Notion database integration
├── templates/
│ ├── schema_templates/ # JSON-LD templates (article, faq, product, etc.)
│ └── notion_database_schema.json
├── reference.md # API documentation
└── USER_GUIDE.md # End-user documentation
Script Relationships
full_audit.py (orchestrator)
├── robots_checker.py → RobotsChecker.analyze()
├── sitemap_validator.py → SitemapValidator.validate()
├── schema_validator.py → SchemaValidator.validate()
├── pagespeed_client.py → PageSpeedClient.analyze()
└── notion_reporter.py → NotionReporter.create_audit_report()
All scripts use:
└── base_client.py → ConfigManager (credentials), RateLimiter, BaseAsyncClient
Common Commands
Install Dependencies
pip install -r scripts/requirements.txt
Run Full SEO Audit
python scripts/full_audit.py --url https://example.com --output console
python scripts/full_audit.py --url https://example.com --output notion
python scripts/full_audit.py --url https://example.com --json
Individual Script Usage
# Robots.txt analysis
python scripts/robots_checker.py --url https://example.com
# Sitemap validation
python scripts/sitemap_validator.py --url https://example.com/sitemap.xml
# Schema validation
python scripts/schema_validator.py --url https://example.com
# Schema generation
python scripts/schema_generator.py --type organization --url https://example.com
# Core Web Vitals
python scripts/pagespeed_client.py --url https://example.com --strategy mobile
# Search Console data
python scripts/gsc_client.py --site sc-domain:example.com --action summary
Key Classes and Data Flow
AuditResult (full_audit.py)
Central dataclass holding all audit findings:
robots,sitemap,schema,performance- Raw results from each checkerfindings: list[SEOFinding]- Normalized issues for Notion exportsummary- Aggregated statistics
SEOFinding (notion_reporter.py)
Standard format for all audit issues:
@dataclass
class SEOFinding:
issue: str # Issue title
category: str # Technical SEO, Performance, Schema, etc.
priority: str # Critical, High, Medium, Low
url: str | None # Affected URL
recommendation: str # How to fix
audit_id: str # Groups findings from same session
NotionReporter
Creates findings in Notion with two modes:
- Individual pages per finding in default database
- Summary page with checklist table via
create_audit_report()
Default database: 2c8581e5-8a1e-8035-880b-e38cefc2f3ef
Google API Configuration
Service Account: ~/.credential/ourdigital-seo-agent.json
| API | Authentication | Usage |
|---|---|---|
| Search Console | Service account | gsc_client.py |
| PageSpeed Insights | API key (PAGESPEED_API_KEY) |
pagespeed_client.py |
| GA4 Analytics | Service account | Traffic data |
Environment variables are loaded from ~/Workspaces/claude-workspace/.env.
MCP Tool Integration
The skill uses MCP tools as primary data sources (Tier 1):
mcp__firecrawl__scrape/crawl- Web page content extractionmcp__perplexity__search- Competitor researchmcp__notion__*- Database operations
Python scripts are Tier 2 for Google API data collection.
Extending the Skill
Adding a New Schema Type
- Add JSON template to
templates/schema_templates/ - Update
REQUIRED_PROPERTIESandRECOMMENDED_PROPERTIESinschema_validator.py - Add type-specific validation in
_validate_type_specific()
Adding a New Audit Check
- Create checker class following pattern in existing scripts
- Return dataclass with
to_dict()method andissueslist - Add processing method in
SEOAuditor(_process_*_findings) - Wire into
run_audit()infull_audit.py
Rate Limits
| Service | Limit | Handled By |
|---|---|---|
| Firecrawl | Per plan | MCP |
| PageSpeed | 25,000/day | base_client.py RateLimiter |
| Search Console | 1,200/min | Manual delays |
| Notion | 3 req/sec | Semaphore in reporter |