refactor(skills): Restructure skills to dual-platform architecture

Major refactoring of ourdigital-custom-skills with new numbering system:

## Structure Changes
- Each skill now has code/ (Claude Code) and desktop/ (Claude Desktop) versions
- New progressive numbering: 01-09 General, 10-19 SEO, 20-29 GTM, 30-39 OurDigital, 40-49 Jamie

## Skill Reorganization
- 01-notion-organizer (from 02)
- 10-18: SEO tools split into focused skills (technical, on-page, local, schema, vitals, gsc, gateway)
- 20-21: GTM audit and manager
- 30-32: OurDigital designer, research, presentation
- 40-41: Jamie brand editor and audit

## New Files
- .claude/commands/: Slash command definitions for all skills
- CLAUDE.md: Updated with new skill structure documentation
- REFACTORING_PLAN.md: Migration documentation
- COMPATIBILITY_REPORT.md, SKILLS_COMPARISON.md: Analysis docs

## Removed
- Old skill directories (02-05, 10-14, 20-21 old numbering)
- Consolidated into new structure with _archive/ for reference

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
2025-12-22 01:58:24 +09:00
parent 214247ace2
commit eea49f9f8c
251 changed files with 12308 additions and 102 deletions

View File

@@ -0,0 +1,89 @@
# CLAUDE.md
## Overview
Notion workspace management toolkit for database organization, schema migration, and bulk operations.
## Quick Start
```bash
pip install -r scripts/requirements.txt
# Schema migration
python scripts/schema_migrator.py --source [DB_ID] --target [DB_ID] --dry-run
# Async bulk operations
python scripts/async_organizer.py --database [DB_ID] --action cleanup
```
## Scripts
| Script | Purpose |
|--------|---------|
| `schema_migrator.py` | Migrate data between databases with property mapping |
| `async_organizer.py` | Async bulk operations (cleanup, restructure, archive) |
## Schema Migrator
```bash
# Dry run (preview changes)
python scripts/schema_migrator.py \
--source abc123 \
--target def456 \
--mapping mapping.json \
--dry-run
# Execute migration
python scripts/schema_migrator.py \
--source abc123 \
--target def456 \
--mapping mapping.json
```
### Mapping File Format
```json
{
"properties": {
"OldName": "NewName",
"Status": "Status"
},
"transforms": {
"Date": "date_to_iso"
}
}
```
## Async Organizer
```bash
# Cleanup empty/stale pages
python scripts/async_organizer.py --database [ID] --action cleanup
# Archive old pages
python scripts/async_organizer.py --database [ID] --action archive --days 90
# Restructure hierarchy
python scripts/async_organizer.py --database [ID] --action restructure
```
## Rate Limits
| Limit | Value |
|-------|-------|
| Requests/second | 3 max |
| Items per request | 100 max |
| Retry on 429 | Exponential backoff |
## Configuration
Environment variables:
```bash
NOTION_TOKEN=secret_xxx
```
## Notes
- Always use `--dry-run` first for destructive operations
- Large operations (1000+ pages) use async with progress reporting
- Scripts implement automatic rate limiting

View File

@@ -0,0 +1,127 @@
# CLAUDE.md
## Overview
Technical SEO auditor for crawlability fundamentals: robots.txt validation, XML sitemap analysis, and URL accessibility checking.
## Quick Start
```bash
# Install dependencies
pip install -r scripts/requirements.txt
# Robots.txt analysis
python scripts/robots_checker.py --url https://example.com
# Sitemap validation
python scripts/sitemap_validator.py --url https://example.com/sitemap.xml
# Async URL crawl (check sitemap URLs accessibility)
python scripts/sitemap_crawler.py --sitemap https://example.com/sitemap.xml
```
## Scripts
| Script | Purpose | Key Output |
|--------|---------|------------|
| `robots_checker.py` | Parse and validate robots.txt | User-agent rules, disallow patterns, sitemap declarations |
| `sitemap_validator.py` | Validate XML sitemap structure | URL count, lastmod dates, size limits, syntax errors |
| `sitemap_crawler.py` | Async check URL accessibility | HTTP status codes, response times, broken links |
| `base_client.py` | Shared utilities | RateLimiter, ConfigManager, BaseAsyncClient |
## Robots.txt Checker
```bash
# Basic analysis
python scripts/robots_checker.py --url https://example.com
# Test specific URL against rules
python scripts/robots_checker.py --url https://example.com --test-url /admin/
# Output JSON
python scripts/robots_checker.py --url https://example.com --json
```
**Checks performed**:
- Syntax validation
- User-agent rule parsing
- Disallow/Allow pattern analysis
- Sitemap declarations
- Critical resource access (CSS/JS/images)
## Sitemap Validator
```bash
# Validate sitemap
python scripts/sitemap_validator.py --url https://example.com/sitemap.xml
# Include sitemap index parsing
python scripts/sitemap_validator.py --url https://example.com/sitemap_index.xml --follow-index
```
**Validation rules**:
- XML syntax correctness
- URL count limit (50,000 max per sitemap)
- File size limit (50MB max uncompressed)
- Lastmod date format validation
- Sitemap index structure
## Sitemap Crawler
```bash
# Crawl all URLs in sitemap
python scripts/sitemap_crawler.py --sitemap https://example.com/sitemap.xml
# Limit concurrent requests
python scripts/sitemap_crawler.py --sitemap https://example.com/sitemap.xml --concurrency 10
# Sample mode (check subset)
python scripts/sitemap_crawler.py --sitemap https://example.com/sitemap.xml --sample 100
```
**Output includes**:
- HTTP status codes per URL
- Response times
- Redirect chains
- Broken links (4xx, 5xx)
## Output Format
All scripts support `--json` flag for structured output:
```json
{
"url": "https://example.com",
"status": "valid|invalid|warning",
"issues": [
{
"type": "error|warning|info",
"message": "Description",
"location": "Line or URL"
}
],
"summary": {}
}
```
## Common Issues Detected
| Category | Issue | Severity |
|----------|-------|----------|
| Robots.txt | Missing sitemap declaration | Medium |
| Robots.txt | Blocking CSS/JS resources | High |
| Robots.txt | Overly broad disallow rules | Medium |
| Sitemap | URLs returning 404 | High |
| Sitemap | Missing lastmod dates | Low |
| Sitemap | Exceeds 50,000 URL limit | High |
| Sitemap | Non-canonical URLs included | Medium |
## Configuration
Environment variables (optional):
```bash
# Rate limiting
CRAWL_DELAY=1.0 # Seconds between requests
MAX_CONCURRENT=20 # Async concurrency limit
REQUEST_TIMEOUT=30 # Request timeout seconds
```

View File

@@ -0,0 +1,17 @@
# 10-seo-technical-audit dependencies
# Install: pip install -r requirements.txt
# Web Scraping & Parsing
lxml>=5.1.0
beautifulsoup4>=4.12.0
requests>=2.31.0
aiohttp>=3.9.0
# Async & Retry
tenacity>=8.2.0
tqdm>=4.66.0
# Environment & CLI
python-dotenv>=1.0.0
rich>=13.7.0
typer>=0.9.0

View File

@@ -0,0 +1,94 @@
---
name: seo-technical-audit
version: 1.0.0
description: Technical SEO auditor for crawlability fundamentals. Triggers: robots.txt, sitemap validation, crawlability, indexing check, technical SEO.
allowed-tools: mcp__firecrawl__*, mcp__perplexity__*, mcp__notion__*
---
# SEO Technical Audit
## Purpose
Analyze crawlability fundamentals: robots.txt rules, XML sitemap structure, and URL accessibility. Identify issues blocking search engine crawlers.
## Core Capabilities
1. **Robots.txt Analysis** - Parse rules, check blocked resources
2. **Sitemap Validation** - Verify XML structure, URL limits, dates
3. **URL Accessibility** - Check HTTP status, redirects, broken links
## MCP Tool Usage
### Firecrawl for Page Data
```
mcp__firecrawl__scrape: Fetch robots.txt and sitemap content
mcp__firecrawl__crawl: Check multiple URLs accessibility
```
### Perplexity for Best Practices
```
mcp__perplexity__search: Research current SEO recommendations
```
## Workflow
### 1. Robots.txt Check
1. Fetch `[domain]/robots.txt` using Firecrawl
2. Parse User-agent rules and Disallow patterns
3. Identify blocked resources (CSS, JS, images)
4. Check for Sitemap declarations
5. Report critical issues
### 2. Sitemap Validation
1. Locate sitemap (from robots.txt or `/sitemap.xml`)
2. Validate XML syntax
3. Check URL count (max 50,000)
4. Verify lastmod date formats
5. For sitemap index: parse child sitemaps
### 3. URL Accessibility Sampling
1. Extract URLs from sitemap
2. Sample 50-100 URLs for large sites
3. Check HTTP status codes
4. Identify redirects and broken links
5. Report 4xx/5xx errors
## Output Format
```markdown
## Technical SEO Audit: [domain]
### Robots.txt Analysis
- Status: [Valid/Invalid/Missing]
- Sitemap declared: [Yes/No]
- Critical blocks: [List]
### Sitemap Validation
- URLs found: [count]
- Syntax: [Valid/Errors]
- Issues: [List]
### URL Accessibility (sampled)
- Checked: [count] URLs
- Success (2xx): [count]
- Redirects (3xx): [count]
- Errors (4xx/5xx): [count]
### Recommendations
1. [Priority fixes]
```
## Common Issues
| Issue | Impact | Fix |
|-------|--------|-----|
| No sitemap in robots.txt | Medium | Add `Sitemap:` directive |
| Blocking CSS/JS | High | Allow Googlebot access |
| 404s in sitemap | High | Remove or fix URLs |
| Missing lastmod | Low | Add dates for freshness signals |
## Limitations
- Cannot access password-protected sitemaps
- Large sitemaps (10,000+ URLs) require sampling
- Does not check render-blocking issues (use Core Web Vitals skill)

View File

@@ -0,0 +1,107 @@
# CLAUDE.md
## Overview
On-page SEO analyzer for single-page optimization: meta tags, headings, links, images, and Open Graph data.
## Quick Start
```bash
pip install -r scripts/requirements.txt
python scripts/page_analyzer.py --url https://example.com
```
## Scripts
| Script | Purpose |
|--------|---------|
| `page_analyzer.py` | Analyze on-page SEO elements |
| `base_client.py` | Shared utilities |
## Usage
```bash
# Full page analysis
python scripts/page_analyzer.py --url https://example.com
# JSON output
python scripts/page_analyzer.py --url https://example.com --json
# Analyze multiple pages
python scripts/page_analyzer.py --urls urls.txt
```
## Analysis Categories
### Meta Tags
- Title tag (length, keywords)
- Meta description (length, call-to-action)
- Canonical URL
- Robots meta tag
### Heading Structure
- H1 presence and count
- Heading hierarchy (H1→H6)
- Keyword placement in headings
### Links
- Internal link count
- External link count
- Broken links (4xx/5xx)
- Nofollow distribution
### Images
- Alt attribute presence
- Image file sizes
- Lazy loading implementation
### Open Graph / Social
- OG title, description, image
- Twitter Card tags
- Social sharing preview
## Output
```json
{
"url": "https://example.com",
"meta": {
"title": "Page Title",
"title_length": 55,
"description": "...",
"description_length": 150,
"canonical": "https://example.com"
},
"headings": {
"h1_count": 1,
"h1_text": ["Main Heading"],
"hierarchy_valid": true
},
"links": {
"internal": 25,
"external": 5,
"broken": []
},
"issues": []
}
```
## Common Issues
| Issue | Severity | Recommendation |
|-------|----------|----------------|
| Missing H1 | High | Add single H1 tag |
| Title too long (>60) | Medium | Shorten to 50-60 chars |
| No meta description | High | Add compelling description |
| Images without alt | Medium | Add descriptive alt text |
| Multiple H1 tags | Medium | Use single H1 only |
## Dependencies
```
lxml>=5.1.0
beautifulsoup4>=4.12.0
requests>=2.31.0
python-dotenv>=1.0.0
rich>=13.7.0
```

View File

@@ -0,0 +1,207 @@
"""
Base Client - Shared async client utilities
===========================================
Purpose: Rate-limited async operations for API clients
Python: 3.10+
"""
import asyncio
import logging
import os
from asyncio import Semaphore
from datetime import datetime
from typing import Any, Callable, TypeVar
from dotenv import load_dotenv
from tenacity import (
retry,
stop_after_attempt,
wait_exponential,
retry_if_exception_type,
)
# Load environment variables
load_dotenv()
# Logging setup
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(levelname)s - %(message)s",
)
T = TypeVar("T")
class RateLimiter:
"""Rate limiter using token bucket algorithm."""
def __init__(self, rate: float, per: float = 1.0):
"""
Initialize rate limiter.
Args:
rate: Number of requests allowed
per: Time period in seconds (default: 1 second)
"""
self.rate = rate
self.per = per
self.tokens = rate
self.last_update = datetime.now()
self._lock = asyncio.Lock()
async def acquire(self) -> None:
"""Acquire a token, waiting if necessary."""
async with self._lock:
now = datetime.now()
elapsed = (now - self.last_update).total_seconds()
self.tokens = min(self.rate, self.tokens + elapsed * (self.rate / self.per))
self.last_update = now
if self.tokens < 1:
wait_time = (1 - self.tokens) * (self.per / self.rate)
await asyncio.sleep(wait_time)
self.tokens = 0
else:
self.tokens -= 1
class BaseAsyncClient:
"""Base class for async API clients with rate limiting."""
def __init__(
self,
max_concurrent: int = 5,
requests_per_second: float = 3.0,
logger: logging.Logger | None = None,
):
"""
Initialize base client.
Args:
max_concurrent: Maximum concurrent requests
requests_per_second: Rate limit
logger: Logger instance
"""
self.semaphore = Semaphore(max_concurrent)
self.rate_limiter = RateLimiter(requests_per_second)
self.logger = logger or logging.getLogger(self.__class__.__name__)
self.stats = {
"requests": 0,
"success": 0,
"errors": 0,
"retries": 0,
}
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=2, max=10),
retry=retry_if_exception_type(Exception),
)
async def _rate_limited_request(
self,
coro: Callable[[], Any],
) -> Any:
"""Execute a request with rate limiting and retry."""
async with self.semaphore:
await self.rate_limiter.acquire()
self.stats["requests"] += 1
try:
result = await coro()
self.stats["success"] += 1
return result
except Exception as e:
self.stats["errors"] += 1
self.logger.error(f"Request failed: {e}")
raise
async def batch_requests(
self,
requests: list[Callable[[], Any]],
desc: str = "Processing",
) -> list[Any]:
"""Execute multiple requests concurrently."""
try:
from tqdm.asyncio import tqdm
has_tqdm = True
except ImportError:
has_tqdm = False
async def execute(req: Callable) -> Any:
try:
return await self._rate_limited_request(req)
except Exception as e:
return {"error": str(e)}
tasks = [execute(req) for req in requests]
if has_tqdm:
results = []
for coro in tqdm.as_completed(tasks, total=len(tasks), desc=desc):
result = await coro
results.append(result)
return results
else:
return await asyncio.gather(*tasks, return_exceptions=True)
def print_stats(self) -> None:
"""Print request statistics."""
self.logger.info("=" * 40)
self.logger.info("Request Statistics:")
self.logger.info(f" Total Requests: {self.stats['requests']}")
self.logger.info(f" Successful: {self.stats['success']}")
self.logger.info(f" Errors: {self.stats['errors']}")
self.logger.info("=" * 40)
class ConfigManager:
"""Manage API configuration and credentials."""
def __init__(self):
load_dotenv()
@property
def google_credentials_path(self) -> str | None:
"""Get Google service account credentials path."""
# Prefer SEO-specific credentials, fallback to general credentials
seo_creds = os.path.expanduser("~/.credential/ourdigital-seo-agent.json")
if os.path.exists(seo_creds):
return seo_creds
return os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
@property
def pagespeed_api_key(self) -> str | None:
"""Get PageSpeed Insights API key."""
return os.getenv("PAGESPEED_API_KEY")
@property
def custom_search_api_key(self) -> str | None:
"""Get Custom Search API key."""
return os.getenv("CUSTOM_SEARCH_API_KEY")
@property
def custom_search_engine_id(self) -> str | None:
"""Get Custom Search Engine ID."""
return os.getenv("CUSTOM_SEARCH_ENGINE_ID")
@property
def notion_token(self) -> str | None:
"""Get Notion API token."""
return os.getenv("NOTION_TOKEN") or os.getenv("NOTION_API_KEY")
def validate_google_credentials(self) -> bool:
"""Validate Google credentials are configured."""
creds_path = self.google_credentials_path
if not creds_path:
return False
return os.path.exists(creds_path)
def get_required(self, key: str) -> str:
"""Get required environment variable or raise error."""
value = os.getenv(key)
if not value:
raise ValueError(f"Missing required environment variable: {key}")
return value
# Singleton config instance
config = ConfigManager()

View File

@@ -0,0 +1,569 @@
"""
Page Analyzer - Extract SEO metadata from web pages
===================================================
Purpose: Comprehensive page-level SEO data extraction
Python: 3.10+
Usage:
from page_analyzer import PageAnalyzer, PageMetadata
analyzer = PageAnalyzer()
metadata = analyzer.analyze_url("https://example.com/page")
"""
import json
import logging
import re
from dataclasses import dataclass, field
from datetime import datetime
from typing import Any
from urllib.parse import urljoin, urlparse
import requests
from bs4 import BeautifulSoup
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(levelname)s - %(message)s",
)
logger = logging.getLogger(__name__)
@dataclass
class LinkData:
"""Represents a link found on a page."""
url: str
anchor_text: str
is_internal: bool
is_nofollow: bool = False
link_type: str = "body" # body, nav, footer, etc.
@dataclass
class HeadingData:
"""Represents a heading found on a page."""
level: int # 1-6
text: str
@dataclass
class SchemaData:
"""Represents schema.org structured data."""
schema_type: str
properties: dict
format: str = "json-ld" # json-ld, microdata, rdfa
@dataclass
class OpenGraphData:
"""Represents Open Graph metadata."""
og_title: str | None = None
og_description: str | None = None
og_image: str | None = None
og_url: str | None = None
og_type: str | None = None
og_site_name: str | None = None
og_locale: str | None = None
twitter_card: str | None = None
twitter_title: str | None = None
twitter_description: str | None = None
twitter_image: str | None = None
@dataclass
class PageMetadata:
"""Complete SEO metadata for a page."""
# Basic info
url: str
status_code: int = 0
content_type: str = ""
response_time_ms: float = 0
analyzed_at: datetime = field(default_factory=datetime.now)
# Meta tags
title: str | None = None
title_length: int = 0
meta_description: str | None = None
meta_description_length: int = 0
canonical_url: str | None = None
robots_meta: str | None = None
# Language
html_lang: str | None = None
hreflang_tags: list[dict] = field(default_factory=list) # [{"lang": "en", "url": "..."}]
# Headings
headings: list[HeadingData] = field(default_factory=list)
h1_count: int = 0
h1_text: str | None = None
# Open Graph & Social
open_graph: OpenGraphData = field(default_factory=OpenGraphData)
# Schema/Structured Data
schema_data: list[SchemaData] = field(default_factory=list)
schema_types_found: list[str] = field(default_factory=list)
# Links
internal_links: list[LinkData] = field(default_factory=list)
external_links: list[LinkData] = field(default_factory=list)
internal_link_count: int = 0
external_link_count: int = 0
# Images
images_total: int = 0
images_without_alt: int = 0
images_with_alt: int = 0
# Content metrics
word_count: int = 0
# Issues found
issues: list[str] = field(default_factory=list)
warnings: list[str] = field(default_factory=list)
def to_dict(self) -> dict:
"""Convert to dictionary for JSON serialization."""
return {
"url": self.url,
"status_code": self.status_code,
"content_type": self.content_type,
"response_time_ms": self.response_time_ms,
"analyzed_at": self.analyzed_at.isoformat(),
"title": self.title,
"title_length": self.title_length,
"meta_description": self.meta_description,
"meta_description_length": self.meta_description_length,
"canonical_url": self.canonical_url,
"robots_meta": self.robots_meta,
"html_lang": self.html_lang,
"hreflang_tags": self.hreflang_tags,
"h1_count": self.h1_count,
"h1_text": self.h1_text,
"headings_count": len(self.headings),
"schema_types_found": self.schema_types_found,
"internal_link_count": self.internal_link_count,
"external_link_count": self.external_link_count,
"images_total": self.images_total,
"images_without_alt": self.images_without_alt,
"word_count": self.word_count,
"issues": self.issues,
"warnings": self.warnings,
"open_graph": {
"og_title": self.open_graph.og_title,
"og_description": self.open_graph.og_description,
"og_image": self.open_graph.og_image,
"og_url": self.open_graph.og_url,
"og_type": self.open_graph.og_type,
},
}
def get_summary(self) -> str:
"""Get a brief summary of the page analysis."""
lines = [
f"URL: {self.url}",
f"Status: {self.status_code}",
f"Title: {self.title[:50] + '...' if self.title and len(self.title) > 50 else self.title}",
f"Description: {'' if self.meta_description else '✗ Missing'}",
f"Canonical: {'' if self.canonical_url else '✗ Missing'}",
f"H1: {self.h1_count} found",
f"Schema: {', '.join(self.schema_types_found) if self.schema_types_found else 'None'}",
f"Links: {self.internal_link_count} internal, {self.external_link_count} external",
f"Images: {self.images_total} total, {self.images_without_alt} without alt",
]
if self.issues:
lines.append(f"Issues: {len(self.issues)}")
return "\n".join(lines)
class PageAnalyzer:
"""Analyze web pages for SEO metadata."""
DEFAULT_USER_AGENT = "Mozilla/5.0 (compatible; OurDigitalSEOBot/1.0; +https://ourdigital.org)"
def __init__(
self,
user_agent: str | None = None,
timeout: int = 30,
):
"""
Initialize page analyzer.
Args:
user_agent: Custom user agent string
timeout: Request timeout in seconds
"""
self.user_agent = user_agent or self.DEFAULT_USER_AGENT
self.timeout = timeout
self.session = requests.Session()
self.session.headers.update({
"User-Agent": self.user_agent,
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Language": "en-US,en;q=0.9,ko;q=0.8",
})
def analyze_url(self, url: str) -> PageMetadata:
"""
Analyze a URL and extract SEO metadata.
Args:
url: URL to analyze
Returns:
PageMetadata object with all extracted data
"""
metadata = PageMetadata(url=url)
try:
# Fetch page
start_time = datetime.now()
response = self.session.get(url, timeout=self.timeout, allow_redirects=True)
metadata.response_time_ms = (datetime.now() - start_time).total_seconds() * 1000
metadata.status_code = response.status_code
metadata.content_type = response.headers.get("Content-Type", "")
if response.status_code != 200:
metadata.issues.append(f"HTTP {response.status_code} status")
if response.status_code >= 400:
return metadata
# Parse HTML
soup = BeautifulSoup(response.text, "html.parser")
base_url = url
# Extract all metadata
self._extract_basic_meta(soup, metadata)
self._extract_canonical(soup, metadata, base_url)
self._extract_robots_meta(soup, metadata)
self._extract_hreflang(soup, metadata)
self._extract_headings(soup, metadata)
self._extract_open_graph(soup, metadata)
self._extract_schema(soup, metadata)
self._extract_links(soup, metadata, base_url)
self._extract_images(soup, metadata)
self._extract_content_metrics(soup, metadata)
# Run SEO checks
self._run_seo_checks(metadata)
except requests.RequestException as e:
metadata.issues.append(f"Request failed: {str(e)}")
logger.error(f"Failed to analyze {url}: {e}")
except Exception as e:
metadata.issues.append(f"Analysis error: {str(e)}")
logger.error(f"Error analyzing {url}: {e}")
return metadata
def _extract_basic_meta(self, soup: BeautifulSoup, metadata: PageMetadata) -> None:
"""Extract title and meta description."""
# Title
title_tag = soup.find("title")
if title_tag and title_tag.string:
metadata.title = title_tag.string.strip()
metadata.title_length = len(metadata.title)
# Meta description
desc_tag = soup.find("meta", attrs={"name": re.compile(r"^description$", re.I)})
if desc_tag and desc_tag.get("content"):
metadata.meta_description = desc_tag["content"].strip()
metadata.meta_description_length = len(metadata.meta_description)
# HTML lang
html_tag = soup.find("html")
if html_tag and html_tag.get("lang"):
metadata.html_lang = html_tag["lang"]
def _extract_canonical(self, soup: BeautifulSoup, metadata: PageMetadata, base_url: str) -> None:
"""Extract canonical URL."""
canonical = soup.find("link", rel="canonical")
if canonical and canonical.get("href"):
metadata.canonical_url = urljoin(base_url, canonical["href"])
def _extract_robots_meta(self, soup: BeautifulSoup, metadata: PageMetadata) -> None:
"""Extract robots meta tag."""
robots = soup.find("meta", attrs={"name": re.compile(r"^robots$", re.I)})
if robots and robots.get("content"):
metadata.robots_meta = robots["content"]
# Also check for googlebot-specific
googlebot = soup.find("meta", attrs={"name": re.compile(r"^googlebot$", re.I)})
if googlebot and googlebot.get("content"):
if metadata.robots_meta:
metadata.robots_meta += f" | googlebot: {googlebot['content']}"
else:
metadata.robots_meta = f"googlebot: {googlebot['content']}"
def _extract_hreflang(self, soup: BeautifulSoup, metadata: PageMetadata) -> None:
"""Extract hreflang tags."""
hreflang_tags = soup.find_all("link", rel="alternate", hreflang=True)
for tag in hreflang_tags:
if tag.get("href") and tag.get("hreflang"):
metadata.hreflang_tags.append({
"lang": tag["hreflang"],
"url": tag["href"]
})
def _extract_headings(self, soup: BeautifulSoup, metadata: PageMetadata) -> None:
"""Extract all headings."""
for level in range(1, 7):
for heading in soup.find_all(f"h{level}"):
text = heading.get_text(strip=True)
if text:
metadata.headings.append(HeadingData(level=level, text=text))
# Count H1s specifically
h1_tags = soup.find_all("h1")
metadata.h1_count = len(h1_tags)
if h1_tags:
metadata.h1_text = h1_tags[0].get_text(strip=True)
def _extract_open_graph(self, soup: BeautifulSoup, metadata: PageMetadata) -> None:
"""Extract Open Graph and Twitter Card data."""
og = metadata.open_graph
# Open Graph tags
og_mappings = {
"og:title": "og_title",
"og:description": "og_description",
"og:image": "og_image",
"og:url": "og_url",
"og:type": "og_type",
"og:site_name": "og_site_name",
"og:locale": "og_locale",
}
for og_prop, attr_name in og_mappings.items():
tag = soup.find("meta", property=og_prop)
if tag and tag.get("content"):
setattr(og, attr_name, tag["content"])
# Twitter Card tags
twitter_mappings = {
"twitter:card": "twitter_card",
"twitter:title": "twitter_title",
"twitter:description": "twitter_description",
"twitter:image": "twitter_image",
}
for tw_name, attr_name in twitter_mappings.items():
tag = soup.find("meta", attrs={"name": tw_name})
if tag and tag.get("content"):
setattr(og, attr_name, tag["content"])
def _extract_schema(self, soup: BeautifulSoup, metadata: PageMetadata) -> None:
"""Extract schema.org structured data."""
# JSON-LD
for script in soup.find_all("script", type="application/ld+json"):
try:
data = json.loads(script.string)
if isinstance(data, list):
for item in data:
self._process_schema_item(item, metadata, "json-ld")
else:
self._process_schema_item(data, metadata, "json-ld")
except (json.JSONDecodeError, TypeError):
continue
# Microdata (basic detection)
for item in soup.find_all(itemscope=True):
itemtype = item.get("itemtype", "")
if itemtype:
schema_type = itemtype.split("/")[-1]
if schema_type not in metadata.schema_types_found:
metadata.schema_types_found.append(schema_type)
metadata.schema_data.append(SchemaData(
schema_type=schema_type,
properties={},
format="microdata"
))
def _process_schema_item(self, data: dict, metadata: PageMetadata, format_type: str) -> None:
"""Process a single schema.org item."""
if not isinstance(data, dict):
return
schema_type = data.get("@type", "Unknown")
if isinstance(schema_type, list):
schema_type = schema_type[0] if schema_type else "Unknown"
if schema_type not in metadata.schema_types_found:
metadata.schema_types_found.append(schema_type)
metadata.schema_data.append(SchemaData(
schema_type=schema_type,
properties=data,
format=format_type
))
# Process nested @graph items
if "@graph" in data:
for item in data["@graph"]:
self._process_schema_item(item, metadata, format_type)
def _extract_links(self, soup: BeautifulSoup, metadata: PageMetadata, base_url: str) -> None:
"""Extract internal and external links."""
parsed_base = urlparse(base_url)
base_domain = parsed_base.netloc.lower()
for a_tag in soup.find_all("a", href=True):
href = a_tag["href"]
# Skip non-http links
if href.startswith(("#", "javascript:", "mailto:", "tel:")):
continue
# Resolve relative URLs
full_url = urljoin(base_url, href)
parsed_url = urlparse(full_url)
# Get anchor text
anchor_text = a_tag.get_text(strip=True)[:100] # Limit length
# Check if nofollow
rel = a_tag.get("rel", [])
if isinstance(rel, str):
rel = rel.split()
is_nofollow = "nofollow" in rel
# Determine if internal or external
link_domain = parsed_url.netloc.lower()
is_internal = (
link_domain == base_domain or
link_domain.endswith(f".{base_domain}") or
base_domain.endswith(f".{link_domain}")
)
link_data = LinkData(
url=full_url,
anchor_text=anchor_text,
is_internal=is_internal,
is_nofollow=is_nofollow,
)
if is_internal:
metadata.internal_links.append(link_data)
else:
metadata.external_links.append(link_data)
metadata.internal_link_count = len(metadata.internal_links)
metadata.external_link_count = len(metadata.external_links)
def _extract_images(self, soup: BeautifulSoup, metadata: PageMetadata) -> None:
"""Extract image information."""
images = soup.find_all("img")
metadata.images_total = len(images)
for img in images:
alt = img.get("alt", "").strip()
if alt:
metadata.images_with_alt += 1
else:
metadata.images_without_alt += 1
def _extract_content_metrics(self, soup: BeautifulSoup, metadata: PageMetadata) -> None:
"""Extract content metrics like word count."""
# Remove script and style elements
for element in soup(["script", "style", "noscript"]):
element.decompose()
# Get text content
text = soup.get_text(separator=" ", strip=True)
words = text.split()
metadata.word_count = len(words)
def _run_seo_checks(self, metadata: PageMetadata) -> None:
"""Run SEO checks and add issues/warnings."""
# Title checks
if not metadata.title:
metadata.issues.append("Missing title tag")
elif metadata.title_length < 30:
metadata.warnings.append(f"Title too short ({metadata.title_length} chars, recommend 50-60)")
elif metadata.title_length > 60:
metadata.warnings.append(f"Title too long ({metadata.title_length} chars, recommend 50-60)")
# Meta description checks
if not metadata.meta_description:
metadata.issues.append("Missing meta description")
elif metadata.meta_description_length < 120:
metadata.warnings.append(f"Meta description too short ({metadata.meta_description_length} chars)")
elif metadata.meta_description_length > 160:
metadata.warnings.append(f"Meta description too long ({metadata.meta_description_length} chars)")
# Canonical check
if not metadata.canonical_url:
metadata.warnings.append("Missing canonical tag")
elif metadata.canonical_url != metadata.url:
metadata.warnings.append(f"Canonical points to different URL: {metadata.canonical_url}")
# H1 checks
if metadata.h1_count == 0:
metadata.issues.append("Missing H1 tag")
elif metadata.h1_count > 1:
metadata.warnings.append(f"Multiple H1 tags ({metadata.h1_count})")
# Image alt check
if metadata.images_without_alt > 0:
metadata.warnings.append(f"{metadata.images_without_alt} images missing alt text")
# Schema check
if not metadata.schema_types_found:
metadata.warnings.append("No structured data found")
# Open Graph check
if not metadata.open_graph.og_title:
metadata.warnings.append("Missing Open Graph tags")
# Robots meta check
if metadata.robots_meta:
robots_lower = metadata.robots_meta.lower()
if "noindex" in robots_lower:
metadata.issues.append("Page is set to noindex")
if "nofollow" in robots_lower:
metadata.warnings.append("Page is set to nofollow")
def main():
"""CLI entry point for testing."""
import argparse
parser = argparse.ArgumentParser(description="Page SEO Analyzer")
parser.add_argument("url", help="URL to analyze")
parser.add_argument("--json", "-j", action="store_true", help="Output as JSON")
args = parser.parse_args()
analyzer = PageAnalyzer()
metadata = analyzer.analyze_url(args.url)
if args.json:
print(json.dumps(metadata.to_dict(), indent=2, ensure_ascii=False))
else:
print("=" * 60)
print("PAGE ANALYSIS REPORT")
print("=" * 60)
print(metadata.get_summary())
print()
if metadata.issues:
print("ISSUES:")
for issue in metadata.issues:
print(f"{issue}")
if metadata.warnings:
print("\nWARNINGS:")
for warning in metadata.warnings:
print(f"{warning}")
if metadata.hreflang_tags:
print(f"\nHREFLANG TAGS ({len(metadata.hreflang_tags)}):")
for tag in metadata.hreflang_tags[:5]:
print(f" {tag['lang']}: {tag['url']}")
if metadata.schema_types_found:
print(f"\nSCHEMA TYPES:")
for schema_type in metadata.schema_types_found:
print(f" - {schema_type}")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,6 @@
# 11-seo-on-page-audit dependencies
lxml>=5.1.0
beautifulsoup4>=4.12.0
requests>=2.31.0
python-dotenv>=1.0.0
rich>=13.7.0

View File

@@ -0,0 +1,94 @@
---
name: seo-on-page-audit
version: 1.0.0
description: On-page SEO analyzer for meta tags, headings, links, images, and Open Graph. Triggers: on-page SEO, meta tags, title tag, heading structure, alt text.
allowed-tools: mcp__firecrawl__*, mcp__perplexity__*, mcp__notion__*
---
# SEO On-Page Audit
## Purpose
Analyze single-page SEO elements: meta tags, heading hierarchy, internal/external links, images, and social sharing tags.
## Core Capabilities
1. **Meta Tags** - Title, description, canonical, robots
2. **Headings** - H1-H6 structure and hierarchy
3. **Links** - Internal, external, broken detection
4. **Images** - Alt text, sizing, lazy loading
5. **Social** - Open Graph, Twitter Cards
## MCP Tool Usage
```
mcp__firecrawl__scrape: Extract page HTML and metadata
mcp__perplexity__search: Research SEO best practices
mcp__notion__create-page: Save audit findings
```
## Workflow
1. Scrape target URL with Firecrawl
2. Extract and analyze meta tags
3. Map heading hierarchy
4. Count and categorize links
5. Check image optimization
6. Validate Open Graph tags
7. Generate recommendations
## Checklist
### Meta Tags
- [ ] Title present (50-60 characters)
- [ ] Meta description present (150-160 characters)
- [ ] Canonical URL set
- [ ] Robots meta allows indexing
### Headings
- [ ] Single H1 tag
- [ ] Logical hierarchy (no skips)
- [ ] Keywords in H1
### Links
- [ ] No broken internal links
- [ ] External links use rel attributes
- [ ] Reasonable internal link count
### Images
- [ ] All images have alt text
- [ ] Images are appropriately sized
- [ ] Lazy loading implemented
### Open Graph
- [ ] og:title present
- [ ] og:description present
- [ ] og:image present (1200x630)
## Output Format
```markdown
## On-Page Audit: [URL]
### Meta Tags: X/5
| Element | Status | Value |
|---------|--------|-------|
### Headings: X/5
- H1: [text]
- Hierarchy: Valid/Invalid
### Links
- Internal: X
- External: X
- Broken: X
### Recommendations
1. [Priority fixes]
```
## Limitations
- Single page analysis only
- Cannot detect JavaScript-rendered content issues
- External link status requires additional crawl

View File

@@ -0,0 +1,107 @@
# CLAUDE.md
## Overview
Local SEO auditor for businesses with physical locations: NAP consistency, Google Business Profile optimization, local citations, and LocalBusiness schema validation.
## Quick Start
This skill primarily uses MCP tools (Firecrawl, Perplexity) for data collection. Scripts are helpers for validation.
```bash
# NAP consistency check (manual data input)
python scripts/nap_checker.py --business "Business Name" --address "123 Main St" --phone "555-1234"
# LocalBusiness schema validation
python scripts/local_schema_validator.py --url https://example.com
```
## Audit Components
### 1. NAP Consistency
**Name, Address, Phone** consistency across:
- Website (header, footer, contact page)
- Google Business Profile
- Local directories (Yelp, Yellow Pages, etc.)
- Social media profiles
### 2. Google Business Profile (GBP)
Optimization checklist:
- [ ] Business name matches website
- [ ] Address is complete and accurate
- [ ] Phone number is local
- [ ] Business hours are current
- [ ] Categories are appropriate
- [ ] Photos uploaded (exterior, interior, products)
- [ ] Posts are recent (within 7 days)
- [ ] Reviews are responded to
### 3. Local Citations
Priority directories to check:
- Google Business Profile
- Apple Maps
- Bing Places
- Yelp
- Facebook Business
- Industry-specific directories
### 4. LocalBusiness Schema
Required properties:
- @type (LocalBusiness or subtype)
- name
- address (PostalAddress)
- telephone
- openingHours
## Workflow
```
1. Collect NAP from client
2. Scrape website for NAP mentions
3. Search citations using Perplexity
4. Check GBP data (manual or API)
5. Validate LocalBusiness schema
6. Generate consistency report
```
## Output Format
```markdown
## Local SEO Audit: [Business Name]
### NAP Consistency Score: X/10
| Source | Name | Address | Phone | Status |
|--------|------|---------|-------|--------|
| Website | ✓ | ✓ | ✓ | Match |
| GBP | ✓ | ✗ | ✓ | Mismatch |
### GBP Optimization: X/10
- [ ] Issue 1
- [x] Completed item
### Citation Audit
- Found: X citations
- Consistent: X
- Needs update: X
### Recommendations
1. Fix address mismatch on GBP
2. Add LocalBusiness schema
```
## Common Issues
| Issue | Impact | Fix |
|-------|--------|-----|
| NAP inconsistency | High | Update all directories |
| Missing GBP categories | Medium | Add relevant categories |
| No LocalBusiness schema | Medium | Add JSON-LD markup |
| Outdated business hours | Medium | Update GBP hours |
| No review responses | Low | Respond to all reviews |
## Notes
- GBP API requires enterprise approval (use manual audit)
- Citation discovery limited to public data
- Use schema generator skill (14) for creating LocalBusiness markup

View File

@@ -0,0 +1,116 @@
---
name: seo-local-audit
version: 1.0.0
description: Local SEO auditor for NAP consistency, Google Business Profile, citations, and LocalBusiness schema. Triggers: local SEO, Google Business Profile, GBP, NAP, citations, local rankings.
allowed-tools: mcp__firecrawl__*, mcp__perplexity__*, mcp__notion__*
---
# SEO Local Audit
## Purpose
Audit local business SEO: NAP (Name, Address, Phone) consistency, Google Business Profile optimization, local citations, and LocalBusiness schema markup.
## Core Capabilities
1. **NAP Consistency** - Cross-platform verification
2. **GBP Optimization** - Profile completeness check
3. **Citation Audit** - Directory presence
4. **Schema Validation** - LocalBusiness markup
## MCP Tool Usage
```
mcp__firecrawl__scrape: Extract NAP from website
mcp__perplexity__search: Find citations and directories
mcp__notion__create-page: Save audit findings
```
## Workflow
### 1. Gather Business Info
Collect from client:
- Business name (exact)
- Full address
- Phone number (local preferred)
- Website URL
- GBP listing URL
### 2. Website NAP Check
Scrape website for NAP mentions:
- Header/footer
- Contact page
- About page
- Schema markup
### 3. Citation Discovery
Search for business mentions:
- "[Business Name] [City]"
- Phone number search
- Address search
### 4. GBP Review
Manual checklist:
- Profile completeness
- Category accuracy
- Photo presence
- Review responses
- Post recency
### 5. Schema Check
Validate LocalBusiness markup presence and accuracy.
## GBP Optimization Checklist
- [ ] Business name matches website
- [ ] Complete address with suite/unit
- [ ] Local phone number (not toll-free)
- [ ] Accurate business hours
- [ ] Primary + secondary categories set
- [ ] Business description complete
- [ ] 10+ photos uploaded
- [ ] Recent post (within 7 days)
- [ ] Reviews responded to
## Citation Priority
| Platform | Priority |
|----------|----------|
| Google Business Profile | Critical |
| Apple Maps | High |
| Bing Places | High |
| Yelp | High |
| Facebook | Medium |
| Industry directories | Medium |
## Output Format
```markdown
## Local SEO Audit: [Business]
### NAP Consistency: X/10
| Source | Name | Address | Phone |
|--------|------|---------|-------|
| Website | ✓/✗ | ✓/✗ | ✓/✗ |
| GBP | ✓/✗ | ✓/✗ | ✓/✗ |
### GBP Score: X/10
[Checklist results]
### Citations Found: X
- Consistent: X
- Inconsistent: X
### LocalBusiness Schema
- Present: Yes/No
- Valid: Yes/No
### Priority Actions
1. [Fix recommendations]
```
## Limitations
- GBP data requires manual access
- Citation discovery limited to searchable sources
- Cannot update external directories

View File

@@ -0,0 +1,113 @@
# CLAUDE.md
## Overview
Structured data validator: extract, parse, and validate JSON-LD, Microdata, and RDFa markup against schema.org vocabulary.
## Quick Start
```bash
pip install -r scripts/requirements.txt
python scripts/schema_validator.py --url https://example.com
```
## Scripts
| Script | Purpose |
|--------|---------|
| `schema_validator.py` | Extract and validate structured data |
| `base_client.py` | Shared utilities |
## Usage
```bash
# Validate page schema
python scripts/schema_validator.py --url https://example.com
# JSON output
python scripts/schema_validator.py --url https://example.com --json
# Validate local file
python scripts/schema_validator.py --file schema.json
# Check Rich Results eligibility
python scripts/schema_validator.py --url https://example.com --rich-results
```
## Supported Formats
| Format | Detection |
|--------|-----------|
| JSON-LD | `<script type="application/ld+json">` |
| Microdata | `itemscope`, `itemtype`, `itemprop` |
| RDFa | `vocab`, `typeof`, `property` |
## Validation Levels
### 1. Syntax Validation
- Valid JSON structure
- Proper nesting
- No syntax errors
### 2. Schema.org Vocabulary
- Valid @type values
- Known properties
- Correct property types
### 3. Google Rich Results
- Required properties present
- Recommended properties
- Feature-specific requirements
## Schema Types Validated
| Type | Required Properties | Rich Result |
|------|---------------------|-------------|
| Article | headline, author, datePublished | Yes |
| Product | name, offers | Yes |
| LocalBusiness | name, address | Yes |
| FAQPage | mainEntity | Yes |
| Organization | name, url | Yes |
| BreadcrumbList | itemListElement | Yes |
| WebSite | name, url | Sitelinks |
## Output
```json
{
"url": "https://example.com",
"schemas_found": 3,
"schemas": [
{
"@type": "Organization",
"valid": true,
"rich_results_eligible": true,
"issues": [],
"warnings": []
}
],
"summary": {
"valid": 3,
"invalid": 0,
"rich_results_eligible": 2
}
}
```
## Issue Severity
| Level | Description |
|-------|-------------|
| Error | Invalid schema, blocks rich results |
| Warning | Missing recommended property |
| Info | Optimization suggestion |
## Dependencies
```
extruct>=0.16.0
jsonschema>=4.21.0
rdflib>=7.0.0
lxml>=5.1.0
requests>=2.31.0
```

View File

@@ -0,0 +1,207 @@
"""
Base Client - Shared async client utilities
===========================================
Purpose: Rate-limited async operations for API clients
Python: 3.10+
"""
import asyncio
import logging
import os
from asyncio import Semaphore
from datetime import datetime
from typing import Any, Callable, TypeVar
from dotenv import load_dotenv
from tenacity import (
retry,
stop_after_attempt,
wait_exponential,
retry_if_exception_type,
)
# Load environment variables
load_dotenv()
# Logging setup
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(levelname)s - %(message)s",
)
T = TypeVar("T")
class RateLimiter:
"""Rate limiter using token bucket algorithm."""
def __init__(self, rate: float, per: float = 1.0):
"""
Initialize rate limiter.
Args:
rate: Number of requests allowed
per: Time period in seconds (default: 1 second)
"""
self.rate = rate
self.per = per
self.tokens = rate
self.last_update = datetime.now()
self._lock = asyncio.Lock()
async def acquire(self) -> None:
"""Acquire a token, waiting if necessary."""
async with self._lock:
now = datetime.now()
elapsed = (now - self.last_update).total_seconds()
self.tokens = min(self.rate, self.tokens + elapsed * (self.rate / self.per))
self.last_update = now
if self.tokens < 1:
wait_time = (1 - self.tokens) * (self.per / self.rate)
await asyncio.sleep(wait_time)
self.tokens = 0
else:
self.tokens -= 1
class BaseAsyncClient:
"""Base class for async API clients with rate limiting."""
def __init__(
self,
max_concurrent: int = 5,
requests_per_second: float = 3.0,
logger: logging.Logger | None = None,
):
"""
Initialize base client.
Args:
max_concurrent: Maximum concurrent requests
requests_per_second: Rate limit
logger: Logger instance
"""
self.semaphore = Semaphore(max_concurrent)
self.rate_limiter = RateLimiter(requests_per_second)
self.logger = logger or logging.getLogger(self.__class__.__name__)
self.stats = {
"requests": 0,
"success": 0,
"errors": 0,
"retries": 0,
}
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=2, max=10),
retry=retry_if_exception_type(Exception),
)
async def _rate_limited_request(
self,
coro: Callable[[], Any],
) -> Any:
"""Execute a request with rate limiting and retry."""
async with self.semaphore:
await self.rate_limiter.acquire()
self.stats["requests"] += 1
try:
result = await coro()
self.stats["success"] += 1
return result
except Exception as e:
self.stats["errors"] += 1
self.logger.error(f"Request failed: {e}")
raise
async def batch_requests(
self,
requests: list[Callable[[], Any]],
desc: str = "Processing",
) -> list[Any]:
"""Execute multiple requests concurrently."""
try:
from tqdm.asyncio import tqdm
has_tqdm = True
except ImportError:
has_tqdm = False
async def execute(req: Callable) -> Any:
try:
return await self._rate_limited_request(req)
except Exception as e:
return {"error": str(e)}
tasks = [execute(req) for req in requests]
if has_tqdm:
results = []
for coro in tqdm.as_completed(tasks, total=len(tasks), desc=desc):
result = await coro
results.append(result)
return results
else:
return await asyncio.gather(*tasks, return_exceptions=True)
def print_stats(self) -> None:
"""Print request statistics."""
self.logger.info("=" * 40)
self.logger.info("Request Statistics:")
self.logger.info(f" Total Requests: {self.stats['requests']}")
self.logger.info(f" Successful: {self.stats['success']}")
self.logger.info(f" Errors: {self.stats['errors']}")
self.logger.info("=" * 40)
class ConfigManager:
"""Manage API configuration and credentials."""
def __init__(self):
load_dotenv()
@property
def google_credentials_path(self) -> str | None:
"""Get Google service account credentials path."""
# Prefer SEO-specific credentials, fallback to general credentials
seo_creds = os.path.expanduser("~/.credential/ourdigital-seo-agent.json")
if os.path.exists(seo_creds):
return seo_creds
return os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
@property
def pagespeed_api_key(self) -> str | None:
"""Get PageSpeed Insights API key."""
return os.getenv("PAGESPEED_API_KEY")
@property
def custom_search_api_key(self) -> str | None:
"""Get Custom Search API key."""
return os.getenv("CUSTOM_SEARCH_API_KEY")
@property
def custom_search_engine_id(self) -> str | None:
"""Get Custom Search Engine ID."""
return os.getenv("CUSTOM_SEARCH_ENGINE_ID")
@property
def notion_token(self) -> str | None:
"""Get Notion API token."""
return os.getenv("NOTION_TOKEN") or os.getenv("NOTION_API_KEY")
def validate_google_credentials(self) -> bool:
"""Validate Google credentials are configured."""
creds_path = self.google_credentials_path
if not creds_path:
return False
return os.path.exists(creds_path)
def get_required(self, key: str) -> str:
"""Get required environment variable or raise error."""
value = os.getenv(key)
if not value:
raise ValueError(f"Missing required environment variable: {key}")
return value
# Singleton config instance
config = ConfigManager()

View File

@@ -0,0 +1,9 @@
# 13-seo-schema-validator dependencies
extruct>=0.16.0
jsonschema>=4.21.0
rdflib>=7.0.0
lxml>=5.1.0
beautifulsoup4>=4.12.0
requests>=2.31.0
python-dotenv>=1.0.0
rich>=13.7.0

View File

@@ -0,0 +1,110 @@
---
name: seo-schema-validator
version: 1.0.0
description: Structured data validator for JSON-LD, Microdata, and RDFa. Triggers: validate schema, structured data, JSON-LD, rich results, schema.org.
allowed-tools: mcp__firecrawl__*, mcp__perplexity__*
---
# SEO Schema Validator
## Purpose
Extract and validate structured data (JSON-LD, Microdata, RDFa) against schema.org vocabulary and Google Rich Results requirements.
## Core Capabilities
1. **Extract** - Find all structured data on page
2. **Parse** - JSON-LD, Microdata, RDFa formats
3. **Validate** - Schema.org compliance
4. **Rich Results** - Google eligibility check
## MCP Tool Usage
```
mcp__firecrawl__scrape: Extract page HTML with structured data
mcp__perplexity__search: Research schema requirements
```
## Workflow
1. Scrape target URL
2. Locate structured data blocks
3. Parse each format found
4. Validate against schema.org
5. Check Rich Results eligibility
6. Report issues and recommendations
## Supported Schema Types
| Type | Required Properties | Rich Result |
|------|---------------------|-------------|
| Article | headline, author, datePublished, image | Yes |
| Product | name, offers (price, availability) | Yes |
| LocalBusiness | name, address, telephone | Yes |
| FAQPage | mainEntity (questions) | Yes |
| Organization | name, url, logo | Sitelinks |
| BreadcrumbList | itemListElement | Yes |
| WebSite | name, url, potentialAction | Sitelinks |
| Review | itemReviewed, reviewRating | Yes |
| Event | name, startDate, location | Yes |
| Recipe | name, image, ingredients | Yes |
## Validation Levels
### Level 1: Syntax
- Valid JSON structure
- Proper nesting
- No parsing errors
### Level 2: Vocabulary
- Valid @type values
- Known property names
- Correct value types
### Level 3: Rich Results
- Required properties present
- Recommended properties
- Google-specific requirements
## Output Format
```markdown
## Schema Validation: [URL]
### Schemas Found: X
#### Schema 1: [Type]
- Format: JSON-LD
- Valid: Yes/No
- Rich Results Eligible: Yes/No
**Issues:**
- [Error/Warning list]
**Properties:**
| Property | Present | Valid |
|----------|---------|-------|
### Summary
- Valid: X
- Invalid: X
- Rich Results Ready: X
### Recommendations
1. [Fixes needed]
```
## Common Issues
| Issue | Severity | Fix |
|-------|----------|-----|
| Missing required property | Error | Add property |
| Invalid date format | Error | Use ISO 8601 |
| Missing @context | Error | Add schema.org context |
| No image property | Warning | Add image URL |
## Limitations
- Cannot test rendered schema (JavaScript)
- Validation against schema.org, not all Google features
- Use Google Rich Results Test for final verification

View File

@@ -0,0 +1,121 @@
# CLAUDE.md
## Overview
Schema markup generator: create JSON-LD structured data from templates for various content types.
## Quick Start
```bash
pip install -r scripts/requirements.txt
# Generate Organization schema
python scripts/schema_generator.py --type organization --url https://example.com
# Generate from template
python scripts/schema_generator.py --template templates/article.json --data article_data.json
```
## Scripts
| Script | Purpose |
|--------|---------|
| `schema_generator.py` | Generate schema markup |
| `base_client.py` | Shared utilities |
## Supported Schema Types
| Type | Template | Use Case |
|------|----------|----------|
| Organization | `organization.json` | Company/brand info |
| LocalBusiness | `local_business.json` | Physical locations |
| Article | `article.json` | Blog posts, news |
| Product | `product.json` | E-commerce items |
| FAQPage | `faq.json` | FAQ sections |
| BreadcrumbList | `breadcrumb.json` | Navigation path |
| WebSite | `website.json` | Site-level info |
## Usage Examples
### Organization
```bash
python scripts/schema_generator.py --type organization \
--name "Company Name" \
--url "https://example.com" \
--logo "https://example.com/logo.png"
```
### LocalBusiness
```bash
python scripts/schema_generator.py --type localbusiness \
--name "Restaurant Name" \
--address "123 Main St, City, State 12345" \
--phone "+1-555-123-4567" \
--hours "Mo-Fr 09:00-17:00"
```
### Article
```bash
python scripts/schema_generator.py --type article \
--headline "Article Title" \
--author "Author Name" \
--published "2024-01-15" \
--image "https://example.com/image.jpg"
```
### FAQPage
```bash
python scripts/schema_generator.py --type faq \
--questions questions.json
```
## Output
Generated JSON-LD ready for insertion:
```html
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "Organization",
"name": "Company Name",
"url": "https://example.com",
"logo": "https://example.com/logo.png"
}
</script>
```
## Template Customization
Templates in `templates/` can be modified. Required fields are marked:
```json
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "{{REQUIRED}}",
"author": {
"@type": "Person",
"name": "{{REQUIRED}}"
},
"datePublished": "{{REQUIRED}}",
"image": "{{RECOMMENDED}}"
}
```
## Validation
Generated schemas are validated before output:
- Syntax correctness
- Required properties present
- Schema.org vocabulary compliance
Use skill 13 (schema-validator) for additional validation.
## Dependencies
```
jsonschema>=4.21.0
requests>=2.31.0
python-dotenv>=1.0.0
```

View File

@@ -0,0 +1,207 @@
"""
Base Client - Shared async client utilities
===========================================
Purpose: Rate-limited async operations for API clients
Python: 3.10+
"""
import asyncio
import logging
import os
from asyncio import Semaphore
from datetime import datetime
from typing import Any, Callable, TypeVar
from dotenv import load_dotenv
from tenacity import (
retry,
stop_after_attempt,
wait_exponential,
retry_if_exception_type,
)
# Load environment variables
load_dotenv()
# Logging setup
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(levelname)s - %(message)s",
)
T = TypeVar("T")
class RateLimiter:
"""Rate limiter using token bucket algorithm."""
def __init__(self, rate: float, per: float = 1.0):
"""
Initialize rate limiter.
Args:
rate: Number of requests allowed
per: Time period in seconds (default: 1 second)
"""
self.rate = rate
self.per = per
self.tokens = rate
self.last_update = datetime.now()
self._lock = asyncio.Lock()
async def acquire(self) -> None:
"""Acquire a token, waiting if necessary."""
async with self._lock:
now = datetime.now()
elapsed = (now - self.last_update).total_seconds()
self.tokens = min(self.rate, self.tokens + elapsed * (self.rate / self.per))
self.last_update = now
if self.tokens < 1:
wait_time = (1 - self.tokens) * (self.per / self.rate)
await asyncio.sleep(wait_time)
self.tokens = 0
else:
self.tokens -= 1
class BaseAsyncClient:
"""Base class for async API clients with rate limiting."""
def __init__(
self,
max_concurrent: int = 5,
requests_per_second: float = 3.0,
logger: logging.Logger | None = None,
):
"""
Initialize base client.
Args:
max_concurrent: Maximum concurrent requests
requests_per_second: Rate limit
logger: Logger instance
"""
self.semaphore = Semaphore(max_concurrent)
self.rate_limiter = RateLimiter(requests_per_second)
self.logger = logger or logging.getLogger(self.__class__.__name__)
self.stats = {
"requests": 0,
"success": 0,
"errors": 0,
"retries": 0,
}
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=2, max=10),
retry=retry_if_exception_type(Exception),
)
async def _rate_limited_request(
self,
coro: Callable[[], Any],
) -> Any:
"""Execute a request with rate limiting and retry."""
async with self.semaphore:
await self.rate_limiter.acquire()
self.stats["requests"] += 1
try:
result = await coro()
self.stats["success"] += 1
return result
except Exception as e:
self.stats["errors"] += 1
self.logger.error(f"Request failed: {e}")
raise
async def batch_requests(
self,
requests: list[Callable[[], Any]],
desc: str = "Processing",
) -> list[Any]:
"""Execute multiple requests concurrently."""
try:
from tqdm.asyncio import tqdm
has_tqdm = True
except ImportError:
has_tqdm = False
async def execute(req: Callable) -> Any:
try:
return await self._rate_limited_request(req)
except Exception as e:
return {"error": str(e)}
tasks = [execute(req) for req in requests]
if has_tqdm:
results = []
for coro in tqdm.as_completed(tasks, total=len(tasks), desc=desc):
result = await coro
results.append(result)
return results
else:
return await asyncio.gather(*tasks, return_exceptions=True)
def print_stats(self) -> None:
"""Print request statistics."""
self.logger.info("=" * 40)
self.logger.info("Request Statistics:")
self.logger.info(f" Total Requests: {self.stats['requests']}")
self.logger.info(f" Successful: {self.stats['success']}")
self.logger.info(f" Errors: {self.stats['errors']}")
self.logger.info("=" * 40)
class ConfigManager:
"""Manage API configuration and credentials."""
def __init__(self):
load_dotenv()
@property
def google_credentials_path(self) -> str | None:
"""Get Google service account credentials path."""
# Prefer SEO-specific credentials, fallback to general credentials
seo_creds = os.path.expanduser("~/.credential/ourdigital-seo-agent.json")
if os.path.exists(seo_creds):
return seo_creds
return os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
@property
def pagespeed_api_key(self) -> str | None:
"""Get PageSpeed Insights API key."""
return os.getenv("PAGESPEED_API_KEY")
@property
def custom_search_api_key(self) -> str | None:
"""Get Custom Search API key."""
return os.getenv("CUSTOM_SEARCH_API_KEY")
@property
def custom_search_engine_id(self) -> str | None:
"""Get Custom Search Engine ID."""
return os.getenv("CUSTOM_SEARCH_ENGINE_ID")
@property
def notion_token(self) -> str | None:
"""Get Notion API token."""
return os.getenv("NOTION_TOKEN") or os.getenv("NOTION_API_KEY")
def validate_google_credentials(self) -> bool:
"""Validate Google credentials are configured."""
creds_path = self.google_credentials_path
if not creds_path:
return False
return os.path.exists(creds_path)
def get_required(self, key: str) -> str:
"""Get required environment variable or raise error."""
value = os.getenv(key)
if not value:
raise ValueError(f"Missing required environment variable: {key}")
return value
# Singleton config instance
config = ConfigManager()

View File

@@ -0,0 +1,6 @@
# 14-seo-schema-generator dependencies
jsonschema>=4.21.0
requests>=2.31.0
python-dotenv>=1.0.0
rich>=13.7.0
typer>=0.9.0

View File

@@ -0,0 +1,146 @@
---
name: seo-schema-generator
version: 1.0.0
description: Schema markup generator for JSON-LD structured data. Triggers: generate schema, create JSON-LD, add structured data, schema markup.
allowed-tools: mcp__firecrawl__*, mcp__perplexity__*
---
# SEO Schema Generator
## Purpose
Generate JSON-LD structured data markup for various content types using templates.
## Core Capabilities
1. **Organization** - Company/brand information
2. **LocalBusiness** - Physical location businesses
3. **Article** - Blog posts and news articles
4. **Product** - E-commerce products
5. **FAQPage** - FAQ sections
6. **BreadcrumbList** - Navigation breadcrumbs
7. **WebSite** - Site-level with search action
## Workflow
1. Identify content type
2. Gather required information
3. Generate JSON-LD from template
4. Validate output
5. Provide implementation instructions
## Schema Templates
### Organization
```json
{
"@context": "https://schema.org",
"@type": "Organization",
"name": "[Company Name]",
"url": "[Website URL]",
"logo": "[Logo URL]",
"sameAs": [
"[Social Media URLs]"
]
}
```
### LocalBusiness
```json
{
"@context": "https://schema.org",
"@type": "LocalBusiness",
"name": "[Business Name]",
"address": {
"@type": "PostalAddress",
"streetAddress": "[Street]",
"addressLocality": "[City]",
"addressRegion": "[State]",
"postalCode": "[ZIP]",
"addressCountry": "[Country]"
},
"telephone": "[Phone]",
"openingHours": ["Mo-Fr 09:00-17:00"]
}
```
### Article
```json
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "[Title]",
"author": {
"@type": "Person",
"name": "[Author Name]"
},
"datePublished": "[YYYY-MM-DD]",
"dateModified": "[YYYY-MM-DD]",
"image": "[Image URL]",
"publisher": {
"@type": "Organization",
"name": "[Publisher]",
"logo": "[Logo URL]"
}
}
```
### FAQPage
```json
{
"@context": "https://schema.org",
"@type": "FAQPage",
"mainEntity": [
{
"@type": "Question",
"name": "[Question]",
"acceptedAnswer": {
"@type": "Answer",
"text": "[Answer]"
}
}
]
}
```
### Product
```json
{
"@context": "https://schema.org",
"@type": "Product",
"name": "[Product Name]",
"image": "[Image URL]",
"description": "[Description]",
"offers": {
"@type": "Offer",
"price": "[Price]",
"priceCurrency": "[Currency]",
"availability": "https://schema.org/InStock"
}
}
```
## Implementation
Place generated JSON-LD in `<head>` section:
```html
<head>
<script type="application/ld+json">
[Generated Schema Here]
</script>
</head>
```
## Validation
After generating:
1. Use schema validator skill (13) to verify
2. Test with Google Rich Results Test
3. Monitor in Search Console
## Limitations
- Templates cover common types only
- Complex nested schemas may need manual adjustment
- Some Rich Results require additional properties

View File

@@ -0,0 +1,32 @@
{
"@context": "https://schema.org",
"@type": "{{article_type}}",
"headline": "{{headline}}",
"description": "{{description}}",
"image": [
"{{image_url_1}}",
"{{image_url_2}}"
],
"datePublished": "{{date_published}}",
"dateModified": "{{date_modified}}",
"author": {
"@type": "Person",
"name": "{{author_name}}",
"url": "{{author_url}}"
},
"publisher": {
"@type": "Organization",
"name": "{{publisher_name}}",
"logo": {
"@type": "ImageObject",
"url": "{{publisher_logo_url}}"
}
},
"mainEntityOfPage": {
"@type": "WebPage",
"@id": "{{page_url}}"
},
"articleSection": "{{section}}",
"wordCount": "{{word_count}}",
"keywords": "{{keywords}}"
}

View File

@@ -0,0 +1,24 @@
{
"@context": "https://schema.org",
"@type": "BreadcrumbList",
"itemListElement": [
{
"@type": "ListItem",
"position": 1,
"name": "{{level_1_name}}",
"item": "{{level_1_url}}"
},
{
"@type": "ListItem",
"position": 2,
"name": "{{level_2_name}}",
"item": "{{level_2_url}}"
},
{
"@type": "ListItem",
"position": 3,
"name": "{{level_3_name}}",
"item": "{{level_3_url}}"
}
]
}

View File

@@ -0,0 +1,30 @@
{
"@context": "https://schema.org",
"@type": "FAQPage",
"mainEntity": [
{
"@type": "Question",
"name": "{{question_1}}",
"acceptedAnswer": {
"@type": "Answer",
"text": "{{answer_1}}"
}
},
{
"@type": "Question",
"name": "{{question_2}}",
"acceptedAnswer": {
"@type": "Answer",
"text": "{{answer_2}}"
}
},
{
"@type": "Question",
"name": "{{question_3}}",
"acceptedAnswer": {
"@type": "Answer",
"text": "{{answer_3}}"
}
}
]
}

View File

@@ -0,0 +1,47 @@
{
"@context": "https://schema.org",
"@type": "{{business_type}}",
"name": "{{name}}",
"description": "{{description}}",
"url": "{{url}}",
"telephone": "{{phone}}",
"email": "{{email}}",
"image": "{{image_url}}",
"priceRange": "{{price_range}}",
"address": {
"@type": "PostalAddress",
"streetAddress": "{{street_address}}",
"addressLocality": "{{city}}",
"addressRegion": "{{region}}",
"postalCode": "{{postal_code}}",
"addressCountry": "{{country}}"
},
"geo": {
"@type": "GeoCoordinates",
"latitude": "{{latitude}}",
"longitude": "{{longitude}}"
},
"openingHoursSpecification": [
{
"@type": "OpeningHoursSpecification",
"dayOfWeek": ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"],
"opens": "{{weekday_opens}}",
"closes": "{{weekday_closes}}"
},
{
"@type": "OpeningHoursSpecification",
"dayOfWeek": ["Saturday", "Sunday"],
"opens": "{{weekend_opens}}",
"closes": "{{weekend_closes}}"
}
],
"aggregateRating": {
"@type": "AggregateRating",
"ratingValue": "{{rating}}",
"reviewCount": "{{review_count}}"
},
"sameAs": [
"{{facebook_url}}",
"{{instagram_url}}"
]
}

View File

@@ -0,0 +1,37 @@
{
"@context": "https://schema.org",
"@type": "Organization",
"name": "{{name}}",
"url": "{{url}}",
"logo": "{{logo_url}}",
"description": "{{description}}",
"foundingDate": "{{founding_date}}",
"founders": [
{
"@type": "Person",
"name": "{{founder_name}}"
}
],
"address": {
"@type": "PostalAddress",
"streetAddress": "{{street_address}}",
"addressLocality": "{{city}}",
"addressRegion": "{{region}}",
"postalCode": "{{postal_code}}",
"addressCountry": "{{country}}"
},
"contactPoint": [
{
"@type": "ContactPoint",
"telephone": "{{phone}}",
"contactType": "customer service",
"availableLanguage": ["Korean", "English"]
}
],
"sameAs": [
"{{facebook_url}}",
"{{twitter_url}}",
"{{linkedin_url}}",
"{{instagram_url}}"
]
}

View File

@@ -0,0 +1,76 @@
{
"@context": "https://schema.org",
"@type": "Product",
"name": "{{name}}",
"description": "{{description}}",
"image": [
"{{image_url_1}}",
"{{image_url_2}}",
"{{image_url_3}}"
],
"sku": "{{sku}}",
"mpn": "{{mpn}}",
"gtin13": "{{gtin13}}",
"brand": {
"@type": "Brand",
"name": "{{brand_name}}"
},
"offers": {
"@type": "Offer",
"url": "{{product_url}}",
"price": "{{price}}",
"priceCurrency": "{{currency}}",
"priceValidUntil": "{{price_valid_until}}",
"availability": "https://schema.org/{{availability}}",
"itemCondition": "https://schema.org/{{condition}}",
"seller": {
"@type": "Organization",
"name": "{{seller_name}}"
},
"shippingDetails": {
"@type": "OfferShippingDetails",
"shippingRate": {
"@type": "MonetaryAmount",
"value": "{{shipping_cost}}",
"currency": "{{currency}}"
},
"deliveryTime": {
"@type": "ShippingDeliveryTime",
"handlingTime": {
"@type": "QuantitativeValue",
"minValue": "{{handling_min_days}}",
"maxValue": "{{handling_max_days}}",
"unitCode": "DAY"
},
"transitTime": {
"@type": "QuantitativeValue",
"minValue": "{{transit_min_days}}",
"maxValue": "{{transit_max_days}}",
"unitCode": "DAY"
}
}
}
},
"aggregateRating": {
"@type": "AggregateRating",
"ratingValue": "{{rating}}",
"reviewCount": "{{review_count}}",
"bestRating": "5",
"worstRating": "1"
},
"review": [
{
"@type": "Review",
"reviewRating": {
"@type": "Rating",
"ratingValue": "{{review_rating}}",
"bestRating": "5"
},
"author": {
"@type": "Person",
"name": "{{reviewer_name}}"
},
"reviewBody": "{{review_text}}"
}
]
}

View File

@@ -0,0 +1,25 @@
{
"@context": "https://schema.org",
"@type": "WebSite",
"name": "{{site_name}}",
"alternateName": "{{alternate_name}}",
"url": "{{url}}",
"description": "{{description}}",
"inLanguage": "{{language}}",
"potentialAction": {
"@type": "SearchAction",
"target": {
"@type": "EntryPoint",
"urlTemplate": "{{search_url_template}}"
},
"query-input": "required name=search_term_string"
},
"publisher": {
"@type": "Organization",
"name": "{{publisher_name}}",
"logo": {
"@type": "ImageObject",
"url": "{{logo_url}}"
}
}
}

View File

@@ -0,0 +1,117 @@
# CLAUDE.md
## Overview
Core Web Vitals analyzer using Google PageSpeed Insights API: LCP, FID, CLS, INP, TTFB, FCP measurement and recommendations.
## Quick Start
```bash
pip install -r scripts/requirements.txt
# Requires API key
export PAGESPEED_API_KEY=your_api_key
python scripts/pagespeed_client.py --url https://example.com
```
## Scripts
| Script | Purpose |
|--------|---------|
| `pagespeed_client.py` | PageSpeed Insights API client |
| `base_client.py` | Shared utilities |
## Usage
```bash
# Mobile analysis (default)
python scripts/pagespeed_client.py --url https://example.com
# Desktop analysis
python scripts/pagespeed_client.py --url https://example.com --strategy desktop
# Both strategies
python scripts/pagespeed_client.py --url https://example.com --strategy both
# JSON output
python scripts/pagespeed_client.py --url https://example.com --json
# Batch analysis
python scripts/pagespeed_client.py --urls urls.txt --output results.json
```
## Core Web Vitals Metrics
| Metric | Good | Needs Improvement | Poor |
|--------|------|-------------------|------|
| LCP (Largest Contentful Paint) | ≤2.5s | 2.5s-4s | >4s |
| FID (First Input Delay) | ≤100ms | 100ms-300ms | >300ms |
| CLS (Cumulative Layout Shift) | ≤0.1 | 0.1-0.25 | >0.25 |
| INP (Interaction to Next Paint) | ≤200ms | 200ms-500ms | >500ms |
## Additional Metrics
| Metric | Description |
|--------|-------------|
| TTFB | Time to First Byte |
| FCP | First Contentful Paint |
| SI | Speed Index |
| TBT | Total Blocking Time |
## Output
```json
{
"url": "https://example.com",
"strategy": "mobile",
"score": 85,
"core_web_vitals": {
"lcp": {"value": 2.1, "rating": "good"},
"fid": {"value": 50, "rating": "good"},
"cls": {"value": 0.05, "rating": "good"},
"inp": {"value": 180, "rating": "good"}
},
"opportunities": [
{
"id": "render-blocking-resources",
"title": "Eliminate render-blocking resources",
"savings_ms": 1200
}
],
"diagnostics": []
}
```
## Configuration
Environment variables:
```bash
PAGESPEED_API_KEY=AIza... # Required for higher quotas
GOOGLE_API_KEY=AIza... # Alternative key name
```
## Rate Limits
| Tier | Limit |
|------|-------|
| No API key | 25 queries/day |
| With API key | 25,000 queries/day |
## Common Recommendations
| Issue | Fix |
|-------|-----|
| Large LCP | Optimize images, preload critical resources |
| High CLS | Set image dimensions, avoid injected content |
| Poor INP | Reduce JavaScript, optimize event handlers |
| Slow TTFB | Improve server response, use CDN |
## Dependencies
```
google-api-python-client>=2.100.0
requests>=2.31.0
python-dotenv>=1.0.0
rich>=13.7.0
```

View File

@@ -0,0 +1,207 @@
"""
Base Client - Shared async client utilities
===========================================
Purpose: Rate-limited async operations for API clients
Python: 3.10+
"""
import asyncio
import logging
import os
from asyncio import Semaphore
from datetime import datetime
from typing import Any, Callable, TypeVar
from dotenv import load_dotenv
from tenacity import (
retry,
stop_after_attempt,
wait_exponential,
retry_if_exception_type,
)
# Load environment variables
load_dotenv()
# Logging setup
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(levelname)s - %(message)s",
)
T = TypeVar("T")
class RateLimiter:
"""Rate limiter using token bucket algorithm."""
def __init__(self, rate: float, per: float = 1.0):
"""
Initialize rate limiter.
Args:
rate: Number of requests allowed
per: Time period in seconds (default: 1 second)
"""
self.rate = rate
self.per = per
self.tokens = rate
self.last_update = datetime.now()
self._lock = asyncio.Lock()
async def acquire(self) -> None:
"""Acquire a token, waiting if necessary."""
async with self._lock:
now = datetime.now()
elapsed = (now - self.last_update).total_seconds()
self.tokens = min(self.rate, self.tokens + elapsed * (self.rate / self.per))
self.last_update = now
if self.tokens < 1:
wait_time = (1 - self.tokens) * (self.per / self.rate)
await asyncio.sleep(wait_time)
self.tokens = 0
else:
self.tokens -= 1
class BaseAsyncClient:
"""Base class for async API clients with rate limiting."""
def __init__(
self,
max_concurrent: int = 5,
requests_per_second: float = 3.0,
logger: logging.Logger | None = None,
):
"""
Initialize base client.
Args:
max_concurrent: Maximum concurrent requests
requests_per_second: Rate limit
logger: Logger instance
"""
self.semaphore = Semaphore(max_concurrent)
self.rate_limiter = RateLimiter(requests_per_second)
self.logger = logger or logging.getLogger(self.__class__.__name__)
self.stats = {
"requests": 0,
"success": 0,
"errors": 0,
"retries": 0,
}
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=2, max=10),
retry=retry_if_exception_type(Exception),
)
async def _rate_limited_request(
self,
coro: Callable[[], Any],
) -> Any:
"""Execute a request with rate limiting and retry."""
async with self.semaphore:
await self.rate_limiter.acquire()
self.stats["requests"] += 1
try:
result = await coro()
self.stats["success"] += 1
return result
except Exception as e:
self.stats["errors"] += 1
self.logger.error(f"Request failed: {e}")
raise
async def batch_requests(
self,
requests: list[Callable[[], Any]],
desc: str = "Processing",
) -> list[Any]:
"""Execute multiple requests concurrently."""
try:
from tqdm.asyncio import tqdm
has_tqdm = True
except ImportError:
has_tqdm = False
async def execute(req: Callable) -> Any:
try:
return await self._rate_limited_request(req)
except Exception as e:
return {"error": str(e)}
tasks = [execute(req) for req in requests]
if has_tqdm:
results = []
for coro in tqdm.as_completed(tasks, total=len(tasks), desc=desc):
result = await coro
results.append(result)
return results
else:
return await asyncio.gather(*tasks, return_exceptions=True)
def print_stats(self) -> None:
"""Print request statistics."""
self.logger.info("=" * 40)
self.logger.info("Request Statistics:")
self.logger.info(f" Total Requests: {self.stats['requests']}")
self.logger.info(f" Successful: {self.stats['success']}")
self.logger.info(f" Errors: {self.stats['errors']}")
self.logger.info("=" * 40)
class ConfigManager:
"""Manage API configuration and credentials."""
def __init__(self):
load_dotenv()
@property
def google_credentials_path(self) -> str | None:
"""Get Google service account credentials path."""
# Prefer SEO-specific credentials, fallback to general credentials
seo_creds = os.path.expanduser("~/.credential/ourdigital-seo-agent.json")
if os.path.exists(seo_creds):
return seo_creds
return os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
@property
def pagespeed_api_key(self) -> str | None:
"""Get PageSpeed Insights API key."""
return os.getenv("PAGESPEED_API_KEY")
@property
def custom_search_api_key(self) -> str | None:
"""Get Custom Search API key."""
return os.getenv("CUSTOM_SEARCH_API_KEY")
@property
def custom_search_engine_id(self) -> str | None:
"""Get Custom Search Engine ID."""
return os.getenv("CUSTOM_SEARCH_ENGINE_ID")
@property
def notion_token(self) -> str | None:
"""Get Notion API token."""
return os.getenv("NOTION_TOKEN") or os.getenv("NOTION_API_KEY")
def validate_google_credentials(self) -> bool:
"""Validate Google credentials are configured."""
creds_path = self.google_credentials_path
if not creds_path:
return False
return os.path.exists(creds_path)
def get_required(self, key: str) -> str:
"""Get required environment variable or raise error."""
value = os.getenv(key)
if not value:
raise ValueError(f"Missing required environment variable: {key}")
return value
# Singleton config instance
config = ConfigManager()

View File

@@ -0,0 +1,6 @@
# 15-seo-core-web-vitals dependencies
google-api-python-client>=2.100.0
requests>=2.31.0
python-dotenv>=1.0.0
rich>=13.7.0
typer>=0.9.0

View File

@@ -0,0 +1,108 @@
---
name: seo-core-web-vitals
version: 1.0.0
description: Core Web Vitals analyzer for LCP, FID, CLS, INP performance metrics. Triggers: Core Web Vitals, page speed, LCP, CLS, FID, INP, performance.
allowed-tools: mcp__firecrawl__*, mcp__perplexity__*
---
# SEO Core Web Vitals
## Purpose
Analyze Core Web Vitals performance metrics and provide optimization recommendations.
## Core Capabilities
1. **LCP** - Largest Contentful Paint measurement
2. **FID/INP** - Interactivity metrics
3. **CLS** - Cumulative Layout Shift
4. **Recommendations** - Optimization guidance
## Metrics Thresholds
| Metric | Good | Needs Work | Poor |
|--------|------|------------|------|
| LCP | ≤2.5s | 2.5-4s | >4s |
| FID | ≤100ms | 100-300ms | >300ms |
| CLS | ≤0.1 | 0.1-0.25 | >0.25 |
| INP | ≤200ms | 200-500ms | >500ms |
## Data Sources
### Option 1: PageSpeed Insights (Recommended)
Use external tool and input results:
- Visit: https://pagespeed.web.dev/
- Enter URL, run test
- Provide scores to skill
### Option 2: Research Best Practices
```
mcp__perplexity__search: "Core Web Vitals optimization [specific issue]"
```
## Workflow
1. Request PageSpeed Insights data from user
2. Analyze provided metrics
3. Identify failing metrics
4. Research optimization strategies
5. Provide prioritized recommendations
## Common LCP Issues
| Cause | Fix |
|-------|-----|
| Slow server response | Improve TTFB, use CDN |
| Render-blocking resources | Defer non-critical CSS/JS |
| Slow resource load | Preload LCP image |
| Client-side rendering | Use SSR/SSG |
## Common CLS Issues
| Cause | Fix |
|-------|-----|
| Images without dimensions | Add width/height attributes |
| Ads/embeds without space | Reserve space with CSS |
| Web fonts causing FOIT/FOUT | Use font-display: swap |
| Dynamic content injection | Reserve space, use transforms |
## Common INP Issues
| Cause | Fix |
|-------|-----|
| Long JavaScript tasks | Break up tasks, use web workers |
| Large DOM size | Reduce DOM nodes |
| Heavy event handlers | Debounce, optimize listeners |
| Third-party scripts | Defer, lazy load |
## Output Format
```markdown
## Core Web Vitals: [URL]
### Scores
| Metric | Mobile | Desktop | Status |
|--------|--------|---------|--------|
| LCP | Xs | Xs | Good/Poor |
| FID | Xms | Xms | Good/Poor |
| CLS | X.XX | X.XX | Good/Poor |
| INP | Xms | Xms | Good/Poor |
### Overall Score
- Mobile: X/100
- Desktop: X/100
### Priority Fixes
1. [Highest impact recommendation]
2. [Second priority]
### Detailed Recommendations
[Per-metric optimization steps]
```
## Limitations
- Requires external PageSpeed Insights data
- Lab data may differ from field data
- Some fixes require developer implementation
- Third-party scripts may be difficult to optimize

View File

@@ -0,0 +1,122 @@
# CLAUDE.md
## Overview
Google Search Console data retriever: search analytics (rankings, CTR, impressions), sitemap status, and index coverage.
## Quick Start
```bash
pip install -r scripts/requirements.txt
# Requires service account credentials
# ~/.credential/ourdigital-seo-agent.json
python scripts/gsc_client.py --site sc-domain:example.com --action summary
```
## Scripts
| Script | Purpose |
|--------|---------|
| `gsc_client.py` | Search Console API client |
| `base_client.py` | Shared utilities |
## Configuration
Service account setup:
```bash
# Credentials file location
~/.credential/ourdigital-seo-agent.json
# Add service account email to GSC property as user
ourdigital-seo-agent@ourdigital-insights.iam.gserviceaccount.com
```
## Usage
```bash
# Performance summary (last 28 days)
python scripts/gsc_client.py --site sc-domain:example.com --action summary
# Query-level data
python scripts/gsc_client.py --site sc-domain:example.com --action queries --limit 100
# Page-level data
python scripts/gsc_client.py --site sc-domain:example.com --action pages
# Custom date range
python scripts/gsc_client.py --site sc-domain:example.com --action queries \
--start 2024-01-01 --end 2024-01-31
# Sitemap status
python scripts/gsc_client.py --site sc-domain:example.com --action sitemaps
# JSON output
python scripts/gsc_client.py --site sc-domain:example.com --action summary --json
```
## Actions
| Action | Description |
|--------|-------------|
| `summary` | Overview metrics (clicks, impressions, CTR, position) |
| `queries` | Top search queries |
| `pages` | Top pages by clicks |
| `sitemaps` | Sitemap submission status |
| `coverage` | Index coverage issues |
## Output: Summary
```json
{
"site": "sc-domain:example.com",
"date_range": "2024-01-01 to 2024-01-28",
"totals": {
"clicks": 15000,
"impressions": 500000,
"ctr": 3.0,
"position": 12.5
}
}
```
## Output: Queries
```json
{
"queries": [
{
"query": "keyword",
"clicks": 500,
"impressions": 10000,
"ctr": 5.0,
"position": 3.2
}
]
}
```
## Rate Limits
| Limit | Value |
|-------|-------|
| Queries per minute | 1,200 |
| Rows per request | 25,000 |
## Site Property Formats
| Format | Example |
|--------|---------|
| Domain property | `sc-domain:example.com` |
| URL prefix | `https://www.example.com/` |
## Dependencies
```
google-api-python-client>=2.100.0
google-auth>=2.23.0
python-dotenv>=1.0.0
rich>=13.7.0
pandas>=2.1.0
```

View File

@@ -0,0 +1,207 @@
"""
Base Client - Shared async client utilities
===========================================
Purpose: Rate-limited async operations for API clients
Python: 3.10+
"""
import asyncio
import logging
import os
from asyncio import Semaphore
from datetime import datetime
from typing import Any, Callable, TypeVar
from dotenv import load_dotenv
from tenacity import (
retry,
stop_after_attempt,
wait_exponential,
retry_if_exception_type,
)
# Load environment variables
load_dotenv()
# Logging setup
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(levelname)s - %(message)s",
)
T = TypeVar("T")
class RateLimiter:
"""Rate limiter using token bucket algorithm."""
def __init__(self, rate: float, per: float = 1.0):
"""
Initialize rate limiter.
Args:
rate: Number of requests allowed
per: Time period in seconds (default: 1 second)
"""
self.rate = rate
self.per = per
self.tokens = rate
self.last_update = datetime.now()
self._lock = asyncio.Lock()
async def acquire(self) -> None:
"""Acquire a token, waiting if necessary."""
async with self._lock:
now = datetime.now()
elapsed = (now - self.last_update).total_seconds()
self.tokens = min(self.rate, self.tokens + elapsed * (self.rate / self.per))
self.last_update = now
if self.tokens < 1:
wait_time = (1 - self.tokens) * (self.per / self.rate)
await asyncio.sleep(wait_time)
self.tokens = 0
else:
self.tokens -= 1
class BaseAsyncClient:
"""Base class for async API clients with rate limiting."""
def __init__(
self,
max_concurrent: int = 5,
requests_per_second: float = 3.0,
logger: logging.Logger | None = None,
):
"""
Initialize base client.
Args:
max_concurrent: Maximum concurrent requests
requests_per_second: Rate limit
logger: Logger instance
"""
self.semaphore = Semaphore(max_concurrent)
self.rate_limiter = RateLimiter(requests_per_second)
self.logger = logger or logging.getLogger(self.__class__.__name__)
self.stats = {
"requests": 0,
"success": 0,
"errors": 0,
"retries": 0,
}
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=2, max=10),
retry=retry_if_exception_type(Exception),
)
async def _rate_limited_request(
self,
coro: Callable[[], Any],
) -> Any:
"""Execute a request with rate limiting and retry."""
async with self.semaphore:
await self.rate_limiter.acquire()
self.stats["requests"] += 1
try:
result = await coro()
self.stats["success"] += 1
return result
except Exception as e:
self.stats["errors"] += 1
self.logger.error(f"Request failed: {e}")
raise
async def batch_requests(
self,
requests: list[Callable[[], Any]],
desc: str = "Processing",
) -> list[Any]:
"""Execute multiple requests concurrently."""
try:
from tqdm.asyncio import tqdm
has_tqdm = True
except ImportError:
has_tqdm = False
async def execute(req: Callable) -> Any:
try:
return await self._rate_limited_request(req)
except Exception as e:
return {"error": str(e)}
tasks = [execute(req) for req in requests]
if has_tqdm:
results = []
for coro in tqdm.as_completed(tasks, total=len(tasks), desc=desc):
result = await coro
results.append(result)
return results
else:
return await asyncio.gather(*tasks, return_exceptions=True)
def print_stats(self) -> None:
"""Print request statistics."""
self.logger.info("=" * 40)
self.logger.info("Request Statistics:")
self.logger.info(f" Total Requests: {self.stats['requests']}")
self.logger.info(f" Successful: {self.stats['success']}")
self.logger.info(f" Errors: {self.stats['errors']}")
self.logger.info("=" * 40)
class ConfigManager:
"""Manage API configuration and credentials."""
def __init__(self):
load_dotenv()
@property
def google_credentials_path(self) -> str | None:
"""Get Google service account credentials path."""
# Prefer SEO-specific credentials, fallback to general credentials
seo_creds = os.path.expanduser("~/.credential/ourdigital-seo-agent.json")
if os.path.exists(seo_creds):
return seo_creds
return os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
@property
def pagespeed_api_key(self) -> str | None:
"""Get PageSpeed Insights API key."""
return os.getenv("PAGESPEED_API_KEY")
@property
def custom_search_api_key(self) -> str | None:
"""Get Custom Search API key."""
return os.getenv("CUSTOM_SEARCH_API_KEY")
@property
def custom_search_engine_id(self) -> str | None:
"""Get Custom Search Engine ID."""
return os.getenv("CUSTOM_SEARCH_ENGINE_ID")
@property
def notion_token(self) -> str | None:
"""Get Notion API token."""
return os.getenv("NOTION_TOKEN") or os.getenv("NOTION_API_KEY")
def validate_google_credentials(self) -> bool:
"""Validate Google credentials are configured."""
creds_path = self.google_credentials_path
if not creds_path:
return False
return os.path.exists(creds_path)
def get_required(self, key: str) -> str:
"""Get required environment variable or raise error."""
value = os.getenv(key)
if not value:
raise ValueError(f"Missing required environment variable: {key}")
return value
# Singleton config instance
config = ConfigManager()

View File

@@ -0,0 +1,7 @@
# 16-seo-search-console dependencies
google-api-python-client>=2.100.0
google-auth>=2.23.0
pandas>=2.1.0
python-dotenv>=1.0.0
rich>=13.7.0
typer>=0.9.0

View File

@@ -0,0 +1,117 @@
---
name: seo-search-console
version: 1.0.0
description: Google Search Console data analyzer for rankings, CTR, impressions, and index coverage. Triggers: Search Console, GSC, rankings, search performance, impressions, CTR.
allowed-tools: mcp__perplexity__*, mcp__notion__*
---
# SEO Search Console
## Purpose
Analyze Google Search Console data: search performance (queries, pages, CTR, position), sitemap status, and index coverage.
## Core Capabilities
1. **Performance Analysis** - Clicks, impressions, CTR, position
2. **Query Analysis** - Top search queries
3. **Page Performance** - Best/worst performing pages
4. **Index Coverage** - Crawl and index issues
5. **Sitemap Status** - Submission and processing
## Data Collection
### Option 1: User Provides Data
Request GSC export from user:
1. Go to Search Console > Performance
2. Export data (CSV or Google Sheets)
3. Share with assistant
### Option 2: User Describes Data
User verbally provides:
- Top queries and positions
- CTR trends
- Coverage issues
## Analysis Framework
### Performance Metrics
| Metric | What It Measures | Good Benchmark |
|--------|------------------|----------------|
| Clicks | User visits from search | Trending up |
| Impressions | Search appearances | High for target keywords |
| CTR | Click-through rate | 2-5% average |
| Position | Average ranking | <10 for key terms |
### Query Analysis
Identify:
- **Winners** - High position, high CTR
- **Opportunities** - High impressions, low CTR
- **Quick wins** - Position 8-20, low effort to improve
### Page Analysis
Categorize:
- **Top performers** - High clicks, good CTR
- **Underperformers** - High impressions, low CTR
- **Declining** - Down vs previous period
## Workflow
1. Collect GSC data from user
2. Analyze performance trends
3. Identify top queries and pages
4. Find optimization opportunities
5. Check for coverage issues
6. Provide actionable recommendations
## Output Format
```markdown
## Search Console Analysis: [Site]
### Overview (Last 28 Days)
| Metric | Value | vs Previous |
|--------|-------|-------------|
| Clicks | X | +X% |
| Impressions | X | +X% |
| CTR | X% | +X% |
| Position | X | +X |
### Top Queries
| Query | Clicks | Position | Opportunity |
|-------|--------|----------|-------------|
### Top Pages
| Page | Clicks | CTR | Status |
|------|--------|-----|--------|
### Opportunities
1. [Query with high impressions, low CTR]
2. [Page ranking 8-20 that can improve]
### Issues
- [Coverage problems]
- [Sitemap issues]
### Recommendations
1. [Priority action]
```
## Common Issues
| Issue | Impact | Fix |
|-------|--------|-----|
| Low CTR on high-impression query | Lost traffic | Improve title/description |
| Declining positions | Traffic loss | Update content, build links |
| Not indexed pages | No visibility | Fix crawl issues |
| Sitemap errors | Discovery problems | Fix sitemap XML |
## Limitations
- Requires user to provide GSC data
- API access needs service account setup
- Data has 2-3 day delay
- Limited to verified properties

View File

@@ -0,0 +1,65 @@
# CLAUDE.md
## Overview
SEO gateway page strategist for Korean medical/service websites. Creates keyword strategies, content architecture, and technical SEO plans.
## Quick Start
```bash
pip install -r scripts/requirements.txt
# Keyword analysis
python scripts/keyword_analyzer.py --topic "눈 성형" --market "강남"
```
## Scripts
| Script | Purpose |
|--------|---------|
| `keyword_analyzer.py` | Analyze keywords, search volume, competitor gaps |
## Keyword Analyzer
```bash
# Basic analysis
python scripts/keyword_analyzer.py --topic "눈 성형"
# With location targeting
python scripts/keyword_analyzer.py --topic "눈 성형" --market "강남" --output strategy.json
# Competitor analysis
python scripts/keyword_analyzer.py --topic "눈 성형" --competitors url1,url2
```
## Output
Generates strategic document with:
- Primary keyword + monthly search volume
- LSI keywords (7-10)
- User intent distribution
- Competitor gap analysis
- Content architecture (H1-H3 structure)
- Technical SEO checklist
## Templates
See `templates/` for:
- `keyword-research-template.md`
- `content-architecture-template.md`
- `seo-checklist-template.md`
## Workflow
1. Run keyword analyzer for target topic
2. Review search volume and intent data
3. Use output to plan content architecture
4. Hand off to `18-seo-gateway-builder` for content generation
## Configuration
```bash
# Optional: API keys for enhanced data
GOOGLE_API_KEY=xxx
NAVER_API_KEY=xxx
```

View File

@@ -281,20 +281,38 @@ Generated: {datetime.now().strftime('%Y-%m-%d %H:%M')}
def main():
"""Main execution function"""
import sys
if len(sys.argv) < 2:
print("Usage: python keyword_analyzer.py '키워드'")
print("Example: python keyword_analyzer.py '눈 성형'")
sys.exit(1)
keyword = ' '.join(sys.argv[1:])
import argparse
parser = argparse.ArgumentParser(
description='Analyze keywords for SEO gateway page strategy',
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog='''
Examples:
python keyword_analyzer.py --topic "눈 성형"
python keyword_analyzer.py --topic "이마 성형" --market "강남"
python keyword_analyzer.py --topic "동안 성형" --output strategy.json
'''
)
parser.add_argument('--topic', '-t', required=True,
help='Primary keyword to analyze (e.g., "눈 성형")')
parser.add_argument('--market', '-m', default=None,
help='Target market/location (e.g., "강남")')
parser.add_argument('--output', '-o', default=None,
help='Output JSON file path')
parser.add_argument('--competitors', '-c', default=None,
help='Comma-separated competitor URLs for analysis')
args = parser.parse_args()
keyword = args.topic
if args.market:
keyword = f"{args.market} {args.topic}"
print(f"Analyzing keyword: {keyword}")
print("-" * 50)
analyzer = KeywordAnalyzer(keyword)
# Run analysis
analyzer.analyze_primary_keyword()
analyzer.generate_lsi_keywords()
@@ -302,13 +320,13 @@ def main():
analyzer.generate_question_keywords()
analyzer.calculate_intent_distribution()
analyzer.generate_recommendations()
# Generate and print report
report = analyzer.generate_report()
print(report)
# Export to JSON
filename = analyzer.export_analysis()
filename = analyzer.export_analysis(args.output)
print(f"\nAnalysis exported to: {filename}")

View File

@@ -0,0 +1,160 @@
# Content Architecture Template
## Page Hierarchy Structure
```
[Page URL: /service-name]
├── H1: [Primary Keyword-Optimized Headline]
│ Example: "강남 눈 성형 전문의가 만드는 자연스러운 눈매"
│ Word Count Target: 15-25 characters
│ Keyword Placement: Primary keyword at beginning
├── Hero Section [Above Fold]
│ ├── Value Proposition (30-50 words)
│ │ └── Keywords: Primary + 1 LSI
│ ├── Trust Signals (3-5 items)
│ │ ├── Certification badges
│ │ ├── Years of experience
│ │ └── Success cases number
│ └── Primary CTA
│ └── Text: "무료 상담 신청하기"
├── H2: [Service Name] 이란? [Problem/Solution Framework]
│ Word Count: 200-300 words
│ Keywords: Primary (1x), LSI (2-3x)
│ ├── H3: 이런 고민이 있으신가요? [Pain Points]
│ │ ├── Pain point 1 (include LSI keyword)
│ │ ├── Pain point 2 (include LSI keyword)
│ │ └── Pain point 3 (include LSI keyword)
│ └── H3: [Clinic Name]의 솔루션 [Benefits]
│ ├── Benefit 1 (address pain point 1)
│ ├── Benefit 2 (address pain point 2)
│ └── Benefit 3 (address pain point 3)
├── H2: [Service Name] 종류 및 방법 [Service Categories]
│ Word Count: 400-500 words total
│ Keywords: Category-specific LSI keywords
│ ├── H3: [Sub-service 1] - [LSI Keyword Variation]
│ │ ├── Description (80-100 words)
│ │ ├── Best for (target audience)
│ │ ├── Duration & Recovery
│ │ └── CTA: "자세히 보기"
│ ├── H3: [Sub-service 2] - [LSI Keyword Variation]
│ │ └── [Same structure as above]
│ └── H3: [Sub-service 3] - [LSI Keyword Variation]
│ └── [Same structure as above]
├── H2: [Clinic Name] [Service Name]만의 차별점 [Trust & Authority]
│ Word Count: 300-400 words
│ Keywords: Brand + Primary keyword combinations
│ ├── H3: 전문 의료진 [Doctor Credentials]
│ │ ├── Doctor profile summary
│ │ ├── Specializations
│ │ └── Certifications
│ ├── H3: 검증된 시술 결과 [Success Metrics]
│ │ ├── Number statistics
│ │ ├── Success rate
│ │ └── Patient satisfaction
│ └── H3: 첨단 장비 및 시설 [Facilities]
│ ├── Equipment descriptions
│ └── Safety protocols
├── H2: [Service Name] 자주 묻는 질문 [FAQ Section]
│ Word Count: 500-700 words
│ Keywords: Long-tail question keywords
│ ├── Q1: [Long-tail keyword as question]?
│ │ └── A: [40-60 word answer, keyword in first sentence]
│ ├── Q2: [Price-related question]?
│ │ └── A: [Include "비용" LSI keyword]
│ ├── Q3: [Recovery-related question]?
│ │ └── A: [Include "회복기간" LSI keyword]
│ ├── Q4: [Side-effect question]?
│ │ └── A: [Include "부작용" LSI keyword]
│ ├── Q5: [Process question]?
│ │ └── A: [Include process-related LSI]
│ ├── Q6: [Candidacy question]?
│ │ └── A: [Include target audience keywords]
│ └── Q7: [Results duration question]?
│ └── A: [Include maintenance keywords]
├── H2: [Service Name] 시술 과정 [Process Guide]
│ Word Count: 300-400 words
│ Keywords: "과정", "단계", procedural LSI
│ ├── H3: 상담 및 검사 [Consultation]
│ ├── H3: 시술 당일 [Procedure Day]
│ ├── H3: 회복 과정 [Recovery]
│ └── H3: 사후 관리 [Aftercare]
├── H2: 실제 고객 후기 [Social Proof]
│ Word Count: 200-300 words
│ Keywords: "후기", "리뷰", satisfaction keywords
│ ├── Review snippet 1
│ ├── Review snippet 2
│ ├── Review snippet 3
│ └── Before/After gallery teaser
└── H2: 상담 예약 안내 [Conversion Section]
Word Count: 150-200 words
Keywords: CTA-related, location keywords
├── H3: 상담 예약 방법
├── H3: 오시는 길
└── H3: 문의 정보
```
## Keyword Density Map
| Section | Primary Keyword | LSI Keywords | Total Keywords |
|---------|----------------|--------------|----------------|
| Hero | 1 | 1-2 | 2-3 |
| Problem/Solution | 1 | 2-3 | 3-4 |
| Service Categories | 1-2 | 4-6 | 5-8 |
| Trust & Authority | 1 | 2-3 | 3-4 |
| FAQ | 2-3 | 5-7 | 7-10 |
| Process | 1 | 2-3 | 3-4 |
| Social Proof | 0-1 | 1-2 | 1-3 |
| Conversion | 1 | 1-2 | 2-3 |
| **Total** | **8-11** | **18-29** | **26-40** |
## Internal Linking Strategy
| From Section | To Page | Anchor Text | Purpose |
|-------------|---------|-------------|---------|
| Service Categories | Sub-service page | [Sub-service name] | Deep dive |
| FAQ | Price page | "비용 안내 페이지" | Conversion |
| Trust section | Doctor profile | "[Doctor name] 원장" | Authority |
| Process section | Consultation form | "상담 예약하기" | Conversion |
| Social proof | Gallery page | "더 많은 전후 사진" | Engagement |
## Content Length Guidelines
- **Total Page Length**: 2,000-2,500 words
- **Above Fold Content**: 100-150 words
- **Each H2 Section**: 200-500 words
- **Each H3 Subsection**: 80-150 words
- **Meta Description**: 150-160 characters
- **Image Alt Text**: 10-15 words each
## Schema Markup Requirements
```json
{
"@context": "https://schema.org",
"@type": "MedicalProcedure",
"name": "[Service Name]",
"description": "[Meta description]",
"procedureType": "Cosmetic",
"provider": {
"@type": "MedicalOrganization",
"name": "[Clinic Name]"
}
}
```
## Mobile Content Adaptation
- Reduce hero text by 30%
- Show 3 FAQs initially (expand for more)
- Simplify navigation to single-column
- Increase CTA button size
- Compress trust signals to carousel

View File

@@ -0,0 +1,95 @@
# Keyword Research Template
## Primary Keyword Analysis
| Metric | Value | Notes |
|--------|-------|-------|
| **Primary Keyword** | [KEYWORD] | Main target keyword |
| **Monthly Search Volume** | [VOLUME] | Average monthly searches |
| **Keyword Difficulty** | [0-100] | Competition score |
| **Current Ranking** | #[POSITION] | Current SERP position |
| **Search Trend** | ↑ ↓ → | Trending direction |
## LSI Keywords Matrix
| LSI Keyword | Search Volume | Intent Type | Priority |
|------------|--------------|-------------|----------|
| [keyword 1] | [volume] | Informational | High |
| [keyword 2] | [volume] | Transactional | Medium |
| [keyword 3] | [volume] | Comparative | High |
| [keyword 4] | [volume] | Informational | Medium |
| [keyword 5] | [volume] | Transactional | Low |
| [keyword 6] | [volume] | Comparative | High |
| [keyword 7] | [volume] | Informational | Medium |
| [keyword 8] | [volume] | Navigational | Low |
| [keyword 9] | [volume] | Transactional | High |
| [keyword 10] | [volume] | Informational | Medium |
## User Intent Distribution
```
Informational (Research Phase): ___%
- Common queries: "what is", "how to", "benefits of"
- Content needed: Educational guides, FAQs, process explanations
Comparative (Evaluation Phase): ___%
- Common queries: "best", "vs", "reviews", "비교"
- Content needed: Comparison tables, reviews, case studies
Transactional (Ready to Convert): ___%
- Common queries: "price", "book", "consultation", "예약"
- Content needed: CTAs, pricing, booking forms
```
## Long-tail Keyword Opportunities
### Question-based Keywords
- [질문 키워드 1]
- [질문 키워드 2]
- [질문 키워드 3]
### Location-based Keywords
- [지역] + [primary keyword]
- [지역] + [primary keyword] + 잘하는곳
- [지역] + [primary keyword] + 추천
### Modifier-based Keywords
- [primary keyword] + 비용
- [primary keyword] + 부작용
- [primary keyword] + 회복기간
- [primary keyword] + 전후
## Competitor Keyword Analysis
| Competitor | Target Keywords | Ranking Keywords | Gap Opportunities |
|------------|----------------|------------------|-------------------|
| Competitor 1 | [keywords] | [keywords] | [missing keywords] |
| Competitor 2 | [keywords] | [keywords] | [missing keywords] |
| Competitor 3 | [keywords] | [keywords] | [missing keywords] |
## Seasonal Trends
| Month | Search Volume | Events/Factors |
|-------|--------------|----------------|
| January | [volume] | New year resolutions |
| February | [volume] | [factor] |
| March | [volume] | [factor] |
| ... | ... | ... |
## Platform-Specific Keywords
### Naver-Optimized
- [네이버 specific keyword 1]
- [네이버 specific keyword 2]
### Google-Optimized
- [Google specific keyword 1]
- [Google specific keyword 2]
## Action Items
- [ ] Target primary keyword in H1 and title tag
- [ ] Include 3-5 LSI keywords naturally in content
- [ ] Create content matching user intent distribution
- [ ] Optimize for question-based featured snippets
- [ ] Add location modifiers for local SEO

View File

@@ -0,0 +1,239 @@
# SEO Technical Checklist Template
## Meta Tags Optimization
### Title Tag
- [ ] Length: 50-60 characters
- [ ] Primary keyword at beginning
- [ ] Brand name at end
- [ ] Unique for each page
- [ ] Formula: `[Primary Keyword] - [Value Proposition] | [Brand]`
**Template**: `{primary_keyword} 전문 - {unique_value} | {clinic_name}`
**Example**: `눈 성형 전문 - 자연스러운 라인 | 제이미클리닉`
### Meta Description
- [ ] Length: 150-160 characters
- [ ] Include primary keyword
- [ ] Include 1-2 LSI keywords
- [ ] Clear CTA
- [ ] Unique for each page
**Template**: `{location} {primary_keyword} 전문의가 {benefit}. {credential}. 무료상담 ☎ {phone}`
**Example**: `강남 눈 성형 전문의가 자연스러운 눈매를 디자인합니다. 15년 경력, 10,000건 시술. 무료상담 ☎ 02-1234-5678`
### Open Graph Tags
```html
<meta property="og:title" content="{page_title}">
<meta property="og:description" content="{meta_description}">
<meta property="og:image" content="{featured_image_url}">
<meta property="og:url" content="{page_url}">
<meta property="og:type" content="website">
<meta property="og:locale" content="ko_KR">
```
## Header Tags Structure
- [ ] Only one H1 per page
- [ ] H1 contains primary keyword
- [ ] H2 tags for main sections (5-7)
- [ ] H3 tags for subsections
- [ ] Logical hierarchy maintained
- [ ] Keywords distributed naturally
## Content Optimization
### Keyword Density
- [ ] Primary keyword: 2-3% (20-30 times per 1000 words)
- [ ] LSI keywords: 1-2% each
- [ ] Natural placement (no stuffing)
- [ ] Synonyms and variations used
### Content Structure
- [ ] First 100 words include primary keyword
- [ ] Short paragraphs (3-4 sentences)
- [ ] Bullet points and lists
- [ ] Bold important keywords (sparingly)
- [ ] Internal links: 5-10
- [ ] External links: 2-3 (authoritative)
## Schema Markup
### Medical Procedure Schema
```json
{
"@context": "https://schema.org",
"@type": "MedicalProcedure",
"name": "{procedure_name}",
"procedureType": "Cosmetic",
"bodyLocation": "{body_part}",
"outcome": "{expected_outcome}",
"preparation": "{preparation_required}",
"followup": "{followup_care}",
"provider": {
"@type": "MedicalOrganization",
"name": "{clinic_name}",
"address": {
"@type": "PostalAddress",
"streetAddress": "{street}",
"addressLocality": "{city}",
"addressCountry": "KR"
}
}
}
```
### FAQ Schema
```json
{
"@context": "https://schema.org",
"@type": "FAQPage",
"mainEntity": [{
"@type": "Question",
"name": "{question}",
"acceptedAnswer": {
"@type": "Answer",
"text": "{answer}"
}
}]
}
```
## Image Optimization
- [ ] Descriptive file names: `eye-surgery-before-after-case1.jpg`
- [ ] Alt text with keywords: `눈 성형 전후 사진 - 30대 여성 사례`
- [ ] Compressed file size (< 200KB)
- [ ] WebP format with fallback
- [ ] Lazy loading implemented
- [ ] Image sitemap created
## Performance Optimization
### Page Speed
- [ ] Load time < 3 seconds
- [ ] First Contentful Paint < 1.8s
- [ ] Time to Interactive < 3.8s
- [ ] Total page size < 3MB
- [ ] Requests minimized (< 50)
### Core Web Vitals
- [ ] LCP (Largest Contentful Paint) < 2.5s
- [ ] FID (First Input Delay) < 100ms
- [ ] CLS (Cumulative Layout Shift) < 0.1
## Mobile Optimization
- [ ] Mobile-responsive design
- [ ] Viewport meta tag set
- [ ] Touch-friendly buttons (44x44px minimum)
- [ ] Readable font size (16px minimum)
- [ ] No horizontal scrolling
- [ ] Mobile page speed < 3s
## URL Structure
- [ ] SEO-friendly URL: `/eye-surgery` or `/눈-성형`
- [ ] No special characters
- [ ] Lowercase only
- [ ] Hyphens for word separation
- [ ] Under 60 characters
- [ ] Include primary keyword
## Internal Linking
| From Page | To Page | Anchor Text | Purpose |
|-----------|---------|-------------|---------|
| Gateway | Service Detail | {service_name} | Deep content |
| Gateway | Doctor Profile | {doctor_name} 원장 | Authority |
| Gateway | Pricing | 비용 안내 | Conversion |
| Gateway | Gallery | 시술 전후 사진 | Engagement |
| Gateway | Contact | 상담 예약 | Conversion |
## Naver-Specific Optimization
### Naver Webmaster Tools
- [ ] Site verification complete
- [ ] XML sitemap submitted
- [ ] Robots.txt configured
- [ ] Syndication feed active
- [ ] Site optimization report reviewed
### Naver SEO Elements
- [ ] Title under 30 Korean characters
- [ ] C-Rank tags implemented
- [ ] Image-to-text ratio optimized (40:60)
- [ ] Outbound links minimized
- [ ] Brand search optimization
## Tracking & Analytics
- [ ] Google Analytics 4 installed
- [ ] Google Search Console verified
- [ ] Naver Analytics installed
- [ ] Conversion tracking configured
- [ ] Event tracking for CTAs
- [ ] Heatmap tool installed
## Security & Technical
- [ ] SSL certificate active (HTTPS)
- [ ] WWW/non-WWW redirect configured
- [ ] 404 error page customized
- [ ] XML sitemap generated
- [ ] Robots.txt optimized
- [ ] Canonical URLs set
- [ ] Hreflang tags (if multi-language)
## Quality Checks
### Content Quality
- [ ] No spelling/grammar errors
- [ ] Medical information accurate
- [ ] Legal compliance verified
- [ ] Contact information correct
- [ ] CTAs working properly
### Cross-browser Testing
- [ ] Chrome (Desktop/Mobile)
- [ ] Safari (Desktop/Mobile)
- [ ] Firefox
- [ ] Samsung Internet
- [ ] Naver Whale
## Monthly Monitoring Tasks
- [ ] Keyword ranking check
- [ ] Organic traffic analysis
- [ ] Bounce rate monitoring
- [ ] Conversion rate tracking
- [ ] Competitor analysis
- [ ] Content freshness update
- [ ] Broken link check
- [ ] Page speed test
## Priority Levels
1. **Critical (Day 1)**
- Title and meta tags
- H1 optimization
- Mobile responsiveness
- Page speed < 4s
2. **High (Week 1)**
- Schema markup
- Internal linking
- Image optimization
- Content optimization
3. **Medium (Week 2-3)**
- Naver optimization
- FAQ implementation
- Social proof elements
- Analytics setup
4. **Low (Month 2)**
- A/B testing
- Advanced schema
- Link building
- Content expansion

View File

@@ -0,0 +1,82 @@
# CLAUDE.md
## Overview
Gateway page content generator for local services. Creates SEO-optimized pages from location/service configurations.
## Quick Start
```bash
# Generate pages from config
python scripts/generate_pages.py --config config/services.json --locations config/locations.json
```
## Scripts
| Script | Purpose |
|--------|---------|
| `generate_pages.py` | Generate gateway pages from templates |
## Page Generator
```bash
# Generate all combinations
python scripts/generate_pages.py \
--config config/services.json \
--locations config/locations.json \
--output ./pages
# Single service/location
python scripts/generate_pages.py \
--service "laser_hair_removal" \
--location "gangnam" \
--template templates/gateway-page-medical.md
```
## Configuration Files
### services.json
```json
{
"services": [
{
"id": "laser_hair_removal",
"korean": "레이저 제모",
"keywords": ["laser hair removal", "permanent hair removal"]
}
]
}
```
### locations.json
```json
{
"locations": [
{
"id": "gangnam",
"korean": "강남",
"full_address": "서울특별시 강남구"
}
]
}
```
## Templates
- `templates/gateway-page-medical.md` - Medical service template
- Supports variables: `{{service}}`, `{{location}}`, `{{brand}}`
## Output
Generates markdown files with:
- SEO-optimized title and meta
- Structured content sections
- Schema markup recommendations
- Internal linking suggestions
## Workflow
1. Get strategy from `17-seo-gateway-architect`
2. Configure services and locations
3. Run generator for content drafts
4. Review and customize output

View File

@@ -52,10 +52,15 @@ class Brand:
class GatewayPageGenerator:
"""Main class for generating gateway page content"""
def __init__(self, brand: Brand, template_path: str = "templates/"):
def __init__(self, brand: Brand, template_path: str = None):
self.brand = brand
self.template_path = Path(template_path)
# Use script directory as base for template path
if template_path is None:
script_dir = Path(__file__).parent.parent
self.template_path = script_dir / "templates"
else:
self.template_path = Path(template_path)
self.generated_pages = []
def load_template(self, template_name: str) -> str:

View File

@@ -0,0 +1,5 @@
# 18-seo-gateway-builder dependencies
jinja2>=3.1.0
pyyaml>=6.0.0
markdown>=3.5.0
python-dotenv>=1.0.0

View File

@@ -0,0 +1,231 @@
# [Medical Service] in [Location] | [Clinic Name]
<!-- Meta Tags -->
<!--
Title: [Medical Service] in [Location] | Expert Care | [Clinic Name]
Description: Looking for professional [medical service] in [location]? [Clinic Name] offers state-of-the-art [service] with experienced doctors. ✓ Same-day appointments ✓ Insurance accepted ✓ [Unique benefit]
Canonical: https://example.com/[location]/[service-slug]/
-->
## Professional [Medical Service] Available in [Location]
Welcome to [Clinic Name], your trusted provider for [medical service] in [location]. Our medical team brings over [X years] of combined experience, utilizing the latest medical technology to ensure optimal results for our patients in the [location] area.
### Why Choose [Clinic Name] for [Medical Service] in [Location]?
Located conveniently at [specific address near landmark], our [location] clinic specializes in providing personalized [medical service] treatments tailored to each patient's unique needs. We understand the specific health concerns of [location] residents and have designed our services accordingly.
**Our [Location] Advantages:**
- 🏥 Modern facility equipped with latest [equipment type]
- 👨‍⚕️ Board-certified specialists with [certification details]
- 📍 Easy access from [nearby subway/bus stations]
- 🕐 Extended hours to accommodate busy [location] professionals
- 💳 Accept major insurance plans popular in [location]
## Understanding [Medical Service]
### What Is [Medical Service]?
[Detailed medical explanation of the service, including scientific background, FDA approvals if applicable, and medical benefits. This section should be educational while remaining accessible.]
### Who Can Benefit from [Medical Service]?
Our [medical service] treatment in [location] is ideal for patients experiencing:
- [Condition 1 with brief explanation]
- [Condition 2 with brief explanation]
- [Condition 3 with brief explanation]
- [Condition 4 with brief explanation]
## Our [Medical Service] Process in [Location]
### 1. Initial Consultation
Your journey begins with a comprehensive consultation at our [location] clinic. Our specialists will:
- Review your medical history
- Conduct necessary diagnostic tests
- Discuss your treatment goals
- Create a personalized treatment plan
### 2. Treatment Planning
Based on your consultation, we develop a customized approach that considers:
- Your specific medical condition
- Lifestyle factors common to [location] residents
- Insurance coverage options
- Optimal scheduling for your convenience
### 3. Treatment Sessions
Each [medical service] session at our [location] facility typically involves:
- Pre-treatment preparation
- The procedure itself (approximately [duration])
- Post-treatment monitoring
- Detailed aftercare instructions
### 4. Follow-up Care
We provide comprehensive follow-up support including:
- Scheduled check-ups
- 24/7 emergency hotline
- Ongoing treatment adjustments
- Long-term health monitoring
## Expected Results and Recovery
### What to Expect After [Medical Service]
Patients at our [location] clinic typically experience:
- **Immediate effects**: [Description]
- **Short-term (1-2 weeks)**: [Description]
- **Long-term (1-3 months)**: [Description]
- **Final results**: [Timeline and description]
### Recovery Timeline
- Day 1-3: [Recovery details]
- Week 1: [Recovery details]
- Week 2-4: [Recovery details]
- Month 2-3: [Recovery details]
## Safety and Credentials
### Our Medical Standards
[Clinic Name] in [location] maintains the highest medical standards:
- ✓ [Relevant medical certification]
- ✓ [Hospital affiliation if applicable]
- ✓ [Safety protocol certification]
- ✓ [Professional membership]
### Our Medical Team
**Dr. [Name], MD**
- [Medical school]
- [Residency/Fellowship]
- [Years of experience] specializing in [medical service]
- [Special recognition or research]
## Pricing and Insurance
### Insurance Coverage
We accept most major insurance plans used by [location] residents:
- [Insurance provider 1]
- [Insurance provider 2]
- [Insurance provider 3]
- [Insurance provider 4]
### Payment Options
For your convenience, we offer:
- Insurance direct billing
- Flexible payment plans
- Credit card payments
- HSA/FSA acceptance
### Transparent Pricing
Contact us for a detailed quote. Factors affecting cost include:
- Severity of condition
- Number of sessions required
- Insurance coverage level
- Additional treatments needed
## Patient Testimonials from [Location]
> "After struggling with [condition] for years, I finally found relief at [Clinic Name]. The team was professional, and the results exceeded my expectations."
> — [Patient initials], [Location] resident
> "The convenience of having such high-quality [medical service] right here in [location] made all the difference. I no longer have to travel to [other area] for treatment."
> — [Patient initials], [Nearby neighborhood]
> "Dr. [Name] took the time to explain everything thoroughly. I felt confident throughout the entire process."
> — [Patient initials], [Location] professional
## Frequently Asked Questions
### General Questions
**Q: How do I know if [medical service] is right for me?**
A: The best way to determine if you're a candidate is through a consultation at our [location] clinic. We'll evaluate your medical history, current condition, and treatment goals.
**Q: How long does [medical service] take?**
A: Treatment sessions typically last [duration], though your first visit including consultation may take [longer duration].
**Q: Is [medical service] painful?**
A: [Comfort level explanation with pain management options available]
### Location-Specific Questions
**Q: Where exactly is your [location] clinic located?**
A: We're located at [full address], just [distance] from [landmark/station]. [Parking/public transport information].
**Q: Do you have parking available?**
A: Yes, we offer [parking details specific to location].
**Q: What are your hours for the [location] clinic?**
A:
- Monday-Friday: [hours]
- Saturday: [hours]
- Sunday: [hours/closed]
### Insurance and Payment
**Q: Does insurance cover [medical service]?**
A: Coverage varies by plan. Our insurance specialists can verify your benefits before your appointment.
**Q: Do you offer payment plans?**
A: Yes, we offer flexible payment options including [specific plans available].
## Schedule Your [Medical Service] Consultation in [Location]
Ready to take the first step? Contact our [location] clinic today:
### Contact Information
📍 **Address**: [Full address]
📞 **Phone**: [Local phone number]
📧 **Email**: [location]@[clinicname].com
🌐 **Online Booking**: [URL]
### Office Hours
- **Monday-Friday**: [Hours]
- **Saturday**: [Hours]
- **Sunday**: [Hours/Closed]
- **Emergency**: [24/7 hotline if available]
### Getting Here
**By Subway**: [Detailed directions from nearest station]
**By Bus**: [Bus routes and stops]
**By Car**: [Driving directions and parking info]
---
<!-- Schema Markup -->
```json
{
"@context": "https://schema.org",
"@type": "MedicalClinic",
"name": "[Clinic Name] - [Location]",
"image": "[clinic-image-url]",
"@id": "[page-url]",
"url": "[website-url]",
"telephone": "[phone-number]",
"address": {
"@type": "PostalAddress",
"streetAddress": "[street]",
"addressLocality": "[city]",
"addressRegion": "[state/province]",
"postalCode": "[zip]",
"addressCountry": "KR"
},
"geo": {
"@type": "GeoCoordinates",
"latitude": [latitude],
"longitude": [longitude]
},
"openingHoursSpecification": {
"@type": "OpeningHoursSpecification",
"dayOfWeek": ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"],
"opens": "09:00",
"closes": "18:00"
},
"medicalSpecialty": "[Medical Specialty]",
"availableService": {
"@type": "MedicalProcedure",
"name": "[Medical Service]",
"description": "[Service Description]"
}
}
```
*Last updated: [Date] | [Clinic Name] - Professional [Medical Service] in [Location]*

Some files were not shown because too many files have changed in this diff Show More