Add SEO skills 33-34 and fix bugs in skills 19-34

New skills:
- Skill 33: Site migration planner with redirect mapping and monitoring
- Skill 34: Reporting dashboard with HTML charts and Korean executive reports

Bug fixes (Skill 34 - report_aggregator.py):
- Add audit_type fallback for skill identification (was only using audit_id prefix)
- Extract health scores from nested data dict (technical_score, onpage_score, etc.)
- Support subdomain matching in domain filter (blog.ourdigital.org matches ourdigital.org)
- Skip self-referencing DASH- aggregated reports

Bug fixes (Skill 20 - naver_serp_analyzer.py):
- Remove VIEW tab selectors (removed by Naver in 2026)
- Add new section detectors: books (도서), shortform (숏폼), influencer (인플루언서)

Improvements (Skill 34 - dashboard/executive report):
- Add Korean category labels for Chart.js charts (기술 SEO, 온페이지, etc.)
- Add Korean trend labels (개선 중 ↑, 안정 →, 하락 중 ↓)
- Add English→Korean issue description translation layer (20 common patterns)

Documentation improvements:
- Add Korean triggers to 4 skill descriptions (19, 25, 28, 31)
- Expand Skill 32 SKILL.md from 40→143 lines (was 6/10, added workflow, output format, limitations)
- Add output format examples to Skills 27 and 28 SKILL.md
- Add limitations sections to Skills 27 and 28
- Update README.md, CLAUDE.md, AGENTS.md for skills 33-34

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-14 00:01:00 +09:00
parent dbfaa883cd
commit d2d0a2d460
37 changed files with 5462 additions and 56 deletions

View File

@@ -0,0 +1,91 @@
# SEO Migration Planner
SEO 사이트 이전 계획 및 모니터링 도구 - 사전 위험 평가, 리디렉트 매핑, 이전 후 트래픽/인덱싱 추적.
## Overview
Pre-migration risk assessment, redirect mapping, URL inventory, crawl baseline capture, and post-migration traffic/indexation monitoring for site migrations. Supports domain moves, platform changes, URL restructuring, HTTPS migrations, and subdomain consolidation.
## Dual-Platform Structure
```
33-seo-migration-planner/
├── code/ # Claude Code version
│ ├── CLAUDE.md # Action-oriented directive
│ ├── commands/
│ │ └── seo-migration-planner.md # Slash command
│ └── scripts/
│ ├── migration_planner.py # Pre-migration planning
│ ├── migration_monitor.py # Post-migration monitoring
│ ├── base_client.py # Shared async utilities
│ └── requirements.txt
├── desktop/ # Claude Desktop version
│ ├── SKILL.md # MCP-based workflow
│ ├── skill.yaml # Extended metadata
│ └── tools/
│ ├── ahrefs.md # Ahrefs MCP tools
│ ├── firecrawl.md # Firecrawl MCP tools
│ └── notion.md # Notion MCP tools
└── README.md
```
## Quick Start
### Claude Code
```bash
/seo-migration-planner https://example.com --type domain-move --new-domain https://new-example.com
```
### Python Script
```bash
pip install -r code/scripts/requirements.txt
# Pre-migration planning
python code/scripts/migration_planner.py --domain https://example.com --type domain-move --new-domain https://new-example.com --json
# Post-migration monitoring
python code/scripts/migration_monitor.py --domain https://new-example.com --migration-date 2025-01-15 --baseline baseline.json --json
```
## Features
### Pre-Migration Planning
- URL inventory via Firecrawl crawl
- Ahrefs traffic/keyword/backlink baseline
- Per-URL risk scoring (0-100)
- Redirect map generation (301 mappings)
- Type-specific pre-migration checklist (Korean)
### Post-Migration Monitoring
- Pre vs post traffic comparison
- Redirect health check (broken, chains, loops)
- Indexation change tracking
- Keyword ranking monitoring
- Recovery timeline estimation
- Automated alert generation
## Migration Types
| Type | Description |
|------|-------------|
| `domain-move` | Old domain -> new domain |
| `platform` | CMS/framework migration |
| `url-restructure` | Path/slug changes |
| `https` | HTTP -> HTTPS |
| `subdomain` | Subdomain -> subfolder |
## Notion Output
Reports are saved to the OurDigital SEO Audit Log database:
- **Title**: `사이트 이전 계획 - [domain] - YYYY-MM-DD`
- **Database ID**: `2c8581e5-8a1e-8035-880b-e38cefc2f3ef`
- **Audit ID Format**: MIGR-YYYYMMDD-NNN
## Triggers
- site migration, domain move, redirect mapping
- platform migration, URL restructuring
- HTTPS migration, subdomain consolidation
- 사이트 이전, 도메인 이전, 리디렉트 매핑

View File

@@ -0,0 +1,150 @@
# CLAUDE.md
## Overview
SEO site migration planning and monitoring tool for comprehensive pre-migration risk assessment, redirect mapping, URL inventory, crawl baseline capture, and post-migration traffic/indexation monitoring. Supports domain moves, platform changes, URL restructuring, HTTPS migrations, and subdomain consolidation. Captures full URL inventory via Firecrawl crawl, builds traffic/keyword baselines via Ahrefs, generates redirect maps with per-URL risk scoring, and tracks post-launch recovery with automated alerts.
## Quick Start
```bash
pip install -r scripts/requirements.txt
# Pre-migration planning
python scripts/migration_planner.py --domain https://example.com --type domain-move --new-domain https://new-example.com --json
# Post-migration monitoring
python scripts/migration_monitor.py --domain https://new-example.com --migration-date 2025-01-15 --baseline baseline.json --json
```
## Scripts
| Script | Purpose | Key Output |
|--------|---------|------------|
| `migration_planner.py` | Pre-migration baseline + redirect map + risk assessment | URL inventory, redirect map, risk scores, checklist |
| `migration_monitor.py` | Post-migration traffic comparison, redirect health, indexation tracking | Traffic delta, broken redirects, ranking changes, alerts |
| `base_client.py` | Shared utilities | RateLimiter, ConfigManager, BaseAsyncClient |
## Migration Planner
```bash
# Domain move planning
python scripts/migration_planner.py --domain https://example.com --type domain-move --new-domain https://new-example.com --json
# Platform migration (e.g., WordPress to headless)
python scripts/migration_planner.py --domain https://example.com --type platform --json
# URL restructuring
python scripts/migration_planner.py --domain https://example.com --type url-restructure --json
# HTTPS migration
python scripts/migration_planner.py --domain http://example.com --type https --json
# Subdomain consolidation
python scripts/migration_planner.py --domain https://blog.example.com --type subdomain --new-domain https://example.com/blog --json
```
**Capabilities**:
- URL inventory via Firecrawl crawl (capture all URLs + status codes)
- Ahrefs top-pages baseline (traffic, keywords per page)
- Redirect map generation (old URL -> new URL mapping)
- Risk scoring per URL (based on traffic + backlinks + keyword rankings)
- Pre-migration checklist generation
- Support for migration types:
- Domain move (old domain -> new domain)
- Platform change (CMS/framework swap)
- URL restructuring (path/slug changes)
- HTTPS migration (HTTP -> HTTPS)
- Subdomain consolidation (subdomain -> subfolder)
## Migration Monitor
```bash
# Post-launch traffic comparison
python scripts/migration_monitor.py --domain https://new-example.com --migration-date 2025-01-15 --baseline baseline.json --json
# Monitor with custom period
python scripts/migration_monitor.py --domain https://new-example.com --migration-date 2025-01-15 --baseline baseline.json --json
# Quick redirect health check
python scripts/migration_monitor.py --domain https://new-example.com --migration-date 2025-01-15 --json
```
**Capabilities**:
- Post-launch traffic comparison (pre vs post, by page group)
- Redirect chain/loop detection
- 404 monitoring for high-value pages
- Indexation tracking (indexed pages before vs after)
- Ranking change tracking for priority keywords
- Recovery timeline estimation
- Alert generation for traffic drops >20%
## Ahrefs MCP Tools Used
| Tool | Purpose |
|------|---------|
| `site-explorer-metrics` | Current organic metrics (traffic, keywords) |
| `site-explorer-metrics-history` | Historical metrics for pre/post comparison |
| `site-explorer-top-pages` | Top performing pages for baseline |
| `site-explorer-pages-by-traffic` | Pages ranked by traffic for risk scoring |
| `site-explorer-organic-keywords` | Keyword rankings per page |
| `site-explorer-referring-domains` | Referring domains per page for risk scoring |
| `site-explorer-backlinks-stats` | Backlink overview for migration impact |
## Output Format
```json
{
"domain": "example.com",
"migration_type": "domain-move",
"baseline": {
"total_urls": 1250,
"total_traffic": 45000,
"total_keywords": 8500,
"top_pages": []
},
"redirect_map": [
{
"source": "https://example.com/page-1",
"target": "https://new-example.com/page-1",
"status_code": 301,
"priority": "critical"
}
],
"risk_assessment": {
"high_risk_urls": 45,
"medium_risk_urls": 180,
"low_risk_urls": 1025,
"overall_risk": "medium"
},
"pre_migration_checklist": [],
"timestamp": "2025-01-01T00:00:00"
}
```
## Notion Output (Required)
**IMPORTANT**: All audit reports MUST be saved to the OurDigital SEO Audit Log database.
### Database Configuration
| Field | Value |
|-------|-------|
| Database ID | `2c8581e5-8a1e-8035-880b-e38cefc2f3ef` |
| URL | https://www.notion.so/dintelligence/2c8581e58a1e8035880be38cefc2f3ef |
### Required Properties
| Property | Type | Description |
|----------|------|-------------|
| Issue | Title | Report title (Korean + date) |
| Site | URL | Target website URL |
| Category | Select | SEO Migration |
| Priority | Select | Based on risk level |
| Found Date | Date | Report date (YYYY-MM-DD) |
| Audit ID | Rich Text | Format: MIGR-YYYYMMDD-NNN |
### Language Guidelines
- Report content in Korean (한국어)
- Keep technical English terms as-is (e.g., Redirect Map, Risk Score, Traffic Baseline, Indexation)
- URLs and code remain unchanged

View File

@@ -0,0 +1,27 @@
---
name: seo-migration-planner
description: |
SEO site migration planning and monitoring. Pre-migration risk assessment, redirect mapping,
crawl baseline, and post-migration traffic/indexation monitoring.
Triggers: site migration, domain move, redirect mapping, platform migration, URL restructuring, 사이트 이전.
allowed-tools:
- Bash
- Read
- Write
- WebFetch
- WebSearch
---
# SEO Migration Planner
Run the migration planning or monitoring workflow based on the user's request.
## Pre-Migration Planning
```bash
python custom-skills/33-seo-migration-planner/code/scripts/migration_planner.py --domain [URL] --type [TYPE] --json
```
## Post-Migration Monitoring
```bash
python custom-skills/33-seo-migration-planner/code/scripts/migration_monitor.py --domain [URL] --migration-date [DATE] --json
```

View File

@@ -0,0 +1,172 @@
"""
Base Client - Shared async client utilities
===========================================
Purpose: Rate-limited async operations for API clients
Python: 3.10+
"""
import asyncio
import logging
import os
from asyncio import Semaphore
from datetime import datetime
from typing import Any, Callable, TypeVar
from dotenv import load_dotenv
from tenacity import (
retry,
stop_after_attempt,
wait_exponential,
retry_if_exception_type,
)
# Load environment variables
load_dotenv()
# Logging setup
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(levelname)s - %(message)s",
)
T = TypeVar("T")
class RateLimiter:
"""Rate limiter using token bucket algorithm."""
def __init__(self, rate: float, per: float = 1.0):
self.rate = rate
self.per = per
self.tokens = rate
self.last_update = datetime.now()
self._lock = asyncio.Lock()
async def acquire(self) -> None:
async with self._lock:
now = datetime.now()
elapsed = (now - self.last_update).total_seconds()
self.tokens = min(self.rate, self.tokens + elapsed * (self.rate / self.per))
self.last_update = now
if self.tokens < 1:
wait_time = (1 - self.tokens) * (self.per / self.rate)
await asyncio.sleep(wait_time)
self.tokens = 0
else:
self.tokens -= 1
class BaseAsyncClient:
"""Base class for async API clients with rate limiting."""
def __init__(
self,
max_concurrent: int = 5,
requests_per_second: float = 3.0,
logger: logging.Logger | None = None,
):
self.semaphore = Semaphore(max_concurrent)
self.rate_limiter = RateLimiter(requests_per_second)
self.logger = logger or logging.getLogger(self.__class__.__name__)
self.stats = {
"requests": 0,
"success": 0,
"errors": 0,
"retries": 0,
}
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=2, max=10),
retry=retry_if_exception_type(Exception),
)
async def _rate_limited_request(
self,
coro: Callable[[], Any],
) -> Any:
async with self.semaphore:
await self.rate_limiter.acquire()
self.stats["requests"] += 1
try:
result = await coro()
self.stats["success"] += 1
return result
except Exception as e:
self.stats["errors"] += 1
self.logger.error(f"Request failed: {e}")
raise
async def batch_requests(
self,
requests: list[Callable[[], Any]],
desc: str = "Processing",
) -> list[Any]:
try:
from tqdm.asyncio import tqdm
has_tqdm = True
except ImportError:
has_tqdm = False
async def execute(req: Callable) -> Any:
try:
return await self._rate_limited_request(req)
except Exception as e:
return {"error": str(e)}
tasks = [execute(req) for req in requests]
if has_tqdm:
results = []
for coro in tqdm.as_completed(tasks, total=len(tasks), desc=desc):
result = await coro
results.append(result)
return results
else:
return await asyncio.gather(*tasks, return_exceptions=True)
def print_stats(self) -> None:
self.logger.info("=" * 40)
self.logger.info("Request Statistics:")
self.logger.info(f" Total Requests: {self.stats['requests']}")
self.logger.info(f" Successful: {self.stats['success']}")
self.logger.info(f" Errors: {self.stats['errors']}")
self.logger.info("=" * 40)
class ConfigManager:
"""Manage API configuration and credentials."""
def __init__(self):
load_dotenv()
@property
def google_credentials_path(self) -> str | None:
seo_creds = os.path.expanduser("~/.credential/ourdigital-seo-agent.json")
if os.path.exists(seo_creds):
return seo_creds
return os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
@property
def pagespeed_api_key(self) -> str | None:
return os.getenv("PAGESPEED_API_KEY")
@property
def notion_token(self) -> str | None:
return os.getenv("NOTION_TOKEN") or os.getenv("NOTION_API_KEY")
def validate_google_credentials(self) -> bool:
creds_path = self.google_credentials_path
if not creds_path:
return False
return os.path.exists(creds_path)
def get_required(self, key: str) -> str:
value = os.getenv(key)
if not value:
raise ValueError(f"Missing required environment variable: {key}")
return value
# Singleton config instance
config = ConfigManager()

View File

@@ -0,0 +1,909 @@
"""
Migration Monitor - Post-Migration Traffic & Indexation Monitoring
==================================================================
Purpose: Post-migration traffic comparison, redirect health checks,
indexation tracking, ranking change monitoring, and alert generation.
Python: 3.10+
Usage:
python migration_monitor.py --domain https://new-example.com --migration-date 2025-01-15 --baseline baseline.json --json
python migration_monitor.py --domain https://new-example.com --migration-date 2025-01-15 --json
"""
import argparse
import asyncio
import json
import logging
import sys
from dataclasses import dataclass, field, asdict
from datetime import datetime, timedelta
from typing import Any
from urllib.parse import urlparse
from base_client import BaseAsyncClient, config
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Data classes
# ---------------------------------------------------------------------------
@dataclass
class TrafficComparison:
"""Traffic comparison between pre- and post-migration periods."""
page_group: str = ""
pre_traffic: int = 0
post_traffic: int = 0
change_pct: float = 0.0
change_absolute: int = 0
status: str = "stable" # improved / stable / declined / critical
@dataclass
class RedirectHealth:
"""Health status of a single redirect."""
source: str = ""
target: str = ""
status_code: int = 0
chain_length: int = 0
is_broken: bool = False
final_url: str = ""
error: str = ""
@dataclass
class IndexationStatus:
"""Indexation comparison before and after migration."""
pre_count: int = 0
post_count: int = 0
change_pct: float = 0.0
missing_pages: list[str] = field(default_factory=list)
new_pages: list[str] = field(default_factory=list)
deindexed_count: int = 0
@dataclass
class RankingChange:
"""Ranking change for a keyword."""
keyword: str = ""
pre_position: int = 0
post_position: int = 0
change: int = 0
url: str = ""
search_volume: int = 0
@dataclass
class MigrationAlert:
"""Alert for significant post-migration issues."""
alert_type: str = "" # traffic_drop, redirect_broken, indexation_drop, ranking_loss
severity: str = "info" # info / warning / critical
message: str = ""
metric_value: float = 0.0
threshold: float = 0.0
affected_urls: list[str] = field(default_factory=list)
@dataclass
class MigrationReport:
"""Complete post-migration monitoring report."""
domain: str = ""
migration_date: str = ""
days_since_migration: int = 0
traffic_comparison: list[TrafficComparison] = field(default_factory=list)
redirect_health: list[RedirectHealth] = field(default_factory=list)
indexation: IndexationStatus | None = None
ranking_changes: list[RankingChange] = field(default_factory=list)
recovery_estimate: dict[str, Any] = field(default_factory=dict)
alerts: list[MigrationAlert] = field(default_factory=list)
timestamp: str = ""
errors: list[str] = field(default_factory=list)
# ---------------------------------------------------------------------------
# Monitor
# ---------------------------------------------------------------------------
class MigrationMonitor(BaseAsyncClient):
"""Monitors post-migration SEO health using Ahrefs and Firecrawl MCP tools."""
# Alert thresholds
TRAFFIC_DROP_WARNING = 0.20 # 20% drop
TRAFFIC_DROP_CRITICAL = 0.40 # 40% drop
RANKING_DROP_THRESHOLD = 5 # 5+ position drop
INDEXATION_DROP_WARNING = 0.10 # 10% indexation loss
def __init__(self):
super().__init__(max_concurrent=5, requests_per_second=2.0)
@staticmethod
def _extract_domain(url: str) -> str:
"""Extract bare domain from URL or return as-is if already bare."""
if "://" in url:
parsed = urlparse(url)
return parsed.netloc.lower().replace("www.", "")
return url.lower().replace("www.", "")
async def _call_ahrefs(self, tool: str, params: dict[str, Any]) -> dict:
"""Simulate Ahrefs MCP call. In production, routed via MCP bridge."""
self.logger.info(f"Ahrefs MCP call: {tool} | params={params}")
return {"tool": tool, "params": params, "data": {}}
async def _call_firecrawl(self, tool: str, params: dict[str, Any]) -> dict:
"""Simulate Firecrawl MCP call. In production, routed via MCP bridge."""
self.logger.info(f"Firecrawl MCP call: {tool} | params={params}")
return {"tool": tool, "params": params, "data": {}}
# ------------------------------------------------------------------
# Traffic Comparison
# ------------------------------------------------------------------
async def compare_traffic(
self, domain: str, migration_date: str
) -> list[TrafficComparison]:
"""Compare traffic before and after migration date."""
domain = self._extract_domain(domain)
mig_date = datetime.strptime(migration_date, "%Y-%m-%d")
days_since = (datetime.now() - mig_date).days
# Pre-migration period: same duration before migration
pre_start = (mig_date - timedelta(days=max(days_since, 30))).strftime("%Y-%m-%d")
pre_end = (mig_date - timedelta(days=1)).strftime("%Y-%m-%d")
post_start = migration_date
post_end = datetime.now().strftime("%Y-%m-%d")
self.logger.info(
f"Comparing traffic for {domain}: "
f"pre={pre_start}..{pre_end} vs post={post_start}..{post_end}"
)
# Fetch pre-migration metrics history
pre_resp = await self._call_ahrefs(
"site-explorer-metrics-history",
{"target": domain, "date_from": pre_start, "date_to": pre_end},
)
pre_data = pre_resp.get("data", {}).get("data_points", [])
# Fetch post-migration metrics history
post_resp = await self._call_ahrefs(
"site-explorer-metrics-history",
{"target": domain, "date_from": post_start, "date_to": post_end},
)
post_data = post_resp.get("data", {}).get("data_points", [])
# Calculate averages
pre_avg_traffic = 0
if pre_data:
pre_avg_traffic = int(
sum(int(p.get("organic_traffic", 0)) for p in pre_data) / len(pre_data)
)
post_avg_traffic = 0
if post_data:
post_avg_traffic = int(
sum(int(p.get("organic_traffic", 0)) for p in post_data) / len(post_data)
)
# Overall comparison
change_pct = 0.0
if pre_avg_traffic > 0:
change_pct = ((post_avg_traffic - pre_avg_traffic) / pre_avg_traffic) * 100
status = "stable"
if change_pct > 5:
status = "improved"
elif change_pct < -40:
status = "critical"
elif change_pct < -20:
status = "declined"
comparisons = [
TrafficComparison(
page_group="Overall",
pre_traffic=pre_avg_traffic,
post_traffic=post_avg_traffic,
change_pct=round(change_pct, 2),
change_absolute=post_avg_traffic - pre_avg_traffic,
status=status,
)
]
# Fetch top pages comparison
pre_pages_resp = await self._call_ahrefs(
"site-explorer-pages-by-traffic",
{"target": domain, "limit": 50},
)
top_pages = pre_pages_resp.get("data", {}).get("pages", [])
for page in top_pages[:20]:
page_url = page.get("url", "")
page_traffic = int(page.get("traffic", 0))
# In production, would compare with baseline data
comparisons.append(
TrafficComparison(
page_group=page_url,
pre_traffic=0, # Would be populated from baseline
post_traffic=page_traffic,
change_pct=0.0,
change_absolute=0,
status="stable",
)
)
self.logger.info(
f"Traffic comparison for {domain}: "
f"pre={pre_avg_traffic:,} -> post={post_avg_traffic:,} "
f"({change_pct:+.1f}%)"
)
return comparisons
# ------------------------------------------------------------------
# Redirect Health Check
# ------------------------------------------------------------------
async def check_redirects(
self, redirect_map: list[dict[str, str]]
) -> list[RedirectHealth]:
"""Verify redirect health: check for broken redirects, chains, and loops."""
health_results: list[RedirectHealth] = []
self.logger.info(f"Checking {len(redirect_map)} redirects for health...")
for entry in redirect_map:
source = entry.get("source", "")
expected_target = entry.get("target", "")
if not source:
continue
# Use Firecrawl to check the redirect
resp = await self._call_firecrawl(
"firecrawl_scrape",
{"url": source, "formats": ["links"]},
)
result_data = resp.get("data", {})
final_url = result_data.get("final_url", "")
status_code = int(result_data.get("status_code", 0))
redirect_chain = result_data.get("redirect_chain", [])
chain_length = len(redirect_chain)
is_broken = (
status_code >= 400
or status_code == 0
or (final_url and final_url != expected_target and status_code != 301)
)
health = RedirectHealth(
source=source,
target=expected_target,
status_code=status_code,
chain_length=chain_length,
is_broken=is_broken,
final_url=final_url,
error="" if not is_broken else f"Expected {expected_target}, got {final_url} ({status_code})",
)
health_results.append(health)
broken_count = sum(1 for h in health_results if h.is_broken)
chain_count = sum(1 for h in health_results if h.chain_length > 1)
self.logger.info(
f"Redirect health check complete: "
f"{broken_count} broken, {chain_count} chains detected "
f"out of {len(health_results)} redirects"
)
return health_results
# ------------------------------------------------------------------
# Indexation Tracking
# ------------------------------------------------------------------
async def track_indexation(
self, domain: str, pre_baseline: dict[str, Any] | None = None
) -> IndexationStatus:
"""Compare indexed pages before and after migration."""
domain = self._extract_domain(domain)
self.logger.info(f"Tracking indexation for {domain}")
# Fetch current metrics
metrics_resp = await self._call_ahrefs(
"site-explorer-metrics", {"target": domain}
)
current_pages = int(metrics_resp.get("data", {}).get("pages", 0))
# Get pre-migration count from baseline
pre_count = 0
if pre_baseline:
pre_count = int(pre_baseline.get("total_urls", 0))
change_pct = 0.0
if pre_count > 0:
change_pct = ((current_pages - pre_count) / pre_count) * 100
# Fetch current top pages to detect missing ones
pages_resp = await self._call_ahrefs(
"site-explorer-top-pages", {"target": domain, "limit": 500}
)
current_page_urls = set()
for page in pages_resp.get("data", {}).get("pages", []):
url = page.get("url", "")
if url:
current_page_urls.add(url)
# Compare with baseline URL inventory
missing_pages: list[str] = []
if pre_baseline:
baseline_urls = pre_baseline.get("url_inventory", [])
for url_entry in baseline_urls:
url = url_entry if isinstance(url_entry, str) else url_entry.get("url", "")
if url and url not in current_page_urls:
missing_pages.append(url)
status = IndexationStatus(
pre_count=pre_count,
post_count=current_pages,
change_pct=round(change_pct, 2),
missing_pages=missing_pages[:100], # Cap at 100 for readability
deindexed_count=len(missing_pages),
)
self.logger.info(
f"Indexation for {domain}: "
f"pre={pre_count:,} -> post={current_pages:,} "
f"({change_pct:+.1f}%), {len(missing_pages)} missing"
)
return status
# ------------------------------------------------------------------
# Ranking Tracking
# ------------------------------------------------------------------
async def track_rankings(
self, domain: str, priority_keywords: list[str] | None = None
) -> list[RankingChange]:
"""Track ranking changes for priority keywords."""
domain = self._extract_domain(domain)
self.logger.info(f"Tracking rankings for {domain}")
# Fetch current keyword rankings
kw_resp = await self._call_ahrefs(
"site-explorer-organic-keywords",
{"target": domain, "limit": 200},
)
current_keywords = kw_resp.get("data", {}).get("keywords", [])
ranking_changes: list[RankingChange] = []
for kw_data in current_keywords:
keyword = kw_data.get("keyword", "")
# If priority keywords specified, filter
if priority_keywords and keyword.lower() not in [k.lower() for k in priority_keywords]:
continue
current_pos = int(kw_data.get("position", 0))
previous_pos = int(kw_data.get("previous_position", current_pos))
volume = int(kw_data.get("search_volume", 0))
url = kw_data.get("url", "")
change = previous_pos - current_pos # Positive = improved
ranking_changes.append(
RankingChange(
keyword=keyword,
pre_position=previous_pos,
post_position=current_pos,
change=change,
url=url,
search_volume=volume,
)
)
# Sort by absolute change (biggest drops first)
ranking_changes.sort(key=lambda r: r.change)
self.logger.info(
f"Tracked {len(ranking_changes)} keyword rankings for {domain}"
)
return ranking_changes
# ------------------------------------------------------------------
# Recovery Estimation
# ------------------------------------------------------------------
def estimate_recovery(
self, traffic_data: list[TrafficComparison], migration_type: str = "domain-move"
) -> dict[str, Any]:
"""Estimate recovery timeline based on traffic comparison data."""
overall = next(
(t for t in traffic_data if t.page_group == "Overall"), None
)
if not overall:
return {
"estimated_weeks": "unknown",
"confidence": "low",
"message": "트래픽 데이터 부족으로 회복 기간 추정 불가",
}
change_pct = overall.change_pct
# Base recovery timelines by migration type (weeks)
base_timelines = {
"domain-move": 16, # 4 months
"platform": 8, # 2 months
"url-restructure": 12, # 3 months
"https": 4, # 1 month
"subdomain": 10, # 2.5 months
}
base_weeks = base_timelines.get(migration_type, 12)
if change_pct >= 0:
# No traffic drop — recovery already achieved or in progress
return {
"estimated_weeks": 0,
"confidence": "high",
"current_recovery_pct": 100.0,
"message": "트래픽 손실 없음 — 이전 성공적으로 진행 중",
}
elif change_pct > -20:
# Minor drop — quick recovery expected
estimated_weeks = max(int(base_weeks * 0.5), 2)
confidence = "high"
recovery_pct = round(100 + change_pct, 1)
elif change_pct > -40:
# Moderate drop — standard recovery timeline
estimated_weeks = base_weeks
confidence = "medium"
recovery_pct = round(100 + change_pct, 1)
else:
# Severe drop — extended recovery
estimated_weeks = int(base_weeks * 1.5)
confidence = "low"
recovery_pct = round(100 + change_pct, 1)
return {
"estimated_weeks": estimated_weeks,
"confidence": confidence,
"current_recovery_pct": recovery_pct,
"traffic_change_pct": change_pct,
"migration_type": migration_type,
"message": (
f"현재 트래픽 {change_pct:+.1f}% 변동. "
f"예상 회복 기간: {estimated_weeks}주 (신뢰도: {confidence}). "
f"현재 회복률: {recovery_pct:.1f}%"
),
}
# ------------------------------------------------------------------
# Alert Generation
# ------------------------------------------------------------------
def generate_alerts(self, report: MigrationReport) -> list[MigrationAlert]:
"""Generate alerts for significant post-migration issues."""
alerts: list[MigrationAlert] = []
# Traffic drop alerts
for tc in report.traffic_comparison:
if tc.page_group == "Overall":
abs_change = abs(tc.change_pct) / 100.0
if tc.change_pct < 0 and abs_change >= self.TRAFFIC_DROP_CRITICAL:
alerts.append(MigrationAlert(
alert_type="traffic_drop",
severity="critical",
message=(
f"심각한 트래픽 하락: {tc.change_pct:+.1f}% "
f"(이전 전 {tc.pre_traffic:,} -> 이전 후 {tc.post_traffic:,})"
),
metric_value=tc.change_pct,
threshold=-self.TRAFFIC_DROP_CRITICAL * 100,
))
elif tc.change_pct < 0 and abs_change >= self.TRAFFIC_DROP_WARNING:
alerts.append(MigrationAlert(
alert_type="traffic_drop",
severity="warning",
message=(
f"트래픽 하락 감지: {tc.change_pct:+.1f}% "
f"(이전 전 {tc.pre_traffic:,} -> 이전 후 {tc.post_traffic:,})"
),
metric_value=tc.change_pct,
threshold=-self.TRAFFIC_DROP_WARNING * 100,
))
# Broken redirect alerts
broken_redirects = [r for r in report.redirect_health if r.is_broken]
if broken_redirects:
severity = "critical" if len(broken_redirects) > 10 else "warning"
alerts.append(MigrationAlert(
alert_type="redirect_broken",
severity=severity,
message=(
f"깨진 리디렉트 {len(broken_redirects)}건 감지. "
f"고가치 페이지의 링크 에퀴티 손실 위험."
),
metric_value=float(len(broken_redirects)),
threshold=1.0,
affected_urls=[r.source for r in broken_redirects[:20]],
))
# Redirect chain alerts
chain_redirects = [r for r in report.redirect_health if r.chain_length > 1]
if chain_redirects:
alerts.append(MigrationAlert(
alert_type="redirect_chain",
severity="warning",
message=(
f"리디렉트 체인 {len(chain_redirects)}건 감지. "
f"크롤 효율성 및 링크 에퀴티에 영향."
),
metric_value=float(len(chain_redirects)),
threshold=1.0,
affected_urls=[r.source for r in chain_redirects[:20]],
))
# Indexation drop alerts
if report.indexation:
idx = report.indexation
if idx.pre_count > 0:
idx_drop = abs(idx.change_pct) / 100.0
if idx.change_pct < 0 and idx_drop >= self.INDEXATION_DROP_WARNING:
alerts.append(MigrationAlert(
alert_type="indexation_drop",
severity="warning" if idx_drop < 0.30 else "critical",
message=(
f"인덱싱 감소: {idx.change_pct:+.1f}% "
f"(이전 전 {idx.pre_count:,} -> 이전 후 {idx.post_count:,}페이지). "
f"디인덱싱된 페이지: {idx.deindexed_count}"
),
metric_value=idx.change_pct,
threshold=-self.INDEXATION_DROP_WARNING * 100,
affected_urls=idx.missing_pages[:20],
))
# Ranking loss alerts
significant_drops = [
r for r in report.ranking_changes
if r.change < -self.RANKING_DROP_THRESHOLD and r.search_volume > 100
]
if significant_drops:
alerts.append(MigrationAlert(
alert_type="ranking_loss",
severity="warning" if len(significant_drops) < 20 else "critical",
message=(
f"주요 키워드 {len(significant_drops)}개의 순위 하락 감지 "
f"(5포지션 이상 하락, 검색량 100+)"
),
metric_value=float(len(significant_drops)),
threshold=float(self.RANKING_DROP_THRESHOLD),
affected_urls=[r.url for r in significant_drops[:20]],
))
# Sort alerts by severity
severity_order = {"critical": 0, "warning": 1, "info": 2}
alerts.sort(key=lambda a: severity_order.get(a.severity, 3))
self.logger.info(f"Generated {len(alerts)} migration alerts")
return alerts
# ------------------------------------------------------------------
# Orchestrator
# ------------------------------------------------------------------
async def run(
self,
domain: str,
migration_date: str,
baseline_file: str | None = None,
migration_type: str = "domain-move",
) -> MigrationReport:
"""Orchestrate full post-migration monitoring pipeline."""
timestamp = datetime.now().isoformat()
mig_date = datetime.strptime(migration_date, "%Y-%m-%d")
days_since = (datetime.now() - mig_date).days
report = MigrationReport(
domain=self._extract_domain(domain),
migration_date=migration_date,
days_since_migration=days_since,
timestamp=timestamp,
)
# Load baseline if provided
baseline: dict[str, Any] | None = None
redirect_map_data: list[dict[str, str]] = []
if baseline_file:
try:
with open(baseline_file, "r", encoding="utf-8") as f:
baseline_raw = json.load(f)
baseline = baseline_raw.get("baseline", baseline_raw)
redirect_map_data = [
{"source": r.get("source", ""), "target": r.get("target", "")}
for r in baseline_raw.get("redirect_map", [])
]
self.logger.info(f"Loaded baseline from {baseline_file}")
except Exception as e:
msg = f"Failed to load baseline file: {e}"
self.logger.error(msg)
report.errors.append(msg)
try:
# Step 1: Traffic comparison
self.logger.info("Step 1/5: Comparing pre/post traffic...")
report.traffic_comparison = await self.compare_traffic(
domain, migration_date
)
# Step 2: Redirect health check
if redirect_map_data:
self.logger.info("Step 2/5: Checking redirect health...")
report.redirect_health = await self.check_redirects(redirect_map_data)
else:
self.logger.info(
"Step 2/5: Skipping redirect check (no baseline redirect map)"
)
# Step 3: Indexation tracking
self.logger.info("Step 3/5: Tracking indexation changes...")
report.indexation = await self.track_indexation(domain, baseline)
# Step 4: Ranking tracking
self.logger.info("Step 4/5: Tracking keyword rankings...")
report.ranking_changes = await self.track_rankings(domain)
# Step 5: Recovery estimation
self.logger.info("Step 5/5: Estimating recovery timeline...")
report.recovery_estimate = self.estimate_recovery(
report.traffic_comparison, migration_type
)
# Generate alerts
report.alerts = self.generate_alerts(report)
self.logger.info(
f"Migration monitoring complete: "
f"{days_since} days since migration, "
f"{len(report.alerts)} alerts generated"
)
except Exception as e:
msg = f"Migration monitoring pipeline error: {e}"
self.logger.error(msg)
report.errors.append(msg)
return report
# ---------------------------------------------------------------------------
# Output helpers
# ---------------------------------------------------------------------------
def _format_text_report(report: MigrationReport) -> str:
"""Format monitoring report as human-readable text."""
lines: list[str] = []
lines.append("=" * 70)
lines.append(" SEO MIGRATION MONITORING REPORT")
lines.append(f" Domain: {report.domain}")
lines.append(f" Migration Date: {report.migration_date}")
lines.append(f" Days Since Migration: {report.days_since_migration}")
lines.append(f" Generated: {report.timestamp}")
lines.append("=" * 70)
# Alerts
if report.alerts:
lines.append("")
lines.append("--- ALERTS ---")
for alert in report.alerts:
icon = {"critical": "[!]", "warning": "[*]", "info": "[-]"}.get(
alert.severity, "[-]"
)
lines.append(f" {icon} [{alert.severity.upper()}] {alert.message}")
if alert.affected_urls:
for url in alert.affected_urls[:5]:
lines.append(f" - {url}")
if len(alert.affected_urls) > 5:
lines.append(f" ... and {len(alert.affected_urls) - 5} more")
# Traffic comparison
if report.traffic_comparison:
lines.append("")
lines.append("--- TRAFFIC COMPARISON ---")
lines.append(
f" {'Page Group':<40} {'Pre':>10} {'Post':>10} {'Change':>10} {'Status':>10}"
)
lines.append(" " + "-" * 83)
for tc in report.traffic_comparison:
group = tc.page_group[:38]
lines.append(
f" {group:<40} {tc.pre_traffic:>10,} {tc.post_traffic:>10,} "
f"{tc.change_pct:>+9.1f}% {tc.status:>10}"
)
# Redirect health
if report.redirect_health:
broken = [r for r in report.redirect_health if r.is_broken]
chains = [r for r in report.redirect_health if r.chain_length > 1]
healthy = [r for r in report.redirect_health if not r.is_broken and r.chain_length <= 1]
lines.append("")
lines.append("--- REDIRECT HEALTH ---")
lines.append(f" Total Redirects: {len(report.redirect_health):,}")
lines.append(f" Healthy: {len(healthy):,}")
lines.append(f" Broken: {len(broken):,}")
lines.append(f" Chains (>1 hop): {len(chains):,}")
if broken:
lines.append("")
lines.append(" Broken Redirects:")
for r in broken[:10]:
lines.append(f" [{r.status_code}] {r.source} -> {r.target}")
if r.error:
lines.append(f" Error: {r.error}")
# Indexation
if report.indexation:
idx = report.indexation
lines.append("")
lines.append("--- INDEXATION STATUS ---")
lines.append(f" Pre-Migration Pages: {idx.pre_count:,}")
lines.append(f" Post-Migration Pages: {idx.post_count:,}")
lines.append(f" Change: {idx.change_pct:+.1f}%")
lines.append(f" De-indexed Pages: {idx.deindexed_count:,}")
if idx.missing_pages:
lines.append("")
lines.append(" Missing Pages (top 10):")
for page in idx.missing_pages[:10]:
lines.append(f" - {page}")
# Ranking changes
if report.ranking_changes:
lines.append("")
lines.append("--- RANKING CHANGES ---")
drops = [r for r in report.ranking_changes if r.change < 0]
gains = [r for r in report.ranking_changes if r.change > 0]
lines.append(f" Total Tracked: {len(report.ranking_changes)}")
lines.append(f" Improved: {len(gains)}")
lines.append(f" Declined: {len(drops)}")
if drops:
lines.append("")
lines.append(" Biggest Drops:")
lines.append(
f" {'Keyword':<30} {'Pre':>6} {'Post':>6} {'Change':>8} {'Volume':>8}"
)
lines.append(" " + "-" * 61)
for r in drops[:15]:
kw = r.keyword[:28]
lines.append(
f" {kw:<30} {r.pre_position:>6} {r.post_position:>6} "
f"{r.change:>+7} {r.search_volume:>8,}"
)
# Recovery estimate
if report.recovery_estimate:
est = report.recovery_estimate
lines.append("")
lines.append("--- RECOVERY ESTIMATE ---")
lines.append(f" {est.get('message', 'N/A')}")
weeks = est.get("estimated_weeks", "unknown")
confidence = est.get("confidence", "unknown")
lines.append(f" Estimated Weeks: {weeks}")
lines.append(f" Confidence: {confidence}")
if report.errors:
lines.append("")
lines.append("--- ERRORS ---")
for err in report.errors:
lines.append(f" - {err}")
lines.append("")
lines.append("=" * 70)
return "\n".join(lines)
def _serialize_report(report: MigrationReport) -> dict:
"""Convert report to JSON-serializable dict."""
output: dict[str, Any] = {
"domain": report.domain,
"migration_date": report.migration_date,
"days_since_migration": report.days_since_migration,
"traffic_comparison": [asdict(t) for t in report.traffic_comparison],
"redirect_health": [asdict(r) for r in report.redirect_health],
"indexation": asdict(report.indexation) if report.indexation else None,
"ranking_changes": [asdict(r) for r in report.ranking_changes],
"recovery_estimate": report.recovery_estimate,
"alerts": [asdict(a) for a in report.alerts],
"timestamp": report.timestamp,
}
if report.errors:
output["errors"] = report.errors
return output
# ---------------------------------------------------------------------------
# CLI
# ---------------------------------------------------------------------------
def parse_args(argv: list[str] | None = None) -> argparse.Namespace:
parser = argparse.ArgumentParser(
description="Migration Monitor - Post-migration SEO monitoring and alerting",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""\
Examples:
python migration_monitor.py --domain https://new-example.com --migration-date 2025-01-15 --baseline baseline.json --json
python migration_monitor.py --domain https://new-example.com --migration-date 2025-01-15 --json
""",
)
parser.add_argument(
"--domain",
required=True,
help="Domain to monitor (post-migration URL)",
)
parser.add_argument(
"--migration-date",
required=True,
help="Migration date in YYYY-MM-DD format",
)
parser.add_argument(
"--baseline",
type=str,
default=None,
help="Path to baseline JSON file from migration_planner.py",
)
parser.add_argument(
"--type",
choices=["domain-move", "platform", "url-restructure", "https", "subdomain"],
default="domain-move",
help="Migration type for recovery estimation (default: domain-move)",
)
parser.add_argument(
"--json",
action="store_true",
default=False,
help="Output in JSON format",
)
parser.add_argument(
"--output",
type=str,
default=None,
help="Save output to file path",
)
return parser.parse_args(argv)
async def async_main(args: argparse.Namespace) -> None:
monitor = MigrationMonitor()
report = await monitor.run(
domain=args.domain,
migration_date=args.migration_date,
baseline_file=args.baseline,
migration_type=args.type,
)
if args.json:
output_str = json.dumps(_serialize_report(report), indent=2, ensure_ascii=False)
else:
output_str = _format_text_report(report)
if args.output:
with open(args.output, "w", encoding="utf-8") as f:
f.write(output_str)
logger.info(f"Migration report saved to {args.output}")
else:
print(output_str)
monitor.print_stats()
def main() -> None:
args = parse_args()
asyncio.run(async_main(args))
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,754 @@
"""
Migration Planner - SEO Site Migration Planning
================================================
Purpose: Pre-migration risk assessment, redirect mapping, URL inventory,
crawl baseline capture, and checklist generation for site migrations.
Python: 3.10+
Usage:
python migration_planner.py --domain https://example.com --type domain-move --new-domain https://new-example.com --json
python migration_planner.py --domain https://example.com --type platform --json
python migration_planner.py --domain https://example.com --type url-restructure --json
python migration_planner.py --domain http://example.com --type https --json
python migration_planner.py --domain https://blog.example.com --type subdomain --new-domain https://example.com/blog --json
"""
import argparse
import asyncio
import json
import logging
import sys
from dataclasses import dataclass, field, asdict
from datetime import datetime
from typing import Any
from urllib.parse import urlparse
from base_client import BaseAsyncClient, config
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Data classes
# ---------------------------------------------------------------------------
@dataclass
class MigrationURL:
"""A single URL in the migration inventory with associated metrics."""
url: str = ""
traffic: int = 0
keywords: int = 0
backlinks: int = 0
risk_score: float = 0.0
redirect_target: str = ""
status_code: int = 200
priority: str = "low" # critical / high / medium / low
@dataclass
class MigrationBaseline:
"""Pre-migration baseline snapshot of the site."""
domain: str = ""
total_urls: int = 0
total_traffic: int = 0
total_keywords: int = 0
total_referring_domains: int = 0
top_pages: list[dict[str, Any]] = field(default_factory=list)
url_inventory: list[MigrationURL] = field(default_factory=list)
@dataclass
class RedirectMap:
"""A single redirect mapping entry."""
source: str = ""
target: str = ""
status_code: int = 301
priority: str = "low" # critical / high / medium / low
risk_score: float = 0.0
@dataclass
class RiskAssessment:
"""Aggregated risk assessment for the migration."""
high_risk_urls: int = 0
medium_risk_urls: int = 0
low_risk_urls: int = 0
overall_risk: str = "low" # critical / high / medium / low
top_risk_urls: list[dict[str, Any]] = field(default_factory=list)
risk_factors: list[str] = field(default_factory=list)
@dataclass
class MigrationPlan:
"""Complete migration plan output."""
migration_type: str = ""
domain: str = ""
new_domain: str = ""
baseline: MigrationBaseline | None = None
redirect_map: list[RedirectMap] = field(default_factory=list)
risk_assessment: RiskAssessment | None = None
pre_migration_checklist: list[dict[str, Any]] = field(default_factory=list)
timestamp: str = ""
errors: list[str] = field(default_factory=list)
# ---------------------------------------------------------------------------
# Migration types
# ---------------------------------------------------------------------------
MIGRATION_TYPES = {
"domain-move": "Domain Move (old domain -> new domain)",
"platform": "Platform Change (CMS/framework migration)",
"url-restructure": "URL Restructuring (path/slug changes)",
"https": "HTTPS Migration (HTTP -> HTTPS)",
"subdomain": "Subdomain Consolidation (subdomain -> subfolder)",
}
# ---------------------------------------------------------------------------
# Planner
# ---------------------------------------------------------------------------
class MigrationPlanner(BaseAsyncClient):
"""Plans site migrations using Firecrawl for crawling and Ahrefs for SEO data."""
def __init__(self):
super().__init__(max_concurrent=5, requests_per_second=2.0)
@staticmethod
def _extract_domain(url: str) -> str:
"""Extract bare domain from URL or return as-is if already bare."""
if "://" in url:
parsed = urlparse(url)
return parsed.netloc.lower().replace("www.", "")
return url.lower().replace("www.", "")
@staticmethod
def _normalize_url(url: str) -> str:
"""Ensure URL has a scheme."""
if not url.startswith(("http://", "https://")):
return f"https://{url}"
return url
# ------------------------------------------------------------------
# MCP wrappers (return dicts; Claude MCP bridge fills these)
# ------------------------------------------------------------------
async def _call_ahrefs(self, tool: str, params: dict[str, Any]) -> dict:
"""Simulate Ahrefs MCP call. In production, routed via MCP bridge."""
self.logger.info(f"Ahrefs MCP call: {tool} | params={params}")
return {"tool": tool, "params": params, "data": {}}
async def _call_firecrawl(self, tool: str, params: dict[str, Any]) -> dict:
"""Simulate Firecrawl MCP call. In production, routed via MCP bridge."""
self.logger.info(f"Firecrawl MCP call: {tool} | params={params}")
return {"tool": tool, "params": params, "data": {}}
# ------------------------------------------------------------------
# URL Inventory
# ------------------------------------------------------------------
async def crawl_url_inventory(self, domain: str) -> list[MigrationURL]:
"""Crawl the site via Firecrawl to capture all URLs and status codes."""
url = self._normalize_url(domain)
self.logger.info(f"Crawling URL inventory for {url}")
resp = await self._call_firecrawl(
"firecrawl_crawl",
{"url": url, "limit": 5000, "scrapeOptions": {"formats": ["links"]}},
)
crawl_data = resp.get("data", {})
pages = crawl_data.get("pages", [])
inventory: list[MigrationURL] = []
for page in pages:
migration_url = MigrationURL(
url=page.get("url", ""),
status_code=int(page.get("status_code", 200)),
)
inventory.append(migration_url)
if not inventory:
# Fallback: create a single entry for the domain
inventory.append(MigrationURL(url=url, status_code=200))
self.logger.warning(
"Firecrawl returned no pages; created placeholder entry. "
"Verify Firecrawl MCP is configured."
)
else:
self.logger.info(f"Crawled {len(inventory)} URLs from {domain}")
return inventory
# ------------------------------------------------------------------
# Ahrefs Baseline
# ------------------------------------------------------------------
async def fetch_top_pages_baseline(
self, domain: str, limit: int = 500
) -> list[dict[str, Any]]:
"""Fetch top pages with traffic and keyword data from Ahrefs."""
domain = self._extract_domain(domain)
self.logger.info(f"Fetching top pages baseline for {domain}")
resp = await self._call_ahrefs(
"site-explorer-top-pages",
{"target": domain, "limit": limit},
)
pages_raw = resp.get("data", {}).get("pages", [])
top_pages: list[dict[str, Any]] = []
for page in pages_raw:
top_pages.append({
"url": page.get("url", ""),
"traffic": int(page.get("traffic", 0)),
"keywords": int(page.get("keywords", 0)),
"top_keyword": page.get("top_keyword", ""),
"position": int(page.get("position", 0)),
})
self.logger.info(f"Fetched {len(top_pages)} top pages for {domain}")
return top_pages
async def fetch_site_metrics(self, domain: str) -> dict[str, Any]:
"""Fetch overall site metrics from Ahrefs."""
domain = self._extract_domain(domain)
metrics_resp = await self._call_ahrefs(
"site-explorer-metrics", {"target": domain}
)
metrics = metrics_resp.get("data", {})
backlinks_resp = await self._call_ahrefs(
"site-explorer-backlinks-stats", {"target": domain}
)
backlinks = backlinks_resp.get("data", {})
return {
"organic_traffic": int(metrics.get("organic_traffic", 0)),
"organic_keywords": int(metrics.get("organic_keywords", 0)),
"referring_domains": int(backlinks.get("referring_domains", 0)),
}
async def fetch_page_backlinks(self, url: str) -> int:
"""Fetch backlink count for a specific URL."""
resp = await self._call_ahrefs(
"site-explorer-backlinks-stats", {"target": url}
)
return int(resp.get("data", {}).get("referring_domains", 0))
async def fetch_page_keywords(self, url: str) -> list[dict[str, Any]]:
"""Fetch keyword rankings for a specific URL."""
resp = await self._call_ahrefs(
"site-explorer-organic-keywords",
{"target": url, "limit": 100},
)
return resp.get("data", {}).get("keywords", [])
# ------------------------------------------------------------------
# Risk Assessment
# ------------------------------------------------------------------
def assess_url_risk(self, url_data: MigrationURL) -> float:
"""Score risk for a single URL based on traffic, backlinks, and keywords.
Risk score 0-100:
- Traffic weight: 40% (high traffic = high risk if migration fails)
- Backlinks weight: 30% (external links break if redirect fails)
- Keywords weight: 30% (ranking loss risk)
"""
# Normalize each factor to 0-100
# Traffic: 1000+ monthly visits = high risk
traffic_score = min((url_data.traffic / 1000) * 100, 100) if url_data.traffic > 0 else 0
# Backlinks: 50+ referring domains = high risk
backlinks_score = min((url_data.backlinks / 50) * 100, 100) if url_data.backlinks > 0 else 0
# Keywords: 20+ rankings = high risk
keywords_score = min((url_data.keywords / 20) * 100, 100) if url_data.keywords > 0 else 0
risk = (
traffic_score * 0.40
+ backlinks_score * 0.30
+ keywords_score * 0.30
)
return round(min(max(risk, 0), 100), 1)
def classify_priority(self, risk_score: float) -> str:
"""Classify URL priority based on risk score."""
if risk_score >= 75:
return "critical"
elif risk_score >= 50:
return "high"
elif risk_score >= 25:
return "medium"
else:
return "low"
# ------------------------------------------------------------------
# Redirect Map
# ------------------------------------------------------------------
def generate_redirect_map(
self,
url_inventory: list[MigrationURL],
migration_type: str,
new_domain: str | None = None,
) -> list[RedirectMap]:
"""Generate redirect mappings based on migration type."""
redirect_map: list[RedirectMap] = []
for url_entry in url_inventory:
source = url_entry.url
if not source:
continue
parsed = urlparse(source)
path = parsed.path
# Determine target URL based on migration type
if migration_type == "domain-move" and new_domain:
new_parsed = urlparse(self._normalize_url(new_domain))
target = f"{new_parsed.scheme}://{new_parsed.netloc}{path}"
elif migration_type == "https":
target = source.replace("http://", "https://")
elif migration_type == "subdomain" and new_domain:
# e.g., blog.example.com/page -> example.com/blog/page
new_parsed = urlparse(self._normalize_url(new_domain))
target = f"{new_parsed.scheme}://{new_parsed.netloc}{new_parsed.path.rstrip('/')}{path}"
elif migration_type == "url-restructure":
# Placeholder: URL restructuring requires custom mapping rules
# In practice, user provides a mapping CSV or pattern
target = source # Will need manual mapping
elif migration_type == "platform":
# Platform change: URLs may stay the same or change
target = source # Will need verification post-migration
else:
target = source
redirect_entry = RedirectMap(
source=source,
target=target,
status_code=301,
priority=url_entry.priority,
risk_score=url_entry.risk_score,
)
redirect_map.append(redirect_entry)
# Sort by risk score descending (highest risk first)
redirect_map.sort(key=lambda r: r.risk_score, reverse=True)
self.logger.info(
f"Generated {len(redirect_map)} redirect mappings "
f"for {migration_type} migration"
)
return redirect_map
# ------------------------------------------------------------------
# Checklist
# ------------------------------------------------------------------
def generate_checklist(self, migration_type: str) -> list[dict[str, Any]]:
"""Generate pre-migration checklist based on migration type."""
# Common checklist items for all migration types
common_items = [
{"step": 1, "category": "Baseline", "task": "URL 인벤토리 크롤링 완료", "description": "Firecrawl로 전체 URL 목록 및 상태 코드 캡처", "status": "pending"},
{"step": 2, "category": "Baseline", "task": "트래픽 베이스라인 캡처", "description": "Ahrefs에서 페이지별 트래픽, 키워드, 백링크 데이터 수집", "status": "pending"},
{"step": 3, "category": "Baseline", "task": "Google Search Console 데이터 내보내기", "description": "현재 인덱싱 상태, 사이트맵 현황, 크롤 통계 기록", "status": "pending"},
{"step": 4, "category": "Baseline", "task": "Google Analytics 벤치마크 저장", "description": "이전 전 30일/90일 트래픽 데이터 스냅샷 저장", "status": "pending"},
{"step": 5, "category": "Redirects", "task": "Redirect 맵 생성", "description": "모든 URL에 대한 301 리디렉트 매핑 완료", "status": "pending"},
{"step": 6, "category": "Redirects", "task": "고위험 URL 우선 검증", "description": "트래픽/백링크 기준 상위 URL 리디렉트 수동 확인", "status": "pending"},
{"step": 7, "category": "Technical", "task": "robots.txt 업데이트 준비", "description": "새 도메인/구조에 맞는 robots.txt 작성", "status": "pending"},
{"step": 8, "category": "Technical", "task": "XML 사이트맵 업데이트 준비", "description": "새 URL 구조 반영한 사이트맵 생성", "status": "pending"},
{"step": 9, "category": "Technical", "task": "Canonical 태그 업데이트 계획", "description": "모든 페이지의 canonical URL이 새 주소를 가리키도록 변경", "status": "pending"},
{"step": 10, "category": "Technical", "task": "Internal link 업데이트 계획", "description": "사이트 내부 링크가 새 URL을 직접 가리키도록 변경", "status": "pending"},
{"step": 11, "category": "Monitoring", "task": "모니터링 대시보드 설정", "description": "이전 후 트래픽, 인덱싱, 리디렉트 상태 모니터링 준비", "status": "pending"},
{"step": 12, "category": "Monitoring", "task": "알림 임계값 설정", "description": "트래픽 20% 이상 하락 시 알림 설정", "status": "pending"},
]
# Type-specific items
type_specific: dict[str, list[dict[str, Any]]] = {
"domain-move": [
{"step": 13, "category": "Domain", "task": "새 도메인 DNS 설정", "description": "DNS A/CNAME 레코드 설정 및 전파 확인", "status": "pending"},
{"step": 14, "category": "Domain", "task": "Google Search Console에 새 도메인 등록", "description": "새 도메인 속성 추가 및 소유권 확인", "status": "pending"},
{"step": 15, "category": "Domain", "task": "도메인 변경 알림 (GSC Change of Address)", "description": "Search Console에서 주소 변경 도구 실행", "status": "pending"},
{"step": 16, "category": "Domain", "task": "SSL 인증서 설치", "description": "새 도메인에 유효한 SSL 인증서 설치", "status": "pending"},
],
"platform": [
{"step": 13, "category": "Platform", "task": "URL 구조 매핑 확인", "description": "새 플랫폼에서 동일한 URL 구조 유지 여부 확인", "status": "pending"},
{"step": 14, "category": "Platform", "task": "메타 태그 이전 확인", "description": "Title, Description, Open Graph 태그 동일 여부 확인", "status": "pending"},
{"step": 15, "category": "Platform", "task": "구조화된 데이터 이전", "description": "JSON-LD Schema Markup 동일 여부 확인", "status": "pending"},
{"step": 16, "category": "Platform", "task": "스테이징 환경 테스트", "description": "스테이징에서 전체 크롤링 및 리디렉트 테스트 실행", "status": "pending"},
],
"url-restructure": [
{"step": 13, "category": "URL", "task": "URL 패턴 매핑 문서화", "description": "기존 → 신규 URL 패턴 규칙 문서화", "status": "pending"},
{"step": 14, "category": "URL", "task": "정규식 리디렉트 규칙 작성", "description": "서버 레벨 리디렉트 규칙 (nginx/Apache) 작성", "status": "pending"},
{"step": 15, "category": "URL", "task": "Breadcrumb 업데이트", "description": "새 URL 구조에 맞게 Breadcrumb 네비게이션 수정", "status": "pending"},
],
"https": [
{"step": 13, "category": "HTTPS", "task": "SSL 인증서 설치 및 확인", "description": "유효한 SSL 인증서 설치 (Let's Encrypt 또는 상용 인증서)", "status": "pending"},
{"step": 14, "category": "HTTPS", "task": "Mixed Content 점검", "description": "HTTP로 로드되는 리소스 (이미지, CSS, JS) 식별 및 수정", "status": "pending"},
{"step": 15, "category": "HTTPS", "task": "HSTS 헤더 설정", "description": "Strict-Transport-Security 헤더 활성화", "status": "pending"},
],
"subdomain": [
{"step": 13, "category": "Subdomain", "task": "서브도메인 → 서브폴더 매핑", "description": "서브도메인 경로를 서브폴더 경로로 매핑", "status": "pending"},
{"step": 14, "category": "Subdomain", "task": "서버 리디렉트 규칙 설정", "description": "서브도메인에서 메인 도메인으로의 301 리디렉트 규칙", "status": "pending"},
{"step": 15, "category": "Subdomain", "task": "DNS 설정 업데이트", "description": "서브도메인 DNS 레코드 유지 (리디렉트용)", "status": "pending"},
],
}
checklist = common_items.copy()
if migration_type in type_specific:
checklist.extend(type_specific[migration_type])
self.logger.info(
f"Generated {len(checklist)} checklist items for {migration_type} migration"
)
return checklist
# ------------------------------------------------------------------
# Orchestrator
# ------------------------------------------------------------------
async def run(
self,
domain: str,
migration_type: str,
new_domain: str | None = None,
) -> MigrationPlan:
"""Orchestrate full migration planning pipeline."""
timestamp = datetime.now().isoformat()
plan = MigrationPlan(
migration_type=migration_type,
domain=self._extract_domain(domain),
new_domain=self._extract_domain(new_domain) if new_domain else "",
timestamp=timestamp,
)
try:
# Step 1: Crawl URL inventory
self.logger.info("Step 1/6: Crawling URL inventory via Firecrawl...")
url_inventory = await self.crawl_url_inventory(domain)
# Step 2: Fetch Ahrefs baseline
self.logger.info("Step 2/6: Fetching Ahrefs top pages baseline...")
top_pages = await self.fetch_top_pages_baseline(domain)
site_metrics = await self.fetch_site_metrics(domain)
# Step 3: Enrich URL inventory with Ahrefs data
self.logger.info("Step 3/6: Enriching URLs with traffic/backlink data...")
top_pages_map: dict[str, dict] = {}
for page in top_pages:
page_url = page.get("url", "")
if page_url:
top_pages_map[page_url] = page
for url_entry in url_inventory:
page_data = top_pages_map.get(url_entry.url, {})
url_entry.traffic = int(page_data.get("traffic", 0))
url_entry.keywords = int(page_data.get("keywords", 0))
# Step 4: Risk assessment per URL
self.logger.info("Step 4/6: Scoring risk per URL...")
for url_entry in url_inventory:
url_entry.risk_score = self.assess_url_risk(url_entry)
url_entry.priority = self.classify_priority(url_entry.risk_score)
# Build baseline
baseline = MigrationBaseline(
domain=self._extract_domain(domain),
total_urls=len(url_inventory),
total_traffic=site_metrics.get("organic_traffic", 0),
total_keywords=site_metrics.get("organic_keywords", 0),
total_referring_domains=site_metrics.get("referring_domains", 0),
top_pages=top_pages[:50], # Store top 50 for reference
url_inventory=url_inventory,
)
plan.baseline = baseline
# Step 5: Generate redirect map
self.logger.info("Step 5/6: Generating redirect map...")
plan.redirect_map = self.generate_redirect_map(
url_inventory, migration_type, new_domain
)
# Build risk assessment summary
high_risk = sum(1 for u in url_inventory if u.risk_score >= 75)
medium_risk = sum(1 for u in url_inventory if 25 <= u.risk_score < 75)
low_risk = sum(1 for u in url_inventory if u.risk_score < 25)
# Determine overall risk level
if high_risk > len(url_inventory) * 0.2:
overall_risk = "critical"
elif high_risk > len(url_inventory) * 0.1:
overall_risk = "high"
elif medium_risk > len(url_inventory) * 0.3:
overall_risk = "medium"
else:
overall_risk = "low"
# Top risk URLs
sorted_urls = sorted(url_inventory, key=lambda u: u.risk_score, reverse=True)
top_risk = [
{
"url": u.url,
"risk_score": u.risk_score,
"traffic": u.traffic,
"keywords": u.keywords,
"backlinks": u.backlinks,
}
for u in sorted_urls[:20]
]
# Risk factors
risk_factors: list[str] = []
if high_risk > 0:
risk_factors.append(
f"{high_risk}개 고위험 URL (트래픽/백링크 손실 위험)"
)
if baseline.total_traffic > 10000:
risk_factors.append(
f"월간 오가닉 트래픽 {baseline.total_traffic:,}회 — 이전 실패 시 큰 영향"
)
if baseline.total_referring_domains > 500:
risk_factors.append(
f"참조 도메인 {baseline.total_referring_domains:,}개 — 리디렉트 누락 시 링크 에퀴티 손실"
)
if migration_type == "domain-move":
risk_factors.append(
"도메인 변경은 가장 위험한 이전 유형 — 최소 3-6개월 회복 예상"
)
elif migration_type == "url-restructure":
risk_factors.append(
"URL 구조 변경 시 모든 내부/외부 링크 영향 — 정규식 리디렉트 필수"
)
plan.risk_assessment = RiskAssessment(
high_risk_urls=high_risk,
medium_risk_urls=medium_risk,
low_risk_urls=low_risk,
overall_risk=overall_risk,
top_risk_urls=top_risk,
risk_factors=risk_factors,
)
# Step 6: Generate checklist
self.logger.info("Step 6/6: Generating pre-migration checklist...")
plan.pre_migration_checklist = self.generate_checklist(migration_type)
self.logger.info(
f"Migration plan complete: {len(url_inventory)} URLs inventoried, "
f"{len(plan.redirect_map)} redirects mapped, "
f"overall risk: {overall_risk}"
)
except Exception as e:
msg = f"Migration planning pipeline error: {e}"
self.logger.error(msg)
plan.errors.append(msg)
return plan
# ---------------------------------------------------------------------------
# Output helpers
# ---------------------------------------------------------------------------
def _format_text_report(plan: MigrationPlan) -> str:
"""Format migration plan as human-readable text report."""
lines: list[str] = []
lines.append("=" * 70)
lines.append(" SEO MIGRATION PLAN")
lines.append(f" Domain: {plan.domain}")
if plan.new_domain:
lines.append(f" New Domain: {plan.new_domain}")
lines.append(f" Migration Type: {MIGRATION_TYPES.get(plan.migration_type, plan.migration_type)}")
lines.append(f" Generated: {plan.timestamp}")
lines.append("=" * 70)
if plan.baseline:
b = plan.baseline
lines.append("")
lines.append("--- BASELINE ---")
lines.append(f" Total URLs: {b.total_urls:,}")
lines.append(f" Organic Traffic: {b.total_traffic:,}")
lines.append(f" Organic Keywords: {b.total_keywords:,}")
lines.append(f" Referring Domains: {b.total_referring_domains:,}")
if plan.risk_assessment:
r = plan.risk_assessment
lines.append("")
lines.append("--- RISK ASSESSMENT ---")
lines.append(f" Overall Risk: {r.overall_risk.upper()}")
lines.append(f" High Risk URLs: {r.high_risk_urls:,}")
lines.append(f" Medium Risk: {r.medium_risk_urls:,}")
lines.append(f" Low Risk: {r.low_risk_urls:,}")
if r.risk_factors:
lines.append("")
lines.append(" Risk Factors:")
for factor in r.risk_factors:
lines.append(f" - {factor}")
if r.top_risk_urls:
lines.append("")
lines.append(" Top Risk URLs:")
for url_info in r.top_risk_urls[:10]:
lines.append(
f" [{url_info['risk_score']:.0f}] {url_info['url']} "
f"(traffic={url_info['traffic']:,}, kw={url_info['keywords']})"
)
if plan.redirect_map:
lines.append("")
lines.append(f"--- REDIRECT MAP ({len(plan.redirect_map)} entries) ---")
# Show top 20 by risk
for i, rmap in enumerate(plan.redirect_map[:20], 1):
lines.append(
f" {i:>3}. [{rmap.priority.upper():>8}] "
f"{rmap.source} -> {rmap.target}"
)
if len(plan.redirect_map) > 20:
lines.append(f" ... and {len(plan.redirect_map) - 20} more entries")
if plan.pre_migration_checklist:
lines.append("")
lines.append("--- PRE-MIGRATION CHECKLIST ---")
for item in plan.pre_migration_checklist:
status_marker = "[ ]" if item["status"] == "pending" else "[x]"
lines.append(
f" {status_marker} Step {item['step']}: {item['task']}"
)
lines.append(f" {item['description']}")
if plan.errors:
lines.append("")
lines.append("--- ERRORS ---")
for err in plan.errors:
lines.append(f" - {err}")
lines.append("")
lines.append("=" * 70)
return "\n".join(lines)
def _serialize_plan(plan: MigrationPlan) -> dict:
"""Convert plan to JSON-serializable dict."""
output: dict[str, Any] = {
"domain": plan.domain,
"new_domain": plan.new_domain,
"migration_type": plan.migration_type,
"baseline": None,
"redirect_map": [asdict(r) for r in plan.redirect_map],
"risk_assessment": asdict(plan.risk_assessment) if plan.risk_assessment else None,
"pre_migration_checklist": plan.pre_migration_checklist,
"timestamp": plan.timestamp,
}
if plan.baseline:
output["baseline"] = {
"domain": plan.baseline.domain,
"total_urls": plan.baseline.total_urls,
"total_traffic": plan.baseline.total_traffic,
"total_keywords": plan.baseline.total_keywords,
"total_referring_domains": plan.baseline.total_referring_domains,
"top_pages": plan.baseline.top_pages,
"url_inventory": [asdict(u) for u in plan.baseline.url_inventory],
}
if plan.errors:
output["errors"] = plan.errors
return output
# ---------------------------------------------------------------------------
# CLI
# ---------------------------------------------------------------------------
def parse_args(argv: list[str] | None = None) -> argparse.Namespace:
parser = argparse.ArgumentParser(
description="SEO Migration Planner - Pre-migration risk assessment and redirect mapping",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""\
Examples:
python migration_planner.py --domain https://example.com --type domain-move --new-domain https://new-example.com --json
python migration_planner.py --domain https://example.com --type platform --json
python migration_planner.py --domain https://example.com --type url-restructure --json
python migration_planner.py --domain http://example.com --type https --json
python migration_planner.py --domain https://blog.example.com --type subdomain --new-domain https://example.com/blog --json
""",
)
parser.add_argument(
"--domain",
required=True,
help="Target website URL or domain to plan migration for",
)
parser.add_argument(
"--type",
required=True,
choices=["domain-move", "platform", "url-restructure", "https", "subdomain"],
help="Migration type",
)
parser.add_argument(
"--new-domain",
type=str,
default=None,
help="New domain/URL (required for domain-move and subdomain types)",
)
parser.add_argument(
"--json",
action="store_true",
default=False,
help="Output in JSON format",
)
parser.add_argument(
"--output",
type=str,
default=None,
help="Save output to file path",
)
return parser.parse_args(argv)
async def async_main(args: argparse.Namespace) -> None:
# Validate required arguments for specific types
if args.type in ("domain-move", "subdomain") and not args.new_domain:
logger.error(f"--new-domain is required for {args.type} migration type")
sys.exit(1)
planner = MigrationPlanner()
plan = await planner.run(
domain=args.domain,
migration_type=args.type,
new_domain=args.new_domain,
)
if args.json:
output_str = json.dumps(_serialize_plan(plan), indent=2, ensure_ascii=False)
else:
output_str = _format_text_report(plan)
if args.output:
with open(args.output, "w", encoding="utf-8") as f:
f.write(output_str)
logger.info(f"Migration plan saved to {args.output}")
else:
print(output_str)
planner.print_stats()
def main() -> None:
args = parse_args()
asyncio.run(async_main(args))
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,8 @@
# 33-seo-migration-planner dependencies
requests>=2.31.0
aiohttp>=3.9.0
pandas>=2.1.0
tenacity>=8.2.0
tqdm>=4.66.0
python-dotenv>=1.0.0
rich>=13.7.0

View File

@@ -0,0 +1,171 @@
---
name: seo-migration-planner
description: |
SEO site migration planning and monitoring. Triggers: site migration, domain move, redirect mapping, platform migration, URL restructuring, HTTPS migration, subdomain consolidation, 사이트 이전, 도메인 이전, 리디렉트 매핑.
---
# SEO Migration Planner & Monitor
## Purpose
Comprehensive site migration planning and post-migration monitoring for SEO: crawl-based URL inventory, traffic/keyword baseline capture via Ahrefs, redirect map generation with per-URL risk scoring, pre-migration checklist creation, and post-launch traffic/indexation/ranking recovery tracking with automated alerts. Supports domain moves, platform changes, URL restructuring, HTTPS migrations, and subdomain consolidation.
## Core Capabilities
1. **URL Inventory** - Crawl entire site via Firecrawl to capture all URLs and status codes
2. **Traffic Baseline** - Capture per-page traffic, keywords, and backlinks via Ahrefs
3. **Redirect Map Generation** - Create old URL -> new URL mappings with 301 redirect rules
4. **Risk Scoring** - Score each URL (0-100) based on traffic, backlinks, and keyword rankings
5. **Pre-Migration Checklist** - Generate type-specific migration checklist (Korean)
6. **Post-Migration Traffic Comparison** - Compare pre vs post traffic by page group
7. **Redirect Health Check** - Detect broken redirects, chains, and loops
8. **Indexation Tracking** - Monitor indexed page count changes and missing pages
9. **Ranking Monitoring** - Track keyword position changes for priority keywords
10. **Recovery Estimation** - Estimate traffic recovery timeline based on migration type
11. **Alert Generation** - Flag traffic drops >20%, broken redirects, indexation loss
## MCP Tool Usage
### Ahrefs for SEO Baseline & Monitoring
```
mcp__ahrefs__site-explorer-metrics: Current organic metrics (traffic, keywords)
mcp__ahrefs__site-explorer-metrics-history: Historical metrics for pre/post comparison
mcp__ahrefs__site-explorer-top-pages: Top performing pages for baseline
mcp__ahrefs__site-explorer-pages-by-traffic: Pages ranked by traffic for risk scoring
mcp__ahrefs__site-explorer-organic-keywords: Keyword rankings per page
mcp__ahrefs__site-explorer-referring-domains: Referring domains for risk scoring
mcp__ahrefs__site-explorer-backlinks-stats: Backlink overview for migration impact
```
### Firecrawl for URL Inventory & Redirect Verification
```
mcp__firecrawl__firecrawl_crawl: Crawl entire site for URL inventory
mcp__firecrawl__firecrawl_scrape: Verify individual redirect health
```
### Notion for Report Storage
```
mcp__notion__notion-create-pages: Save reports to SEO Audit Log
```
### Perplexity for Migration Best Practices
```
mcp__perplexity__search: Research migration best practices and common pitfalls
```
## Workflow
### Pre-Migration Planning
1. Accept target domain, migration type, and new domain (if applicable)
2. Crawl URL inventory via Firecrawl (capture all URLs + status codes)
3. Fetch Ahrefs top pages baseline (traffic, keywords, backlinks per page)
4. Fetch site-level metrics (total traffic, keywords, referring domains)
5. Enrich URL inventory with Ahrefs traffic/backlink data
6. Score risk per URL (0-100) based on traffic weight (40%), backlinks (30%), keywords (30%)
7. Generate redirect map (old URL -> new URL) based on migration type
8. Aggregate risk assessment (high/medium/low URL counts, overall risk level)
9. Generate pre-migration checklist (common + type-specific items, in Korean)
10. Save baseline and plan to Notion
### Post-Migration Monitoring
1. Accept domain, migration date, and optional baseline JSON
2. Compare pre vs post traffic using Ahrefs metrics history
3. Check redirect health via Firecrawl (broken, chains, loops)
4. Track indexation changes (pre vs post page count, missing pages)
5. Track keyword ranking changes for priority keywords
6. Estimate recovery timeline based on traffic delta and migration type
7. Generate alerts for significant issues (traffic >20% drop, broken redirects, etc.)
8. Save monitoring report to Notion
## Output Format
### Planning Report
```markdown
## SEO 사이트 이전 계획: [domain]
### 베이스라인
- 전체 URL 수: [count]
- 오가닉 트래픽: [traffic]
- 오가닉 키워드: [keywords]
- 참조 도메인: [count]
### 위험 평가
- 전체 위험도: [HIGH/MEDIUM/LOW]
- 고위험 URL: [count]개
- 중위험 URL: [count]개
- 저위험 URL: [count]개
### 리디렉트 맵 (상위 위험 URL)
| Source URL | Target URL | Risk Score | Priority |
|------------|------------|------------|----------|
### 사전 체크리스트
- [ ] Step 1: ...
- [ ] Step 2: ...
```
### Monitoring Report
```markdown
## SEO 이전 모니터링 보고서: [domain]
### 이전일: [date] | 경과일: [N]일
### 알림
- [severity] [message]
### 트래픽 비교
| Page Group | Pre | Post | Change | Status |
|------------|-----|------|--------|--------|
### 리디렉트 상태
- 전체: [count] | 정상: [count] | 깨짐: [count] | 체인: [count]
### 인덱싱 현황
- 이전 전: [count] | 이전 후: [count] | 변화: [pct]%
### 회복 예상
- 예상 기간: [weeks]주
- 현재 회복률: [pct]%
```
## Risk Scoring Methodology
| Factor | Weight | Scale |
|--------|--------|-------|
| Traffic | 40% | 1,000+ monthly visits = high risk |
| Backlinks | 30% | 50+ referring domains = high risk |
| Keywords | 30% | 20+ keyword rankings = high risk |
### Priority Classification
| Risk Score | Priority | Action |
|------------|----------|--------|
| 75-100 | Critical | Manual redirect verification required |
| 50-74 | High | Priority redirect with monitoring |
| 25-49 | Medium | Standard redirect |
| 0-24 | Low | Batch redirect |
## Alert Thresholds
| Alert Type | Threshold | Severity |
|------------|-----------|----------|
| Traffic drop | >20% | warning; >40% critical |
| Broken redirects | >0 | warning; >10 critical |
| Redirect chains | >0 | warning |
| Indexation loss | >10% | warning; >30% critical |
| Ranking drop | >5 positions (volume 100+) | warning; >20 keywords critical |
## Limitations
- Ahrefs data has ~24h freshness lag
- Firecrawl crawl limited to 5,000 URLs per run
- Redirect chain detection depends on Firecrawl following redirects
- Recovery estimation is heuristic-based on industry averages
- URL restructuring requires manual mapping rules (no auto-pattern detection)
## Notion Output (Required)
All reports MUST be saved to OurDigital SEO Audit Log:
- **Database ID**: `2c8581e5-8a1e-8035-880b-e38cefc2f3ef`
- **Properties**: Issue (title), Site (url), Category ("SEO Migration"), Priority, Found Date, Audit ID
- **Language**: Korean with English technical terms
- **Audit ID Format**: MIGR-YYYYMMDD-NNN

View File

@@ -0,0 +1,10 @@
name: seo-migration-planner
description: |
SEO site migration planning and monitoring. Triggers: site migration, domain move, redirect mapping, platform migration, URL restructuring, 사이트 이전.
allowed-tools:
- mcp__ahrefs__*
- mcp__firecrawl__*
- mcp__notion__*
- mcp__perplexity__*
- WebSearch
- WebFetch

View File

@@ -0,0 +1,37 @@
# Ahrefs
> MCP tool documentation for migration planner skill
## Available Commands
- `site-explorer-metrics` - Get current organic metrics (traffic, keywords) for a domain
- `site-explorer-metrics-history` - Get historical organic metrics for pre/post comparison
- `site-explorer-top-pages` - Get top performing pages by traffic for baseline
- `site-explorer-pages-by-traffic` - Get pages ranked by organic traffic for risk scoring
- `site-explorer-organic-keywords` - Get keyword rankings per page
- `site-explorer-referring-domains` - Get referring domain list for risk scoring
- `site-explorer-backlinks-stats` - Get backlink overview for migration impact assessment
## Configuration
- Requires Ahrefs MCP server configured in Claude Desktop
- API access via `mcp__ahrefs__*` tool prefix
## Examples
```
# Get site baseline metrics
mcp__ahrefs__site-explorer-metrics(target="example.com")
# Get top pages for risk scoring
mcp__ahrefs__site-explorer-top-pages(target="example.com", limit=500)
# Get traffic history for pre/post comparison
mcp__ahrefs__site-explorer-metrics-history(target="example.com", date_from="2025-01-01")
# Get backlink stats for a specific page
mcp__ahrefs__site-explorer-backlinks-stats(target="https://example.com/important-page")
# Get keyword rankings
mcp__ahrefs__site-explorer-organic-keywords(target="example.com", limit=200)
```

View File

@@ -0,0 +1,29 @@
# Firecrawl
> MCP tool documentation for URL inventory crawling and redirect verification
## Available Commands
- `firecrawl_crawl` - Crawl entire site to capture all URLs and status codes for migration inventory
- `firecrawl_scrape` - Scrape individual pages to verify redirect health (status codes, chains, final URL)
## Configuration
- Requires Firecrawl MCP server configured in Claude Desktop
- API access via `mcp__firecrawl__*` tool prefix
## Examples
```
# Crawl full site for URL inventory
mcp__firecrawl__firecrawl_crawl(url="https://example.com", limit=5000, scrapeOptions={"formats": ["links"]})
# Verify a redirect
mcp__firecrawl__firecrawl_scrape(url="https://old-example.com/page", formats=["links"])
```
## Notes
- Crawl limit defaults to 5,000 URLs per run
- For larger sites, run multiple crawls with path-based filtering
- Redirect verification returns status_code, final_url, and redirect_chain

View File

@@ -0,0 +1,46 @@
# Notion
> MCP tool documentation for saving migration planning and monitoring reports
## Available Commands
- `notion-create-pages` - Create new pages in the SEO Audit Log database
- `notion-update-page` - Update existing audit entries
- `notion-query-database-view` - Query existing reports
- `notion-search` - Search across Notion workspace
## Configuration
- Database ID: `2c8581e5-8a1e-8035-880b-e38cefc2f3ef`
- All reports saved with Category: "SEO Migration"
- Audit ID format: MIGR-YYYYMMDD-NNN
## Examples
```
# Create migration planning report
mcp__notion__notion-create-pages(
database_id="2c8581e5-8a1e-8035-880b-e38cefc2f3ef",
properties={
"Issue": {"title": [{"text": {"content": "사이트 이전 계획 - example.com - 2025-01-15"}}]},
"Site": {"url": "https://example.com"},
"Category": {"select": {"name": "SEO Migration"}},
"Priority": {"select": {"name": "High"}},
"Found Date": {"date": {"start": "2025-01-15"}},
"Audit ID": {"rich_text": [{"text": {"content": "MIGR-20250115-001"}}]}
}
)
# Create post-migration monitoring report
mcp__notion__notion-create-pages(
database_id="2c8581e5-8a1e-8035-880b-e38cefc2f3ef",
properties={
"Issue": {"title": [{"text": {"content": "이전 모니터링 보고서 - new-example.com - 2025-02-01"}}]},
"Site": {"url": "https://new-example.com"},
"Category": {"select": {"name": "SEO Migration"}},
"Priority": {"select": {"name": "Critical"}},
"Found Date": {"date": {"start": "2025-02-01"}},
"Audit ID": {"rich_text": [{"text": {"content": "MIGR-20250201-001"}}]}
}
)
```