Add SEO skills 33-34 and fix bugs in skills 19-34
New skills: - Skill 33: Site migration planner with redirect mapping and monitoring - Skill 34: Reporting dashboard with HTML charts and Korean executive reports Bug fixes (Skill 34 - report_aggregator.py): - Add audit_type fallback for skill identification (was only using audit_id prefix) - Extract health scores from nested data dict (technical_score, onpage_score, etc.) - Support subdomain matching in domain filter (blog.ourdigital.org matches ourdigital.org) - Skip self-referencing DASH- aggregated reports Bug fixes (Skill 20 - naver_serp_analyzer.py): - Remove VIEW tab selectors (removed by Naver in 2026) - Add new section detectors: books (도서), shortform (숏폼), influencer (인플루언서) Improvements (Skill 34 - dashboard/executive report): - Add Korean category labels for Chart.js charts (기술 SEO, 온페이지, etc.) - Add Korean trend labels (개선 중 ↑, 안정 →, 하락 중 ↓) - Add English→Korean issue description translation layer (20 common patterns) Documentation improvements: - Add Korean triggers to 4 skill descriptions (19, 25, 28, 31) - Expand Skill 32 SKILL.md from 40→143 lines (was 6/10, added workflow, output format, limitations) - Add output format examples to Skills 27 and 28 SKILL.md - Add limitations sections to Skills 27 and 28 - Update README.md, CLAUDE.md, AGENTS.md for skills 33-34 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -115,15 +115,17 @@ Task 2: general-purpose - "Implement the planned skill" # Needs Task 1 result
|
|||||||
|
|
||||||
## Domain-Specific Routing
|
## Domain-Specific Routing
|
||||||
|
|
||||||
### SEO Skills (11-32)
|
### SEO Skills (11-34)
|
||||||
|
|
||||||
- Use **Explore** to understand existing SEO script patterns
|
- Use **Explore** to understand existing SEO script patterns
|
||||||
- Python scripts in these skills follow `base_client.py` patterns (RateLimiter, ConfigManager, BaseAsyncClient)
|
- Python scripts in these skills follow `base_client.py` patterns (RateLimiter, ConfigManager, BaseAsyncClient)
|
||||||
- `11-seo-comprehensive-audit` orchestrates skills 12-18 for unified audits
|
- `11-seo-comprehensive-audit` orchestrates skills 12-18 for unified audits
|
||||||
- Skills 19-28 provide advanced SEO capabilities (keyword strategy, SERP analysis, position tracking, link building, content strategy, e-commerce, KPI framework, international SEO, AI visibility, knowledge graph)
|
- Skills 19-28 provide advanced SEO capabilities (keyword strategy, SERP analysis, position tracking, link building, content strategy, e-commerce, KPI framework, international SEO, AI visibility, knowledge graph)
|
||||||
- Skills 31-32 cover competitor intelligence and crawl budget optimization
|
- Skills 31-32 cover competitor intelligence and crawl budget optimization
|
||||||
|
- Skill 33 provides site migration planning (pre-migration baseline, redirect mapping, risk assessment, post-migration monitoring)
|
||||||
|
- Skill 34 aggregates outputs from all SEO skills into executive reports, HTML dashboards, and Korean-language summaries
|
||||||
- All SEO skills integrate with Ahrefs MCP tools and output to the Notion SEO Audit Log database
|
- All SEO skills integrate with Ahrefs MCP tools and output to the Notion SEO Audit Log database
|
||||||
- Slash commands available: `/seo-keyword-strategy`, `/seo-serp-analysis`, `/seo-position-tracking`, `/seo-link-building`, `/seo-content-strategy`, `/seo-ecommerce`, `/seo-kpi-framework`, `/seo-international`, `/seo-ai-visibility`, `/seo-knowledge-graph`, `/seo-competitor-intel`, `/seo-crawl-budget`
|
- Slash commands available: `/seo-keyword-strategy`, `/seo-serp-analysis`, `/seo-position-tracking`, `/seo-link-building`, `/seo-content-strategy`, `/seo-ecommerce`, `/seo-kpi-framework`, `/seo-international`, `/seo-ai-visibility`, `/seo-knowledge-graph`, `/seo-competitor-intel`, `/seo-crawl-budget`, `/seo-migration-planner`, `/seo-reporting-dashboard`
|
||||||
|
|
||||||
### GTM Skills (60-69)
|
### GTM Skills (60-69)
|
||||||
|
|
||||||
@@ -202,7 +204,7 @@ For long-running tasks, use `run_in_background: true`:
|
|||||||
|
|
||||||
```
|
```
|
||||||
# Good candidates for background execution:
|
# Good candidates for background execution:
|
||||||
- Full skill audit across all 50 skills
|
- Full skill audit across all 52 skills
|
||||||
- Running Python tests on multiple skills
|
- Running Python tests on multiple skills
|
||||||
- Generating comprehensive documentation
|
- Generating comprehensive documentation
|
||||||
|
|
||||||
|
|||||||
@@ -7,7 +7,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
|
|||||||
**GitHub**: https://github.com/ourdigital/our-claude-skills
|
**GitHub**: https://github.com/ourdigital/our-claude-skills
|
||||||
|
|
||||||
This is a Claude Skills collection repository containing:
|
This is a Claude Skills collection repository containing:
|
||||||
- **custom-skills/**: 50 custom skills for OurDigital workflows, SEO, GTM, Jamie Brand, NotebookLM, Notion, Reference Curation, and Multi-Agent Collaboration
|
- **custom-skills/**: 52 custom skills for OurDigital workflows, SEO, GTM, Jamie Brand, NotebookLM, Notion, Reference Curation, and Multi-Agent Collaboration
|
||||||
- **example-skills/**: Reference examples from Anthropic's official skills repository
|
- **example-skills/**: Reference examples from Anthropic's official skills repository
|
||||||
- **official-skills/**: Notion integration skills (3rd party)
|
- **official-skills/**: Notion integration skills (3rd party)
|
||||||
- **reference/**: Skill format requirements documentation
|
- **reference/**: Skill format requirements documentation
|
||||||
@@ -35,7 +35,7 @@ This is a Claude Skills collection repository containing:
|
|||||||
| 09 | ourdigital-backoffice | Business document creation | "create proposal", "견적서" |
|
| 09 | ourdigital-backoffice | Business document creation | "create proposal", "견적서" |
|
||||||
| 10 | ourdigital-skill-creator | Meta skill for creating skills | "create skill", "init skill" |
|
| 10 | ourdigital-skill-creator | Meta skill for creating skills | "create skill", "init skill" |
|
||||||
|
|
||||||
### SEO Tools (11-32)
|
### SEO Tools (11-34)
|
||||||
|
|
||||||
| # | Skill | Purpose | Trigger |
|
| # | Skill | Purpose | Trigger |
|
||||||
|---|-------|---------|---------|
|
|---|-------|---------|---------|
|
||||||
@@ -61,6 +61,8 @@ This is a Claude Skills collection repository containing:
|
|||||||
| 30 | seo-gateway-builder | Gateway page content | "build gateway page" |
|
| 30 | seo-gateway-builder | Gateway page content | "build gateway page" |
|
||||||
| 31 | seo-competitor-intel | Competitor profiling, benchmarking, threats | "competitor analysis", "competitive intel" |
|
| 31 | seo-competitor-intel | Competitor profiling, benchmarking, threats | "competitor analysis", "competitive intel" |
|
||||||
| 32 | seo-crawl-budget | Log analysis, bot profiling, crawl waste | "crawl budget", "log analysis" |
|
| 32 | seo-crawl-budget | Log analysis, bot profiling, crawl waste | "crawl budget", "log analysis" |
|
||||||
|
| 33 | seo-migration-planner | Site migration planning, redirect mapping | "site migration", "domain move", "사이트 이전" |
|
||||||
|
| 34 | seo-reporting-dashboard | Executive reports, HTML dashboards, aggregation | "SEO report", "SEO dashboard", "보고서" |
|
||||||
|
|
||||||
### GTM/GA Tools (60-69)
|
### GTM/GA Tools (60-69)
|
||||||
|
|
||||||
@@ -221,6 +223,8 @@ our-claude-skills/
|
|||||||
│ ├── 30-seo-gateway-builder/
|
│ ├── 30-seo-gateway-builder/
|
||||||
│ ├── 31-seo-competitor-intel/
|
│ ├── 31-seo-competitor-intel/
|
||||||
│ ├── 32-seo-crawl-budget/
|
│ ├── 32-seo-crawl-budget/
|
||||||
|
│ ├── 33-seo-migration-planner/
|
||||||
|
│ ├── 34-seo-reporting-dashboard/
|
||||||
│ │
|
│ │
|
||||||
│ ├── 60-gtm-audit/
|
│ ├── 60-gtm-audit/
|
||||||
│ ├── 61-gtm-manager/
|
│ ├── 61-gtm-manager/
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
|
|
||||||
> **Internal R&D Repository** - This repository is restricted for internal use only.
|
> **Internal R&D Repository** - This repository is restricted for internal use only.
|
||||||
|
|
||||||
A collection of **50 custom Claude Skills** for OurDigital workflows, Jamie Plastic Surgery Clinic brand management, SEO/GTM tools, NotebookLM automation, Notion integrations, reference documentation curation, and multi-agent collaboration.
|
A collection of **52 custom Claude Skills** for OurDigital workflows, Jamie Plastic Surgery Clinic brand management, SEO/GTM tools, NotebookLM automation, Notion integrations, reference documentation curation, and multi-agent collaboration.
|
||||||
|
|
||||||
## Quick Install
|
## Quick Install
|
||||||
|
|
||||||
@@ -35,7 +35,7 @@ cd our-claude-skills/custom-skills/_ourdigital-shared
|
|||||||
| 09 | `ourdigital-backoffice` | Business document creation |
|
| 09 | `ourdigital-backoffice` | Business document creation |
|
||||||
| 10 | `ourdigital-skill-creator` | Meta skill for creating/managing skills |
|
| 10 | `ourdigital-skill-creator` | Meta skill for creating/managing skills |
|
||||||
|
|
||||||
### SEO Tools (11-32)
|
### SEO Tools (11-34)
|
||||||
|
|
||||||
| # | Skill | Purpose |
|
| # | Skill | Purpose |
|
||||||
|---|-------|---------|
|
|---|-------|---------|
|
||||||
@@ -61,6 +61,8 @@ cd our-claude-skills/custom-skills/_ourdigital-shared
|
|||||||
| 30 | `seo-gateway-builder` | Gateway page content generation |
|
| 30 | `seo-gateway-builder` | Gateway page content generation |
|
||||||
| 31 | `seo-competitor-intel` | Competitor profiling, benchmarking, threat scoring |
|
| 31 | `seo-competitor-intel` | Competitor profiling, benchmarking, threat scoring |
|
||||||
| 32 | `seo-crawl-budget` | Log analysis, bot profiling, crawl waste detection |
|
| 32 | `seo-crawl-budget` | Log analysis, bot profiling, crawl waste detection |
|
||||||
|
| 33 | `seo-migration-planner` | Site migration planning, redirect mapping, monitoring |
|
||||||
|
| 34 | `seo-reporting-dashboard` | Executive reports, HTML dashboards, data aggregation |
|
||||||
|
|
||||||
### GTM/GA Tools (60-69)
|
### GTM/GA Tools (60-69)
|
||||||
|
|
||||||
@@ -148,7 +150,7 @@ our-claude-skills/
|
|||||||
│ │
|
│ │
|
||||||
│ ├── 00-our-settings-audit/
|
│ ├── 00-our-settings-audit/
|
||||||
│ ├── 01-10 (OurDigital core)
|
│ ├── 01-10 (OurDigital core)
|
||||||
│ ├── 11-32 (SEO tools)
|
│ ├── 11-34 (SEO tools)
|
||||||
│ ├── 60-62 (GTM/GA tools)
|
│ ├── 60-62 (GTM/GA tools)
|
||||||
│ ├── 31-32 (Notion tools)
|
│ ├── 31-32 (Notion tools)
|
||||||
│ ├── 40-45 (Jamie clinic)
|
│ ├── 40-45 (Jamie clinic)
|
||||||
|
|||||||
@@ -1,7 +1,10 @@
|
|||||||
---
|
---
|
||||||
name: seo-keyword-strategy
|
name: seo-keyword-strategy
|
||||||
description: |
|
description: |
|
||||||
Keyword strategy and research for SEO campaigns. Triggers: keyword research, keyword analysis, keyword gap, search volume, keyword clustering, intent classification.
|
Keyword strategy and research for SEO campaigns.
|
||||||
|
Triggers: keyword research, keyword analysis, keyword gap, search volume,
|
||||||
|
keyword clustering, intent classification, 키워드 전략, 키워드 분석,
|
||||||
|
키워드 리서치, 검색량 분석, 키워드 클러스터링.
|
||||||
---
|
---
|
||||||
|
|
||||||
# SEO Keyword Strategy & Research
|
# SEO Keyword Strategy & Research
|
||||||
|
|||||||
@@ -59,11 +59,11 @@ python scripts/naver_serp_analyzer.py --keywords-file keywords.txt --json
|
|||||||
```
|
```
|
||||||
|
|
||||||
**Capabilities**:
|
**Capabilities**:
|
||||||
- Naver section detection (블로그, 카페, 지식iN, 스마트스토어, 브랜드존, VIEW탭)
|
- Naver section detection (블로그, 카페, 지식iN, 스마트스토어, 브랜드존, 도서, 숏폼, 인플루언서)
|
||||||
- Section priority mapping (which sections appear above fold)
|
- Section priority mapping (which sections appear above fold)
|
||||||
- Content type distribution per section
|
- Content type distribution per section
|
||||||
- Brand zone presence detection
|
- Brand zone presence detection
|
||||||
- VIEW tab content analysis
|
- Shortform/influencer content analysis
|
||||||
|
|
||||||
## Ahrefs MCP Tools Used
|
## Ahrefs MCP Tools Used
|
||||||
|
|
||||||
|
|||||||
@@ -80,13 +80,6 @@ NAVER_SECTION_SELECTORS: dict[str, list[str]] = {
|
|||||||
"type_brand",
|
"type_brand",
|
||||||
"sc_new.sp_brand",
|
"sc_new.sp_brand",
|
||||||
],
|
],
|
||||||
"view_tab": [
|
|
||||||
"sp_view",
|
|
||||||
"view_widget",
|
|
||||||
"sc_new.sp_view",
|
|
||||||
"type_view",
|
|
||||||
"api_subject_view",
|
|
||||||
],
|
|
||||||
"news": [
|
"news": [
|
||||||
"sp_nnews",
|
"sp_nnews",
|
||||||
"news_widget",
|
"news_widget",
|
||||||
@@ -132,6 +125,26 @@ NAVER_SECTION_SELECTORS: dict[str, list[str]] = {
|
|||||||
"type_ad",
|
"type_ad",
|
||||||
"nx_ad",
|
"nx_ad",
|
||||||
],
|
],
|
||||||
|
"books": [
|
||||||
|
"sp_book",
|
||||||
|
"sc_new.sp_book",
|
||||||
|
"type_book",
|
||||||
|
"api_subject_book",
|
||||||
|
"nx_book",
|
||||||
|
],
|
||||||
|
"shortform": [
|
||||||
|
"sp_shortform",
|
||||||
|
"sc_new.sp_shortform",
|
||||||
|
"type_shortform",
|
||||||
|
"sp_shorts",
|
||||||
|
"type_shorts",
|
||||||
|
],
|
||||||
|
"influencer": [
|
||||||
|
"sp_influencer",
|
||||||
|
"sc_new.sp_influencer",
|
||||||
|
"type_influencer",
|
||||||
|
"api_subject_influencer",
|
||||||
|
],
|
||||||
}
|
}
|
||||||
|
|
||||||
# Section display names in Korean
|
# Section display names in Korean
|
||||||
@@ -141,13 +154,15 @@ SECTION_DISPLAY_NAMES: dict[str, str] = {
|
|||||||
"knowledge_in": "지식iN",
|
"knowledge_in": "지식iN",
|
||||||
"smart_store": "스마트스토어",
|
"smart_store": "스마트스토어",
|
||||||
"brand_zone": "브랜드존",
|
"brand_zone": "브랜드존",
|
||||||
"view_tab": "VIEW",
|
|
||||||
"news": "뉴스",
|
"news": "뉴스",
|
||||||
"encyclopedia": "백과사전",
|
"encyclopedia": "백과사전",
|
||||||
"image": "이미지",
|
"image": "이미지",
|
||||||
"video": "동영상",
|
"video": "동영상",
|
||||||
"place": "플레이스",
|
"place": "플레이스",
|
||||||
"ad": "광고",
|
"ad": "광고",
|
||||||
|
"books": "도서",
|
||||||
|
"shortform": "숏폼",
|
||||||
|
"influencer": "인플루언서",
|
||||||
}
|
}
|
||||||
|
|
||||||
# Default headers for Naver requests
|
# Default headers for Naver requests
|
||||||
@@ -199,7 +214,6 @@ class NaverSerpResult:
|
|||||||
above_fold_sections: list[str] = field(default_factory=list)
|
above_fold_sections: list[str] = field(default_factory=list)
|
||||||
ad_count: int = 0
|
ad_count: int = 0
|
||||||
dominant_section: str = ""
|
dominant_section: str = ""
|
||||||
has_view_tab: bool = False
|
|
||||||
has_place_section: bool = False
|
has_place_section: bool = False
|
||||||
timestamp: str = ""
|
timestamp: str = ""
|
||||||
|
|
||||||
@@ -485,7 +499,6 @@ class NaverSerpAnalyzer:
|
|||||||
ad_count = sum(s.item_count for s in ad_sections) if ad_sections else 0
|
ad_count = sum(s.item_count for s in ad_sections) if ad_sections else 0
|
||||||
|
|
||||||
# Check special sections
|
# Check special sections
|
||||||
has_view = any(s.section_type == "view_tab" for s in sections)
|
|
||||||
has_place = any(s.section_type == "place" for s in sections)
|
has_place = any(s.section_type == "place" for s in sections)
|
||||||
dominant = self._find_dominant_section(sections)
|
dominant = self._find_dominant_section(sections)
|
||||||
|
|
||||||
@@ -499,7 +512,6 @@ class NaverSerpAnalyzer:
|
|||||||
above_fold_sections=above_fold,
|
above_fold_sections=above_fold,
|
||||||
ad_count=ad_count,
|
ad_count=ad_count,
|
||||||
dominant_section=dominant,
|
dominant_section=dominant,
|
||||||
has_view_tab=has_view,
|
|
||||||
has_place_section=has_place,
|
has_place_section=has_place,
|
||||||
)
|
)
|
||||||
return result
|
return result
|
||||||
@@ -534,7 +546,6 @@ def print_rich_report(result: NaverSerpResult) -> None:
|
|||||||
summary_table.add_row("Brand Zone", "Yes" if result.brand_zone_present else "No")
|
summary_table.add_row("Brand Zone", "Yes" if result.brand_zone_present else "No")
|
||||||
if result.brand_zone_brand:
|
if result.brand_zone_brand:
|
||||||
summary_table.add_row("Brand Name", result.brand_zone_brand)
|
summary_table.add_row("Brand Name", result.brand_zone_brand)
|
||||||
summary_table.add_row("VIEW Tab", "Yes" if result.has_view_tab else "No")
|
|
||||||
summary_table.add_row("Place Section", "Yes" if result.has_place_section else "No")
|
summary_table.add_row("Place Section", "Yes" if result.has_place_section else "No")
|
||||||
summary_table.add_row("Dominant Section", result.dominant_section or "N/A")
|
summary_table.add_row("Dominant Section", result.dominant_section or "N/A")
|
||||||
console.print(summary_table)
|
console.print(summary_table)
|
||||||
|
|||||||
@@ -17,7 +17,7 @@ Analyze search engine result page composition for Google and Naver. Detect SERP
|
|||||||
2. **Competitor Position Mapping** - Extract domains, positions, content types for top organic results
|
2. **Competitor Position Mapping** - Extract domains, positions, content types for top organic results
|
||||||
3. **Opportunity Scoring** - Score SERP opportunity (0-100) based on feature landscape and competition
|
3. **Opportunity Scoring** - Score SERP opportunity (0-100) based on feature landscape and competition
|
||||||
4. **Search Intent Validation** - Infer intent (informational, navigational, commercial, transactional, local) from SERP composition
|
4. **Search Intent Validation** - Infer intent (informational, navigational, commercial, transactional, local) from SERP composition
|
||||||
5. **Naver SERP Composition** - Detect sections (blog, cafe, knowledge iN, Smart Store, brand zone, VIEW tab), map section priority, analyze brand zone presence
|
5. **Naver SERP Composition** - Detect sections (blog, cafe, knowledge iN, Smart Store, brand zone, books, shortform, influencer), map section priority, analyze brand zone presence
|
||||||
|
|
||||||
## MCP Tool Usage
|
## MCP Tool Usage
|
||||||
|
|
||||||
@@ -53,7 +53,7 @@ WebFetch: Fetch Naver SERP HTML for section analysis
|
|||||||
|
|
||||||
### 2. Naver SERP Analysis
|
### 2. Naver SERP Analysis
|
||||||
1. Fetch Naver search page for the target keyword
|
1. Fetch Naver search page for the target keyword
|
||||||
2. Detect SERP sections (blog, cafe, knowledge iN, Smart Store, brand zone, VIEW tab, news, encyclopedia)
|
2. Detect SERP sections (blog, cafe, knowledge iN, Smart Store, brand zone, news, encyclopedia, books, shortform, influencer)
|
||||||
3. Map section priority (above-fold order)
|
3. Map section priority (above-fold order)
|
||||||
4. Check brand zone presence and extract brand name
|
4. Check brand zone presence and extract brand name
|
||||||
5. Count items per section
|
5. Count items per section
|
||||||
|
|||||||
@@ -2,7 +2,8 @@
|
|||||||
name: seo-kpi-framework
|
name: seo-kpi-framework
|
||||||
description: |
|
description: |
|
||||||
SEO KPI and performance framework for unified metrics, health scores, ROI, and period-over-period reporting.
|
SEO KPI and performance framework for unified metrics, health scores, ROI, and period-over-period reporting.
|
||||||
Triggers: SEO KPI, performance report, health score, SEO metrics, ROI, baseline, targets.
|
Triggers: SEO KPI, performance report, health score, SEO metrics, ROI,
|
||||||
|
baseline, targets, SEO 성과 지표, KPI 대시보드, SEO 성과 보고서.
|
||||||
---
|
---
|
||||||
|
|
||||||
# SEO KPI & Performance Framework
|
# SEO KPI & Performance Framework
|
||||||
|
|||||||
@@ -57,6 +57,55 @@ All reports are saved to the OurDigital SEO Audit Log:
|
|||||||
- **Audit ID Format**: AI-YYYYMMDD-NNN
|
- **Audit ID Format**: AI-YYYYMMDD-NNN
|
||||||
- **Language**: Korean (technical terms in English)
|
- **Language**: Korean (technical terms in English)
|
||||||
|
|
||||||
|
## Output Format
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"domain": "example.com",
|
||||||
|
"impressions": {
|
||||||
|
"total": 15000,
|
||||||
|
"trend": "increasing",
|
||||||
|
"period": "30d"
|
||||||
|
},
|
||||||
|
"mentions": {
|
||||||
|
"total": 450,
|
||||||
|
"positive": 320,
|
||||||
|
"neutral": 100,
|
||||||
|
"negative": 30,
|
||||||
|
"sentiment_score": 0.72
|
||||||
|
},
|
||||||
|
"share_of_voice": {
|
||||||
|
"domain_sov": 12.5,
|
||||||
|
"competitors": {
|
||||||
|
"competitor1.com": 18.3,
|
||||||
|
"competitor2.com": 15.1
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"cited_pages": [
|
||||||
|
{"url": "https://example.com/guide", "citations": 45},
|
||||||
|
{"url": "https://example.com/faq", "citations": 28}
|
||||||
|
],
|
||||||
|
"cited_domains": [
|
||||||
|
{"domain": "example.com", "citations": 120},
|
||||||
|
{"domain": "competitor1.com", "citations": 95}
|
||||||
|
],
|
||||||
|
"recommendations": [
|
||||||
|
"Create more FAQ-style content for AI citation capture",
|
||||||
|
"Add structured data to improve AI answer extraction"
|
||||||
|
],
|
||||||
|
"audit_id": "AI-20250115-001",
|
||||||
|
"timestamp": "2025-01-15T14:30:00"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Limitations
|
||||||
|
|
||||||
|
- Requires Ahrefs Brand Radar API access (not available in basic plans)
|
||||||
|
- AI search landscape changes rapidly; data may not reflect real-time state
|
||||||
|
- Share of Voice metrics are relative to tracked competitor set only
|
||||||
|
- Sentiment analysis based on AI-generated text, not user perception
|
||||||
|
- Cannot distinguish between different AI engines (ChatGPT, Gemini, Perplexity) without Brand Radar
|
||||||
|
|
||||||
## Example Queries
|
## Example Queries
|
||||||
|
|
||||||
- "example.com의 AI 검색 가시성을 분석해줘"
|
- "example.com의 AI 검색 가시성을 분석해줘"
|
||||||
|
|||||||
@@ -1,7 +1,10 @@
|
|||||||
---
|
---
|
||||||
name: seo-knowledge-graph
|
name: seo-knowledge-graph
|
||||||
description: |
|
description: |
|
||||||
Knowledge Graph and entity SEO analysis. Triggers: knowledge panel, entity SEO, knowledge graph, PAA, FAQ schema, Wikipedia, Wikidata, brand entity.
|
Knowledge Graph and entity SEO analysis.
|
||||||
|
Triggers: knowledge panel, entity SEO, knowledge graph, PAA, FAQ schema,
|
||||||
|
Wikipedia, Wikidata, brand entity, 지식 그래프, 엔티티 SEO,
|
||||||
|
지식 패널, 브랜드 엔티티, 위키데이터.
|
||||||
---
|
---
|
||||||
|
|
||||||
# Knowledge Graph & Entity SEO
|
# Knowledge Graph & Entity SEO
|
||||||
@@ -69,9 +72,60 @@ All reports must be saved to the OurDigital SEO Audit Log database.
|
|||||||
|
|
||||||
Report content should be written in Korean (한국어), keeping technical English terms as-is.
|
Report content should be written in Korean (한국어), keeping technical English terms as-is.
|
||||||
|
|
||||||
|
## Output Format
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"entity_name": "OurDigital",
|
||||||
|
"knowledge_panel": {
|
||||||
|
"present": false,
|
||||||
|
"attributes": {}
|
||||||
|
},
|
||||||
|
"entity_presence": {
|
||||||
|
"wikipedia": false,
|
||||||
|
"wikidata": false,
|
||||||
|
"wikidata_qid": null,
|
||||||
|
"naver_encyclopedia": false,
|
||||||
|
"naver_knowledge_in": false,
|
||||||
|
"google_knowledge_panel": false
|
||||||
|
},
|
||||||
|
"entity_schema": {
|
||||||
|
"organization_count": 2,
|
||||||
|
"person_count": 1,
|
||||||
|
"same_as_links": ["https://linkedin.com/...", "https://facebook.com/..."],
|
||||||
|
"same_as_count": 2,
|
||||||
|
"issues": [
|
||||||
|
"Duplicate Organization schemas with inconsistent names",
|
||||||
|
"Placeholder image in Organization schema",
|
||||||
|
"Only 2 sameAs links (recommend 6+)"
|
||||||
|
]
|
||||||
|
},
|
||||||
|
"paa_questions": [],
|
||||||
|
"faq_schema_present": false,
|
||||||
|
"entity_completeness_score": 12,
|
||||||
|
"recommendations": [
|
||||||
|
"Create Wikidata entity for brand recognition",
|
||||||
|
"Add 4-6 more sameAs social profile links",
|
||||||
|
"Replace placeholder image with actual brand logo",
|
||||||
|
"Consolidate duplicate Organization schemas",
|
||||||
|
"Add FAQPage schema to relevant pages"
|
||||||
|
],
|
||||||
|
"audit_id": "KG-20250115-001",
|
||||||
|
"timestamp": "2025-01-15T14:30:00"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Limitations
|
||||||
|
|
||||||
|
- Google Knowledge Panel detection via search results is not guaranteed (personalization, location-based)
|
||||||
|
- Direct Google scraping may be blocked (403/429); prefer WebSearch tool
|
||||||
|
- Wikipedia/Wikidata creation requires meeting notability guidelines
|
||||||
|
- PAA questions vary by location and device
|
||||||
|
- Entity completeness scoring is heuristic-based
|
||||||
|
|
||||||
## Reference Scripts
|
## Reference Scripts
|
||||||
|
|
||||||
Located in `code/scripts/`:
|
Located in `code/scripts/`:
|
||||||
- `knowledge_graph_analyzer.py` -- Knowledge Panel and entity presence analysis
|
- `knowledge_graph_analyzer.py` — Knowledge Panel and entity presence analysis
|
||||||
- `entity_auditor.py` -- Entity SEO signals and PAA/FAQ audit
|
- `entity_auditor.py` — Entity SEO signals and PAA/FAQ audit
|
||||||
- `base_client.py` -- Shared async client utilities
|
- `base_client.py` — Shared async client utilities
|
||||||
|
|||||||
@@ -1,7 +1,10 @@
|
|||||||
---
|
---
|
||||||
name: seo-competitor-intel
|
name: seo-competitor-intel
|
||||||
description: |
|
description: |
|
||||||
Competitor intelligence and SEO benchmarking. Triggers: competitor analysis, competitive intelligence, competitor comparison, threat assessment, market position, benchmarking.
|
Competitor intelligence and SEO benchmarking.
|
||||||
|
Triggers: competitor analysis, competitive intelligence, competitor comparison,
|
||||||
|
threat assessment, market position, benchmarking, 경쟁사 분석,
|
||||||
|
경쟁 인텔리전스, 벤치마킹, 경쟁사 비교.
|
||||||
---
|
---
|
||||||
|
|
||||||
# SEO Competitor Intelligence & Benchmarking
|
# SEO Competitor Intelligence & Benchmarking
|
||||||
|
|||||||
@@ -1,39 +1,142 @@
|
|||||||
---
|
---
|
||||||
name: seo-crawl-budget
|
name: seo-crawl-budget
|
||||||
description: |
|
description: |
|
||||||
Crawl budget optimization and log analysis. Triggers: crawl budget, log analysis, bot crawling, Googlebot, crawl waste, orphan pages, crawl efficiency.
|
Crawl budget optimization and server log analysis for search engine bots.
|
||||||
|
Triggers: crawl budget, log analysis, bot crawling, Googlebot, crawl waste,
|
||||||
|
orphan pages, crawl efficiency, 크롤 예산, 로그 분석, 크롤 최적화.
|
||||||
---
|
---
|
||||||
|
|
||||||
# Crawl Budget Optimizer
|
# Crawl Budget Optimizer
|
||||||
|
|
||||||
Analyze server access logs to identify crawl budget waste and generate optimization recommendations for search engine bots.
|
Analyze server access logs to identify crawl budget waste and generate optimization recommendations for search engine bots (Googlebot, Yeti/Naver, Bingbot, Daumoa/Kakao).
|
||||||
|
|
||||||
## Capabilities
|
## Capabilities
|
||||||
|
|
||||||
1. **Log Analysis**: Parse Nginx/Apache/CloudFront access logs to extract bot crawl data
|
### Log Analysis
|
||||||
2. **Bot Profiling**: Per-bot behavior analysis (Googlebot, Yeti, Bingbot, Daumoa)
|
- Parse Nginx combined, Apache combined, and CloudFront log formats
|
||||||
3. **Waste Detection**: Parameter URLs, redirect chains, soft 404s, duplicate URL variants
|
- Support for gzip/bzip2 compressed logs
|
||||||
4. **Orphan Pages**: Pages in sitemap but uncrawled, and crawled pages not in sitemap
|
- Streaming parser for files >1GB
|
||||||
5. **Recommendations**: Prioritized action items for crawl budget optimization
|
- Date range filtering
|
||||||
|
- Custom format via regex
|
||||||
|
|
||||||
|
### Bot Profiling
|
||||||
|
- Identify bots by User-Agent: Googlebot (and variants), Yeti (Naver), Bingbot, Daumoa (Kakao), Applebot, DuckDuckBot, Baiduspider
|
||||||
|
- Per-bot metrics: requests/day, requests/hour, unique URLs crawled
|
||||||
|
- Status code distribution per bot (200, 301, 404, 500)
|
||||||
|
- Crawl depth distribution
|
||||||
|
- Crawl pattern analysis (time of day, days of week)
|
||||||
|
- Most crawled URLs per bot
|
||||||
|
|
||||||
|
### Waste Detection
|
||||||
|
- **Parameter URLs**: ?sort=, ?filter=, ?page=, ?utm_* consuming crawl budget
|
||||||
|
- **Redirect chains**: Multiple redirects consuming crawl slots
|
||||||
|
- **Soft 404s**: 200 status pages with error/empty content
|
||||||
|
- **Duplicate URLs**: www/non-www, http/https, trailing slash variants
|
||||||
|
- **Low-value pages**: Thin content pages, noindex pages being crawled
|
||||||
|
|
||||||
|
### Orphan Page Detection
|
||||||
|
- Pages in sitemap but never crawled by bots
|
||||||
|
- Pages crawled but not in sitemap
|
||||||
|
- Crawled pages with no internal links pointing to them
|
||||||
|
|
||||||
## Workflow
|
## Workflow
|
||||||
|
|
||||||
1. Parse server access log with `log_parser.py`
|
### Step 1: Obtain Server Access Logs
|
||||||
2. Run crawl budget analysis with `crawl_budget_analyzer.py`
|
Request or locate server access logs from the target site. Supported formats:
|
||||||
3. Compare with sitemap URLs for orphan page detection
|
- Nginx: `/var/log/nginx/access.log`
|
||||||
4. Optionally compare with Ahrefs page history data
|
- Apache: `/var/log/apache2/access.log`
|
||||||
5. Generate Korean-language report with recommendations
|
- CloudFront: Downloaded from S3 or CloudWatch
|
||||||
6. Save to Notion SEO Audit Log database
|
|
||||||
|
|
||||||
## Tools Used
|
### Step 2: Parse Access Logs
|
||||||
|
```bash
|
||||||
|
python scripts/log_parser.py --log-file access.log --json
|
||||||
|
python scripts/log_parser.py --log-file access.log.gz --streaming --json
|
||||||
|
python scripts/log_parser.py --log-file access.log --bot googlebot --json
|
||||||
|
```
|
||||||
|
|
||||||
- **Ahrefs**: `site-explorer-pages-history` for indexed page comparison
|
### Step 3: Crawl Budget Analysis
|
||||||
- **Notion**: Save audit report to database `2c8581e5-8a1e-8035-880b-e38cefc2f3ef`
|
```bash
|
||||||
- **WebSearch**: Current best practices and bot documentation
|
python scripts/crawl_budget_analyzer.py --log-file access.log --sitemap https://example.com/sitemap.xml --json
|
||||||
|
python scripts/crawl_budget_analyzer.py --log-file access.log --scope waste --json
|
||||||
|
python scripts/crawl_budget_analyzer.py --log-file access.log --scope orphans --json
|
||||||
|
python scripts/crawl_budget_analyzer.py --log-file access.log --scope bots --json
|
||||||
|
```
|
||||||
|
|
||||||
## Output
|
### Step 4: Cross-Reference with Ahrefs (Optional)
|
||||||
|
Use `site-explorer-pages-history` to compare indexed pages vs crawled pages.
|
||||||
|
|
||||||
All reports are saved to the OurDigital SEO Audit Log with:
|
### Step 5: Generate Recommendations
|
||||||
- Category: Crawl Budget
|
Prioritized action items:
|
||||||
- Audit ID format: CRAWL-YYYYMMDD-NNN
|
1. robots.txt optimization (block parameter URLs, low-value paths)
|
||||||
- Content in Korean with technical English terms preserved
|
2. URL parameter handling (Google Search Console settings)
|
||||||
|
3. Noindex/nofollow for low-value pages
|
||||||
|
4. Redirect chain resolution (reduce 301 → 301 → 200 to 301 → 200)
|
||||||
|
5. Internal linking improvements for orphan pages
|
||||||
|
|
||||||
|
### Step 6: Report to Notion
|
||||||
|
Save Korean-language report to SEO Audit Log database.
|
||||||
|
|
||||||
|
## MCP Tools Used
|
||||||
|
|
||||||
|
| Tool | Purpose |
|
||||||
|
|------|---------|
|
||||||
|
| Ahrefs `site-explorer-pages-history` | Compare indexed pages with crawled pages |
|
||||||
|
| Notion | Save audit report to database |
|
||||||
|
| WebSearch | Current bot documentation and best practices |
|
||||||
|
|
||||||
|
## Output Format
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"log_file": "access.log",
|
||||||
|
"analysis_period": {"from": "2025-01-01", "to": "2025-01-31"},
|
||||||
|
"total_bot_requests": 150000,
|
||||||
|
"bots": {
|
||||||
|
"googlebot": {
|
||||||
|
"requests": 80000,
|
||||||
|
"unique_urls": 12000,
|
||||||
|
"avg_requests_per_day": 2580,
|
||||||
|
"status_distribution": {"200": 70000, "301": 5000, "404": 3000, "500": 2000}
|
||||||
|
},
|
||||||
|
"yeti": {"requests": 35000},
|
||||||
|
"bingbot": {"requests": 20000},
|
||||||
|
"daumoa": {"requests": 15000}
|
||||||
|
},
|
||||||
|
"waste": {
|
||||||
|
"parameter_urls": {"count": 5000, "pct_of_crawls": 3.3},
|
||||||
|
"redirect_chains": {"count": 2000, "pct_of_crawls": 1.3},
|
||||||
|
"soft_404s": {"count": 1500, "pct_of_crawls": 1.0},
|
||||||
|
"total_waste_pct": 8.5
|
||||||
|
},
|
||||||
|
"orphan_pages": {
|
||||||
|
"in_sitemap_not_crawled": [],
|
||||||
|
"crawled_not_in_sitemap": []
|
||||||
|
},
|
||||||
|
"recommendations": [],
|
||||||
|
"efficiency_score": 72,
|
||||||
|
"timestamp": "2025-01-01T00:00:00"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Limitations
|
||||||
|
|
||||||
|
- Requires actual server access logs (not available via standard web crawling)
|
||||||
|
- Log format auto-detection may need manual format specification for custom formats
|
||||||
|
- CloudFront logs have a different field structure than Nginx/Apache
|
||||||
|
- Large log files (>10GB) may need pre-filtering before analysis
|
||||||
|
- Bot identification relies on User-Agent strings which can be spoofed
|
||||||
|
|
||||||
|
## Notion Output (Required)
|
||||||
|
|
||||||
|
All audit reports MUST be saved to the OurDigital SEO Audit Log:
|
||||||
|
- **Database ID**: `2c8581e5-8a1e-8035-880b-e38cefc2f3ef`
|
||||||
|
- **Category**: Crawl Budget
|
||||||
|
- **Audit ID Format**: CRAWL-YYYYMMDD-NNN
|
||||||
|
- **Language**: Korean with technical English terms (Crawl Budget, Googlebot, robots.txt)
|
||||||
|
|
||||||
|
## Reference Scripts
|
||||||
|
|
||||||
|
Located in `code/scripts/`:
|
||||||
|
- `log_parser.py` — Server access log parser with bot identification
|
||||||
|
- `crawl_budget_analyzer.py` — Crawl budget efficiency analysis
|
||||||
|
- `base_client.py` — Shared async client utilities
|
||||||
|
|||||||
91
custom-skills/33-seo-migration-planner/README.md
Normal file
91
custom-skills/33-seo-migration-planner/README.md
Normal file
@@ -0,0 +1,91 @@
|
|||||||
|
# SEO Migration Planner
|
||||||
|
|
||||||
|
SEO 사이트 이전 계획 및 모니터링 도구 - 사전 위험 평가, 리디렉트 매핑, 이전 후 트래픽/인덱싱 추적.
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
Pre-migration risk assessment, redirect mapping, URL inventory, crawl baseline capture, and post-migration traffic/indexation monitoring for site migrations. Supports domain moves, platform changes, URL restructuring, HTTPS migrations, and subdomain consolidation.
|
||||||
|
|
||||||
|
## Dual-Platform Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
33-seo-migration-planner/
|
||||||
|
├── code/ # Claude Code version
|
||||||
|
│ ├── CLAUDE.md # Action-oriented directive
|
||||||
|
│ ├── commands/
|
||||||
|
│ │ └── seo-migration-planner.md # Slash command
|
||||||
|
│ └── scripts/
|
||||||
|
│ ├── migration_planner.py # Pre-migration planning
|
||||||
|
│ ├── migration_monitor.py # Post-migration monitoring
|
||||||
|
│ ├── base_client.py # Shared async utilities
|
||||||
|
│ └── requirements.txt
|
||||||
|
│
|
||||||
|
├── desktop/ # Claude Desktop version
|
||||||
|
│ ├── SKILL.md # MCP-based workflow
|
||||||
|
│ ├── skill.yaml # Extended metadata
|
||||||
|
│ └── tools/
|
||||||
|
│ ├── ahrefs.md # Ahrefs MCP tools
|
||||||
|
│ ├── firecrawl.md # Firecrawl MCP tools
|
||||||
|
│ └── notion.md # Notion MCP tools
|
||||||
|
│
|
||||||
|
└── README.md
|
||||||
|
```
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
### Claude Code
|
||||||
|
```bash
|
||||||
|
/seo-migration-planner https://example.com --type domain-move --new-domain https://new-example.com
|
||||||
|
```
|
||||||
|
|
||||||
|
### Python Script
|
||||||
|
```bash
|
||||||
|
pip install -r code/scripts/requirements.txt
|
||||||
|
|
||||||
|
# Pre-migration planning
|
||||||
|
python code/scripts/migration_planner.py --domain https://example.com --type domain-move --new-domain https://new-example.com --json
|
||||||
|
|
||||||
|
# Post-migration monitoring
|
||||||
|
python code/scripts/migration_monitor.py --domain https://new-example.com --migration-date 2025-01-15 --baseline baseline.json --json
|
||||||
|
```
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
|
### Pre-Migration Planning
|
||||||
|
- URL inventory via Firecrawl crawl
|
||||||
|
- Ahrefs traffic/keyword/backlink baseline
|
||||||
|
- Per-URL risk scoring (0-100)
|
||||||
|
- Redirect map generation (301 mappings)
|
||||||
|
- Type-specific pre-migration checklist (Korean)
|
||||||
|
|
||||||
|
### Post-Migration Monitoring
|
||||||
|
- Pre vs post traffic comparison
|
||||||
|
- Redirect health check (broken, chains, loops)
|
||||||
|
- Indexation change tracking
|
||||||
|
- Keyword ranking monitoring
|
||||||
|
- Recovery timeline estimation
|
||||||
|
- Automated alert generation
|
||||||
|
|
||||||
|
## Migration Types
|
||||||
|
|
||||||
|
| Type | Description |
|
||||||
|
|------|-------------|
|
||||||
|
| `domain-move` | Old domain -> new domain |
|
||||||
|
| `platform` | CMS/framework migration |
|
||||||
|
| `url-restructure` | Path/slug changes |
|
||||||
|
| `https` | HTTP -> HTTPS |
|
||||||
|
| `subdomain` | Subdomain -> subfolder |
|
||||||
|
|
||||||
|
## Notion Output
|
||||||
|
|
||||||
|
Reports are saved to the OurDigital SEO Audit Log database:
|
||||||
|
- **Title**: `사이트 이전 계획 - [domain] - YYYY-MM-DD`
|
||||||
|
- **Database ID**: `2c8581e5-8a1e-8035-880b-e38cefc2f3ef`
|
||||||
|
- **Audit ID Format**: MIGR-YYYYMMDD-NNN
|
||||||
|
|
||||||
|
## Triggers
|
||||||
|
|
||||||
|
- site migration, domain move, redirect mapping
|
||||||
|
- platform migration, URL restructuring
|
||||||
|
- HTTPS migration, subdomain consolidation
|
||||||
|
- 사이트 이전, 도메인 이전, 리디렉트 매핑
|
||||||
150
custom-skills/33-seo-migration-planner/code/CLAUDE.md
Normal file
150
custom-skills/33-seo-migration-planner/code/CLAUDE.md
Normal file
@@ -0,0 +1,150 @@
|
|||||||
|
# CLAUDE.md
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
SEO site migration planning and monitoring tool for comprehensive pre-migration risk assessment, redirect mapping, URL inventory, crawl baseline capture, and post-migration traffic/indexation monitoring. Supports domain moves, platform changes, URL restructuring, HTTPS migrations, and subdomain consolidation. Captures full URL inventory via Firecrawl crawl, builds traffic/keyword baselines via Ahrefs, generates redirect maps with per-URL risk scoring, and tracks post-launch recovery with automated alerts.
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install -r scripts/requirements.txt
|
||||||
|
|
||||||
|
# Pre-migration planning
|
||||||
|
python scripts/migration_planner.py --domain https://example.com --type domain-move --new-domain https://new-example.com --json
|
||||||
|
|
||||||
|
# Post-migration monitoring
|
||||||
|
python scripts/migration_monitor.py --domain https://new-example.com --migration-date 2025-01-15 --baseline baseline.json --json
|
||||||
|
```
|
||||||
|
|
||||||
|
## Scripts
|
||||||
|
|
||||||
|
| Script | Purpose | Key Output |
|
||||||
|
|--------|---------|------------|
|
||||||
|
| `migration_planner.py` | Pre-migration baseline + redirect map + risk assessment | URL inventory, redirect map, risk scores, checklist |
|
||||||
|
| `migration_monitor.py` | Post-migration traffic comparison, redirect health, indexation tracking | Traffic delta, broken redirects, ranking changes, alerts |
|
||||||
|
| `base_client.py` | Shared utilities | RateLimiter, ConfigManager, BaseAsyncClient |
|
||||||
|
|
||||||
|
## Migration Planner
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Domain move planning
|
||||||
|
python scripts/migration_planner.py --domain https://example.com --type domain-move --new-domain https://new-example.com --json
|
||||||
|
|
||||||
|
# Platform migration (e.g., WordPress to headless)
|
||||||
|
python scripts/migration_planner.py --domain https://example.com --type platform --json
|
||||||
|
|
||||||
|
# URL restructuring
|
||||||
|
python scripts/migration_planner.py --domain https://example.com --type url-restructure --json
|
||||||
|
|
||||||
|
# HTTPS migration
|
||||||
|
python scripts/migration_planner.py --domain http://example.com --type https --json
|
||||||
|
|
||||||
|
# Subdomain consolidation
|
||||||
|
python scripts/migration_planner.py --domain https://blog.example.com --type subdomain --new-domain https://example.com/blog --json
|
||||||
|
```
|
||||||
|
|
||||||
|
**Capabilities**:
|
||||||
|
- URL inventory via Firecrawl crawl (capture all URLs + status codes)
|
||||||
|
- Ahrefs top-pages baseline (traffic, keywords per page)
|
||||||
|
- Redirect map generation (old URL -> new URL mapping)
|
||||||
|
- Risk scoring per URL (based on traffic + backlinks + keyword rankings)
|
||||||
|
- Pre-migration checklist generation
|
||||||
|
- Support for migration types:
|
||||||
|
- Domain move (old domain -> new domain)
|
||||||
|
- Platform change (CMS/framework swap)
|
||||||
|
- URL restructuring (path/slug changes)
|
||||||
|
- HTTPS migration (HTTP -> HTTPS)
|
||||||
|
- Subdomain consolidation (subdomain -> subfolder)
|
||||||
|
|
||||||
|
## Migration Monitor
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Post-launch traffic comparison
|
||||||
|
python scripts/migration_monitor.py --domain https://new-example.com --migration-date 2025-01-15 --baseline baseline.json --json
|
||||||
|
|
||||||
|
# Monitor with custom period
|
||||||
|
python scripts/migration_monitor.py --domain https://new-example.com --migration-date 2025-01-15 --baseline baseline.json --json
|
||||||
|
|
||||||
|
# Quick redirect health check
|
||||||
|
python scripts/migration_monitor.py --domain https://new-example.com --migration-date 2025-01-15 --json
|
||||||
|
```
|
||||||
|
|
||||||
|
**Capabilities**:
|
||||||
|
- Post-launch traffic comparison (pre vs post, by page group)
|
||||||
|
- Redirect chain/loop detection
|
||||||
|
- 404 monitoring for high-value pages
|
||||||
|
- Indexation tracking (indexed pages before vs after)
|
||||||
|
- Ranking change tracking for priority keywords
|
||||||
|
- Recovery timeline estimation
|
||||||
|
- Alert generation for traffic drops >20%
|
||||||
|
|
||||||
|
## Ahrefs MCP Tools Used
|
||||||
|
|
||||||
|
| Tool | Purpose |
|
||||||
|
|------|---------|
|
||||||
|
| `site-explorer-metrics` | Current organic metrics (traffic, keywords) |
|
||||||
|
| `site-explorer-metrics-history` | Historical metrics for pre/post comparison |
|
||||||
|
| `site-explorer-top-pages` | Top performing pages for baseline |
|
||||||
|
| `site-explorer-pages-by-traffic` | Pages ranked by traffic for risk scoring |
|
||||||
|
| `site-explorer-organic-keywords` | Keyword rankings per page |
|
||||||
|
| `site-explorer-referring-domains` | Referring domains per page for risk scoring |
|
||||||
|
| `site-explorer-backlinks-stats` | Backlink overview for migration impact |
|
||||||
|
|
||||||
|
## Output Format
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"domain": "example.com",
|
||||||
|
"migration_type": "domain-move",
|
||||||
|
"baseline": {
|
||||||
|
"total_urls": 1250,
|
||||||
|
"total_traffic": 45000,
|
||||||
|
"total_keywords": 8500,
|
||||||
|
"top_pages": []
|
||||||
|
},
|
||||||
|
"redirect_map": [
|
||||||
|
{
|
||||||
|
"source": "https://example.com/page-1",
|
||||||
|
"target": "https://new-example.com/page-1",
|
||||||
|
"status_code": 301,
|
||||||
|
"priority": "critical"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"risk_assessment": {
|
||||||
|
"high_risk_urls": 45,
|
||||||
|
"medium_risk_urls": 180,
|
||||||
|
"low_risk_urls": 1025,
|
||||||
|
"overall_risk": "medium"
|
||||||
|
},
|
||||||
|
"pre_migration_checklist": [],
|
||||||
|
"timestamp": "2025-01-01T00:00:00"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Notion Output (Required)
|
||||||
|
|
||||||
|
**IMPORTANT**: All audit reports MUST be saved to the OurDigital SEO Audit Log database.
|
||||||
|
|
||||||
|
### Database Configuration
|
||||||
|
|
||||||
|
| Field | Value |
|
||||||
|
|-------|-------|
|
||||||
|
| Database ID | `2c8581e5-8a1e-8035-880b-e38cefc2f3ef` |
|
||||||
|
| URL | https://www.notion.so/dintelligence/2c8581e58a1e8035880be38cefc2f3ef |
|
||||||
|
|
||||||
|
### Required Properties
|
||||||
|
|
||||||
|
| Property | Type | Description |
|
||||||
|
|----------|------|-------------|
|
||||||
|
| Issue | Title | Report title (Korean + date) |
|
||||||
|
| Site | URL | Target website URL |
|
||||||
|
| Category | Select | SEO Migration |
|
||||||
|
| Priority | Select | Based on risk level |
|
||||||
|
| Found Date | Date | Report date (YYYY-MM-DD) |
|
||||||
|
| Audit ID | Rich Text | Format: MIGR-YYYYMMDD-NNN |
|
||||||
|
|
||||||
|
### Language Guidelines
|
||||||
|
|
||||||
|
- Report content in Korean (한국어)
|
||||||
|
- Keep technical English terms as-is (e.g., Redirect Map, Risk Score, Traffic Baseline, Indexation)
|
||||||
|
- URLs and code remain unchanged
|
||||||
@@ -0,0 +1,27 @@
|
|||||||
|
---
|
||||||
|
name: seo-migration-planner
|
||||||
|
description: |
|
||||||
|
SEO site migration planning and monitoring. Pre-migration risk assessment, redirect mapping,
|
||||||
|
crawl baseline, and post-migration traffic/indexation monitoring.
|
||||||
|
Triggers: site migration, domain move, redirect mapping, platform migration, URL restructuring, 사이트 이전.
|
||||||
|
allowed-tools:
|
||||||
|
- Bash
|
||||||
|
- Read
|
||||||
|
- Write
|
||||||
|
- WebFetch
|
||||||
|
- WebSearch
|
||||||
|
---
|
||||||
|
|
||||||
|
# SEO Migration Planner
|
||||||
|
|
||||||
|
Run the migration planning or monitoring workflow based on the user's request.
|
||||||
|
|
||||||
|
## Pre-Migration Planning
|
||||||
|
```bash
|
||||||
|
python custom-skills/33-seo-migration-planner/code/scripts/migration_planner.py --domain [URL] --type [TYPE] --json
|
||||||
|
```
|
||||||
|
|
||||||
|
## Post-Migration Monitoring
|
||||||
|
```bash
|
||||||
|
python custom-skills/33-seo-migration-planner/code/scripts/migration_monitor.py --domain [URL] --migration-date [DATE] --json
|
||||||
|
```
|
||||||
@@ -0,0 +1,172 @@
|
|||||||
|
"""
|
||||||
|
Base Client - Shared async client utilities
|
||||||
|
===========================================
|
||||||
|
Purpose: Rate-limited async operations for API clients
|
||||||
|
Python: 3.10+
|
||||||
|
"""
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
from asyncio import Semaphore
|
||||||
|
from datetime import datetime
|
||||||
|
from typing import Any, Callable, TypeVar
|
||||||
|
|
||||||
|
from dotenv import load_dotenv
|
||||||
|
from tenacity import (
|
||||||
|
retry,
|
||||||
|
stop_after_attempt,
|
||||||
|
wait_exponential,
|
||||||
|
retry_if_exception_type,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Load environment variables
|
||||||
|
load_dotenv()
|
||||||
|
|
||||||
|
# Logging setup
|
||||||
|
logging.basicConfig(
|
||||||
|
level=logging.INFO,
|
||||||
|
format="%(asctime)s - %(levelname)s - %(message)s",
|
||||||
|
)
|
||||||
|
|
||||||
|
T = TypeVar("T")
|
||||||
|
|
||||||
|
|
||||||
|
class RateLimiter:
|
||||||
|
"""Rate limiter using token bucket algorithm."""
|
||||||
|
|
||||||
|
def __init__(self, rate: float, per: float = 1.0):
|
||||||
|
self.rate = rate
|
||||||
|
self.per = per
|
||||||
|
self.tokens = rate
|
||||||
|
self.last_update = datetime.now()
|
||||||
|
self._lock = asyncio.Lock()
|
||||||
|
|
||||||
|
async def acquire(self) -> None:
|
||||||
|
async with self._lock:
|
||||||
|
now = datetime.now()
|
||||||
|
elapsed = (now - self.last_update).total_seconds()
|
||||||
|
self.tokens = min(self.rate, self.tokens + elapsed * (self.rate / self.per))
|
||||||
|
self.last_update = now
|
||||||
|
|
||||||
|
if self.tokens < 1:
|
||||||
|
wait_time = (1 - self.tokens) * (self.per / self.rate)
|
||||||
|
await asyncio.sleep(wait_time)
|
||||||
|
self.tokens = 0
|
||||||
|
else:
|
||||||
|
self.tokens -= 1
|
||||||
|
|
||||||
|
|
||||||
|
class BaseAsyncClient:
|
||||||
|
"""Base class for async API clients with rate limiting."""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
max_concurrent: int = 5,
|
||||||
|
requests_per_second: float = 3.0,
|
||||||
|
logger: logging.Logger | None = None,
|
||||||
|
):
|
||||||
|
self.semaphore = Semaphore(max_concurrent)
|
||||||
|
self.rate_limiter = RateLimiter(requests_per_second)
|
||||||
|
self.logger = logger or logging.getLogger(self.__class__.__name__)
|
||||||
|
self.stats = {
|
||||||
|
"requests": 0,
|
||||||
|
"success": 0,
|
||||||
|
"errors": 0,
|
||||||
|
"retries": 0,
|
||||||
|
}
|
||||||
|
|
||||||
|
@retry(
|
||||||
|
stop=stop_after_attempt(3),
|
||||||
|
wait=wait_exponential(multiplier=1, min=2, max=10),
|
||||||
|
retry=retry_if_exception_type(Exception),
|
||||||
|
)
|
||||||
|
async def _rate_limited_request(
|
||||||
|
self,
|
||||||
|
coro: Callable[[], Any],
|
||||||
|
) -> Any:
|
||||||
|
async with self.semaphore:
|
||||||
|
await self.rate_limiter.acquire()
|
||||||
|
self.stats["requests"] += 1
|
||||||
|
try:
|
||||||
|
result = await coro()
|
||||||
|
self.stats["success"] += 1
|
||||||
|
return result
|
||||||
|
except Exception as e:
|
||||||
|
self.stats["errors"] += 1
|
||||||
|
self.logger.error(f"Request failed: {e}")
|
||||||
|
raise
|
||||||
|
|
||||||
|
async def batch_requests(
|
||||||
|
self,
|
||||||
|
requests: list[Callable[[], Any]],
|
||||||
|
desc: str = "Processing",
|
||||||
|
) -> list[Any]:
|
||||||
|
try:
|
||||||
|
from tqdm.asyncio import tqdm
|
||||||
|
has_tqdm = True
|
||||||
|
except ImportError:
|
||||||
|
has_tqdm = False
|
||||||
|
|
||||||
|
async def execute(req: Callable) -> Any:
|
||||||
|
try:
|
||||||
|
return await self._rate_limited_request(req)
|
||||||
|
except Exception as e:
|
||||||
|
return {"error": str(e)}
|
||||||
|
|
||||||
|
tasks = [execute(req) for req in requests]
|
||||||
|
|
||||||
|
if has_tqdm:
|
||||||
|
results = []
|
||||||
|
for coro in tqdm.as_completed(tasks, total=len(tasks), desc=desc):
|
||||||
|
result = await coro
|
||||||
|
results.append(result)
|
||||||
|
return results
|
||||||
|
else:
|
||||||
|
return await asyncio.gather(*tasks, return_exceptions=True)
|
||||||
|
|
||||||
|
def print_stats(self) -> None:
|
||||||
|
self.logger.info("=" * 40)
|
||||||
|
self.logger.info("Request Statistics:")
|
||||||
|
self.logger.info(f" Total Requests: {self.stats['requests']}")
|
||||||
|
self.logger.info(f" Successful: {self.stats['success']}")
|
||||||
|
self.logger.info(f" Errors: {self.stats['errors']}")
|
||||||
|
self.logger.info("=" * 40)
|
||||||
|
|
||||||
|
|
||||||
|
class ConfigManager:
|
||||||
|
"""Manage API configuration and credentials."""
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
load_dotenv()
|
||||||
|
|
||||||
|
@property
|
||||||
|
def google_credentials_path(self) -> str | None:
|
||||||
|
seo_creds = os.path.expanduser("~/.credential/ourdigital-seo-agent.json")
|
||||||
|
if os.path.exists(seo_creds):
|
||||||
|
return seo_creds
|
||||||
|
return os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
|
||||||
|
|
||||||
|
@property
|
||||||
|
def pagespeed_api_key(self) -> str | None:
|
||||||
|
return os.getenv("PAGESPEED_API_KEY")
|
||||||
|
|
||||||
|
@property
|
||||||
|
def notion_token(self) -> str | None:
|
||||||
|
return os.getenv("NOTION_TOKEN") or os.getenv("NOTION_API_KEY")
|
||||||
|
|
||||||
|
def validate_google_credentials(self) -> bool:
|
||||||
|
creds_path = self.google_credentials_path
|
||||||
|
if not creds_path:
|
||||||
|
return False
|
||||||
|
return os.path.exists(creds_path)
|
||||||
|
|
||||||
|
def get_required(self, key: str) -> str:
|
||||||
|
value = os.getenv(key)
|
||||||
|
if not value:
|
||||||
|
raise ValueError(f"Missing required environment variable: {key}")
|
||||||
|
return value
|
||||||
|
|
||||||
|
|
||||||
|
# Singleton config instance
|
||||||
|
config = ConfigManager()
|
||||||
@@ -0,0 +1,909 @@
|
|||||||
|
"""
|
||||||
|
Migration Monitor - Post-Migration Traffic & Indexation Monitoring
|
||||||
|
==================================================================
|
||||||
|
Purpose: Post-migration traffic comparison, redirect health checks,
|
||||||
|
indexation tracking, ranking change monitoring, and alert generation.
|
||||||
|
Python: 3.10+
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python migration_monitor.py --domain https://new-example.com --migration-date 2025-01-15 --baseline baseline.json --json
|
||||||
|
python migration_monitor.py --domain https://new-example.com --migration-date 2025-01-15 --json
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import asyncio
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
import sys
|
||||||
|
from dataclasses import dataclass, field, asdict
|
||||||
|
from datetime import datetime, timedelta
|
||||||
|
from typing import Any
|
||||||
|
from urllib.parse import urlparse
|
||||||
|
|
||||||
|
from base_client import BaseAsyncClient, config
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Data classes
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class TrafficComparison:
|
||||||
|
"""Traffic comparison between pre- and post-migration periods."""
|
||||||
|
page_group: str = ""
|
||||||
|
pre_traffic: int = 0
|
||||||
|
post_traffic: int = 0
|
||||||
|
change_pct: float = 0.0
|
||||||
|
change_absolute: int = 0
|
||||||
|
status: str = "stable" # improved / stable / declined / critical
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class RedirectHealth:
|
||||||
|
"""Health status of a single redirect."""
|
||||||
|
source: str = ""
|
||||||
|
target: str = ""
|
||||||
|
status_code: int = 0
|
||||||
|
chain_length: int = 0
|
||||||
|
is_broken: bool = False
|
||||||
|
final_url: str = ""
|
||||||
|
error: str = ""
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class IndexationStatus:
|
||||||
|
"""Indexation comparison before and after migration."""
|
||||||
|
pre_count: int = 0
|
||||||
|
post_count: int = 0
|
||||||
|
change_pct: float = 0.0
|
||||||
|
missing_pages: list[str] = field(default_factory=list)
|
||||||
|
new_pages: list[str] = field(default_factory=list)
|
||||||
|
deindexed_count: int = 0
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class RankingChange:
|
||||||
|
"""Ranking change for a keyword."""
|
||||||
|
keyword: str = ""
|
||||||
|
pre_position: int = 0
|
||||||
|
post_position: int = 0
|
||||||
|
change: int = 0
|
||||||
|
url: str = ""
|
||||||
|
search_volume: int = 0
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class MigrationAlert:
|
||||||
|
"""Alert for significant post-migration issues."""
|
||||||
|
alert_type: str = "" # traffic_drop, redirect_broken, indexation_drop, ranking_loss
|
||||||
|
severity: str = "info" # info / warning / critical
|
||||||
|
message: str = ""
|
||||||
|
metric_value: float = 0.0
|
||||||
|
threshold: float = 0.0
|
||||||
|
affected_urls: list[str] = field(default_factory=list)
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class MigrationReport:
|
||||||
|
"""Complete post-migration monitoring report."""
|
||||||
|
domain: str = ""
|
||||||
|
migration_date: str = ""
|
||||||
|
days_since_migration: int = 0
|
||||||
|
traffic_comparison: list[TrafficComparison] = field(default_factory=list)
|
||||||
|
redirect_health: list[RedirectHealth] = field(default_factory=list)
|
||||||
|
indexation: IndexationStatus | None = None
|
||||||
|
ranking_changes: list[RankingChange] = field(default_factory=list)
|
||||||
|
recovery_estimate: dict[str, Any] = field(default_factory=dict)
|
||||||
|
alerts: list[MigrationAlert] = field(default_factory=list)
|
||||||
|
timestamp: str = ""
|
||||||
|
errors: list[str] = field(default_factory=list)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Monitor
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
class MigrationMonitor(BaseAsyncClient):
|
||||||
|
"""Monitors post-migration SEO health using Ahrefs and Firecrawl MCP tools."""
|
||||||
|
|
||||||
|
# Alert thresholds
|
||||||
|
TRAFFIC_DROP_WARNING = 0.20 # 20% drop
|
||||||
|
TRAFFIC_DROP_CRITICAL = 0.40 # 40% drop
|
||||||
|
RANKING_DROP_THRESHOLD = 5 # 5+ position drop
|
||||||
|
INDEXATION_DROP_WARNING = 0.10 # 10% indexation loss
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
super().__init__(max_concurrent=5, requests_per_second=2.0)
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _extract_domain(url: str) -> str:
|
||||||
|
"""Extract bare domain from URL or return as-is if already bare."""
|
||||||
|
if "://" in url:
|
||||||
|
parsed = urlparse(url)
|
||||||
|
return parsed.netloc.lower().replace("www.", "")
|
||||||
|
return url.lower().replace("www.", "")
|
||||||
|
|
||||||
|
async def _call_ahrefs(self, tool: str, params: dict[str, Any]) -> dict:
|
||||||
|
"""Simulate Ahrefs MCP call. In production, routed via MCP bridge."""
|
||||||
|
self.logger.info(f"Ahrefs MCP call: {tool} | params={params}")
|
||||||
|
return {"tool": tool, "params": params, "data": {}}
|
||||||
|
|
||||||
|
async def _call_firecrawl(self, tool: str, params: dict[str, Any]) -> dict:
|
||||||
|
"""Simulate Firecrawl MCP call. In production, routed via MCP bridge."""
|
||||||
|
self.logger.info(f"Firecrawl MCP call: {tool} | params={params}")
|
||||||
|
return {"tool": tool, "params": params, "data": {}}
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Traffic Comparison
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
async def compare_traffic(
|
||||||
|
self, domain: str, migration_date: str
|
||||||
|
) -> list[TrafficComparison]:
|
||||||
|
"""Compare traffic before and after migration date."""
|
||||||
|
domain = self._extract_domain(domain)
|
||||||
|
mig_date = datetime.strptime(migration_date, "%Y-%m-%d")
|
||||||
|
days_since = (datetime.now() - mig_date).days
|
||||||
|
|
||||||
|
# Pre-migration period: same duration before migration
|
||||||
|
pre_start = (mig_date - timedelta(days=max(days_since, 30))).strftime("%Y-%m-%d")
|
||||||
|
pre_end = (mig_date - timedelta(days=1)).strftime("%Y-%m-%d")
|
||||||
|
post_start = migration_date
|
||||||
|
post_end = datetime.now().strftime("%Y-%m-%d")
|
||||||
|
|
||||||
|
self.logger.info(
|
||||||
|
f"Comparing traffic for {domain}: "
|
||||||
|
f"pre={pre_start}..{pre_end} vs post={post_start}..{post_end}"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Fetch pre-migration metrics history
|
||||||
|
pre_resp = await self._call_ahrefs(
|
||||||
|
"site-explorer-metrics-history",
|
||||||
|
{"target": domain, "date_from": pre_start, "date_to": pre_end},
|
||||||
|
)
|
||||||
|
pre_data = pre_resp.get("data", {}).get("data_points", [])
|
||||||
|
|
||||||
|
# Fetch post-migration metrics history
|
||||||
|
post_resp = await self._call_ahrefs(
|
||||||
|
"site-explorer-metrics-history",
|
||||||
|
{"target": domain, "date_from": post_start, "date_to": post_end},
|
||||||
|
)
|
||||||
|
post_data = post_resp.get("data", {}).get("data_points", [])
|
||||||
|
|
||||||
|
# Calculate averages
|
||||||
|
pre_avg_traffic = 0
|
||||||
|
if pre_data:
|
||||||
|
pre_avg_traffic = int(
|
||||||
|
sum(int(p.get("organic_traffic", 0)) for p in pre_data) / len(pre_data)
|
||||||
|
)
|
||||||
|
|
||||||
|
post_avg_traffic = 0
|
||||||
|
if post_data:
|
||||||
|
post_avg_traffic = int(
|
||||||
|
sum(int(p.get("organic_traffic", 0)) for p in post_data) / len(post_data)
|
||||||
|
)
|
||||||
|
|
||||||
|
# Overall comparison
|
||||||
|
change_pct = 0.0
|
||||||
|
if pre_avg_traffic > 0:
|
||||||
|
change_pct = ((post_avg_traffic - pre_avg_traffic) / pre_avg_traffic) * 100
|
||||||
|
|
||||||
|
status = "stable"
|
||||||
|
if change_pct > 5:
|
||||||
|
status = "improved"
|
||||||
|
elif change_pct < -40:
|
||||||
|
status = "critical"
|
||||||
|
elif change_pct < -20:
|
||||||
|
status = "declined"
|
||||||
|
|
||||||
|
comparisons = [
|
||||||
|
TrafficComparison(
|
||||||
|
page_group="Overall",
|
||||||
|
pre_traffic=pre_avg_traffic,
|
||||||
|
post_traffic=post_avg_traffic,
|
||||||
|
change_pct=round(change_pct, 2),
|
||||||
|
change_absolute=post_avg_traffic - pre_avg_traffic,
|
||||||
|
status=status,
|
||||||
|
)
|
||||||
|
]
|
||||||
|
|
||||||
|
# Fetch top pages comparison
|
||||||
|
pre_pages_resp = await self._call_ahrefs(
|
||||||
|
"site-explorer-pages-by-traffic",
|
||||||
|
{"target": domain, "limit": 50},
|
||||||
|
)
|
||||||
|
top_pages = pre_pages_resp.get("data", {}).get("pages", [])
|
||||||
|
|
||||||
|
for page in top_pages[:20]:
|
||||||
|
page_url = page.get("url", "")
|
||||||
|
page_traffic = int(page.get("traffic", 0))
|
||||||
|
# In production, would compare with baseline data
|
||||||
|
comparisons.append(
|
||||||
|
TrafficComparison(
|
||||||
|
page_group=page_url,
|
||||||
|
pre_traffic=0, # Would be populated from baseline
|
||||||
|
post_traffic=page_traffic,
|
||||||
|
change_pct=0.0,
|
||||||
|
change_absolute=0,
|
||||||
|
status="stable",
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
self.logger.info(
|
||||||
|
f"Traffic comparison for {domain}: "
|
||||||
|
f"pre={pre_avg_traffic:,} -> post={post_avg_traffic:,} "
|
||||||
|
f"({change_pct:+.1f}%)"
|
||||||
|
)
|
||||||
|
return comparisons
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Redirect Health Check
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
async def check_redirects(
|
||||||
|
self, redirect_map: list[dict[str, str]]
|
||||||
|
) -> list[RedirectHealth]:
|
||||||
|
"""Verify redirect health: check for broken redirects, chains, and loops."""
|
||||||
|
health_results: list[RedirectHealth] = []
|
||||||
|
|
||||||
|
self.logger.info(f"Checking {len(redirect_map)} redirects for health...")
|
||||||
|
|
||||||
|
for entry in redirect_map:
|
||||||
|
source = entry.get("source", "")
|
||||||
|
expected_target = entry.get("target", "")
|
||||||
|
|
||||||
|
if not source:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Use Firecrawl to check the redirect
|
||||||
|
resp = await self._call_firecrawl(
|
||||||
|
"firecrawl_scrape",
|
||||||
|
{"url": source, "formats": ["links"]},
|
||||||
|
)
|
||||||
|
|
||||||
|
result_data = resp.get("data", {})
|
||||||
|
final_url = result_data.get("final_url", "")
|
||||||
|
status_code = int(result_data.get("status_code", 0))
|
||||||
|
redirect_chain = result_data.get("redirect_chain", [])
|
||||||
|
chain_length = len(redirect_chain)
|
||||||
|
|
||||||
|
is_broken = (
|
||||||
|
status_code >= 400
|
||||||
|
or status_code == 0
|
||||||
|
or (final_url and final_url != expected_target and status_code != 301)
|
||||||
|
)
|
||||||
|
|
||||||
|
health = RedirectHealth(
|
||||||
|
source=source,
|
||||||
|
target=expected_target,
|
||||||
|
status_code=status_code,
|
||||||
|
chain_length=chain_length,
|
||||||
|
is_broken=is_broken,
|
||||||
|
final_url=final_url,
|
||||||
|
error="" if not is_broken else f"Expected {expected_target}, got {final_url} ({status_code})",
|
||||||
|
)
|
||||||
|
health_results.append(health)
|
||||||
|
|
||||||
|
broken_count = sum(1 for h in health_results if h.is_broken)
|
||||||
|
chain_count = sum(1 for h in health_results if h.chain_length > 1)
|
||||||
|
|
||||||
|
self.logger.info(
|
||||||
|
f"Redirect health check complete: "
|
||||||
|
f"{broken_count} broken, {chain_count} chains detected "
|
||||||
|
f"out of {len(health_results)} redirects"
|
||||||
|
)
|
||||||
|
return health_results
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Indexation Tracking
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
async def track_indexation(
|
||||||
|
self, domain: str, pre_baseline: dict[str, Any] | None = None
|
||||||
|
) -> IndexationStatus:
|
||||||
|
"""Compare indexed pages before and after migration."""
|
||||||
|
domain = self._extract_domain(domain)
|
||||||
|
|
||||||
|
self.logger.info(f"Tracking indexation for {domain}")
|
||||||
|
|
||||||
|
# Fetch current metrics
|
||||||
|
metrics_resp = await self._call_ahrefs(
|
||||||
|
"site-explorer-metrics", {"target": domain}
|
||||||
|
)
|
||||||
|
current_pages = int(metrics_resp.get("data", {}).get("pages", 0))
|
||||||
|
|
||||||
|
# Get pre-migration count from baseline
|
||||||
|
pre_count = 0
|
||||||
|
if pre_baseline:
|
||||||
|
pre_count = int(pre_baseline.get("total_urls", 0))
|
||||||
|
|
||||||
|
change_pct = 0.0
|
||||||
|
if pre_count > 0:
|
||||||
|
change_pct = ((current_pages - pre_count) / pre_count) * 100
|
||||||
|
|
||||||
|
# Fetch current top pages to detect missing ones
|
||||||
|
pages_resp = await self._call_ahrefs(
|
||||||
|
"site-explorer-top-pages", {"target": domain, "limit": 500}
|
||||||
|
)
|
||||||
|
current_page_urls = set()
|
||||||
|
for page in pages_resp.get("data", {}).get("pages", []):
|
||||||
|
url = page.get("url", "")
|
||||||
|
if url:
|
||||||
|
current_page_urls.add(url)
|
||||||
|
|
||||||
|
# Compare with baseline URL inventory
|
||||||
|
missing_pages: list[str] = []
|
||||||
|
if pre_baseline:
|
||||||
|
baseline_urls = pre_baseline.get("url_inventory", [])
|
||||||
|
for url_entry in baseline_urls:
|
||||||
|
url = url_entry if isinstance(url_entry, str) else url_entry.get("url", "")
|
||||||
|
if url and url not in current_page_urls:
|
||||||
|
missing_pages.append(url)
|
||||||
|
|
||||||
|
status = IndexationStatus(
|
||||||
|
pre_count=pre_count,
|
||||||
|
post_count=current_pages,
|
||||||
|
change_pct=round(change_pct, 2),
|
||||||
|
missing_pages=missing_pages[:100], # Cap at 100 for readability
|
||||||
|
deindexed_count=len(missing_pages),
|
||||||
|
)
|
||||||
|
|
||||||
|
self.logger.info(
|
||||||
|
f"Indexation for {domain}: "
|
||||||
|
f"pre={pre_count:,} -> post={current_pages:,} "
|
||||||
|
f"({change_pct:+.1f}%), {len(missing_pages)} missing"
|
||||||
|
)
|
||||||
|
return status
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Ranking Tracking
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
async def track_rankings(
|
||||||
|
self, domain: str, priority_keywords: list[str] | None = None
|
||||||
|
) -> list[RankingChange]:
|
||||||
|
"""Track ranking changes for priority keywords."""
|
||||||
|
domain = self._extract_domain(domain)
|
||||||
|
|
||||||
|
self.logger.info(f"Tracking rankings for {domain}")
|
||||||
|
|
||||||
|
# Fetch current keyword rankings
|
||||||
|
kw_resp = await self._call_ahrefs(
|
||||||
|
"site-explorer-organic-keywords",
|
||||||
|
{"target": domain, "limit": 200},
|
||||||
|
)
|
||||||
|
current_keywords = kw_resp.get("data", {}).get("keywords", [])
|
||||||
|
|
||||||
|
ranking_changes: list[RankingChange] = []
|
||||||
|
for kw_data in current_keywords:
|
||||||
|
keyword = kw_data.get("keyword", "")
|
||||||
|
|
||||||
|
# If priority keywords specified, filter
|
||||||
|
if priority_keywords and keyword.lower() not in [k.lower() for k in priority_keywords]:
|
||||||
|
continue
|
||||||
|
|
||||||
|
current_pos = int(kw_data.get("position", 0))
|
||||||
|
previous_pos = int(kw_data.get("previous_position", current_pos))
|
||||||
|
volume = int(kw_data.get("search_volume", 0))
|
||||||
|
url = kw_data.get("url", "")
|
||||||
|
|
||||||
|
change = previous_pos - current_pos # Positive = improved
|
||||||
|
|
||||||
|
ranking_changes.append(
|
||||||
|
RankingChange(
|
||||||
|
keyword=keyword,
|
||||||
|
pre_position=previous_pos,
|
||||||
|
post_position=current_pos,
|
||||||
|
change=change,
|
||||||
|
url=url,
|
||||||
|
search_volume=volume,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
# Sort by absolute change (biggest drops first)
|
||||||
|
ranking_changes.sort(key=lambda r: r.change)
|
||||||
|
|
||||||
|
self.logger.info(
|
||||||
|
f"Tracked {len(ranking_changes)} keyword rankings for {domain}"
|
||||||
|
)
|
||||||
|
return ranking_changes
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Recovery Estimation
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
def estimate_recovery(
|
||||||
|
self, traffic_data: list[TrafficComparison], migration_type: str = "domain-move"
|
||||||
|
) -> dict[str, Any]:
|
||||||
|
"""Estimate recovery timeline based on traffic comparison data."""
|
||||||
|
overall = next(
|
||||||
|
(t for t in traffic_data if t.page_group == "Overall"), None
|
||||||
|
)
|
||||||
|
|
||||||
|
if not overall:
|
||||||
|
return {
|
||||||
|
"estimated_weeks": "unknown",
|
||||||
|
"confidence": "low",
|
||||||
|
"message": "트래픽 데이터 부족으로 회복 기간 추정 불가",
|
||||||
|
}
|
||||||
|
|
||||||
|
change_pct = overall.change_pct
|
||||||
|
|
||||||
|
# Base recovery timelines by migration type (weeks)
|
||||||
|
base_timelines = {
|
||||||
|
"domain-move": 16, # 4 months
|
||||||
|
"platform": 8, # 2 months
|
||||||
|
"url-restructure": 12, # 3 months
|
||||||
|
"https": 4, # 1 month
|
||||||
|
"subdomain": 10, # 2.5 months
|
||||||
|
}
|
||||||
|
base_weeks = base_timelines.get(migration_type, 12)
|
||||||
|
|
||||||
|
if change_pct >= 0:
|
||||||
|
# No traffic drop — recovery already achieved or in progress
|
||||||
|
return {
|
||||||
|
"estimated_weeks": 0,
|
||||||
|
"confidence": "high",
|
||||||
|
"current_recovery_pct": 100.0,
|
||||||
|
"message": "트래픽 손실 없음 — 이전 성공적으로 진행 중",
|
||||||
|
}
|
||||||
|
elif change_pct > -20:
|
||||||
|
# Minor drop — quick recovery expected
|
||||||
|
estimated_weeks = max(int(base_weeks * 0.5), 2)
|
||||||
|
confidence = "high"
|
||||||
|
recovery_pct = round(100 + change_pct, 1)
|
||||||
|
elif change_pct > -40:
|
||||||
|
# Moderate drop — standard recovery timeline
|
||||||
|
estimated_weeks = base_weeks
|
||||||
|
confidence = "medium"
|
||||||
|
recovery_pct = round(100 + change_pct, 1)
|
||||||
|
else:
|
||||||
|
# Severe drop — extended recovery
|
||||||
|
estimated_weeks = int(base_weeks * 1.5)
|
||||||
|
confidence = "low"
|
||||||
|
recovery_pct = round(100 + change_pct, 1)
|
||||||
|
|
||||||
|
return {
|
||||||
|
"estimated_weeks": estimated_weeks,
|
||||||
|
"confidence": confidence,
|
||||||
|
"current_recovery_pct": recovery_pct,
|
||||||
|
"traffic_change_pct": change_pct,
|
||||||
|
"migration_type": migration_type,
|
||||||
|
"message": (
|
||||||
|
f"현재 트래픽 {change_pct:+.1f}% 변동. "
|
||||||
|
f"예상 회복 기간: {estimated_weeks}주 (신뢰도: {confidence}). "
|
||||||
|
f"현재 회복률: {recovery_pct:.1f}%"
|
||||||
|
),
|
||||||
|
}
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Alert Generation
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
def generate_alerts(self, report: MigrationReport) -> list[MigrationAlert]:
|
||||||
|
"""Generate alerts for significant post-migration issues."""
|
||||||
|
alerts: list[MigrationAlert] = []
|
||||||
|
|
||||||
|
# Traffic drop alerts
|
||||||
|
for tc in report.traffic_comparison:
|
||||||
|
if tc.page_group == "Overall":
|
||||||
|
abs_change = abs(tc.change_pct) / 100.0
|
||||||
|
if tc.change_pct < 0 and abs_change >= self.TRAFFIC_DROP_CRITICAL:
|
||||||
|
alerts.append(MigrationAlert(
|
||||||
|
alert_type="traffic_drop",
|
||||||
|
severity="critical",
|
||||||
|
message=(
|
||||||
|
f"심각한 트래픽 하락: {tc.change_pct:+.1f}% "
|
||||||
|
f"(이전 전 {tc.pre_traffic:,} -> 이전 후 {tc.post_traffic:,})"
|
||||||
|
),
|
||||||
|
metric_value=tc.change_pct,
|
||||||
|
threshold=-self.TRAFFIC_DROP_CRITICAL * 100,
|
||||||
|
))
|
||||||
|
elif tc.change_pct < 0 and abs_change >= self.TRAFFIC_DROP_WARNING:
|
||||||
|
alerts.append(MigrationAlert(
|
||||||
|
alert_type="traffic_drop",
|
||||||
|
severity="warning",
|
||||||
|
message=(
|
||||||
|
f"트래픽 하락 감지: {tc.change_pct:+.1f}% "
|
||||||
|
f"(이전 전 {tc.pre_traffic:,} -> 이전 후 {tc.post_traffic:,})"
|
||||||
|
),
|
||||||
|
metric_value=tc.change_pct,
|
||||||
|
threshold=-self.TRAFFIC_DROP_WARNING * 100,
|
||||||
|
))
|
||||||
|
|
||||||
|
# Broken redirect alerts
|
||||||
|
broken_redirects = [r for r in report.redirect_health if r.is_broken]
|
||||||
|
if broken_redirects:
|
||||||
|
severity = "critical" if len(broken_redirects) > 10 else "warning"
|
||||||
|
alerts.append(MigrationAlert(
|
||||||
|
alert_type="redirect_broken",
|
||||||
|
severity=severity,
|
||||||
|
message=(
|
||||||
|
f"깨진 리디렉트 {len(broken_redirects)}건 감지. "
|
||||||
|
f"고가치 페이지의 링크 에퀴티 손실 위험."
|
||||||
|
),
|
||||||
|
metric_value=float(len(broken_redirects)),
|
||||||
|
threshold=1.0,
|
||||||
|
affected_urls=[r.source for r in broken_redirects[:20]],
|
||||||
|
))
|
||||||
|
|
||||||
|
# Redirect chain alerts
|
||||||
|
chain_redirects = [r for r in report.redirect_health if r.chain_length > 1]
|
||||||
|
if chain_redirects:
|
||||||
|
alerts.append(MigrationAlert(
|
||||||
|
alert_type="redirect_chain",
|
||||||
|
severity="warning",
|
||||||
|
message=(
|
||||||
|
f"리디렉트 체인 {len(chain_redirects)}건 감지. "
|
||||||
|
f"크롤 효율성 및 링크 에퀴티에 영향."
|
||||||
|
),
|
||||||
|
metric_value=float(len(chain_redirects)),
|
||||||
|
threshold=1.0,
|
||||||
|
affected_urls=[r.source for r in chain_redirects[:20]],
|
||||||
|
))
|
||||||
|
|
||||||
|
# Indexation drop alerts
|
||||||
|
if report.indexation:
|
||||||
|
idx = report.indexation
|
||||||
|
if idx.pre_count > 0:
|
||||||
|
idx_drop = abs(idx.change_pct) / 100.0
|
||||||
|
if idx.change_pct < 0 and idx_drop >= self.INDEXATION_DROP_WARNING:
|
||||||
|
alerts.append(MigrationAlert(
|
||||||
|
alert_type="indexation_drop",
|
||||||
|
severity="warning" if idx_drop < 0.30 else "critical",
|
||||||
|
message=(
|
||||||
|
f"인덱싱 감소: {idx.change_pct:+.1f}% "
|
||||||
|
f"(이전 전 {idx.pre_count:,} -> 이전 후 {idx.post_count:,}페이지). "
|
||||||
|
f"디인덱싱된 페이지: {idx.deindexed_count}건"
|
||||||
|
),
|
||||||
|
metric_value=idx.change_pct,
|
||||||
|
threshold=-self.INDEXATION_DROP_WARNING * 100,
|
||||||
|
affected_urls=idx.missing_pages[:20],
|
||||||
|
))
|
||||||
|
|
||||||
|
# Ranking loss alerts
|
||||||
|
significant_drops = [
|
||||||
|
r for r in report.ranking_changes
|
||||||
|
if r.change < -self.RANKING_DROP_THRESHOLD and r.search_volume > 100
|
||||||
|
]
|
||||||
|
if significant_drops:
|
||||||
|
alerts.append(MigrationAlert(
|
||||||
|
alert_type="ranking_loss",
|
||||||
|
severity="warning" if len(significant_drops) < 20 else "critical",
|
||||||
|
message=(
|
||||||
|
f"주요 키워드 {len(significant_drops)}개의 순위 하락 감지 "
|
||||||
|
f"(5포지션 이상 하락, 검색량 100+)"
|
||||||
|
),
|
||||||
|
metric_value=float(len(significant_drops)),
|
||||||
|
threshold=float(self.RANKING_DROP_THRESHOLD),
|
||||||
|
affected_urls=[r.url for r in significant_drops[:20]],
|
||||||
|
))
|
||||||
|
|
||||||
|
# Sort alerts by severity
|
||||||
|
severity_order = {"critical": 0, "warning": 1, "info": 2}
|
||||||
|
alerts.sort(key=lambda a: severity_order.get(a.severity, 3))
|
||||||
|
|
||||||
|
self.logger.info(f"Generated {len(alerts)} migration alerts")
|
||||||
|
return alerts
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Orchestrator
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
async def run(
|
||||||
|
self,
|
||||||
|
domain: str,
|
||||||
|
migration_date: str,
|
||||||
|
baseline_file: str | None = None,
|
||||||
|
migration_type: str = "domain-move",
|
||||||
|
) -> MigrationReport:
|
||||||
|
"""Orchestrate full post-migration monitoring pipeline."""
|
||||||
|
timestamp = datetime.now().isoformat()
|
||||||
|
mig_date = datetime.strptime(migration_date, "%Y-%m-%d")
|
||||||
|
days_since = (datetime.now() - mig_date).days
|
||||||
|
|
||||||
|
report = MigrationReport(
|
||||||
|
domain=self._extract_domain(domain),
|
||||||
|
migration_date=migration_date,
|
||||||
|
days_since_migration=days_since,
|
||||||
|
timestamp=timestamp,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Load baseline if provided
|
||||||
|
baseline: dict[str, Any] | None = None
|
||||||
|
redirect_map_data: list[dict[str, str]] = []
|
||||||
|
if baseline_file:
|
||||||
|
try:
|
||||||
|
with open(baseline_file, "r", encoding="utf-8") as f:
|
||||||
|
baseline_raw = json.load(f)
|
||||||
|
baseline = baseline_raw.get("baseline", baseline_raw)
|
||||||
|
redirect_map_data = [
|
||||||
|
{"source": r.get("source", ""), "target": r.get("target", "")}
|
||||||
|
for r in baseline_raw.get("redirect_map", [])
|
||||||
|
]
|
||||||
|
self.logger.info(f"Loaded baseline from {baseline_file}")
|
||||||
|
except Exception as e:
|
||||||
|
msg = f"Failed to load baseline file: {e}"
|
||||||
|
self.logger.error(msg)
|
||||||
|
report.errors.append(msg)
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Step 1: Traffic comparison
|
||||||
|
self.logger.info("Step 1/5: Comparing pre/post traffic...")
|
||||||
|
report.traffic_comparison = await self.compare_traffic(
|
||||||
|
domain, migration_date
|
||||||
|
)
|
||||||
|
|
||||||
|
# Step 2: Redirect health check
|
||||||
|
if redirect_map_data:
|
||||||
|
self.logger.info("Step 2/5: Checking redirect health...")
|
||||||
|
report.redirect_health = await self.check_redirects(redirect_map_data)
|
||||||
|
else:
|
||||||
|
self.logger.info(
|
||||||
|
"Step 2/5: Skipping redirect check (no baseline redirect map)"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Step 3: Indexation tracking
|
||||||
|
self.logger.info("Step 3/5: Tracking indexation changes...")
|
||||||
|
report.indexation = await self.track_indexation(domain, baseline)
|
||||||
|
|
||||||
|
# Step 4: Ranking tracking
|
||||||
|
self.logger.info("Step 4/5: Tracking keyword rankings...")
|
||||||
|
report.ranking_changes = await self.track_rankings(domain)
|
||||||
|
|
||||||
|
# Step 5: Recovery estimation
|
||||||
|
self.logger.info("Step 5/5: Estimating recovery timeline...")
|
||||||
|
report.recovery_estimate = self.estimate_recovery(
|
||||||
|
report.traffic_comparison, migration_type
|
||||||
|
)
|
||||||
|
|
||||||
|
# Generate alerts
|
||||||
|
report.alerts = self.generate_alerts(report)
|
||||||
|
|
||||||
|
self.logger.info(
|
||||||
|
f"Migration monitoring complete: "
|
||||||
|
f"{days_since} days since migration, "
|
||||||
|
f"{len(report.alerts)} alerts generated"
|
||||||
|
)
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
msg = f"Migration monitoring pipeline error: {e}"
|
||||||
|
self.logger.error(msg)
|
||||||
|
report.errors.append(msg)
|
||||||
|
|
||||||
|
return report
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Output helpers
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def _format_text_report(report: MigrationReport) -> str:
|
||||||
|
"""Format monitoring report as human-readable text."""
|
||||||
|
lines: list[str] = []
|
||||||
|
lines.append("=" * 70)
|
||||||
|
lines.append(" SEO MIGRATION MONITORING REPORT")
|
||||||
|
lines.append(f" Domain: {report.domain}")
|
||||||
|
lines.append(f" Migration Date: {report.migration_date}")
|
||||||
|
lines.append(f" Days Since Migration: {report.days_since_migration}")
|
||||||
|
lines.append(f" Generated: {report.timestamp}")
|
||||||
|
lines.append("=" * 70)
|
||||||
|
|
||||||
|
# Alerts
|
||||||
|
if report.alerts:
|
||||||
|
lines.append("")
|
||||||
|
lines.append("--- ALERTS ---")
|
||||||
|
for alert in report.alerts:
|
||||||
|
icon = {"critical": "[!]", "warning": "[*]", "info": "[-]"}.get(
|
||||||
|
alert.severity, "[-]"
|
||||||
|
)
|
||||||
|
lines.append(f" {icon} [{alert.severity.upper()}] {alert.message}")
|
||||||
|
if alert.affected_urls:
|
||||||
|
for url in alert.affected_urls[:5]:
|
||||||
|
lines.append(f" - {url}")
|
||||||
|
if len(alert.affected_urls) > 5:
|
||||||
|
lines.append(f" ... and {len(alert.affected_urls) - 5} more")
|
||||||
|
|
||||||
|
# Traffic comparison
|
||||||
|
if report.traffic_comparison:
|
||||||
|
lines.append("")
|
||||||
|
lines.append("--- TRAFFIC COMPARISON ---")
|
||||||
|
lines.append(
|
||||||
|
f" {'Page Group':<40} {'Pre':>10} {'Post':>10} {'Change':>10} {'Status':>10}"
|
||||||
|
)
|
||||||
|
lines.append(" " + "-" * 83)
|
||||||
|
for tc in report.traffic_comparison:
|
||||||
|
group = tc.page_group[:38]
|
||||||
|
lines.append(
|
||||||
|
f" {group:<40} {tc.pre_traffic:>10,} {tc.post_traffic:>10,} "
|
||||||
|
f"{tc.change_pct:>+9.1f}% {tc.status:>10}"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Redirect health
|
||||||
|
if report.redirect_health:
|
||||||
|
broken = [r for r in report.redirect_health if r.is_broken]
|
||||||
|
chains = [r for r in report.redirect_health if r.chain_length > 1]
|
||||||
|
healthy = [r for r in report.redirect_health if not r.is_broken and r.chain_length <= 1]
|
||||||
|
|
||||||
|
lines.append("")
|
||||||
|
lines.append("--- REDIRECT HEALTH ---")
|
||||||
|
lines.append(f" Total Redirects: {len(report.redirect_health):,}")
|
||||||
|
lines.append(f" Healthy: {len(healthy):,}")
|
||||||
|
lines.append(f" Broken: {len(broken):,}")
|
||||||
|
lines.append(f" Chains (>1 hop): {len(chains):,}")
|
||||||
|
|
||||||
|
if broken:
|
||||||
|
lines.append("")
|
||||||
|
lines.append(" Broken Redirects:")
|
||||||
|
for r in broken[:10]:
|
||||||
|
lines.append(f" [{r.status_code}] {r.source} -> {r.target}")
|
||||||
|
if r.error:
|
||||||
|
lines.append(f" Error: {r.error}")
|
||||||
|
|
||||||
|
# Indexation
|
||||||
|
if report.indexation:
|
||||||
|
idx = report.indexation
|
||||||
|
lines.append("")
|
||||||
|
lines.append("--- INDEXATION STATUS ---")
|
||||||
|
lines.append(f" Pre-Migration Pages: {idx.pre_count:,}")
|
||||||
|
lines.append(f" Post-Migration Pages: {idx.post_count:,}")
|
||||||
|
lines.append(f" Change: {idx.change_pct:+.1f}%")
|
||||||
|
lines.append(f" De-indexed Pages: {idx.deindexed_count:,}")
|
||||||
|
|
||||||
|
if idx.missing_pages:
|
||||||
|
lines.append("")
|
||||||
|
lines.append(" Missing Pages (top 10):")
|
||||||
|
for page in idx.missing_pages[:10]:
|
||||||
|
lines.append(f" - {page}")
|
||||||
|
|
||||||
|
# Ranking changes
|
||||||
|
if report.ranking_changes:
|
||||||
|
lines.append("")
|
||||||
|
lines.append("--- RANKING CHANGES ---")
|
||||||
|
drops = [r for r in report.ranking_changes if r.change < 0]
|
||||||
|
gains = [r for r in report.ranking_changes if r.change > 0]
|
||||||
|
|
||||||
|
lines.append(f" Total Tracked: {len(report.ranking_changes)}")
|
||||||
|
lines.append(f" Improved: {len(gains)}")
|
||||||
|
lines.append(f" Declined: {len(drops)}")
|
||||||
|
|
||||||
|
if drops:
|
||||||
|
lines.append("")
|
||||||
|
lines.append(" Biggest Drops:")
|
||||||
|
lines.append(
|
||||||
|
f" {'Keyword':<30} {'Pre':>6} {'Post':>6} {'Change':>8} {'Volume':>8}"
|
||||||
|
)
|
||||||
|
lines.append(" " + "-" * 61)
|
||||||
|
for r in drops[:15]:
|
||||||
|
kw = r.keyword[:28]
|
||||||
|
lines.append(
|
||||||
|
f" {kw:<30} {r.pre_position:>6} {r.post_position:>6} "
|
||||||
|
f"{r.change:>+7} {r.search_volume:>8,}"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Recovery estimate
|
||||||
|
if report.recovery_estimate:
|
||||||
|
est = report.recovery_estimate
|
||||||
|
lines.append("")
|
||||||
|
lines.append("--- RECOVERY ESTIMATE ---")
|
||||||
|
lines.append(f" {est.get('message', 'N/A')}")
|
||||||
|
weeks = est.get("estimated_weeks", "unknown")
|
||||||
|
confidence = est.get("confidence", "unknown")
|
||||||
|
lines.append(f" Estimated Weeks: {weeks}")
|
||||||
|
lines.append(f" Confidence: {confidence}")
|
||||||
|
|
||||||
|
if report.errors:
|
||||||
|
lines.append("")
|
||||||
|
lines.append("--- ERRORS ---")
|
||||||
|
for err in report.errors:
|
||||||
|
lines.append(f" - {err}")
|
||||||
|
|
||||||
|
lines.append("")
|
||||||
|
lines.append("=" * 70)
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
def _serialize_report(report: MigrationReport) -> dict:
|
||||||
|
"""Convert report to JSON-serializable dict."""
|
||||||
|
output: dict[str, Any] = {
|
||||||
|
"domain": report.domain,
|
||||||
|
"migration_date": report.migration_date,
|
||||||
|
"days_since_migration": report.days_since_migration,
|
||||||
|
"traffic_comparison": [asdict(t) for t in report.traffic_comparison],
|
||||||
|
"redirect_health": [asdict(r) for r in report.redirect_health],
|
||||||
|
"indexation": asdict(report.indexation) if report.indexation else None,
|
||||||
|
"ranking_changes": [asdict(r) for r in report.ranking_changes],
|
||||||
|
"recovery_estimate": report.recovery_estimate,
|
||||||
|
"alerts": [asdict(a) for a in report.alerts],
|
||||||
|
"timestamp": report.timestamp,
|
||||||
|
}
|
||||||
|
if report.errors:
|
||||||
|
output["errors"] = report.errors
|
||||||
|
return output
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# CLI
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def parse_args(argv: list[str] | None = None) -> argparse.Namespace:
|
||||||
|
parser = argparse.ArgumentParser(
|
||||||
|
description="Migration Monitor - Post-migration SEO monitoring and alerting",
|
||||||
|
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||||
|
epilog="""\
|
||||||
|
Examples:
|
||||||
|
python migration_monitor.py --domain https://new-example.com --migration-date 2025-01-15 --baseline baseline.json --json
|
||||||
|
python migration_monitor.py --domain https://new-example.com --migration-date 2025-01-15 --json
|
||||||
|
""",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--domain",
|
||||||
|
required=True,
|
||||||
|
help="Domain to monitor (post-migration URL)",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--migration-date",
|
||||||
|
required=True,
|
||||||
|
help="Migration date in YYYY-MM-DD format",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--baseline",
|
||||||
|
type=str,
|
||||||
|
default=None,
|
||||||
|
help="Path to baseline JSON file from migration_planner.py",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--type",
|
||||||
|
choices=["domain-move", "platform", "url-restructure", "https", "subdomain"],
|
||||||
|
default="domain-move",
|
||||||
|
help="Migration type for recovery estimation (default: domain-move)",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--json",
|
||||||
|
action="store_true",
|
||||||
|
default=False,
|
||||||
|
help="Output in JSON format",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--output",
|
||||||
|
type=str,
|
||||||
|
default=None,
|
||||||
|
help="Save output to file path",
|
||||||
|
)
|
||||||
|
return parser.parse_args(argv)
|
||||||
|
|
||||||
|
|
||||||
|
async def async_main(args: argparse.Namespace) -> None:
|
||||||
|
monitor = MigrationMonitor()
|
||||||
|
|
||||||
|
report = await monitor.run(
|
||||||
|
domain=args.domain,
|
||||||
|
migration_date=args.migration_date,
|
||||||
|
baseline_file=args.baseline,
|
||||||
|
migration_type=args.type,
|
||||||
|
)
|
||||||
|
|
||||||
|
if args.json:
|
||||||
|
output_str = json.dumps(_serialize_report(report), indent=2, ensure_ascii=False)
|
||||||
|
else:
|
||||||
|
output_str = _format_text_report(report)
|
||||||
|
|
||||||
|
if args.output:
|
||||||
|
with open(args.output, "w", encoding="utf-8") as f:
|
||||||
|
f.write(output_str)
|
||||||
|
logger.info(f"Migration report saved to {args.output}")
|
||||||
|
else:
|
||||||
|
print(output_str)
|
||||||
|
|
||||||
|
monitor.print_stats()
|
||||||
|
|
||||||
|
|
||||||
|
def main() -> None:
|
||||||
|
args = parse_args()
|
||||||
|
asyncio.run(async_main(args))
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,754 @@
|
|||||||
|
"""
|
||||||
|
Migration Planner - SEO Site Migration Planning
|
||||||
|
================================================
|
||||||
|
Purpose: Pre-migration risk assessment, redirect mapping, URL inventory,
|
||||||
|
crawl baseline capture, and checklist generation for site migrations.
|
||||||
|
Python: 3.10+
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python migration_planner.py --domain https://example.com --type domain-move --new-domain https://new-example.com --json
|
||||||
|
python migration_planner.py --domain https://example.com --type platform --json
|
||||||
|
python migration_planner.py --domain https://example.com --type url-restructure --json
|
||||||
|
python migration_planner.py --domain http://example.com --type https --json
|
||||||
|
python migration_planner.py --domain https://blog.example.com --type subdomain --new-domain https://example.com/blog --json
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import asyncio
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
import sys
|
||||||
|
from dataclasses import dataclass, field, asdict
|
||||||
|
from datetime import datetime
|
||||||
|
from typing import Any
|
||||||
|
from urllib.parse import urlparse
|
||||||
|
|
||||||
|
from base_client import BaseAsyncClient, config
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Data classes
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class MigrationURL:
|
||||||
|
"""A single URL in the migration inventory with associated metrics."""
|
||||||
|
url: str = ""
|
||||||
|
traffic: int = 0
|
||||||
|
keywords: int = 0
|
||||||
|
backlinks: int = 0
|
||||||
|
risk_score: float = 0.0
|
||||||
|
redirect_target: str = ""
|
||||||
|
status_code: int = 200
|
||||||
|
priority: str = "low" # critical / high / medium / low
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class MigrationBaseline:
|
||||||
|
"""Pre-migration baseline snapshot of the site."""
|
||||||
|
domain: str = ""
|
||||||
|
total_urls: int = 0
|
||||||
|
total_traffic: int = 0
|
||||||
|
total_keywords: int = 0
|
||||||
|
total_referring_domains: int = 0
|
||||||
|
top_pages: list[dict[str, Any]] = field(default_factory=list)
|
||||||
|
url_inventory: list[MigrationURL] = field(default_factory=list)
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class RedirectMap:
|
||||||
|
"""A single redirect mapping entry."""
|
||||||
|
source: str = ""
|
||||||
|
target: str = ""
|
||||||
|
status_code: int = 301
|
||||||
|
priority: str = "low" # critical / high / medium / low
|
||||||
|
risk_score: float = 0.0
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class RiskAssessment:
|
||||||
|
"""Aggregated risk assessment for the migration."""
|
||||||
|
high_risk_urls: int = 0
|
||||||
|
medium_risk_urls: int = 0
|
||||||
|
low_risk_urls: int = 0
|
||||||
|
overall_risk: str = "low" # critical / high / medium / low
|
||||||
|
top_risk_urls: list[dict[str, Any]] = field(default_factory=list)
|
||||||
|
risk_factors: list[str] = field(default_factory=list)
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class MigrationPlan:
|
||||||
|
"""Complete migration plan output."""
|
||||||
|
migration_type: str = ""
|
||||||
|
domain: str = ""
|
||||||
|
new_domain: str = ""
|
||||||
|
baseline: MigrationBaseline | None = None
|
||||||
|
redirect_map: list[RedirectMap] = field(default_factory=list)
|
||||||
|
risk_assessment: RiskAssessment | None = None
|
||||||
|
pre_migration_checklist: list[dict[str, Any]] = field(default_factory=list)
|
||||||
|
timestamp: str = ""
|
||||||
|
errors: list[str] = field(default_factory=list)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Migration types
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
MIGRATION_TYPES = {
|
||||||
|
"domain-move": "Domain Move (old domain -> new domain)",
|
||||||
|
"platform": "Platform Change (CMS/framework migration)",
|
||||||
|
"url-restructure": "URL Restructuring (path/slug changes)",
|
||||||
|
"https": "HTTPS Migration (HTTP -> HTTPS)",
|
||||||
|
"subdomain": "Subdomain Consolidation (subdomain -> subfolder)",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Planner
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
class MigrationPlanner(BaseAsyncClient):
|
||||||
|
"""Plans site migrations using Firecrawl for crawling and Ahrefs for SEO data."""
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
super().__init__(max_concurrent=5, requests_per_second=2.0)
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _extract_domain(url: str) -> str:
|
||||||
|
"""Extract bare domain from URL or return as-is if already bare."""
|
||||||
|
if "://" in url:
|
||||||
|
parsed = urlparse(url)
|
||||||
|
return parsed.netloc.lower().replace("www.", "")
|
||||||
|
return url.lower().replace("www.", "")
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _normalize_url(url: str) -> str:
|
||||||
|
"""Ensure URL has a scheme."""
|
||||||
|
if not url.startswith(("http://", "https://")):
|
||||||
|
return f"https://{url}"
|
||||||
|
return url
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# MCP wrappers (return dicts; Claude MCP bridge fills these)
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
async def _call_ahrefs(self, tool: str, params: dict[str, Any]) -> dict:
|
||||||
|
"""Simulate Ahrefs MCP call. In production, routed via MCP bridge."""
|
||||||
|
self.logger.info(f"Ahrefs MCP call: {tool} | params={params}")
|
||||||
|
return {"tool": tool, "params": params, "data": {}}
|
||||||
|
|
||||||
|
async def _call_firecrawl(self, tool: str, params: dict[str, Any]) -> dict:
|
||||||
|
"""Simulate Firecrawl MCP call. In production, routed via MCP bridge."""
|
||||||
|
self.logger.info(f"Firecrawl MCP call: {tool} | params={params}")
|
||||||
|
return {"tool": tool, "params": params, "data": {}}
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# URL Inventory
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
async def crawl_url_inventory(self, domain: str) -> list[MigrationURL]:
|
||||||
|
"""Crawl the site via Firecrawl to capture all URLs and status codes."""
|
||||||
|
url = self._normalize_url(domain)
|
||||||
|
self.logger.info(f"Crawling URL inventory for {url}")
|
||||||
|
|
||||||
|
resp = await self._call_firecrawl(
|
||||||
|
"firecrawl_crawl",
|
||||||
|
{"url": url, "limit": 5000, "scrapeOptions": {"formats": ["links"]}},
|
||||||
|
)
|
||||||
|
|
||||||
|
crawl_data = resp.get("data", {})
|
||||||
|
pages = crawl_data.get("pages", [])
|
||||||
|
|
||||||
|
inventory: list[MigrationURL] = []
|
||||||
|
for page in pages:
|
||||||
|
migration_url = MigrationURL(
|
||||||
|
url=page.get("url", ""),
|
||||||
|
status_code=int(page.get("status_code", 200)),
|
||||||
|
)
|
||||||
|
inventory.append(migration_url)
|
||||||
|
|
||||||
|
if not inventory:
|
||||||
|
# Fallback: create a single entry for the domain
|
||||||
|
inventory.append(MigrationURL(url=url, status_code=200))
|
||||||
|
self.logger.warning(
|
||||||
|
"Firecrawl returned no pages; created placeholder entry. "
|
||||||
|
"Verify Firecrawl MCP is configured."
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
self.logger.info(f"Crawled {len(inventory)} URLs from {domain}")
|
||||||
|
|
||||||
|
return inventory
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Ahrefs Baseline
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
async def fetch_top_pages_baseline(
|
||||||
|
self, domain: str, limit: int = 500
|
||||||
|
) -> list[dict[str, Any]]:
|
||||||
|
"""Fetch top pages with traffic and keyword data from Ahrefs."""
|
||||||
|
domain = self._extract_domain(domain)
|
||||||
|
self.logger.info(f"Fetching top pages baseline for {domain}")
|
||||||
|
|
||||||
|
resp = await self._call_ahrefs(
|
||||||
|
"site-explorer-top-pages",
|
||||||
|
{"target": domain, "limit": limit},
|
||||||
|
)
|
||||||
|
|
||||||
|
pages_raw = resp.get("data", {}).get("pages", [])
|
||||||
|
top_pages: list[dict[str, Any]] = []
|
||||||
|
for page in pages_raw:
|
||||||
|
top_pages.append({
|
||||||
|
"url": page.get("url", ""),
|
||||||
|
"traffic": int(page.get("traffic", 0)),
|
||||||
|
"keywords": int(page.get("keywords", 0)),
|
||||||
|
"top_keyword": page.get("top_keyword", ""),
|
||||||
|
"position": int(page.get("position", 0)),
|
||||||
|
})
|
||||||
|
|
||||||
|
self.logger.info(f"Fetched {len(top_pages)} top pages for {domain}")
|
||||||
|
return top_pages
|
||||||
|
|
||||||
|
async def fetch_site_metrics(self, domain: str) -> dict[str, Any]:
|
||||||
|
"""Fetch overall site metrics from Ahrefs."""
|
||||||
|
domain = self._extract_domain(domain)
|
||||||
|
|
||||||
|
metrics_resp = await self._call_ahrefs(
|
||||||
|
"site-explorer-metrics", {"target": domain}
|
||||||
|
)
|
||||||
|
metrics = metrics_resp.get("data", {})
|
||||||
|
|
||||||
|
backlinks_resp = await self._call_ahrefs(
|
||||||
|
"site-explorer-backlinks-stats", {"target": domain}
|
||||||
|
)
|
||||||
|
backlinks = backlinks_resp.get("data", {})
|
||||||
|
|
||||||
|
return {
|
||||||
|
"organic_traffic": int(metrics.get("organic_traffic", 0)),
|
||||||
|
"organic_keywords": int(metrics.get("organic_keywords", 0)),
|
||||||
|
"referring_domains": int(backlinks.get("referring_domains", 0)),
|
||||||
|
}
|
||||||
|
|
||||||
|
async def fetch_page_backlinks(self, url: str) -> int:
|
||||||
|
"""Fetch backlink count for a specific URL."""
|
||||||
|
resp = await self._call_ahrefs(
|
||||||
|
"site-explorer-backlinks-stats", {"target": url}
|
||||||
|
)
|
||||||
|
return int(resp.get("data", {}).get("referring_domains", 0))
|
||||||
|
|
||||||
|
async def fetch_page_keywords(self, url: str) -> list[dict[str, Any]]:
|
||||||
|
"""Fetch keyword rankings for a specific URL."""
|
||||||
|
resp = await self._call_ahrefs(
|
||||||
|
"site-explorer-organic-keywords",
|
||||||
|
{"target": url, "limit": 100},
|
||||||
|
)
|
||||||
|
return resp.get("data", {}).get("keywords", [])
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Risk Assessment
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
def assess_url_risk(self, url_data: MigrationURL) -> float:
|
||||||
|
"""Score risk for a single URL based on traffic, backlinks, and keywords.
|
||||||
|
|
||||||
|
Risk score 0-100:
|
||||||
|
- Traffic weight: 40% (high traffic = high risk if migration fails)
|
||||||
|
- Backlinks weight: 30% (external links break if redirect fails)
|
||||||
|
- Keywords weight: 30% (ranking loss risk)
|
||||||
|
"""
|
||||||
|
# Normalize each factor to 0-100
|
||||||
|
# Traffic: 1000+ monthly visits = high risk
|
||||||
|
traffic_score = min((url_data.traffic / 1000) * 100, 100) if url_data.traffic > 0 else 0
|
||||||
|
|
||||||
|
# Backlinks: 50+ referring domains = high risk
|
||||||
|
backlinks_score = min((url_data.backlinks / 50) * 100, 100) if url_data.backlinks > 0 else 0
|
||||||
|
|
||||||
|
# Keywords: 20+ rankings = high risk
|
||||||
|
keywords_score = min((url_data.keywords / 20) * 100, 100) if url_data.keywords > 0 else 0
|
||||||
|
|
||||||
|
risk = (
|
||||||
|
traffic_score * 0.40
|
||||||
|
+ backlinks_score * 0.30
|
||||||
|
+ keywords_score * 0.30
|
||||||
|
)
|
||||||
|
|
||||||
|
return round(min(max(risk, 0), 100), 1)
|
||||||
|
|
||||||
|
def classify_priority(self, risk_score: float) -> str:
|
||||||
|
"""Classify URL priority based on risk score."""
|
||||||
|
if risk_score >= 75:
|
||||||
|
return "critical"
|
||||||
|
elif risk_score >= 50:
|
||||||
|
return "high"
|
||||||
|
elif risk_score >= 25:
|
||||||
|
return "medium"
|
||||||
|
else:
|
||||||
|
return "low"
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Redirect Map
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
def generate_redirect_map(
|
||||||
|
self,
|
||||||
|
url_inventory: list[MigrationURL],
|
||||||
|
migration_type: str,
|
||||||
|
new_domain: str | None = None,
|
||||||
|
) -> list[RedirectMap]:
|
||||||
|
"""Generate redirect mappings based on migration type."""
|
||||||
|
redirect_map: list[RedirectMap] = []
|
||||||
|
|
||||||
|
for url_entry in url_inventory:
|
||||||
|
source = url_entry.url
|
||||||
|
if not source:
|
||||||
|
continue
|
||||||
|
|
||||||
|
parsed = urlparse(source)
|
||||||
|
path = parsed.path
|
||||||
|
|
||||||
|
# Determine target URL based on migration type
|
||||||
|
if migration_type == "domain-move" and new_domain:
|
||||||
|
new_parsed = urlparse(self._normalize_url(new_domain))
|
||||||
|
target = f"{new_parsed.scheme}://{new_parsed.netloc}{path}"
|
||||||
|
|
||||||
|
elif migration_type == "https":
|
||||||
|
target = source.replace("http://", "https://")
|
||||||
|
|
||||||
|
elif migration_type == "subdomain" and new_domain:
|
||||||
|
# e.g., blog.example.com/page -> example.com/blog/page
|
||||||
|
new_parsed = urlparse(self._normalize_url(new_domain))
|
||||||
|
target = f"{new_parsed.scheme}://{new_parsed.netloc}{new_parsed.path.rstrip('/')}{path}"
|
||||||
|
|
||||||
|
elif migration_type == "url-restructure":
|
||||||
|
# Placeholder: URL restructuring requires custom mapping rules
|
||||||
|
# In practice, user provides a mapping CSV or pattern
|
||||||
|
target = source # Will need manual mapping
|
||||||
|
|
||||||
|
elif migration_type == "platform":
|
||||||
|
# Platform change: URLs may stay the same or change
|
||||||
|
target = source # Will need verification post-migration
|
||||||
|
|
||||||
|
else:
|
||||||
|
target = source
|
||||||
|
|
||||||
|
redirect_entry = RedirectMap(
|
||||||
|
source=source,
|
||||||
|
target=target,
|
||||||
|
status_code=301,
|
||||||
|
priority=url_entry.priority,
|
||||||
|
risk_score=url_entry.risk_score,
|
||||||
|
)
|
||||||
|
redirect_map.append(redirect_entry)
|
||||||
|
|
||||||
|
# Sort by risk score descending (highest risk first)
|
||||||
|
redirect_map.sort(key=lambda r: r.risk_score, reverse=True)
|
||||||
|
|
||||||
|
self.logger.info(
|
||||||
|
f"Generated {len(redirect_map)} redirect mappings "
|
||||||
|
f"for {migration_type} migration"
|
||||||
|
)
|
||||||
|
return redirect_map
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Checklist
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
def generate_checklist(self, migration_type: str) -> list[dict[str, Any]]:
|
||||||
|
"""Generate pre-migration checklist based on migration type."""
|
||||||
|
# Common checklist items for all migration types
|
||||||
|
common_items = [
|
||||||
|
{"step": 1, "category": "Baseline", "task": "URL 인벤토리 크롤링 완료", "description": "Firecrawl로 전체 URL 목록 및 상태 코드 캡처", "status": "pending"},
|
||||||
|
{"step": 2, "category": "Baseline", "task": "트래픽 베이스라인 캡처", "description": "Ahrefs에서 페이지별 트래픽, 키워드, 백링크 데이터 수집", "status": "pending"},
|
||||||
|
{"step": 3, "category": "Baseline", "task": "Google Search Console 데이터 내보내기", "description": "현재 인덱싱 상태, 사이트맵 현황, 크롤 통계 기록", "status": "pending"},
|
||||||
|
{"step": 4, "category": "Baseline", "task": "Google Analytics 벤치마크 저장", "description": "이전 전 30일/90일 트래픽 데이터 스냅샷 저장", "status": "pending"},
|
||||||
|
{"step": 5, "category": "Redirects", "task": "Redirect 맵 생성", "description": "모든 URL에 대한 301 리디렉트 매핑 완료", "status": "pending"},
|
||||||
|
{"step": 6, "category": "Redirects", "task": "고위험 URL 우선 검증", "description": "트래픽/백링크 기준 상위 URL 리디렉트 수동 확인", "status": "pending"},
|
||||||
|
{"step": 7, "category": "Technical", "task": "robots.txt 업데이트 준비", "description": "새 도메인/구조에 맞는 robots.txt 작성", "status": "pending"},
|
||||||
|
{"step": 8, "category": "Technical", "task": "XML 사이트맵 업데이트 준비", "description": "새 URL 구조 반영한 사이트맵 생성", "status": "pending"},
|
||||||
|
{"step": 9, "category": "Technical", "task": "Canonical 태그 업데이트 계획", "description": "모든 페이지의 canonical URL이 새 주소를 가리키도록 변경", "status": "pending"},
|
||||||
|
{"step": 10, "category": "Technical", "task": "Internal link 업데이트 계획", "description": "사이트 내부 링크가 새 URL을 직접 가리키도록 변경", "status": "pending"},
|
||||||
|
{"step": 11, "category": "Monitoring", "task": "모니터링 대시보드 설정", "description": "이전 후 트래픽, 인덱싱, 리디렉트 상태 모니터링 준비", "status": "pending"},
|
||||||
|
{"step": 12, "category": "Monitoring", "task": "알림 임계값 설정", "description": "트래픽 20% 이상 하락 시 알림 설정", "status": "pending"},
|
||||||
|
]
|
||||||
|
|
||||||
|
# Type-specific items
|
||||||
|
type_specific: dict[str, list[dict[str, Any]]] = {
|
||||||
|
"domain-move": [
|
||||||
|
{"step": 13, "category": "Domain", "task": "새 도메인 DNS 설정", "description": "DNS A/CNAME 레코드 설정 및 전파 확인", "status": "pending"},
|
||||||
|
{"step": 14, "category": "Domain", "task": "Google Search Console에 새 도메인 등록", "description": "새 도메인 속성 추가 및 소유권 확인", "status": "pending"},
|
||||||
|
{"step": 15, "category": "Domain", "task": "도메인 변경 알림 (GSC Change of Address)", "description": "Search Console에서 주소 변경 도구 실행", "status": "pending"},
|
||||||
|
{"step": 16, "category": "Domain", "task": "SSL 인증서 설치", "description": "새 도메인에 유효한 SSL 인증서 설치", "status": "pending"},
|
||||||
|
],
|
||||||
|
"platform": [
|
||||||
|
{"step": 13, "category": "Platform", "task": "URL 구조 매핑 확인", "description": "새 플랫폼에서 동일한 URL 구조 유지 여부 확인", "status": "pending"},
|
||||||
|
{"step": 14, "category": "Platform", "task": "메타 태그 이전 확인", "description": "Title, Description, Open Graph 태그 동일 여부 확인", "status": "pending"},
|
||||||
|
{"step": 15, "category": "Platform", "task": "구조화된 데이터 이전", "description": "JSON-LD Schema Markup 동일 여부 확인", "status": "pending"},
|
||||||
|
{"step": 16, "category": "Platform", "task": "스테이징 환경 테스트", "description": "스테이징에서 전체 크롤링 및 리디렉트 테스트 실행", "status": "pending"},
|
||||||
|
],
|
||||||
|
"url-restructure": [
|
||||||
|
{"step": 13, "category": "URL", "task": "URL 패턴 매핑 문서화", "description": "기존 → 신규 URL 패턴 규칙 문서화", "status": "pending"},
|
||||||
|
{"step": 14, "category": "URL", "task": "정규식 리디렉트 규칙 작성", "description": "서버 레벨 리디렉트 규칙 (nginx/Apache) 작성", "status": "pending"},
|
||||||
|
{"step": 15, "category": "URL", "task": "Breadcrumb 업데이트", "description": "새 URL 구조에 맞게 Breadcrumb 네비게이션 수정", "status": "pending"},
|
||||||
|
],
|
||||||
|
"https": [
|
||||||
|
{"step": 13, "category": "HTTPS", "task": "SSL 인증서 설치 및 확인", "description": "유효한 SSL 인증서 설치 (Let's Encrypt 또는 상용 인증서)", "status": "pending"},
|
||||||
|
{"step": 14, "category": "HTTPS", "task": "Mixed Content 점검", "description": "HTTP로 로드되는 리소스 (이미지, CSS, JS) 식별 및 수정", "status": "pending"},
|
||||||
|
{"step": 15, "category": "HTTPS", "task": "HSTS 헤더 설정", "description": "Strict-Transport-Security 헤더 활성화", "status": "pending"},
|
||||||
|
],
|
||||||
|
"subdomain": [
|
||||||
|
{"step": 13, "category": "Subdomain", "task": "서브도메인 → 서브폴더 매핑", "description": "서브도메인 경로를 서브폴더 경로로 매핑", "status": "pending"},
|
||||||
|
{"step": 14, "category": "Subdomain", "task": "서버 리디렉트 규칙 설정", "description": "서브도메인에서 메인 도메인으로의 301 리디렉트 규칙", "status": "pending"},
|
||||||
|
{"step": 15, "category": "Subdomain", "task": "DNS 설정 업데이트", "description": "서브도메인 DNS 레코드 유지 (리디렉트용)", "status": "pending"},
|
||||||
|
],
|
||||||
|
}
|
||||||
|
|
||||||
|
checklist = common_items.copy()
|
||||||
|
if migration_type in type_specific:
|
||||||
|
checklist.extend(type_specific[migration_type])
|
||||||
|
|
||||||
|
self.logger.info(
|
||||||
|
f"Generated {len(checklist)} checklist items for {migration_type} migration"
|
||||||
|
)
|
||||||
|
return checklist
|
||||||
|
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
# Orchestrator
|
||||||
|
# ------------------------------------------------------------------
|
||||||
|
|
||||||
|
async def run(
|
||||||
|
self,
|
||||||
|
domain: str,
|
||||||
|
migration_type: str,
|
||||||
|
new_domain: str | None = None,
|
||||||
|
) -> MigrationPlan:
|
||||||
|
"""Orchestrate full migration planning pipeline."""
|
||||||
|
timestamp = datetime.now().isoformat()
|
||||||
|
plan = MigrationPlan(
|
||||||
|
migration_type=migration_type,
|
||||||
|
domain=self._extract_domain(domain),
|
||||||
|
new_domain=self._extract_domain(new_domain) if new_domain else "",
|
||||||
|
timestamp=timestamp,
|
||||||
|
)
|
||||||
|
|
||||||
|
try:
|
||||||
|
# Step 1: Crawl URL inventory
|
||||||
|
self.logger.info("Step 1/6: Crawling URL inventory via Firecrawl...")
|
||||||
|
url_inventory = await self.crawl_url_inventory(domain)
|
||||||
|
|
||||||
|
# Step 2: Fetch Ahrefs baseline
|
||||||
|
self.logger.info("Step 2/6: Fetching Ahrefs top pages baseline...")
|
||||||
|
top_pages = await self.fetch_top_pages_baseline(domain)
|
||||||
|
site_metrics = await self.fetch_site_metrics(domain)
|
||||||
|
|
||||||
|
# Step 3: Enrich URL inventory with Ahrefs data
|
||||||
|
self.logger.info("Step 3/6: Enriching URLs with traffic/backlink data...")
|
||||||
|
top_pages_map: dict[str, dict] = {}
|
||||||
|
for page in top_pages:
|
||||||
|
page_url = page.get("url", "")
|
||||||
|
if page_url:
|
||||||
|
top_pages_map[page_url] = page
|
||||||
|
|
||||||
|
for url_entry in url_inventory:
|
||||||
|
page_data = top_pages_map.get(url_entry.url, {})
|
||||||
|
url_entry.traffic = int(page_data.get("traffic", 0))
|
||||||
|
url_entry.keywords = int(page_data.get("keywords", 0))
|
||||||
|
|
||||||
|
# Step 4: Risk assessment per URL
|
||||||
|
self.logger.info("Step 4/6: Scoring risk per URL...")
|
||||||
|
for url_entry in url_inventory:
|
||||||
|
url_entry.risk_score = self.assess_url_risk(url_entry)
|
||||||
|
url_entry.priority = self.classify_priority(url_entry.risk_score)
|
||||||
|
|
||||||
|
# Build baseline
|
||||||
|
baseline = MigrationBaseline(
|
||||||
|
domain=self._extract_domain(domain),
|
||||||
|
total_urls=len(url_inventory),
|
||||||
|
total_traffic=site_metrics.get("organic_traffic", 0),
|
||||||
|
total_keywords=site_metrics.get("organic_keywords", 0),
|
||||||
|
total_referring_domains=site_metrics.get("referring_domains", 0),
|
||||||
|
top_pages=top_pages[:50], # Store top 50 for reference
|
||||||
|
url_inventory=url_inventory,
|
||||||
|
)
|
||||||
|
plan.baseline = baseline
|
||||||
|
|
||||||
|
# Step 5: Generate redirect map
|
||||||
|
self.logger.info("Step 5/6: Generating redirect map...")
|
||||||
|
plan.redirect_map = self.generate_redirect_map(
|
||||||
|
url_inventory, migration_type, new_domain
|
||||||
|
)
|
||||||
|
|
||||||
|
# Build risk assessment summary
|
||||||
|
high_risk = sum(1 for u in url_inventory if u.risk_score >= 75)
|
||||||
|
medium_risk = sum(1 for u in url_inventory if 25 <= u.risk_score < 75)
|
||||||
|
low_risk = sum(1 for u in url_inventory if u.risk_score < 25)
|
||||||
|
|
||||||
|
# Determine overall risk level
|
||||||
|
if high_risk > len(url_inventory) * 0.2:
|
||||||
|
overall_risk = "critical"
|
||||||
|
elif high_risk > len(url_inventory) * 0.1:
|
||||||
|
overall_risk = "high"
|
||||||
|
elif medium_risk > len(url_inventory) * 0.3:
|
||||||
|
overall_risk = "medium"
|
||||||
|
else:
|
||||||
|
overall_risk = "low"
|
||||||
|
|
||||||
|
# Top risk URLs
|
||||||
|
sorted_urls = sorted(url_inventory, key=lambda u: u.risk_score, reverse=True)
|
||||||
|
top_risk = [
|
||||||
|
{
|
||||||
|
"url": u.url,
|
||||||
|
"risk_score": u.risk_score,
|
||||||
|
"traffic": u.traffic,
|
||||||
|
"keywords": u.keywords,
|
||||||
|
"backlinks": u.backlinks,
|
||||||
|
}
|
||||||
|
for u in sorted_urls[:20]
|
||||||
|
]
|
||||||
|
|
||||||
|
# Risk factors
|
||||||
|
risk_factors: list[str] = []
|
||||||
|
if high_risk > 0:
|
||||||
|
risk_factors.append(
|
||||||
|
f"{high_risk}개 고위험 URL (트래픽/백링크 손실 위험)"
|
||||||
|
)
|
||||||
|
if baseline.total_traffic > 10000:
|
||||||
|
risk_factors.append(
|
||||||
|
f"월간 오가닉 트래픽 {baseline.total_traffic:,}회 — 이전 실패 시 큰 영향"
|
||||||
|
)
|
||||||
|
if baseline.total_referring_domains > 500:
|
||||||
|
risk_factors.append(
|
||||||
|
f"참조 도메인 {baseline.total_referring_domains:,}개 — 리디렉트 누락 시 링크 에퀴티 손실"
|
||||||
|
)
|
||||||
|
if migration_type == "domain-move":
|
||||||
|
risk_factors.append(
|
||||||
|
"도메인 변경은 가장 위험한 이전 유형 — 최소 3-6개월 회복 예상"
|
||||||
|
)
|
||||||
|
elif migration_type == "url-restructure":
|
||||||
|
risk_factors.append(
|
||||||
|
"URL 구조 변경 시 모든 내부/외부 링크 영향 — 정규식 리디렉트 필수"
|
||||||
|
)
|
||||||
|
|
||||||
|
plan.risk_assessment = RiskAssessment(
|
||||||
|
high_risk_urls=high_risk,
|
||||||
|
medium_risk_urls=medium_risk,
|
||||||
|
low_risk_urls=low_risk,
|
||||||
|
overall_risk=overall_risk,
|
||||||
|
top_risk_urls=top_risk,
|
||||||
|
risk_factors=risk_factors,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Step 6: Generate checklist
|
||||||
|
self.logger.info("Step 6/6: Generating pre-migration checklist...")
|
||||||
|
plan.pre_migration_checklist = self.generate_checklist(migration_type)
|
||||||
|
|
||||||
|
self.logger.info(
|
||||||
|
f"Migration plan complete: {len(url_inventory)} URLs inventoried, "
|
||||||
|
f"{len(plan.redirect_map)} redirects mapped, "
|
||||||
|
f"overall risk: {overall_risk}"
|
||||||
|
)
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
msg = f"Migration planning pipeline error: {e}"
|
||||||
|
self.logger.error(msg)
|
||||||
|
plan.errors.append(msg)
|
||||||
|
|
||||||
|
return plan
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Output helpers
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def _format_text_report(plan: MigrationPlan) -> str:
|
||||||
|
"""Format migration plan as human-readable text report."""
|
||||||
|
lines: list[str] = []
|
||||||
|
lines.append("=" * 70)
|
||||||
|
lines.append(" SEO MIGRATION PLAN")
|
||||||
|
lines.append(f" Domain: {plan.domain}")
|
||||||
|
if plan.new_domain:
|
||||||
|
lines.append(f" New Domain: {plan.new_domain}")
|
||||||
|
lines.append(f" Migration Type: {MIGRATION_TYPES.get(plan.migration_type, plan.migration_type)}")
|
||||||
|
lines.append(f" Generated: {plan.timestamp}")
|
||||||
|
lines.append("=" * 70)
|
||||||
|
|
||||||
|
if plan.baseline:
|
||||||
|
b = plan.baseline
|
||||||
|
lines.append("")
|
||||||
|
lines.append("--- BASELINE ---")
|
||||||
|
lines.append(f" Total URLs: {b.total_urls:,}")
|
||||||
|
lines.append(f" Organic Traffic: {b.total_traffic:,}")
|
||||||
|
lines.append(f" Organic Keywords: {b.total_keywords:,}")
|
||||||
|
lines.append(f" Referring Domains: {b.total_referring_domains:,}")
|
||||||
|
|
||||||
|
if plan.risk_assessment:
|
||||||
|
r = plan.risk_assessment
|
||||||
|
lines.append("")
|
||||||
|
lines.append("--- RISK ASSESSMENT ---")
|
||||||
|
lines.append(f" Overall Risk: {r.overall_risk.upper()}")
|
||||||
|
lines.append(f" High Risk URLs: {r.high_risk_urls:,}")
|
||||||
|
lines.append(f" Medium Risk: {r.medium_risk_urls:,}")
|
||||||
|
lines.append(f" Low Risk: {r.low_risk_urls:,}")
|
||||||
|
if r.risk_factors:
|
||||||
|
lines.append("")
|
||||||
|
lines.append(" Risk Factors:")
|
||||||
|
for factor in r.risk_factors:
|
||||||
|
lines.append(f" - {factor}")
|
||||||
|
if r.top_risk_urls:
|
||||||
|
lines.append("")
|
||||||
|
lines.append(" Top Risk URLs:")
|
||||||
|
for url_info in r.top_risk_urls[:10]:
|
||||||
|
lines.append(
|
||||||
|
f" [{url_info['risk_score']:.0f}] {url_info['url']} "
|
||||||
|
f"(traffic={url_info['traffic']:,}, kw={url_info['keywords']})"
|
||||||
|
)
|
||||||
|
|
||||||
|
if plan.redirect_map:
|
||||||
|
lines.append("")
|
||||||
|
lines.append(f"--- REDIRECT MAP ({len(plan.redirect_map)} entries) ---")
|
||||||
|
# Show top 20 by risk
|
||||||
|
for i, rmap in enumerate(plan.redirect_map[:20], 1):
|
||||||
|
lines.append(
|
||||||
|
f" {i:>3}. [{rmap.priority.upper():>8}] "
|
||||||
|
f"{rmap.source} -> {rmap.target}"
|
||||||
|
)
|
||||||
|
if len(plan.redirect_map) > 20:
|
||||||
|
lines.append(f" ... and {len(plan.redirect_map) - 20} more entries")
|
||||||
|
|
||||||
|
if plan.pre_migration_checklist:
|
||||||
|
lines.append("")
|
||||||
|
lines.append("--- PRE-MIGRATION CHECKLIST ---")
|
||||||
|
for item in plan.pre_migration_checklist:
|
||||||
|
status_marker = "[ ]" if item["status"] == "pending" else "[x]"
|
||||||
|
lines.append(
|
||||||
|
f" {status_marker} Step {item['step']}: {item['task']}"
|
||||||
|
)
|
||||||
|
lines.append(f" {item['description']}")
|
||||||
|
|
||||||
|
if plan.errors:
|
||||||
|
lines.append("")
|
||||||
|
lines.append("--- ERRORS ---")
|
||||||
|
for err in plan.errors:
|
||||||
|
lines.append(f" - {err}")
|
||||||
|
|
||||||
|
lines.append("")
|
||||||
|
lines.append("=" * 70)
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
def _serialize_plan(plan: MigrationPlan) -> dict:
|
||||||
|
"""Convert plan to JSON-serializable dict."""
|
||||||
|
output: dict[str, Any] = {
|
||||||
|
"domain": plan.domain,
|
||||||
|
"new_domain": plan.new_domain,
|
||||||
|
"migration_type": plan.migration_type,
|
||||||
|
"baseline": None,
|
||||||
|
"redirect_map": [asdict(r) for r in plan.redirect_map],
|
||||||
|
"risk_assessment": asdict(plan.risk_assessment) if plan.risk_assessment else None,
|
||||||
|
"pre_migration_checklist": plan.pre_migration_checklist,
|
||||||
|
"timestamp": plan.timestamp,
|
||||||
|
}
|
||||||
|
|
||||||
|
if plan.baseline:
|
||||||
|
output["baseline"] = {
|
||||||
|
"domain": plan.baseline.domain,
|
||||||
|
"total_urls": plan.baseline.total_urls,
|
||||||
|
"total_traffic": plan.baseline.total_traffic,
|
||||||
|
"total_keywords": plan.baseline.total_keywords,
|
||||||
|
"total_referring_domains": plan.baseline.total_referring_domains,
|
||||||
|
"top_pages": plan.baseline.top_pages,
|
||||||
|
"url_inventory": [asdict(u) for u in plan.baseline.url_inventory],
|
||||||
|
}
|
||||||
|
|
||||||
|
if plan.errors:
|
||||||
|
output["errors"] = plan.errors
|
||||||
|
|
||||||
|
return output
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# CLI
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def parse_args(argv: list[str] | None = None) -> argparse.Namespace:
|
||||||
|
parser = argparse.ArgumentParser(
|
||||||
|
description="SEO Migration Planner - Pre-migration risk assessment and redirect mapping",
|
||||||
|
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||||
|
epilog="""\
|
||||||
|
Examples:
|
||||||
|
python migration_planner.py --domain https://example.com --type domain-move --new-domain https://new-example.com --json
|
||||||
|
python migration_planner.py --domain https://example.com --type platform --json
|
||||||
|
python migration_planner.py --domain https://example.com --type url-restructure --json
|
||||||
|
python migration_planner.py --domain http://example.com --type https --json
|
||||||
|
python migration_planner.py --domain https://blog.example.com --type subdomain --new-domain https://example.com/blog --json
|
||||||
|
""",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--domain",
|
||||||
|
required=True,
|
||||||
|
help="Target website URL or domain to plan migration for",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--type",
|
||||||
|
required=True,
|
||||||
|
choices=["domain-move", "platform", "url-restructure", "https", "subdomain"],
|
||||||
|
help="Migration type",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--new-domain",
|
||||||
|
type=str,
|
||||||
|
default=None,
|
||||||
|
help="New domain/URL (required for domain-move and subdomain types)",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--json",
|
||||||
|
action="store_true",
|
||||||
|
default=False,
|
||||||
|
help="Output in JSON format",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--output",
|
||||||
|
type=str,
|
||||||
|
default=None,
|
||||||
|
help="Save output to file path",
|
||||||
|
)
|
||||||
|
return parser.parse_args(argv)
|
||||||
|
|
||||||
|
|
||||||
|
async def async_main(args: argparse.Namespace) -> None:
|
||||||
|
# Validate required arguments for specific types
|
||||||
|
if args.type in ("domain-move", "subdomain") and not args.new_domain:
|
||||||
|
logger.error(f"--new-domain is required for {args.type} migration type")
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
planner = MigrationPlanner()
|
||||||
|
|
||||||
|
plan = await planner.run(
|
||||||
|
domain=args.domain,
|
||||||
|
migration_type=args.type,
|
||||||
|
new_domain=args.new_domain,
|
||||||
|
)
|
||||||
|
|
||||||
|
if args.json:
|
||||||
|
output_str = json.dumps(_serialize_plan(plan), indent=2, ensure_ascii=False)
|
||||||
|
else:
|
||||||
|
output_str = _format_text_report(plan)
|
||||||
|
|
||||||
|
if args.output:
|
||||||
|
with open(args.output, "w", encoding="utf-8") as f:
|
||||||
|
f.write(output_str)
|
||||||
|
logger.info(f"Migration plan saved to {args.output}")
|
||||||
|
else:
|
||||||
|
print(output_str)
|
||||||
|
|
||||||
|
planner.print_stats()
|
||||||
|
|
||||||
|
|
||||||
|
def main() -> None:
|
||||||
|
args = parse_args()
|
||||||
|
asyncio.run(async_main(args))
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,8 @@
|
|||||||
|
# 33-seo-migration-planner dependencies
|
||||||
|
requests>=2.31.0
|
||||||
|
aiohttp>=3.9.0
|
||||||
|
pandas>=2.1.0
|
||||||
|
tenacity>=8.2.0
|
||||||
|
tqdm>=4.66.0
|
||||||
|
python-dotenv>=1.0.0
|
||||||
|
rich>=13.7.0
|
||||||
171
custom-skills/33-seo-migration-planner/desktop/SKILL.md
Normal file
171
custom-skills/33-seo-migration-planner/desktop/SKILL.md
Normal file
@@ -0,0 +1,171 @@
|
|||||||
|
---
|
||||||
|
name: seo-migration-planner
|
||||||
|
description: |
|
||||||
|
SEO site migration planning and monitoring. Triggers: site migration, domain move, redirect mapping, platform migration, URL restructuring, HTTPS migration, subdomain consolidation, 사이트 이전, 도메인 이전, 리디렉트 매핑.
|
||||||
|
---
|
||||||
|
|
||||||
|
# SEO Migration Planner & Monitor
|
||||||
|
|
||||||
|
## Purpose
|
||||||
|
|
||||||
|
Comprehensive site migration planning and post-migration monitoring for SEO: crawl-based URL inventory, traffic/keyword baseline capture via Ahrefs, redirect map generation with per-URL risk scoring, pre-migration checklist creation, and post-launch traffic/indexation/ranking recovery tracking with automated alerts. Supports domain moves, platform changes, URL restructuring, HTTPS migrations, and subdomain consolidation.
|
||||||
|
|
||||||
|
## Core Capabilities
|
||||||
|
|
||||||
|
1. **URL Inventory** - Crawl entire site via Firecrawl to capture all URLs and status codes
|
||||||
|
2. **Traffic Baseline** - Capture per-page traffic, keywords, and backlinks via Ahrefs
|
||||||
|
3. **Redirect Map Generation** - Create old URL -> new URL mappings with 301 redirect rules
|
||||||
|
4. **Risk Scoring** - Score each URL (0-100) based on traffic, backlinks, and keyword rankings
|
||||||
|
5. **Pre-Migration Checklist** - Generate type-specific migration checklist (Korean)
|
||||||
|
6. **Post-Migration Traffic Comparison** - Compare pre vs post traffic by page group
|
||||||
|
7. **Redirect Health Check** - Detect broken redirects, chains, and loops
|
||||||
|
8. **Indexation Tracking** - Monitor indexed page count changes and missing pages
|
||||||
|
9. **Ranking Monitoring** - Track keyword position changes for priority keywords
|
||||||
|
10. **Recovery Estimation** - Estimate traffic recovery timeline based on migration type
|
||||||
|
11. **Alert Generation** - Flag traffic drops >20%, broken redirects, indexation loss
|
||||||
|
|
||||||
|
## MCP Tool Usage
|
||||||
|
|
||||||
|
### Ahrefs for SEO Baseline & Monitoring
|
||||||
|
```
|
||||||
|
mcp__ahrefs__site-explorer-metrics: Current organic metrics (traffic, keywords)
|
||||||
|
mcp__ahrefs__site-explorer-metrics-history: Historical metrics for pre/post comparison
|
||||||
|
mcp__ahrefs__site-explorer-top-pages: Top performing pages for baseline
|
||||||
|
mcp__ahrefs__site-explorer-pages-by-traffic: Pages ranked by traffic for risk scoring
|
||||||
|
mcp__ahrefs__site-explorer-organic-keywords: Keyword rankings per page
|
||||||
|
mcp__ahrefs__site-explorer-referring-domains: Referring domains for risk scoring
|
||||||
|
mcp__ahrefs__site-explorer-backlinks-stats: Backlink overview for migration impact
|
||||||
|
```
|
||||||
|
|
||||||
|
### Firecrawl for URL Inventory & Redirect Verification
|
||||||
|
```
|
||||||
|
mcp__firecrawl__firecrawl_crawl: Crawl entire site for URL inventory
|
||||||
|
mcp__firecrawl__firecrawl_scrape: Verify individual redirect health
|
||||||
|
```
|
||||||
|
|
||||||
|
### Notion for Report Storage
|
||||||
|
```
|
||||||
|
mcp__notion__notion-create-pages: Save reports to SEO Audit Log
|
||||||
|
```
|
||||||
|
|
||||||
|
### Perplexity for Migration Best Practices
|
||||||
|
```
|
||||||
|
mcp__perplexity__search: Research migration best practices and common pitfalls
|
||||||
|
```
|
||||||
|
|
||||||
|
## Workflow
|
||||||
|
|
||||||
|
### Pre-Migration Planning
|
||||||
|
1. Accept target domain, migration type, and new domain (if applicable)
|
||||||
|
2. Crawl URL inventory via Firecrawl (capture all URLs + status codes)
|
||||||
|
3. Fetch Ahrefs top pages baseline (traffic, keywords, backlinks per page)
|
||||||
|
4. Fetch site-level metrics (total traffic, keywords, referring domains)
|
||||||
|
5. Enrich URL inventory with Ahrefs traffic/backlink data
|
||||||
|
6. Score risk per URL (0-100) based on traffic weight (40%), backlinks (30%), keywords (30%)
|
||||||
|
7. Generate redirect map (old URL -> new URL) based on migration type
|
||||||
|
8. Aggregate risk assessment (high/medium/low URL counts, overall risk level)
|
||||||
|
9. Generate pre-migration checklist (common + type-specific items, in Korean)
|
||||||
|
10. Save baseline and plan to Notion
|
||||||
|
|
||||||
|
### Post-Migration Monitoring
|
||||||
|
1. Accept domain, migration date, and optional baseline JSON
|
||||||
|
2. Compare pre vs post traffic using Ahrefs metrics history
|
||||||
|
3. Check redirect health via Firecrawl (broken, chains, loops)
|
||||||
|
4. Track indexation changes (pre vs post page count, missing pages)
|
||||||
|
5. Track keyword ranking changes for priority keywords
|
||||||
|
6. Estimate recovery timeline based on traffic delta and migration type
|
||||||
|
7. Generate alerts for significant issues (traffic >20% drop, broken redirects, etc.)
|
||||||
|
8. Save monitoring report to Notion
|
||||||
|
|
||||||
|
## Output Format
|
||||||
|
|
||||||
|
### Planning Report
|
||||||
|
```markdown
|
||||||
|
## SEO 사이트 이전 계획: [domain]
|
||||||
|
|
||||||
|
### 베이스라인
|
||||||
|
- 전체 URL 수: [count]
|
||||||
|
- 오가닉 트래픽: [traffic]
|
||||||
|
- 오가닉 키워드: [keywords]
|
||||||
|
- 참조 도메인: [count]
|
||||||
|
|
||||||
|
### 위험 평가
|
||||||
|
- 전체 위험도: [HIGH/MEDIUM/LOW]
|
||||||
|
- 고위험 URL: [count]개
|
||||||
|
- 중위험 URL: [count]개
|
||||||
|
- 저위험 URL: [count]개
|
||||||
|
|
||||||
|
### 리디렉트 맵 (상위 위험 URL)
|
||||||
|
| Source URL | Target URL | Risk Score | Priority |
|
||||||
|
|------------|------------|------------|----------|
|
||||||
|
|
||||||
|
### 사전 체크리스트
|
||||||
|
- [ ] Step 1: ...
|
||||||
|
- [ ] Step 2: ...
|
||||||
|
```
|
||||||
|
|
||||||
|
### Monitoring Report
|
||||||
|
```markdown
|
||||||
|
## SEO 이전 모니터링 보고서: [domain]
|
||||||
|
### 이전일: [date] | 경과일: [N]일
|
||||||
|
|
||||||
|
### 알림
|
||||||
|
- [severity] [message]
|
||||||
|
|
||||||
|
### 트래픽 비교
|
||||||
|
| Page Group | Pre | Post | Change | Status |
|
||||||
|
|------------|-----|------|--------|--------|
|
||||||
|
|
||||||
|
### 리디렉트 상태
|
||||||
|
- 전체: [count] | 정상: [count] | 깨짐: [count] | 체인: [count]
|
||||||
|
|
||||||
|
### 인덱싱 현황
|
||||||
|
- 이전 전: [count] | 이전 후: [count] | 변화: [pct]%
|
||||||
|
|
||||||
|
### 회복 예상
|
||||||
|
- 예상 기간: [weeks]주
|
||||||
|
- 현재 회복률: [pct]%
|
||||||
|
```
|
||||||
|
|
||||||
|
## Risk Scoring Methodology
|
||||||
|
|
||||||
|
| Factor | Weight | Scale |
|
||||||
|
|--------|--------|-------|
|
||||||
|
| Traffic | 40% | 1,000+ monthly visits = high risk |
|
||||||
|
| Backlinks | 30% | 50+ referring domains = high risk |
|
||||||
|
| Keywords | 30% | 20+ keyword rankings = high risk |
|
||||||
|
|
||||||
|
### Priority Classification
|
||||||
|
|
||||||
|
| Risk Score | Priority | Action |
|
||||||
|
|------------|----------|--------|
|
||||||
|
| 75-100 | Critical | Manual redirect verification required |
|
||||||
|
| 50-74 | High | Priority redirect with monitoring |
|
||||||
|
| 25-49 | Medium | Standard redirect |
|
||||||
|
| 0-24 | Low | Batch redirect |
|
||||||
|
|
||||||
|
## Alert Thresholds
|
||||||
|
|
||||||
|
| Alert Type | Threshold | Severity |
|
||||||
|
|------------|-----------|----------|
|
||||||
|
| Traffic drop | >20% | warning; >40% critical |
|
||||||
|
| Broken redirects | >0 | warning; >10 critical |
|
||||||
|
| Redirect chains | >0 | warning |
|
||||||
|
| Indexation loss | >10% | warning; >30% critical |
|
||||||
|
| Ranking drop | >5 positions (volume 100+) | warning; >20 keywords critical |
|
||||||
|
|
||||||
|
## Limitations
|
||||||
|
|
||||||
|
- Ahrefs data has ~24h freshness lag
|
||||||
|
- Firecrawl crawl limited to 5,000 URLs per run
|
||||||
|
- Redirect chain detection depends on Firecrawl following redirects
|
||||||
|
- Recovery estimation is heuristic-based on industry averages
|
||||||
|
- URL restructuring requires manual mapping rules (no auto-pattern detection)
|
||||||
|
|
||||||
|
## Notion Output (Required)
|
||||||
|
|
||||||
|
All reports MUST be saved to OurDigital SEO Audit Log:
|
||||||
|
- **Database ID**: `2c8581e5-8a1e-8035-880b-e38cefc2f3ef`
|
||||||
|
- **Properties**: Issue (title), Site (url), Category ("SEO Migration"), Priority, Found Date, Audit ID
|
||||||
|
- **Language**: Korean with English technical terms
|
||||||
|
- **Audit ID Format**: MIGR-YYYYMMDD-NNN
|
||||||
10
custom-skills/33-seo-migration-planner/desktop/skill.yaml
Normal file
10
custom-skills/33-seo-migration-planner/desktop/skill.yaml
Normal file
@@ -0,0 +1,10 @@
|
|||||||
|
name: seo-migration-planner
|
||||||
|
description: |
|
||||||
|
SEO site migration planning and monitoring. Triggers: site migration, domain move, redirect mapping, platform migration, URL restructuring, 사이트 이전.
|
||||||
|
allowed-tools:
|
||||||
|
- mcp__ahrefs__*
|
||||||
|
- mcp__firecrawl__*
|
||||||
|
- mcp__notion__*
|
||||||
|
- mcp__perplexity__*
|
||||||
|
- WebSearch
|
||||||
|
- WebFetch
|
||||||
@@ -0,0 +1,37 @@
|
|||||||
|
# Ahrefs
|
||||||
|
|
||||||
|
> MCP tool documentation for migration planner skill
|
||||||
|
|
||||||
|
## Available Commands
|
||||||
|
|
||||||
|
- `site-explorer-metrics` - Get current organic metrics (traffic, keywords) for a domain
|
||||||
|
- `site-explorer-metrics-history` - Get historical organic metrics for pre/post comparison
|
||||||
|
- `site-explorer-top-pages` - Get top performing pages by traffic for baseline
|
||||||
|
- `site-explorer-pages-by-traffic` - Get pages ranked by organic traffic for risk scoring
|
||||||
|
- `site-explorer-organic-keywords` - Get keyword rankings per page
|
||||||
|
- `site-explorer-referring-domains` - Get referring domain list for risk scoring
|
||||||
|
- `site-explorer-backlinks-stats` - Get backlink overview for migration impact assessment
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
- Requires Ahrefs MCP server configured in Claude Desktop
|
||||||
|
- API access via `mcp__ahrefs__*` tool prefix
|
||||||
|
|
||||||
|
## Examples
|
||||||
|
|
||||||
|
```
|
||||||
|
# Get site baseline metrics
|
||||||
|
mcp__ahrefs__site-explorer-metrics(target="example.com")
|
||||||
|
|
||||||
|
# Get top pages for risk scoring
|
||||||
|
mcp__ahrefs__site-explorer-top-pages(target="example.com", limit=500)
|
||||||
|
|
||||||
|
# Get traffic history for pre/post comparison
|
||||||
|
mcp__ahrefs__site-explorer-metrics-history(target="example.com", date_from="2025-01-01")
|
||||||
|
|
||||||
|
# Get backlink stats for a specific page
|
||||||
|
mcp__ahrefs__site-explorer-backlinks-stats(target="https://example.com/important-page")
|
||||||
|
|
||||||
|
# Get keyword rankings
|
||||||
|
mcp__ahrefs__site-explorer-organic-keywords(target="example.com", limit=200)
|
||||||
|
```
|
||||||
@@ -0,0 +1,29 @@
|
|||||||
|
# Firecrawl
|
||||||
|
|
||||||
|
> MCP tool documentation for URL inventory crawling and redirect verification
|
||||||
|
|
||||||
|
## Available Commands
|
||||||
|
|
||||||
|
- `firecrawl_crawl` - Crawl entire site to capture all URLs and status codes for migration inventory
|
||||||
|
- `firecrawl_scrape` - Scrape individual pages to verify redirect health (status codes, chains, final URL)
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
- Requires Firecrawl MCP server configured in Claude Desktop
|
||||||
|
- API access via `mcp__firecrawl__*` tool prefix
|
||||||
|
|
||||||
|
## Examples
|
||||||
|
|
||||||
|
```
|
||||||
|
# Crawl full site for URL inventory
|
||||||
|
mcp__firecrawl__firecrawl_crawl(url="https://example.com", limit=5000, scrapeOptions={"formats": ["links"]})
|
||||||
|
|
||||||
|
# Verify a redirect
|
||||||
|
mcp__firecrawl__firecrawl_scrape(url="https://old-example.com/page", formats=["links"])
|
||||||
|
```
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- Crawl limit defaults to 5,000 URLs per run
|
||||||
|
- For larger sites, run multiple crawls with path-based filtering
|
||||||
|
- Redirect verification returns status_code, final_url, and redirect_chain
|
||||||
@@ -0,0 +1,46 @@
|
|||||||
|
# Notion
|
||||||
|
|
||||||
|
> MCP tool documentation for saving migration planning and monitoring reports
|
||||||
|
|
||||||
|
## Available Commands
|
||||||
|
|
||||||
|
- `notion-create-pages` - Create new pages in the SEO Audit Log database
|
||||||
|
- `notion-update-page` - Update existing audit entries
|
||||||
|
- `notion-query-database-view` - Query existing reports
|
||||||
|
- `notion-search` - Search across Notion workspace
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
- Database ID: `2c8581e5-8a1e-8035-880b-e38cefc2f3ef`
|
||||||
|
- All reports saved with Category: "SEO Migration"
|
||||||
|
- Audit ID format: MIGR-YYYYMMDD-NNN
|
||||||
|
|
||||||
|
## Examples
|
||||||
|
|
||||||
|
```
|
||||||
|
# Create migration planning report
|
||||||
|
mcp__notion__notion-create-pages(
|
||||||
|
database_id="2c8581e5-8a1e-8035-880b-e38cefc2f3ef",
|
||||||
|
properties={
|
||||||
|
"Issue": {"title": [{"text": {"content": "사이트 이전 계획 - example.com - 2025-01-15"}}]},
|
||||||
|
"Site": {"url": "https://example.com"},
|
||||||
|
"Category": {"select": {"name": "SEO Migration"}},
|
||||||
|
"Priority": {"select": {"name": "High"}},
|
||||||
|
"Found Date": {"date": {"start": "2025-01-15"}},
|
||||||
|
"Audit ID": {"rich_text": [{"text": {"content": "MIGR-20250115-001"}}]}
|
||||||
|
}
|
||||||
|
)
|
||||||
|
|
||||||
|
# Create post-migration monitoring report
|
||||||
|
mcp__notion__notion-create-pages(
|
||||||
|
database_id="2c8581e5-8a1e-8035-880b-e38cefc2f3ef",
|
||||||
|
properties={
|
||||||
|
"Issue": {"title": [{"text": {"content": "이전 모니터링 보고서 - new-example.com - 2025-02-01"}}]},
|
||||||
|
"Site": {"url": "https://new-example.com"},
|
||||||
|
"Category": {"select": {"name": "SEO Migration"}},
|
||||||
|
"Priority": {"select": {"name": "Critical"}},
|
||||||
|
"Found Date": {"date": {"start": "2025-02-01"}},
|
||||||
|
"Audit ID": {"rich_text": [{"text": {"content": "MIGR-20250201-001"}}]}
|
||||||
|
}
|
||||||
|
)
|
||||||
|
```
|
||||||
78
custom-skills/34-seo-reporting-dashboard/README.md
Normal file
78
custom-skills/34-seo-reporting-dashboard/README.md
Normal file
@@ -0,0 +1,78 @@
|
|||||||
|
# SEO Reporting Dashboard
|
||||||
|
|
||||||
|
SEO 종합 보고서 및 대시보드 생성 도구 - 모든 SEO 스킬 결과를 집계하여 이해관계자용 보고서와 인터랙티브 HTML 대시보드를 생성합니다.
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
Aggregates outputs from all SEO skills (11-33) into executive reports with interactive HTML dashboards, trend analysis, and Korean-language executive summaries. This is the PRESENTATION LAYER that sits on top of skill 25 (KPI Framework) and all other skill outputs.
|
||||||
|
|
||||||
|
## Relationship to Skill 25 (KPI Framework)
|
||||||
|
|
||||||
|
Skill 25 establishes KPI baselines, targets, and health scores for a single domain. Skill 34 builds on top of skill 25 by:
|
||||||
|
- Aggregating outputs from ALL SEO skills (not just KPIs)
|
||||||
|
- Generating visual HTML dashboards with Chart.js
|
||||||
|
- Producing audience-specific Korean executive summaries
|
||||||
|
- Providing cross-skill priority analysis
|
||||||
|
|
||||||
|
## Dual-Platform Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
34-seo-reporting-dashboard/
|
||||||
|
├── code/ # Claude Code version
|
||||||
|
│ ├── CLAUDE.md # Action-oriented directive
|
||||||
|
│ ├── commands/
|
||||||
|
│ │ └── seo-reporting-dashboard.md # Slash command
|
||||||
|
│ └── scripts/
|
||||||
|
│ ├── base_client.py # Shared async utilities
|
||||||
|
│ ├── report_aggregator.py # Collect + normalize skill outputs
|
||||||
|
│ ├── dashboard_generator.py # HTML dashboard with Chart.js
|
||||||
|
│ ├── executive_report.py # Korean executive summary
|
||||||
|
│ └── requirements.txt
|
||||||
|
│
|
||||||
|
├── desktop/ # Claude Desktop version
|
||||||
|
│ ├── SKILL.md # MCP-based workflow
|
||||||
|
│ ├── skill.yaml # Extended metadata
|
||||||
|
│ └── tools/
|
||||||
|
│ ├── ahrefs.md # Ahrefs tool docs
|
||||||
|
│ └── notion.md # Notion tool docs
|
||||||
|
│
|
||||||
|
└── README.md
|
||||||
|
```
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
### Claude Code
|
||||||
|
```bash
|
||||||
|
pip install -r code/scripts/requirements.txt
|
||||||
|
|
||||||
|
# Aggregate all skill outputs
|
||||||
|
python code/scripts/report_aggregator.py --domain https://example.com --json
|
||||||
|
|
||||||
|
# Generate HTML dashboard
|
||||||
|
python code/scripts/dashboard_generator.py --report report.json --output dashboard.html
|
||||||
|
|
||||||
|
# Generate Korean executive report
|
||||||
|
python code/scripts/executive_report.py --report report.json --audience c-level --output report.md
|
||||||
|
```
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
|
- Cross-skill report aggregation (skills 11-33)
|
||||||
|
- Interactive HTML dashboard with Chart.js charts
|
||||||
|
- Korean-language executive summaries
|
||||||
|
- Audience-specific reporting (C-level, marketing, technical)
|
||||||
|
- Notion integration for reading past audits and writing reports
|
||||||
|
- Mobile-responsive dashboard layout
|
||||||
|
|
||||||
|
## Requirements
|
||||||
|
|
||||||
|
- Python 3.10+
|
||||||
|
- Dependencies: `pip install -r code/scripts/requirements.txt`
|
||||||
|
- Notion API token (for database access)
|
||||||
|
- Ahrefs API token (for fresh data pull)
|
||||||
|
|
||||||
|
## Triggers
|
||||||
|
|
||||||
|
- SEO report, SEO dashboard, executive summary
|
||||||
|
- 보고서, 대시보드, 종합 보고서, 성과 보고서
|
||||||
|
- performance report, reporting dashboard
|
||||||
173
custom-skills/34-seo-reporting-dashboard/code/CLAUDE.md
Normal file
173
custom-skills/34-seo-reporting-dashboard/code/CLAUDE.md
Normal file
@@ -0,0 +1,173 @@
|
|||||||
|
# CLAUDE.md
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
SEO reporting dashboard and executive report generator. Aggregates outputs from all SEO skills (11-33) into stakeholder-ready reports with interactive HTML dashboards, trend analysis, and Korean-language executive summaries. This is the PRESENTATION LAYER that sits on top of skill 25 (KPI Framework) and all other skill outputs, providing a unified view of SEO performance across all audit dimensions.
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
```bash
|
||||||
|
pip install -r scripts/requirements.txt
|
||||||
|
|
||||||
|
# Aggregate outputs from all SEO skills
|
||||||
|
python scripts/report_aggregator.py --domain https://example.com --json
|
||||||
|
|
||||||
|
# Generate HTML dashboard
|
||||||
|
python scripts/dashboard_generator.py --report aggregated_report.json --output dashboard.html
|
||||||
|
|
||||||
|
# Generate Korean executive report
|
||||||
|
python scripts/executive_report.py --report aggregated_report.json --audience c-level --output report.md
|
||||||
|
```
|
||||||
|
|
||||||
|
## Scripts
|
||||||
|
|
||||||
|
| Script | Purpose | Key Output |
|
||||||
|
|--------|---------|------------|
|
||||||
|
| `report_aggregator.py` | Collect and normalize outputs from all SEO skills | Unified aggregated report, cross-skill health score, priority issues |
|
||||||
|
| `dashboard_generator.py` | Generate interactive HTML dashboard with Chart.js | Self-contained HTML file with charts and responsive layout |
|
||||||
|
| `executive_report.py` | Korean-language executive summary generation | Markdown report tailored to audience level |
|
||||||
|
| `base_client.py` | Shared utilities | RateLimiter, ConfigManager, BaseAsyncClient |
|
||||||
|
|
||||||
|
## Report Aggregator
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Aggregate all skill outputs for a domain
|
||||||
|
python scripts/report_aggregator.py --domain https://example.com --json
|
||||||
|
|
||||||
|
# Specify output directory to scan
|
||||||
|
python scripts/report_aggregator.py --domain https://example.com --output-dir ./audit_outputs --json
|
||||||
|
|
||||||
|
# Filter by date range
|
||||||
|
python scripts/report_aggregator.py --domain https://example.com --from 2025-01-01 --to 2025-03-31 --json
|
||||||
|
|
||||||
|
# Save to file
|
||||||
|
python scripts/report_aggregator.py --domain https://example.com --json --output report.json
|
||||||
|
```
|
||||||
|
|
||||||
|
**Capabilities**:
|
||||||
|
- Scan for recent audit outputs from skills 11-33 (JSON files or Notion entries)
|
||||||
|
- Normalize data formats across skills into unified structure
|
||||||
|
- Merge findings by domain/date
|
||||||
|
- Compute cross-skill health scores with weighted dimensions
|
||||||
|
- Identify top-priority issues across all audits
|
||||||
|
- Timeline of audit history
|
||||||
|
- Support for both local file scanning and Notion database queries
|
||||||
|
|
||||||
|
## Dashboard Generator
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Generate HTML dashboard from aggregated report
|
||||||
|
python scripts/dashboard_generator.py --report aggregated_report.json --output dashboard.html
|
||||||
|
|
||||||
|
# Custom title
|
||||||
|
python scripts/dashboard_generator.py --report aggregated_report.json --output dashboard.html --title "OurDigital SEO Dashboard"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Capabilities**:
|
||||||
|
- Generate self-contained HTML dashboard (uses Chart.js from CDN)
|
||||||
|
- Health score gauge chart
|
||||||
|
- Traffic trend line chart
|
||||||
|
- Keyword ranking distribution bar chart
|
||||||
|
- Technical issues breakdown pie chart
|
||||||
|
- Competitor comparison radar chart
|
||||||
|
- Mobile-responsive layout with CSS grid
|
||||||
|
- Export as single .html file (no external dependencies)
|
||||||
|
|
||||||
|
## Executive Report
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# C-level executive summary (Korean)
|
||||||
|
python scripts/executive_report.py --report aggregated_report.json --audience c-level --output report.md
|
||||||
|
|
||||||
|
# Marketing team report
|
||||||
|
python scripts/executive_report.py --report aggregated_report.json --audience marketing --output report.md
|
||||||
|
|
||||||
|
# Technical team report
|
||||||
|
python scripts/executive_report.py --report aggregated_report.json --audience technical --output report.md
|
||||||
|
|
||||||
|
# Output to Notion instead of file
|
||||||
|
python scripts/executive_report.py --report aggregated_report.json --audience c-level --format notion
|
||||||
|
```
|
||||||
|
|
||||||
|
**Capabilities**:
|
||||||
|
- Korean-language executive summary generation
|
||||||
|
- Key wins and concerns identification
|
||||||
|
- Period-over-period comparison narrative
|
||||||
|
- Priority action items ranked by impact
|
||||||
|
- Stakeholder-appropriate language (non-technical for C-level)
|
||||||
|
- Support for C-level, marketing team, and technical team audiences
|
||||||
|
- Markdown output format
|
||||||
|
|
||||||
|
## Ahrefs MCP Tools Used
|
||||||
|
|
||||||
|
| Tool | Purpose |
|
||||||
|
|------|---------|
|
||||||
|
| `site-explorer-metrics` | Fresh current organic metrics snapshot |
|
||||||
|
| `site-explorer-metrics-history` | Historical metrics for trend visualization |
|
||||||
|
|
||||||
|
## Output Format
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"domain": "example.com",
|
||||||
|
"report_date": "2025-01-15",
|
||||||
|
"overall_health": 72,
|
||||||
|
"health_trend": "improving",
|
||||||
|
"skills_included": [
|
||||||
|
{"skill_id": 11, "skill_name": "comprehensive-audit", "audit_date": "2025-01-14"},
|
||||||
|
{"skill_id": 25, "skill_name": "kpi-framework", "audit_date": "2025-01-15"}
|
||||||
|
],
|
||||||
|
"category_scores": {
|
||||||
|
"technical": 85,
|
||||||
|
"on_page": 70,
|
||||||
|
"performance": 60,
|
||||||
|
"content": 75,
|
||||||
|
"links": 68,
|
||||||
|
"local": 65,
|
||||||
|
"keywords": 72,
|
||||||
|
"competitor": 58
|
||||||
|
},
|
||||||
|
"top_issues": [
|
||||||
|
{"severity": "critical", "category": "performance", "description": "CLS exceeds threshold on mobile"},
|
||||||
|
{"severity": "high", "category": "technical", "description": "12 pages with noindex tag incorrectly set"}
|
||||||
|
],
|
||||||
|
"top_wins": [
|
||||||
|
{"category": "links", "description": "Domain Rating increased by 3 points"},
|
||||||
|
{"category": "keywords", "description": "15 new keywords entered top 10"}
|
||||||
|
],
|
||||||
|
"timeline": [
|
||||||
|
{"date": "2025-01-15", "skill": "kpi-framework", "health_score": 72},
|
||||||
|
{"date": "2025-01-14", "skill": "comprehensive-audit", "health_score": 70}
|
||||||
|
],
|
||||||
|
"audit_id": "DASH-20250115-001",
|
||||||
|
"timestamp": "2025-01-15T14:30:00"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
## Notion Output (Required)
|
||||||
|
|
||||||
|
**IMPORTANT**: All audit reports MUST be saved to the OurDigital SEO Audit Log database.
|
||||||
|
|
||||||
|
### Database Configuration
|
||||||
|
|
||||||
|
| Field | Value |
|
||||||
|
|-------|-------|
|
||||||
|
| Database ID | `2c8581e5-8a1e-8035-880b-e38cefc2f3ef` |
|
||||||
|
| URL | https://www.notion.so/dintelligence/2c8581e58a1e8035880be38cefc2f3ef |
|
||||||
|
|
||||||
|
### Required Properties
|
||||||
|
|
||||||
|
| Property | Type | Description |
|
||||||
|
|----------|------|-------------|
|
||||||
|
| Issue | Title | Report title (Korean + date) |
|
||||||
|
| Site | URL | Audited website URL |
|
||||||
|
| Category | Select | SEO Dashboard |
|
||||||
|
| Priority | Select | Based on overall health trend |
|
||||||
|
| Found Date | Date | Report date (YYYY-MM-DD) |
|
||||||
|
| Audit ID | Rich Text | Format: DASH-YYYYMMDD-NNN |
|
||||||
|
|
||||||
|
### Language Guidelines
|
||||||
|
|
||||||
|
- Report content in Korean (한국어)
|
||||||
|
- Keep technical English terms as-is (e.g., Health Score, Domain Rating, Core Web Vitals, Chart.js)
|
||||||
|
- URLs and code remain unchanged
|
||||||
@@ -0,0 +1,30 @@
|
|||||||
|
---
|
||||||
|
name: seo-reporting-dashboard
|
||||||
|
description: |
|
||||||
|
SEO reporting dashboard and executive report generation. Aggregates data from all SEO skills
|
||||||
|
into stakeholder-ready reports and interactive HTML dashboards.
|
||||||
|
Triggers: SEO report, SEO dashboard, executive summary, 보고서, 대시보드, performance report.
|
||||||
|
allowed-tools:
|
||||||
|
- Bash
|
||||||
|
- Read
|
||||||
|
- Write
|
||||||
|
- WebFetch
|
||||||
|
- WebSearch
|
||||||
|
---
|
||||||
|
|
||||||
|
# SEO Reporting Dashboard
|
||||||
|
|
||||||
|
## Generate HTML Dashboard
|
||||||
|
```bash
|
||||||
|
python custom-skills/34-seo-reporting-dashboard/code/scripts/dashboard_generator.py --report [JSON] --output dashboard.html
|
||||||
|
```
|
||||||
|
|
||||||
|
## Generate Executive Report (Korean)
|
||||||
|
```bash
|
||||||
|
python custom-skills/34-seo-reporting-dashboard/code/scripts/executive_report.py --report [JSON] --audience c-level --output report.md
|
||||||
|
```
|
||||||
|
|
||||||
|
## Aggregate All Skill Outputs
|
||||||
|
```bash
|
||||||
|
python custom-skills/34-seo-reporting-dashboard/code/scripts/report_aggregator.py --domain [URL] --json
|
||||||
|
```
|
||||||
@@ -0,0 +1,169 @@
|
|||||||
|
"""
|
||||||
|
Base Client - Shared async client utilities
|
||||||
|
===========================================
|
||||||
|
Purpose: Rate-limited async operations for API clients
|
||||||
|
Python: 3.10+
|
||||||
|
"""
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
from asyncio import Semaphore
|
||||||
|
from datetime import datetime
|
||||||
|
from typing import Any, Callable, TypeVar
|
||||||
|
|
||||||
|
from dotenv import load_dotenv
|
||||||
|
from tenacity import (
|
||||||
|
retry,
|
||||||
|
stop_after_attempt,
|
||||||
|
wait_exponential,
|
||||||
|
retry_if_exception_type,
|
||||||
|
)
|
||||||
|
|
||||||
|
load_dotenv()
|
||||||
|
|
||||||
|
logging.basicConfig(
|
||||||
|
level=logging.INFO,
|
||||||
|
format="%(asctime)s - %(levelname)s - %(message)s",
|
||||||
|
)
|
||||||
|
|
||||||
|
T = TypeVar("T")
|
||||||
|
|
||||||
|
|
||||||
|
class RateLimiter:
|
||||||
|
"""Rate limiter using token bucket algorithm."""
|
||||||
|
|
||||||
|
def __init__(self, rate: float, per: float = 1.0):
|
||||||
|
self.rate = rate
|
||||||
|
self.per = per
|
||||||
|
self.tokens = rate
|
||||||
|
self.last_update = datetime.now()
|
||||||
|
self._lock = asyncio.Lock()
|
||||||
|
|
||||||
|
async def acquire(self) -> None:
|
||||||
|
async with self._lock:
|
||||||
|
now = datetime.now()
|
||||||
|
elapsed = (now - self.last_update).total_seconds()
|
||||||
|
self.tokens = min(self.rate, self.tokens + elapsed * (self.rate / self.per))
|
||||||
|
self.last_update = now
|
||||||
|
|
||||||
|
if self.tokens < 1:
|
||||||
|
wait_time = (1 - self.tokens) * (self.per / self.rate)
|
||||||
|
await asyncio.sleep(wait_time)
|
||||||
|
self.tokens = 0
|
||||||
|
else:
|
||||||
|
self.tokens -= 1
|
||||||
|
|
||||||
|
|
||||||
|
class BaseAsyncClient:
|
||||||
|
"""Base class for async API clients with rate limiting."""
|
||||||
|
|
||||||
|
def __init__(
|
||||||
|
self,
|
||||||
|
max_concurrent: int = 5,
|
||||||
|
requests_per_second: float = 3.0,
|
||||||
|
logger: logging.Logger | None = None,
|
||||||
|
):
|
||||||
|
self.semaphore = Semaphore(max_concurrent)
|
||||||
|
self.rate_limiter = RateLimiter(requests_per_second)
|
||||||
|
self.logger = logger or logging.getLogger(self.__class__.__name__)
|
||||||
|
self.stats = {
|
||||||
|
"requests": 0,
|
||||||
|
"success": 0,
|
||||||
|
"errors": 0,
|
||||||
|
"retries": 0,
|
||||||
|
}
|
||||||
|
|
||||||
|
@retry(
|
||||||
|
stop=stop_after_attempt(3),
|
||||||
|
wait=wait_exponential(multiplier=1, min=2, max=10),
|
||||||
|
retry=retry_if_exception_type(Exception),
|
||||||
|
)
|
||||||
|
async def _rate_limited_request(
|
||||||
|
self,
|
||||||
|
coro: Callable[[], Any],
|
||||||
|
) -> Any:
|
||||||
|
async with self.semaphore:
|
||||||
|
await self.rate_limiter.acquire()
|
||||||
|
self.stats["requests"] += 1
|
||||||
|
try:
|
||||||
|
result = await coro()
|
||||||
|
self.stats["success"] += 1
|
||||||
|
return result
|
||||||
|
except Exception as e:
|
||||||
|
self.stats["errors"] += 1
|
||||||
|
self.logger.error(f"Request failed: {e}")
|
||||||
|
raise
|
||||||
|
|
||||||
|
async def batch_requests(
|
||||||
|
self,
|
||||||
|
requests: list[Callable[[], Any]],
|
||||||
|
desc: str = "Processing",
|
||||||
|
) -> list[Any]:
|
||||||
|
try:
|
||||||
|
from tqdm.asyncio import tqdm
|
||||||
|
has_tqdm = True
|
||||||
|
except ImportError:
|
||||||
|
has_tqdm = False
|
||||||
|
|
||||||
|
async def execute(req: Callable) -> Any:
|
||||||
|
try:
|
||||||
|
return await self._rate_limited_request(req)
|
||||||
|
except Exception as e:
|
||||||
|
return {"error": str(e)}
|
||||||
|
|
||||||
|
tasks = [execute(req) for req in requests]
|
||||||
|
|
||||||
|
if has_tqdm:
|
||||||
|
results = []
|
||||||
|
for coro in tqdm.as_completed(tasks, total=len(tasks), desc=desc):
|
||||||
|
result = await coro
|
||||||
|
results.append(result)
|
||||||
|
return results
|
||||||
|
else:
|
||||||
|
return await asyncio.gather(*tasks, return_exceptions=True)
|
||||||
|
|
||||||
|
def print_stats(self) -> None:
|
||||||
|
self.logger.info("=" * 40)
|
||||||
|
self.logger.info("Request Statistics:")
|
||||||
|
self.logger.info(f" Total Requests: {self.stats['requests']}")
|
||||||
|
self.logger.info(f" Successful: {self.stats['success']}")
|
||||||
|
self.logger.info(f" Errors: {self.stats['errors']}")
|
||||||
|
self.logger.info("=" * 40)
|
||||||
|
|
||||||
|
|
||||||
|
class ConfigManager:
|
||||||
|
"""Manage API configuration and credentials."""
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
load_dotenv()
|
||||||
|
|
||||||
|
@property
|
||||||
|
def google_credentials_path(self) -> str | None:
|
||||||
|
seo_creds = os.path.expanduser("~/.credential/ourdigital-seo-agent.json")
|
||||||
|
if os.path.exists(seo_creds):
|
||||||
|
return seo_creds
|
||||||
|
return os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
|
||||||
|
|
||||||
|
@property
|
||||||
|
def pagespeed_api_key(self) -> str | None:
|
||||||
|
return os.getenv("PAGESPEED_API_KEY")
|
||||||
|
|
||||||
|
@property
|
||||||
|
def notion_token(self) -> str | None:
|
||||||
|
return os.getenv("NOTION_TOKEN") or os.getenv("NOTION_API_KEY")
|
||||||
|
|
||||||
|
def validate_google_credentials(self) -> bool:
|
||||||
|
creds_path = self.google_credentials_path
|
||||||
|
if not creds_path:
|
||||||
|
return False
|
||||||
|
return os.path.exists(creds_path)
|
||||||
|
|
||||||
|
def get_required(self, key: str) -> str:
|
||||||
|
value = os.getenv(key)
|
||||||
|
if not value:
|
||||||
|
raise ValueError(f"Missing required environment variable: {key}")
|
||||||
|
return value
|
||||||
|
|
||||||
|
|
||||||
|
config = ConfigManager()
|
||||||
@@ -0,0 +1,745 @@
|
|||||||
|
"""
|
||||||
|
Dashboard Generator - Interactive HTML SEO dashboard with Chart.js
|
||||||
|
==================================================================
|
||||||
|
Purpose: Generate a self-contained HTML dashboard from aggregated SEO
|
||||||
|
report data, with responsive charts for health scores, traffic
|
||||||
|
trends, keyword rankings, issue breakdowns, and competitor radar.
|
||||||
|
Python: 3.10+
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python dashboard_generator.py --report aggregated_report.json --output dashboard.html
|
||||||
|
python dashboard_generator.py --report aggregated_report.json --output dashboard.html --title "My SEO Dashboard"
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
import sys
|
||||||
|
from dataclasses import dataclass, field
|
||||||
|
from datetime import datetime
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
from jinja2 import Template
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
logging.basicConfig(
|
||||||
|
level=logging.INFO,
|
||||||
|
format="%(asctime)s - %(levelname)s - %(message)s",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Data classes
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class DashboardConfig:
|
||||||
|
"""Configuration for dashboard generation."""
|
||||||
|
title: str = "SEO Reporting Dashboard"
|
||||||
|
domain: str = ""
|
||||||
|
date_range: str = ""
|
||||||
|
theme: str = "light"
|
||||||
|
chart_options: dict[str, Any] = field(default_factory=dict)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# HTML template
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
DASHBOARD_TEMPLATE = """<!DOCTYPE html>
|
||||||
|
<html lang="ko">
|
||||||
|
<head>
|
||||||
|
<meta charset="UTF-8">
|
||||||
|
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||||
|
<title>{{ title }} - {{ domain }}</title>
|
||||||
|
<script src="https://cdn.jsdelivr.net/npm/chart.js@4.4.0/dist/chart.umd.min.js"></script>
|
||||||
|
<style>
|
||||||
|
:root {
|
||||||
|
--bg-primary: #f8f9fa;
|
||||||
|
--bg-card: #ffffff;
|
||||||
|
--text-primary: #212529;
|
||||||
|
--text-secondary: #6c757d;
|
||||||
|
--border: #dee2e6;
|
||||||
|
--accent: #0d6efd;
|
||||||
|
--success: #198754;
|
||||||
|
--warning: #ffc107;
|
||||||
|
--danger: #dc3545;
|
||||||
|
}
|
||||||
|
* { margin: 0; padding: 0; box-sizing: border-box; }
|
||||||
|
body {
|
||||||
|
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
|
||||||
|
background: var(--bg-primary);
|
||||||
|
color: var(--text-primary);
|
||||||
|
line-height: 1.6;
|
||||||
|
}
|
||||||
|
.header {
|
||||||
|
background: linear-gradient(135deg, #0d6efd 0%, #6610f2 100%);
|
||||||
|
color: white;
|
||||||
|
padding: 2rem;
|
||||||
|
text-align: center;
|
||||||
|
}
|
||||||
|
.header h1 { font-size: 1.8rem; margin-bottom: 0.5rem; }
|
||||||
|
.header .meta { opacity: 0.85; font-size: 0.9rem; }
|
||||||
|
.container {
|
||||||
|
max-width: 1400px;
|
||||||
|
margin: 0 auto;
|
||||||
|
padding: 1.5rem;
|
||||||
|
}
|
||||||
|
.grid {
|
||||||
|
display: grid;
|
||||||
|
grid-template-columns: repeat(auto-fit, minmax(320px, 1fr));
|
||||||
|
gap: 1.5rem;
|
||||||
|
margin-bottom: 1.5rem;
|
||||||
|
}
|
||||||
|
.grid-full {
|
||||||
|
display: grid;
|
||||||
|
grid-template-columns: 1fr;
|
||||||
|
gap: 1.5rem;
|
||||||
|
margin-bottom: 1.5rem;
|
||||||
|
}
|
||||||
|
.card {
|
||||||
|
background: var(--bg-card);
|
||||||
|
border-radius: 12px;
|
||||||
|
padding: 1.5rem;
|
||||||
|
box-shadow: 0 2px 8px rgba(0,0,0,0.06);
|
||||||
|
border: 1px solid var(--border);
|
||||||
|
}
|
||||||
|
.card h2 {
|
||||||
|
font-size: 1.1rem;
|
||||||
|
color: var(--text-secondary);
|
||||||
|
margin-bottom: 1rem;
|
||||||
|
padding-bottom: 0.5rem;
|
||||||
|
border-bottom: 2px solid var(--border);
|
||||||
|
}
|
||||||
|
.health-score {
|
||||||
|
text-align: center;
|
||||||
|
padding: 2rem;
|
||||||
|
}
|
||||||
|
.health-score .score {
|
||||||
|
font-size: 4rem;
|
||||||
|
font-weight: 700;
|
||||||
|
line-height: 1;
|
||||||
|
}
|
||||||
|
.health-score .label {
|
||||||
|
font-size: 1rem;
|
||||||
|
color: var(--text-secondary);
|
||||||
|
margin-top: 0.5rem;
|
||||||
|
}
|
||||||
|
.health-score .trend {
|
||||||
|
font-size: 1.2rem;
|
||||||
|
margin-top: 0.5rem;
|
||||||
|
font-weight: 600;
|
||||||
|
}
|
||||||
|
.trend-improving { color: var(--success); }
|
||||||
|
.trend-stable { color: var(--warning); }
|
||||||
|
.trend-declining { color: var(--danger); }
|
||||||
|
.score-excellent { color: var(--success); }
|
||||||
|
.score-good { color: #20c997; }
|
||||||
|
.score-average { color: var(--warning); }
|
||||||
|
.score-poor { color: #fd7e14; }
|
||||||
|
.score-critical { color: var(--danger); }
|
||||||
|
.chart-container {
|
||||||
|
position: relative;
|
||||||
|
width: 100%;
|
||||||
|
height: 300px;
|
||||||
|
}
|
||||||
|
.issues-list { list-style: none; }
|
||||||
|
.issues-list li {
|
||||||
|
padding: 0.75rem;
|
||||||
|
border-bottom: 1px solid var(--border);
|
||||||
|
display: flex;
|
||||||
|
align-items: flex-start;
|
||||||
|
gap: 0.75rem;
|
||||||
|
}
|
||||||
|
.issues-list li:last-child { border-bottom: none; }
|
||||||
|
.severity-badge {
|
||||||
|
display: inline-block;
|
||||||
|
padding: 0.15rem 0.5rem;
|
||||||
|
border-radius: 4px;
|
||||||
|
font-size: 0.75rem;
|
||||||
|
font-weight: 600;
|
||||||
|
text-transform: uppercase;
|
||||||
|
white-space: nowrap;
|
||||||
|
}
|
||||||
|
.severity-critical { background: #f8d7da; color: #842029; }
|
||||||
|
.severity-high { background: #fff3cd; color: #664d03; }
|
||||||
|
.severity-medium { background: #cfe2ff; color: #084298; }
|
||||||
|
.severity-low { background: #d1e7dd; color: #0f5132; }
|
||||||
|
.timeline-table {
|
||||||
|
width: 100%;
|
||||||
|
border-collapse: collapse;
|
||||||
|
font-size: 0.9rem;
|
||||||
|
}
|
||||||
|
.timeline-table th {
|
||||||
|
text-align: left;
|
||||||
|
padding: 0.6rem;
|
||||||
|
border-bottom: 2px solid var(--border);
|
||||||
|
color: var(--text-secondary);
|
||||||
|
font-weight: 600;
|
||||||
|
}
|
||||||
|
.timeline-table td {
|
||||||
|
padding: 0.6rem;
|
||||||
|
border-bottom: 1px solid var(--border);
|
||||||
|
}
|
||||||
|
.footer {
|
||||||
|
text-align: center;
|
||||||
|
padding: 2rem;
|
||||||
|
color: var(--text-secondary);
|
||||||
|
font-size: 0.85rem;
|
||||||
|
}
|
||||||
|
@media (max-width: 768px) {
|
||||||
|
.grid { grid-template-columns: 1fr; }
|
||||||
|
.header h1 { font-size: 1.4rem; }
|
||||||
|
.health-score .score { font-size: 3rem; }
|
||||||
|
}
|
||||||
|
</style>
|
||||||
|
</head>
|
||||||
|
<body>
|
||||||
|
<div class="header">
|
||||||
|
<h1>{{ title }}</h1>
|
||||||
|
<div class="meta">{{ domain }} | {{ report_date }} | Audit ID: {{ audit_id }}</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="container">
|
||||||
|
<!-- Health Score & Category Overview -->
|
||||||
|
<div class="grid">
|
||||||
|
<div class="card health-score">
|
||||||
|
<div class="score {{ score_class }}">{{ overall_health }}</div>
|
||||||
|
<div class="label">Overall Health Score</div>
|
||||||
|
<div class="trend trend-{{ health_trend }}">{{ trend_label }}</div>
|
||||||
|
<div class="chart-container" style="height: 200px; margin-top: 1rem;">
|
||||||
|
<canvas id="gaugeChart"></canvas>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="card">
|
||||||
|
<h2>Category Scores</h2>
|
||||||
|
<div class="chart-container">
|
||||||
|
<canvas id="categoryChart"></canvas>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Traffic & Keywords -->
|
||||||
|
<div class="grid">
|
||||||
|
<div class="card">
|
||||||
|
<h2>Health Score Timeline</h2>
|
||||||
|
<div class="chart-container">
|
||||||
|
<canvas id="timelineChart"></canvas>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<div class="card">
|
||||||
|
<h2>Issue Distribution</h2>
|
||||||
|
<div class="chart-container">
|
||||||
|
<canvas id="issuesChart"></canvas>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Competitor Radar (if data available) -->
|
||||||
|
{% if has_competitor_data %}
|
||||||
|
<div class="grid">
|
||||||
|
<div class="card">
|
||||||
|
<h2>Competitive Comparison</h2>
|
||||||
|
<div class="chart-container">
|
||||||
|
<canvas id="radarChart"></canvas>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
{% endif %}
|
||||||
|
|
||||||
|
<!-- Top Issues -->
|
||||||
|
<div class="grid-full">
|
||||||
|
<div class="card">
|
||||||
|
<h2>Top Issues ({{ issues_count }})</h2>
|
||||||
|
<ul class="issues-list">
|
||||||
|
{% for issue in top_issues %}
|
||||||
|
<li>
|
||||||
|
<span class="severity-badge severity-{{ issue.severity }}">{{ issue.severity }}</span>
|
||||||
|
<span>{{ issue.description }} <em style="color: var(--text-secondary);">({{ issue.category }})</em></span>
|
||||||
|
</li>
|
||||||
|
{% endfor %}
|
||||||
|
</ul>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<!-- Top Wins -->
|
||||||
|
{% if top_wins %}
|
||||||
|
<div class="grid-full">
|
||||||
|
<div class="card">
|
||||||
|
<h2>Top Wins ({{ wins_count }})</h2>
|
||||||
|
<ul class="issues-list">
|
||||||
|
{% for win in top_wins %}
|
||||||
|
<li>
|
||||||
|
<span class="severity-badge severity-low">WIN</span>
|
||||||
|
<span>{{ win.description }} <em style="color: var(--text-secondary);">({{ win.category }})</em></span>
|
||||||
|
</li>
|
||||||
|
{% endfor %}
|
||||||
|
</ul>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
{% endif %}
|
||||||
|
|
||||||
|
<!-- Audit Timeline Table -->
|
||||||
|
<div class="grid-full">
|
||||||
|
<div class="card">
|
||||||
|
<h2>Audit History</h2>
|
||||||
|
<table class="timeline-table">
|
||||||
|
<thead>
|
||||||
|
<tr>
|
||||||
|
<th>Date</th>
|
||||||
|
<th>Skill</th>
|
||||||
|
<th>Category</th>
|
||||||
|
<th>Score</th>
|
||||||
|
<th>Issues</th>
|
||||||
|
</tr>
|
||||||
|
</thead>
|
||||||
|
<tbody>
|
||||||
|
{% for entry in timeline %}
|
||||||
|
<tr>
|
||||||
|
<td>{{ entry.date }}</td>
|
||||||
|
<td>{{ entry.skill }}</td>
|
||||||
|
<td>{{ entry.category }}</td>
|
||||||
|
<td>{{ entry.health_score }}</td>
|
||||||
|
<td>{{ entry.issues_count }}</td>
|
||||||
|
</tr>
|
||||||
|
{% endfor %}
|
||||||
|
</tbody>
|
||||||
|
</table>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div class="footer">
|
||||||
|
Generated by SEO Reporting Dashboard (Skill 34) | {{ timestamp }}
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<script>
|
||||||
|
// --- Gauge Chart ---
|
||||||
|
const gaugeCtx = document.getElementById('gaugeChart').getContext('2d');
|
||||||
|
new Chart(gaugeCtx, {
|
||||||
|
type: 'doughnut',
|
||||||
|
data: {
|
||||||
|
datasets: [{
|
||||||
|
data: [{{ overall_health }}, {{ 100 - overall_health }}],
|
||||||
|
backgroundColor: ['{{ gauge_color }}', '#e9ecef'],
|
||||||
|
borderWidth: 0,
|
||||||
|
circumference: 180,
|
||||||
|
rotation: 270,
|
||||||
|
}]
|
||||||
|
},
|
||||||
|
options: {
|
||||||
|
responsive: true,
|
||||||
|
maintainAspectRatio: false,
|
||||||
|
cutout: '75%',
|
||||||
|
plugins: { legend: { display: false }, tooltip: { enabled: false } }
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
// --- Category Bar Chart ---
|
||||||
|
const catCtx = document.getElementById('categoryChart').getContext('2d');
|
||||||
|
new Chart(catCtx, {
|
||||||
|
type: 'bar',
|
||||||
|
data: {
|
||||||
|
labels: {{ category_labels | tojson }},
|
||||||
|
datasets: [{
|
||||||
|
label: 'Score',
|
||||||
|
data: {{ category_values | tojson }},
|
||||||
|
backgroundColor: {{ category_colors | tojson }},
|
||||||
|
borderRadius: 6,
|
||||||
|
borderSkipped: false,
|
||||||
|
}]
|
||||||
|
},
|
||||||
|
options: {
|
||||||
|
responsive: true,
|
||||||
|
maintainAspectRatio: false,
|
||||||
|
indexAxis: 'y',
|
||||||
|
scales: {
|
||||||
|
x: { min: 0, max: 100, grid: { display: false } },
|
||||||
|
y: { grid: { display: false } }
|
||||||
|
},
|
||||||
|
plugins: { legend: { display: false } }
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
// --- Timeline Line Chart ---
|
||||||
|
const timeCtx = document.getElementById('timelineChart').getContext('2d');
|
||||||
|
new Chart(timeCtx, {
|
||||||
|
type: 'line',
|
||||||
|
data: {
|
||||||
|
labels: {{ timeline_dates | tojson }},
|
||||||
|
datasets: [{
|
||||||
|
label: 'Health Score',
|
||||||
|
data: {{ timeline_scores | tojson }},
|
||||||
|
borderColor: '#0d6efd',
|
||||||
|
backgroundColor: 'rgba(13, 110, 253, 0.1)',
|
||||||
|
fill: true,
|
||||||
|
tension: 0.3,
|
||||||
|
pointRadius: 4,
|
||||||
|
pointBackgroundColor: '#0d6efd',
|
||||||
|
}]
|
||||||
|
},
|
||||||
|
options: {
|
||||||
|
responsive: true,
|
||||||
|
maintainAspectRatio: false,
|
||||||
|
scales: {
|
||||||
|
y: { min: 0, max: 100, grid: { color: '#f0f0f0' } },
|
||||||
|
x: { grid: { display: false } }
|
||||||
|
},
|
||||||
|
plugins: { legend: { display: false } }
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
// --- Issues Pie Chart ---
|
||||||
|
const issuesCtx = document.getElementById('issuesChart').getContext('2d');
|
||||||
|
new Chart(issuesCtx, {
|
||||||
|
type: 'pie',
|
||||||
|
data: {
|
||||||
|
labels: {{ issue_category_labels | tojson }},
|
||||||
|
datasets: [{
|
||||||
|
data: {{ issue_category_values | tojson }},
|
||||||
|
backgroundColor: [
|
||||||
|
'#dc3545', '#fd7e14', '#ffc107', '#198754',
|
||||||
|
'#0d6efd', '#6610f2', '#d63384', '#20c997',
|
||||||
|
'#0dcaf0', '#6c757d'
|
||||||
|
],
|
||||||
|
}]
|
||||||
|
},
|
||||||
|
options: {
|
||||||
|
responsive: true,
|
||||||
|
maintainAspectRatio: false,
|
||||||
|
plugins: {
|
||||||
|
legend: { position: 'right', labels: { boxWidth: 12, padding: 8 } }
|
||||||
|
}
|
||||||
|
}
|
||||||
|
});
|
||||||
|
|
||||||
|
{% if has_competitor_data %}
|
||||||
|
// --- Competitor Radar Chart ---
|
||||||
|
const radarCtx = document.getElementById('radarChart').getContext('2d');
|
||||||
|
new Chart(radarCtx, {
|
||||||
|
type: 'radar',
|
||||||
|
data: {
|
||||||
|
labels: {{ radar_labels | tojson }},
|
||||||
|
datasets: {{ radar_datasets | tojson }}
|
||||||
|
},
|
||||||
|
options: {
|
||||||
|
responsive: true,
|
||||||
|
maintainAspectRatio: false,
|
||||||
|
scales: {
|
||||||
|
r: { min: 0, max: 100, ticks: { stepSize: 20 } }
|
||||||
|
}
|
||||||
|
}
|
||||||
|
});
|
||||||
|
{% endif %}
|
||||||
|
</script>
|
||||||
|
</body>
|
||||||
|
</html>"""
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Generator
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
CATEGORY_KOREAN_LABELS: dict[str, str] = {
|
||||||
|
"technical": "기술 SEO",
|
||||||
|
"on_page": "온페이지",
|
||||||
|
"performance": "성능",
|
||||||
|
"content": "콘텐츠",
|
||||||
|
"links": "링크",
|
||||||
|
"local": "로컬 SEO",
|
||||||
|
"keywords": "키워드",
|
||||||
|
"competitor": "경쟁사",
|
||||||
|
"schema": "스키마",
|
||||||
|
"kpi": "KPI",
|
||||||
|
"search_console": "Search Console",
|
||||||
|
"ecommerce": "이커머스",
|
||||||
|
"international": "국제 SEO",
|
||||||
|
"ai_search": "AI 검색",
|
||||||
|
"entity_seo": "엔티티 SEO",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
class DashboardGenerator:
|
||||||
|
"""Generate an interactive HTML dashboard from aggregated SEO report data."""
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
self.template = Template(DASHBOARD_TEMPLATE)
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _score_class(score: float) -> str:
|
||||||
|
"""Return CSS class based on health score."""
|
||||||
|
if score >= 90:
|
||||||
|
return "score-excellent"
|
||||||
|
elif score >= 75:
|
||||||
|
return "score-good"
|
||||||
|
elif score >= 60:
|
||||||
|
return "score-average"
|
||||||
|
elif score >= 40:
|
||||||
|
return "score-poor"
|
||||||
|
else:
|
||||||
|
return "score-critical"
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _gauge_color(score: float) -> str:
|
||||||
|
"""Return color hex for gauge chart."""
|
||||||
|
if score >= 90:
|
||||||
|
return "#198754"
|
||||||
|
elif score >= 75:
|
||||||
|
return "#20c997"
|
||||||
|
elif score >= 60:
|
||||||
|
return "#ffc107"
|
||||||
|
elif score >= 40:
|
||||||
|
return "#fd7e14"
|
||||||
|
else:
|
||||||
|
return "#dc3545"
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _category_color(score: float) -> str:
|
||||||
|
"""Return color for category bar based on score."""
|
||||||
|
if score >= 80:
|
||||||
|
return "#198754"
|
||||||
|
elif score >= 60:
|
||||||
|
return "#0d6efd"
|
||||||
|
elif score >= 40:
|
||||||
|
return "#ffc107"
|
||||||
|
else:
|
||||||
|
return "#dc3545"
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _trend_label(trend: str) -> str:
|
||||||
|
"""Return human-readable trend label in Korean."""
|
||||||
|
labels = {
|
||||||
|
"improving": "개선 중 ↑",
|
||||||
|
"stable": "안정 →",
|
||||||
|
"declining": "하락 중 ↓",
|
||||||
|
}
|
||||||
|
return labels.get(trend, trend.title())
|
||||||
|
|
||||||
|
def generate_health_gauge(self, score: float) -> dict[str, Any]:
|
||||||
|
"""Generate gauge chart data for health score."""
|
||||||
|
return {
|
||||||
|
"score": score,
|
||||||
|
"remainder": 100 - score,
|
||||||
|
"color": self._gauge_color(score),
|
||||||
|
"class": self._score_class(score),
|
||||||
|
}
|
||||||
|
|
||||||
|
def generate_traffic_chart(self, traffic_data: list[dict]) -> dict[str, Any]:
|
||||||
|
"""Generate line chart data for traffic trends."""
|
||||||
|
dates = [d.get("date", "") for d in traffic_data]
|
||||||
|
values = [d.get("traffic", 0) for d in traffic_data]
|
||||||
|
return {"labels": dates, "values": values}
|
||||||
|
|
||||||
|
def generate_keyword_chart(self, keyword_data: list[dict]) -> dict[str, Any]:
|
||||||
|
"""Generate bar chart data for keyword ranking distribution."""
|
||||||
|
labels = [d.get("range", "") for d in keyword_data]
|
||||||
|
values = [d.get("count", 0) for d in keyword_data]
|
||||||
|
return {"labels": labels, "values": values}
|
||||||
|
|
||||||
|
def generate_issues_chart(
|
||||||
|
self, issues_data: list[dict[str, Any]]
|
||||||
|
) -> dict[str, Any]:
|
||||||
|
"""Generate pie chart data for issue category distribution."""
|
||||||
|
category_counts: dict[str, int] = {}
|
||||||
|
for issue in issues_data:
|
||||||
|
cat = issue.get("category", "other")
|
||||||
|
category_counts[cat] = category_counts.get(cat, 0) + 1
|
||||||
|
|
||||||
|
sorted_cats = sorted(
|
||||||
|
category_counts.items(), key=lambda x: x[1], reverse=True
|
||||||
|
)
|
||||||
|
return {
|
||||||
|
"labels": [CATEGORY_KOREAN_LABELS.get(c[0], c[0]) for c in sorted_cats],
|
||||||
|
"values": [c[1] for c in sorted_cats],
|
||||||
|
}
|
||||||
|
|
||||||
|
def generate_competitor_radar(
|
||||||
|
self, competitor_data: dict[str, Any]
|
||||||
|
) -> dict[str, Any]:
|
||||||
|
"""Generate radar chart data for competitor comparison."""
|
||||||
|
labels = list(competitor_data.get("dimensions", []))
|
||||||
|
datasets = []
|
||||||
|
colors = [
|
||||||
|
"rgba(13, 110, 253, 0.5)",
|
||||||
|
"rgba(220, 53, 69, 0.5)",
|
||||||
|
"rgba(25, 135, 84, 0.5)",
|
||||||
|
]
|
||||||
|
border_colors = ["#0d6efd", "#dc3545", "#198754"]
|
||||||
|
|
||||||
|
for i, (domain, scores) in enumerate(
|
||||||
|
competitor_data.get("scores", {}).items()
|
||||||
|
):
|
||||||
|
datasets.append({
|
||||||
|
"label": domain,
|
||||||
|
"data": [scores.get(dim, 0) for dim in labels],
|
||||||
|
"backgroundColor": colors[i % len(colors)],
|
||||||
|
"borderColor": border_colors[i % len(border_colors)],
|
||||||
|
"borderWidth": 2,
|
||||||
|
})
|
||||||
|
|
||||||
|
return {"labels": labels, "datasets": datasets}
|
||||||
|
|
||||||
|
def render_html(
|
||||||
|
self,
|
||||||
|
report: dict[str, Any],
|
||||||
|
config: DashboardConfig,
|
||||||
|
) -> str:
|
||||||
|
"""Render the full HTML dashboard from aggregated report data."""
|
||||||
|
overall_health = report.get("overall_health", 0)
|
||||||
|
health_trend = report.get("health_trend", "stable")
|
||||||
|
|
||||||
|
# Category scores (with Korean labels)
|
||||||
|
cat_scores = report.get("category_scores", {})
|
||||||
|
category_labels = [
|
||||||
|
CATEGORY_KOREAN_LABELS.get(k, k) for k in cat_scores.keys()
|
||||||
|
]
|
||||||
|
category_values = list(cat_scores.values())
|
||||||
|
category_colors = [self._category_color(v) for v in category_values]
|
||||||
|
|
||||||
|
# Timeline
|
||||||
|
timeline = report.get("timeline", [])
|
||||||
|
timeline_dates = [e.get("date", "") for e in timeline]
|
||||||
|
timeline_scores = [e.get("health_score", 0) for e in timeline]
|
||||||
|
|
||||||
|
# Issues
|
||||||
|
top_issues = report.get("top_issues", [])
|
||||||
|
issues_chart = self.generate_issues_chart(top_issues)
|
||||||
|
|
||||||
|
# Wins
|
||||||
|
top_wins = report.get("top_wins", [])
|
||||||
|
|
||||||
|
# Competitor radar
|
||||||
|
has_competitor_data = False
|
||||||
|
radar_labels: list[str] = []
|
||||||
|
radar_datasets: list[dict] = []
|
||||||
|
|
||||||
|
raw_outputs = report.get("raw_outputs", [])
|
||||||
|
for output in raw_outputs:
|
||||||
|
if output.get("category") == "competitor":
|
||||||
|
has_competitor_data = True
|
||||||
|
comp_data = output.get("data", {})
|
||||||
|
if "comparison_matrix" in comp_data:
|
||||||
|
radar_result = self.generate_competitor_radar(
|
||||||
|
comp_data["comparison_matrix"]
|
||||||
|
)
|
||||||
|
radar_labels = radar_result["labels"]
|
||||||
|
radar_datasets = radar_result["datasets"]
|
||||||
|
break
|
||||||
|
|
||||||
|
context = {
|
||||||
|
"title": config.title,
|
||||||
|
"domain": config.domain or report.get("domain", ""),
|
||||||
|
"report_date": report.get("report_date", ""),
|
||||||
|
"audit_id": report.get("audit_id", ""),
|
||||||
|
"timestamp": report.get("timestamp", datetime.now().isoformat()),
|
||||||
|
"overall_health": overall_health,
|
||||||
|
"score_class": self._score_class(overall_health),
|
||||||
|
"health_trend": health_trend,
|
||||||
|
"trend_label": self._trend_label(health_trend),
|
||||||
|
"gauge_color": self._gauge_color(overall_health),
|
||||||
|
"category_labels": category_labels,
|
||||||
|
"category_values": category_values,
|
||||||
|
"category_colors": category_colors,
|
||||||
|
"timeline_dates": timeline_dates,
|
||||||
|
"timeline_scores": timeline_scores,
|
||||||
|
"issue_category_labels": issues_chart["labels"],
|
||||||
|
"issue_category_values": issues_chart["values"],
|
||||||
|
"top_issues": top_issues[:15],
|
||||||
|
"issues_count": len(top_issues),
|
||||||
|
"top_wins": top_wins[:10],
|
||||||
|
"wins_count": len(top_wins),
|
||||||
|
"timeline": timeline[:20],
|
||||||
|
"has_competitor_data": has_competitor_data,
|
||||||
|
"radar_labels": radar_labels,
|
||||||
|
"radar_datasets": radar_datasets,
|
||||||
|
}
|
||||||
|
|
||||||
|
return self.template.render(**context)
|
||||||
|
|
||||||
|
def save(self, html_content: str, output_path: str) -> None:
|
||||||
|
"""Save rendered HTML to a file."""
|
||||||
|
Path(output_path).write_text(html_content, encoding="utf-8")
|
||||||
|
logger.info(f"Dashboard saved to {output_path}")
|
||||||
|
|
||||||
|
def run(
|
||||||
|
self,
|
||||||
|
report_json: str,
|
||||||
|
output_path: str,
|
||||||
|
title: str = "SEO Reporting Dashboard",
|
||||||
|
) -> str:
|
||||||
|
"""Orchestrate dashboard generation from a report JSON file."""
|
||||||
|
# Load report data
|
||||||
|
report_path = Path(report_json)
|
||||||
|
if not report_path.exists():
|
||||||
|
raise FileNotFoundError(f"Report file not found: {report_json}")
|
||||||
|
|
||||||
|
report = json.loads(report_path.read_text(encoding="utf-8"))
|
||||||
|
logger.info(f"Loaded report: {report.get('domain', 'unknown')}")
|
||||||
|
|
||||||
|
# Configure
|
||||||
|
config = DashboardConfig(
|
||||||
|
title=title,
|
||||||
|
domain=report.get("domain", ""),
|
||||||
|
date_range=report.get("report_date", ""),
|
||||||
|
)
|
||||||
|
|
||||||
|
# Render
|
||||||
|
html = self.render_html(report, config)
|
||||||
|
logger.info(f"Rendered HTML dashboard ({len(html):,} bytes)")
|
||||||
|
|
||||||
|
# Save
|
||||||
|
self.save(html, output_path)
|
||||||
|
|
||||||
|
return output_path
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# CLI
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def parse_args(argv: list[str] | None = None) -> argparse.Namespace:
|
||||||
|
parser = argparse.ArgumentParser(
|
||||||
|
description="SEO Dashboard Generator - Interactive HTML dashboard with Chart.js",
|
||||||
|
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||||
|
epilog="""\
|
||||||
|
Examples:
|
||||||
|
python dashboard_generator.py --report aggregated_report.json --output dashboard.html
|
||||||
|
python dashboard_generator.py --report aggregated_report.json --output dashboard.html --title "My Dashboard"
|
||||||
|
""",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--report",
|
||||||
|
required=True,
|
||||||
|
help="Path to aggregated report JSON file (from report_aggregator.py)",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--output",
|
||||||
|
required=True,
|
||||||
|
help="Output HTML file path",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--title",
|
||||||
|
type=str,
|
||||||
|
default="SEO Reporting Dashboard",
|
||||||
|
help="Dashboard title (default: 'SEO Reporting Dashboard')",
|
||||||
|
)
|
||||||
|
return parser.parse_args(argv)
|
||||||
|
|
||||||
|
|
||||||
|
def main() -> None:
|
||||||
|
args = parse_args()
|
||||||
|
|
||||||
|
generator = DashboardGenerator()
|
||||||
|
output = generator.run(
|
||||||
|
report_json=args.report,
|
||||||
|
output_path=args.output,
|
||||||
|
title=args.title,
|
||||||
|
)
|
||||||
|
|
||||||
|
logger.info(f"Dashboard generated: {output}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,622 @@
|
|||||||
|
"""
|
||||||
|
Executive Report - Korean-language executive summary generation
|
||||||
|
===============================================================
|
||||||
|
Purpose: Generate stakeholder-ready executive summaries in Korean from
|
||||||
|
aggregated SEO report data, with audience-specific detail levels
|
||||||
|
for C-level, marketing, and technical teams.
|
||||||
|
Python: 3.10+
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python executive_report.py --report aggregated_report.json --audience c-level --output report.md
|
||||||
|
python executive_report.py --report aggregated_report.json --audience marketing --output report.md
|
||||||
|
python executive_report.py --report aggregated_report.json --audience technical --output report.md
|
||||||
|
python executive_report.py --report aggregated_report.json --audience c-level --format notion
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
import sys
|
||||||
|
from dataclasses import dataclass, field
|
||||||
|
from datetime import datetime
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
logging.basicConfig(
|
||||||
|
level=logging.INFO,
|
||||||
|
format="%(asctime)s - %(levelname)s - %(message)s",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Data classes
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class AudienceConfig:
|
||||||
|
"""Configuration for report audience targeting."""
|
||||||
|
level: str = "c-level" # c-level | marketing | technical
|
||||||
|
detail_depth: str = "summary" # summary | moderate | detailed
|
||||||
|
include_recommendations: bool = True
|
||||||
|
include_technical_details: bool = False
|
||||||
|
max_issues: int = 5
|
||||||
|
max_recommendations: int = 5
|
||||||
|
|
||||||
|
@classmethod
|
||||||
|
def from_level(cls, level: str) -> "AudienceConfig":
|
||||||
|
"""Create config preset from audience level."""
|
||||||
|
presets = {
|
||||||
|
"c-level": cls(
|
||||||
|
level="c-level",
|
||||||
|
detail_depth="summary",
|
||||||
|
include_recommendations=True,
|
||||||
|
include_technical_details=False,
|
||||||
|
max_issues=5,
|
||||||
|
max_recommendations=3,
|
||||||
|
),
|
||||||
|
"marketing": cls(
|
||||||
|
level="marketing",
|
||||||
|
detail_depth="moderate",
|
||||||
|
include_recommendations=True,
|
||||||
|
include_technical_details=False,
|
||||||
|
max_issues=10,
|
||||||
|
max_recommendations=5,
|
||||||
|
),
|
||||||
|
"technical": cls(
|
||||||
|
level="technical",
|
||||||
|
detail_depth="detailed",
|
||||||
|
include_recommendations=True,
|
||||||
|
include_technical_details=True,
|
||||||
|
max_issues=20,
|
||||||
|
max_recommendations=10,
|
||||||
|
),
|
||||||
|
}
|
||||||
|
return presets.get(level, presets["c-level"])
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class ExecutiveSummary:
|
||||||
|
"""Generated executive summary content."""
|
||||||
|
title: str = ""
|
||||||
|
domain: str = ""
|
||||||
|
period: str = ""
|
||||||
|
health_score: float = 0.0
|
||||||
|
health_trend: str = "stable"
|
||||||
|
key_wins: list[str] = field(default_factory=list)
|
||||||
|
key_concerns: list[str] = field(default_factory=list)
|
||||||
|
recommendations: list[str] = field(default_factory=list)
|
||||||
|
narrative: str = ""
|
||||||
|
audience: str = "c-level"
|
||||||
|
category_summary: dict[str, str] = field(default_factory=dict)
|
||||||
|
audit_id: str = ""
|
||||||
|
timestamp: str = ""
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Korean text templates
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
HEALTH_LABELS_KR = {
|
||||||
|
"excellent": "우수",
|
||||||
|
"good": "양호",
|
||||||
|
"average": "보통",
|
||||||
|
"poor": "미흡",
|
||||||
|
"critical": "위험",
|
||||||
|
}
|
||||||
|
|
||||||
|
TREND_LABELS_KR = {
|
||||||
|
"improving": "개선 중",
|
||||||
|
"stable": "안정",
|
||||||
|
"declining": "하락 중",
|
||||||
|
}
|
||||||
|
|
||||||
|
CATEGORY_LABELS_KR = {
|
||||||
|
"technical": "기술 SEO",
|
||||||
|
"on_page": "온페이지 SEO",
|
||||||
|
"performance": "성능 (Core Web Vitals)",
|
||||||
|
"content": "콘텐츠 전략",
|
||||||
|
"links": "링크 프로필",
|
||||||
|
"local": "로컬 SEO",
|
||||||
|
"keywords": "키워드 전략",
|
||||||
|
"competitor": "경쟁 분석",
|
||||||
|
"schema": "스키마/구조화 데이터",
|
||||||
|
"kpi": "KPI 프레임워크",
|
||||||
|
"search_console": "Search Console",
|
||||||
|
"ecommerce": "이커머스 SEO",
|
||||||
|
"international": "국제 SEO",
|
||||||
|
"ai_search": "AI 검색 가시성",
|
||||||
|
"entity_seo": "Knowledge Graph",
|
||||||
|
}
|
||||||
|
|
||||||
|
# Common English issue descriptions -> Korean translations
|
||||||
|
ISSUE_TRANSLATIONS_KR: dict[str, str] = {
|
||||||
|
"missing meta description": "메타 설명(meta description) 누락",
|
||||||
|
"missing title tag": "타이틀 태그 누락",
|
||||||
|
"duplicate title": "중복 타이틀 태그",
|
||||||
|
"duplicate meta description": "중복 메타 설명",
|
||||||
|
"missing h1": "H1 태그 누락",
|
||||||
|
"multiple h1 tags": "H1 태그 다수 사용",
|
||||||
|
"missing alt text": "이미지 alt 텍스트 누락",
|
||||||
|
"broken links": "깨진 링크 발견",
|
||||||
|
"redirect chain": "리다이렉트 체인 발견",
|
||||||
|
"mixed content": "Mixed Content (HTTP/HTTPS 혼합) 발견",
|
||||||
|
"missing canonical": "Canonical 태그 누락",
|
||||||
|
"noindex on important page": "중요 페이지에 noindex 설정됨",
|
||||||
|
"slow page load": "페이지 로딩 속도 저하",
|
||||||
|
"cls exceeds threshold": "CLS(누적 레이아웃 변경) 임계값 초과",
|
||||||
|
"lcp exceeds threshold": "LCP(최대 콘텐츠풀 페인트) 임계값 초과",
|
||||||
|
"missing sitemap": "사이트맵 누락",
|
||||||
|
"robots.txt blocking important pages": "robots.txt에서 중요 페이지 차단 중",
|
||||||
|
"missing schema markup": "스키마 마크업 누락",
|
||||||
|
"missing hreflang": "hreflang 태그 누락",
|
||||||
|
"thin content": "콘텐츠 부족 (Thin Content)",
|
||||||
|
"orphan pages": "고아 페이지 발견 (내부 링크 없음)",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _translate_description(desc: str) -> str:
|
||||||
|
"""Translate common English issue descriptions to Korean."""
|
||||||
|
desc_lower = desc.lower().strip()
|
||||||
|
# Check exact match
|
||||||
|
if desc_lower in ISSUE_TRANSLATIONS_KR:
|
||||||
|
return ISSUE_TRANSLATIONS_KR[desc_lower]
|
||||||
|
# Check partial match (case-insensitive replace)
|
||||||
|
for eng, kor in ISSUE_TRANSLATIONS_KR.items():
|
||||||
|
if eng in desc_lower:
|
||||||
|
# Find the original-case substring and replace it
|
||||||
|
idx = desc_lower.index(eng)
|
||||||
|
return desc[:idx] + kor + desc[idx + len(eng):]
|
||||||
|
return desc
|
||||||
|
|
||||||
|
|
||||||
|
AUDIENCE_INTRO_KR = {
|
||||||
|
"c-level": "본 보고서는 SEO 성과의 핵심 지표와 비즈니스 영향을 요약한 경영진용 보고서입니다.",
|
||||||
|
"marketing": "본 보고서는 SEO 전략 실행 현황과 마케팅 성과를 분석한 마케팅팀 보고서입니다.",
|
||||||
|
"technical": "본 보고서는 SEO 기술 진단 결과와 상세 개선 사항을 포함한 기술팀 보고서입니다.",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Generator
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
class ExecutiveReportGenerator:
|
||||||
|
"""Generate Korean-language executive reports from aggregated SEO data."""
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _health_grade(score: float) -> str:
|
||||||
|
"""Return health grade string."""
|
||||||
|
if score >= 90:
|
||||||
|
return "excellent"
|
||||||
|
elif score >= 75:
|
||||||
|
return "good"
|
||||||
|
elif score >= 60:
|
||||||
|
return "average"
|
||||||
|
elif score >= 40:
|
||||||
|
return "poor"
|
||||||
|
else:
|
||||||
|
return "critical"
|
||||||
|
|
||||||
|
def generate_narrative(
|
||||||
|
self,
|
||||||
|
report: dict[str, Any],
|
||||||
|
audience: AudienceConfig,
|
||||||
|
) -> str:
|
||||||
|
"""Generate Korean narrative text for the executive summary."""
|
||||||
|
domain = report.get("domain", "")
|
||||||
|
health = report.get("overall_health", 0)
|
||||||
|
trend = report.get("health_trend", "stable")
|
||||||
|
grade = self._health_grade(health)
|
||||||
|
grade_kr = HEALTH_LABELS_KR.get(grade, grade)
|
||||||
|
trend_kr = TREND_LABELS_KR.get(trend, trend)
|
||||||
|
|
||||||
|
intro = AUDIENCE_INTRO_KR.get(audience.level, AUDIENCE_INTRO_KR["c-level"])
|
||||||
|
|
||||||
|
# Build narrative paragraphs
|
||||||
|
paragraphs = []
|
||||||
|
|
||||||
|
# Opening
|
||||||
|
paragraphs.append(intro)
|
||||||
|
|
||||||
|
# Health overview
|
||||||
|
paragraphs.append(
|
||||||
|
f"{domain}의 전체 SEO Health Score는 **{health}/100** ({grade_kr})이며, "
|
||||||
|
f"현재 추세는 **{trend_kr}** 상태입니다."
|
||||||
|
)
|
||||||
|
|
||||||
|
# Category highlights
|
||||||
|
cat_scores = report.get("category_scores", {})
|
||||||
|
if cat_scores:
|
||||||
|
strong_cats = [
|
||||||
|
CATEGORY_LABELS_KR.get(k, k)
|
||||||
|
for k, v in cat_scores.items()
|
||||||
|
if v >= 75
|
||||||
|
]
|
||||||
|
weak_cats = [
|
||||||
|
CATEGORY_LABELS_KR.get(k, k)
|
||||||
|
for k, v in cat_scores.items()
|
||||||
|
if v < 50
|
||||||
|
]
|
||||||
|
|
||||||
|
if strong_cats:
|
||||||
|
paragraphs.append(
|
||||||
|
f"강점 영역: {', '.join(strong_cats[:3])} 등이 양호한 성과를 보이고 있습니다."
|
||||||
|
)
|
||||||
|
if weak_cats:
|
||||||
|
paragraphs.append(
|
||||||
|
f"개선 필요 영역: {', '.join(weak_cats[:3])} 등에서 집중적인 개선이 필요합니다."
|
||||||
|
)
|
||||||
|
|
||||||
|
# Skills coverage
|
||||||
|
skills = report.get("skills_included", [])
|
||||||
|
if skills:
|
||||||
|
paragraphs.append(
|
||||||
|
f"총 {len(skills)}개의 SEO 진단 도구를 통해 종합 분석을 수행하였습니다."
|
||||||
|
)
|
||||||
|
|
||||||
|
# C-level specific: business impact focus
|
||||||
|
if audience.level == "c-level":
|
||||||
|
if trend == "improving":
|
||||||
|
paragraphs.append(
|
||||||
|
"전반적인 SEO 성과가 개선 추세에 있으며, 현재 전략을 유지하면서 "
|
||||||
|
"핵심 약점 영역에 대한 집중 투자가 권장됩니다."
|
||||||
|
)
|
||||||
|
elif trend == "declining":
|
||||||
|
paragraphs.append(
|
||||||
|
"SEO 성과가 하락 추세를 보이고 있어, 원인 분석과 함께 "
|
||||||
|
"긴급한 대응 조치가 필요합니다."
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
paragraphs.append(
|
||||||
|
"SEO 성과가 안정적으로 유지되고 있으나, 경쟁 환경 변화에 대비하여 "
|
||||||
|
"지속적인 모니터링과 선제적 대응이 필요합니다."
|
||||||
|
)
|
||||||
|
|
||||||
|
# Marketing specific: channel and content focus
|
||||||
|
elif audience.level == "marketing":
|
||||||
|
top_issues = report.get("top_issues", [])
|
||||||
|
content_issues = [
|
||||||
|
i for i in top_issues if i.get("category") in ("content", "keywords")
|
||||||
|
]
|
||||||
|
if content_issues:
|
||||||
|
paragraphs.append(
|
||||||
|
f"콘텐츠/키워드 관련 이슈가 {len(content_issues)}건 발견되었으며, "
|
||||||
|
f"콘텐츠 전략 수정이 권장됩니다."
|
||||||
|
)
|
||||||
|
|
||||||
|
# Technical specific: detailed breakdown
|
||||||
|
elif audience.level == "technical":
|
||||||
|
for cat, score in sorted(
|
||||||
|
cat_scores.items(), key=lambda x: x[1]
|
||||||
|
):
|
||||||
|
cat_kr = CATEGORY_LABELS_KR.get(cat, cat)
|
||||||
|
paragraphs.append(f"- {cat_kr}: {score}/100")
|
||||||
|
|
||||||
|
return "\n\n".join(paragraphs)
|
||||||
|
|
||||||
|
def format_wins(self, report: dict[str, Any]) -> list[str]:
|
||||||
|
"""Extract and format key wins in Korean."""
|
||||||
|
wins = report.get("top_wins", [])
|
||||||
|
formatted: list[str] = []
|
||||||
|
|
||||||
|
for win in wins:
|
||||||
|
desc = _translate_description(win.get("description", ""))
|
||||||
|
cat = win.get("category", "")
|
||||||
|
cat_kr = CATEGORY_LABELS_KR.get(cat, cat)
|
||||||
|
|
||||||
|
if desc:
|
||||||
|
formatted.append(f"[{cat_kr}] {desc}")
|
||||||
|
|
||||||
|
return formatted
|
||||||
|
|
||||||
|
def format_concerns(self, report: dict[str, Any]) -> list[str]:
|
||||||
|
"""Extract and format key concerns in Korean."""
|
||||||
|
issues = report.get("top_issues", [])
|
||||||
|
formatted: list[str] = []
|
||||||
|
|
||||||
|
severity_kr = {
|
||||||
|
"critical": "긴급",
|
||||||
|
"high": "높음",
|
||||||
|
"medium": "보통",
|
||||||
|
"low": "낮음",
|
||||||
|
}
|
||||||
|
|
||||||
|
for issue in issues:
|
||||||
|
desc = _translate_description(issue.get("description", ""))
|
||||||
|
severity = issue.get("severity", "medium")
|
||||||
|
cat = issue.get("category", "")
|
||||||
|
sev_kr = severity_kr.get(severity, severity)
|
||||||
|
cat_kr = CATEGORY_LABELS_KR.get(cat, cat)
|
||||||
|
|
||||||
|
if desc:
|
||||||
|
formatted.append(f"[{sev_kr}] [{cat_kr}] {desc}")
|
||||||
|
|
||||||
|
return formatted
|
||||||
|
|
||||||
|
def generate_recommendations(
|
||||||
|
self,
|
||||||
|
report: dict[str, Any],
|
||||||
|
audience: AudienceConfig,
|
||||||
|
) -> list[str]:
|
||||||
|
"""Generate prioritized action items ranked by impact."""
|
||||||
|
recommendations: list[str] = []
|
||||||
|
cat_scores = report.get("category_scores", {})
|
||||||
|
top_issues = report.get("top_issues", [])
|
||||||
|
|
||||||
|
# Priority 1: Critical issues
|
||||||
|
critical = [i for i in top_issues if i.get("severity") == "critical"]
|
||||||
|
for issue in critical[:3]:
|
||||||
|
cat_kr = CATEGORY_LABELS_KR.get(issue.get("category", ""), "")
|
||||||
|
desc = _translate_description(issue.get("description", ""))
|
||||||
|
if audience.level == "c-level":
|
||||||
|
recommendations.append(
|
||||||
|
f"[긴급] {cat_kr} 영역 긴급 조치 필요 - {desc}"
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
recommendations.append(
|
||||||
|
f"[긴급] {desc} (영역: {cat_kr})"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Priority 2: Weak categories
|
||||||
|
weak_cats = sorted(
|
||||||
|
[(k, v) for k, v in cat_scores.items() if v < 50],
|
||||||
|
key=lambda x: x[1],
|
||||||
|
)
|
||||||
|
for cat, score in weak_cats[:3]:
|
||||||
|
cat_kr = CATEGORY_LABELS_KR.get(cat, cat)
|
||||||
|
if audience.level == "c-level":
|
||||||
|
recommendations.append(
|
||||||
|
f"[개선] {cat_kr} 점수 {score}/100 - 전략적 투자 권장"
|
||||||
|
)
|
||||||
|
elif audience.level == "marketing":
|
||||||
|
recommendations.append(
|
||||||
|
f"[개선] {cat_kr} ({score}/100) - 캠페인 전략 재검토 필요"
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
recommendations.append(
|
||||||
|
f"[개선] {cat_kr} ({score}/100) - 상세 진단 및 기술적 개선 필요"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Priority 3: Maintenance for good categories
|
||||||
|
strong_cats = [
|
||||||
|
(k, v) for k, v in cat_scores.items() if v >= 75
|
||||||
|
]
|
||||||
|
if strong_cats:
|
||||||
|
cats_kr = ", ".join(
|
||||||
|
CATEGORY_LABELS_KR.get(k, k) for k, _ in strong_cats[:3]
|
||||||
|
)
|
||||||
|
recommendations.append(
|
||||||
|
f"[유지] {cats_kr} - 현재 수준 유지 및 모니터링 지속"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Audience-specific recommendations
|
||||||
|
if audience.level == "c-level":
|
||||||
|
health = report.get("overall_health", 0)
|
||||||
|
if health < 60:
|
||||||
|
recommendations.append(
|
||||||
|
"[전략] SEO 개선을 위한 전문 인력 또는 외부 에이전시 투입 검토"
|
||||||
|
)
|
||||||
|
elif audience.level == "marketing":
|
||||||
|
recommendations.append(
|
||||||
|
"[실행] 다음 분기 SEO 개선 로드맵 수립 및 KPI 설정"
|
||||||
|
)
|
||||||
|
elif audience.level == "technical":
|
||||||
|
recommendations.append(
|
||||||
|
"[실행] 기술 부채 해소 스프린트 계획 수립"
|
||||||
|
)
|
||||||
|
|
||||||
|
return recommendations[:audience.max_recommendations]
|
||||||
|
|
||||||
|
def render_markdown(self, summary: ExecutiveSummary) -> str:
|
||||||
|
"""Render executive summary as markdown document."""
|
||||||
|
lines: list[str] = []
|
||||||
|
|
||||||
|
# Title
|
||||||
|
lines.append(f"# {summary.title}")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Meta
|
||||||
|
audience_kr = {
|
||||||
|
"c-level": "경영진",
|
||||||
|
"marketing": "마케팅팀",
|
||||||
|
"technical": "기술팀",
|
||||||
|
}
|
||||||
|
lines.append(f"**대상**: {audience_kr.get(summary.audience, summary.audience)}")
|
||||||
|
lines.append(f"**도메인**: {summary.domain}")
|
||||||
|
lines.append(f"**보고 일자**: {summary.period}")
|
||||||
|
lines.append(f"**Audit ID**: {summary.audit_id}")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Health Score
|
||||||
|
grade = self._health_grade(summary.health_score)
|
||||||
|
grade_kr = HEALTH_LABELS_KR.get(grade, grade)
|
||||||
|
trend_kr = TREND_LABELS_KR.get(summary.health_trend, summary.health_trend)
|
||||||
|
|
||||||
|
lines.append("## Health Score")
|
||||||
|
lines.append("")
|
||||||
|
lines.append(f"| 지표 | 값 |")
|
||||||
|
lines.append(f"|------|-----|")
|
||||||
|
lines.append(f"| Overall Score | **{summary.health_score}/100** |")
|
||||||
|
lines.append(f"| 등급 | {grade_kr} |")
|
||||||
|
lines.append(f"| 추세 | {trend_kr} |")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Category summary
|
||||||
|
if summary.category_summary:
|
||||||
|
lines.append("## 영역별 점수")
|
||||||
|
lines.append("")
|
||||||
|
lines.append("| 영역 | 점수 |")
|
||||||
|
lines.append("|------|------|")
|
||||||
|
for cat, score_str in summary.category_summary.items():
|
||||||
|
cat_kr = CATEGORY_LABELS_KR.get(cat, cat)
|
||||||
|
lines.append(f"| {cat_kr} | {score_str} |")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Narrative
|
||||||
|
lines.append("## 종합 분석")
|
||||||
|
lines.append("")
|
||||||
|
lines.append(summary.narrative)
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Key wins
|
||||||
|
if summary.key_wins:
|
||||||
|
lines.append("## 주요 성과")
|
||||||
|
lines.append("")
|
||||||
|
for win in summary.key_wins:
|
||||||
|
lines.append(f"- {win}")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Key concerns
|
||||||
|
if summary.key_concerns:
|
||||||
|
lines.append("## 주요 이슈")
|
||||||
|
lines.append("")
|
||||||
|
for concern in summary.key_concerns:
|
||||||
|
lines.append(f"- {concern}")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Recommendations
|
||||||
|
if summary.recommendations:
|
||||||
|
lines.append("## 권장 조치 사항")
|
||||||
|
lines.append("")
|
||||||
|
for i, rec in enumerate(summary.recommendations, 1):
|
||||||
|
lines.append(f"{i}. {rec}")
|
||||||
|
lines.append("")
|
||||||
|
|
||||||
|
# Footer
|
||||||
|
lines.append("---")
|
||||||
|
lines.append(
|
||||||
|
f"*이 보고서는 SEO Reporting Dashboard (Skill 34)에 의해 "
|
||||||
|
f"{summary.timestamp}에 자동 생성되었습니다.*"
|
||||||
|
)
|
||||||
|
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
def run(
|
||||||
|
self,
|
||||||
|
report_json: str,
|
||||||
|
audience_level: str = "c-level",
|
||||||
|
output_path: str | None = None,
|
||||||
|
output_format: str = "markdown",
|
||||||
|
) -> str:
|
||||||
|
"""Orchestrate executive report generation."""
|
||||||
|
# Load report
|
||||||
|
report_path = Path(report_json)
|
||||||
|
if not report_path.exists():
|
||||||
|
raise FileNotFoundError(f"Report file not found: {report_json}")
|
||||||
|
|
||||||
|
report = json.loads(report_path.read_text(encoding="utf-8"))
|
||||||
|
logger.info(f"Loaded report: {report.get('domain', 'unknown')}")
|
||||||
|
|
||||||
|
# Configure audience
|
||||||
|
audience = AudienceConfig.from_level(audience_level)
|
||||||
|
logger.info(f"Audience: {audience.level} (depth: {audience.detail_depth})")
|
||||||
|
|
||||||
|
# Build summary
|
||||||
|
domain = report.get("domain", "")
|
||||||
|
summary = ExecutiveSummary(
|
||||||
|
title=f"SEO 성과 보고서 - {domain}",
|
||||||
|
domain=domain,
|
||||||
|
period=report.get("report_date", ""),
|
||||||
|
health_score=report.get("overall_health", 0),
|
||||||
|
health_trend=report.get("health_trend", "stable"),
|
||||||
|
audit_id=report.get("audit_id", ""),
|
||||||
|
audience=audience.level,
|
||||||
|
timestamp=datetime.now().isoformat(),
|
||||||
|
)
|
||||||
|
|
||||||
|
# Category summary
|
||||||
|
cat_scores = report.get("category_scores", {})
|
||||||
|
summary.category_summary = {
|
||||||
|
cat: f"{score}/100"
|
||||||
|
for cat, score in sorted(
|
||||||
|
cat_scores.items(), key=lambda x: x[1], reverse=True
|
||||||
|
)
|
||||||
|
}
|
||||||
|
|
||||||
|
# Generate content
|
||||||
|
summary.narrative = self.generate_narrative(report, audience)
|
||||||
|
summary.key_wins = self.format_wins(report)[:audience.max_issues]
|
||||||
|
summary.key_concerns = self.format_concerns(report)[:audience.max_issues]
|
||||||
|
summary.recommendations = self.generate_recommendations(report, audience)
|
||||||
|
|
||||||
|
# Render
|
||||||
|
if output_format == "markdown":
|
||||||
|
content = self.render_markdown(summary)
|
||||||
|
elif output_format == "notion":
|
||||||
|
# For Notion, we output markdown that can be pasted into Notion
|
||||||
|
content = self.render_markdown(summary)
|
||||||
|
logger.info(
|
||||||
|
"Notion format: use MCP tools to push this markdown to Notion "
|
||||||
|
f"database {report.get('audit_id', 'DASH-YYYYMMDD-NNN')}"
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
content = self.render_markdown(summary)
|
||||||
|
|
||||||
|
# Save or print
|
||||||
|
if output_path:
|
||||||
|
Path(output_path).write_text(content, encoding="utf-8")
|
||||||
|
logger.info(f"Executive report saved to {output_path}")
|
||||||
|
else:
|
||||||
|
print(content)
|
||||||
|
|
||||||
|
return content
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# CLI
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def parse_args(argv: list[str] | None = None) -> argparse.Namespace:
|
||||||
|
parser = argparse.ArgumentParser(
|
||||||
|
description="SEO Executive Report - Korean-language executive summary generator",
|
||||||
|
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||||
|
epilog="""\
|
||||||
|
Examples:
|
||||||
|
python executive_report.py --report aggregated_report.json --audience c-level --output report.md
|
||||||
|
python executive_report.py --report aggregated_report.json --audience marketing --output report.md
|
||||||
|
python executive_report.py --report aggregated_report.json --audience technical --format notion
|
||||||
|
""",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--report",
|
||||||
|
required=True,
|
||||||
|
help="Path to aggregated report JSON file (from report_aggregator.py)",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--audience",
|
||||||
|
choices=["c-level", "marketing", "technical"],
|
||||||
|
default="c-level",
|
||||||
|
help="Target audience level (default: c-level)",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--output",
|
||||||
|
type=str,
|
||||||
|
default=None,
|
||||||
|
help="Output file path (prints to stdout if omitted)",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--format",
|
||||||
|
choices=["markdown", "notion"],
|
||||||
|
default="markdown",
|
||||||
|
dest="output_format",
|
||||||
|
help="Output format (default: markdown)",
|
||||||
|
)
|
||||||
|
return parser.parse_args(argv)
|
||||||
|
|
||||||
|
|
||||||
|
def main() -> None:
|
||||||
|
args = parse_args()
|
||||||
|
|
||||||
|
generator = ExecutiveReportGenerator()
|
||||||
|
generator.run(
|
||||||
|
report_json=args.report,
|
||||||
|
audience_level=args.audience,
|
||||||
|
output_path=args.output,
|
||||||
|
output_format=args.output_format,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,744 @@
|
|||||||
|
"""
|
||||||
|
Report Aggregator - Collect and normalize outputs from all SEO skills
|
||||||
|
=====================================================================
|
||||||
|
Purpose: Scan for recent audit outputs from skills 11-33, normalize data
|
||||||
|
formats, merge findings by domain/date, compute cross-skill health
|
||||||
|
scores, and identify top-priority issues across all audits.
|
||||||
|
Python: 3.10+
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python report_aggregator.py --domain https://example.com --json
|
||||||
|
python report_aggregator.py --domain https://example.com --output-dir ./audit_outputs --json
|
||||||
|
python report_aggregator.py --domain https://example.com --from 2025-01-01 --to 2025-03-31 --json
|
||||||
|
python report_aggregator.py --domain https://example.com --json --output report.json
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import asyncio
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
from dataclasses import dataclass, field, asdict
|
||||||
|
from datetime import datetime, date
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Any
|
||||||
|
from urllib.parse import urlparse
|
||||||
|
|
||||||
|
from base_client import BaseAsyncClient, config
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Skill registry
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
SKILL_REGISTRY = {
|
||||||
|
11: {"name": "comprehensive-audit", "category": "comprehensive", "weight": 1.0},
|
||||||
|
12: {"name": "technical-audit", "category": "technical", "weight": 0.20},
|
||||||
|
13: {"name": "on-page-audit", "category": "on_page", "weight": 0.20},
|
||||||
|
14: {"name": "core-web-vitals", "category": "performance", "weight": 0.25},
|
||||||
|
15: {"name": "search-console", "category": "search_console", "weight": 0.10},
|
||||||
|
16: {"name": "schema-validator", "category": "schema", "weight": 0.15},
|
||||||
|
17: {"name": "schema-generator", "category": "schema", "weight": 0.10},
|
||||||
|
18: {"name": "local-audit", "category": "local", "weight": 0.10},
|
||||||
|
19: {"name": "keyword-strategy", "category": "keywords", "weight": 0.15},
|
||||||
|
20: {"name": "serp-analysis", "category": "keywords", "weight": 0.10},
|
||||||
|
21: {"name": "position-tracking", "category": "keywords", "weight": 0.15},
|
||||||
|
22: {"name": "link-building", "category": "links", "weight": 0.15},
|
||||||
|
23: {"name": "content-strategy", "category": "content", "weight": 0.15},
|
||||||
|
24: {"name": "ecommerce-seo", "category": "ecommerce", "weight": 0.10},
|
||||||
|
25: {"name": "kpi-framework", "category": "kpi", "weight": 0.20},
|
||||||
|
26: {"name": "international-seo", "category": "international", "weight": 0.10},
|
||||||
|
27: {"name": "ai-visibility", "category": "ai_search", "weight": 0.10},
|
||||||
|
28: {"name": "knowledge-graph", "category": "entity_seo", "weight": 0.10},
|
||||||
|
31: {"name": "competitor-intel", "category": "competitor", "weight": 0.15},
|
||||||
|
32: {"name": "crawl-budget", "category": "technical", "weight": 0.10},
|
||||||
|
33: {"name": "page-experience", "category": "performance", "weight": 0.10},
|
||||||
|
}
|
||||||
|
|
||||||
|
CATEGORY_WEIGHTS = {
|
||||||
|
"technical": 0.20,
|
||||||
|
"on_page": 0.15,
|
||||||
|
"performance": 0.15,
|
||||||
|
"content": 0.10,
|
||||||
|
"links": 0.10,
|
||||||
|
"local": 0.05,
|
||||||
|
"keywords": 0.10,
|
||||||
|
"competitor": 0.05,
|
||||||
|
"schema": 0.05,
|
||||||
|
"kpi": 0.05,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Data classes
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class SkillOutput:
|
||||||
|
"""Normalized output from a single SEO skill."""
|
||||||
|
skill_id: int = 0
|
||||||
|
skill_name: str = ""
|
||||||
|
domain: str = ""
|
||||||
|
audit_date: str = ""
|
||||||
|
category: str = ""
|
||||||
|
data: dict[str, Any] = field(default_factory=dict)
|
||||||
|
health_score: float = 0.0
|
||||||
|
issues: list[dict[str, Any]] = field(default_factory=list)
|
||||||
|
wins: list[dict[str, Any]] = field(default_factory=list)
|
||||||
|
source_file: str = ""
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class AggregatedReport:
|
||||||
|
"""Full aggregated report from all SEO skill outputs."""
|
||||||
|
domain: str = ""
|
||||||
|
report_date: str = ""
|
||||||
|
skills_included: list[dict[str, Any]] = field(default_factory=list)
|
||||||
|
overall_health: float = 0.0
|
||||||
|
health_trend: str = "stable"
|
||||||
|
category_scores: dict[str, float] = field(default_factory=dict)
|
||||||
|
top_issues: list[dict[str, Any]] = field(default_factory=list)
|
||||||
|
top_wins: list[dict[str, Any]] = field(default_factory=list)
|
||||||
|
timeline: list[dict[str, Any]] = field(default_factory=list)
|
||||||
|
raw_outputs: list[dict[str, Any]] = field(default_factory=list)
|
||||||
|
audit_id: str = ""
|
||||||
|
timestamp: str = ""
|
||||||
|
errors: list[str] = field(default_factory=list)
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Aggregator
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
class ReportAggregator(BaseAsyncClient):
|
||||||
|
"""Aggregate outputs from all SEO skills into unified reports."""
|
||||||
|
|
||||||
|
NOTION_DB_ID = "2c8581e5-8a1e-8035-880b-e38cefc2f3ef"
|
||||||
|
|
||||||
|
def __init__(self):
|
||||||
|
super().__init__(max_concurrent=5, requests_per_second=2.0)
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _extract_domain(url: str) -> str:
|
||||||
|
"""Extract bare domain from URL or return as-is if already bare."""
|
||||||
|
if "://" in url:
|
||||||
|
parsed = urlparse(url)
|
||||||
|
return parsed.netloc.lower().replace("www.", "")
|
||||||
|
return url.lower().replace("www.", "")
|
||||||
|
|
||||||
|
@staticmethod
|
||||||
|
def _generate_audit_id() -> str:
|
||||||
|
"""Generate audit ID in DASH-YYYYMMDD-NNN format."""
|
||||||
|
now = datetime.now()
|
||||||
|
return f"DASH-{now.strftime('%Y%m%d')}-001"
|
||||||
|
|
||||||
|
def scan_local_outputs(
|
||||||
|
self,
|
||||||
|
output_dir: str,
|
||||||
|
domain: str | None = None,
|
||||||
|
date_from: str | None = None,
|
||||||
|
date_to: str | None = None,
|
||||||
|
) -> list[SkillOutput]:
|
||||||
|
"""Find JSON output files from other SEO skills in a directory.
|
||||||
|
|
||||||
|
Scans for files matching patterns from skills 11-33 and parses
|
||||||
|
them into normalized SkillOutput objects.
|
||||||
|
"""
|
||||||
|
outputs: list[SkillOutput] = []
|
||||||
|
output_path = Path(output_dir)
|
||||||
|
|
||||||
|
if not output_path.exists():
|
||||||
|
self.logger.warning(f"Output directory not found: {output_dir}")
|
||||||
|
return outputs
|
||||||
|
|
||||||
|
# Scan for JSON files matching skill output patterns
|
||||||
|
json_files = list(output_path.rglob("*.json"))
|
||||||
|
self.logger.info(f"Found {len(json_files)} JSON files in {output_dir}")
|
||||||
|
|
||||||
|
for json_file in json_files:
|
||||||
|
try:
|
||||||
|
data = json.loads(json_file.read_text(encoding="utf-8"))
|
||||||
|
|
||||||
|
# Attempt to identify which skill produced this output
|
||||||
|
skill_output = self._identify_and_parse(data, str(json_file))
|
||||||
|
|
||||||
|
if skill_output is None:
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Filter by domain if specified (supports subdomains)
|
||||||
|
if domain:
|
||||||
|
target_domain = self._extract_domain(domain)
|
||||||
|
if skill_output.domain:
|
||||||
|
file_domain = skill_output.domain
|
||||||
|
# Match exact domain OR subdomains (e.g., blog.example.com matches example.com)
|
||||||
|
if file_domain != target_domain and not file_domain.endswith("." + target_domain):
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Filter by date range
|
||||||
|
if date_from and skill_output.audit_date < date_from:
|
||||||
|
continue
|
||||||
|
if date_to and skill_output.audit_date > date_to:
|
||||||
|
continue
|
||||||
|
|
||||||
|
outputs.append(skill_output)
|
||||||
|
self.logger.info(
|
||||||
|
f"Parsed output from skill {skill_output.skill_id} "
|
||||||
|
f"({skill_output.skill_name}): {json_file.name}"
|
||||||
|
)
|
||||||
|
|
||||||
|
except (json.JSONDecodeError, KeyError, TypeError) as e:
|
||||||
|
self.logger.warning(f"Could not parse {json_file}: {e}")
|
||||||
|
|
||||||
|
self.logger.info(f"Successfully parsed {len(outputs)} skill outputs")
|
||||||
|
return outputs
|
||||||
|
|
||||||
|
def _identify_and_parse(
|
||||||
|
self, data: dict[str, Any], source_file: str
|
||||||
|
) -> SkillOutput | None:
|
||||||
|
"""Identify which skill produced the output and parse it."""
|
||||||
|
skill_output = SkillOutput(source_file=source_file)
|
||||||
|
|
||||||
|
# Strategy 1: Parse skill from audit_id prefix (e.g., KPI-20250115-001)
|
||||||
|
audit_id = data.get("audit_id", "")
|
||||||
|
if isinstance(audit_id, str):
|
||||||
|
prefix_map = {
|
||||||
|
"COMP": 11, "TECH": 12, "PAGE": 13, "CWV": 14,
|
||||||
|
"GSC": 15, "SCHEMA": 16, "LOCAL": 18, "KW": 19,
|
||||||
|
"SERP": 20, "RANK": 21, "LINK": 22, "CONTENT": 23,
|
||||||
|
"ECOM": 24, "KPI": 25, "INTL": 26, "AI": 27,
|
||||||
|
"KG": 28, "COMPET": 31, "CRAWL": 32, "MIGR": 33,
|
||||||
|
"DASH": None, # Skip self-referencing dashboard reports
|
||||||
|
}
|
||||||
|
for prefix, skill_id in prefix_map.items():
|
||||||
|
if audit_id.startswith(prefix):
|
||||||
|
if skill_id is None:
|
||||||
|
return None # Skip aggregated reports
|
||||||
|
skill_info = SKILL_REGISTRY.get(skill_id, {})
|
||||||
|
skill_output.skill_id = skill_id
|
||||||
|
skill_output.skill_name = skill_info.get("name", "unknown")
|
||||||
|
skill_output.category = skill_info.get("category", "unknown")
|
||||||
|
break
|
||||||
|
|
||||||
|
# Strategy 2: Fallback to audit_type field (used by our-seo-agent outputs)
|
||||||
|
if not skill_output.skill_id:
|
||||||
|
audit_type = data.get("audit_type", "")
|
||||||
|
if isinstance(audit_type, str) and audit_type:
|
||||||
|
type_map = {
|
||||||
|
"comprehensive": 11, "technical": 12, "onpage": 13,
|
||||||
|
"cwv": 14, "core-web-vitals": 14,
|
||||||
|
"gsc": 15, "search-console": 15,
|
||||||
|
"schema": 16, "local": 18,
|
||||||
|
"keyword": 19, "serp": 20, "position": 21,
|
||||||
|
"link": 22, "backlink": 22,
|
||||||
|
"content": 23, "ecommerce": 24, "kpi": 25,
|
||||||
|
"international": 26, "hreflang": 26,
|
||||||
|
"ai-visibility": 27, "knowledge-graph": 28, "entity": 28,
|
||||||
|
"competitor": 31, "crawl-budget": 32, "crawl": 32,
|
||||||
|
"migration": 33,
|
||||||
|
}
|
||||||
|
for type_key, skill_id in type_map.items():
|
||||||
|
if audit_type.lower() == type_key:
|
||||||
|
skill_info = SKILL_REGISTRY.get(skill_id, {})
|
||||||
|
skill_output.skill_id = skill_id
|
||||||
|
skill_output.skill_name = skill_info.get("name", "unknown")
|
||||||
|
skill_output.category = skill_info.get("category", "unknown")
|
||||||
|
break
|
||||||
|
|
||||||
|
# Extract domain
|
||||||
|
for key in ("url", "target", "domain", "site"):
|
||||||
|
if key in data:
|
||||||
|
skill_output.domain = self._extract_domain(str(data[key]))
|
||||||
|
break
|
||||||
|
|
||||||
|
# Extract health score — check top-level first, then nested data dict
|
||||||
|
score_found = False
|
||||||
|
for key in ("health_score", "overall_health", "score"):
|
||||||
|
if key in data:
|
||||||
|
try:
|
||||||
|
skill_output.health_score = float(data[key])
|
||||||
|
score_found = True
|
||||||
|
except (ValueError, TypeError):
|
||||||
|
pass
|
||||||
|
break
|
||||||
|
|
||||||
|
if not score_found:
|
||||||
|
nested = data.get("data", {})
|
||||||
|
if isinstance(nested, dict):
|
||||||
|
for key in ("technical_score", "onpage_score", "schema_score",
|
||||||
|
"local_seo_score", "cwv_score", "performance_score",
|
||||||
|
"content_score", "link_score", "keyword_score",
|
||||||
|
"competitor_score", "efficiency_score",
|
||||||
|
"health_score", "overall_score", "score"):
|
||||||
|
val = nested.get(key)
|
||||||
|
if val is not None:
|
||||||
|
try:
|
||||||
|
skill_output.health_score = float(val)
|
||||||
|
except (ValueError, TypeError):
|
||||||
|
pass
|
||||||
|
break
|
||||||
|
|
||||||
|
# Extract audit date
|
||||||
|
for key in ("audit_date", "report_date", "timestamp", "found_date"):
|
||||||
|
if key in data:
|
||||||
|
date_str = str(data[key])[:10]
|
||||||
|
skill_output.audit_date = date_str
|
||||||
|
break
|
||||||
|
|
||||||
|
if not skill_output.audit_date:
|
||||||
|
skill_output.audit_date = date.today().isoformat()
|
||||||
|
|
||||||
|
# Extract issues
|
||||||
|
issues_raw = data.get("issues", data.get("critical_issues", []))
|
||||||
|
if isinstance(issues_raw, list):
|
||||||
|
for issue in issues_raw:
|
||||||
|
if isinstance(issue, dict):
|
||||||
|
skill_output.issues.append(issue)
|
||||||
|
elif isinstance(issue, str):
|
||||||
|
skill_output.issues.append({"description": issue, "severity": "medium"})
|
||||||
|
|
||||||
|
# Extract wins / recommendations
|
||||||
|
wins_raw = data.get("wins", data.get("top_wins", []))
|
||||||
|
if isinstance(wins_raw, list):
|
||||||
|
for win in wins_raw:
|
||||||
|
if isinstance(win, dict):
|
||||||
|
skill_output.wins.append(win)
|
||||||
|
elif isinstance(win, str):
|
||||||
|
skill_output.wins.append({"description": win})
|
||||||
|
|
||||||
|
# Store full data
|
||||||
|
skill_output.data = data
|
||||||
|
|
||||||
|
# Skip if no useful data was extracted
|
||||||
|
if not skill_output.skill_id and not skill_output.domain:
|
||||||
|
return None
|
||||||
|
|
||||||
|
return skill_output
|
||||||
|
|
||||||
|
async def query_notion_audits(
|
||||||
|
self,
|
||||||
|
domain: str,
|
||||||
|
date_from: str | None = None,
|
||||||
|
date_to: str | None = None,
|
||||||
|
) -> list[SkillOutput]:
|
||||||
|
"""Fetch past audit entries from Notion SEO Audit Log database.
|
||||||
|
|
||||||
|
In production, this uses the Notion MCP tools to query the database.
|
||||||
|
Returns normalized SkillOutput objects.
|
||||||
|
"""
|
||||||
|
outputs: list[SkillOutput] = []
|
||||||
|
self.logger.info(
|
||||||
|
f"Querying Notion audits for {domain} "
|
||||||
|
f"(db: {self.NOTION_DB_ID}, from={date_from}, to={date_to})"
|
||||||
|
)
|
||||||
|
|
||||||
|
# In production, this would call:
|
||||||
|
# mcp__notion__query-database with filters for Site URL and Found Date
|
||||||
|
# For now, return empty list as placeholder
|
||||||
|
self.logger.info(
|
||||||
|
"Notion query is a placeholder; use MCP tools in Claude Desktop "
|
||||||
|
"or manually provide JSON files via --output-dir."
|
||||||
|
)
|
||||||
|
|
||||||
|
return outputs
|
||||||
|
|
||||||
|
def normalize_output(self, skill_output: SkillOutput) -> dict[str, Any]:
|
||||||
|
"""Normalize a skill output into a unified format."""
|
||||||
|
return {
|
||||||
|
"skill_id": skill_output.skill_id,
|
||||||
|
"skill_name": skill_output.skill_name,
|
||||||
|
"domain": skill_output.domain,
|
||||||
|
"audit_date": skill_output.audit_date,
|
||||||
|
"category": skill_output.category,
|
||||||
|
"health_score": skill_output.health_score,
|
||||||
|
"issues_count": len(skill_output.issues),
|
||||||
|
"wins_count": len(skill_output.wins),
|
||||||
|
"issues": skill_output.issues[:10],
|
||||||
|
"wins": skill_output.wins[:10],
|
||||||
|
}
|
||||||
|
|
||||||
|
def compute_cross_skill_health(
|
||||||
|
self, outputs: list[SkillOutput]
|
||||||
|
) -> tuple[float, dict[str, float]]:
|
||||||
|
"""Compute weighted overall health score across all skills.
|
||||||
|
|
||||||
|
Returns (overall_score, category_scores_dict).
|
||||||
|
"""
|
||||||
|
category_scores: dict[str, list[float]] = {}
|
||||||
|
|
||||||
|
for output in outputs:
|
||||||
|
cat = output.category
|
||||||
|
if cat and output.health_score > 0:
|
||||||
|
category_scores.setdefault(cat, []).append(output.health_score)
|
||||||
|
|
||||||
|
# Average scores per category
|
||||||
|
avg_category: dict[str, float] = {}
|
||||||
|
for cat, scores in category_scores.items():
|
||||||
|
avg_category[cat] = round(sum(scores) / len(scores), 1)
|
||||||
|
|
||||||
|
# Weighted overall score
|
||||||
|
total_weight = 0.0
|
||||||
|
weighted_sum = 0.0
|
||||||
|
for cat, avg_score in avg_category.items():
|
||||||
|
weight = CATEGORY_WEIGHTS.get(cat, 0.05)
|
||||||
|
weighted_sum += avg_score * weight
|
||||||
|
total_weight += weight
|
||||||
|
|
||||||
|
overall = round(weighted_sum / total_weight, 1) if total_weight > 0 else 0.0
|
||||||
|
|
||||||
|
return overall, avg_category
|
||||||
|
|
||||||
|
def identify_priorities(
|
||||||
|
self, outputs: list[SkillOutput]
|
||||||
|
) -> tuple[list[dict[str, Any]], list[dict[str, Any]]]:
|
||||||
|
"""Identify top issues and wins across all skill outputs.
|
||||||
|
|
||||||
|
Returns (top_issues, top_wins).
|
||||||
|
"""
|
||||||
|
all_issues: list[dict[str, Any]] = []
|
||||||
|
all_wins: list[dict[str, Any]] = []
|
||||||
|
|
||||||
|
severity_order = {"critical": 0, "high": 1, "medium": 2, "low": 3}
|
||||||
|
|
||||||
|
for output in outputs:
|
||||||
|
for issue in output.issues:
|
||||||
|
enriched = {
|
||||||
|
**issue,
|
||||||
|
"source_skill": output.skill_name,
|
||||||
|
"source_skill_id": output.skill_id,
|
||||||
|
"category": output.category,
|
||||||
|
}
|
||||||
|
all_issues.append(enriched)
|
||||||
|
|
||||||
|
for win in output.wins:
|
||||||
|
enriched = {
|
||||||
|
**win,
|
||||||
|
"source_skill": output.skill_name,
|
||||||
|
"source_skill_id": output.skill_id,
|
||||||
|
"category": output.category,
|
||||||
|
}
|
||||||
|
all_wins.append(enriched)
|
||||||
|
|
||||||
|
# Sort issues by severity
|
||||||
|
all_issues.sort(
|
||||||
|
key=lambda i: severity_order.get(
|
||||||
|
i.get("severity", "medium"), 2
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
return all_issues[:20], all_wins[:20]
|
||||||
|
|
||||||
|
def build_timeline(self, outputs: list[SkillOutput]) -> list[dict[str, Any]]:
|
||||||
|
"""Build an audit history timeline from all skill outputs."""
|
||||||
|
timeline: list[dict[str, Any]] = []
|
||||||
|
|
||||||
|
for output in outputs:
|
||||||
|
entry = {
|
||||||
|
"date": output.audit_date,
|
||||||
|
"skill": output.skill_name,
|
||||||
|
"skill_id": output.skill_id,
|
||||||
|
"health_score": output.health_score,
|
||||||
|
"category": output.category,
|
||||||
|
"issues_count": len(output.issues),
|
||||||
|
}
|
||||||
|
timeline.append(entry)
|
||||||
|
|
||||||
|
# Sort by date descending
|
||||||
|
timeline.sort(key=lambda e: e.get("date", ""), reverse=True)
|
||||||
|
return timeline
|
||||||
|
|
||||||
|
async def run(
|
||||||
|
self,
|
||||||
|
domain: str,
|
||||||
|
output_dir: str | None = None,
|
||||||
|
date_from: str | None = None,
|
||||||
|
date_to: str | None = None,
|
||||||
|
) -> AggregatedReport:
|
||||||
|
"""Orchestrate the full report aggregation pipeline."""
|
||||||
|
target_domain = self._extract_domain(domain)
|
||||||
|
report = AggregatedReport(
|
||||||
|
domain=target_domain,
|
||||||
|
report_date=date.today().isoformat(),
|
||||||
|
audit_id=self._generate_audit_id(),
|
||||||
|
timestamp=datetime.now().isoformat(),
|
||||||
|
)
|
||||||
|
|
||||||
|
all_outputs: list[SkillOutput] = []
|
||||||
|
|
||||||
|
# Step 1: Scan local outputs
|
||||||
|
if output_dir:
|
||||||
|
self.logger.info(f"Step 1/5: Scanning local outputs in {output_dir}...")
|
||||||
|
local_outputs = self.scan_local_outputs(
|
||||||
|
output_dir, domain=target_domain,
|
||||||
|
date_from=date_from, date_to=date_to,
|
||||||
|
)
|
||||||
|
all_outputs.extend(local_outputs)
|
||||||
|
else:
|
||||||
|
self.logger.info("Step 1/5: No output directory specified, skipping local scan.")
|
||||||
|
|
||||||
|
# Step 2: Query Notion for past audits
|
||||||
|
self.logger.info("Step 2/5: Querying Notion for past audits...")
|
||||||
|
try:
|
||||||
|
notion_outputs = await self.query_notion_audits(
|
||||||
|
domain=target_domain,
|
||||||
|
date_from=date_from,
|
||||||
|
date_to=date_to,
|
||||||
|
)
|
||||||
|
all_outputs.extend(notion_outputs)
|
||||||
|
except Exception as e:
|
||||||
|
msg = f"Notion query error: {e}"
|
||||||
|
self.logger.error(msg)
|
||||||
|
report.errors.append(msg)
|
||||||
|
|
||||||
|
if not all_outputs:
|
||||||
|
self.logger.warning(
|
||||||
|
"No skill outputs found. Provide --output-dir with JSON files "
|
||||||
|
"from SEO skills 11-33, or ensure Notion audit log has entries."
|
||||||
|
)
|
||||||
|
report.errors.append("No skill outputs found to aggregate.")
|
||||||
|
return report
|
||||||
|
|
||||||
|
# Step 3: Normalize and compute health scores
|
||||||
|
self.logger.info(
|
||||||
|
f"Step 3/5: Normalizing {len(all_outputs)} skill outputs..."
|
||||||
|
)
|
||||||
|
report.skills_included = [
|
||||||
|
{
|
||||||
|
"skill_id": o.skill_id,
|
||||||
|
"skill_name": o.skill_name,
|
||||||
|
"audit_date": o.audit_date,
|
||||||
|
}
|
||||||
|
for o in all_outputs
|
||||||
|
]
|
||||||
|
report.raw_outputs = [self.normalize_output(o) for o in all_outputs]
|
||||||
|
|
||||||
|
overall_health, category_scores = self.compute_cross_skill_health(all_outputs)
|
||||||
|
report.overall_health = overall_health
|
||||||
|
report.category_scores = category_scores
|
||||||
|
|
||||||
|
# Determine health trend from timeline
|
||||||
|
scores_by_date = sorted(
|
||||||
|
[(o.audit_date, o.health_score) for o in all_outputs if o.health_score > 0],
|
||||||
|
key=lambda x: x[0],
|
||||||
|
)
|
||||||
|
if len(scores_by_date) >= 2:
|
||||||
|
older_avg = sum(s for _, s in scores_by_date[:len(scores_by_date)//2]) / max(len(scores_by_date)//2, 1)
|
||||||
|
newer_avg = sum(s for _, s in scores_by_date[len(scores_by_date)//2:]) / max(len(scores_by_date) - len(scores_by_date)//2, 1)
|
||||||
|
if newer_avg > older_avg + 3:
|
||||||
|
report.health_trend = "improving"
|
||||||
|
elif newer_avg < older_avg - 3:
|
||||||
|
report.health_trend = "declining"
|
||||||
|
else:
|
||||||
|
report.health_trend = "stable"
|
||||||
|
|
||||||
|
# Step 4: Identify priorities
|
||||||
|
self.logger.info("Step 4/5: Identifying top issues and wins...")
|
||||||
|
top_issues, top_wins = self.identify_priorities(all_outputs)
|
||||||
|
report.top_issues = top_issues
|
||||||
|
report.top_wins = top_wins
|
||||||
|
|
||||||
|
# Step 5: Build timeline
|
||||||
|
self.logger.info("Step 5/5: Building audit history timeline...")
|
||||||
|
report.timeline = self.build_timeline(all_outputs)
|
||||||
|
|
||||||
|
self.logger.info(
|
||||||
|
f"Aggregation complete: {len(all_outputs)} skills, "
|
||||||
|
f"health={report.overall_health}/100, "
|
||||||
|
f"trend={report.health_trend}, "
|
||||||
|
f"issues={len(report.top_issues)}, wins={len(report.top_wins)}"
|
||||||
|
)
|
||||||
|
|
||||||
|
return report
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# Output formatting
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def _format_text_report(report: AggregatedReport) -> str:
|
||||||
|
"""Format aggregated report as human-readable text."""
|
||||||
|
lines: list[str] = []
|
||||||
|
lines.append("=" * 70)
|
||||||
|
lines.append(" SEO REPORTING DASHBOARD - AGGREGATED REPORT")
|
||||||
|
lines.append(f" Domain: {report.domain}")
|
||||||
|
lines.append(f" Report Date: {report.report_date}")
|
||||||
|
lines.append(f" Audit ID: {report.audit_id}")
|
||||||
|
lines.append("=" * 70)
|
||||||
|
|
||||||
|
# Health score
|
||||||
|
lines.append("")
|
||||||
|
lines.append(f" Overall Health: {report.overall_health}/100 ({report.health_trend})")
|
||||||
|
lines.append("-" * 50)
|
||||||
|
|
||||||
|
# Category scores
|
||||||
|
if report.category_scores:
|
||||||
|
lines.append("")
|
||||||
|
lines.append("--- CATEGORY SCORES ---")
|
||||||
|
for cat, score in sorted(
|
||||||
|
report.category_scores.items(), key=lambda x: x[1], reverse=True
|
||||||
|
):
|
||||||
|
bar = "#" * int(score / 5) + "." * (20 - int(score / 5))
|
||||||
|
lines.append(f" {cat:<20} [{bar}] {score:.1f}/100")
|
||||||
|
|
||||||
|
# Skills included
|
||||||
|
if report.skills_included:
|
||||||
|
lines.append("")
|
||||||
|
lines.append("--- SKILLS INCLUDED ---")
|
||||||
|
for skill in report.skills_included:
|
||||||
|
lines.append(
|
||||||
|
f" [{skill['skill_id']:>2}] {skill['skill_name']:<30} "
|
||||||
|
f"({skill['audit_date']})"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Top issues
|
||||||
|
if report.top_issues:
|
||||||
|
lines.append("")
|
||||||
|
lines.append("--- TOP ISSUES ---")
|
||||||
|
for i, issue in enumerate(report.top_issues[:10], 1):
|
||||||
|
severity = issue.get("severity", "medium").upper()
|
||||||
|
desc = issue.get("description", "No description")
|
||||||
|
cat = issue.get("category", "")
|
||||||
|
lines.append(f" {i:>2}. [{severity}] ({cat}) {desc}")
|
||||||
|
|
||||||
|
# Top wins
|
||||||
|
if report.top_wins:
|
||||||
|
lines.append("")
|
||||||
|
lines.append("--- TOP WINS ---")
|
||||||
|
for i, win in enumerate(report.top_wins[:10], 1):
|
||||||
|
desc = win.get("description", "No description")
|
||||||
|
cat = win.get("category", "")
|
||||||
|
lines.append(f" {i:>2}. ({cat}) {desc}")
|
||||||
|
|
||||||
|
# Timeline
|
||||||
|
if report.timeline:
|
||||||
|
lines.append("")
|
||||||
|
lines.append("--- AUDIT TIMELINE ---")
|
||||||
|
lines.append(f" {'Date':<12} {'Skill':<25} {'Score':>8} {'Issues':>8}")
|
||||||
|
lines.append(" " + "-" * 55)
|
||||||
|
for entry in report.timeline[:15]:
|
||||||
|
lines.append(
|
||||||
|
f" {entry['date']:<12} {entry['skill']:<25} "
|
||||||
|
f"{entry['health_score']:>7.1f} {entry['issues_count']:>7}"
|
||||||
|
)
|
||||||
|
|
||||||
|
# Errors
|
||||||
|
if report.errors:
|
||||||
|
lines.append("")
|
||||||
|
lines.append("--- ERRORS ---")
|
||||||
|
for err in report.errors:
|
||||||
|
lines.append(f" - {err}")
|
||||||
|
|
||||||
|
lines.append("")
|
||||||
|
lines.append("=" * 70)
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
def _serialize_report(report: AggregatedReport) -> dict:
|
||||||
|
"""Convert report to JSON-serializable dict."""
|
||||||
|
return {
|
||||||
|
"domain": report.domain,
|
||||||
|
"report_date": report.report_date,
|
||||||
|
"overall_health": report.overall_health,
|
||||||
|
"health_trend": report.health_trend,
|
||||||
|
"skills_included": report.skills_included,
|
||||||
|
"category_scores": report.category_scores,
|
||||||
|
"top_issues": report.top_issues,
|
||||||
|
"top_wins": report.top_wins,
|
||||||
|
"timeline": report.timeline,
|
||||||
|
"raw_outputs": report.raw_outputs,
|
||||||
|
"audit_id": report.audit_id,
|
||||||
|
"timestamp": report.timestamp,
|
||||||
|
"errors": report.errors if report.errors else None,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
# CLI
|
||||||
|
# ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
def parse_args(argv: list[str] | None = None) -> argparse.Namespace:
|
||||||
|
parser = argparse.ArgumentParser(
|
||||||
|
description="SEO Report Aggregator - Collect and normalize outputs from all SEO skills",
|
||||||
|
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||||
|
epilog="""\
|
||||||
|
Examples:
|
||||||
|
python report_aggregator.py --domain https://example.com --json
|
||||||
|
python report_aggregator.py --domain https://example.com --output-dir ./audit_outputs --json
|
||||||
|
python report_aggregator.py --domain https://example.com --from 2025-01-01 --to 2025-03-31 --json
|
||||||
|
""",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--domain",
|
||||||
|
required=True,
|
||||||
|
help="Target domain to aggregate reports for",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--output-dir",
|
||||||
|
type=str,
|
||||||
|
default=None,
|
||||||
|
help="Directory containing JSON outputs from SEO skills",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--from",
|
||||||
|
type=str,
|
||||||
|
default=None,
|
||||||
|
dest="date_from",
|
||||||
|
help="Start date for filtering (YYYY-MM-DD)",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--to",
|
||||||
|
type=str,
|
||||||
|
default=None,
|
||||||
|
dest="date_to",
|
||||||
|
help="End date for filtering (YYYY-MM-DD)",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--json",
|
||||||
|
action="store_true",
|
||||||
|
default=False,
|
||||||
|
help="Output in JSON format",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--output",
|
||||||
|
type=str,
|
||||||
|
default=None,
|
||||||
|
help="Save output to file path",
|
||||||
|
)
|
||||||
|
return parser.parse_args(argv)
|
||||||
|
|
||||||
|
|
||||||
|
async def async_main(args: argparse.Namespace) -> None:
|
||||||
|
aggregator = ReportAggregator()
|
||||||
|
|
||||||
|
report = await aggregator.run(
|
||||||
|
domain=args.domain,
|
||||||
|
output_dir=args.output_dir,
|
||||||
|
date_from=args.date_from,
|
||||||
|
date_to=args.date_to,
|
||||||
|
)
|
||||||
|
|
||||||
|
if args.json:
|
||||||
|
output_str = json.dumps(
|
||||||
|
_serialize_report(report), indent=2, ensure_ascii=False
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
output_str = _format_text_report(report)
|
||||||
|
|
||||||
|
if args.output:
|
||||||
|
Path(args.output).write_text(output_str, encoding="utf-8")
|
||||||
|
logger.info(f"Report saved to {args.output}")
|
||||||
|
else:
|
||||||
|
print(output_str)
|
||||||
|
|
||||||
|
aggregator.print_stats()
|
||||||
|
|
||||||
|
|
||||||
|
def main() -> None:
|
||||||
|
args = parse_args()
|
||||||
|
asyncio.run(async_main(args))
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@@ -0,0 +1,9 @@
|
|||||||
|
# 34-seo-reporting-dashboard dependencies
|
||||||
|
requests>=2.31.0
|
||||||
|
aiohttp>=3.9.0
|
||||||
|
pandas>=2.1.0
|
||||||
|
tenacity>=8.2.0
|
||||||
|
tqdm>=4.66.0
|
||||||
|
python-dotenv>=1.0.0
|
||||||
|
rich>=13.7.0
|
||||||
|
jinja2>=3.1.0
|
||||||
136
custom-skills/34-seo-reporting-dashboard/desktop/SKILL.md
Normal file
136
custom-skills/34-seo-reporting-dashboard/desktop/SKILL.md
Normal file
@@ -0,0 +1,136 @@
|
|||||||
|
---
|
||||||
|
name: seo-reporting-dashboard
|
||||||
|
description: |
|
||||||
|
SEO reporting dashboard and executive report generation. Aggregates data from all SEO skills
|
||||||
|
into stakeholder-ready reports and interactive HTML dashboards.
|
||||||
|
Triggers: SEO report, SEO dashboard, executive summary, 보고서, 대시보드, performance report, 종합 보고서.
|
||||||
|
---
|
||||||
|
|
||||||
|
# SEO Reporting Dashboard
|
||||||
|
|
||||||
|
## Purpose
|
||||||
|
|
||||||
|
Aggregate outputs from all SEO skills (11-33) into stakeholder-ready executive reports with interactive HTML dashboards, trend analysis, and Korean-language summaries. This is the PRESENTATION LAYER that sits on top of skill 25 (KPI Framework) and all other skill outputs, providing a unified view of SEO performance across all audit dimensions.
|
||||||
|
|
||||||
|
## Core Capabilities
|
||||||
|
|
||||||
|
1. **Report Aggregation** - Collect and normalize outputs from all SEO skills (11-33) into a unified data structure with cross-skill health scoring and priority issue identification
|
||||||
|
2. **Interactive Dashboard** - Generate self-contained HTML dashboards with Chart.js visualizations including health gauge, traffic trends, keyword distribution, issue breakdown, and competitor radar
|
||||||
|
3. **Executive Reporting** - Korean-language executive summary generation with audience-specific detail levels (C-level, marketing team, technical team) and prioritized action items
|
||||||
|
|
||||||
|
## MCP Tool Usage
|
||||||
|
|
||||||
|
### Ahrefs for Fresh Data Pull
|
||||||
|
```
|
||||||
|
mcp__ahrefs__site-explorer-metrics: Pull current organic metrics snapshot for dashboard
|
||||||
|
mcp__ahrefs__site-explorer-metrics-history: Pull historical metrics for trend visualization
|
||||||
|
```
|
||||||
|
|
||||||
|
### Notion for Reading Past Audits and Writing Reports
|
||||||
|
```
|
||||||
|
mcp__notion__*: Query SEO Audit Log database for past audit entries
|
||||||
|
mcp__notion__*: Save dashboard reports and executive summaries to Notion
|
||||||
|
```
|
||||||
|
|
||||||
|
### Perplexity for Context
|
||||||
|
```
|
||||||
|
mcp__perplexity__*: Enrich reports with industry benchmarks and competitor context
|
||||||
|
```
|
||||||
|
|
||||||
|
## Workflow
|
||||||
|
|
||||||
|
### Dashboard Generation
|
||||||
|
1. Accept target domain and optional date range
|
||||||
|
2. Query Notion SEO Audit Log for all past audit entries for the domain
|
||||||
|
3. Optionally pull fresh metrics from Ahrefs (site-explorer-metrics, metrics-history)
|
||||||
|
4. Normalize all skill outputs into unified format
|
||||||
|
5. Compute cross-skill health score with weighted category dimensions
|
||||||
|
6. Identify top issues (sorted by severity) and top wins across all audits
|
||||||
|
7. Build audit history timeline
|
||||||
|
8. Generate HTML dashboard with Chart.js charts:
|
||||||
|
- Health score gauge (doughnut)
|
||||||
|
- Category scores horizontal bar chart
|
||||||
|
- Health score timeline line chart
|
||||||
|
- Issue distribution pie chart
|
||||||
|
- Competitor radar chart (if competitor data available)
|
||||||
|
9. Save HTML file and optionally push summary to Notion
|
||||||
|
|
||||||
|
### Executive Reporting
|
||||||
|
1. Load aggregated report data (from dashboard generation or JSON file)
|
||||||
|
2. Select audience level: C-level, marketing, or technical
|
||||||
|
3. Generate Korean-language narrative with:
|
||||||
|
- Health score overview and trend
|
||||||
|
- Category highlights (strengths and weaknesses)
|
||||||
|
- Skills coverage summary
|
||||||
|
- Audience-specific business impact analysis
|
||||||
|
4. Format key wins and concerns with severity and category labels
|
||||||
|
5. Generate prioritized action items ranked by impact
|
||||||
|
6. Render as markdown document
|
||||||
|
7. Optionally push to Notion SEO Audit Log
|
||||||
|
|
||||||
|
## Output Format
|
||||||
|
|
||||||
|
### HTML Dashboard
|
||||||
|
```
|
||||||
|
Self-contained HTML file with:
|
||||||
|
- Responsive CSS grid layout
|
||||||
|
- Chart.js visualizations from CDN
|
||||||
|
- Health score gauge
|
||||||
|
- Category bar chart
|
||||||
|
- Timeline line chart
|
||||||
|
- Issues pie chart
|
||||||
|
- Competitor radar chart
|
||||||
|
- Issues and wins lists
|
||||||
|
- Audit history table
|
||||||
|
```
|
||||||
|
|
||||||
|
### Executive Report (Markdown)
|
||||||
|
```markdown
|
||||||
|
# SEO 성과 보고서 - [domain]
|
||||||
|
|
||||||
|
**대상**: 경영진 / 마케팅팀 / 기술팀
|
||||||
|
**도메인**: [domain]
|
||||||
|
**보고 일자**: [date]
|
||||||
|
|
||||||
|
## Health Score
|
||||||
|
| 지표 | 값 |
|
||||||
|
|------|-----|
|
||||||
|
| Overall Score | **[score]/100** |
|
||||||
|
| 등급 | [grade_kr] |
|
||||||
|
| 추세 | [trend_kr] |
|
||||||
|
|
||||||
|
## 종합 분석
|
||||||
|
[Korean narrative...]
|
||||||
|
|
||||||
|
## 주요 성과
|
||||||
|
- [wins...]
|
||||||
|
|
||||||
|
## 주요 이슈
|
||||||
|
- [concerns...]
|
||||||
|
|
||||||
|
## 권장 조치 사항
|
||||||
|
1. [recommendations...]
|
||||||
|
```
|
||||||
|
|
||||||
|
## Audience Configurations
|
||||||
|
|
||||||
|
| Audience | Detail | Issues | Recommendations | Technical Details |
|
||||||
|
|----------|--------|--------|------------------|-------------------|
|
||||||
|
| C-level (경영진) | Summary | Top 5 | Top 3 | No |
|
||||||
|
| Marketing (마케팅팀) | Moderate | Top 10 | Top 5 | No |
|
||||||
|
| Technical (기술팀) | Detailed | Top 20 | Top 10 | Yes |
|
||||||
|
|
||||||
|
## Limitations
|
||||||
|
|
||||||
|
- Aggregation depends on availability of JSON outputs from other skills
|
||||||
|
- Notion query for past audits requires MCP tools (placeholder in scripts)
|
||||||
|
- Competitor radar chart only renders if competitor intel (skill 31) data is present
|
||||||
|
- HTML dashboard requires internet access for Chart.js CDN
|
||||||
|
|
||||||
|
## Notion Output (Required)
|
||||||
|
|
||||||
|
All reports MUST be saved to OurDigital SEO Audit Log:
|
||||||
|
- **Database ID**: `2c8581e5-8a1e-8035-880b-e38cefc2f3ef`
|
||||||
|
- **Properties**: Issue (title), Site (url), Category ("SEO Dashboard"), Priority, Found Date, Audit ID
|
||||||
|
- **Language**: Korean with English technical terms
|
||||||
|
- **Audit ID Format**: DASH-YYYYMMDD-NNN
|
||||||
@@ -0,0 +1,9 @@
|
|||||||
|
name: seo-reporting-dashboard
|
||||||
|
description: |
|
||||||
|
SEO reporting dashboard and executive report generation. Triggers: SEO report, dashboard, executive summary, 보고서, 대시보드.
|
||||||
|
allowed-tools:
|
||||||
|
- mcp__ahrefs__*
|
||||||
|
- mcp__notion__*
|
||||||
|
- mcp__perplexity__*
|
||||||
|
- WebSearch
|
||||||
|
- WebFetch
|
||||||
@@ -0,0 +1,20 @@
|
|||||||
|
# Ahrefs
|
||||||
|
|
||||||
|
Tools used for pulling fresh SEO data into the reporting dashboard.
|
||||||
|
|
||||||
|
## Tools Used
|
||||||
|
|
||||||
|
| Tool | Purpose |
|
||||||
|
|------|---------|
|
||||||
|
| `mcp__ahrefs__site-explorer-metrics` | Current organic metrics snapshot (traffic, keywords, DR) |
|
||||||
|
| `mcp__ahrefs__site-explorer-metrics-history` | Historical metrics for trend charts and period comparison |
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
These tools are called when the dashboard needs fresh data beyond what is available from cached skill outputs. The aggregator first checks local JSON files and Notion audit log entries, then optionally pulls current data from Ahrefs to supplement the report.
|
||||||
|
|
||||||
|
## Notes
|
||||||
|
|
||||||
|
- Ahrefs data has approximately 24-hour freshness lag
|
||||||
|
- Traffic value from Ahrefs is in cents; divide by 100 for USD
|
||||||
|
- Historical data availability depends on Ahrefs subscription tier
|
||||||
@@ -0,0 +1,34 @@
|
|||||||
|
# Notion
|
||||||
|
|
||||||
|
Notion MCP tools are used for both reading past audit data and writing dashboard reports.
|
||||||
|
|
||||||
|
## Database Configuration
|
||||||
|
|
||||||
|
| Field | Value |
|
||||||
|
|-------|-------|
|
||||||
|
| Database ID | `2c8581e5-8a1e-8035-880b-e38cefc2f3ef` |
|
||||||
|
| URL | https://www.notion.so/dintelligence/2c8581e58a1e8035880be38cefc2f3ef |
|
||||||
|
|
||||||
|
## Reading Past Audits
|
||||||
|
|
||||||
|
Query the SEO Audit Log database to retrieve historical audit entries:
|
||||||
|
- Filter by **Site** (URL property) matching the target domain
|
||||||
|
- Filter by **Found Date** for date range selection
|
||||||
|
- Retrieve **Category**, **Priority**, **Audit ID**, and page content
|
||||||
|
- Used by the report aggregator to build the unified dataset
|
||||||
|
|
||||||
|
## Writing Dashboard Reports
|
||||||
|
|
||||||
|
Save generated reports to the SEO Audit Log:
|
||||||
|
- **Issue** (Title): `SEO 대시보드 보고서 - [domain] - YYYY-MM-DD`
|
||||||
|
- **Site** (URL): Target website URL
|
||||||
|
- **Category** (Select): `SEO Dashboard`
|
||||||
|
- **Priority** (Select): Based on overall health trend
|
||||||
|
- **Found Date** (Date): Report generation date
|
||||||
|
- **Audit ID** (Rich Text): Format `DASH-YYYYMMDD-NNN`
|
||||||
|
|
||||||
|
## Language Guidelines
|
||||||
|
|
||||||
|
- Report content in Korean (한국어)
|
||||||
|
- Keep technical English terms as-is (e.g., Health Score, Chart.js, Domain Rating)
|
||||||
|
- URLs and code remain unchanged
|
||||||
@@ -115,10 +115,11 @@ skills:
|
|||||||
numbering:
|
numbering:
|
||||||
core: "01-09" # Brand, blog, journal, research, etc.
|
core: "01-09" # Brand, blog, journal, research, etc.
|
||||||
meta: "10" # Skill creator
|
meta: "10" # Skill creator
|
||||||
seo: "11-19" # SEO tools
|
seo: "11-39" # SEO tools (11-32 active, 33-39 reserved)
|
||||||
gtm: "20-29" # GTM/Analytics
|
|
||||||
notion: "31-39" # Notion tools
|
|
||||||
jamie: "40-49" # Jamie Clinic
|
jamie: "40-49" # Jamie Clinic
|
||||||
|
notebooklm: "50-59" # NotebookLM tools
|
||||||
|
gtm: "60-69" # GTM/Analytics
|
||||||
|
reference: "90-99" # Reference curator, multi-agent
|
||||||
|
|
||||||
body_word_limit:
|
body_word_limit:
|
||||||
min: 800
|
min: 800
|
||||||
|
|||||||
Reference in New Issue
Block a user