Add SEO skills 33-34 and fix bugs in skills 19-34

New skills:
- Skill 33: Site migration planner with redirect mapping and monitoring
- Skill 34: Reporting dashboard with HTML charts and Korean executive reports

Bug fixes (Skill 34 - report_aggregator.py):
- Add audit_type fallback for skill identification (was only using audit_id prefix)
- Extract health scores from nested data dict (technical_score, onpage_score, etc.)
- Support subdomain matching in domain filter (blog.ourdigital.org matches ourdigital.org)
- Skip self-referencing DASH- aggregated reports

Bug fixes (Skill 20 - naver_serp_analyzer.py):
- Remove VIEW tab selectors (removed by Naver in 2026)
- Add new section detectors: books (도서), shortform (숏폼), influencer (인플루언서)

Improvements (Skill 34 - dashboard/executive report):
- Add Korean category labels for Chart.js charts (기술 SEO, 온페이지, etc.)
- Add Korean trend labels (개선 중 ↑, 안정 →, 하락 중 ↓)
- Add English→Korean issue description translation layer (20 common patterns)

Documentation improvements:
- Add Korean triggers to 4 skill descriptions (19, 25, 28, 31)
- Expand Skill 32 SKILL.md from 40→143 lines (was 6/10, added workflow, output format, limitations)
- Add output format examples to Skills 27 and 28 SKILL.md
- Add limitations sections to Skills 27 and 28
- Update README.md, CLAUDE.md, AGENTS.md for skills 33-34

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-14 00:01:00 +09:00
parent dbfaa883cd
commit d2d0a2d460
37 changed files with 5462 additions and 56 deletions

View File

@@ -0,0 +1,78 @@
# SEO Reporting Dashboard
SEO 종합 보고서 및 대시보드 생성 도구 - 모든 SEO 스킬 결과를 집계하여 이해관계자용 보고서와 인터랙티브 HTML 대시보드를 생성합니다.
## Overview
Aggregates outputs from all SEO skills (11-33) into executive reports with interactive HTML dashboards, trend analysis, and Korean-language executive summaries. This is the PRESENTATION LAYER that sits on top of skill 25 (KPI Framework) and all other skill outputs.
## Relationship to Skill 25 (KPI Framework)
Skill 25 establishes KPI baselines, targets, and health scores for a single domain. Skill 34 builds on top of skill 25 by:
- Aggregating outputs from ALL SEO skills (not just KPIs)
- Generating visual HTML dashboards with Chart.js
- Producing audience-specific Korean executive summaries
- Providing cross-skill priority analysis
## Dual-Platform Structure
```
34-seo-reporting-dashboard/
├── code/ # Claude Code version
│ ├── CLAUDE.md # Action-oriented directive
│ ├── commands/
│ │ └── seo-reporting-dashboard.md # Slash command
│ └── scripts/
│ ├── base_client.py # Shared async utilities
│ ├── report_aggregator.py # Collect + normalize skill outputs
│ ├── dashboard_generator.py # HTML dashboard with Chart.js
│ ├── executive_report.py # Korean executive summary
│ └── requirements.txt
├── desktop/ # Claude Desktop version
│ ├── SKILL.md # MCP-based workflow
│ ├── skill.yaml # Extended metadata
│ └── tools/
│ ├── ahrefs.md # Ahrefs tool docs
│ └── notion.md # Notion tool docs
└── README.md
```
## Quick Start
### Claude Code
```bash
pip install -r code/scripts/requirements.txt
# Aggregate all skill outputs
python code/scripts/report_aggregator.py --domain https://example.com --json
# Generate HTML dashboard
python code/scripts/dashboard_generator.py --report report.json --output dashboard.html
# Generate Korean executive report
python code/scripts/executive_report.py --report report.json --audience c-level --output report.md
```
## Features
- Cross-skill report aggregation (skills 11-33)
- Interactive HTML dashboard with Chart.js charts
- Korean-language executive summaries
- Audience-specific reporting (C-level, marketing, technical)
- Notion integration for reading past audits and writing reports
- Mobile-responsive dashboard layout
## Requirements
- Python 3.10+
- Dependencies: `pip install -r code/scripts/requirements.txt`
- Notion API token (for database access)
- Ahrefs API token (for fresh data pull)
## Triggers
- SEO report, SEO dashboard, executive summary
- 보고서, 대시보드, 종합 보고서, 성과 보고서
- performance report, reporting dashboard

View File

@@ -0,0 +1,173 @@
# CLAUDE.md
## Overview
SEO reporting dashboard and executive report generator. Aggregates outputs from all SEO skills (11-33) into stakeholder-ready reports with interactive HTML dashboards, trend analysis, and Korean-language executive summaries. This is the PRESENTATION LAYER that sits on top of skill 25 (KPI Framework) and all other skill outputs, providing a unified view of SEO performance across all audit dimensions.
## Quick Start
```bash
pip install -r scripts/requirements.txt
# Aggregate outputs from all SEO skills
python scripts/report_aggregator.py --domain https://example.com --json
# Generate HTML dashboard
python scripts/dashboard_generator.py --report aggregated_report.json --output dashboard.html
# Generate Korean executive report
python scripts/executive_report.py --report aggregated_report.json --audience c-level --output report.md
```
## Scripts
| Script | Purpose | Key Output |
|--------|---------|------------|
| `report_aggregator.py` | Collect and normalize outputs from all SEO skills | Unified aggregated report, cross-skill health score, priority issues |
| `dashboard_generator.py` | Generate interactive HTML dashboard with Chart.js | Self-contained HTML file with charts and responsive layout |
| `executive_report.py` | Korean-language executive summary generation | Markdown report tailored to audience level |
| `base_client.py` | Shared utilities | RateLimiter, ConfigManager, BaseAsyncClient |
## Report Aggregator
```bash
# Aggregate all skill outputs for a domain
python scripts/report_aggregator.py --domain https://example.com --json
# Specify output directory to scan
python scripts/report_aggregator.py --domain https://example.com --output-dir ./audit_outputs --json
# Filter by date range
python scripts/report_aggregator.py --domain https://example.com --from 2025-01-01 --to 2025-03-31 --json
# Save to file
python scripts/report_aggregator.py --domain https://example.com --json --output report.json
```
**Capabilities**:
- Scan for recent audit outputs from skills 11-33 (JSON files or Notion entries)
- Normalize data formats across skills into unified structure
- Merge findings by domain/date
- Compute cross-skill health scores with weighted dimensions
- Identify top-priority issues across all audits
- Timeline of audit history
- Support for both local file scanning and Notion database queries
## Dashboard Generator
```bash
# Generate HTML dashboard from aggregated report
python scripts/dashboard_generator.py --report aggregated_report.json --output dashboard.html
# Custom title
python scripts/dashboard_generator.py --report aggregated_report.json --output dashboard.html --title "OurDigital SEO Dashboard"
```
**Capabilities**:
- Generate self-contained HTML dashboard (uses Chart.js from CDN)
- Health score gauge chart
- Traffic trend line chart
- Keyword ranking distribution bar chart
- Technical issues breakdown pie chart
- Competitor comparison radar chart
- Mobile-responsive layout with CSS grid
- Export as single .html file (no external dependencies)
## Executive Report
```bash
# C-level executive summary (Korean)
python scripts/executive_report.py --report aggregated_report.json --audience c-level --output report.md
# Marketing team report
python scripts/executive_report.py --report aggregated_report.json --audience marketing --output report.md
# Technical team report
python scripts/executive_report.py --report aggregated_report.json --audience technical --output report.md
# Output to Notion instead of file
python scripts/executive_report.py --report aggregated_report.json --audience c-level --format notion
```
**Capabilities**:
- Korean-language executive summary generation
- Key wins and concerns identification
- Period-over-period comparison narrative
- Priority action items ranked by impact
- Stakeholder-appropriate language (non-technical for C-level)
- Support for C-level, marketing team, and technical team audiences
- Markdown output format
## Ahrefs MCP Tools Used
| Tool | Purpose |
|------|---------|
| `site-explorer-metrics` | Fresh current organic metrics snapshot |
| `site-explorer-metrics-history` | Historical metrics for trend visualization |
## Output Format
```json
{
"domain": "example.com",
"report_date": "2025-01-15",
"overall_health": 72,
"health_trend": "improving",
"skills_included": [
{"skill_id": 11, "skill_name": "comprehensive-audit", "audit_date": "2025-01-14"},
{"skill_id": 25, "skill_name": "kpi-framework", "audit_date": "2025-01-15"}
],
"category_scores": {
"technical": 85,
"on_page": 70,
"performance": 60,
"content": 75,
"links": 68,
"local": 65,
"keywords": 72,
"competitor": 58
},
"top_issues": [
{"severity": "critical", "category": "performance", "description": "CLS exceeds threshold on mobile"},
{"severity": "high", "category": "technical", "description": "12 pages with noindex tag incorrectly set"}
],
"top_wins": [
{"category": "links", "description": "Domain Rating increased by 3 points"},
{"category": "keywords", "description": "15 new keywords entered top 10"}
],
"timeline": [
{"date": "2025-01-15", "skill": "kpi-framework", "health_score": 72},
{"date": "2025-01-14", "skill": "comprehensive-audit", "health_score": 70}
],
"audit_id": "DASH-20250115-001",
"timestamp": "2025-01-15T14:30:00"
}
```
## Notion Output (Required)
**IMPORTANT**: All audit reports MUST be saved to the OurDigital SEO Audit Log database.
### Database Configuration
| Field | Value |
|-------|-------|
| Database ID | `2c8581e5-8a1e-8035-880b-e38cefc2f3ef` |
| URL | https://www.notion.so/dintelligence/2c8581e58a1e8035880be38cefc2f3ef |
### Required Properties
| Property | Type | Description |
|----------|------|-------------|
| Issue | Title | Report title (Korean + date) |
| Site | URL | Audited website URL |
| Category | Select | SEO Dashboard |
| Priority | Select | Based on overall health trend |
| Found Date | Date | Report date (YYYY-MM-DD) |
| Audit ID | Rich Text | Format: DASH-YYYYMMDD-NNN |
### Language Guidelines
- Report content in Korean (한국어)
- Keep technical English terms as-is (e.g., Health Score, Domain Rating, Core Web Vitals, Chart.js)
- URLs and code remain unchanged

View File

@@ -0,0 +1,30 @@
---
name: seo-reporting-dashboard
description: |
SEO reporting dashboard and executive report generation. Aggregates data from all SEO skills
into stakeholder-ready reports and interactive HTML dashboards.
Triggers: SEO report, SEO dashboard, executive summary, 보고서, 대시보드, performance report.
allowed-tools:
- Bash
- Read
- Write
- WebFetch
- WebSearch
---
# SEO Reporting Dashboard
## Generate HTML Dashboard
```bash
python custom-skills/34-seo-reporting-dashboard/code/scripts/dashboard_generator.py --report [JSON] --output dashboard.html
```
## Generate Executive Report (Korean)
```bash
python custom-skills/34-seo-reporting-dashboard/code/scripts/executive_report.py --report [JSON] --audience c-level --output report.md
```
## Aggregate All Skill Outputs
```bash
python custom-skills/34-seo-reporting-dashboard/code/scripts/report_aggregator.py --domain [URL] --json
```

View File

@@ -0,0 +1,169 @@
"""
Base Client - Shared async client utilities
===========================================
Purpose: Rate-limited async operations for API clients
Python: 3.10+
"""
import asyncio
import logging
import os
from asyncio import Semaphore
from datetime import datetime
from typing import Any, Callable, TypeVar
from dotenv import load_dotenv
from tenacity import (
retry,
stop_after_attempt,
wait_exponential,
retry_if_exception_type,
)
load_dotenv()
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(levelname)s - %(message)s",
)
T = TypeVar("T")
class RateLimiter:
"""Rate limiter using token bucket algorithm."""
def __init__(self, rate: float, per: float = 1.0):
self.rate = rate
self.per = per
self.tokens = rate
self.last_update = datetime.now()
self._lock = asyncio.Lock()
async def acquire(self) -> None:
async with self._lock:
now = datetime.now()
elapsed = (now - self.last_update).total_seconds()
self.tokens = min(self.rate, self.tokens + elapsed * (self.rate / self.per))
self.last_update = now
if self.tokens < 1:
wait_time = (1 - self.tokens) * (self.per / self.rate)
await asyncio.sleep(wait_time)
self.tokens = 0
else:
self.tokens -= 1
class BaseAsyncClient:
"""Base class for async API clients with rate limiting."""
def __init__(
self,
max_concurrent: int = 5,
requests_per_second: float = 3.0,
logger: logging.Logger | None = None,
):
self.semaphore = Semaphore(max_concurrent)
self.rate_limiter = RateLimiter(requests_per_second)
self.logger = logger or logging.getLogger(self.__class__.__name__)
self.stats = {
"requests": 0,
"success": 0,
"errors": 0,
"retries": 0,
}
@retry(
stop=stop_after_attempt(3),
wait=wait_exponential(multiplier=1, min=2, max=10),
retry=retry_if_exception_type(Exception),
)
async def _rate_limited_request(
self,
coro: Callable[[], Any],
) -> Any:
async with self.semaphore:
await self.rate_limiter.acquire()
self.stats["requests"] += 1
try:
result = await coro()
self.stats["success"] += 1
return result
except Exception as e:
self.stats["errors"] += 1
self.logger.error(f"Request failed: {e}")
raise
async def batch_requests(
self,
requests: list[Callable[[], Any]],
desc: str = "Processing",
) -> list[Any]:
try:
from tqdm.asyncio import tqdm
has_tqdm = True
except ImportError:
has_tqdm = False
async def execute(req: Callable) -> Any:
try:
return await self._rate_limited_request(req)
except Exception as e:
return {"error": str(e)}
tasks = [execute(req) for req in requests]
if has_tqdm:
results = []
for coro in tqdm.as_completed(tasks, total=len(tasks), desc=desc):
result = await coro
results.append(result)
return results
else:
return await asyncio.gather(*tasks, return_exceptions=True)
def print_stats(self) -> None:
self.logger.info("=" * 40)
self.logger.info("Request Statistics:")
self.logger.info(f" Total Requests: {self.stats['requests']}")
self.logger.info(f" Successful: {self.stats['success']}")
self.logger.info(f" Errors: {self.stats['errors']}")
self.logger.info("=" * 40)
class ConfigManager:
"""Manage API configuration and credentials."""
def __init__(self):
load_dotenv()
@property
def google_credentials_path(self) -> str | None:
seo_creds = os.path.expanduser("~/.credential/ourdigital-seo-agent.json")
if os.path.exists(seo_creds):
return seo_creds
return os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
@property
def pagespeed_api_key(self) -> str | None:
return os.getenv("PAGESPEED_API_KEY")
@property
def notion_token(self) -> str | None:
return os.getenv("NOTION_TOKEN") or os.getenv("NOTION_API_KEY")
def validate_google_credentials(self) -> bool:
creds_path = self.google_credentials_path
if not creds_path:
return False
return os.path.exists(creds_path)
def get_required(self, key: str) -> str:
value = os.getenv(key)
if not value:
raise ValueError(f"Missing required environment variable: {key}")
return value
config = ConfigManager()

View File

@@ -0,0 +1,745 @@
"""
Dashboard Generator - Interactive HTML SEO dashboard with Chart.js
==================================================================
Purpose: Generate a self-contained HTML dashboard from aggregated SEO
report data, with responsive charts for health scores, traffic
trends, keyword rankings, issue breakdowns, and competitor radar.
Python: 3.10+
Usage:
python dashboard_generator.py --report aggregated_report.json --output dashboard.html
python dashboard_generator.py --report aggregated_report.json --output dashboard.html --title "My SEO Dashboard"
"""
import argparse
import json
import logging
import sys
from dataclasses import dataclass, field
from datetime import datetime
from pathlib import Path
from typing import Any
from jinja2 import Template
logger = logging.getLogger(__name__)
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(levelname)s - %(message)s",
)
# ---------------------------------------------------------------------------
# Data classes
# ---------------------------------------------------------------------------
@dataclass
class DashboardConfig:
"""Configuration for dashboard generation."""
title: str = "SEO Reporting Dashboard"
domain: str = ""
date_range: str = ""
theme: str = "light"
chart_options: dict[str, Any] = field(default_factory=dict)
# ---------------------------------------------------------------------------
# HTML template
# ---------------------------------------------------------------------------
DASHBOARD_TEMPLATE = """<!DOCTYPE html>
<html lang="ko">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>{{ title }} - {{ domain }}</title>
<script src="https://cdn.jsdelivr.net/npm/chart.js@4.4.0/dist/chart.umd.min.js"></script>
<style>
:root {
--bg-primary: #f8f9fa;
--bg-card: #ffffff;
--text-primary: #212529;
--text-secondary: #6c757d;
--border: #dee2e6;
--accent: #0d6efd;
--success: #198754;
--warning: #ffc107;
--danger: #dc3545;
}
* { margin: 0; padding: 0; box-sizing: border-box; }
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
background: var(--bg-primary);
color: var(--text-primary);
line-height: 1.6;
}
.header {
background: linear-gradient(135deg, #0d6efd 0%, #6610f2 100%);
color: white;
padding: 2rem;
text-align: center;
}
.header h1 { font-size: 1.8rem; margin-bottom: 0.5rem; }
.header .meta { opacity: 0.85; font-size: 0.9rem; }
.container {
max-width: 1400px;
margin: 0 auto;
padding: 1.5rem;
}
.grid {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(320px, 1fr));
gap: 1.5rem;
margin-bottom: 1.5rem;
}
.grid-full {
display: grid;
grid-template-columns: 1fr;
gap: 1.5rem;
margin-bottom: 1.5rem;
}
.card {
background: var(--bg-card);
border-radius: 12px;
padding: 1.5rem;
box-shadow: 0 2px 8px rgba(0,0,0,0.06);
border: 1px solid var(--border);
}
.card h2 {
font-size: 1.1rem;
color: var(--text-secondary);
margin-bottom: 1rem;
padding-bottom: 0.5rem;
border-bottom: 2px solid var(--border);
}
.health-score {
text-align: center;
padding: 2rem;
}
.health-score .score {
font-size: 4rem;
font-weight: 700;
line-height: 1;
}
.health-score .label {
font-size: 1rem;
color: var(--text-secondary);
margin-top: 0.5rem;
}
.health-score .trend {
font-size: 1.2rem;
margin-top: 0.5rem;
font-weight: 600;
}
.trend-improving { color: var(--success); }
.trend-stable { color: var(--warning); }
.trend-declining { color: var(--danger); }
.score-excellent { color: var(--success); }
.score-good { color: #20c997; }
.score-average { color: var(--warning); }
.score-poor { color: #fd7e14; }
.score-critical { color: var(--danger); }
.chart-container {
position: relative;
width: 100%;
height: 300px;
}
.issues-list { list-style: none; }
.issues-list li {
padding: 0.75rem;
border-bottom: 1px solid var(--border);
display: flex;
align-items: flex-start;
gap: 0.75rem;
}
.issues-list li:last-child { border-bottom: none; }
.severity-badge {
display: inline-block;
padding: 0.15rem 0.5rem;
border-radius: 4px;
font-size: 0.75rem;
font-weight: 600;
text-transform: uppercase;
white-space: nowrap;
}
.severity-critical { background: #f8d7da; color: #842029; }
.severity-high { background: #fff3cd; color: #664d03; }
.severity-medium { background: #cfe2ff; color: #084298; }
.severity-low { background: #d1e7dd; color: #0f5132; }
.timeline-table {
width: 100%;
border-collapse: collapse;
font-size: 0.9rem;
}
.timeline-table th {
text-align: left;
padding: 0.6rem;
border-bottom: 2px solid var(--border);
color: var(--text-secondary);
font-weight: 600;
}
.timeline-table td {
padding: 0.6rem;
border-bottom: 1px solid var(--border);
}
.footer {
text-align: center;
padding: 2rem;
color: var(--text-secondary);
font-size: 0.85rem;
}
@media (max-width: 768px) {
.grid { grid-template-columns: 1fr; }
.header h1 { font-size: 1.4rem; }
.health-score .score { font-size: 3rem; }
}
</style>
</head>
<body>
<div class="header">
<h1>{{ title }}</h1>
<div class="meta">{{ domain }} | {{ report_date }} | Audit ID: {{ audit_id }}</div>
</div>
<div class="container">
<!-- Health Score & Category Overview -->
<div class="grid">
<div class="card health-score">
<div class="score {{ score_class }}">{{ overall_health }}</div>
<div class="label">Overall Health Score</div>
<div class="trend trend-{{ health_trend }}">{{ trend_label }}</div>
<div class="chart-container" style="height: 200px; margin-top: 1rem;">
<canvas id="gaugeChart"></canvas>
</div>
</div>
<div class="card">
<h2>Category Scores</h2>
<div class="chart-container">
<canvas id="categoryChart"></canvas>
</div>
</div>
</div>
<!-- Traffic & Keywords -->
<div class="grid">
<div class="card">
<h2>Health Score Timeline</h2>
<div class="chart-container">
<canvas id="timelineChart"></canvas>
</div>
</div>
<div class="card">
<h2>Issue Distribution</h2>
<div class="chart-container">
<canvas id="issuesChart"></canvas>
</div>
</div>
</div>
<!-- Competitor Radar (if data available) -->
{% if has_competitor_data %}
<div class="grid">
<div class="card">
<h2>Competitive Comparison</h2>
<div class="chart-container">
<canvas id="radarChart"></canvas>
</div>
</div>
</div>
{% endif %}
<!-- Top Issues -->
<div class="grid-full">
<div class="card">
<h2>Top Issues ({{ issues_count }})</h2>
<ul class="issues-list">
{% for issue in top_issues %}
<li>
<span class="severity-badge severity-{{ issue.severity }}">{{ issue.severity }}</span>
<span>{{ issue.description }} <em style="color: var(--text-secondary);">({{ issue.category }})</em></span>
</li>
{% endfor %}
</ul>
</div>
</div>
<!-- Top Wins -->
{% if top_wins %}
<div class="grid-full">
<div class="card">
<h2>Top Wins ({{ wins_count }})</h2>
<ul class="issues-list">
{% for win in top_wins %}
<li>
<span class="severity-badge severity-low">WIN</span>
<span>{{ win.description }} <em style="color: var(--text-secondary);">({{ win.category }})</em></span>
</li>
{% endfor %}
</ul>
</div>
</div>
{% endif %}
<!-- Audit Timeline Table -->
<div class="grid-full">
<div class="card">
<h2>Audit History</h2>
<table class="timeline-table">
<thead>
<tr>
<th>Date</th>
<th>Skill</th>
<th>Category</th>
<th>Score</th>
<th>Issues</th>
</tr>
</thead>
<tbody>
{% for entry in timeline %}
<tr>
<td>{{ entry.date }}</td>
<td>{{ entry.skill }}</td>
<td>{{ entry.category }}</td>
<td>{{ entry.health_score }}</td>
<td>{{ entry.issues_count }}</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
</div>
</div>
<div class="footer">
Generated by SEO Reporting Dashboard (Skill 34) | {{ timestamp }}
</div>
<script>
// --- Gauge Chart ---
const gaugeCtx = document.getElementById('gaugeChart').getContext('2d');
new Chart(gaugeCtx, {
type: 'doughnut',
data: {
datasets: [{
data: [{{ overall_health }}, {{ 100 - overall_health }}],
backgroundColor: ['{{ gauge_color }}', '#e9ecef'],
borderWidth: 0,
circumference: 180,
rotation: 270,
}]
},
options: {
responsive: true,
maintainAspectRatio: false,
cutout: '75%',
plugins: { legend: { display: false }, tooltip: { enabled: false } }
}
});
// --- Category Bar Chart ---
const catCtx = document.getElementById('categoryChart').getContext('2d');
new Chart(catCtx, {
type: 'bar',
data: {
labels: {{ category_labels | tojson }},
datasets: [{
label: 'Score',
data: {{ category_values | tojson }},
backgroundColor: {{ category_colors | tojson }},
borderRadius: 6,
borderSkipped: false,
}]
},
options: {
responsive: true,
maintainAspectRatio: false,
indexAxis: 'y',
scales: {
x: { min: 0, max: 100, grid: { display: false } },
y: { grid: { display: false } }
},
plugins: { legend: { display: false } }
}
});
// --- Timeline Line Chart ---
const timeCtx = document.getElementById('timelineChart').getContext('2d');
new Chart(timeCtx, {
type: 'line',
data: {
labels: {{ timeline_dates | tojson }},
datasets: [{
label: 'Health Score',
data: {{ timeline_scores | tojson }},
borderColor: '#0d6efd',
backgroundColor: 'rgba(13, 110, 253, 0.1)',
fill: true,
tension: 0.3,
pointRadius: 4,
pointBackgroundColor: '#0d6efd',
}]
},
options: {
responsive: true,
maintainAspectRatio: false,
scales: {
y: { min: 0, max: 100, grid: { color: '#f0f0f0' } },
x: { grid: { display: false } }
},
plugins: { legend: { display: false } }
}
});
// --- Issues Pie Chart ---
const issuesCtx = document.getElementById('issuesChart').getContext('2d');
new Chart(issuesCtx, {
type: 'pie',
data: {
labels: {{ issue_category_labels | tojson }},
datasets: [{
data: {{ issue_category_values | tojson }},
backgroundColor: [
'#dc3545', '#fd7e14', '#ffc107', '#198754',
'#0d6efd', '#6610f2', '#d63384', '#20c997',
'#0dcaf0', '#6c757d'
],
}]
},
options: {
responsive: true,
maintainAspectRatio: false,
plugins: {
legend: { position: 'right', labels: { boxWidth: 12, padding: 8 } }
}
}
});
{% if has_competitor_data %}
// --- Competitor Radar Chart ---
const radarCtx = document.getElementById('radarChart').getContext('2d');
new Chart(radarCtx, {
type: 'radar',
data: {
labels: {{ radar_labels | tojson }},
datasets: {{ radar_datasets | tojson }}
},
options: {
responsive: true,
maintainAspectRatio: false,
scales: {
r: { min: 0, max: 100, ticks: { stepSize: 20 } }
}
}
});
{% endif %}
</script>
</body>
</html>"""
# ---------------------------------------------------------------------------
# Generator
# ---------------------------------------------------------------------------
CATEGORY_KOREAN_LABELS: dict[str, str] = {
"technical": "기술 SEO",
"on_page": "온페이지",
"performance": "성능",
"content": "콘텐츠",
"links": "링크",
"local": "로컬 SEO",
"keywords": "키워드",
"competitor": "경쟁사",
"schema": "스키마",
"kpi": "KPI",
"search_console": "Search Console",
"ecommerce": "이커머스",
"international": "국제 SEO",
"ai_search": "AI 검색",
"entity_seo": "엔티티 SEO",
}
class DashboardGenerator:
"""Generate an interactive HTML dashboard from aggregated SEO report data."""
def __init__(self):
self.template = Template(DASHBOARD_TEMPLATE)
@staticmethod
def _score_class(score: float) -> str:
"""Return CSS class based on health score."""
if score >= 90:
return "score-excellent"
elif score >= 75:
return "score-good"
elif score >= 60:
return "score-average"
elif score >= 40:
return "score-poor"
else:
return "score-critical"
@staticmethod
def _gauge_color(score: float) -> str:
"""Return color hex for gauge chart."""
if score >= 90:
return "#198754"
elif score >= 75:
return "#20c997"
elif score >= 60:
return "#ffc107"
elif score >= 40:
return "#fd7e14"
else:
return "#dc3545"
@staticmethod
def _category_color(score: float) -> str:
"""Return color for category bar based on score."""
if score >= 80:
return "#198754"
elif score >= 60:
return "#0d6efd"
elif score >= 40:
return "#ffc107"
else:
return "#dc3545"
@staticmethod
def _trend_label(trend: str) -> str:
"""Return human-readable trend label in Korean."""
labels = {
"improving": "개선 중 ↑",
"stable": "안정 →",
"declining": "하락 중 ↓",
}
return labels.get(trend, trend.title())
def generate_health_gauge(self, score: float) -> dict[str, Any]:
"""Generate gauge chart data for health score."""
return {
"score": score,
"remainder": 100 - score,
"color": self._gauge_color(score),
"class": self._score_class(score),
}
def generate_traffic_chart(self, traffic_data: list[dict]) -> dict[str, Any]:
"""Generate line chart data for traffic trends."""
dates = [d.get("date", "") for d in traffic_data]
values = [d.get("traffic", 0) for d in traffic_data]
return {"labels": dates, "values": values}
def generate_keyword_chart(self, keyword_data: list[dict]) -> dict[str, Any]:
"""Generate bar chart data for keyword ranking distribution."""
labels = [d.get("range", "") for d in keyword_data]
values = [d.get("count", 0) for d in keyword_data]
return {"labels": labels, "values": values}
def generate_issues_chart(
self, issues_data: list[dict[str, Any]]
) -> dict[str, Any]:
"""Generate pie chart data for issue category distribution."""
category_counts: dict[str, int] = {}
for issue in issues_data:
cat = issue.get("category", "other")
category_counts[cat] = category_counts.get(cat, 0) + 1
sorted_cats = sorted(
category_counts.items(), key=lambda x: x[1], reverse=True
)
return {
"labels": [CATEGORY_KOREAN_LABELS.get(c[0], c[0]) for c in sorted_cats],
"values": [c[1] for c in sorted_cats],
}
def generate_competitor_radar(
self, competitor_data: dict[str, Any]
) -> dict[str, Any]:
"""Generate radar chart data for competitor comparison."""
labels = list(competitor_data.get("dimensions", []))
datasets = []
colors = [
"rgba(13, 110, 253, 0.5)",
"rgba(220, 53, 69, 0.5)",
"rgba(25, 135, 84, 0.5)",
]
border_colors = ["#0d6efd", "#dc3545", "#198754"]
for i, (domain, scores) in enumerate(
competitor_data.get("scores", {}).items()
):
datasets.append({
"label": domain,
"data": [scores.get(dim, 0) for dim in labels],
"backgroundColor": colors[i % len(colors)],
"borderColor": border_colors[i % len(border_colors)],
"borderWidth": 2,
})
return {"labels": labels, "datasets": datasets}
def render_html(
self,
report: dict[str, Any],
config: DashboardConfig,
) -> str:
"""Render the full HTML dashboard from aggregated report data."""
overall_health = report.get("overall_health", 0)
health_trend = report.get("health_trend", "stable")
# Category scores (with Korean labels)
cat_scores = report.get("category_scores", {})
category_labels = [
CATEGORY_KOREAN_LABELS.get(k, k) for k in cat_scores.keys()
]
category_values = list(cat_scores.values())
category_colors = [self._category_color(v) for v in category_values]
# Timeline
timeline = report.get("timeline", [])
timeline_dates = [e.get("date", "") for e in timeline]
timeline_scores = [e.get("health_score", 0) for e in timeline]
# Issues
top_issues = report.get("top_issues", [])
issues_chart = self.generate_issues_chart(top_issues)
# Wins
top_wins = report.get("top_wins", [])
# Competitor radar
has_competitor_data = False
radar_labels: list[str] = []
radar_datasets: list[dict] = []
raw_outputs = report.get("raw_outputs", [])
for output in raw_outputs:
if output.get("category") == "competitor":
has_competitor_data = True
comp_data = output.get("data", {})
if "comparison_matrix" in comp_data:
radar_result = self.generate_competitor_radar(
comp_data["comparison_matrix"]
)
radar_labels = radar_result["labels"]
radar_datasets = radar_result["datasets"]
break
context = {
"title": config.title,
"domain": config.domain or report.get("domain", ""),
"report_date": report.get("report_date", ""),
"audit_id": report.get("audit_id", ""),
"timestamp": report.get("timestamp", datetime.now().isoformat()),
"overall_health": overall_health,
"score_class": self._score_class(overall_health),
"health_trend": health_trend,
"trend_label": self._trend_label(health_trend),
"gauge_color": self._gauge_color(overall_health),
"category_labels": category_labels,
"category_values": category_values,
"category_colors": category_colors,
"timeline_dates": timeline_dates,
"timeline_scores": timeline_scores,
"issue_category_labels": issues_chart["labels"],
"issue_category_values": issues_chart["values"],
"top_issues": top_issues[:15],
"issues_count": len(top_issues),
"top_wins": top_wins[:10],
"wins_count": len(top_wins),
"timeline": timeline[:20],
"has_competitor_data": has_competitor_data,
"radar_labels": radar_labels,
"radar_datasets": radar_datasets,
}
return self.template.render(**context)
def save(self, html_content: str, output_path: str) -> None:
"""Save rendered HTML to a file."""
Path(output_path).write_text(html_content, encoding="utf-8")
logger.info(f"Dashboard saved to {output_path}")
def run(
self,
report_json: str,
output_path: str,
title: str = "SEO Reporting Dashboard",
) -> str:
"""Orchestrate dashboard generation from a report JSON file."""
# Load report data
report_path = Path(report_json)
if not report_path.exists():
raise FileNotFoundError(f"Report file not found: {report_json}")
report = json.loads(report_path.read_text(encoding="utf-8"))
logger.info(f"Loaded report: {report.get('domain', 'unknown')}")
# Configure
config = DashboardConfig(
title=title,
domain=report.get("domain", ""),
date_range=report.get("report_date", ""),
)
# Render
html = self.render_html(report, config)
logger.info(f"Rendered HTML dashboard ({len(html):,} bytes)")
# Save
self.save(html, output_path)
return output_path
# ---------------------------------------------------------------------------
# CLI
# ---------------------------------------------------------------------------
def parse_args(argv: list[str] | None = None) -> argparse.Namespace:
parser = argparse.ArgumentParser(
description="SEO Dashboard Generator - Interactive HTML dashboard with Chart.js",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""\
Examples:
python dashboard_generator.py --report aggregated_report.json --output dashboard.html
python dashboard_generator.py --report aggregated_report.json --output dashboard.html --title "My Dashboard"
""",
)
parser.add_argument(
"--report",
required=True,
help="Path to aggregated report JSON file (from report_aggregator.py)",
)
parser.add_argument(
"--output",
required=True,
help="Output HTML file path",
)
parser.add_argument(
"--title",
type=str,
default="SEO Reporting Dashboard",
help="Dashboard title (default: 'SEO Reporting Dashboard')",
)
return parser.parse_args(argv)
def main() -> None:
args = parse_args()
generator = DashboardGenerator()
output = generator.run(
report_json=args.report,
output_path=args.output,
title=args.title,
)
logger.info(f"Dashboard generated: {output}")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,622 @@
"""
Executive Report - Korean-language executive summary generation
===============================================================
Purpose: Generate stakeholder-ready executive summaries in Korean from
aggregated SEO report data, with audience-specific detail levels
for C-level, marketing, and technical teams.
Python: 3.10+
Usage:
python executive_report.py --report aggregated_report.json --audience c-level --output report.md
python executive_report.py --report aggregated_report.json --audience marketing --output report.md
python executive_report.py --report aggregated_report.json --audience technical --output report.md
python executive_report.py --report aggregated_report.json --audience c-level --format notion
"""
import argparse
import json
import logging
import sys
from dataclasses import dataclass, field
from datetime import datetime
from pathlib import Path
from typing import Any
logger = logging.getLogger(__name__)
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s - %(levelname)s - %(message)s",
)
# ---------------------------------------------------------------------------
# Data classes
# ---------------------------------------------------------------------------
@dataclass
class AudienceConfig:
"""Configuration for report audience targeting."""
level: str = "c-level" # c-level | marketing | technical
detail_depth: str = "summary" # summary | moderate | detailed
include_recommendations: bool = True
include_technical_details: bool = False
max_issues: int = 5
max_recommendations: int = 5
@classmethod
def from_level(cls, level: str) -> "AudienceConfig":
"""Create config preset from audience level."""
presets = {
"c-level": cls(
level="c-level",
detail_depth="summary",
include_recommendations=True,
include_technical_details=False,
max_issues=5,
max_recommendations=3,
),
"marketing": cls(
level="marketing",
detail_depth="moderate",
include_recommendations=True,
include_technical_details=False,
max_issues=10,
max_recommendations=5,
),
"technical": cls(
level="technical",
detail_depth="detailed",
include_recommendations=True,
include_technical_details=True,
max_issues=20,
max_recommendations=10,
),
}
return presets.get(level, presets["c-level"])
@dataclass
class ExecutiveSummary:
"""Generated executive summary content."""
title: str = ""
domain: str = ""
period: str = ""
health_score: float = 0.0
health_trend: str = "stable"
key_wins: list[str] = field(default_factory=list)
key_concerns: list[str] = field(default_factory=list)
recommendations: list[str] = field(default_factory=list)
narrative: str = ""
audience: str = "c-level"
category_summary: dict[str, str] = field(default_factory=dict)
audit_id: str = ""
timestamp: str = ""
# ---------------------------------------------------------------------------
# Korean text templates
# ---------------------------------------------------------------------------
HEALTH_LABELS_KR = {
"excellent": "우수",
"good": "양호",
"average": "보통",
"poor": "미흡",
"critical": "위험",
}
TREND_LABELS_KR = {
"improving": "개선 중",
"stable": "안정",
"declining": "하락 중",
}
CATEGORY_LABELS_KR = {
"technical": "기술 SEO",
"on_page": "온페이지 SEO",
"performance": "성능 (Core Web Vitals)",
"content": "콘텐츠 전략",
"links": "링크 프로필",
"local": "로컬 SEO",
"keywords": "키워드 전략",
"competitor": "경쟁 분석",
"schema": "스키마/구조화 데이터",
"kpi": "KPI 프레임워크",
"search_console": "Search Console",
"ecommerce": "이커머스 SEO",
"international": "국제 SEO",
"ai_search": "AI 검색 가시성",
"entity_seo": "Knowledge Graph",
}
# Common English issue descriptions -> Korean translations
ISSUE_TRANSLATIONS_KR: dict[str, str] = {
"missing meta description": "메타 설명(meta description) 누락",
"missing title tag": "타이틀 태그 누락",
"duplicate title": "중복 타이틀 태그",
"duplicate meta description": "중복 메타 설명",
"missing h1": "H1 태그 누락",
"multiple h1 tags": "H1 태그 다수 사용",
"missing alt text": "이미지 alt 텍스트 누락",
"broken links": "깨진 링크 발견",
"redirect chain": "리다이렉트 체인 발견",
"mixed content": "Mixed Content (HTTP/HTTPS 혼합) 발견",
"missing canonical": "Canonical 태그 누락",
"noindex on important page": "중요 페이지에 noindex 설정됨",
"slow page load": "페이지 로딩 속도 저하",
"cls exceeds threshold": "CLS(누적 레이아웃 변경) 임계값 초과",
"lcp exceeds threshold": "LCP(최대 콘텐츠풀 페인트) 임계값 초과",
"missing sitemap": "사이트맵 누락",
"robots.txt blocking important pages": "robots.txt에서 중요 페이지 차단 중",
"missing schema markup": "스키마 마크업 누락",
"missing hreflang": "hreflang 태그 누락",
"thin content": "콘텐츠 부족 (Thin Content)",
"orphan pages": "고아 페이지 발견 (내부 링크 없음)",
}
def _translate_description(desc: str) -> str:
"""Translate common English issue descriptions to Korean."""
desc_lower = desc.lower().strip()
# Check exact match
if desc_lower in ISSUE_TRANSLATIONS_KR:
return ISSUE_TRANSLATIONS_KR[desc_lower]
# Check partial match (case-insensitive replace)
for eng, kor in ISSUE_TRANSLATIONS_KR.items():
if eng in desc_lower:
# Find the original-case substring and replace it
idx = desc_lower.index(eng)
return desc[:idx] + kor + desc[idx + len(eng):]
return desc
AUDIENCE_INTRO_KR = {
"c-level": "본 보고서는 SEO 성과의 핵심 지표와 비즈니스 영향을 요약한 경영진용 보고서입니다.",
"marketing": "본 보고서는 SEO 전략 실행 현황과 마케팅 성과를 분석한 마케팅팀 보고서입니다.",
"technical": "본 보고서는 SEO 기술 진단 결과와 상세 개선 사항을 포함한 기술팀 보고서입니다.",
}
# ---------------------------------------------------------------------------
# Generator
# ---------------------------------------------------------------------------
class ExecutiveReportGenerator:
"""Generate Korean-language executive reports from aggregated SEO data."""
@staticmethod
def _health_grade(score: float) -> str:
"""Return health grade string."""
if score >= 90:
return "excellent"
elif score >= 75:
return "good"
elif score >= 60:
return "average"
elif score >= 40:
return "poor"
else:
return "critical"
def generate_narrative(
self,
report: dict[str, Any],
audience: AudienceConfig,
) -> str:
"""Generate Korean narrative text for the executive summary."""
domain = report.get("domain", "")
health = report.get("overall_health", 0)
trend = report.get("health_trend", "stable")
grade = self._health_grade(health)
grade_kr = HEALTH_LABELS_KR.get(grade, grade)
trend_kr = TREND_LABELS_KR.get(trend, trend)
intro = AUDIENCE_INTRO_KR.get(audience.level, AUDIENCE_INTRO_KR["c-level"])
# Build narrative paragraphs
paragraphs = []
# Opening
paragraphs.append(intro)
# Health overview
paragraphs.append(
f"{domain}의 전체 SEO Health Score는 **{health}/100** ({grade_kr})이며, "
f"현재 추세는 **{trend_kr}** 상태입니다."
)
# Category highlights
cat_scores = report.get("category_scores", {})
if cat_scores:
strong_cats = [
CATEGORY_LABELS_KR.get(k, k)
for k, v in cat_scores.items()
if v >= 75
]
weak_cats = [
CATEGORY_LABELS_KR.get(k, k)
for k, v in cat_scores.items()
if v < 50
]
if strong_cats:
paragraphs.append(
f"강점 영역: {', '.join(strong_cats[:3])} 등이 양호한 성과를 보이고 있습니다."
)
if weak_cats:
paragraphs.append(
f"개선 필요 영역: {', '.join(weak_cats[:3])} 등에서 집중적인 개선이 필요합니다."
)
# Skills coverage
skills = report.get("skills_included", [])
if skills:
paragraphs.append(
f"{len(skills)}개의 SEO 진단 도구를 통해 종합 분석을 수행하였습니다."
)
# C-level specific: business impact focus
if audience.level == "c-level":
if trend == "improving":
paragraphs.append(
"전반적인 SEO 성과가 개선 추세에 있으며, 현재 전략을 유지하면서 "
"핵심 약점 영역에 대한 집중 투자가 권장됩니다."
)
elif trend == "declining":
paragraphs.append(
"SEO 성과가 하락 추세를 보이고 있어, 원인 분석과 함께 "
"긴급한 대응 조치가 필요합니다."
)
else:
paragraphs.append(
"SEO 성과가 안정적으로 유지되고 있으나, 경쟁 환경 변화에 대비하여 "
"지속적인 모니터링과 선제적 대응이 필요합니다."
)
# Marketing specific: channel and content focus
elif audience.level == "marketing":
top_issues = report.get("top_issues", [])
content_issues = [
i for i in top_issues if i.get("category") in ("content", "keywords")
]
if content_issues:
paragraphs.append(
f"콘텐츠/키워드 관련 이슈가 {len(content_issues)}건 발견되었으며, "
f"콘텐츠 전략 수정이 권장됩니다."
)
# Technical specific: detailed breakdown
elif audience.level == "technical":
for cat, score in sorted(
cat_scores.items(), key=lambda x: x[1]
):
cat_kr = CATEGORY_LABELS_KR.get(cat, cat)
paragraphs.append(f"- {cat_kr}: {score}/100")
return "\n\n".join(paragraphs)
def format_wins(self, report: dict[str, Any]) -> list[str]:
"""Extract and format key wins in Korean."""
wins = report.get("top_wins", [])
formatted: list[str] = []
for win in wins:
desc = _translate_description(win.get("description", ""))
cat = win.get("category", "")
cat_kr = CATEGORY_LABELS_KR.get(cat, cat)
if desc:
formatted.append(f"[{cat_kr}] {desc}")
return formatted
def format_concerns(self, report: dict[str, Any]) -> list[str]:
"""Extract and format key concerns in Korean."""
issues = report.get("top_issues", [])
formatted: list[str] = []
severity_kr = {
"critical": "긴급",
"high": "높음",
"medium": "보통",
"low": "낮음",
}
for issue in issues:
desc = _translate_description(issue.get("description", ""))
severity = issue.get("severity", "medium")
cat = issue.get("category", "")
sev_kr = severity_kr.get(severity, severity)
cat_kr = CATEGORY_LABELS_KR.get(cat, cat)
if desc:
formatted.append(f"[{sev_kr}] [{cat_kr}] {desc}")
return formatted
def generate_recommendations(
self,
report: dict[str, Any],
audience: AudienceConfig,
) -> list[str]:
"""Generate prioritized action items ranked by impact."""
recommendations: list[str] = []
cat_scores = report.get("category_scores", {})
top_issues = report.get("top_issues", [])
# Priority 1: Critical issues
critical = [i for i in top_issues if i.get("severity") == "critical"]
for issue in critical[:3]:
cat_kr = CATEGORY_LABELS_KR.get(issue.get("category", ""), "")
desc = _translate_description(issue.get("description", ""))
if audience.level == "c-level":
recommendations.append(
f"[긴급] {cat_kr} 영역 긴급 조치 필요 - {desc}"
)
else:
recommendations.append(
f"[긴급] {desc} (영역: {cat_kr})"
)
# Priority 2: Weak categories
weak_cats = sorted(
[(k, v) for k, v in cat_scores.items() if v < 50],
key=lambda x: x[1],
)
for cat, score in weak_cats[:3]:
cat_kr = CATEGORY_LABELS_KR.get(cat, cat)
if audience.level == "c-level":
recommendations.append(
f"[개선] {cat_kr} 점수 {score}/100 - 전략적 투자 권장"
)
elif audience.level == "marketing":
recommendations.append(
f"[개선] {cat_kr} ({score}/100) - 캠페인 전략 재검토 필요"
)
else:
recommendations.append(
f"[개선] {cat_kr} ({score}/100) - 상세 진단 및 기술적 개선 필요"
)
# Priority 3: Maintenance for good categories
strong_cats = [
(k, v) for k, v in cat_scores.items() if v >= 75
]
if strong_cats:
cats_kr = ", ".join(
CATEGORY_LABELS_KR.get(k, k) for k, _ in strong_cats[:3]
)
recommendations.append(
f"[유지] {cats_kr} - 현재 수준 유지 및 모니터링 지속"
)
# Audience-specific recommendations
if audience.level == "c-level":
health = report.get("overall_health", 0)
if health < 60:
recommendations.append(
"[전략] SEO 개선을 위한 전문 인력 또는 외부 에이전시 투입 검토"
)
elif audience.level == "marketing":
recommendations.append(
"[실행] 다음 분기 SEO 개선 로드맵 수립 및 KPI 설정"
)
elif audience.level == "technical":
recommendations.append(
"[실행] 기술 부채 해소 스프린트 계획 수립"
)
return recommendations[:audience.max_recommendations]
def render_markdown(self, summary: ExecutiveSummary) -> str:
"""Render executive summary as markdown document."""
lines: list[str] = []
# Title
lines.append(f"# {summary.title}")
lines.append("")
# Meta
audience_kr = {
"c-level": "경영진",
"marketing": "마케팅팀",
"technical": "기술팀",
}
lines.append(f"**대상**: {audience_kr.get(summary.audience, summary.audience)}")
lines.append(f"**도메인**: {summary.domain}")
lines.append(f"**보고 일자**: {summary.period}")
lines.append(f"**Audit ID**: {summary.audit_id}")
lines.append("")
# Health Score
grade = self._health_grade(summary.health_score)
grade_kr = HEALTH_LABELS_KR.get(grade, grade)
trend_kr = TREND_LABELS_KR.get(summary.health_trend, summary.health_trend)
lines.append("## Health Score")
lines.append("")
lines.append(f"| 지표 | 값 |")
lines.append(f"|------|-----|")
lines.append(f"| Overall Score | **{summary.health_score}/100** |")
lines.append(f"| 등급 | {grade_kr} |")
lines.append(f"| 추세 | {trend_kr} |")
lines.append("")
# Category summary
if summary.category_summary:
lines.append("## 영역별 점수")
lines.append("")
lines.append("| 영역 | 점수 |")
lines.append("|------|------|")
for cat, score_str in summary.category_summary.items():
cat_kr = CATEGORY_LABELS_KR.get(cat, cat)
lines.append(f"| {cat_kr} | {score_str} |")
lines.append("")
# Narrative
lines.append("## 종합 분석")
lines.append("")
lines.append(summary.narrative)
lines.append("")
# Key wins
if summary.key_wins:
lines.append("## 주요 성과")
lines.append("")
for win in summary.key_wins:
lines.append(f"- {win}")
lines.append("")
# Key concerns
if summary.key_concerns:
lines.append("## 주요 이슈")
lines.append("")
for concern in summary.key_concerns:
lines.append(f"- {concern}")
lines.append("")
# Recommendations
if summary.recommendations:
lines.append("## 권장 조치 사항")
lines.append("")
for i, rec in enumerate(summary.recommendations, 1):
lines.append(f"{i}. {rec}")
lines.append("")
# Footer
lines.append("---")
lines.append(
f"*이 보고서는 SEO Reporting Dashboard (Skill 34)에 의해 "
f"{summary.timestamp}에 자동 생성되었습니다.*"
)
return "\n".join(lines)
def run(
self,
report_json: str,
audience_level: str = "c-level",
output_path: str | None = None,
output_format: str = "markdown",
) -> str:
"""Orchestrate executive report generation."""
# Load report
report_path = Path(report_json)
if not report_path.exists():
raise FileNotFoundError(f"Report file not found: {report_json}")
report = json.loads(report_path.read_text(encoding="utf-8"))
logger.info(f"Loaded report: {report.get('domain', 'unknown')}")
# Configure audience
audience = AudienceConfig.from_level(audience_level)
logger.info(f"Audience: {audience.level} (depth: {audience.detail_depth})")
# Build summary
domain = report.get("domain", "")
summary = ExecutiveSummary(
title=f"SEO 성과 보고서 - {domain}",
domain=domain,
period=report.get("report_date", ""),
health_score=report.get("overall_health", 0),
health_trend=report.get("health_trend", "stable"),
audit_id=report.get("audit_id", ""),
audience=audience.level,
timestamp=datetime.now().isoformat(),
)
# Category summary
cat_scores = report.get("category_scores", {})
summary.category_summary = {
cat: f"{score}/100"
for cat, score in sorted(
cat_scores.items(), key=lambda x: x[1], reverse=True
)
}
# Generate content
summary.narrative = self.generate_narrative(report, audience)
summary.key_wins = self.format_wins(report)[:audience.max_issues]
summary.key_concerns = self.format_concerns(report)[:audience.max_issues]
summary.recommendations = self.generate_recommendations(report, audience)
# Render
if output_format == "markdown":
content = self.render_markdown(summary)
elif output_format == "notion":
# For Notion, we output markdown that can be pasted into Notion
content = self.render_markdown(summary)
logger.info(
"Notion format: use MCP tools to push this markdown to Notion "
f"database {report.get('audit_id', 'DASH-YYYYMMDD-NNN')}"
)
else:
content = self.render_markdown(summary)
# Save or print
if output_path:
Path(output_path).write_text(content, encoding="utf-8")
logger.info(f"Executive report saved to {output_path}")
else:
print(content)
return content
# ---------------------------------------------------------------------------
# CLI
# ---------------------------------------------------------------------------
def parse_args(argv: list[str] | None = None) -> argparse.Namespace:
parser = argparse.ArgumentParser(
description="SEO Executive Report - Korean-language executive summary generator",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""\
Examples:
python executive_report.py --report aggregated_report.json --audience c-level --output report.md
python executive_report.py --report aggregated_report.json --audience marketing --output report.md
python executive_report.py --report aggregated_report.json --audience technical --format notion
""",
)
parser.add_argument(
"--report",
required=True,
help="Path to aggregated report JSON file (from report_aggregator.py)",
)
parser.add_argument(
"--audience",
choices=["c-level", "marketing", "technical"],
default="c-level",
help="Target audience level (default: c-level)",
)
parser.add_argument(
"--output",
type=str,
default=None,
help="Output file path (prints to stdout if omitted)",
)
parser.add_argument(
"--format",
choices=["markdown", "notion"],
default="markdown",
dest="output_format",
help="Output format (default: markdown)",
)
return parser.parse_args(argv)
def main() -> None:
args = parse_args()
generator = ExecutiveReportGenerator()
generator.run(
report_json=args.report,
audience_level=args.audience,
output_path=args.output,
output_format=args.output_format,
)
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,744 @@
"""
Report Aggregator - Collect and normalize outputs from all SEO skills
=====================================================================
Purpose: Scan for recent audit outputs from skills 11-33, normalize data
formats, merge findings by domain/date, compute cross-skill health
scores, and identify top-priority issues across all audits.
Python: 3.10+
Usage:
python report_aggregator.py --domain https://example.com --json
python report_aggregator.py --domain https://example.com --output-dir ./audit_outputs --json
python report_aggregator.py --domain https://example.com --from 2025-01-01 --to 2025-03-31 --json
python report_aggregator.py --domain https://example.com --json --output report.json
"""
import argparse
import asyncio
import json
import logging
import os
import sys
from dataclasses import dataclass, field, asdict
from datetime import datetime, date
from pathlib import Path
from typing import Any
from urllib.parse import urlparse
from base_client import BaseAsyncClient, config
logger = logging.getLogger(__name__)
# ---------------------------------------------------------------------------
# Skill registry
# ---------------------------------------------------------------------------
SKILL_REGISTRY = {
11: {"name": "comprehensive-audit", "category": "comprehensive", "weight": 1.0},
12: {"name": "technical-audit", "category": "technical", "weight": 0.20},
13: {"name": "on-page-audit", "category": "on_page", "weight": 0.20},
14: {"name": "core-web-vitals", "category": "performance", "weight": 0.25},
15: {"name": "search-console", "category": "search_console", "weight": 0.10},
16: {"name": "schema-validator", "category": "schema", "weight": 0.15},
17: {"name": "schema-generator", "category": "schema", "weight": 0.10},
18: {"name": "local-audit", "category": "local", "weight": 0.10},
19: {"name": "keyword-strategy", "category": "keywords", "weight": 0.15},
20: {"name": "serp-analysis", "category": "keywords", "weight": 0.10},
21: {"name": "position-tracking", "category": "keywords", "weight": 0.15},
22: {"name": "link-building", "category": "links", "weight": 0.15},
23: {"name": "content-strategy", "category": "content", "weight": 0.15},
24: {"name": "ecommerce-seo", "category": "ecommerce", "weight": 0.10},
25: {"name": "kpi-framework", "category": "kpi", "weight": 0.20},
26: {"name": "international-seo", "category": "international", "weight": 0.10},
27: {"name": "ai-visibility", "category": "ai_search", "weight": 0.10},
28: {"name": "knowledge-graph", "category": "entity_seo", "weight": 0.10},
31: {"name": "competitor-intel", "category": "competitor", "weight": 0.15},
32: {"name": "crawl-budget", "category": "technical", "weight": 0.10},
33: {"name": "page-experience", "category": "performance", "weight": 0.10},
}
CATEGORY_WEIGHTS = {
"technical": 0.20,
"on_page": 0.15,
"performance": 0.15,
"content": 0.10,
"links": 0.10,
"local": 0.05,
"keywords": 0.10,
"competitor": 0.05,
"schema": 0.05,
"kpi": 0.05,
}
# ---------------------------------------------------------------------------
# Data classes
# ---------------------------------------------------------------------------
@dataclass
class SkillOutput:
"""Normalized output from a single SEO skill."""
skill_id: int = 0
skill_name: str = ""
domain: str = ""
audit_date: str = ""
category: str = ""
data: dict[str, Any] = field(default_factory=dict)
health_score: float = 0.0
issues: list[dict[str, Any]] = field(default_factory=list)
wins: list[dict[str, Any]] = field(default_factory=list)
source_file: str = ""
@dataclass
class AggregatedReport:
"""Full aggregated report from all SEO skill outputs."""
domain: str = ""
report_date: str = ""
skills_included: list[dict[str, Any]] = field(default_factory=list)
overall_health: float = 0.0
health_trend: str = "stable"
category_scores: dict[str, float] = field(default_factory=dict)
top_issues: list[dict[str, Any]] = field(default_factory=list)
top_wins: list[dict[str, Any]] = field(default_factory=list)
timeline: list[dict[str, Any]] = field(default_factory=list)
raw_outputs: list[dict[str, Any]] = field(default_factory=list)
audit_id: str = ""
timestamp: str = ""
errors: list[str] = field(default_factory=list)
# ---------------------------------------------------------------------------
# Aggregator
# ---------------------------------------------------------------------------
class ReportAggregator(BaseAsyncClient):
"""Aggregate outputs from all SEO skills into unified reports."""
NOTION_DB_ID = "2c8581e5-8a1e-8035-880b-e38cefc2f3ef"
def __init__(self):
super().__init__(max_concurrent=5, requests_per_second=2.0)
@staticmethod
def _extract_domain(url: str) -> str:
"""Extract bare domain from URL or return as-is if already bare."""
if "://" in url:
parsed = urlparse(url)
return parsed.netloc.lower().replace("www.", "")
return url.lower().replace("www.", "")
@staticmethod
def _generate_audit_id() -> str:
"""Generate audit ID in DASH-YYYYMMDD-NNN format."""
now = datetime.now()
return f"DASH-{now.strftime('%Y%m%d')}-001"
def scan_local_outputs(
self,
output_dir: str,
domain: str | None = None,
date_from: str | None = None,
date_to: str | None = None,
) -> list[SkillOutput]:
"""Find JSON output files from other SEO skills in a directory.
Scans for files matching patterns from skills 11-33 and parses
them into normalized SkillOutput objects.
"""
outputs: list[SkillOutput] = []
output_path = Path(output_dir)
if not output_path.exists():
self.logger.warning(f"Output directory not found: {output_dir}")
return outputs
# Scan for JSON files matching skill output patterns
json_files = list(output_path.rglob("*.json"))
self.logger.info(f"Found {len(json_files)} JSON files in {output_dir}")
for json_file in json_files:
try:
data = json.loads(json_file.read_text(encoding="utf-8"))
# Attempt to identify which skill produced this output
skill_output = self._identify_and_parse(data, str(json_file))
if skill_output is None:
continue
# Filter by domain if specified (supports subdomains)
if domain:
target_domain = self._extract_domain(domain)
if skill_output.domain:
file_domain = skill_output.domain
# Match exact domain OR subdomains (e.g., blog.example.com matches example.com)
if file_domain != target_domain and not file_domain.endswith("." + target_domain):
continue
# Filter by date range
if date_from and skill_output.audit_date < date_from:
continue
if date_to and skill_output.audit_date > date_to:
continue
outputs.append(skill_output)
self.logger.info(
f"Parsed output from skill {skill_output.skill_id} "
f"({skill_output.skill_name}): {json_file.name}"
)
except (json.JSONDecodeError, KeyError, TypeError) as e:
self.logger.warning(f"Could not parse {json_file}: {e}")
self.logger.info(f"Successfully parsed {len(outputs)} skill outputs")
return outputs
def _identify_and_parse(
self, data: dict[str, Any], source_file: str
) -> SkillOutput | None:
"""Identify which skill produced the output and parse it."""
skill_output = SkillOutput(source_file=source_file)
# Strategy 1: Parse skill from audit_id prefix (e.g., KPI-20250115-001)
audit_id = data.get("audit_id", "")
if isinstance(audit_id, str):
prefix_map = {
"COMP": 11, "TECH": 12, "PAGE": 13, "CWV": 14,
"GSC": 15, "SCHEMA": 16, "LOCAL": 18, "KW": 19,
"SERP": 20, "RANK": 21, "LINK": 22, "CONTENT": 23,
"ECOM": 24, "KPI": 25, "INTL": 26, "AI": 27,
"KG": 28, "COMPET": 31, "CRAWL": 32, "MIGR": 33,
"DASH": None, # Skip self-referencing dashboard reports
}
for prefix, skill_id in prefix_map.items():
if audit_id.startswith(prefix):
if skill_id is None:
return None # Skip aggregated reports
skill_info = SKILL_REGISTRY.get(skill_id, {})
skill_output.skill_id = skill_id
skill_output.skill_name = skill_info.get("name", "unknown")
skill_output.category = skill_info.get("category", "unknown")
break
# Strategy 2: Fallback to audit_type field (used by our-seo-agent outputs)
if not skill_output.skill_id:
audit_type = data.get("audit_type", "")
if isinstance(audit_type, str) and audit_type:
type_map = {
"comprehensive": 11, "technical": 12, "onpage": 13,
"cwv": 14, "core-web-vitals": 14,
"gsc": 15, "search-console": 15,
"schema": 16, "local": 18,
"keyword": 19, "serp": 20, "position": 21,
"link": 22, "backlink": 22,
"content": 23, "ecommerce": 24, "kpi": 25,
"international": 26, "hreflang": 26,
"ai-visibility": 27, "knowledge-graph": 28, "entity": 28,
"competitor": 31, "crawl-budget": 32, "crawl": 32,
"migration": 33,
}
for type_key, skill_id in type_map.items():
if audit_type.lower() == type_key:
skill_info = SKILL_REGISTRY.get(skill_id, {})
skill_output.skill_id = skill_id
skill_output.skill_name = skill_info.get("name", "unknown")
skill_output.category = skill_info.get("category", "unknown")
break
# Extract domain
for key in ("url", "target", "domain", "site"):
if key in data:
skill_output.domain = self._extract_domain(str(data[key]))
break
# Extract health score — check top-level first, then nested data dict
score_found = False
for key in ("health_score", "overall_health", "score"):
if key in data:
try:
skill_output.health_score = float(data[key])
score_found = True
except (ValueError, TypeError):
pass
break
if not score_found:
nested = data.get("data", {})
if isinstance(nested, dict):
for key in ("technical_score", "onpage_score", "schema_score",
"local_seo_score", "cwv_score", "performance_score",
"content_score", "link_score", "keyword_score",
"competitor_score", "efficiency_score",
"health_score", "overall_score", "score"):
val = nested.get(key)
if val is not None:
try:
skill_output.health_score = float(val)
except (ValueError, TypeError):
pass
break
# Extract audit date
for key in ("audit_date", "report_date", "timestamp", "found_date"):
if key in data:
date_str = str(data[key])[:10]
skill_output.audit_date = date_str
break
if not skill_output.audit_date:
skill_output.audit_date = date.today().isoformat()
# Extract issues
issues_raw = data.get("issues", data.get("critical_issues", []))
if isinstance(issues_raw, list):
for issue in issues_raw:
if isinstance(issue, dict):
skill_output.issues.append(issue)
elif isinstance(issue, str):
skill_output.issues.append({"description": issue, "severity": "medium"})
# Extract wins / recommendations
wins_raw = data.get("wins", data.get("top_wins", []))
if isinstance(wins_raw, list):
for win in wins_raw:
if isinstance(win, dict):
skill_output.wins.append(win)
elif isinstance(win, str):
skill_output.wins.append({"description": win})
# Store full data
skill_output.data = data
# Skip if no useful data was extracted
if not skill_output.skill_id and not skill_output.domain:
return None
return skill_output
async def query_notion_audits(
self,
domain: str,
date_from: str | None = None,
date_to: str | None = None,
) -> list[SkillOutput]:
"""Fetch past audit entries from Notion SEO Audit Log database.
In production, this uses the Notion MCP tools to query the database.
Returns normalized SkillOutput objects.
"""
outputs: list[SkillOutput] = []
self.logger.info(
f"Querying Notion audits for {domain} "
f"(db: {self.NOTION_DB_ID}, from={date_from}, to={date_to})"
)
# In production, this would call:
# mcp__notion__query-database with filters for Site URL and Found Date
# For now, return empty list as placeholder
self.logger.info(
"Notion query is a placeholder; use MCP tools in Claude Desktop "
"or manually provide JSON files via --output-dir."
)
return outputs
def normalize_output(self, skill_output: SkillOutput) -> dict[str, Any]:
"""Normalize a skill output into a unified format."""
return {
"skill_id": skill_output.skill_id,
"skill_name": skill_output.skill_name,
"domain": skill_output.domain,
"audit_date": skill_output.audit_date,
"category": skill_output.category,
"health_score": skill_output.health_score,
"issues_count": len(skill_output.issues),
"wins_count": len(skill_output.wins),
"issues": skill_output.issues[:10],
"wins": skill_output.wins[:10],
}
def compute_cross_skill_health(
self, outputs: list[SkillOutput]
) -> tuple[float, dict[str, float]]:
"""Compute weighted overall health score across all skills.
Returns (overall_score, category_scores_dict).
"""
category_scores: dict[str, list[float]] = {}
for output in outputs:
cat = output.category
if cat and output.health_score > 0:
category_scores.setdefault(cat, []).append(output.health_score)
# Average scores per category
avg_category: dict[str, float] = {}
for cat, scores in category_scores.items():
avg_category[cat] = round(sum(scores) / len(scores), 1)
# Weighted overall score
total_weight = 0.0
weighted_sum = 0.0
for cat, avg_score in avg_category.items():
weight = CATEGORY_WEIGHTS.get(cat, 0.05)
weighted_sum += avg_score * weight
total_weight += weight
overall = round(weighted_sum / total_weight, 1) if total_weight > 0 else 0.0
return overall, avg_category
def identify_priorities(
self, outputs: list[SkillOutput]
) -> tuple[list[dict[str, Any]], list[dict[str, Any]]]:
"""Identify top issues and wins across all skill outputs.
Returns (top_issues, top_wins).
"""
all_issues: list[dict[str, Any]] = []
all_wins: list[dict[str, Any]] = []
severity_order = {"critical": 0, "high": 1, "medium": 2, "low": 3}
for output in outputs:
for issue in output.issues:
enriched = {
**issue,
"source_skill": output.skill_name,
"source_skill_id": output.skill_id,
"category": output.category,
}
all_issues.append(enriched)
for win in output.wins:
enriched = {
**win,
"source_skill": output.skill_name,
"source_skill_id": output.skill_id,
"category": output.category,
}
all_wins.append(enriched)
# Sort issues by severity
all_issues.sort(
key=lambda i: severity_order.get(
i.get("severity", "medium"), 2
)
)
return all_issues[:20], all_wins[:20]
def build_timeline(self, outputs: list[SkillOutput]) -> list[dict[str, Any]]:
"""Build an audit history timeline from all skill outputs."""
timeline: list[dict[str, Any]] = []
for output in outputs:
entry = {
"date": output.audit_date,
"skill": output.skill_name,
"skill_id": output.skill_id,
"health_score": output.health_score,
"category": output.category,
"issues_count": len(output.issues),
}
timeline.append(entry)
# Sort by date descending
timeline.sort(key=lambda e: e.get("date", ""), reverse=True)
return timeline
async def run(
self,
domain: str,
output_dir: str | None = None,
date_from: str | None = None,
date_to: str | None = None,
) -> AggregatedReport:
"""Orchestrate the full report aggregation pipeline."""
target_domain = self._extract_domain(domain)
report = AggregatedReport(
domain=target_domain,
report_date=date.today().isoformat(),
audit_id=self._generate_audit_id(),
timestamp=datetime.now().isoformat(),
)
all_outputs: list[SkillOutput] = []
# Step 1: Scan local outputs
if output_dir:
self.logger.info(f"Step 1/5: Scanning local outputs in {output_dir}...")
local_outputs = self.scan_local_outputs(
output_dir, domain=target_domain,
date_from=date_from, date_to=date_to,
)
all_outputs.extend(local_outputs)
else:
self.logger.info("Step 1/5: No output directory specified, skipping local scan.")
# Step 2: Query Notion for past audits
self.logger.info("Step 2/5: Querying Notion for past audits...")
try:
notion_outputs = await self.query_notion_audits(
domain=target_domain,
date_from=date_from,
date_to=date_to,
)
all_outputs.extend(notion_outputs)
except Exception as e:
msg = f"Notion query error: {e}"
self.logger.error(msg)
report.errors.append(msg)
if not all_outputs:
self.logger.warning(
"No skill outputs found. Provide --output-dir with JSON files "
"from SEO skills 11-33, or ensure Notion audit log has entries."
)
report.errors.append("No skill outputs found to aggregate.")
return report
# Step 3: Normalize and compute health scores
self.logger.info(
f"Step 3/5: Normalizing {len(all_outputs)} skill outputs..."
)
report.skills_included = [
{
"skill_id": o.skill_id,
"skill_name": o.skill_name,
"audit_date": o.audit_date,
}
for o in all_outputs
]
report.raw_outputs = [self.normalize_output(o) for o in all_outputs]
overall_health, category_scores = self.compute_cross_skill_health(all_outputs)
report.overall_health = overall_health
report.category_scores = category_scores
# Determine health trend from timeline
scores_by_date = sorted(
[(o.audit_date, o.health_score) for o in all_outputs if o.health_score > 0],
key=lambda x: x[0],
)
if len(scores_by_date) >= 2:
older_avg = sum(s for _, s in scores_by_date[:len(scores_by_date)//2]) / max(len(scores_by_date)//2, 1)
newer_avg = sum(s for _, s in scores_by_date[len(scores_by_date)//2:]) / max(len(scores_by_date) - len(scores_by_date)//2, 1)
if newer_avg > older_avg + 3:
report.health_trend = "improving"
elif newer_avg < older_avg - 3:
report.health_trend = "declining"
else:
report.health_trend = "stable"
# Step 4: Identify priorities
self.logger.info("Step 4/5: Identifying top issues and wins...")
top_issues, top_wins = self.identify_priorities(all_outputs)
report.top_issues = top_issues
report.top_wins = top_wins
# Step 5: Build timeline
self.logger.info("Step 5/5: Building audit history timeline...")
report.timeline = self.build_timeline(all_outputs)
self.logger.info(
f"Aggregation complete: {len(all_outputs)} skills, "
f"health={report.overall_health}/100, "
f"trend={report.health_trend}, "
f"issues={len(report.top_issues)}, wins={len(report.top_wins)}"
)
return report
# ---------------------------------------------------------------------------
# Output formatting
# ---------------------------------------------------------------------------
def _format_text_report(report: AggregatedReport) -> str:
"""Format aggregated report as human-readable text."""
lines: list[str] = []
lines.append("=" * 70)
lines.append(" SEO REPORTING DASHBOARD - AGGREGATED REPORT")
lines.append(f" Domain: {report.domain}")
lines.append(f" Report Date: {report.report_date}")
lines.append(f" Audit ID: {report.audit_id}")
lines.append("=" * 70)
# Health score
lines.append("")
lines.append(f" Overall Health: {report.overall_health}/100 ({report.health_trend})")
lines.append("-" * 50)
# Category scores
if report.category_scores:
lines.append("")
lines.append("--- CATEGORY SCORES ---")
for cat, score in sorted(
report.category_scores.items(), key=lambda x: x[1], reverse=True
):
bar = "#" * int(score / 5) + "." * (20 - int(score / 5))
lines.append(f" {cat:<20} [{bar}] {score:.1f}/100")
# Skills included
if report.skills_included:
lines.append("")
lines.append("--- SKILLS INCLUDED ---")
for skill in report.skills_included:
lines.append(
f" [{skill['skill_id']:>2}] {skill['skill_name']:<30} "
f"({skill['audit_date']})"
)
# Top issues
if report.top_issues:
lines.append("")
lines.append("--- TOP ISSUES ---")
for i, issue in enumerate(report.top_issues[:10], 1):
severity = issue.get("severity", "medium").upper()
desc = issue.get("description", "No description")
cat = issue.get("category", "")
lines.append(f" {i:>2}. [{severity}] ({cat}) {desc}")
# Top wins
if report.top_wins:
lines.append("")
lines.append("--- TOP WINS ---")
for i, win in enumerate(report.top_wins[:10], 1):
desc = win.get("description", "No description")
cat = win.get("category", "")
lines.append(f" {i:>2}. ({cat}) {desc}")
# Timeline
if report.timeline:
lines.append("")
lines.append("--- AUDIT TIMELINE ---")
lines.append(f" {'Date':<12} {'Skill':<25} {'Score':>8} {'Issues':>8}")
lines.append(" " + "-" * 55)
for entry in report.timeline[:15]:
lines.append(
f" {entry['date']:<12} {entry['skill']:<25} "
f"{entry['health_score']:>7.1f} {entry['issues_count']:>7}"
)
# Errors
if report.errors:
lines.append("")
lines.append("--- ERRORS ---")
for err in report.errors:
lines.append(f" - {err}")
lines.append("")
lines.append("=" * 70)
return "\n".join(lines)
def _serialize_report(report: AggregatedReport) -> dict:
"""Convert report to JSON-serializable dict."""
return {
"domain": report.domain,
"report_date": report.report_date,
"overall_health": report.overall_health,
"health_trend": report.health_trend,
"skills_included": report.skills_included,
"category_scores": report.category_scores,
"top_issues": report.top_issues,
"top_wins": report.top_wins,
"timeline": report.timeline,
"raw_outputs": report.raw_outputs,
"audit_id": report.audit_id,
"timestamp": report.timestamp,
"errors": report.errors if report.errors else None,
}
# ---------------------------------------------------------------------------
# CLI
# ---------------------------------------------------------------------------
def parse_args(argv: list[str] | None = None) -> argparse.Namespace:
parser = argparse.ArgumentParser(
description="SEO Report Aggregator - Collect and normalize outputs from all SEO skills",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""\
Examples:
python report_aggregator.py --domain https://example.com --json
python report_aggregator.py --domain https://example.com --output-dir ./audit_outputs --json
python report_aggregator.py --domain https://example.com --from 2025-01-01 --to 2025-03-31 --json
""",
)
parser.add_argument(
"--domain",
required=True,
help="Target domain to aggregate reports for",
)
parser.add_argument(
"--output-dir",
type=str,
default=None,
help="Directory containing JSON outputs from SEO skills",
)
parser.add_argument(
"--from",
type=str,
default=None,
dest="date_from",
help="Start date for filtering (YYYY-MM-DD)",
)
parser.add_argument(
"--to",
type=str,
default=None,
dest="date_to",
help="End date for filtering (YYYY-MM-DD)",
)
parser.add_argument(
"--json",
action="store_true",
default=False,
help="Output in JSON format",
)
parser.add_argument(
"--output",
type=str,
default=None,
help="Save output to file path",
)
return parser.parse_args(argv)
async def async_main(args: argparse.Namespace) -> None:
aggregator = ReportAggregator()
report = await aggregator.run(
domain=args.domain,
output_dir=args.output_dir,
date_from=args.date_from,
date_to=args.date_to,
)
if args.json:
output_str = json.dumps(
_serialize_report(report), indent=2, ensure_ascii=False
)
else:
output_str = _format_text_report(report)
if args.output:
Path(args.output).write_text(output_str, encoding="utf-8")
logger.info(f"Report saved to {args.output}")
else:
print(output_str)
aggregator.print_stats()
def main() -> None:
args = parse_args()
asyncio.run(async_main(args))
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,9 @@
# 34-seo-reporting-dashboard dependencies
requests>=2.31.0
aiohttp>=3.9.0
pandas>=2.1.0
tenacity>=8.2.0
tqdm>=4.66.0
python-dotenv>=1.0.0
rich>=13.7.0
jinja2>=3.1.0

View File

@@ -0,0 +1,136 @@
---
name: seo-reporting-dashboard
description: |
SEO reporting dashboard and executive report generation. Aggregates data from all SEO skills
into stakeholder-ready reports and interactive HTML dashboards.
Triggers: SEO report, SEO dashboard, executive summary, 보고서, 대시보드, performance report, 종합 보고서.
---
# SEO Reporting Dashboard
## Purpose
Aggregate outputs from all SEO skills (11-33) into stakeholder-ready executive reports with interactive HTML dashboards, trend analysis, and Korean-language summaries. This is the PRESENTATION LAYER that sits on top of skill 25 (KPI Framework) and all other skill outputs, providing a unified view of SEO performance across all audit dimensions.
## Core Capabilities
1. **Report Aggregation** - Collect and normalize outputs from all SEO skills (11-33) into a unified data structure with cross-skill health scoring and priority issue identification
2. **Interactive Dashboard** - Generate self-contained HTML dashboards with Chart.js visualizations including health gauge, traffic trends, keyword distribution, issue breakdown, and competitor radar
3. **Executive Reporting** - Korean-language executive summary generation with audience-specific detail levels (C-level, marketing team, technical team) and prioritized action items
## MCP Tool Usage
### Ahrefs for Fresh Data Pull
```
mcp__ahrefs__site-explorer-metrics: Pull current organic metrics snapshot for dashboard
mcp__ahrefs__site-explorer-metrics-history: Pull historical metrics for trend visualization
```
### Notion for Reading Past Audits and Writing Reports
```
mcp__notion__*: Query SEO Audit Log database for past audit entries
mcp__notion__*: Save dashboard reports and executive summaries to Notion
```
### Perplexity for Context
```
mcp__perplexity__*: Enrich reports with industry benchmarks and competitor context
```
## Workflow
### Dashboard Generation
1. Accept target domain and optional date range
2. Query Notion SEO Audit Log for all past audit entries for the domain
3. Optionally pull fresh metrics from Ahrefs (site-explorer-metrics, metrics-history)
4. Normalize all skill outputs into unified format
5. Compute cross-skill health score with weighted category dimensions
6. Identify top issues (sorted by severity) and top wins across all audits
7. Build audit history timeline
8. Generate HTML dashboard with Chart.js charts:
- Health score gauge (doughnut)
- Category scores horizontal bar chart
- Health score timeline line chart
- Issue distribution pie chart
- Competitor radar chart (if competitor data available)
9. Save HTML file and optionally push summary to Notion
### Executive Reporting
1. Load aggregated report data (from dashboard generation or JSON file)
2. Select audience level: C-level, marketing, or technical
3. Generate Korean-language narrative with:
- Health score overview and trend
- Category highlights (strengths and weaknesses)
- Skills coverage summary
- Audience-specific business impact analysis
4. Format key wins and concerns with severity and category labels
5. Generate prioritized action items ranked by impact
6. Render as markdown document
7. Optionally push to Notion SEO Audit Log
## Output Format
### HTML Dashboard
```
Self-contained HTML file with:
- Responsive CSS grid layout
- Chart.js visualizations from CDN
- Health score gauge
- Category bar chart
- Timeline line chart
- Issues pie chart
- Competitor radar chart
- Issues and wins lists
- Audit history table
```
### Executive Report (Markdown)
```markdown
# SEO 성과 보고서 - [domain]
**대상**: 경영진 / 마케팅팀 / 기술팀
**도메인**: [domain]
**보고 일자**: [date]
## Health Score
| 지표 | 값 |
|------|-----|
| Overall Score | **[score]/100** |
| 등급 | [grade_kr] |
| 추세 | [trend_kr] |
## 종합 분석
[Korean narrative...]
## 주요 성과
- [wins...]
## 주요 이슈
- [concerns...]
## 권장 조치 사항
1. [recommendations...]
```
## Audience Configurations
| Audience | Detail | Issues | Recommendations | Technical Details |
|----------|--------|--------|------------------|-------------------|
| C-level (경영진) | Summary | Top 5 | Top 3 | No |
| Marketing (마케팅팀) | Moderate | Top 10 | Top 5 | No |
| Technical (기술팀) | Detailed | Top 20 | Top 10 | Yes |
## Limitations
- Aggregation depends on availability of JSON outputs from other skills
- Notion query for past audits requires MCP tools (placeholder in scripts)
- Competitor radar chart only renders if competitor intel (skill 31) data is present
- HTML dashboard requires internet access for Chart.js CDN
## Notion Output (Required)
All reports MUST be saved to OurDigital SEO Audit Log:
- **Database ID**: `2c8581e5-8a1e-8035-880b-e38cefc2f3ef`
- **Properties**: Issue (title), Site (url), Category ("SEO Dashboard"), Priority, Found Date, Audit ID
- **Language**: Korean with English technical terms
- **Audit ID Format**: DASH-YYYYMMDD-NNN

View File

@@ -0,0 +1,9 @@
name: seo-reporting-dashboard
description: |
SEO reporting dashboard and executive report generation. Triggers: SEO report, dashboard, executive summary, 보고서, 대시보드.
allowed-tools:
- mcp__ahrefs__*
- mcp__notion__*
- mcp__perplexity__*
- WebSearch
- WebFetch

View File

@@ -0,0 +1,20 @@
# Ahrefs
Tools used for pulling fresh SEO data into the reporting dashboard.
## Tools Used
| Tool | Purpose |
|------|---------|
| `mcp__ahrefs__site-explorer-metrics` | Current organic metrics snapshot (traffic, keywords, DR) |
| `mcp__ahrefs__site-explorer-metrics-history` | Historical metrics for trend charts and period comparison |
## Usage
These tools are called when the dashboard needs fresh data beyond what is available from cached skill outputs. The aggregator first checks local JSON files and Notion audit log entries, then optionally pulls current data from Ahrefs to supplement the report.
## Notes
- Ahrefs data has approximately 24-hour freshness lag
- Traffic value from Ahrefs is in cents; divide by 100 for USD
- Historical data availability depends on Ahrefs subscription tier

View File

@@ -0,0 +1,34 @@
# Notion
Notion MCP tools are used for both reading past audit data and writing dashboard reports.
## Database Configuration
| Field | Value |
|-------|-------|
| Database ID | `2c8581e5-8a1e-8035-880b-e38cefc2f3ef` |
| URL | https://www.notion.so/dintelligence/2c8581e58a1e8035880be38cefc2f3ef |
## Reading Past Audits
Query the SEO Audit Log database to retrieve historical audit entries:
- Filter by **Site** (URL property) matching the target domain
- Filter by **Found Date** for date range selection
- Retrieve **Category**, **Priority**, **Audit ID**, and page content
- Used by the report aggregator to build the unified dataset
## Writing Dashboard Reports
Save generated reports to the SEO Audit Log:
- **Issue** (Title): `SEO 대시보드 보고서 - [domain] - YYYY-MM-DD`
- **Site** (URL): Target website URL
- **Category** (Select): `SEO Dashboard`
- **Priority** (Select): Based on overall health trend
- **Found Date** (Date): Report generation date
- **Audit ID** (Rich Text): Format `DASH-YYYYMMDD-NNN`
## Language Guidelines
- Report content in Korean (한국어)
- Keep technical English terms as-is (e.g., Health Score, Chart.js, Domain Rating)
- URLs and code remain unchanged