Add SEO skills 33-34 and fix bugs in skills 19-34
New skills: - Skill 33: Site migration planner with redirect mapping and monitoring - Skill 34: Reporting dashboard with HTML charts and Korean executive reports Bug fixes (Skill 34 - report_aggregator.py): - Add audit_type fallback for skill identification (was only using audit_id prefix) - Extract health scores from nested data dict (technical_score, onpage_score, etc.) - Support subdomain matching in domain filter (blog.ourdigital.org matches ourdigital.org) - Skip self-referencing DASH- aggregated reports Bug fixes (Skill 20 - naver_serp_analyzer.py): - Remove VIEW tab selectors (removed by Naver in 2026) - Add new section detectors: books (도서), shortform (숏폼), influencer (인플루언서) Improvements (Skill 34 - dashboard/executive report): - Add Korean category labels for Chart.js charts (기술 SEO, 온페이지, etc.) - Add Korean trend labels (개선 중 ↑, 안정 →, 하락 중 ↓) - Add English→Korean issue description translation layer (20 common patterns) Documentation improvements: - Add Korean triggers to 4 skill descriptions (19, 25, 28, 31) - Expand Skill 32 SKILL.md from 40→143 lines (was 6/10, added workflow, output format, limitations) - Add output format examples to Skills 27 and 28 SKILL.md - Add limitations sections to Skills 27 and 28 - Update README.md, CLAUDE.md, AGENTS.md for skills 33-34 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
173
custom-skills/34-seo-reporting-dashboard/code/CLAUDE.md
Normal file
173
custom-skills/34-seo-reporting-dashboard/code/CLAUDE.md
Normal file
@@ -0,0 +1,173 @@
|
||||
# CLAUDE.md
|
||||
|
||||
## Overview
|
||||
|
||||
SEO reporting dashboard and executive report generator. Aggregates outputs from all SEO skills (11-33) into stakeholder-ready reports with interactive HTML dashboards, trend analysis, and Korean-language executive summaries. This is the PRESENTATION LAYER that sits on top of skill 25 (KPI Framework) and all other skill outputs, providing a unified view of SEO performance across all audit dimensions.
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
pip install -r scripts/requirements.txt
|
||||
|
||||
# Aggregate outputs from all SEO skills
|
||||
python scripts/report_aggregator.py --domain https://example.com --json
|
||||
|
||||
# Generate HTML dashboard
|
||||
python scripts/dashboard_generator.py --report aggregated_report.json --output dashboard.html
|
||||
|
||||
# Generate Korean executive report
|
||||
python scripts/executive_report.py --report aggregated_report.json --audience c-level --output report.md
|
||||
```
|
||||
|
||||
## Scripts
|
||||
|
||||
| Script | Purpose | Key Output |
|
||||
|--------|---------|------------|
|
||||
| `report_aggregator.py` | Collect and normalize outputs from all SEO skills | Unified aggregated report, cross-skill health score, priority issues |
|
||||
| `dashboard_generator.py` | Generate interactive HTML dashboard with Chart.js | Self-contained HTML file with charts and responsive layout |
|
||||
| `executive_report.py` | Korean-language executive summary generation | Markdown report tailored to audience level |
|
||||
| `base_client.py` | Shared utilities | RateLimiter, ConfigManager, BaseAsyncClient |
|
||||
|
||||
## Report Aggregator
|
||||
|
||||
```bash
|
||||
# Aggregate all skill outputs for a domain
|
||||
python scripts/report_aggregator.py --domain https://example.com --json
|
||||
|
||||
# Specify output directory to scan
|
||||
python scripts/report_aggregator.py --domain https://example.com --output-dir ./audit_outputs --json
|
||||
|
||||
# Filter by date range
|
||||
python scripts/report_aggregator.py --domain https://example.com --from 2025-01-01 --to 2025-03-31 --json
|
||||
|
||||
# Save to file
|
||||
python scripts/report_aggregator.py --domain https://example.com --json --output report.json
|
||||
```
|
||||
|
||||
**Capabilities**:
|
||||
- Scan for recent audit outputs from skills 11-33 (JSON files or Notion entries)
|
||||
- Normalize data formats across skills into unified structure
|
||||
- Merge findings by domain/date
|
||||
- Compute cross-skill health scores with weighted dimensions
|
||||
- Identify top-priority issues across all audits
|
||||
- Timeline of audit history
|
||||
- Support for both local file scanning and Notion database queries
|
||||
|
||||
## Dashboard Generator
|
||||
|
||||
```bash
|
||||
# Generate HTML dashboard from aggregated report
|
||||
python scripts/dashboard_generator.py --report aggregated_report.json --output dashboard.html
|
||||
|
||||
# Custom title
|
||||
python scripts/dashboard_generator.py --report aggregated_report.json --output dashboard.html --title "OurDigital SEO Dashboard"
|
||||
```
|
||||
|
||||
**Capabilities**:
|
||||
- Generate self-contained HTML dashboard (uses Chart.js from CDN)
|
||||
- Health score gauge chart
|
||||
- Traffic trend line chart
|
||||
- Keyword ranking distribution bar chart
|
||||
- Technical issues breakdown pie chart
|
||||
- Competitor comparison radar chart
|
||||
- Mobile-responsive layout with CSS grid
|
||||
- Export as single .html file (no external dependencies)
|
||||
|
||||
## Executive Report
|
||||
|
||||
```bash
|
||||
# C-level executive summary (Korean)
|
||||
python scripts/executive_report.py --report aggregated_report.json --audience c-level --output report.md
|
||||
|
||||
# Marketing team report
|
||||
python scripts/executive_report.py --report aggregated_report.json --audience marketing --output report.md
|
||||
|
||||
# Technical team report
|
||||
python scripts/executive_report.py --report aggregated_report.json --audience technical --output report.md
|
||||
|
||||
# Output to Notion instead of file
|
||||
python scripts/executive_report.py --report aggregated_report.json --audience c-level --format notion
|
||||
```
|
||||
|
||||
**Capabilities**:
|
||||
- Korean-language executive summary generation
|
||||
- Key wins and concerns identification
|
||||
- Period-over-period comparison narrative
|
||||
- Priority action items ranked by impact
|
||||
- Stakeholder-appropriate language (non-technical for C-level)
|
||||
- Support for C-level, marketing team, and technical team audiences
|
||||
- Markdown output format
|
||||
|
||||
## Ahrefs MCP Tools Used
|
||||
|
||||
| Tool | Purpose |
|
||||
|------|---------|
|
||||
| `site-explorer-metrics` | Fresh current organic metrics snapshot |
|
||||
| `site-explorer-metrics-history` | Historical metrics for trend visualization |
|
||||
|
||||
## Output Format
|
||||
|
||||
```json
|
||||
{
|
||||
"domain": "example.com",
|
||||
"report_date": "2025-01-15",
|
||||
"overall_health": 72,
|
||||
"health_trend": "improving",
|
||||
"skills_included": [
|
||||
{"skill_id": 11, "skill_name": "comprehensive-audit", "audit_date": "2025-01-14"},
|
||||
{"skill_id": 25, "skill_name": "kpi-framework", "audit_date": "2025-01-15"}
|
||||
],
|
||||
"category_scores": {
|
||||
"technical": 85,
|
||||
"on_page": 70,
|
||||
"performance": 60,
|
||||
"content": 75,
|
||||
"links": 68,
|
||||
"local": 65,
|
||||
"keywords": 72,
|
||||
"competitor": 58
|
||||
},
|
||||
"top_issues": [
|
||||
{"severity": "critical", "category": "performance", "description": "CLS exceeds threshold on mobile"},
|
||||
{"severity": "high", "category": "technical", "description": "12 pages with noindex tag incorrectly set"}
|
||||
],
|
||||
"top_wins": [
|
||||
{"category": "links", "description": "Domain Rating increased by 3 points"},
|
||||
{"category": "keywords", "description": "15 new keywords entered top 10"}
|
||||
],
|
||||
"timeline": [
|
||||
{"date": "2025-01-15", "skill": "kpi-framework", "health_score": 72},
|
||||
{"date": "2025-01-14", "skill": "comprehensive-audit", "health_score": 70}
|
||||
],
|
||||
"audit_id": "DASH-20250115-001",
|
||||
"timestamp": "2025-01-15T14:30:00"
|
||||
}
|
||||
```
|
||||
|
||||
## Notion Output (Required)
|
||||
|
||||
**IMPORTANT**: All audit reports MUST be saved to the OurDigital SEO Audit Log database.
|
||||
|
||||
### Database Configuration
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| Database ID | `2c8581e5-8a1e-8035-880b-e38cefc2f3ef` |
|
||||
| URL | https://www.notion.so/dintelligence/2c8581e58a1e8035880be38cefc2f3ef |
|
||||
|
||||
### Required Properties
|
||||
|
||||
| Property | Type | Description |
|
||||
|----------|------|-------------|
|
||||
| Issue | Title | Report title (Korean + date) |
|
||||
| Site | URL | Audited website URL |
|
||||
| Category | Select | SEO Dashboard |
|
||||
| Priority | Select | Based on overall health trend |
|
||||
| Found Date | Date | Report date (YYYY-MM-DD) |
|
||||
| Audit ID | Rich Text | Format: DASH-YYYYMMDD-NNN |
|
||||
|
||||
### Language Guidelines
|
||||
|
||||
- Report content in Korean (한국어)
|
||||
- Keep technical English terms as-is (e.g., Health Score, Domain Rating, Core Web Vitals, Chart.js)
|
||||
- URLs and code remain unchanged
|
||||
@@ -0,0 +1,30 @@
|
||||
---
|
||||
name: seo-reporting-dashboard
|
||||
description: |
|
||||
SEO reporting dashboard and executive report generation. Aggregates data from all SEO skills
|
||||
into stakeholder-ready reports and interactive HTML dashboards.
|
||||
Triggers: SEO report, SEO dashboard, executive summary, 보고서, 대시보드, performance report.
|
||||
allowed-tools:
|
||||
- Bash
|
||||
- Read
|
||||
- Write
|
||||
- WebFetch
|
||||
- WebSearch
|
||||
---
|
||||
|
||||
# SEO Reporting Dashboard
|
||||
|
||||
## Generate HTML Dashboard
|
||||
```bash
|
||||
python custom-skills/34-seo-reporting-dashboard/code/scripts/dashboard_generator.py --report [JSON] --output dashboard.html
|
||||
```
|
||||
|
||||
## Generate Executive Report (Korean)
|
||||
```bash
|
||||
python custom-skills/34-seo-reporting-dashboard/code/scripts/executive_report.py --report [JSON] --audience c-level --output report.md
|
||||
```
|
||||
|
||||
## Aggregate All Skill Outputs
|
||||
```bash
|
||||
python custom-skills/34-seo-reporting-dashboard/code/scripts/report_aggregator.py --domain [URL] --json
|
||||
```
|
||||
@@ -0,0 +1,169 @@
|
||||
"""
|
||||
Base Client - Shared async client utilities
|
||||
===========================================
|
||||
Purpose: Rate-limited async operations for API clients
|
||||
Python: 3.10+
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
import os
|
||||
from asyncio import Semaphore
|
||||
from datetime import datetime
|
||||
from typing import Any, Callable, TypeVar
|
||||
|
||||
from dotenv import load_dotenv
|
||||
from tenacity import (
|
||||
retry,
|
||||
stop_after_attempt,
|
||||
wait_exponential,
|
||||
retry_if_exception_type,
|
||||
)
|
||||
|
||||
load_dotenv()
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format="%(asctime)s - %(levelname)s - %(message)s",
|
||||
)
|
||||
|
||||
T = TypeVar("T")
|
||||
|
||||
|
||||
class RateLimiter:
|
||||
"""Rate limiter using token bucket algorithm."""
|
||||
|
||||
def __init__(self, rate: float, per: float = 1.0):
|
||||
self.rate = rate
|
||||
self.per = per
|
||||
self.tokens = rate
|
||||
self.last_update = datetime.now()
|
||||
self._lock = asyncio.Lock()
|
||||
|
||||
async def acquire(self) -> None:
|
||||
async with self._lock:
|
||||
now = datetime.now()
|
||||
elapsed = (now - self.last_update).total_seconds()
|
||||
self.tokens = min(self.rate, self.tokens + elapsed * (self.rate / self.per))
|
||||
self.last_update = now
|
||||
|
||||
if self.tokens < 1:
|
||||
wait_time = (1 - self.tokens) * (self.per / self.rate)
|
||||
await asyncio.sleep(wait_time)
|
||||
self.tokens = 0
|
||||
else:
|
||||
self.tokens -= 1
|
||||
|
||||
|
||||
class BaseAsyncClient:
|
||||
"""Base class for async API clients with rate limiting."""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
max_concurrent: int = 5,
|
||||
requests_per_second: float = 3.0,
|
||||
logger: logging.Logger | None = None,
|
||||
):
|
||||
self.semaphore = Semaphore(max_concurrent)
|
||||
self.rate_limiter = RateLimiter(requests_per_second)
|
||||
self.logger = logger or logging.getLogger(self.__class__.__name__)
|
||||
self.stats = {
|
||||
"requests": 0,
|
||||
"success": 0,
|
||||
"errors": 0,
|
||||
"retries": 0,
|
||||
}
|
||||
|
||||
@retry(
|
||||
stop=stop_after_attempt(3),
|
||||
wait=wait_exponential(multiplier=1, min=2, max=10),
|
||||
retry=retry_if_exception_type(Exception),
|
||||
)
|
||||
async def _rate_limited_request(
|
||||
self,
|
||||
coro: Callable[[], Any],
|
||||
) -> Any:
|
||||
async with self.semaphore:
|
||||
await self.rate_limiter.acquire()
|
||||
self.stats["requests"] += 1
|
||||
try:
|
||||
result = await coro()
|
||||
self.stats["success"] += 1
|
||||
return result
|
||||
except Exception as e:
|
||||
self.stats["errors"] += 1
|
||||
self.logger.error(f"Request failed: {e}")
|
||||
raise
|
||||
|
||||
async def batch_requests(
|
||||
self,
|
||||
requests: list[Callable[[], Any]],
|
||||
desc: str = "Processing",
|
||||
) -> list[Any]:
|
||||
try:
|
||||
from tqdm.asyncio import tqdm
|
||||
has_tqdm = True
|
||||
except ImportError:
|
||||
has_tqdm = False
|
||||
|
||||
async def execute(req: Callable) -> Any:
|
||||
try:
|
||||
return await self._rate_limited_request(req)
|
||||
except Exception as e:
|
||||
return {"error": str(e)}
|
||||
|
||||
tasks = [execute(req) for req in requests]
|
||||
|
||||
if has_tqdm:
|
||||
results = []
|
||||
for coro in tqdm.as_completed(tasks, total=len(tasks), desc=desc):
|
||||
result = await coro
|
||||
results.append(result)
|
||||
return results
|
||||
else:
|
||||
return await asyncio.gather(*tasks, return_exceptions=True)
|
||||
|
||||
def print_stats(self) -> None:
|
||||
self.logger.info("=" * 40)
|
||||
self.logger.info("Request Statistics:")
|
||||
self.logger.info(f" Total Requests: {self.stats['requests']}")
|
||||
self.logger.info(f" Successful: {self.stats['success']}")
|
||||
self.logger.info(f" Errors: {self.stats['errors']}")
|
||||
self.logger.info("=" * 40)
|
||||
|
||||
|
||||
class ConfigManager:
|
||||
"""Manage API configuration and credentials."""
|
||||
|
||||
def __init__(self):
|
||||
load_dotenv()
|
||||
|
||||
@property
|
||||
def google_credentials_path(self) -> str | None:
|
||||
seo_creds = os.path.expanduser("~/.credential/ourdigital-seo-agent.json")
|
||||
if os.path.exists(seo_creds):
|
||||
return seo_creds
|
||||
return os.getenv("GOOGLE_APPLICATION_CREDENTIALS")
|
||||
|
||||
@property
|
||||
def pagespeed_api_key(self) -> str | None:
|
||||
return os.getenv("PAGESPEED_API_KEY")
|
||||
|
||||
@property
|
||||
def notion_token(self) -> str | None:
|
||||
return os.getenv("NOTION_TOKEN") or os.getenv("NOTION_API_KEY")
|
||||
|
||||
def validate_google_credentials(self) -> bool:
|
||||
creds_path = self.google_credentials_path
|
||||
if not creds_path:
|
||||
return False
|
||||
return os.path.exists(creds_path)
|
||||
|
||||
def get_required(self, key: str) -> str:
|
||||
value = os.getenv(key)
|
||||
if not value:
|
||||
raise ValueError(f"Missing required environment variable: {key}")
|
||||
return value
|
||||
|
||||
|
||||
config = ConfigManager()
|
||||
@@ -0,0 +1,745 @@
|
||||
"""
|
||||
Dashboard Generator - Interactive HTML SEO dashboard with Chart.js
|
||||
==================================================================
|
||||
Purpose: Generate a self-contained HTML dashboard from aggregated SEO
|
||||
report data, with responsive charts for health scores, traffic
|
||||
trends, keyword rankings, issue breakdowns, and competitor radar.
|
||||
Python: 3.10+
|
||||
|
||||
Usage:
|
||||
python dashboard_generator.py --report aggregated_report.json --output dashboard.html
|
||||
python dashboard_generator.py --report aggregated_report.json --output dashboard.html --title "My SEO Dashboard"
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import logging
|
||||
import sys
|
||||
from dataclasses import dataclass, field
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
from jinja2 import Template
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format="%(asctime)s - %(levelname)s - %(message)s",
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Data classes
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@dataclass
|
||||
class DashboardConfig:
|
||||
"""Configuration for dashboard generation."""
|
||||
title: str = "SEO Reporting Dashboard"
|
||||
domain: str = ""
|
||||
date_range: str = ""
|
||||
theme: str = "light"
|
||||
chart_options: dict[str, Any] = field(default_factory=dict)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# HTML template
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
DASHBOARD_TEMPLATE = """<!DOCTYPE html>
|
||||
<html lang="ko">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>{{ title }} - {{ domain }}</title>
|
||||
<script src="https://cdn.jsdelivr.net/npm/chart.js@4.4.0/dist/chart.umd.min.js"></script>
|
||||
<style>
|
||||
:root {
|
||||
--bg-primary: #f8f9fa;
|
||||
--bg-card: #ffffff;
|
||||
--text-primary: #212529;
|
||||
--text-secondary: #6c757d;
|
||||
--border: #dee2e6;
|
||||
--accent: #0d6efd;
|
||||
--success: #198754;
|
||||
--warning: #ffc107;
|
||||
--danger: #dc3545;
|
||||
}
|
||||
* { margin: 0; padding: 0; box-sizing: border-box; }
|
||||
body {
|
||||
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
|
||||
background: var(--bg-primary);
|
||||
color: var(--text-primary);
|
||||
line-height: 1.6;
|
||||
}
|
||||
.header {
|
||||
background: linear-gradient(135deg, #0d6efd 0%, #6610f2 100%);
|
||||
color: white;
|
||||
padding: 2rem;
|
||||
text-align: center;
|
||||
}
|
||||
.header h1 { font-size: 1.8rem; margin-bottom: 0.5rem; }
|
||||
.header .meta { opacity: 0.85; font-size: 0.9rem; }
|
||||
.container {
|
||||
max-width: 1400px;
|
||||
margin: 0 auto;
|
||||
padding: 1.5rem;
|
||||
}
|
||||
.grid {
|
||||
display: grid;
|
||||
grid-template-columns: repeat(auto-fit, minmax(320px, 1fr));
|
||||
gap: 1.5rem;
|
||||
margin-bottom: 1.5rem;
|
||||
}
|
||||
.grid-full {
|
||||
display: grid;
|
||||
grid-template-columns: 1fr;
|
||||
gap: 1.5rem;
|
||||
margin-bottom: 1.5rem;
|
||||
}
|
||||
.card {
|
||||
background: var(--bg-card);
|
||||
border-radius: 12px;
|
||||
padding: 1.5rem;
|
||||
box-shadow: 0 2px 8px rgba(0,0,0,0.06);
|
||||
border: 1px solid var(--border);
|
||||
}
|
||||
.card h2 {
|
||||
font-size: 1.1rem;
|
||||
color: var(--text-secondary);
|
||||
margin-bottom: 1rem;
|
||||
padding-bottom: 0.5rem;
|
||||
border-bottom: 2px solid var(--border);
|
||||
}
|
||||
.health-score {
|
||||
text-align: center;
|
||||
padding: 2rem;
|
||||
}
|
||||
.health-score .score {
|
||||
font-size: 4rem;
|
||||
font-weight: 700;
|
||||
line-height: 1;
|
||||
}
|
||||
.health-score .label {
|
||||
font-size: 1rem;
|
||||
color: var(--text-secondary);
|
||||
margin-top: 0.5rem;
|
||||
}
|
||||
.health-score .trend {
|
||||
font-size: 1.2rem;
|
||||
margin-top: 0.5rem;
|
||||
font-weight: 600;
|
||||
}
|
||||
.trend-improving { color: var(--success); }
|
||||
.trend-stable { color: var(--warning); }
|
||||
.trend-declining { color: var(--danger); }
|
||||
.score-excellent { color: var(--success); }
|
||||
.score-good { color: #20c997; }
|
||||
.score-average { color: var(--warning); }
|
||||
.score-poor { color: #fd7e14; }
|
||||
.score-critical { color: var(--danger); }
|
||||
.chart-container {
|
||||
position: relative;
|
||||
width: 100%;
|
||||
height: 300px;
|
||||
}
|
||||
.issues-list { list-style: none; }
|
||||
.issues-list li {
|
||||
padding: 0.75rem;
|
||||
border-bottom: 1px solid var(--border);
|
||||
display: flex;
|
||||
align-items: flex-start;
|
||||
gap: 0.75rem;
|
||||
}
|
||||
.issues-list li:last-child { border-bottom: none; }
|
||||
.severity-badge {
|
||||
display: inline-block;
|
||||
padding: 0.15rem 0.5rem;
|
||||
border-radius: 4px;
|
||||
font-size: 0.75rem;
|
||||
font-weight: 600;
|
||||
text-transform: uppercase;
|
||||
white-space: nowrap;
|
||||
}
|
||||
.severity-critical { background: #f8d7da; color: #842029; }
|
||||
.severity-high { background: #fff3cd; color: #664d03; }
|
||||
.severity-medium { background: #cfe2ff; color: #084298; }
|
||||
.severity-low { background: #d1e7dd; color: #0f5132; }
|
||||
.timeline-table {
|
||||
width: 100%;
|
||||
border-collapse: collapse;
|
||||
font-size: 0.9rem;
|
||||
}
|
||||
.timeline-table th {
|
||||
text-align: left;
|
||||
padding: 0.6rem;
|
||||
border-bottom: 2px solid var(--border);
|
||||
color: var(--text-secondary);
|
||||
font-weight: 600;
|
||||
}
|
||||
.timeline-table td {
|
||||
padding: 0.6rem;
|
||||
border-bottom: 1px solid var(--border);
|
||||
}
|
||||
.footer {
|
||||
text-align: center;
|
||||
padding: 2rem;
|
||||
color: var(--text-secondary);
|
||||
font-size: 0.85rem;
|
||||
}
|
||||
@media (max-width: 768px) {
|
||||
.grid { grid-template-columns: 1fr; }
|
||||
.header h1 { font-size: 1.4rem; }
|
||||
.health-score .score { font-size: 3rem; }
|
||||
}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="header">
|
||||
<h1>{{ title }}</h1>
|
||||
<div class="meta">{{ domain }} | {{ report_date }} | Audit ID: {{ audit_id }}</div>
|
||||
</div>
|
||||
|
||||
<div class="container">
|
||||
<!-- Health Score & Category Overview -->
|
||||
<div class="grid">
|
||||
<div class="card health-score">
|
||||
<div class="score {{ score_class }}">{{ overall_health }}</div>
|
||||
<div class="label">Overall Health Score</div>
|
||||
<div class="trend trend-{{ health_trend }}">{{ trend_label }}</div>
|
||||
<div class="chart-container" style="height: 200px; margin-top: 1rem;">
|
||||
<canvas id="gaugeChart"></canvas>
|
||||
</div>
|
||||
</div>
|
||||
<div class="card">
|
||||
<h2>Category Scores</h2>
|
||||
<div class="chart-container">
|
||||
<canvas id="categoryChart"></canvas>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Traffic & Keywords -->
|
||||
<div class="grid">
|
||||
<div class="card">
|
||||
<h2>Health Score Timeline</h2>
|
||||
<div class="chart-container">
|
||||
<canvas id="timelineChart"></canvas>
|
||||
</div>
|
||||
</div>
|
||||
<div class="card">
|
||||
<h2>Issue Distribution</h2>
|
||||
<div class="chart-container">
|
||||
<canvas id="issuesChart"></canvas>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Competitor Radar (if data available) -->
|
||||
{% if has_competitor_data %}
|
||||
<div class="grid">
|
||||
<div class="card">
|
||||
<h2>Competitive Comparison</h2>
|
||||
<div class="chart-container">
|
||||
<canvas id="radarChart"></canvas>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
{% endif %}
|
||||
|
||||
<!-- Top Issues -->
|
||||
<div class="grid-full">
|
||||
<div class="card">
|
||||
<h2>Top Issues ({{ issues_count }})</h2>
|
||||
<ul class="issues-list">
|
||||
{% for issue in top_issues %}
|
||||
<li>
|
||||
<span class="severity-badge severity-{{ issue.severity }}">{{ issue.severity }}</span>
|
||||
<span>{{ issue.description }} <em style="color: var(--text-secondary);">({{ issue.category }})</em></span>
|
||||
</li>
|
||||
{% endfor %}
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Top Wins -->
|
||||
{% if top_wins %}
|
||||
<div class="grid-full">
|
||||
<div class="card">
|
||||
<h2>Top Wins ({{ wins_count }})</h2>
|
||||
<ul class="issues-list">
|
||||
{% for win in top_wins %}
|
||||
<li>
|
||||
<span class="severity-badge severity-low">WIN</span>
|
||||
<span>{{ win.description }} <em style="color: var(--text-secondary);">({{ win.category }})</em></span>
|
||||
</li>
|
||||
{% endfor %}
|
||||
</ul>
|
||||
</div>
|
||||
</div>
|
||||
{% endif %}
|
||||
|
||||
<!-- Audit Timeline Table -->
|
||||
<div class="grid-full">
|
||||
<div class="card">
|
||||
<h2>Audit History</h2>
|
||||
<table class="timeline-table">
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Date</th>
|
||||
<th>Skill</th>
|
||||
<th>Category</th>
|
||||
<th>Score</th>
|
||||
<th>Issues</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
{% for entry in timeline %}
|
||||
<tr>
|
||||
<td>{{ entry.date }}</td>
|
||||
<td>{{ entry.skill }}</td>
|
||||
<td>{{ entry.category }}</td>
|
||||
<td>{{ entry.health_score }}</td>
|
||||
<td>{{ entry.issues_count }}</td>
|
||||
</tr>
|
||||
{% endfor %}
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="footer">
|
||||
Generated by SEO Reporting Dashboard (Skill 34) | {{ timestamp }}
|
||||
</div>
|
||||
|
||||
<script>
|
||||
// --- Gauge Chart ---
|
||||
const gaugeCtx = document.getElementById('gaugeChart').getContext('2d');
|
||||
new Chart(gaugeCtx, {
|
||||
type: 'doughnut',
|
||||
data: {
|
||||
datasets: [{
|
||||
data: [{{ overall_health }}, {{ 100 - overall_health }}],
|
||||
backgroundColor: ['{{ gauge_color }}', '#e9ecef'],
|
||||
borderWidth: 0,
|
||||
circumference: 180,
|
||||
rotation: 270,
|
||||
}]
|
||||
},
|
||||
options: {
|
||||
responsive: true,
|
||||
maintainAspectRatio: false,
|
||||
cutout: '75%',
|
||||
plugins: { legend: { display: false }, tooltip: { enabled: false } }
|
||||
}
|
||||
});
|
||||
|
||||
// --- Category Bar Chart ---
|
||||
const catCtx = document.getElementById('categoryChart').getContext('2d');
|
||||
new Chart(catCtx, {
|
||||
type: 'bar',
|
||||
data: {
|
||||
labels: {{ category_labels | tojson }},
|
||||
datasets: [{
|
||||
label: 'Score',
|
||||
data: {{ category_values | tojson }},
|
||||
backgroundColor: {{ category_colors | tojson }},
|
||||
borderRadius: 6,
|
||||
borderSkipped: false,
|
||||
}]
|
||||
},
|
||||
options: {
|
||||
responsive: true,
|
||||
maintainAspectRatio: false,
|
||||
indexAxis: 'y',
|
||||
scales: {
|
||||
x: { min: 0, max: 100, grid: { display: false } },
|
||||
y: { grid: { display: false } }
|
||||
},
|
||||
plugins: { legend: { display: false } }
|
||||
}
|
||||
});
|
||||
|
||||
// --- Timeline Line Chart ---
|
||||
const timeCtx = document.getElementById('timelineChart').getContext('2d');
|
||||
new Chart(timeCtx, {
|
||||
type: 'line',
|
||||
data: {
|
||||
labels: {{ timeline_dates | tojson }},
|
||||
datasets: [{
|
||||
label: 'Health Score',
|
||||
data: {{ timeline_scores | tojson }},
|
||||
borderColor: '#0d6efd',
|
||||
backgroundColor: 'rgba(13, 110, 253, 0.1)',
|
||||
fill: true,
|
||||
tension: 0.3,
|
||||
pointRadius: 4,
|
||||
pointBackgroundColor: '#0d6efd',
|
||||
}]
|
||||
},
|
||||
options: {
|
||||
responsive: true,
|
||||
maintainAspectRatio: false,
|
||||
scales: {
|
||||
y: { min: 0, max: 100, grid: { color: '#f0f0f0' } },
|
||||
x: { grid: { display: false } }
|
||||
},
|
||||
plugins: { legend: { display: false } }
|
||||
}
|
||||
});
|
||||
|
||||
// --- Issues Pie Chart ---
|
||||
const issuesCtx = document.getElementById('issuesChart').getContext('2d');
|
||||
new Chart(issuesCtx, {
|
||||
type: 'pie',
|
||||
data: {
|
||||
labels: {{ issue_category_labels | tojson }},
|
||||
datasets: [{
|
||||
data: {{ issue_category_values | tojson }},
|
||||
backgroundColor: [
|
||||
'#dc3545', '#fd7e14', '#ffc107', '#198754',
|
||||
'#0d6efd', '#6610f2', '#d63384', '#20c997',
|
||||
'#0dcaf0', '#6c757d'
|
||||
],
|
||||
}]
|
||||
},
|
||||
options: {
|
||||
responsive: true,
|
||||
maintainAspectRatio: false,
|
||||
plugins: {
|
||||
legend: { position: 'right', labels: { boxWidth: 12, padding: 8 } }
|
||||
}
|
||||
}
|
||||
});
|
||||
|
||||
{% if has_competitor_data %}
|
||||
// --- Competitor Radar Chart ---
|
||||
const radarCtx = document.getElementById('radarChart').getContext('2d');
|
||||
new Chart(radarCtx, {
|
||||
type: 'radar',
|
||||
data: {
|
||||
labels: {{ radar_labels | tojson }},
|
||||
datasets: {{ radar_datasets | tojson }}
|
||||
},
|
||||
options: {
|
||||
responsive: true,
|
||||
maintainAspectRatio: false,
|
||||
scales: {
|
||||
r: { min: 0, max: 100, ticks: { stepSize: 20 } }
|
||||
}
|
||||
}
|
||||
});
|
||||
{% endif %}
|
||||
</script>
|
||||
</body>
|
||||
</html>"""
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Generator
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
CATEGORY_KOREAN_LABELS: dict[str, str] = {
|
||||
"technical": "기술 SEO",
|
||||
"on_page": "온페이지",
|
||||
"performance": "성능",
|
||||
"content": "콘텐츠",
|
||||
"links": "링크",
|
||||
"local": "로컬 SEO",
|
||||
"keywords": "키워드",
|
||||
"competitor": "경쟁사",
|
||||
"schema": "스키마",
|
||||
"kpi": "KPI",
|
||||
"search_console": "Search Console",
|
||||
"ecommerce": "이커머스",
|
||||
"international": "국제 SEO",
|
||||
"ai_search": "AI 검색",
|
||||
"entity_seo": "엔티티 SEO",
|
||||
}
|
||||
|
||||
|
||||
class DashboardGenerator:
|
||||
"""Generate an interactive HTML dashboard from aggregated SEO report data."""
|
||||
|
||||
def __init__(self):
|
||||
self.template = Template(DASHBOARD_TEMPLATE)
|
||||
|
||||
@staticmethod
|
||||
def _score_class(score: float) -> str:
|
||||
"""Return CSS class based on health score."""
|
||||
if score >= 90:
|
||||
return "score-excellent"
|
||||
elif score >= 75:
|
||||
return "score-good"
|
||||
elif score >= 60:
|
||||
return "score-average"
|
||||
elif score >= 40:
|
||||
return "score-poor"
|
||||
else:
|
||||
return "score-critical"
|
||||
|
||||
@staticmethod
|
||||
def _gauge_color(score: float) -> str:
|
||||
"""Return color hex for gauge chart."""
|
||||
if score >= 90:
|
||||
return "#198754"
|
||||
elif score >= 75:
|
||||
return "#20c997"
|
||||
elif score >= 60:
|
||||
return "#ffc107"
|
||||
elif score >= 40:
|
||||
return "#fd7e14"
|
||||
else:
|
||||
return "#dc3545"
|
||||
|
||||
@staticmethod
|
||||
def _category_color(score: float) -> str:
|
||||
"""Return color for category bar based on score."""
|
||||
if score >= 80:
|
||||
return "#198754"
|
||||
elif score >= 60:
|
||||
return "#0d6efd"
|
||||
elif score >= 40:
|
||||
return "#ffc107"
|
||||
else:
|
||||
return "#dc3545"
|
||||
|
||||
@staticmethod
|
||||
def _trend_label(trend: str) -> str:
|
||||
"""Return human-readable trend label in Korean."""
|
||||
labels = {
|
||||
"improving": "개선 중 ↑",
|
||||
"stable": "안정 →",
|
||||
"declining": "하락 중 ↓",
|
||||
}
|
||||
return labels.get(trend, trend.title())
|
||||
|
||||
def generate_health_gauge(self, score: float) -> dict[str, Any]:
|
||||
"""Generate gauge chart data for health score."""
|
||||
return {
|
||||
"score": score,
|
||||
"remainder": 100 - score,
|
||||
"color": self._gauge_color(score),
|
||||
"class": self._score_class(score),
|
||||
}
|
||||
|
||||
def generate_traffic_chart(self, traffic_data: list[dict]) -> dict[str, Any]:
|
||||
"""Generate line chart data for traffic trends."""
|
||||
dates = [d.get("date", "") for d in traffic_data]
|
||||
values = [d.get("traffic", 0) for d in traffic_data]
|
||||
return {"labels": dates, "values": values}
|
||||
|
||||
def generate_keyword_chart(self, keyword_data: list[dict]) -> dict[str, Any]:
|
||||
"""Generate bar chart data for keyword ranking distribution."""
|
||||
labels = [d.get("range", "") for d in keyword_data]
|
||||
values = [d.get("count", 0) for d in keyword_data]
|
||||
return {"labels": labels, "values": values}
|
||||
|
||||
def generate_issues_chart(
|
||||
self, issues_data: list[dict[str, Any]]
|
||||
) -> dict[str, Any]:
|
||||
"""Generate pie chart data for issue category distribution."""
|
||||
category_counts: dict[str, int] = {}
|
||||
for issue in issues_data:
|
||||
cat = issue.get("category", "other")
|
||||
category_counts[cat] = category_counts.get(cat, 0) + 1
|
||||
|
||||
sorted_cats = sorted(
|
||||
category_counts.items(), key=lambda x: x[1], reverse=True
|
||||
)
|
||||
return {
|
||||
"labels": [CATEGORY_KOREAN_LABELS.get(c[0], c[0]) for c in sorted_cats],
|
||||
"values": [c[1] for c in sorted_cats],
|
||||
}
|
||||
|
||||
def generate_competitor_radar(
|
||||
self, competitor_data: dict[str, Any]
|
||||
) -> dict[str, Any]:
|
||||
"""Generate radar chart data for competitor comparison."""
|
||||
labels = list(competitor_data.get("dimensions", []))
|
||||
datasets = []
|
||||
colors = [
|
||||
"rgba(13, 110, 253, 0.5)",
|
||||
"rgba(220, 53, 69, 0.5)",
|
||||
"rgba(25, 135, 84, 0.5)",
|
||||
]
|
||||
border_colors = ["#0d6efd", "#dc3545", "#198754"]
|
||||
|
||||
for i, (domain, scores) in enumerate(
|
||||
competitor_data.get("scores", {}).items()
|
||||
):
|
||||
datasets.append({
|
||||
"label": domain,
|
||||
"data": [scores.get(dim, 0) for dim in labels],
|
||||
"backgroundColor": colors[i % len(colors)],
|
||||
"borderColor": border_colors[i % len(border_colors)],
|
||||
"borderWidth": 2,
|
||||
})
|
||||
|
||||
return {"labels": labels, "datasets": datasets}
|
||||
|
||||
def render_html(
|
||||
self,
|
||||
report: dict[str, Any],
|
||||
config: DashboardConfig,
|
||||
) -> str:
|
||||
"""Render the full HTML dashboard from aggregated report data."""
|
||||
overall_health = report.get("overall_health", 0)
|
||||
health_trend = report.get("health_trend", "stable")
|
||||
|
||||
# Category scores (with Korean labels)
|
||||
cat_scores = report.get("category_scores", {})
|
||||
category_labels = [
|
||||
CATEGORY_KOREAN_LABELS.get(k, k) for k in cat_scores.keys()
|
||||
]
|
||||
category_values = list(cat_scores.values())
|
||||
category_colors = [self._category_color(v) for v in category_values]
|
||||
|
||||
# Timeline
|
||||
timeline = report.get("timeline", [])
|
||||
timeline_dates = [e.get("date", "") for e in timeline]
|
||||
timeline_scores = [e.get("health_score", 0) for e in timeline]
|
||||
|
||||
# Issues
|
||||
top_issues = report.get("top_issues", [])
|
||||
issues_chart = self.generate_issues_chart(top_issues)
|
||||
|
||||
# Wins
|
||||
top_wins = report.get("top_wins", [])
|
||||
|
||||
# Competitor radar
|
||||
has_competitor_data = False
|
||||
radar_labels: list[str] = []
|
||||
radar_datasets: list[dict] = []
|
||||
|
||||
raw_outputs = report.get("raw_outputs", [])
|
||||
for output in raw_outputs:
|
||||
if output.get("category") == "competitor":
|
||||
has_competitor_data = True
|
||||
comp_data = output.get("data", {})
|
||||
if "comparison_matrix" in comp_data:
|
||||
radar_result = self.generate_competitor_radar(
|
||||
comp_data["comparison_matrix"]
|
||||
)
|
||||
radar_labels = radar_result["labels"]
|
||||
radar_datasets = radar_result["datasets"]
|
||||
break
|
||||
|
||||
context = {
|
||||
"title": config.title,
|
||||
"domain": config.domain or report.get("domain", ""),
|
||||
"report_date": report.get("report_date", ""),
|
||||
"audit_id": report.get("audit_id", ""),
|
||||
"timestamp": report.get("timestamp", datetime.now().isoformat()),
|
||||
"overall_health": overall_health,
|
||||
"score_class": self._score_class(overall_health),
|
||||
"health_trend": health_trend,
|
||||
"trend_label": self._trend_label(health_trend),
|
||||
"gauge_color": self._gauge_color(overall_health),
|
||||
"category_labels": category_labels,
|
||||
"category_values": category_values,
|
||||
"category_colors": category_colors,
|
||||
"timeline_dates": timeline_dates,
|
||||
"timeline_scores": timeline_scores,
|
||||
"issue_category_labels": issues_chart["labels"],
|
||||
"issue_category_values": issues_chart["values"],
|
||||
"top_issues": top_issues[:15],
|
||||
"issues_count": len(top_issues),
|
||||
"top_wins": top_wins[:10],
|
||||
"wins_count": len(top_wins),
|
||||
"timeline": timeline[:20],
|
||||
"has_competitor_data": has_competitor_data,
|
||||
"radar_labels": radar_labels,
|
||||
"radar_datasets": radar_datasets,
|
||||
}
|
||||
|
||||
return self.template.render(**context)
|
||||
|
||||
def save(self, html_content: str, output_path: str) -> None:
|
||||
"""Save rendered HTML to a file."""
|
||||
Path(output_path).write_text(html_content, encoding="utf-8")
|
||||
logger.info(f"Dashboard saved to {output_path}")
|
||||
|
||||
def run(
|
||||
self,
|
||||
report_json: str,
|
||||
output_path: str,
|
||||
title: str = "SEO Reporting Dashboard",
|
||||
) -> str:
|
||||
"""Orchestrate dashboard generation from a report JSON file."""
|
||||
# Load report data
|
||||
report_path = Path(report_json)
|
||||
if not report_path.exists():
|
||||
raise FileNotFoundError(f"Report file not found: {report_json}")
|
||||
|
||||
report = json.loads(report_path.read_text(encoding="utf-8"))
|
||||
logger.info(f"Loaded report: {report.get('domain', 'unknown')}")
|
||||
|
||||
# Configure
|
||||
config = DashboardConfig(
|
||||
title=title,
|
||||
domain=report.get("domain", ""),
|
||||
date_range=report.get("report_date", ""),
|
||||
)
|
||||
|
||||
# Render
|
||||
html = self.render_html(report, config)
|
||||
logger.info(f"Rendered HTML dashboard ({len(html):,} bytes)")
|
||||
|
||||
# Save
|
||||
self.save(html, output_path)
|
||||
|
||||
return output_path
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CLI
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def parse_args(argv: list[str] | None = None) -> argparse.Namespace:
|
||||
parser = argparse.ArgumentParser(
|
||||
description="SEO Dashboard Generator - Interactive HTML dashboard with Chart.js",
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
epilog="""\
|
||||
Examples:
|
||||
python dashboard_generator.py --report aggregated_report.json --output dashboard.html
|
||||
python dashboard_generator.py --report aggregated_report.json --output dashboard.html --title "My Dashboard"
|
||||
""",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--report",
|
||||
required=True,
|
||||
help="Path to aggregated report JSON file (from report_aggregator.py)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--output",
|
||||
required=True,
|
||||
help="Output HTML file path",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--title",
|
||||
type=str,
|
||||
default="SEO Reporting Dashboard",
|
||||
help="Dashboard title (default: 'SEO Reporting Dashboard')",
|
||||
)
|
||||
return parser.parse_args(argv)
|
||||
|
||||
|
||||
def main() -> None:
|
||||
args = parse_args()
|
||||
|
||||
generator = DashboardGenerator()
|
||||
output = generator.run(
|
||||
report_json=args.report,
|
||||
output_path=args.output,
|
||||
title=args.title,
|
||||
)
|
||||
|
||||
logger.info(f"Dashboard generated: {output}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,622 @@
|
||||
"""
|
||||
Executive Report - Korean-language executive summary generation
|
||||
===============================================================
|
||||
Purpose: Generate stakeholder-ready executive summaries in Korean from
|
||||
aggregated SEO report data, with audience-specific detail levels
|
||||
for C-level, marketing, and technical teams.
|
||||
Python: 3.10+
|
||||
|
||||
Usage:
|
||||
python executive_report.py --report aggregated_report.json --audience c-level --output report.md
|
||||
python executive_report.py --report aggregated_report.json --audience marketing --output report.md
|
||||
python executive_report.py --report aggregated_report.json --audience technical --output report.md
|
||||
python executive_report.py --report aggregated_report.json --audience c-level --format notion
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import logging
|
||||
import sys
|
||||
from dataclasses import dataclass, field
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format="%(asctime)s - %(levelname)s - %(message)s",
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Data classes
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@dataclass
|
||||
class AudienceConfig:
|
||||
"""Configuration for report audience targeting."""
|
||||
level: str = "c-level" # c-level | marketing | technical
|
||||
detail_depth: str = "summary" # summary | moderate | detailed
|
||||
include_recommendations: bool = True
|
||||
include_technical_details: bool = False
|
||||
max_issues: int = 5
|
||||
max_recommendations: int = 5
|
||||
|
||||
@classmethod
|
||||
def from_level(cls, level: str) -> "AudienceConfig":
|
||||
"""Create config preset from audience level."""
|
||||
presets = {
|
||||
"c-level": cls(
|
||||
level="c-level",
|
||||
detail_depth="summary",
|
||||
include_recommendations=True,
|
||||
include_technical_details=False,
|
||||
max_issues=5,
|
||||
max_recommendations=3,
|
||||
),
|
||||
"marketing": cls(
|
||||
level="marketing",
|
||||
detail_depth="moderate",
|
||||
include_recommendations=True,
|
||||
include_technical_details=False,
|
||||
max_issues=10,
|
||||
max_recommendations=5,
|
||||
),
|
||||
"technical": cls(
|
||||
level="technical",
|
||||
detail_depth="detailed",
|
||||
include_recommendations=True,
|
||||
include_technical_details=True,
|
||||
max_issues=20,
|
||||
max_recommendations=10,
|
||||
),
|
||||
}
|
||||
return presets.get(level, presets["c-level"])
|
||||
|
||||
|
||||
@dataclass
|
||||
class ExecutiveSummary:
|
||||
"""Generated executive summary content."""
|
||||
title: str = ""
|
||||
domain: str = ""
|
||||
period: str = ""
|
||||
health_score: float = 0.0
|
||||
health_trend: str = "stable"
|
||||
key_wins: list[str] = field(default_factory=list)
|
||||
key_concerns: list[str] = field(default_factory=list)
|
||||
recommendations: list[str] = field(default_factory=list)
|
||||
narrative: str = ""
|
||||
audience: str = "c-level"
|
||||
category_summary: dict[str, str] = field(default_factory=dict)
|
||||
audit_id: str = ""
|
||||
timestamp: str = ""
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Korean text templates
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
HEALTH_LABELS_KR = {
|
||||
"excellent": "우수",
|
||||
"good": "양호",
|
||||
"average": "보통",
|
||||
"poor": "미흡",
|
||||
"critical": "위험",
|
||||
}
|
||||
|
||||
TREND_LABELS_KR = {
|
||||
"improving": "개선 중",
|
||||
"stable": "안정",
|
||||
"declining": "하락 중",
|
||||
}
|
||||
|
||||
CATEGORY_LABELS_KR = {
|
||||
"technical": "기술 SEO",
|
||||
"on_page": "온페이지 SEO",
|
||||
"performance": "성능 (Core Web Vitals)",
|
||||
"content": "콘텐츠 전략",
|
||||
"links": "링크 프로필",
|
||||
"local": "로컬 SEO",
|
||||
"keywords": "키워드 전략",
|
||||
"competitor": "경쟁 분석",
|
||||
"schema": "스키마/구조화 데이터",
|
||||
"kpi": "KPI 프레임워크",
|
||||
"search_console": "Search Console",
|
||||
"ecommerce": "이커머스 SEO",
|
||||
"international": "국제 SEO",
|
||||
"ai_search": "AI 검색 가시성",
|
||||
"entity_seo": "Knowledge Graph",
|
||||
}
|
||||
|
||||
# Common English issue descriptions -> Korean translations
|
||||
ISSUE_TRANSLATIONS_KR: dict[str, str] = {
|
||||
"missing meta description": "메타 설명(meta description) 누락",
|
||||
"missing title tag": "타이틀 태그 누락",
|
||||
"duplicate title": "중복 타이틀 태그",
|
||||
"duplicate meta description": "중복 메타 설명",
|
||||
"missing h1": "H1 태그 누락",
|
||||
"multiple h1 tags": "H1 태그 다수 사용",
|
||||
"missing alt text": "이미지 alt 텍스트 누락",
|
||||
"broken links": "깨진 링크 발견",
|
||||
"redirect chain": "리다이렉트 체인 발견",
|
||||
"mixed content": "Mixed Content (HTTP/HTTPS 혼합) 발견",
|
||||
"missing canonical": "Canonical 태그 누락",
|
||||
"noindex on important page": "중요 페이지에 noindex 설정됨",
|
||||
"slow page load": "페이지 로딩 속도 저하",
|
||||
"cls exceeds threshold": "CLS(누적 레이아웃 변경) 임계값 초과",
|
||||
"lcp exceeds threshold": "LCP(최대 콘텐츠풀 페인트) 임계값 초과",
|
||||
"missing sitemap": "사이트맵 누락",
|
||||
"robots.txt blocking important pages": "robots.txt에서 중요 페이지 차단 중",
|
||||
"missing schema markup": "스키마 마크업 누락",
|
||||
"missing hreflang": "hreflang 태그 누락",
|
||||
"thin content": "콘텐츠 부족 (Thin Content)",
|
||||
"orphan pages": "고아 페이지 발견 (내부 링크 없음)",
|
||||
}
|
||||
|
||||
|
||||
def _translate_description(desc: str) -> str:
|
||||
"""Translate common English issue descriptions to Korean."""
|
||||
desc_lower = desc.lower().strip()
|
||||
# Check exact match
|
||||
if desc_lower in ISSUE_TRANSLATIONS_KR:
|
||||
return ISSUE_TRANSLATIONS_KR[desc_lower]
|
||||
# Check partial match (case-insensitive replace)
|
||||
for eng, kor in ISSUE_TRANSLATIONS_KR.items():
|
||||
if eng in desc_lower:
|
||||
# Find the original-case substring and replace it
|
||||
idx = desc_lower.index(eng)
|
||||
return desc[:idx] + kor + desc[idx + len(eng):]
|
||||
return desc
|
||||
|
||||
|
||||
AUDIENCE_INTRO_KR = {
|
||||
"c-level": "본 보고서는 SEO 성과의 핵심 지표와 비즈니스 영향을 요약한 경영진용 보고서입니다.",
|
||||
"marketing": "본 보고서는 SEO 전략 실행 현황과 마케팅 성과를 분석한 마케팅팀 보고서입니다.",
|
||||
"technical": "본 보고서는 SEO 기술 진단 결과와 상세 개선 사항을 포함한 기술팀 보고서입니다.",
|
||||
}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Generator
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class ExecutiveReportGenerator:
|
||||
"""Generate Korean-language executive reports from aggregated SEO data."""
|
||||
|
||||
@staticmethod
|
||||
def _health_grade(score: float) -> str:
|
||||
"""Return health grade string."""
|
||||
if score >= 90:
|
||||
return "excellent"
|
||||
elif score >= 75:
|
||||
return "good"
|
||||
elif score >= 60:
|
||||
return "average"
|
||||
elif score >= 40:
|
||||
return "poor"
|
||||
else:
|
||||
return "critical"
|
||||
|
||||
def generate_narrative(
|
||||
self,
|
||||
report: dict[str, Any],
|
||||
audience: AudienceConfig,
|
||||
) -> str:
|
||||
"""Generate Korean narrative text for the executive summary."""
|
||||
domain = report.get("domain", "")
|
||||
health = report.get("overall_health", 0)
|
||||
trend = report.get("health_trend", "stable")
|
||||
grade = self._health_grade(health)
|
||||
grade_kr = HEALTH_LABELS_KR.get(grade, grade)
|
||||
trend_kr = TREND_LABELS_KR.get(trend, trend)
|
||||
|
||||
intro = AUDIENCE_INTRO_KR.get(audience.level, AUDIENCE_INTRO_KR["c-level"])
|
||||
|
||||
# Build narrative paragraphs
|
||||
paragraphs = []
|
||||
|
||||
# Opening
|
||||
paragraphs.append(intro)
|
||||
|
||||
# Health overview
|
||||
paragraphs.append(
|
||||
f"{domain}의 전체 SEO Health Score는 **{health}/100** ({grade_kr})이며, "
|
||||
f"현재 추세는 **{trend_kr}** 상태입니다."
|
||||
)
|
||||
|
||||
# Category highlights
|
||||
cat_scores = report.get("category_scores", {})
|
||||
if cat_scores:
|
||||
strong_cats = [
|
||||
CATEGORY_LABELS_KR.get(k, k)
|
||||
for k, v in cat_scores.items()
|
||||
if v >= 75
|
||||
]
|
||||
weak_cats = [
|
||||
CATEGORY_LABELS_KR.get(k, k)
|
||||
for k, v in cat_scores.items()
|
||||
if v < 50
|
||||
]
|
||||
|
||||
if strong_cats:
|
||||
paragraphs.append(
|
||||
f"강점 영역: {', '.join(strong_cats[:3])} 등이 양호한 성과를 보이고 있습니다."
|
||||
)
|
||||
if weak_cats:
|
||||
paragraphs.append(
|
||||
f"개선 필요 영역: {', '.join(weak_cats[:3])} 등에서 집중적인 개선이 필요합니다."
|
||||
)
|
||||
|
||||
# Skills coverage
|
||||
skills = report.get("skills_included", [])
|
||||
if skills:
|
||||
paragraphs.append(
|
||||
f"총 {len(skills)}개의 SEO 진단 도구를 통해 종합 분석을 수행하였습니다."
|
||||
)
|
||||
|
||||
# C-level specific: business impact focus
|
||||
if audience.level == "c-level":
|
||||
if trend == "improving":
|
||||
paragraphs.append(
|
||||
"전반적인 SEO 성과가 개선 추세에 있으며, 현재 전략을 유지하면서 "
|
||||
"핵심 약점 영역에 대한 집중 투자가 권장됩니다."
|
||||
)
|
||||
elif trend == "declining":
|
||||
paragraphs.append(
|
||||
"SEO 성과가 하락 추세를 보이고 있어, 원인 분석과 함께 "
|
||||
"긴급한 대응 조치가 필요합니다."
|
||||
)
|
||||
else:
|
||||
paragraphs.append(
|
||||
"SEO 성과가 안정적으로 유지되고 있으나, 경쟁 환경 변화에 대비하여 "
|
||||
"지속적인 모니터링과 선제적 대응이 필요합니다."
|
||||
)
|
||||
|
||||
# Marketing specific: channel and content focus
|
||||
elif audience.level == "marketing":
|
||||
top_issues = report.get("top_issues", [])
|
||||
content_issues = [
|
||||
i for i in top_issues if i.get("category") in ("content", "keywords")
|
||||
]
|
||||
if content_issues:
|
||||
paragraphs.append(
|
||||
f"콘텐츠/키워드 관련 이슈가 {len(content_issues)}건 발견되었으며, "
|
||||
f"콘텐츠 전략 수정이 권장됩니다."
|
||||
)
|
||||
|
||||
# Technical specific: detailed breakdown
|
||||
elif audience.level == "technical":
|
||||
for cat, score in sorted(
|
||||
cat_scores.items(), key=lambda x: x[1]
|
||||
):
|
||||
cat_kr = CATEGORY_LABELS_KR.get(cat, cat)
|
||||
paragraphs.append(f"- {cat_kr}: {score}/100")
|
||||
|
||||
return "\n\n".join(paragraphs)
|
||||
|
||||
def format_wins(self, report: dict[str, Any]) -> list[str]:
|
||||
"""Extract and format key wins in Korean."""
|
||||
wins = report.get("top_wins", [])
|
||||
formatted: list[str] = []
|
||||
|
||||
for win in wins:
|
||||
desc = _translate_description(win.get("description", ""))
|
||||
cat = win.get("category", "")
|
||||
cat_kr = CATEGORY_LABELS_KR.get(cat, cat)
|
||||
|
||||
if desc:
|
||||
formatted.append(f"[{cat_kr}] {desc}")
|
||||
|
||||
return formatted
|
||||
|
||||
def format_concerns(self, report: dict[str, Any]) -> list[str]:
|
||||
"""Extract and format key concerns in Korean."""
|
||||
issues = report.get("top_issues", [])
|
||||
formatted: list[str] = []
|
||||
|
||||
severity_kr = {
|
||||
"critical": "긴급",
|
||||
"high": "높음",
|
||||
"medium": "보통",
|
||||
"low": "낮음",
|
||||
}
|
||||
|
||||
for issue in issues:
|
||||
desc = _translate_description(issue.get("description", ""))
|
||||
severity = issue.get("severity", "medium")
|
||||
cat = issue.get("category", "")
|
||||
sev_kr = severity_kr.get(severity, severity)
|
||||
cat_kr = CATEGORY_LABELS_KR.get(cat, cat)
|
||||
|
||||
if desc:
|
||||
formatted.append(f"[{sev_kr}] [{cat_kr}] {desc}")
|
||||
|
||||
return formatted
|
||||
|
||||
def generate_recommendations(
|
||||
self,
|
||||
report: dict[str, Any],
|
||||
audience: AudienceConfig,
|
||||
) -> list[str]:
|
||||
"""Generate prioritized action items ranked by impact."""
|
||||
recommendations: list[str] = []
|
||||
cat_scores = report.get("category_scores", {})
|
||||
top_issues = report.get("top_issues", [])
|
||||
|
||||
# Priority 1: Critical issues
|
||||
critical = [i for i in top_issues if i.get("severity") == "critical"]
|
||||
for issue in critical[:3]:
|
||||
cat_kr = CATEGORY_LABELS_KR.get(issue.get("category", ""), "")
|
||||
desc = _translate_description(issue.get("description", ""))
|
||||
if audience.level == "c-level":
|
||||
recommendations.append(
|
||||
f"[긴급] {cat_kr} 영역 긴급 조치 필요 - {desc}"
|
||||
)
|
||||
else:
|
||||
recommendations.append(
|
||||
f"[긴급] {desc} (영역: {cat_kr})"
|
||||
)
|
||||
|
||||
# Priority 2: Weak categories
|
||||
weak_cats = sorted(
|
||||
[(k, v) for k, v in cat_scores.items() if v < 50],
|
||||
key=lambda x: x[1],
|
||||
)
|
||||
for cat, score in weak_cats[:3]:
|
||||
cat_kr = CATEGORY_LABELS_KR.get(cat, cat)
|
||||
if audience.level == "c-level":
|
||||
recommendations.append(
|
||||
f"[개선] {cat_kr} 점수 {score}/100 - 전략적 투자 권장"
|
||||
)
|
||||
elif audience.level == "marketing":
|
||||
recommendations.append(
|
||||
f"[개선] {cat_kr} ({score}/100) - 캠페인 전략 재검토 필요"
|
||||
)
|
||||
else:
|
||||
recommendations.append(
|
||||
f"[개선] {cat_kr} ({score}/100) - 상세 진단 및 기술적 개선 필요"
|
||||
)
|
||||
|
||||
# Priority 3: Maintenance for good categories
|
||||
strong_cats = [
|
||||
(k, v) for k, v in cat_scores.items() if v >= 75
|
||||
]
|
||||
if strong_cats:
|
||||
cats_kr = ", ".join(
|
||||
CATEGORY_LABELS_KR.get(k, k) for k, _ in strong_cats[:3]
|
||||
)
|
||||
recommendations.append(
|
||||
f"[유지] {cats_kr} - 현재 수준 유지 및 모니터링 지속"
|
||||
)
|
||||
|
||||
# Audience-specific recommendations
|
||||
if audience.level == "c-level":
|
||||
health = report.get("overall_health", 0)
|
||||
if health < 60:
|
||||
recommendations.append(
|
||||
"[전략] SEO 개선을 위한 전문 인력 또는 외부 에이전시 투입 검토"
|
||||
)
|
||||
elif audience.level == "marketing":
|
||||
recommendations.append(
|
||||
"[실행] 다음 분기 SEO 개선 로드맵 수립 및 KPI 설정"
|
||||
)
|
||||
elif audience.level == "technical":
|
||||
recommendations.append(
|
||||
"[실행] 기술 부채 해소 스프린트 계획 수립"
|
||||
)
|
||||
|
||||
return recommendations[:audience.max_recommendations]
|
||||
|
||||
def render_markdown(self, summary: ExecutiveSummary) -> str:
|
||||
"""Render executive summary as markdown document."""
|
||||
lines: list[str] = []
|
||||
|
||||
# Title
|
||||
lines.append(f"# {summary.title}")
|
||||
lines.append("")
|
||||
|
||||
# Meta
|
||||
audience_kr = {
|
||||
"c-level": "경영진",
|
||||
"marketing": "마케팅팀",
|
||||
"technical": "기술팀",
|
||||
}
|
||||
lines.append(f"**대상**: {audience_kr.get(summary.audience, summary.audience)}")
|
||||
lines.append(f"**도메인**: {summary.domain}")
|
||||
lines.append(f"**보고 일자**: {summary.period}")
|
||||
lines.append(f"**Audit ID**: {summary.audit_id}")
|
||||
lines.append("")
|
||||
|
||||
# Health Score
|
||||
grade = self._health_grade(summary.health_score)
|
||||
grade_kr = HEALTH_LABELS_KR.get(grade, grade)
|
||||
trend_kr = TREND_LABELS_KR.get(summary.health_trend, summary.health_trend)
|
||||
|
||||
lines.append("## Health Score")
|
||||
lines.append("")
|
||||
lines.append(f"| 지표 | 값 |")
|
||||
lines.append(f"|------|-----|")
|
||||
lines.append(f"| Overall Score | **{summary.health_score}/100** |")
|
||||
lines.append(f"| 등급 | {grade_kr} |")
|
||||
lines.append(f"| 추세 | {trend_kr} |")
|
||||
lines.append("")
|
||||
|
||||
# Category summary
|
||||
if summary.category_summary:
|
||||
lines.append("## 영역별 점수")
|
||||
lines.append("")
|
||||
lines.append("| 영역 | 점수 |")
|
||||
lines.append("|------|------|")
|
||||
for cat, score_str in summary.category_summary.items():
|
||||
cat_kr = CATEGORY_LABELS_KR.get(cat, cat)
|
||||
lines.append(f"| {cat_kr} | {score_str} |")
|
||||
lines.append("")
|
||||
|
||||
# Narrative
|
||||
lines.append("## 종합 분석")
|
||||
lines.append("")
|
||||
lines.append(summary.narrative)
|
||||
lines.append("")
|
||||
|
||||
# Key wins
|
||||
if summary.key_wins:
|
||||
lines.append("## 주요 성과")
|
||||
lines.append("")
|
||||
for win in summary.key_wins:
|
||||
lines.append(f"- {win}")
|
||||
lines.append("")
|
||||
|
||||
# Key concerns
|
||||
if summary.key_concerns:
|
||||
lines.append("## 주요 이슈")
|
||||
lines.append("")
|
||||
for concern in summary.key_concerns:
|
||||
lines.append(f"- {concern}")
|
||||
lines.append("")
|
||||
|
||||
# Recommendations
|
||||
if summary.recommendations:
|
||||
lines.append("## 권장 조치 사항")
|
||||
lines.append("")
|
||||
for i, rec in enumerate(summary.recommendations, 1):
|
||||
lines.append(f"{i}. {rec}")
|
||||
lines.append("")
|
||||
|
||||
# Footer
|
||||
lines.append("---")
|
||||
lines.append(
|
||||
f"*이 보고서는 SEO Reporting Dashboard (Skill 34)에 의해 "
|
||||
f"{summary.timestamp}에 자동 생성되었습니다.*"
|
||||
)
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
def run(
|
||||
self,
|
||||
report_json: str,
|
||||
audience_level: str = "c-level",
|
||||
output_path: str | None = None,
|
||||
output_format: str = "markdown",
|
||||
) -> str:
|
||||
"""Orchestrate executive report generation."""
|
||||
# Load report
|
||||
report_path = Path(report_json)
|
||||
if not report_path.exists():
|
||||
raise FileNotFoundError(f"Report file not found: {report_json}")
|
||||
|
||||
report = json.loads(report_path.read_text(encoding="utf-8"))
|
||||
logger.info(f"Loaded report: {report.get('domain', 'unknown')}")
|
||||
|
||||
# Configure audience
|
||||
audience = AudienceConfig.from_level(audience_level)
|
||||
logger.info(f"Audience: {audience.level} (depth: {audience.detail_depth})")
|
||||
|
||||
# Build summary
|
||||
domain = report.get("domain", "")
|
||||
summary = ExecutiveSummary(
|
||||
title=f"SEO 성과 보고서 - {domain}",
|
||||
domain=domain,
|
||||
period=report.get("report_date", ""),
|
||||
health_score=report.get("overall_health", 0),
|
||||
health_trend=report.get("health_trend", "stable"),
|
||||
audit_id=report.get("audit_id", ""),
|
||||
audience=audience.level,
|
||||
timestamp=datetime.now().isoformat(),
|
||||
)
|
||||
|
||||
# Category summary
|
||||
cat_scores = report.get("category_scores", {})
|
||||
summary.category_summary = {
|
||||
cat: f"{score}/100"
|
||||
for cat, score in sorted(
|
||||
cat_scores.items(), key=lambda x: x[1], reverse=True
|
||||
)
|
||||
}
|
||||
|
||||
# Generate content
|
||||
summary.narrative = self.generate_narrative(report, audience)
|
||||
summary.key_wins = self.format_wins(report)[:audience.max_issues]
|
||||
summary.key_concerns = self.format_concerns(report)[:audience.max_issues]
|
||||
summary.recommendations = self.generate_recommendations(report, audience)
|
||||
|
||||
# Render
|
||||
if output_format == "markdown":
|
||||
content = self.render_markdown(summary)
|
||||
elif output_format == "notion":
|
||||
# For Notion, we output markdown that can be pasted into Notion
|
||||
content = self.render_markdown(summary)
|
||||
logger.info(
|
||||
"Notion format: use MCP tools to push this markdown to Notion "
|
||||
f"database {report.get('audit_id', 'DASH-YYYYMMDD-NNN')}"
|
||||
)
|
||||
else:
|
||||
content = self.render_markdown(summary)
|
||||
|
||||
# Save or print
|
||||
if output_path:
|
||||
Path(output_path).write_text(content, encoding="utf-8")
|
||||
logger.info(f"Executive report saved to {output_path}")
|
||||
else:
|
||||
print(content)
|
||||
|
||||
return content
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CLI
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def parse_args(argv: list[str] | None = None) -> argparse.Namespace:
|
||||
parser = argparse.ArgumentParser(
|
||||
description="SEO Executive Report - Korean-language executive summary generator",
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
epilog="""\
|
||||
Examples:
|
||||
python executive_report.py --report aggregated_report.json --audience c-level --output report.md
|
||||
python executive_report.py --report aggregated_report.json --audience marketing --output report.md
|
||||
python executive_report.py --report aggregated_report.json --audience technical --format notion
|
||||
""",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--report",
|
||||
required=True,
|
||||
help="Path to aggregated report JSON file (from report_aggregator.py)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--audience",
|
||||
choices=["c-level", "marketing", "technical"],
|
||||
default="c-level",
|
||||
help="Target audience level (default: c-level)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--output",
|
||||
type=str,
|
||||
default=None,
|
||||
help="Output file path (prints to stdout if omitted)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--format",
|
||||
choices=["markdown", "notion"],
|
||||
default="markdown",
|
||||
dest="output_format",
|
||||
help="Output format (default: markdown)",
|
||||
)
|
||||
return parser.parse_args(argv)
|
||||
|
||||
|
||||
def main() -> None:
|
||||
args = parse_args()
|
||||
|
||||
generator = ExecutiveReportGenerator()
|
||||
generator.run(
|
||||
report_json=args.report,
|
||||
audience_level=args.audience,
|
||||
output_path=args.output,
|
||||
output_format=args.output_format,
|
||||
)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,744 @@
|
||||
"""
|
||||
Report Aggregator - Collect and normalize outputs from all SEO skills
|
||||
=====================================================================
|
||||
Purpose: Scan for recent audit outputs from skills 11-33, normalize data
|
||||
formats, merge findings by domain/date, compute cross-skill health
|
||||
scores, and identify top-priority issues across all audits.
|
||||
Python: 3.10+
|
||||
|
||||
Usage:
|
||||
python report_aggregator.py --domain https://example.com --json
|
||||
python report_aggregator.py --domain https://example.com --output-dir ./audit_outputs --json
|
||||
python report_aggregator.py --domain https://example.com --from 2025-01-01 --to 2025-03-31 --json
|
||||
python report_aggregator.py --domain https://example.com --json --output report.json
|
||||
"""
|
||||
|
||||
import argparse
|
||||
import asyncio
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import sys
|
||||
from dataclasses import dataclass, field, asdict
|
||||
from datetime import datetime, date
|
||||
from pathlib import Path
|
||||
from typing import Any
|
||||
from urllib.parse import urlparse
|
||||
|
||||
from base_client import BaseAsyncClient, config
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Skill registry
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
SKILL_REGISTRY = {
|
||||
11: {"name": "comprehensive-audit", "category": "comprehensive", "weight": 1.0},
|
||||
12: {"name": "technical-audit", "category": "technical", "weight": 0.20},
|
||||
13: {"name": "on-page-audit", "category": "on_page", "weight": 0.20},
|
||||
14: {"name": "core-web-vitals", "category": "performance", "weight": 0.25},
|
||||
15: {"name": "search-console", "category": "search_console", "weight": 0.10},
|
||||
16: {"name": "schema-validator", "category": "schema", "weight": 0.15},
|
||||
17: {"name": "schema-generator", "category": "schema", "weight": 0.10},
|
||||
18: {"name": "local-audit", "category": "local", "weight": 0.10},
|
||||
19: {"name": "keyword-strategy", "category": "keywords", "weight": 0.15},
|
||||
20: {"name": "serp-analysis", "category": "keywords", "weight": 0.10},
|
||||
21: {"name": "position-tracking", "category": "keywords", "weight": 0.15},
|
||||
22: {"name": "link-building", "category": "links", "weight": 0.15},
|
||||
23: {"name": "content-strategy", "category": "content", "weight": 0.15},
|
||||
24: {"name": "ecommerce-seo", "category": "ecommerce", "weight": 0.10},
|
||||
25: {"name": "kpi-framework", "category": "kpi", "weight": 0.20},
|
||||
26: {"name": "international-seo", "category": "international", "weight": 0.10},
|
||||
27: {"name": "ai-visibility", "category": "ai_search", "weight": 0.10},
|
||||
28: {"name": "knowledge-graph", "category": "entity_seo", "weight": 0.10},
|
||||
31: {"name": "competitor-intel", "category": "competitor", "weight": 0.15},
|
||||
32: {"name": "crawl-budget", "category": "technical", "weight": 0.10},
|
||||
33: {"name": "page-experience", "category": "performance", "weight": 0.10},
|
||||
}
|
||||
|
||||
CATEGORY_WEIGHTS = {
|
||||
"technical": 0.20,
|
||||
"on_page": 0.15,
|
||||
"performance": 0.15,
|
||||
"content": 0.10,
|
||||
"links": 0.10,
|
||||
"local": 0.05,
|
||||
"keywords": 0.10,
|
||||
"competitor": 0.05,
|
||||
"schema": 0.05,
|
||||
"kpi": 0.05,
|
||||
}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Data classes
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@dataclass
|
||||
class SkillOutput:
|
||||
"""Normalized output from a single SEO skill."""
|
||||
skill_id: int = 0
|
||||
skill_name: str = ""
|
||||
domain: str = ""
|
||||
audit_date: str = ""
|
||||
category: str = ""
|
||||
data: dict[str, Any] = field(default_factory=dict)
|
||||
health_score: float = 0.0
|
||||
issues: list[dict[str, Any]] = field(default_factory=list)
|
||||
wins: list[dict[str, Any]] = field(default_factory=list)
|
||||
source_file: str = ""
|
||||
|
||||
|
||||
@dataclass
|
||||
class AggregatedReport:
|
||||
"""Full aggregated report from all SEO skill outputs."""
|
||||
domain: str = ""
|
||||
report_date: str = ""
|
||||
skills_included: list[dict[str, Any]] = field(default_factory=list)
|
||||
overall_health: float = 0.0
|
||||
health_trend: str = "stable"
|
||||
category_scores: dict[str, float] = field(default_factory=dict)
|
||||
top_issues: list[dict[str, Any]] = field(default_factory=list)
|
||||
top_wins: list[dict[str, Any]] = field(default_factory=list)
|
||||
timeline: list[dict[str, Any]] = field(default_factory=list)
|
||||
raw_outputs: list[dict[str, Any]] = field(default_factory=list)
|
||||
audit_id: str = ""
|
||||
timestamp: str = ""
|
||||
errors: list[str] = field(default_factory=list)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Aggregator
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
class ReportAggregator(BaseAsyncClient):
|
||||
"""Aggregate outputs from all SEO skills into unified reports."""
|
||||
|
||||
NOTION_DB_ID = "2c8581e5-8a1e-8035-880b-e38cefc2f3ef"
|
||||
|
||||
def __init__(self):
|
||||
super().__init__(max_concurrent=5, requests_per_second=2.0)
|
||||
|
||||
@staticmethod
|
||||
def _extract_domain(url: str) -> str:
|
||||
"""Extract bare domain from URL or return as-is if already bare."""
|
||||
if "://" in url:
|
||||
parsed = urlparse(url)
|
||||
return parsed.netloc.lower().replace("www.", "")
|
||||
return url.lower().replace("www.", "")
|
||||
|
||||
@staticmethod
|
||||
def _generate_audit_id() -> str:
|
||||
"""Generate audit ID in DASH-YYYYMMDD-NNN format."""
|
||||
now = datetime.now()
|
||||
return f"DASH-{now.strftime('%Y%m%d')}-001"
|
||||
|
||||
def scan_local_outputs(
|
||||
self,
|
||||
output_dir: str,
|
||||
domain: str | None = None,
|
||||
date_from: str | None = None,
|
||||
date_to: str | None = None,
|
||||
) -> list[SkillOutput]:
|
||||
"""Find JSON output files from other SEO skills in a directory.
|
||||
|
||||
Scans for files matching patterns from skills 11-33 and parses
|
||||
them into normalized SkillOutput objects.
|
||||
"""
|
||||
outputs: list[SkillOutput] = []
|
||||
output_path = Path(output_dir)
|
||||
|
||||
if not output_path.exists():
|
||||
self.logger.warning(f"Output directory not found: {output_dir}")
|
||||
return outputs
|
||||
|
||||
# Scan for JSON files matching skill output patterns
|
||||
json_files = list(output_path.rglob("*.json"))
|
||||
self.logger.info(f"Found {len(json_files)} JSON files in {output_dir}")
|
||||
|
||||
for json_file in json_files:
|
||||
try:
|
||||
data = json.loads(json_file.read_text(encoding="utf-8"))
|
||||
|
||||
# Attempt to identify which skill produced this output
|
||||
skill_output = self._identify_and_parse(data, str(json_file))
|
||||
|
||||
if skill_output is None:
|
||||
continue
|
||||
|
||||
# Filter by domain if specified (supports subdomains)
|
||||
if domain:
|
||||
target_domain = self._extract_domain(domain)
|
||||
if skill_output.domain:
|
||||
file_domain = skill_output.domain
|
||||
# Match exact domain OR subdomains (e.g., blog.example.com matches example.com)
|
||||
if file_domain != target_domain and not file_domain.endswith("." + target_domain):
|
||||
continue
|
||||
|
||||
# Filter by date range
|
||||
if date_from and skill_output.audit_date < date_from:
|
||||
continue
|
||||
if date_to and skill_output.audit_date > date_to:
|
||||
continue
|
||||
|
||||
outputs.append(skill_output)
|
||||
self.logger.info(
|
||||
f"Parsed output from skill {skill_output.skill_id} "
|
||||
f"({skill_output.skill_name}): {json_file.name}"
|
||||
)
|
||||
|
||||
except (json.JSONDecodeError, KeyError, TypeError) as e:
|
||||
self.logger.warning(f"Could not parse {json_file}: {e}")
|
||||
|
||||
self.logger.info(f"Successfully parsed {len(outputs)} skill outputs")
|
||||
return outputs
|
||||
|
||||
def _identify_and_parse(
|
||||
self, data: dict[str, Any], source_file: str
|
||||
) -> SkillOutput | None:
|
||||
"""Identify which skill produced the output and parse it."""
|
||||
skill_output = SkillOutput(source_file=source_file)
|
||||
|
||||
# Strategy 1: Parse skill from audit_id prefix (e.g., KPI-20250115-001)
|
||||
audit_id = data.get("audit_id", "")
|
||||
if isinstance(audit_id, str):
|
||||
prefix_map = {
|
||||
"COMP": 11, "TECH": 12, "PAGE": 13, "CWV": 14,
|
||||
"GSC": 15, "SCHEMA": 16, "LOCAL": 18, "KW": 19,
|
||||
"SERP": 20, "RANK": 21, "LINK": 22, "CONTENT": 23,
|
||||
"ECOM": 24, "KPI": 25, "INTL": 26, "AI": 27,
|
||||
"KG": 28, "COMPET": 31, "CRAWL": 32, "MIGR": 33,
|
||||
"DASH": None, # Skip self-referencing dashboard reports
|
||||
}
|
||||
for prefix, skill_id in prefix_map.items():
|
||||
if audit_id.startswith(prefix):
|
||||
if skill_id is None:
|
||||
return None # Skip aggregated reports
|
||||
skill_info = SKILL_REGISTRY.get(skill_id, {})
|
||||
skill_output.skill_id = skill_id
|
||||
skill_output.skill_name = skill_info.get("name", "unknown")
|
||||
skill_output.category = skill_info.get("category", "unknown")
|
||||
break
|
||||
|
||||
# Strategy 2: Fallback to audit_type field (used by our-seo-agent outputs)
|
||||
if not skill_output.skill_id:
|
||||
audit_type = data.get("audit_type", "")
|
||||
if isinstance(audit_type, str) and audit_type:
|
||||
type_map = {
|
||||
"comprehensive": 11, "technical": 12, "onpage": 13,
|
||||
"cwv": 14, "core-web-vitals": 14,
|
||||
"gsc": 15, "search-console": 15,
|
||||
"schema": 16, "local": 18,
|
||||
"keyword": 19, "serp": 20, "position": 21,
|
||||
"link": 22, "backlink": 22,
|
||||
"content": 23, "ecommerce": 24, "kpi": 25,
|
||||
"international": 26, "hreflang": 26,
|
||||
"ai-visibility": 27, "knowledge-graph": 28, "entity": 28,
|
||||
"competitor": 31, "crawl-budget": 32, "crawl": 32,
|
||||
"migration": 33,
|
||||
}
|
||||
for type_key, skill_id in type_map.items():
|
||||
if audit_type.lower() == type_key:
|
||||
skill_info = SKILL_REGISTRY.get(skill_id, {})
|
||||
skill_output.skill_id = skill_id
|
||||
skill_output.skill_name = skill_info.get("name", "unknown")
|
||||
skill_output.category = skill_info.get("category", "unknown")
|
||||
break
|
||||
|
||||
# Extract domain
|
||||
for key in ("url", "target", "domain", "site"):
|
||||
if key in data:
|
||||
skill_output.domain = self._extract_domain(str(data[key]))
|
||||
break
|
||||
|
||||
# Extract health score — check top-level first, then nested data dict
|
||||
score_found = False
|
||||
for key in ("health_score", "overall_health", "score"):
|
||||
if key in data:
|
||||
try:
|
||||
skill_output.health_score = float(data[key])
|
||||
score_found = True
|
||||
except (ValueError, TypeError):
|
||||
pass
|
||||
break
|
||||
|
||||
if not score_found:
|
||||
nested = data.get("data", {})
|
||||
if isinstance(nested, dict):
|
||||
for key in ("technical_score", "onpage_score", "schema_score",
|
||||
"local_seo_score", "cwv_score", "performance_score",
|
||||
"content_score", "link_score", "keyword_score",
|
||||
"competitor_score", "efficiency_score",
|
||||
"health_score", "overall_score", "score"):
|
||||
val = nested.get(key)
|
||||
if val is not None:
|
||||
try:
|
||||
skill_output.health_score = float(val)
|
||||
except (ValueError, TypeError):
|
||||
pass
|
||||
break
|
||||
|
||||
# Extract audit date
|
||||
for key in ("audit_date", "report_date", "timestamp", "found_date"):
|
||||
if key in data:
|
||||
date_str = str(data[key])[:10]
|
||||
skill_output.audit_date = date_str
|
||||
break
|
||||
|
||||
if not skill_output.audit_date:
|
||||
skill_output.audit_date = date.today().isoformat()
|
||||
|
||||
# Extract issues
|
||||
issues_raw = data.get("issues", data.get("critical_issues", []))
|
||||
if isinstance(issues_raw, list):
|
||||
for issue in issues_raw:
|
||||
if isinstance(issue, dict):
|
||||
skill_output.issues.append(issue)
|
||||
elif isinstance(issue, str):
|
||||
skill_output.issues.append({"description": issue, "severity": "medium"})
|
||||
|
||||
# Extract wins / recommendations
|
||||
wins_raw = data.get("wins", data.get("top_wins", []))
|
||||
if isinstance(wins_raw, list):
|
||||
for win in wins_raw:
|
||||
if isinstance(win, dict):
|
||||
skill_output.wins.append(win)
|
||||
elif isinstance(win, str):
|
||||
skill_output.wins.append({"description": win})
|
||||
|
||||
# Store full data
|
||||
skill_output.data = data
|
||||
|
||||
# Skip if no useful data was extracted
|
||||
if not skill_output.skill_id and not skill_output.domain:
|
||||
return None
|
||||
|
||||
return skill_output
|
||||
|
||||
async def query_notion_audits(
|
||||
self,
|
||||
domain: str,
|
||||
date_from: str | None = None,
|
||||
date_to: str | None = None,
|
||||
) -> list[SkillOutput]:
|
||||
"""Fetch past audit entries from Notion SEO Audit Log database.
|
||||
|
||||
In production, this uses the Notion MCP tools to query the database.
|
||||
Returns normalized SkillOutput objects.
|
||||
"""
|
||||
outputs: list[SkillOutput] = []
|
||||
self.logger.info(
|
||||
f"Querying Notion audits for {domain} "
|
||||
f"(db: {self.NOTION_DB_ID}, from={date_from}, to={date_to})"
|
||||
)
|
||||
|
||||
# In production, this would call:
|
||||
# mcp__notion__query-database with filters for Site URL and Found Date
|
||||
# For now, return empty list as placeholder
|
||||
self.logger.info(
|
||||
"Notion query is a placeholder; use MCP tools in Claude Desktop "
|
||||
"or manually provide JSON files via --output-dir."
|
||||
)
|
||||
|
||||
return outputs
|
||||
|
||||
def normalize_output(self, skill_output: SkillOutput) -> dict[str, Any]:
|
||||
"""Normalize a skill output into a unified format."""
|
||||
return {
|
||||
"skill_id": skill_output.skill_id,
|
||||
"skill_name": skill_output.skill_name,
|
||||
"domain": skill_output.domain,
|
||||
"audit_date": skill_output.audit_date,
|
||||
"category": skill_output.category,
|
||||
"health_score": skill_output.health_score,
|
||||
"issues_count": len(skill_output.issues),
|
||||
"wins_count": len(skill_output.wins),
|
||||
"issues": skill_output.issues[:10],
|
||||
"wins": skill_output.wins[:10],
|
||||
}
|
||||
|
||||
def compute_cross_skill_health(
|
||||
self, outputs: list[SkillOutput]
|
||||
) -> tuple[float, dict[str, float]]:
|
||||
"""Compute weighted overall health score across all skills.
|
||||
|
||||
Returns (overall_score, category_scores_dict).
|
||||
"""
|
||||
category_scores: dict[str, list[float]] = {}
|
||||
|
||||
for output in outputs:
|
||||
cat = output.category
|
||||
if cat and output.health_score > 0:
|
||||
category_scores.setdefault(cat, []).append(output.health_score)
|
||||
|
||||
# Average scores per category
|
||||
avg_category: dict[str, float] = {}
|
||||
for cat, scores in category_scores.items():
|
||||
avg_category[cat] = round(sum(scores) / len(scores), 1)
|
||||
|
||||
# Weighted overall score
|
||||
total_weight = 0.0
|
||||
weighted_sum = 0.0
|
||||
for cat, avg_score in avg_category.items():
|
||||
weight = CATEGORY_WEIGHTS.get(cat, 0.05)
|
||||
weighted_sum += avg_score * weight
|
||||
total_weight += weight
|
||||
|
||||
overall = round(weighted_sum / total_weight, 1) if total_weight > 0 else 0.0
|
||||
|
||||
return overall, avg_category
|
||||
|
||||
def identify_priorities(
|
||||
self, outputs: list[SkillOutput]
|
||||
) -> tuple[list[dict[str, Any]], list[dict[str, Any]]]:
|
||||
"""Identify top issues and wins across all skill outputs.
|
||||
|
||||
Returns (top_issues, top_wins).
|
||||
"""
|
||||
all_issues: list[dict[str, Any]] = []
|
||||
all_wins: list[dict[str, Any]] = []
|
||||
|
||||
severity_order = {"critical": 0, "high": 1, "medium": 2, "low": 3}
|
||||
|
||||
for output in outputs:
|
||||
for issue in output.issues:
|
||||
enriched = {
|
||||
**issue,
|
||||
"source_skill": output.skill_name,
|
||||
"source_skill_id": output.skill_id,
|
||||
"category": output.category,
|
||||
}
|
||||
all_issues.append(enriched)
|
||||
|
||||
for win in output.wins:
|
||||
enriched = {
|
||||
**win,
|
||||
"source_skill": output.skill_name,
|
||||
"source_skill_id": output.skill_id,
|
||||
"category": output.category,
|
||||
}
|
||||
all_wins.append(enriched)
|
||||
|
||||
# Sort issues by severity
|
||||
all_issues.sort(
|
||||
key=lambda i: severity_order.get(
|
||||
i.get("severity", "medium"), 2
|
||||
)
|
||||
)
|
||||
|
||||
return all_issues[:20], all_wins[:20]
|
||||
|
||||
def build_timeline(self, outputs: list[SkillOutput]) -> list[dict[str, Any]]:
|
||||
"""Build an audit history timeline from all skill outputs."""
|
||||
timeline: list[dict[str, Any]] = []
|
||||
|
||||
for output in outputs:
|
||||
entry = {
|
||||
"date": output.audit_date,
|
||||
"skill": output.skill_name,
|
||||
"skill_id": output.skill_id,
|
||||
"health_score": output.health_score,
|
||||
"category": output.category,
|
||||
"issues_count": len(output.issues),
|
||||
}
|
||||
timeline.append(entry)
|
||||
|
||||
# Sort by date descending
|
||||
timeline.sort(key=lambda e: e.get("date", ""), reverse=True)
|
||||
return timeline
|
||||
|
||||
async def run(
|
||||
self,
|
||||
domain: str,
|
||||
output_dir: str | None = None,
|
||||
date_from: str | None = None,
|
||||
date_to: str | None = None,
|
||||
) -> AggregatedReport:
|
||||
"""Orchestrate the full report aggregation pipeline."""
|
||||
target_domain = self._extract_domain(domain)
|
||||
report = AggregatedReport(
|
||||
domain=target_domain,
|
||||
report_date=date.today().isoformat(),
|
||||
audit_id=self._generate_audit_id(),
|
||||
timestamp=datetime.now().isoformat(),
|
||||
)
|
||||
|
||||
all_outputs: list[SkillOutput] = []
|
||||
|
||||
# Step 1: Scan local outputs
|
||||
if output_dir:
|
||||
self.logger.info(f"Step 1/5: Scanning local outputs in {output_dir}...")
|
||||
local_outputs = self.scan_local_outputs(
|
||||
output_dir, domain=target_domain,
|
||||
date_from=date_from, date_to=date_to,
|
||||
)
|
||||
all_outputs.extend(local_outputs)
|
||||
else:
|
||||
self.logger.info("Step 1/5: No output directory specified, skipping local scan.")
|
||||
|
||||
# Step 2: Query Notion for past audits
|
||||
self.logger.info("Step 2/5: Querying Notion for past audits...")
|
||||
try:
|
||||
notion_outputs = await self.query_notion_audits(
|
||||
domain=target_domain,
|
||||
date_from=date_from,
|
||||
date_to=date_to,
|
||||
)
|
||||
all_outputs.extend(notion_outputs)
|
||||
except Exception as e:
|
||||
msg = f"Notion query error: {e}"
|
||||
self.logger.error(msg)
|
||||
report.errors.append(msg)
|
||||
|
||||
if not all_outputs:
|
||||
self.logger.warning(
|
||||
"No skill outputs found. Provide --output-dir with JSON files "
|
||||
"from SEO skills 11-33, or ensure Notion audit log has entries."
|
||||
)
|
||||
report.errors.append("No skill outputs found to aggregate.")
|
||||
return report
|
||||
|
||||
# Step 3: Normalize and compute health scores
|
||||
self.logger.info(
|
||||
f"Step 3/5: Normalizing {len(all_outputs)} skill outputs..."
|
||||
)
|
||||
report.skills_included = [
|
||||
{
|
||||
"skill_id": o.skill_id,
|
||||
"skill_name": o.skill_name,
|
||||
"audit_date": o.audit_date,
|
||||
}
|
||||
for o in all_outputs
|
||||
]
|
||||
report.raw_outputs = [self.normalize_output(o) for o in all_outputs]
|
||||
|
||||
overall_health, category_scores = self.compute_cross_skill_health(all_outputs)
|
||||
report.overall_health = overall_health
|
||||
report.category_scores = category_scores
|
||||
|
||||
# Determine health trend from timeline
|
||||
scores_by_date = sorted(
|
||||
[(o.audit_date, o.health_score) for o in all_outputs if o.health_score > 0],
|
||||
key=lambda x: x[0],
|
||||
)
|
||||
if len(scores_by_date) >= 2:
|
||||
older_avg = sum(s for _, s in scores_by_date[:len(scores_by_date)//2]) / max(len(scores_by_date)//2, 1)
|
||||
newer_avg = sum(s for _, s in scores_by_date[len(scores_by_date)//2:]) / max(len(scores_by_date) - len(scores_by_date)//2, 1)
|
||||
if newer_avg > older_avg + 3:
|
||||
report.health_trend = "improving"
|
||||
elif newer_avg < older_avg - 3:
|
||||
report.health_trend = "declining"
|
||||
else:
|
||||
report.health_trend = "stable"
|
||||
|
||||
# Step 4: Identify priorities
|
||||
self.logger.info("Step 4/5: Identifying top issues and wins...")
|
||||
top_issues, top_wins = self.identify_priorities(all_outputs)
|
||||
report.top_issues = top_issues
|
||||
report.top_wins = top_wins
|
||||
|
||||
# Step 5: Build timeline
|
||||
self.logger.info("Step 5/5: Building audit history timeline...")
|
||||
report.timeline = self.build_timeline(all_outputs)
|
||||
|
||||
self.logger.info(
|
||||
f"Aggregation complete: {len(all_outputs)} skills, "
|
||||
f"health={report.overall_health}/100, "
|
||||
f"trend={report.health_trend}, "
|
||||
f"issues={len(report.top_issues)}, wins={len(report.top_wins)}"
|
||||
)
|
||||
|
||||
return report
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Output formatting
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def _format_text_report(report: AggregatedReport) -> str:
|
||||
"""Format aggregated report as human-readable text."""
|
||||
lines: list[str] = []
|
||||
lines.append("=" * 70)
|
||||
lines.append(" SEO REPORTING DASHBOARD - AGGREGATED REPORT")
|
||||
lines.append(f" Domain: {report.domain}")
|
||||
lines.append(f" Report Date: {report.report_date}")
|
||||
lines.append(f" Audit ID: {report.audit_id}")
|
||||
lines.append("=" * 70)
|
||||
|
||||
# Health score
|
||||
lines.append("")
|
||||
lines.append(f" Overall Health: {report.overall_health}/100 ({report.health_trend})")
|
||||
lines.append("-" * 50)
|
||||
|
||||
# Category scores
|
||||
if report.category_scores:
|
||||
lines.append("")
|
||||
lines.append("--- CATEGORY SCORES ---")
|
||||
for cat, score in sorted(
|
||||
report.category_scores.items(), key=lambda x: x[1], reverse=True
|
||||
):
|
||||
bar = "#" * int(score / 5) + "." * (20 - int(score / 5))
|
||||
lines.append(f" {cat:<20} [{bar}] {score:.1f}/100")
|
||||
|
||||
# Skills included
|
||||
if report.skills_included:
|
||||
lines.append("")
|
||||
lines.append("--- SKILLS INCLUDED ---")
|
||||
for skill in report.skills_included:
|
||||
lines.append(
|
||||
f" [{skill['skill_id']:>2}] {skill['skill_name']:<30} "
|
||||
f"({skill['audit_date']})"
|
||||
)
|
||||
|
||||
# Top issues
|
||||
if report.top_issues:
|
||||
lines.append("")
|
||||
lines.append("--- TOP ISSUES ---")
|
||||
for i, issue in enumerate(report.top_issues[:10], 1):
|
||||
severity = issue.get("severity", "medium").upper()
|
||||
desc = issue.get("description", "No description")
|
||||
cat = issue.get("category", "")
|
||||
lines.append(f" {i:>2}. [{severity}] ({cat}) {desc}")
|
||||
|
||||
# Top wins
|
||||
if report.top_wins:
|
||||
lines.append("")
|
||||
lines.append("--- TOP WINS ---")
|
||||
for i, win in enumerate(report.top_wins[:10], 1):
|
||||
desc = win.get("description", "No description")
|
||||
cat = win.get("category", "")
|
||||
lines.append(f" {i:>2}. ({cat}) {desc}")
|
||||
|
||||
# Timeline
|
||||
if report.timeline:
|
||||
lines.append("")
|
||||
lines.append("--- AUDIT TIMELINE ---")
|
||||
lines.append(f" {'Date':<12} {'Skill':<25} {'Score':>8} {'Issues':>8}")
|
||||
lines.append(" " + "-" * 55)
|
||||
for entry in report.timeline[:15]:
|
||||
lines.append(
|
||||
f" {entry['date']:<12} {entry['skill']:<25} "
|
||||
f"{entry['health_score']:>7.1f} {entry['issues_count']:>7}"
|
||||
)
|
||||
|
||||
# Errors
|
||||
if report.errors:
|
||||
lines.append("")
|
||||
lines.append("--- ERRORS ---")
|
||||
for err in report.errors:
|
||||
lines.append(f" - {err}")
|
||||
|
||||
lines.append("")
|
||||
lines.append("=" * 70)
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
def _serialize_report(report: AggregatedReport) -> dict:
|
||||
"""Convert report to JSON-serializable dict."""
|
||||
return {
|
||||
"domain": report.domain,
|
||||
"report_date": report.report_date,
|
||||
"overall_health": report.overall_health,
|
||||
"health_trend": report.health_trend,
|
||||
"skills_included": report.skills_included,
|
||||
"category_scores": report.category_scores,
|
||||
"top_issues": report.top_issues,
|
||||
"top_wins": report.top_wins,
|
||||
"timeline": report.timeline,
|
||||
"raw_outputs": report.raw_outputs,
|
||||
"audit_id": report.audit_id,
|
||||
"timestamp": report.timestamp,
|
||||
"errors": report.errors if report.errors else None,
|
||||
}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CLI
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def parse_args(argv: list[str] | None = None) -> argparse.Namespace:
|
||||
parser = argparse.ArgumentParser(
|
||||
description="SEO Report Aggregator - Collect and normalize outputs from all SEO skills",
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
epilog="""\
|
||||
Examples:
|
||||
python report_aggregator.py --domain https://example.com --json
|
||||
python report_aggregator.py --domain https://example.com --output-dir ./audit_outputs --json
|
||||
python report_aggregator.py --domain https://example.com --from 2025-01-01 --to 2025-03-31 --json
|
||||
""",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--domain",
|
||||
required=True,
|
||||
help="Target domain to aggregate reports for",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--output-dir",
|
||||
type=str,
|
||||
default=None,
|
||||
help="Directory containing JSON outputs from SEO skills",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--from",
|
||||
type=str,
|
||||
default=None,
|
||||
dest="date_from",
|
||||
help="Start date for filtering (YYYY-MM-DD)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--to",
|
||||
type=str,
|
||||
default=None,
|
||||
dest="date_to",
|
||||
help="End date for filtering (YYYY-MM-DD)",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--json",
|
||||
action="store_true",
|
||||
default=False,
|
||||
help="Output in JSON format",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--output",
|
||||
type=str,
|
||||
default=None,
|
||||
help="Save output to file path",
|
||||
)
|
||||
return parser.parse_args(argv)
|
||||
|
||||
|
||||
async def async_main(args: argparse.Namespace) -> None:
|
||||
aggregator = ReportAggregator()
|
||||
|
||||
report = await aggregator.run(
|
||||
domain=args.domain,
|
||||
output_dir=args.output_dir,
|
||||
date_from=args.date_from,
|
||||
date_to=args.date_to,
|
||||
)
|
||||
|
||||
if args.json:
|
||||
output_str = json.dumps(
|
||||
_serialize_report(report), indent=2, ensure_ascii=False
|
||||
)
|
||||
else:
|
||||
output_str = _format_text_report(report)
|
||||
|
||||
if args.output:
|
||||
Path(args.output).write_text(output_str, encoding="utf-8")
|
||||
logger.info(f"Report saved to {args.output}")
|
||||
else:
|
||||
print(output_str)
|
||||
|
||||
aggregator.print_stats()
|
||||
|
||||
|
||||
def main() -> None:
|
||||
args = parse_args()
|
||||
asyncio.run(async_main(args))
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -0,0 +1,9 @@
|
||||
# 34-seo-reporting-dashboard dependencies
|
||||
requests>=2.31.0
|
||||
aiohttp>=3.9.0
|
||||
pandas>=2.1.0
|
||||
tenacity>=8.2.0
|
||||
tqdm>=4.66.0
|
||||
python-dotenv>=1.0.0
|
||||
rich>=13.7.0
|
||||
jinja2>=3.1.0
|
||||
Reference in New Issue
Block a user