Files
our-claude-skills/ourdigital-custom-skills/12-ourdigital-seo-audit/reference.md
Andrew Yim 9426787ba6 feat(seo-audit): Add comprehensive SEO audit skill
Add ourdigital-seo-audit skill with:
- Full site audit orchestrator (full_audit.py)
- Google Search Console and PageSpeed API clients
- Schema.org JSON-LD validation and generation
- XML sitemap and robots.txt validation
- Notion database integration for findings export
- Core Web Vitals measurement and analysis
- 7 schema templates (article, faq, product, etc.)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 02:30:02 +09:00

12 KiB

OurDigital SEO Audit - API Reference

Google Search Console API

Authentication

from google.oauth2 import service_account
from googleapiclient.discovery import build

SCOPES = ['https://www.googleapis.com/auth/webmasters.readonly']
credentials = service_account.Credentials.from_service_account_file(
    'service-account-key.json', scopes=SCOPES
)
service = build('searchconsole', 'v1', credentials=credentials)

Endpoints

Search Analytics

# Get search performance data
request = {
    'startDate': '2024-01-01',
    'endDate': '2024-12-31',
    'dimensions': ['query', 'page', 'country', 'device'],
    'rowLimit': 25000,
    'dimensionFilterGroups': [{
        'filters': [{
            'dimension': 'country',
            'expression': 'kor'
        }]
    }]
}
response = service.searchanalytics().query(
    siteUrl='sc-domain:example.com',
    body=request
).execute()

URL Inspection

request = {
    'inspectionUrl': 'https://example.com/page',
    'siteUrl': 'sc-domain:example.com'
}
response = service.urlInspection().index().inspect(body=request).execute()

Sitemaps

# List sitemaps
sitemaps = service.sitemaps().list(siteUrl='sc-domain:example.com').execute()

# Submit sitemap
service.sitemaps().submit(
    siteUrl='sc-domain:example.com',
    feedpath='https://example.com/sitemap.xml'
).execute()

Rate Limits

  • 1,200 queries per minute per project
  • 25,000 rows max per request

PageSpeed Insights API

Authentication

import requests

API_KEY = 'your-api-key'
BASE_URL = 'https://www.googleapis.com/pagespeedonline/v5/runPagespeed'

Request Parameters

params = {
    'url': 'https://example.com',
    'key': API_KEY,
    'strategy': 'mobile',  # or 'desktop'
    'category': ['performance', 'accessibility', 'best-practices', 'seo']
}
response = requests.get(BASE_URL, params=params)

Response Structure

{
  "lighthouseResult": {
    "categories": {
      "performance": { "score": 0.85 },
      "seo": { "score": 0.92 }
    },
    "audits": {
      "largest-contentful-paint": {
        "numericValue": 2500,
        "displayValue": "2.5 s"
      },
      "cumulative-layout-shift": {
        "numericValue": 0.05
      },
      "total-blocking-time": {
        "numericValue": 150
      }
    }
  },
  "loadingExperience": {
    "metrics": {
      "LARGEST_CONTENTFUL_PAINT_MS": {
        "percentile": 2500,
        "category": "AVERAGE"
      }
    }
  }
}

Core Web Vitals Thresholds

Metric Good Needs Improvement Poor
LCP ≤ 2.5s 2.5s - 4.0s > 4.0s
FID ≤ 100ms 100ms - 300ms > 300ms
CLS ≤ 0.1 0.1 - 0.25 > 0.25
INP ≤ 200ms 200ms - 500ms > 500ms
TTFB ≤ 800ms 800ms - 1800ms > 1800ms
FCP ≤ 1.8s 1.8s - 3.0s > 3.0s

Rate Limits

  • 25,000 queries per day (free tier)
  • No per-minute limit

Google Analytics 4 Data API

Authentication

from google.analytics.data_v1beta import BetaAnalyticsDataClient
from google.analytics.data_v1beta.types import RunReportRequest

client = BetaAnalyticsDataClient()
property_id = '123456789'

Common Reports

Traffic Overview

request = RunReportRequest(
    property=f'properties/{property_id}',
    dimensions=[
        {'name': 'date'},
        {'name': 'sessionDefaultChannelGroup'}
    ],
    metrics=[
        {'name': 'sessions'},
        {'name': 'totalUsers'},
        {'name': 'screenPageViews'},
        {'name': 'bounceRate'}
    ],
    date_ranges=[{'start_date': '30daysAgo', 'end_date': 'today'}]
)
response = client.run_report(request)

Landing Pages

request = RunReportRequest(
    property=f'properties/{property_id}',
    dimensions=[{'name': 'landingPage'}],
    metrics=[
        {'name': 'sessions'},
        {'name': 'engagementRate'},
        {'name': 'conversions'}
    ],
    date_ranges=[{'start_date': '30daysAgo', 'end_date': 'today'}],
    order_bys=[{'metric': {'metric_name': 'sessions'}, 'desc': True}],
    limit=100
)

Useful Dimensions

  • date, dateHour
  • sessionDefaultChannelGroup
  • landingPage, pagePath
  • deviceCategory, operatingSystem
  • country, city
  • sessionSource, sessionMedium

Useful Metrics

  • sessions, totalUsers, newUsers
  • screenPageViews, engagementRate
  • averageSessionDuration
  • bounceRate, conversions

Installation

pip install pytrends

Usage

from pytrends.request import TrendReq

pytrends = TrendReq(hl='ko-KR', tz=540)

# Interest over time
pytrends.build_payload(['keyword1', 'keyword2'], timeframe='today 12-m', geo='KR')
interest_df = pytrends.interest_over_time()

# Related queries
related = pytrends.related_queries()

# Trending searches
trending = pytrends.trending_searches(pn='south_korea')

# Suggestions
suggestions = pytrends.suggestions('seo')

Rate Limits

  • No official limits, but implement delays (1-2 seconds between requests)
  • May trigger CAPTCHA with heavy usage

Custom Search JSON API

Authentication

import requests

API_KEY = 'your-api-key'
CX = 'your-search-engine-id'  # Programmable Search Engine ID
BASE_URL = 'https://www.googleapis.com/customsearch/v1'

Request

params = {
    'key': API_KEY,
    'cx': CX,
    'q': 'search query',
    'num': 10,  # 1-10
    'start': 1,  # Pagination
    'gl': 'kr',  # Country
    'hl': 'ko'   # Language
}
response = requests.get(BASE_URL, params=params)

Response Structure

{
  "searchInformation": {
    "totalResults": "12345",
    "searchTime": 0.5
  },
  "items": [
    {
      "title": "Page Title",
      "link": "https://example.com",
      "snippet": "Description...",
      "pagemap": {
        "metatags": [...],
        "cse_image": [...]
      }
    }
  ]
}

Rate Limits

  • 100 queries per day (free)
  • 10,000 queries per day ($5 per 1,000)

Knowledge Graph Search API

Request

API_KEY = 'your-api-key'
BASE_URL = 'https://kgsearch.googleapis.com/v1/entities:search'

params = {
    'key': API_KEY,
    'query': 'entity name',
    'types': 'Organization',
    'languages': 'ko',
    'limit': 10
}
response = requests.get(BASE_URL, params=params)

Response

{
  "itemListElement": [
    {
      "result": {
        "@type": "EntitySearchResult",
        "name": "Entity Name",
        "description": "Description...",
        "@id": "kg:/m/entity_id",
        "detailedDescription": {
          "articleBody": "..."
        }
      },
      "resultScore": 1234.56
    }
  ]
}

Schema.org Reference

JSON-LD Format

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Company Name",
  "url": "https://example.com"
}
</script>

Common Schema Types

Organization

{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "Company Name",
  "url": "https://example.com",
  "logo": "https://example.com/logo.png",
  "sameAs": [
    "https://facebook.com/company",
    "https://twitter.com/company"
  ],
  "contactPoint": {
    "@type": "ContactPoint",
    "telephone": "+82-2-1234-5678",
    "contactType": "customer service"
  }
}

LocalBusiness

{
  "@context": "https://schema.org",
  "@type": "LocalBusiness",
  "name": "Business Name",
  "address": {
    "@type": "PostalAddress",
    "streetAddress": "123 Street",
    "addressLocality": "Seoul",
    "addressRegion": "Seoul",
    "postalCode": "12345",
    "addressCountry": "KR"
  },
  "geo": {
    "@type": "GeoCoordinates",
    "latitude": 37.5665,
    "longitude": 126.9780
  },
  "openingHoursSpecification": [{
    "@type": "OpeningHoursSpecification",
    "dayOfWeek": ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"],
    "opens": "09:00",
    "closes": "18:00"
  }]
}

Article/BlogPosting

{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "Article Title",
  "author": {
    "@type": "Person",
    "name": "Author Name"
  },
  "datePublished": "2024-01-01",
  "dateModified": "2024-01-15",
  "image": "https://example.com/image.jpg",
  "publisher": {
    "@type": "Organization",
    "name": "Publisher Name",
    "logo": {
      "@type": "ImageObject",
      "url": "https://example.com/logo.png"
    }
  }
}

Product

{
  "@context": "https://schema.org",
  "@type": "Product",
  "name": "Product Name",
  "image": "https://example.com/product.jpg",
  "description": "Product description",
  "brand": {
    "@type": "Brand",
    "name": "Brand Name"
  },
  "offers": {
    "@type": "Offer",
    "price": "29900",
    "priceCurrency": "KRW",
    "availability": "https://schema.org/InStock"
  },
  "aggregateRating": {
    "@type": "AggregateRating",
    "ratingValue": "4.5",
    "reviewCount": "100"
  }
}

FAQPage

{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [{
    "@type": "Question",
    "name": "Question text?",
    "acceptedAnswer": {
      "@type": "Answer",
      "text": "Answer text."
    }
  }]
}

BreadcrumbList

{
  "@context": "https://schema.org",
  "@type": "BreadcrumbList",
  "itemListElement": [{
    "@type": "ListItem",
    "position": 1,
    "name": "Home",
    "item": "https://example.com/"
  }, {
    "@type": "ListItem",
    "position": 2,
    "name": "Category",
    "item": "https://example.com/category/"
  }]
}

WebSite (with SearchAction)

{
  "@context": "https://schema.org",
  "@type": "WebSite",
  "name": "Site Name",
  "url": "https://example.com",
  "potentialAction": {
    "@type": "SearchAction",
    "target": {
      "@type": "EntryPoint",
      "urlTemplate": "https://example.com/search?q={search_term_string}"
    },
    "query-input": "required name=search_term_string"
  }
}

XML Sitemap Specification

Format

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <url>
    <loc>https://example.com/page</loc>
    <lastmod>2024-01-15</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.8</priority>
  </url>
</urlset>

Index Sitemap

<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://example.com/sitemap-posts.xml</loc>
    <lastmod>2024-01-15</lastmod>
  </sitemap>
</sitemapindex>

Limits

  • 50,000 URLs max per sitemap
  • 50MB uncompressed max
  • Use index for larger sites

Best Practices

  • Use absolute URLs
  • Include only canonical URLs
  • Keep lastmod accurate
  • Exclude noindex pages

Robots.txt Reference

Directives

# Comments start with #
User-agent: *
Disallow: /admin/
Disallow: /private/
Allow: /public/

User-agent: Googlebot
Disallow: /no-google/
Crawl-delay: 1

Sitemap: https://example.com/sitemap.xml

Common User-agents

  • * - All bots
  • Googlebot - Google crawler
  • Googlebot-Image - Google Image crawler
  • Bingbot - Bing crawler
  • Yandex - Yandex crawler
  • Baiduspider - Baidu crawler

Pattern Matching

  • * - Wildcard (any sequence)
  • $ - End of URL
  • /path/ - Directory
  • /*.pdf$ - All PDFs

Testing

from urllib.robotparser import RobotFileParser

rp = RobotFileParser()
rp.set_url("https://example.com/robots.txt")
rp.read()

# Check if URL is allowed
can_fetch = rp.can_fetch("Googlebot", "https://example.com/page")

Error Handling

HTTP Status Codes

Code Meaning Action
200 OK Process response
301/302 Redirect Follow or flag
400 Bad Request Check parameters
401 Unauthorized Check credentials
403 Forbidden Check permissions
404 Not Found Flag missing resource
429 Rate Limited Implement backoff
500 Server Error Retry with backoff
503 Service Unavailable Retry later

Retry Strategy

from tenacity import retry, stop_after_attempt, wait_exponential

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=2, max=10)
)
async def make_request(url):
    # Request logic
    pass