# OurDigital SEO Audit - API Reference ## Google Search Console API ### Authentication ```python from google.oauth2 import service_account from googleapiclient.discovery import build SCOPES = ['https://www.googleapis.com/auth/webmasters.readonly'] credentials = service_account.Credentials.from_service_account_file( 'service-account-key.json', scopes=SCOPES ) service = build('searchconsole', 'v1', credentials=credentials) ``` ### Endpoints #### Search Analytics ```python # Get search performance data request = { 'startDate': '2024-01-01', 'endDate': '2024-12-31', 'dimensions': ['query', 'page', 'country', 'device'], 'rowLimit': 25000, 'dimensionFilterGroups': [{ 'filters': [{ 'dimension': 'country', 'expression': 'kor' }] }] } response = service.searchanalytics().query( siteUrl='sc-domain:example.com', body=request ).execute() ``` #### URL Inspection ```python request = { 'inspectionUrl': 'https://example.com/page', 'siteUrl': 'sc-domain:example.com' } response = service.urlInspection().index().inspect(body=request).execute() ``` #### Sitemaps ```python # List sitemaps sitemaps = service.sitemaps().list(siteUrl='sc-domain:example.com').execute() # Submit sitemap service.sitemaps().submit( siteUrl='sc-domain:example.com', feedpath='https://example.com/sitemap.xml' ).execute() ``` ### Rate Limits - 1,200 queries per minute per project - 25,000 rows max per request --- ## PageSpeed Insights API ### Authentication ```python import requests API_KEY = 'your-api-key' BASE_URL = 'https://www.googleapis.com/pagespeedonline/v5/runPagespeed' ``` ### Request Parameters ```python params = { 'url': 'https://example.com', 'key': API_KEY, 'strategy': 'mobile', # or 'desktop' 'category': ['performance', 'accessibility', 'best-practices', 'seo'] } response = requests.get(BASE_URL, params=params) ``` ### Response Structure ```json { "lighthouseResult": { "categories": { "performance": { "score": 0.85 }, "seo": { "score": 0.92 } }, "audits": { "largest-contentful-paint": { "numericValue": 2500, "displayValue": "2.5 s" }, "cumulative-layout-shift": { "numericValue": 0.05 }, "total-blocking-time": { "numericValue": 150 } } }, "loadingExperience": { "metrics": { "LARGEST_CONTENTFUL_PAINT_MS": { "percentile": 2500, "category": "AVERAGE" } } } } ``` ### Core Web Vitals Thresholds | Metric | Good | Needs Improvement | Poor | |--------|------|-------------------|------| | LCP | ≤ 2.5s | 2.5s - 4.0s | > 4.0s | | FID | ≤ 100ms | 100ms - 300ms | > 300ms | | CLS | ≤ 0.1 | 0.1 - 0.25 | > 0.25 | | INP | ≤ 200ms | 200ms - 500ms | > 500ms | | TTFB | ≤ 800ms | 800ms - 1800ms | > 1800ms | | FCP | ≤ 1.8s | 1.8s - 3.0s | > 3.0s | ### Rate Limits - 25,000 queries per day (free tier) - No per-minute limit --- ## Google Analytics 4 Data API ### Authentication ```python from google.analytics.data_v1beta import BetaAnalyticsDataClient from google.analytics.data_v1beta.types import RunReportRequest client = BetaAnalyticsDataClient() property_id = '123456789' ``` ### Common Reports #### Traffic Overview ```python request = RunReportRequest( property=f'properties/{property_id}', dimensions=[ {'name': 'date'}, {'name': 'sessionDefaultChannelGroup'} ], metrics=[ {'name': 'sessions'}, {'name': 'totalUsers'}, {'name': 'screenPageViews'}, {'name': 'bounceRate'} ], date_ranges=[{'start_date': '30daysAgo', 'end_date': 'today'}] ) response = client.run_report(request) ``` #### Landing Pages ```python request = RunReportRequest( property=f'properties/{property_id}', dimensions=[{'name': 'landingPage'}], metrics=[ {'name': 'sessions'}, {'name': 'engagementRate'}, {'name': 'conversions'} ], date_ranges=[{'start_date': '30daysAgo', 'end_date': 'today'}], order_bys=[{'metric': {'metric_name': 'sessions'}, 'desc': True}], limit=100 ) ``` ### Useful Dimensions - `date`, `dateHour` - `sessionDefaultChannelGroup` - `landingPage`, `pagePath` - `deviceCategory`, `operatingSystem` - `country`, `city` - `sessionSource`, `sessionMedium` ### Useful Metrics - `sessions`, `totalUsers`, `newUsers` - `screenPageViews`, `engagementRate` - `averageSessionDuration` - `bounceRate`, `conversions` --- ## Google Trends API (pytrends) ### Installation ```bash pip install pytrends ``` ### Usage ```python from pytrends.request import TrendReq pytrends = TrendReq(hl='ko-KR', tz=540) # Interest over time pytrends.build_payload(['keyword1', 'keyword2'], timeframe='today 12-m', geo='KR') interest_df = pytrends.interest_over_time() # Related queries related = pytrends.related_queries() # Trending searches trending = pytrends.trending_searches(pn='south_korea') # Suggestions suggestions = pytrends.suggestions('seo') ``` ### Rate Limits - No official limits, but implement delays (1-2 seconds between requests) - May trigger CAPTCHA with heavy usage --- ## Custom Search JSON API ### Authentication ```python import requests API_KEY = 'your-api-key' CX = 'your-search-engine-id' # Programmable Search Engine ID BASE_URL = 'https://www.googleapis.com/customsearch/v1' ``` ### Request ```python params = { 'key': API_KEY, 'cx': CX, 'q': 'search query', 'num': 10, # 1-10 'start': 1, # Pagination 'gl': 'kr', # Country 'hl': 'ko' # Language } response = requests.get(BASE_URL, params=params) ``` ### Response Structure ```json { "searchInformation": { "totalResults": "12345", "searchTime": 0.5 }, "items": [ { "title": "Page Title", "link": "https://example.com", "snippet": "Description...", "pagemap": { "metatags": [...], "cse_image": [...] } } ] } ``` ### Rate Limits - 100 queries per day (free) - 10,000 queries per day ($5 per 1,000) --- ## Knowledge Graph Search API ### Request ```python API_KEY = 'your-api-key' BASE_URL = 'https://kgsearch.googleapis.com/v1/entities:search' params = { 'key': API_KEY, 'query': 'entity name', 'types': 'Organization', 'languages': 'ko', 'limit': 10 } response = requests.get(BASE_URL, params=params) ``` ### Response ```json { "itemListElement": [ { "result": { "@type": "EntitySearchResult", "name": "Entity Name", "description": "Description...", "@id": "kg:/m/entity_id", "detailedDescription": { "articleBody": "..." } }, "resultScore": 1234.56 } ] } ``` --- ## Schema.org Reference ### JSON-LD Format ```html ``` ### Common Schema Types #### Organization ```json { "@context": "https://schema.org", "@type": "Organization", "name": "Company Name", "url": "https://example.com", "logo": "https://example.com/logo.png", "sameAs": [ "https://facebook.com/company", "https://twitter.com/company" ], "contactPoint": { "@type": "ContactPoint", "telephone": "+82-2-1234-5678", "contactType": "customer service" } } ``` #### LocalBusiness ```json { "@context": "https://schema.org", "@type": "LocalBusiness", "name": "Business Name", "address": { "@type": "PostalAddress", "streetAddress": "123 Street", "addressLocality": "Seoul", "addressRegion": "Seoul", "postalCode": "12345", "addressCountry": "KR" }, "geo": { "@type": "GeoCoordinates", "latitude": 37.5665, "longitude": 126.9780 }, "openingHoursSpecification": [{ "@type": "OpeningHoursSpecification", "dayOfWeek": ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"], "opens": "09:00", "closes": "18:00" }] } ``` #### Article/BlogPosting ```json { "@context": "https://schema.org", "@type": "Article", "headline": "Article Title", "author": { "@type": "Person", "name": "Author Name" }, "datePublished": "2024-01-01", "dateModified": "2024-01-15", "image": "https://example.com/image.jpg", "publisher": { "@type": "Organization", "name": "Publisher Name", "logo": { "@type": "ImageObject", "url": "https://example.com/logo.png" } } } ``` #### Product ```json { "@context": "https://schema.org", "@type": "Product", "name": "Product Name", "image": "https://example.com/product.jpg", "description": "Product description", "brand": { "@type": "Brand", "name": "Brand Name" }, "offers": { "@type": "Offer", "price": "29900", "priceCurrency": "KRW", "availability": "https://schema.org/InStock" }, "aggregateRating": { "@type": "AggregateRating", "ratingValue": "4.5", "reviewCount": "100" } } ``` #### FAQPage ```json { "@context": "https://schema.org", "@type": "FAQPage", "mainEntity": [{ "@type": "Question", "name": "Question text?", "acceptedAnswer": { "@type": "Answer", "text": "Answer text." } }] } ``` #### BreadcrumbList ```json { "@context": "https://schema.org", "@type": "BreadcrumbList", "itemListElement": [{ "@type": "ListItem", "position": 1, "name": "Home", "item": "https://example.com/" }, { "@type": "ListItem", "position": 2, "name": "Category", "item": "https://example.com/category/" }] } ``` #### WebSite (with SearchAction) ```json { "@context": "https://schema.org", "@type": "WebSite", "name": "Site Name", "url": "https://example.com", "potentialAction": { "@type": "SearchAction", "target": { "@type": "EntryPoint", "urlTemplate": "https://example.com/search?q={search_term_string}" }, "query-input": "required name=search_term_string" } } ``` --- ## XML Sitemap Specification ### Format ```xml https://example.com/page 2024-01-15 weekly 0.8 ``` ### Index Sitemap ```xml https://example.com/sitemap-posts.xml 2024-01-15 ``` ### Limits - 50,000 URLs max per sitemap - 50MB uncompressed max - Use index for larger sites ### Best Practices - Use absolute URLs - Include only canonical URLs - Keep lastmod accurate - Exclude noindex pages --- ## Robots.txt Reference ### Directives ```txt # Comments start with # User-agent: * Disallow: /admin/ Disallow: /private/ Allow: /public/ User-agent: Googlebot Disallow: /no-google/ Crawl-delay: 1 Sitemap: https://example.com/sitemap.xml ``` ### Common User-agents - `*` - All bots - `Googlebot` - Google crawler - `Googlebot-Image` - Google Image crawler - `Bingbot` - Bing crawler - `Yandex` - Yandex crawler - `Baiduspider` - Baidu crawler ### Pattern Matching - `*` - Wildcard (any sequence) - `$` - End of URL - `/path/` - Directory - `/*.pdf$` - All PDFs ### Testing ```python from urllib.robotparser import RobotFileParser rp = RobotFileParser() rp.set_url("https://example.com/robots.txt") rp.read() # Check if URL is allowed can_fetch = rp.can_fetch("Googlebot", "https://example.com/page") ``` --- ## Error Handling ### HTTP Status Codes | Code | Meaning | Action | |------|---------|--------| | 200 | OK | Process response | | 301/302 | Redirect | Follow or flag | | 400 | Bad Request | Check parameters | | 401 | Unauthorized | Check credentials | | 403 | Forbidden | Check permissions | | 404 | Not Found | Flag missing resource | | 429 | Rate Limited | Implement backoff | | 500 | Server Error | Retry with backoff | | 503 | Service Unavailable | Retry later | ### Retry Strategy ```python from tenacity import retry, stop_after_attempt, wait_exponential @retry( stop=stop_after_attempt(3), wait=wait_exponential(multiplier=1, min=2, max=10) ) async def make_request(url): # Request logic pass ```