Files

Andrew Yim b6a478e1df feat: Add installation tool, Claude.ai export, and skill standardization (#1 )

## Summary

- Add portable installation tool (`install.sh`) for cross-machine setup
- Add Claude.ai export files with proper YAML frontmatter
- Add multi-agent-guide v2.0 with consolidated framework template
- Rename `00-claude-code-setting` → `00-our-settings-audit` (avoid reserved word)
- Add YAML frontmatter to 25+ SKILL.md files for Claude Desktop compatibility

## Commits Included

- `93f604a` feat: Add portable installation tool for cross-machine setup
- `9b84104` feat: Add Claude.ai export for portable skill installation
- `f7ab973` fix: Add YAML frontmatter to Claude.ai export files
- `3fed49a` feat(multi-agent-guide): Add v2.0 with consolidated framework
- `3be26ef` refactor: Rename settings-audit skill and add YAML frontmatter

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-02-03 16:48:06 +07:00

5.6 KiB

Raw Permalink Blame History

name, description

name	description
reference-discovery	Search and discover authoritative reference sources with credibility validation. Triggers: find sources, search documentation, discover references, source validation.

Reference Discovery

Searches for authoritative sources, validates credibility, and produces curated URL lists for crawling.

Source Priority Hierarchy

Tier	Source Type	Examples
Tier 1	Official documentation	docs.anthropic.com, docs.claude.com, platform.openai.com/docs
Tier 1	Engineering blogs (official)	anthropic.com/news, openai.com/blog
Tier 1	Official GitHub repos	github.com/anthropics/, github.com/openai/
Tier 2	Research papers	arxiv.org, papers with citations
Tier 2	Verified community guides	Cookbook examples, official tutorials
Tier 3	Community content	Blog posts, tutorials, Stack Overflow

Discovery Workflow

Step 1: Define Search Scope

search_config = {
    "topic": "prompt engineering",
    "vendors": ["anthropic", "openai", "google"],
    "source_types": ["official_docs", "engineering_blog", "github_repo"],
    "freshness": "past_year",  # past_week, past_month, past_year, any
    "max_results_per_query": 20
}

Step 2: Generate Search Queries

For a given topic, generate targeted queries:

def generate_queries(topic, vendors):
    queries = []
    
    # Official documentation queries
    for vendor in vendors:
        queries.append(f"site:docs.{vendor}.com {topic}")
        queries.append(f"site:{vendor}.com/docs {topic}")
    
    # Engineering blog queries
    for vendor in vendors:
        queries.append(f"site:{vendor}.com/blog {topic}")
        queries.append(f"site:{vendor}.com/news {topic}")
    
    # GitHub queries
    for vendor in vendors:
        queries.append(f"site:github.com/{vendor} {topic}")
    
    # Research queries
    queries.append(f"site:arxiv.org {topic}")
    
    return queries

Step 3: Execute Search

Use web search tool for each query:

def execute_discovery(queries):
    results = []
    for query in queries:
        search_results = web_search(query)
        for result in search_results:
            results.append({
                "url": result.url,
                "title": result.title,
                "snippet": result.snippet,
                "query_used": query
            })
    return deduplicate_by_url(results)

Step 4: Validate and Score Sources

def score_source(url, title):
    score = 0.0
    
    # Domain credibility
    if any(d in url for d in ['docs.anthropic.com', 'docs.claude.com', 'docs.openai.com']):
        score += 0.40  # Tier 1 official docs
    elif any(d in url for d in ['anthropic.com', 'openai.com', 'google.dev']):
        score += 0.30  # Tier 1 official blog/news
    elif 'github.com' in url and any(v in url for v in ['anthropics', 'openai', 'google']):
        score += 0.30  # Tier 1 official repos
    elif 'arxiv.org' in url:
        score += 0.20  # Tier 2 research
    else:
        score += 0.10  # Tier 3 community
    
    # Freshness signals (from title/snippet)
    if any(year in title for year in ['2025', '2024']):
        score += 0.20
    elif any(year in title for year in ['2023']):
        score += 0.10
    
    # Relevance signals
    if any(kw in title.lower() for kw in ['guide', 'documentation', 'tutorial', 'best practices']):
        score += 0.15
    
    return min(score, 1.0)

def assign_credibility_tier(score):
    if score >= 0.60:
        return 'tier1_official'
    elif score >= 0.40:
        return 'tier2_verified'
    else:
        return 'tier3_community'

Step 5: Output URL Manifest

def create_manifest(scored_results, topic):
    manifest = {
        "discovery_date": datetime.now().isoformat(),
        "topic": topic,
        "total_urls": len(scored_results),
        "urls": []
    }
    
    for result in sorted(scored_results, key=lambda x: x['score'], reverse=True):
        manifest["urls"].append({
            "url": result["url"],
            "title": result["title"],
            "credibility_tier": result["tier"],
            "credibility_score": result["score"],
            "source_type": infer_source_type(result["url"]),
            "vendor": infer_vendor(result["url"])
        })
    
    return manifest

Output Format

Discovery produces a JSON manifest for the crawler:

{
  "discovery_date": "2025-01-28T10:30:00",
  "topic": "prompt engineering",
  "total_urls": 15,
  "urls": [
    {
      "url": "https://docs.anthropic.com/en/docs/prompt-engineering",
      "title": "Prompt Engineering Guide",
      "credibility_tier": "tier1_official",
      "credibility_score": 0.85,
      "source_type": "official_docs",
      "vendor": "anthropic"
    }
  ]
}

Known Authoritative Sources

Pre-validated sources for common topics:

Vendor	Documentation	Blog/News	GitHub
Anthropic	docs.anthropic.com, docs.claude.com	anthropic.com/news	github.com/anthropics
OpenAI	platform.openai.com/docs	openai.com/blog	github.com/openai
Google	ai.google.dev/docs	blog.google/technology/ai	github.com/google

Integration

Output: URL manifest JSON → web-crawler-orchestrator

Database: Register new sources in sources table via content-repository

Deduplication

Before outputting, deduplicate URLs:

Normalize URLs (remove trailing slashes, query params)
Check against existing documents table via content-repository
Merge duplicate entries, keeping highest credibility score

5.6 KiB Raw Permalink Blame History