## Summary - Add portable installation tool (`install.sh`) for cross-machine setup - Add Claude.ai export files with proper YAML frontmatter - Add multi-agent-guide v2.0 with consolidated framework template - Rename `00-claude-code-setting` → `00-our-settings-audit` (avoid reserved word) - Add YAML frontmatter to 25+ SKILL.md files for Claude Desktop compatibility ## Commits Included - `93f604a` feat: Add portable installation tool for cross-machine setup - `9b84104` feat: Add Claude.ai export for portable skill installation - `f7ab973` fix: Add YAML frontmatter to Claude.ai export files - `3fed49a` feat(multi-agent-guide): Add v2.0 with consolidated framework - `3be26ef` refactor: Rename settings-audit skill and add YAML frontmatter Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
5.2 KiB
5.2 KiB
name, description
| name | description |
|---|---|
| content-repository | MySQL storage manager for reference library with versioning and deduplication. Triggers: store content, manage repository, document database, content storage. |
Content Repository
Manages MySQL storage for the reference library system. Handles document storage, version control, deduplication, and retrieval.
Prerequisites
- MySQL 8.0+ with utf8mb4 charset
- Config file at
~/.config/reference-curator/db_config.yaml - Database
reference_libraryinitialized with schema
Quick Reference
Connection Setup
import yaml
import os
from pathlib import Path
def get_db_config():
config_path = Path.home() / ".config/reference-curator/db_config.yaml"
with open(config_path) as f:
config = yaml.safe_load(f)
# Resolve environment variables
mysql = config['mysql']
return {
'host': mysql['host'],
'port': mysql['port'],
'database': mysql['database'],
'user': os.environ.get('MYSQL_USER', mysql.get('user', '')),
'password': os.environ.get('MYSQL_PASSWORD', mysql.get('password', '')),
'charset': mysql['charset']
}
Core Operations
Store New Document:
def store_document(cursor, source_id, title, url, doc_type, raw_content_path):
sql = """
INSERT INTO documents (source_id, title, url, doc_type, crawl_date, crawl_status, raw_content_path)
VALUES (%s, %s, %s, %s, NOW(), 'completed', %s)
ON DUPLICATE KEY UPDATE
version = version + 1,
previous_version_id = doc_id,
crawl_date = NOW(),
raw_content_path = VALUES(raw_content_path)
"""
cursor.execute(sql, (source_id, title, url, doc_type, raw_content_path))
return cursor.lastrowid
Check Duplicate:
def is_duplicate(cursor, url):
cursor.execute("SELECT doc_id FROM documents WHERE url_hash = SHA2(%s, 256)", (url,))
return cursor.fetchone() is not None
Get Document by Topic:
def get_docs_by_topic(cursor, topic_slug, min_quality=0.80):
sql = """
SELECT d.doc_id, d.title, d.url, dc.structured_content, dc.quality_score
FROM documents d
JOIN document_topics dt ON d.doc_id = dt.doc_id
JOIN topics t ON dt.topic_id = t.topic_id
LEFT JOIN distilled_content dc ON d.doc_id = dc.doc_id
WHERE t.topic_slug = %s
AND (dc.review_status = 'approved' OR dc.review_status IS NULL)
ORDER BY dt.relevance_score DESC
"""
cursor.execute(sql, (topic_slug,))
return cursor.fetchall()
Table Quick Reference
| Table | Purpose | Key Fields |
|---|---|---|
sources |
Authorized content sources | source_type, credibility_tier, vendor |
documents |
Crawled document metadata | url_hash (dedup), version, crawl_status |
distilled_content |
Processed summaries | review_status, compression_ratio |
review_logs |
QA decisions | quality_score, decision, refactor_instructions |
topics |
Taxonomy | topic_slug, parent_topic_id |
document_topics |
Many-to-many linking | relevance_score |
export_jobs |
Export tracking | export_type, output_format, status |
Status Values
crawl_status: pending → completed | failed | stale
review_status: pending → in_review → approved | needs_refactor | rejected
decision (review): approve | refactor | deep_research | reject
Common Queries
Find Stale Documents (needs re-crawl)
SELECT d.doc_id, d.title, d.url, d.crawl_date
FROM documents d
JOIN crawl_schedule cs ON d.source_id = cs.source_id
WHERE d.crawl_date < DATE_SUB(NOW(), INTERVAL
CASE cs.frequency
WHEN 'daily' THEN 1
WHEN 'weekly' THEN 7
WHEN 'biweekly' THEN 14
WHEN 'monthly' THEN 30
END DAY)
AND cs.is_enabled = TRUE;
Get Pending Reviews
SELECT dc.distill_id, d.title, d.url, dc.token_count_distilled
FROM distilled_content dc
JOIN documents d ON dc.doc_id = d.doc_id
WHERE dc.review_status = 'pending'
ORDER BY dc.distill_date ASC;
Export-Ready Content
SELECT d.title, d.url, dc.structured_content, t.topic_slug
FROM documents d
JOIN distilled_content dc ON d.doc_id = dc.doc_id
JOIN document_topics dt ON d.doc_id = dt.doc_id
JOIN topics t ON dt.topic_id = t.topic_id
JOIN review_logs rl ON dc.distill_id = rl.distill_id
WHERE rl.decision = 'approve'
AND rl.quality_score >= 0.85
ORDER BY t.topic_slug, dt.relevance_score DESC;
Workflow Integration
- From crawler-orchestrator: Receive URL + raw content path →
store_document() - To content-distiller: Query pending documents → send for processing
- From quality-reviewer: Update
review_statusbased on decision - To markdown-exporter: Query approved content by topic
Error Handling
- Duplicate URL: Silent update (version increment) via
ON DUPLICATE KEY UPDATE - Missing source_id: Validate against
sourcestable before insert - Connection failure: Implement retry with exponential backoff
Full Schema Reference
See references/schema.sql for complete table definitions including indexes and constraints.
Config File Template
See references/db_config_template.yaml for connection configuration template.