Files
our-claude-skills/custom-skills/90-reference-curator/03-content-repository/desktop/SKILL.md
Andrew Yim b6a478e1df feat: Add installation tool, Claude.ai export, and skill standardization (#1)
## Summary

- Add portable installation tool (`install.sh`) for cross-machine setup
- Add Claude.ai export files with proper YAML frontmatter
- Add multi-agent-guide v2.0 with consolidated framework template
- Rename `00-claude-code-setting` → `00-our-settings-audit` (avoid reserved word)
- Add YAML frontmatter to 25+ SKILL.md files for Claude Desktop compatibility

## Commits Included

- `93f604a` feat: Add portable installation tool for cross-machine setup
- `9b84104` feat: Add Claude.ai export for portable skill installation
- `f7ab973` fix: Add YAML frontmatter to Claude.ai export files
- `3fed49a` feat(multi-agent-guide): Add v2.0 with consolidated framework
- `3be26ef` refactor: Rename settings-audit skill and add YAML frontmatter

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-03 16:48:06 +07:00

165 lines
5.2 KiB
Markdown

---
name: content-repository
description: |
MySQL storage manager for reference library with versioning and deduplication.
Triggers: store content, manage repository, document database, content storage.
---
# Content Repository
Manages MySQL storage for the reference library system. Handles document storage, version control, deduplication, and retrieval.
## Prerequisites
- MySQL 8.0+ with utf8mb4 charset
- Config file at `~/.config/reference-curator/db_config.yaml`
- Database `reference_library` initialized with schema
## Quick Reference
### Connection Setup
```python
import yaml
import os
from pathlib import Path
def get_db_config():
config_path = Path.home() / ".config/reference-curator/db_config.yaml"
with open(config_path) as f:
config = yaml.safe_load(f)
# Resolve environment variables
mysql = config['mysql']
return {
'host': mysql['host'],
'port': mysql['port'],
'database': mysql['database'],
'user': os.environ.get('MYSQL_USER', mysql.get('user', '')),
'password': os.environ.get('MYSQL_PASSWORD', mysql.get('password', '')),
'charset': mysql['charset']
}
```
### Core Operations
**Store New Document:**
```python
def store_document(cursor, source_id, title, url, doc_type, raw_content_path):
sql = """
INSERT INTO documents (source_id, title, url, doc_type, crawl_date, crawl_status, raw_content_path)
VALUES (%s, %s, %s, %s, NOW(), 'completed', %s)
ON DUPLICATE KEY UPDATE
version = version + 1,
previous_version_id = doc_id,
crawl_date = NOW(),
raw_content_path = VALUES(raw_content_path)
"""
cursor.execute(sql, (source_id, title, url, doc_type, raw_content_path))
return cursor.lastrowid
```
**Check Duplicate:**
```python
def is_duplicate(cursor, url):
cursor.execute("SELECT doc_id FROM documents WHERE url_hash = SHA2(%s, 256)", (url,))
return cursor.fetchone() is not None
```
**Get Document by Topic:**
```python
def get_docs_by_topic(cursor, topic_slug, min_quality=0.80):
sql = """
SELECT d.doc_id, d.title, d.url, dc.structured_content, dc.quality_score
FROM documents d
JOIN document_topics dt ON d.doc_id = dt.doc_id
JOIN topics t ON dt.topic_id = t.topic_id
LEFT JOIN distilled_content dc ON d.doc_id = dc.doc_id
WHERE t.topic_slug = %s
AND (dc.review_status = 'approved' OR dc.review_status IS NULL)
ORDER BY dt.relevance_score DESC
"""
cursor.execute(sql, (topic_slug,))
return cursor.fetchall()
```
## Table Quick Reference
| Table | Purpose | Key Fields |
|-------|---------|------------|
| `sources` | Authorized content sources | source_type, credibility_tier, vendor |
| `documents` | Crawled document metadata | url_hash (dedup), version, crawl_status |
| `distilled_content` | Processed summaries | review_status, compression_ratio |
| `review_logs` | QA decisions | quality_score, decision, refactor_instructions |
| `topics` | Taxonomy | topic_slug, parent_topic_id |
| `document_topics` | Many-to-many linking | relevance_score |
| `export_jobs` | Export tracking | export_type, output_format, status |
## Status Values
**crawl_status:** `pending``completed` | `failed` | `stale`
**review_status:** `pending``in_review``approved` | `needs_refactor` | `rejected`
**decision (review):** `approve` | `refactor` | `deep_research` | `reject`
## Common Queries
### Find Stale Documents (needs re-crawl)
```sql
SELECT d.doc_id, d.title, d.url, d.crawl_date
FROM documents d
JOIN crawl_schedule cs ON d.source_id = cs.source_id
WHERE d.crawl_date < DATE_SUB(NOW(), INTERVAL
CASE cs.frequency
WHEN 'daily' THEN 1
WHEN 'weekly' THEN 7
WHEN 'biweekly' THEN 14
WHEN 'monthly' THEN 30
END DAY)
AND cs.is_enabled = TRUE;
```
### Get Pending Reviews
```sql
SELECT dc.distill_id, d.title, d.url, dc.token_count_distilled
FROM distilled_content dc
JOIN documents d ON dc.doc_id = d.doc_id
WHERE dc.review_status = 'pending'
ORDER BY dc.distill_date ASC;
```
### Export-Ready Content
```sql
SELECT d.title, d.url, dc.structured_content, t.topic_slug
FROM documents d
JOIN distilled_content dc ON d.doc_id = dc.doc_id
JOIN document_topics dt ON d.doc_id = dt.doc_id
JOIN topics t ON dt.topic_id = t.topic_id
JOIN review_logs rl ON dc.distill_id = rl.distill_id
WHERE rl.decision = 'approve'
AND rl.quality_score >= 0.85
ORDER BY t.topic_slug, dt.relevance_score DESC;
```
## Workflow Integration
1. **From crawler-orchestrator:** Receive URL + raw content path → `store_document()`
2. **To content-distiller:** Query pending documents → send for processing
3. **From quality-reviewer:** Update `review_status` based on decision
4. **To markdown-exporter:** Query approved content by topic
## Error Handling
- **Duplicate URL:** Silent update (version increment) via `ON DUPLICATE KEY UPDATE`
- **Missing source_id:** Validate against `sources` table before insert
- **Connection failure:** Implement retry with exponential backoff
## Full Schema Reference
See `references/schema.sql` for complete table definitions including indexes and constraints.
## Config File Template
See `references/db_config_template.yaml` for connection configuration template.