Files
Andrew Yim 243b9d851c feat(reference-curator): Add Claude.ai Projects export format
Add claude-project/ folder with skill files formatted for upload to
Claude.ai Projects (web interface):

- reference-curator-complete.md: All 6 skills consolidated
- INDEX.md: Overview and workflow documentation
- Individual skill files (01-06) without YAML frontmatter

Add --claude-ai option to install.sh:
- Lists available files for upload
- Optionally copies to custom destination directory
- Provides upload instructions for Claude.ai

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-29 00:33:06 +07:00

6.9 KiB

Quality Reviewer

Evaluates distilled content for quality, routes decisions, and triggers refactoring or additional research when needed.

Review Workflow

[Distilled Content]
       │
       ▼
┌─────────────────┐
│ Score Criteria  │ → accuracy, completeness, clarity, PE quality, usability
└─────────────────┘
       │
       ▼
┌─────────────────┐
│ Calculate Total │ → weighted average
└─────────────────┘
       │
       ├── ≥ 0.85 → APPROVE → markdown-exporter
       ├── 0.60-0.84 → REFACTOR → content-distiller (with instructions)
       ├── 0.40-0.59 → DEEP_RESEARCH → web-crawler-orchestrator (with queries)
       └── < 0.40 → REJECT → archive with reason

Scoring Criteria

Criterion Weight Checks
Accuracy 0.25 Factual correctness, up-to-date info, proper attribution
Completeness 0.20 Covers key concepts, includes examples, addresses edge cases
Clarity 0.20 Clear structure, concise language, logical flow
PE Quality 0.25 Demonstrates techniques, before/after examples, explains why
Usability 0.10 Easy to reference, searchable keywords, appropriate length

Decision Thresholds

Score Range Decision Action
≥ 0.85 approve Proceed to export
0.60 - 0.84 refactor Return to distiller with feedback
0.40 - 0.59 deep_research Gather more sources, then re-distill
< 0.40 reject Archive, log reason

Review Process

Step 1: Load Content for Review

def get_pending_reviews(cursor):
    sql = """
    SELECT dc.distill_id, dc.doc_id, d.title, d.url, 
           dc.summary, dc.key_concepts, dc.structured_content,
           dc.token_count_original, dc.token_count_distilled,
           s.credibility_tier
    FROM distilled_content dc
    JOIN documents d ON dc.doc_id = d.doc_id
    JOIN sources s ON d.source_id = s.source_id
    WHERE dc.review_status = 'pending'
    ORDER BY s.credibility_tier ASC, dc.distill_date ASC
    """
    cursor.execute(sql)
    return cursor.fetchall()

Step 2: Score Each Criterion

Evaluate content against each criterion using this assessment template:

assessment_template = {
    "accuracy": {
        "score": 0.0,  # 0.00 - 1.00
        "notes": "",
        "issues": []   # Specific factual errors if any
    },
    "completeness": {
        "score": 0.0,
        "notes": "",
        "missing_topics": []  # Concepts that should be covered
    },
    "clarity": {
        "score": 0.0,
        "notes": "",
        "confusing_sections": []  # Sections needing rewrite
    },
    "prompt_engineering_quality": {
        "score": 0.0,
        "notes": "",
        "improvements": []  # Specific PE technique gaps
    },
    "usability": {
        "score": 0.0,
        "notes": "",
        "suggestions": []
    }
}

Step 3: Calculate Final Score

WEIGHTS = {
    "accuracy": 0.25,
    "completeness": 0.20,
    "clarity": 0.20,
    "prompt_engineering_quality": 0.25,
    "usability": 0.10
}

def calculate_quality_score(assessment):
    return sum(
        assessment[criterion]["score"] * weight
        for criterion, weight in WEIGHTS.items()
    )

Step 4: Route Decision

def determine_decision(score, assessment):
    if score >= 0.85:
        return "approve", None, None
    elif score >= 0.60:
        instructions = generate_refactor_instructions(assessment)
        return "refactor", instructions, None
    elif score >= 0.40:
        queries = generate_research_queries(assessment)
        return "deep_research", None, queries
    else:
        return "reject", f"Quality score {score:.2f} below minimum threshold", None

def generate_refactor_instructions(assessment):
    """Extract actionable feedback from low-scoring criteria."""
    instructions = []
    for criterion, data in assessment.items():
        if data["score"] < 0.80:
            if data.get("issues"):
                instructions.extend(data["issues"])
            if data.get("missing_topics"):
                instructions.append(f"Add coverage for: {', '.join(data['missing_topics'])}")
            if data.get("improvements"):
                instructions.extend(data["improvements"])
    return "\n".join(instructions)

def generate_research_queries(assessment):
    """Generate search queries for content gaps."""
    queries = []
    if assessment["completeness"]["missing_topics"]:
        for topic in assessment["completeness"]["missing_topics"]:
            queries.append(f"{topic} documentation guide")
    if assessment["accuracy"]["issues"]:
        queries.append("latest official documentation verification")
    return queries

Step 5: Log Review Decision

def log_review(cursor, distill_id, assessment, score, decision, instructions=None, queries=None):
    # Get current round number
    cursor.execute(
        "SELECT COALESCE(MAX(review_round), 0) + 1 FROM review_logs WHERE distill_id = %s",
        (distill_id,)
    )
    review_round = cursor.fetchone()[0]
    
    sql = """
    INSERT INTO review_logs 
    (distill_id, review_round, reviewer_type, quality_score, assessment, 
     decision, refactor_instructions, research_queries)
    VALUES (%s, %s, 'claude_review', %s, %s, %s, %s, %s)
    """
    cursor.execute(sql, (
        distill_id, review_round, score, 
        json.dumps(assessment), decision, instructions, 
        json.dumps(queries) if queries else None
    ))
    
    # Update distilled_content status
    status_map = {
        "approve": "approved",
        "refactor": "needs_refactor", 
        "deep_research": "needs_refactor",
        "reject": "rejected"
    }
    cursor.execute(
        "UPDATE distilled_content SET review_status = %s WHERE distill_id = %s",
        (status_map[decision], distill_id)
    )

Prompt Engineering Quality Checklist

When scoring prompt_engineering_quality, verify:

  • Demonstrates specific techniques (CoT, few-shot, etc.)
  • Shows before/after examples
  • Explains why techniques work, not just what
  • Provides actionable patterns
  • Includes edge cases and failure modes
  • References authoritative sources

Auto-Approve Rules

Tier 1 (official) sources with score ≥ 0.80 may auto-approve without human review if configured:

# In export_config.yaml
quality:
  auto_approve_tier1_sources: true
  auto_approve_min_score: 0.80

Integration Points

From Action To
content-distiller Sends distilled content quality-reviewer
quality-reviewer APPROVE markdown-exporter
quality-reviewer REFACTOR + instructions content-distiller
quality-reviewer DEEP_RESEARCH + queries web-crawler-orchestrator