# Markdown Exporter Exports approved reference content as structured markdown files for project knowledge or fine-tuning datasets. ## Trigger Keywords "export references", "generate project files", "create markdown output", "export for fine-tuning", "build knowledge base" ## Export Types | Type | Format | Use Case | |------|--------|----------| | `project_files` | Nested markdown | Claude Projects knowledge | | `fine_tuning` | JSONL | Model fine-tuning dataset | | `knowledge_base` | Flat markdown | Documentation | ## Workflow ### Step 1: Query Approved Content ```bash python scripts/query_approved.py --min-score 0.80 --output approved.json ``` ### Step 2: Organize by Structure **Nested by Topic (default):** ``` exports/ ├── INDEX.md ├── prompt-engineering/ │ ├── _index.md │ ├── 01-chain-of-thought.md │ └── 02-few-shot-prompting.md └── claude-models/ ├── _index.md └── 01-model-comparison.md ``` **Flat Structure:** ``` exports/ ├── INDEX.md ├── prompt-engineering-chain-of-thought.md └── claude-models-comparison.md ``` ### Step 3: Generate Files ```bash python scripts/export_project.py \ --structure nested_by_topic \ --output ~/reference-library/exports/ \ --include-metadata ``` ### Step 4: Generate INDEX ```bash python scripts/generate_index.py --output ~/reference-library/exports/INDEX.md ``` ### Step 5: Fine-tuning Export (Optional) ```bash python scripts/export_finetuning.py \ --output ~/reference-library/exports/fine_tuning.jsonl \ --max-tokens 4096 ``` JSONL format: ```json { "messages": [ {"role": "system", "content": "You are an expert on AI and prompt engineering."}, {"role": "user", "content": "Explain {title}"}, {"role": "assistant", "content": "{structured_content}"} ], "metadata": {"source": "{url}", "topic": "{topic_slug}", "quality_score": 0.92} } ``` ### Step 6: Log Export Job ```bash python scripts/log_export.py --name "January 2025 Export" --type project_files --docs 45 ``` ## Cross-Reference Generation ```bash python scripts/add_crossrefs.py --input ~/reference-library/exports/ ``` Links related documents based on overlapping key concepts. ## Output Verification After export, verify: - [ ] All files readable and valid markdown - [ ] INDEX.md links resolve correctly - [ ] No broken cross-references - [ ] Total token count matches expectation - [ ] No duplicate content ```bash python scripts/verify_export.py --path ~/reference-library/exports/ ``` ## Scripts - `scripts/query_approved.py` - Get approved content from DB - `scripts/export_project.py` - Main export for project files - `scripts/export_finetuning.py` - JSONL export for fine-tuning - `scripts/generate_index.py` - Generate INDEX.md - `scripts/add_crossrefs.py` - Add cross-references - `scripts/log_export.py` - Log export job to DB - `scripts/verify_export.py` - Verify export integrity ## Configuration ```yaml # ~/.config/reference-curator/export_config.yaml output: base_path: ~/reference-library/exports/ project_files: structure: nested_by_topic index_file: INDEX.md include_metadata: true fine_tuning: format: jsonl max_tokens_per_sample: 4096 quality: min_score_for_export: 0.80 ``` ## Integration | From | To | |------|-----| | quality-reviewer (approved) | → | | → | Project knowledge / Fine-tuning dataset |