enrich
fialr enrich <target> [options]Extract text from files and run inference via the configured provider (Ollama by default, or cloud via BYOK) to generate structured metadata: filename tokens, semantic tags, a one-sentence summary, and a confidence score. Tier 1 files are enriched by local AI (Ollama) by default, with results always routed to the review queue. Cloud requires two-step confirmation.
Arguments
Section titled “Arguments”| Argument | Description |
|---|---|
target | Directory to enrich (required) |
Options
Section titled “Options”| Option | Description |
|---|---|
--execute | Apply enrichment metadata (not just report) |
--embed-only | Compute embeddings only, without running AI text extraction or inference. Use to backfill or recompute embeddings. |
--yes, -y | Skip cloud cost confirmation prompt |
--jobs-dir PATH | Directory for job artifacts (default: .fialr/jobs) |
--cloud-refine | Enable 2-step enrichment: local extraction → sanitization → cloud refinement. Tier 3 automatic, Tier 2 opt-in, Tier 1 falls back to local-only unless two-step confirmation active. |
--sensitivity-rules PATH | Path to sensitivity.yaml (default: config/sensitivity.yaml) |
Prerequisites
Section titled “Prerequisites”Enrichment requires:
- Ollama running locally at
http://localhost:11434(default provider), or a cloud provider configured viafialr config ai - A pulled model for Ollama (default:
llama3.2, configurable under[enrichment].model)
What it does
Section titled “What it does”Tier restrictions
Section titled “Tier restrictions”Sensitivity tiers gate access to the enrichment pipeline:
| Tier | Access |
|---|---|
| 1 (RESTRICTED) | Local AI (Ollama). Results always routed to review queue. Cloud requires two-step confirmation. |
| 2 (SENSITIVE) | Configured provider processes extracted text. Human confirmation required. |
| 3 (INTERNAL) | Full enrichment via configured provider. |
Text extraction
Section titled “Text extraction”fialr extracts text from files using format-specific tools:
| Format | Extraction method |
|---|---|
| Scanned PDF | ocrmypdf + Tesseract OCR |
| Native PDF | pypdfium2 |
| Photos | piexif (EXIF metadata) |
| Audio | mutagen (ID3 tags) |
| Office documents | python-docx, openpyxl |
Inference
Section titled “Inference”Extracted text is sent to the configured provider (Ollama on localhost by default, or a cloud provider if configured). The inference layer is abstracted behind a provider interface. The model returns structured JSON:
- Date — document subject date
- Entity — primary subject or organization
- Descriptor — semantic description
- Tags — semantic tags
- Summary — one-sentence summary
- Confidence — 0.0 to 1.0 score
Confidence routing
Section titled “Confidence routing”The confidence threshold (default: 0.7, configurable in fialr.toml under [enrichment].confidence_threshold) determines what happens with inference results:
- Above threshold — metadata is auto-applied to XATTRs and SQLite
- Below threshold — file is sent to the review queue with the LLM suggestion attached as a hint for manual review
Post-enrichment embeddings
Section titled “Post-enrichment embeddings”When embeddings are enabled ([embeddings] enabled = true in fialr.toml), enrichment automatically computes a vector embedding for each successfully enriched file. These embeddings power semantic search, similar file discovery, and improve future enrichment quality through adaptive corpus context.
Output
Section titled “Output”Dry-run:
enrich ~/Documents
──────────────────────────────────────────────────────── enriched 623 review 89 skipped 135 (tier 1: 12, no text: 123) errors 0 embeddings autoExamples
Section titled “Examples”# Dry-run enrichment (default provider: Ollama)fialr enrich ~/Documents
# Apply enrichment metadatafialr enrich ~/Documents --execute
# Cloud enrichment, skip cost confirmationfialr enrich ~/Documents --execute --yes
# 2-step enrichment: local extraction → sanitized → cloud refinementfialr enrich ~/Documents --cloud-refine --execute
# Compute embeddings only (no AI inference)fialr enrich ~/Documents --embed-only
# Recompute embeddings after model changefialr enrich ~/Documents --embed-only --forceSee also
Section titled “See also”- Enrichment guide — walkthrough of the enrichment process
- Sensitivity Tiers — how tiers control enrichment access
- scan — check sensitivity tiers before enrichment
- search — search enriched metadata with
--semanticfor vector similarity