Skip to content

Module Map

Every module in fialr has a defined location and a single responsibility. The module map below is the source of truth for the codebase structure.

fialr/
├── __main__.py # python -m fialr support
├── cli.py # unified CLI entry point
├── core/
│ ├── inventory.py # filesystem traversal, manifest generation
│ ├── classifier.py # sensitivity tiering, category suggestion
│ ├── planner.py # dry-run move plan generator
│ ├── executor.py # approved plan execution engine
│ ├── deduplicate.py # hash-based and near-duplicate detection
│ ├── rename.py # template-driven naming engine
│ ├── organize.py # schema-driven reorganization
│ └── validate.py # integrity verification against manifests
├── enrichment/
│ ├── inference.py # abstracted local inference interface (Ollama)
│ ├── extractor.py # text extraction: OCR, PDF, EXIF, ID3
│ └── enrich.py # enrichment orchestrator
├── metadata/
│ ├── xattr.py # extended attribute read/write (platform-aware)
│ ├── db.py # SQLite operations and schema
│ └── export.py # on-demand sidecar export (JSON / YAML)
├── platform/
│ ├── base.py # platform adapter interface
│ ├── macos.py # iCloud sync, APFS vault, com.fialr.* XATTRs
│ ├── linux.py # user.fialr.* XATTRs, VeraCrypt default vault
│ └── windows.py # NTFS ADS, VeraCrypt default vault
├── reporting/
│ ├── reports.py # job summary generation
│ └── exporters.py # output format adapters (JSON, Markdown, CSV)
├── plugins/
│ └── base.py # plugin interface and hook registry
└── utils/
├── hashing.py # BLAKE3 primary, SHA256 secondary
├── logging.py # structured job logging
├── config.py # TOML config loader and validator
├── output.py # ANSI-colored CLI output (Bronze/Ash palette)
└── help.py # custom CLI help renderer (gh/cargo-style)

Unified CLI entry point. Parses arguments via argparse but renders help through a custom HelpRenderer that bypasses argparse’s default display. All user-facing output goes through the Output class to stderr. Machine-readable data goes to stdout.

The pipeline modules. Each handles one phase of the workflow.

ModuleResponsibility
inventory.pyTraverse a directory, hash every file (BLAKE3 + SHA256), detect MIME types, apply the four-layer exclusion system, produce a manifest.json. Read-only.
classifier.pyApply sensitivity rules to assign tiers (1/2/3) and suggest categories. Uses structural signals only — never reads file content for Tier 1.
planner.pyRead classifier output and schema, produce plan.csv with source/destination paths, proposed names, operation types, and conflict flags. Read-only.
executor.pyExecute an approved plan. Pre-move hash verification, move/rename, post-move hash verification, checkpoint after every N operations. Refuses to run without reviewed=true.
deduplicate.pyGroup files by BLAKE3 hash. Select canonical copy per retention strategy. Move non-canonical copies to _dupes/. No deletions.
rename.pyApply the naming convention (YYYY-MM-DD_[entity]_[descriptor]_[version].[ext]). Derive tokens from metadata, reject generic names.
organize.pySchema-driven reorganization. Maps categories to directory paths using schema.yaml.
validate.pyPost-execution verification: check paths exist, hashes match, XATTRs are correct.

Local-only AI enrichment pipeline. Tier 1 files never enter this pipeline.

ModuleResponsibility
inference.pyAbstracted local inference interface. Calls Ollama on localhost. Returns structured JSON (date, entity, descriptor, tags, summary, confidence). Cloud endpoints cannot be configured.
extractor.pyText extraction from multiple formats: ocrmypdf + Tesseract for scanned PDFs, pypdfium2 for native PDFs, piexif for EXIF, mutagen for audio, python-docx and openpyxl for Office documents.
enrich.pyOrchestrator. Routes files through extraction and inference. Enforces tier restrictions. Routes results above/below confidence threshold to auto-apply or review queue.

Data storage and export.

ModuleResponsibility
db.pySQLite operations. Schema creation, file records, path tracking, operations ledger, duplicate groups, review queue.
xattr.pyExtended attribute read/write. Platform-aware key prefixes (com.fialr.* on macOS, user.fialr.* on Linux). Degrades gracefully on unsupported filesystems.
export.pyOn-demand sidecar file generation. JSON and YAML formats. Single-file and batch export.

Platform adapters. Core modules import from base.py and never contain if sys.platform.

ModuleResponsibility
base.pyAbstract adapter interface. Runtime adapter selection via get_adapter().
macos.pyiCloud sync detection and pause (brctl), APFS encrypted sparse bundle vault, com.fialr.* XATTR namespace.
linux.pyuser.fialr.* XATTR namespace via pyxattr, VeraCrypt as default vault.
windows.pyNTFS Alternate Data Streams (limited XATTR support), VeraCrypt as default vault.

Job output generation.

ModuleResponsibility
reports.pyGenerate human-readable job summaries (report.md) from job logs.
exporters.pyOutput format adapters for JSON, Markdown, and CSV.
ModuleResponsibility
base.pyPlugin interface using Protocol (structural typing). Hook registry for extending fialr behavior.

Shared infrastructure.

ModuleResponsibility
hashing.pyBLAKE3 (primary, canonical) and SHA256 (secondary, cross-tool verification) hash computation. xxhash explicitly excluded — not cryptographically sound.
logging.pyStructured JSON logging for jobs. JobLogger writes to log.json in the job directory. Debug output suppressed by default, enabled with --verbose.
config.pyTOML config loader and validator. Reads fialr.toml, provides nested key access.
output.pyANSI-colored CLI output using the Bronze/Ash brand palette. Writes to stderr. Respects NO_COLOR, FORCE_COLOR, and TTY detection.
help.pyCustom help renderer. Grouped, aligned, color-aware output following gh/cargo/stripe conventions.

config/
fialr.toml # primary runtime configuration
schema.yaml # versioned directory schema
sensitivity.yaml # tier classification rules and patterns

Every operation creates a job directory under .fialr/jobs/:

.fialr/jobs/{YYYY-MM-DD}_{job-name}_{uuid}/
manifest.json # pre-execution file state snapshot
plan.csv # proposed operations
log.json # append-only structured operation log
report.md # human-readable job summary
checkpoint.json # last completed operation index for resume

ComponentPackagePurpose
RuntimePython 3.11+
Primary hashblake3Canonical file identity
Secondary hashhashlib (stdlib)SHA256 for cross-tool verification
Databasesqlite3 (stdlib)Metadata ledger and audit log
InferenceOllamaLocal LLM (localhost only)
OCRocrmypdf + TesseractScanned PDF text extraction
PDFpypdfium2Native PDF text extraction
EXIFpiexifPhoto metadata
AudiomutagenAudio metadata (ID3 tags)
MIMEpython-magicFile type detection
Officepython-docx, openpyxlWord and Excel extraction
Configtomllib (stdlib)TOML parsing
Schema/rulespyyamlYAML parsing
Pathspathlib (stdlib)All path operations

stdlib first. Before adding a dependency, check if the standard library can do the job. Every new dependency must be logged with rationale.