Skip to content

Architecture Overview

fialr is archival infrastructure designed for local-first operation. Every design decision traces back to seven ordered principles. This page documents those principles, the module architecture, the platform adapter layer, the configuration system, and the two-layer metadata architecture.

Ordered by priority. When principles conflict, the higher-ranked principle wins.

#PrincipleImplementation
1Provenance over convenienceAppend-only operation ledger. Original state always recoverable.
2Content hash as identityBLAKE3 primary, SHA256 secondary. Filenames and paths are mutable metadata.
3SQLite as ledger, XATTRs as cacheSQLite is authoritative on every platform. XATTRs are silently omitted where unsupported.
4Tier-gated enrichmentAll tiers use local AI (Ollama) by default. Tier 1 cloud requires two-step confirmation (config flag + CLI flag). Tiers 2-3 support opt-in cloud (Claude API, BYOK) via fialr config ai.
5Safety by defaultDry-run on. Explicit flag to execute. No destructive operation runs without human approval.
6Schema as versioned documentschema.yaml carries a version field. Migrations committed before files move.
7Portability over lock-inSQLite, TOML, YAML, standard Python, no proprietary stores.

fialr processes files through a staged pipeline. Each stage can run independently or as part of a combined operation.

StagePurposeWrites to disk
ScanTraverse filesystem, compute content hashes, build manifestNo (read-only)
ClassifyAssign sensitivity tier and category to each fileNo (metadata to SQLite)
PlanGenerate proposed moves and renames based on schema and naming conventionNo (plan.csv output)
ExecuteApply the reviewed plan: move, rename, set XATTRsYes
EnrichExtract text, run LLM inference, generate metadata and embeddingsYes (metadata to SQLite)
DeduplicateIdentify exact and near-duplicate files, move copies to stagingYes

Supporting capabilities run across stages: full-text search (FTS5), semantic search (vector embeddings), encrypted vault storage, integrity validation, undo/rollback, and sidecar export.

All stages produce structured job artifacts (manifest, plan, log, report, checkpoint) for auditability and resumability. Every file operation is logged before and after execution with BLAKE3 hash verification.


CapabilitymacOSLinux
Extended attributescom.fialr.* via xattruser.fialr.* via os.getxattr/os.setxattr (stdlib)
Default vaultAPFS encrypted sparse bundle (zero install)age (one-command install)
MIME detectionpython-magic with mimetypes fallbackpython-magic with mimetypes fallback
Machine fingerprintBLAKE3 hash of IOPlatformUUID via ioregBLAKE3 hash of /etc/machine-id
iCloud sync pausebrctl pause com.apple.CloudDocsN/A

The platform check runs in platform/base.py::get_adapter() only. All core modules receive the adapter via dependency injection.


fialr uses three configuration files:

FileFormatPurpose
fialr.tomlTOMLPrimary runtime configuration: enrichment provider/model, naming rules, exclusions, vault backend, embeddings
schema.yamlYAMLVersioned directory schema: category-to-directory mapping, naming pattern overrides
sensitivity.yamlYAMLTier classification rules: path patterns, filename patterns, extensions for each tier

Configuration is loaded by the config module. The fialr config command reads, validates, and modifies fialr.toml.

[general]
hash_algorithm = "blake3"
secondary_hash = "sha256"
[inventory]
buffer_size = 262144
checkpoint_interval = 50
[exclusions]
directories = []
patterns = []
[enrichment]
provider = "ollama" # "ollama" | "claude" | "two-step"
model = "llama3.2"
cloud_model = "claude-sonnet-4-20250514"
endpoint = "http://localhost:11434"
confidence_threshold = 0.7
[naming]
pattern = "{{ date }}-{{ entity | slugify }}-{{ descriptor | slugify }}.{{ ext | lower }}"
separator = "-"
word_separator = "_"
case = "lower"
[vault]
path = ""
backend = "" # "apfs" | "age" | "veracrypt" — empty = platform default
[embeddings]
enabled = true
model = "nomic-embed-text"
dimensions = 768

Metadata is stored in two layers with clear authority:

The SQLite database (.fialr/fialr.db) is the source of truth for all metadata and audit history on every platform.

TablePrimary KeyPurpose
filescontent_hash (BLAKE3)Canonical record per unique file
pathshash + pathAll current and historical paths
operationsop_uuidAppend-only audit ledger (non-rebuildable)
jobsjob_uuidJob metadata and config snapshots
duplicatesgroup_idDuplicate groups with canonical selection
review_queuehashFiles pending human review
vault_entrieshash + vault_pathFiles archived in encrypted vaults
embeddingshash + modelVector embeddings for semantic search and enrichment context
search_index(FTS5 virtual)Full-text search across file metadata
schema_metaversionSchema migration history

The operations table is append-only. It is the audit ledger. It is never truncated, never rebuilt, never modified after write.

Extended attributes are a derived cache layer. They are rebuilt from SQLite, never the reverse.

AttributemacOS KeyLinux Key
Content hashcom.fialr.hashuser.fialr.hash
Secondary hashcom.fialr.hash_sha256user.fialr.hash_sha256
Sensitivitycom.fialr.sensitivityuser.fialr.sensitivity
Categorycom.fialr.categoryuser.fialr.category
Entitycom.fialr.entityuser.fialr.entity
Tagscom.fialr.tagsuser.fialr.tags
Reviewedcom.fialr.revieweduser.fialr.reviewed
Original namecom.fialr.original_nameuser.fialr.original_name
Original pathcom.fialr.original_pathuser.fialr.original_path
Job UUIDcom.fialr.job_uuiduser.fialr.job_uuid
Enriched atcom.fialr.enriched_atuser.fialr.enriched_at
Exclude flagcom.fialr.excludeuser.fialr.exclude

When XATTRs are unsupported (FAT32, exFAT, network mounts), fialr writes to SQLite only. The skip is logged. No error is raised. No functionality is lost. The system degrades gracefully because SQLite is always authoritative.