Sensitivity Tiers
Not every file should be processed the same way. A tax return and a podcast episode have different risk profiles. fialr classifies every file into one of three sensitivity tiers, and the tier determines what operations are permitted — particularly how AI-assisted enrichment handles the file.
All tiers benefit from local AI (Ollama). Local processing is private by design — no data leaves the machine. The distinction is cloud access: Tier 1 files require explicit two-step confirmation before cloud AI can process them. The code enforces this, not policy.
Three-tier model
Section titled “Three-tier model”| Tier | Label | AI access | Operations |
|---|---|---|---|
| 1 | RESTRICTED | Local AI (Ollama). Cloud requires two-step confirmation. | Review queue before any operation. Encrypted vault. |
| 2 | SENSITIVE | Local AI (default). Cloud opt-in via configured provider. | Move/rename with human confirmation. |
| 3 | INTERNAL | Full enrichment via configured provider (local or cloud). | Automated above confidence threshold. |
Tier 1: RESTRICTED
Section titled “Tier 1: RESTRICTED”Tier 1 files are enriched by local AI (Ollama) like any other tier. Text extraction and inference run locally, and results are routed to the review queue — never auto-applied. Cloud AI (Claude API) requires a two-step confirmation: a config flag and a CLI flag must both be active. This ensures deliberate intent without excessive friction. Classification itself relies entirely on structural signals — filename patterns, extensions, and directory context.
What gets classified as RESTRICTED:
- Identity: passports, driver licenses, national ID, Social Security cards, biometric data
- Tax: tax returns, W-2s, 1099s, filings
- Medical: medical records, lab results, prescriptions, vaccination records, diagnoses
- Financial: account statements, routing numbers, credit reports
- Legal: wills, powers of attorney, custody agreements, restraining orders
- Crypto: private keys, seed phrases, recovery keys, keystore files (
.keystore,.jks,.kdbx) - Files in directories matching patterns like
tax/,legal/,medical/,identity/,.ssh/,.gnupg/
Detection patterns: Classification looks for 36 filename patterns across these categories, 9 file extensions (.pem, .key, .p12, .pfx, .gpg, .asc, .keystore, .jks, .kdbx), and 15 directory-level path patterns. All patterns are configurable in sensitivity.yaml.
Review queue requirement: Every Tier 1 file enters the review_queue table in SQLite. No operation — rename, move, or otherwise — executes without the file being explicitly reviewed and approved. The executor checks the reviewed flag and refuses to proceed without it.
Tier 2: SENSITIVE
Section titled “Tier 2: SENSITIVE”Tier 2 files default to local LLM processing (Ollama), but can optionally use a cloud provider (Claude API, BYOK) configured via fialr config ai. Regardless of provider, every operation derived from AI output requires human confirmation before execution.
What gets classified as SENSITIVE:
- Financial documents without exposed account numbers (invoices, receipts, statements)
- Employment records (offer letters, pay stubs, performance reviews)
- Personal correspondence
- Insurance documents
- Files containing names, addresses, or phone numbers
- Files in directories matching patterns like
finance/,employment/,insurance/,personal/
Local LLM by default: Text extraction runs on Tier 2 files, and the extracted text is sent to the configured provider for enrichment — generating filename tokens, tags, and summaries. By default this is the local Ollama instance (no content leaves the machine). With cloud opt-in, extracted text is sent to the configured cloud provider.
Human confirmation required: The enrichment output is a suggestion. Proposed renames, category assignments, and tag sets are presented for review. Execution proceeds only after explicit approval. Below the confidence threshold, files are routed to the review queue with the LLM suggestion as a hint.
Tier 3: INTERNAL
Section titled “Tier 3: INTERNAL”Tier 3 files receive full automated processing via the configured provider (Ollama by default, or cloud with opt-in). The enrichment pipeline runs without confirmation when the confidence score exceeds the configured threshold.
What gets classified as INTERNAL:
- General documents: notes, drafts, articles, reference material
- Media files: photos (non-personal), audio, video
- Code and technical files
- Downloaded content, ebooks, manuals
- Files that do not match any Tier 1 or Tier 2 pattern
Full automation above confidence threshold: When the enrichment pipeline produces a result with a confidence score above confidence_threshold (configured in fialr.toml), the operation executes without human review. Below the threshold, the file is routed to the review queue.
Tier 3 is the default. Files are classified as Tier 3 when no structural signal suggests higher sensitivity.
Classification signals
Section titled “Classification signals”Tier assignment is determined by structural signals, evaluated in this order:
| Signal | Example | Priority |
|---|---|---|
| Filename patterns | passport-scan.pdf, 2024-w2.pdf | Highest — specific tokens override other signals |
| File extensions | .p12, .pem, .key, .pfx | High — cryptographic material is always Tier 1 |
| Directory heuristics | Files inside tax/, medical/, legal/ | Medium — directory context propagates to contents |
| MIME type | Application-specific types, encrypted containers | Lower — supplements other signals |
When signals conflict, the highest tier wins. A file named receipt.pdf in a medical/ directory is Tier 1, not Tier 2.
Content scanning (opt-in)
Section titled “Content scanning (opt-in)”By default, classification uses structural signals only — it never reads file content. For additional protection, you can enable content-signal scanning, which reads the first 64 KB of Tier 2 and Tier 3 files to detect PII patterns:
| Pattern | Detection method |
|---|---|
| Social Security numbers | Regex: XXX-XX-XXXX format |
| Credit card numbers | Luhn-valid 13–19 digit sequences |
| Exposed credentials | password= or passwd: proximity patterns |
If PII is detected, the file is escalated to Tier 1 with the reason logged. Content scanning is local only — no data leaves your machine.
Enable in fialr.toml:
[sensitivity]content_scan = trueContent scanning does not apply to files already classified as Tier 1 (they are already at the highest tier).
Configuration
Section titled “Configuration”Tier classification rules are defined in sensitivity.yaml:
tiers: restricted: filename_patterns: - "*passport*" - "*ssn*" - "*w2*" - "*1099*" - "*tax-return*" - "*medical-record*" - "*diagnosis*" - "*prescription*" - "*drivers-license*" - "*seed-phrase*" - "*power-of-attorney*" # ... 36 patterns total across identity, tax, medical, # financial, legal, crypto, and military categories extensions: - ".pem" - ".key" - ".p12" - ".pfx" - ".gpg" - ".asc" - ".keystore" - ".jks" - ".kdbx" path_patterns: - "**/tax/**" - "**/medical/**" - "**/legal/contracts/**" - "**/.ssh/**" - "**/.gnupg/**" # ... 15 patterns total
sensitive: filename_patterns: - "*invoice*" - "*receipt*" - "*contract*" - "*salary*" - "*confidential*" # ... 12 patterns total path_patterns: - "**/financial/**" - "**/personal/**" - "**/hr/**" - "**/payroll/**" # ... 8 patterns totalThe confidence threshold for Tier 3 automated processing is set in fialr.toml:
[enrichment]confidence_threshold = 0.7Files enriched with a confidence score below the threshold are routed to the review queue regardless of tier.
See the classification guide for details on running classification and reviewing results.