deduplicate

fialr dedup <target> [options]

Identify duplicate files by BLAKE3 content hash. In dry-run mode (default), report duplicate groups. With --execute, move non-canonical copies to _dupes/. No files are ever deleted.

Arguments

Argument	Description
`target`	Directory to deduplicate (required)

Options

Option	Description
`--execute`	Move duplicates to `_dupes/` (not just report)
`--strategy STRATEGY`	Canonical selection strategy (default: `shortest-path`)
`--jobs-dir PATH`	Directory for job artifacts (default: `.fialr/jobs`)

What it does

dedup scans the target directory, groups files by BLAKE3 content hash, and identifies groups with more than one member. For each group, it selects one canonical copy and marks the rest as non-canonical.

Retention strategies

The --strategy flag controls how the canonical copy is selected:

Strategy	Selection rule
`shortest-path`	File with the shortest path is canonical (default)
`oldest-mtime`	File with the oldest modification time is canonical
`newest-mtime`	File with the newest modification time is canonical

Staging, not deletion

When --execute is passed, non-canonical copies are moved to _dupes/ inside the target directory. The _dupes/ directory is a staging area for review. fialr never deletes files. Purging duplicates from _dupes/ is a manual operation.

Each moved file retains full provenance in XATTRs and SQLite:

Original path (com.fialr.original_path)
Original name (com.fialr.original_name)
Content hash (com.fialr.hash)
Job UUID (com.fialr.job_uuid)

Safety invariants

Tier 1 files are never touched. They are flagged for manual review.
Pre-move and post-move hash verification for every file moved.
All operations logged to the append-only audit ledger.
Near-duplicate detection identifies version sequences (same stem, different versions) and reports them separately from exact duplicates.
When vector embeddings are available, semantic near-duplicates are also detected — files that differ in wording but cover the same content.

Output

Dry-run:

dedup ~/Documents

────────────────────────────────────────────────────────
     total  847
    unique  801
    groups  19
     dupes  46
     space  128.4 MB reclaimable

Execution:

dedup ~/Documents --execute

────────────────────────────────────────────────────────
     moved  46
   skipped  0
    errors  0

Examples

# Find duplicates (dry-run)
fialr dedup ~/Documents

# Move duplicates to _dupes/
fialr dedup ~/Documents --execute

# Use oldest-mtime retention strategy
fialr dedup ~/Documents --execute --strategy oldest-mtime