deduplicate
fialr dedup <target> [options]Identify duplicate files by BLAKE3 content hash. In dry-run mode (default), report duplicate groups. With --execute, move non-canonical copies to _dupes/. No files are ever deleted.
Arguments
Section titled “Arguments”| Argument | Description |
|---|---|
target | Directory to deduplicate (required) |
Options
Section titled “Options”| Option | Description |
|---|---|
--execute | Move duplicates to _dupes/ (not just report) |
--strategy STRATEGY | Canonical selection strategy (default: shortest-path) |
--jobs-dir PATH | Directory for job artifacts (default: .fialr/jobs) |
What it does
Section titled “What it does”dedup scans the target directory, groups files by BLAKE3 content hash, and identifies groups with more than one member. For each group, it selects one canonical copy and marks the rest as non-canonical.
Retention strategies
Section titled “Retention strategies”The --strategy flag controls how the canonical copy is selected:
| Strategy | Selection rule |
|---|---|
shortest-path | File with the shortest path is canonical (default) |
oldest-mtime | File with the oldest modification time is canonical |
newest-mtime | File with the newest modification time is canonical |
Staging, not deletion
Section titled “Staging, not deletion”When --execute is passed, non-canonical copies are moved to _dupes/ inside the target directory. The _dupes/ directory is a staging area for review. fialr never deletes files. Purging duplicates from _dupes/ is a manual operation.
Each moved file retains full provenance in XATTRs and SQLite:
- Original path (
com.fialr.original_path) - Original name (
com.fialr.original_name) - Content hash (
com.fialr.hash) - Job UUID (
com.fialr.job_uuid)
Safety invariants
Section titled “Safety invariants”- Tier 1 files are never touched. They are flagged for manual review.
- Pre-move and post-move hash verification for every file moved.
- All operations logged to the append-only audit ledger.
- Near-duplicate detection identifies version sequences (same stem, different versions) and reports them separately from exact duplicates.
Output
Section titled “Output”Dry-run:
dedup ~/Documents
──────────────────────────────────────────────────────── total 847 unique 801 groups 19 dupes 46 space 128.4 MB reclaimableExecution:
dedup ~/Documents --execute
──────────────────────────────────────────────────────── moved 46 skipped 0 errors 0Examples
Section titled “Examples”# Find duplicates (dry-run)fialr dedup ~/Documents
# Move duplicates to _dupes/fialr dedup ~/Documents --execute
# Use oldest-mtime retention strategyfialr dedup ~/Documents --execute --strategy oldest-mtimeSee also
Section titled “See also”- Deduplication guide — walkthrough of the deduplication process
- Content Hash as Identity — how hashes drive deduplication
- organize — run the full pipeline before deduplicating