File-Sync Configuration
Pre-flight checklist + ignore patterns for file-sync (Syncthing/rclone/rsync/Dropbox/iCloud) — protects git working trees, virtualenvs, and build artifacts.
Overview
File-Sync Configuration captures the pre-flight rules every two-way file synchroniser needs before it ever touches the network. Load it before configuring Syncthing, rclone bisync, Dropbox/iCloud/Google Drive shared folders, periodic rsync jobs, Disk Arcana sync, or any custom sync layer. Do not load for one-way backups or CI artifact transfer — those have a different risk model.
Why It Matters (Founding Incident)
INFRA-0026 (2026-04-25): the first .stignore for Syncthing had 28 patterns and missed .venv, __pycache__, target/, *.db, plus failed to exclude nested git repos wholesale. Outcome: 1 materialised production sync-conflict, 60+ .sync-conflict files in a week, 14 git repos with diverging working trees across hosts, and a real cross-platform breakage risk (macOS Mach-O vs Linux ELF binaries). Expanding the pattern set 28 → 66 dropped the file count from 40,361 to 2,206 (−95%).
Pre-Flight Inventory (Mandatory)
Before turning sync on, run find against the source root for every problem class — whatever the inventory surfaces must be in the ignore list before the first sync:
- Vendored / build artifacts —
node_modules,.venv/venv,__pycache__,target,.next/.turbo/.nuxt,.cache/.parcel-cache,coverage/.nyc_output,dist/build/.build,DerivedData,.pytest_cache/.mypy_cache/.ruff_cache - Nested .git directories (critical) — every
.git/under the sync root - Local DB / state files —
*.db,*.sqlite/*.sqlite3,*.duckdb,*.db-journal - Compiled binaries (cross-platform unsafe) —
*.so,*.dylib,*.dll,*.exe - IDE / OS junk —
.idea,.vscode,.DS_Store,Thumbs.db
Decision Tree: Sync Working Trees vs Git Pull
For every .git/ found inside the sync root, ask: does the second node host live edits, agents, or production runtime in this repo?
- Yes → do not sync the working tree. Exclude
/path/to/repowholesale. Update via agit pullcron on the second node (seearcanada-pull.shpattern). - No → the working tree may be synced as a read-only mirror, but still exclude
.git/— each node keeps its own commit history.
Default to "yes" — almost every second node eventually becomes "active" (a new agent, a deploy script, a manual edit). Overprotection beats recovery.
Reusable .stignore Template (Syncthing, INFRA-0026 v2)
The hardened template covers eight buckets:
- Project source code (separate git repos) —
/Projects/*/code,/Projects/Datarim/sources,/Projects/Rules of Robotics/Code - AI agents with their own git/venv —
/AI_agents/Email Agent,Screen reader,Remove-Watermark,Agent Dreamer - Workflow / runtime state —
.git,.dreamer,.meta,.claude,.githooks - Build / deps —
node_modules,dist,build,.next,.turbo,.nuxt,.cache,.parcel-cache,coverage,.nyc_output,target - Python environments / caches —
.venv,venv,__pycache__,*.pyc,.pytest_cache,.mypy_cache,.ruff_cache - Swift build artifacts —
.build,DerivedData,*.xcuserstate - Compiled binaries —
*.so,*.dylib,*.dll,*.exe,*.o,*.a - DB / state files —
*.db,*.sqlite,*.sqlite3,*.duckdb,*.db-journal,*.db-shm,*.db-wal
Plus secrets/temp/OS junk: *.tmp, *.log, .env*, .DS_Store, Thumbs.db, .Spotlight-V100, .Trashes, .fseventsd.
Pattern Syntax Cheat-Sheet
Syncthing (.stignore): node_modules matches at any depth, /Projects/*/code is path-anchored, (?d)pattern deletes already-synced files, (?i) is case-insensitive, !important.log negates.
rclone: trailing / targets folders only, ** recurses across folders, /path/to/exclude/** is path-anchored.
rsync: no file/dir distinction, /relative/path is anchored at start dir, **/*.tmp for recursive globs.
Workflow for Git-Managed Repos (when file-sync is excluded)
- Cron
git pullscript — the recommended pattern isarcanada-pull.sh: fetch upstream, skip if local==remote, skip if branch ≠ main/master, stash local edits, ff-only pull with merge fallback, then a CLI Claude conflict-resolver fallback, alert via Ops Bot if unresolved, pop stash. - CI/CD self-hosted runner — a GitHub Actions runner on the second node pulls on push to main (event-driven, no polling).
- Manual — the operator runs
git pullon demand. Fine for rarely-updated repos.
Compliance Check
- Pre-flight inventory completed for every problem class
- Every discovered class is present in the ignore patterns
- Every nested
.git/is either fully excluded or documented as a read-only mirror - Cross-platform binary classes (
.venv,target,*.so/*.dylib/*.dll) excluded if syncing across operating systems - DB files (
*.db,*.sqlite) excluded as host-local state - Lockdown applied (
globalAnnounce=false, no public discovery, transport restricted to a private network such as Tailscale) - Backup of pre-change config preserved (
config.xml.pre-{TASK-ID}) - Runbook documented (topology, ops, rollback)
- Bidirectional smoke test executed