Rollback. Branch. Share. The state model underneath every container.
Five filesystem primitives and a BTRFS copy-on-write layer make container state survive, time-travel, and move between servers — without leaving HTTP.
/hoody/storage · /hoody/databases · /ramdisk · /hoody/shares · BTRFS snapshots
/ ├── hoody/ │ ├── storage/ ← persistent, per-container │ ├── databases/ ← concurrent-safe SQLite (FUSE) │ └── shares/ ← inter-container directory mounts ├── ramdisk/ ← RAM-backed, up to 50% of host memory └── ... ← standard Linux FS (ext4, POSIX)
The filesystem map.
Each path has a different persistence and concurrency story. Picking the right one is the whole mental model. Full deep-dives in the sections below.
/hoody/storage
Persistent per-container directory. Survives restarts; snapshots capture it. A regular ext4 directory — no FUSE, no concurrency safety beyond what your app provides.
/hoody/databases
FUSE mount. Many processes, many containers (same server) can concurrent-write SQLite safely — no 'database is locked'. Path change only: move the file from /app/data.db to /hoody/databases/data.db. Host-level — not replicated cross-server.
/ramdisk
RAM-backed tmpfs at 10–20 GB/s, <1µs latency. Ceiling of 50% of host memory, on-demand allocation (0 bytes used when empty). Persists through container restart, cleared on host reboot. Your usage competes with application memory.
/hoody/shares
Inter-container directory mounts via the Storage Shares API. Read-only or read-write, 1-to-1 or project-wide. Cross-server shares work automatically (full POSIX, transparent file-locking) — no mount setup. Lifecycle (accept / reject / mount / revoke) lives on /platform/control-plane.
/ (ext4)
Standard Linux filesystem everywhere else. POSIX, ext4, full semantics. Behaves like any VPS outside the /hoody/* paths.
BTRFS under it all
Copy-on-write layer beneath the disk-backed paths. Block-level snapshots, deduplication across containers. Instant creation, space-efficient restore. (Not /ramdisk — that's tmpfs in RAM.)
Copy-on-write, block-level, instant.
BTRFS stores only the blocks that changed since the snapshot point. Creating a snapshot adds a marker, not data. Cost scales with how much a container actually changes — not with how many snapshots or containers sit on the same base image.
t0 — snapshot A
Container filesystem has blocks a, b, c. Snapshot A references all three.
t1 — block b changes
Write-modify copies b to b'. Original b stays referenced by snapshot A. Container now sees a, b', c.
t2 — snapshot B
Snapshot B captures a, b', c. A and B share a and c. Only b/b' diverges. Total storage: 4 blocks, not 6.
Snapshots capture the disk — never live memory.
However you take it — running or stopped — a Hoody snapshot captures the filesystem only: never running processes, RAM, or in-flight network connections (the stateful field is always false). Restore reverts the disk and the container boots fresh from it; processes do not resume. Need live memory and processes preserved instead? That is a separate operation — pause/resume.
Snapshot → filesystem only
Everything on disk, nothing in memory.
- +Filesystem — everything written to disk in / including /hoody/*
- +Every file, package, and installed dependency on disk
- +Databases and app data already flushed to disk
- +Restore reverts the disk; the container boots fresh from it — processes do not resume
- +The stateful field is always false — that is what makes snapshots fast, tiny, and reliable
Pause/resume → RAM frozen
The separate operation for when memory must survive.
- ·POST /api/v1/containers/ID/pause — the pause lifecycle operation
- ·Container state is frozen in RAM, suspended in place
- ·The resume operation brings processes back mid-execution
- ·Not a snapshot — it creates no restore point on disk
— Restore is destructive: it overwrites current live disk state. If you want to keep the present, snapshot it first, then restore the target.
Let the agent try. Keep the undo button.
LLMs that touch auth middleware, database migrations, or broad refactors benefit most from a snapshot-before-run pattern. Cheap to take. Fast to restore. One API call in each direction.
Without snapshots
- 1.Agent refactors your auth middleware. First smoke tests pass.
- 2.You merge and deploy. Everything looks fine for days.
- 3.Sessions start dropping silently in production.
- 4.Bisecting recent agent commits takes hours — the change is buried in a large diff.
- 5.Rollback means reverting every merged agent PR by hand and redeploying.
With Hoody snapshots
- 1.Snapshot the container before the agent runs. Give it an alias like pre-auth-refactor.
- 2.Let the agent work. It edits files, restarts services, runs smoke tests.
- 3.Something looks wrong in production a week later.
- 4.PATCH /snapshots/pre-auth-refactor — the container restores to the pre-agent state in 5–15s.
- 5.With service restored from snapshot, you can take a new snapshot of the broken state for offline investigation.
The safety-net pattern is why every AI-assisted workflow — code generation, infrastructure refactoring, database migrations — should run inside a snapshotted container. The snapshot is cheap; the discovery cost of a bad AI change is not.
The workflow is a commit graph for entire machines.
Snapshot before a risky change. Iterate. If the result is good, keep going — the snapshot is cheap and expirable. If it breaks, one PATCH call reverts the container's filesystem exactly to that point — the container boots fresh from the restored disk.
t0 — baseline
POST /snapshots — tagged v1.4.0-pre
t1 — risky work
AI agent refactors, migrations run, services restart
t2 — broke something
Smoke tests fail. Need to go back.
t3 — restore
PATCH /snapshots/v1.4.0-pre — 5–15s restore
t4 — identical to t0
Filesystem matches t0 exactly; the container boots fresh from it. Zero disk drift.
PATCH /api/v1/containers/ID/snapshots/v1.4.0-preWhen SSD is the bottleneck, /ramdisk is the answer.
Up to half the host's memory, reachable as /ramdisk, allocated on-demand. It's there when you use it, disappears when you don't. Persists through container restart. Clears on host reboot.
⚠ /ramdisk usage counts against container memory. If the container has 4 GB and /ramdisk holds 3 GB, the application has 1 GB to work with. Monitor with `free -h` and cap with careful design.
Five snapshot strategies teams actually use.
Pick one and your state discipline becomes a one-line decision, not a policy doc. Most teams run two or three of these in parallel.
1 · Pre-operation safety
Snapshot before anything destructive: migrations, AI code generation, incident response, manual hotfixes.
2 · Versioned milestones
Alias snapshots at release points — v1.4.0, v1.5.0-rc. Expiry weeks out. Instant rollback to any named version.
3 · Daily automated
Cron-snapshot with auto-expiry = self-pruning history. Seven days of yesterdays, thirty days of last months.
4 · Git-style branching
Snapshot + container copy = an alternate timeline on a different project or server. Try a risky path on the copy. If it works, rebuild the baseline there; sync is one-way so the copy is where the new truth lives.
5 · Golden-image templates
Seed a snapshot, copy-from-snapshot for every new dev container. Onboarding becomes one POST call.
Bonus · Forensic preservation
When production is compromised: snapshot the compromised state for investigation, restore production from a clean earlier snapshot, diff the two offline. Incident response without losing evidence.
What you would otherwise stitch together.
Rollback, live-memory pause/resume, concurrent-safe SQLite, cross-container shares, RAM-backed scratch — each has a traditional answer. Here's the honest side-by-side.
| Concern | Hoody Data & State | Traditional stack |
|---|---|---|
| Roll back an entire machine | PATCH /containers/ID/snapshots/NAME | Tarball + hand-redeploy + pray |
| Freeze live memory state | POST /containers/ID/pause (pause/resume) | VMware suspend + custom tooling |
| Cross-container directory share | /hoody/shares + Shares API | Run NFS or SMB server yourself |
| Concurrent SQLite writes | /hoody/databases (FUSE mount) | Rewrite your data layer on Postgres |
| RAM-backed scratch space | /ramdisk (50% of host memory) | tmpfs + careful ulimits |
| Storage dedup across similar containers | BTRFS copy-on-write (built in) | rsync --link-dest, manual policy |
| Cross-server state replication | POST /containers/ID/copy + /sync | DIY rsync loops + service restart |
If you are already on a managed VM snapshot system for a specific workload, stay there for that workload. Hoody's state model earns its place when the primitive you want is actually container-level time travel.
Your state is already a commit graph. Learn to use it.
The filesystem is already there. The snapshots are already there. The mounts are already there. Spin up a container and the whole state model is live.
See also — /platform/control-plane for the snapshot and copy/sync APIs, /kit/files for cloud backends, /kit/sqlite for SQLite as HTTP.