A persistent ELIPS database is a directory. Each file inside it has a single responsibility: identity, WAL, manifest, segments, snapshot, lock, embedder.
On disk
/my_db/
├── LOCK # advisory file lock
├── IDENTITY # dimension, metric, index type
├── TEXT_EMBEDDER.manifest # embedder identity
├── wal.log # write-ahead log
├── elips.manifest # segmented mode root
├── text_embedder/
│ └── default_v1_<dim>.localembed
├── segments/
│ └── vault_<n>_<epoch>.segment
└── elips.snapshot # snapshot mode (compat)IDENTITY
The durable source of truth for dimension, metric, and index type. Existing databases reopen with this identity; passing a conflicting value raises ConfigError.
Embedder manifest
TEXT_EMBEDDER.manifest records provider, model, revision, dimension, fingerprint, whether the embedder is rehydratable, and a relative artifact path when applicable. For the built-in local embedder this manifest plus the .localembed artifact is everything required to restore the same embedder on reopen.
WAL
Every mutation appends to wal.log before the in-memory vault changes. Records are framed with a CRC32C. Supported ops:
insert— vector + payload.erase— id-only.insert_ex— full document attachment, chunk info, and embedding lineage.
Durability controls when the log flushes:
| Mode | Flush |
|---|---|
paranoid | Flush + fsync per write. |
standard | Flush per write. |
relaxed | Buffer until checkpoint / close. |
ephemeral | No WAL attached. |
Checkpoint & compact
checkpoint() writes the current logical state and truncates the WAL. In segmented mode it writes one fresh segment per vault, rewrites elips.manifest, and removes obsolete segment files. In snapshot mode it writes elips.snapshot.tmp, then atomically renames into place.
compact() rebuilds every vault index from the authoritative record store, then checkpoints — useful after large deletions or to reset graph topology.
Recovery
Corrupt or truncated WAL tails are tolerated: replay stops at the first invalid record and preserves the valid prefix. This is what makes ungraceful shutdowns safe in practice.
Read-only mode
Read-only opens require an existing database and take a shared lock. Multiple readers coexist; no WAL writer is attached; every mutation path raises StorageError. This is the supported mode for fan-out serving and shared-reader analytics.