ELIPS ships two Python surfaces over the same C++ core. The low-level bindings (open / Database / Vault / Config) mirror the runtime one-to-one. The modern wrapper (connect / Engine / Arena) adds typed text-first ergonomics on top. Both speak to the same vaults, the same WAL, and the same planner.
Install & build
The bindings are built from the repository — there is no PyPI wheel yet. Build the extension and put the package on PYTHONPATH:
cmake -S . -B build -G Ninja -DELIPS_BUILD_PYTHON=ON
cmake --build build --target elips_pymodule
export PYTHONPATH=$PWD/bindings/pythonTwo surfaces, one core
- Low-level:
open(),open_with_config(),Database,Vault,Config. Pick this when you need exact parity with the C++ runtime — full control overDurability,AccessMode, GPU settings, and the embedder. - Modern:
connect(),Engine,Arena,RecordInput,Row,Hit. Pick this for typed, text-first ingestion and retrieval.
Low-level API
import elips
db = elips.open("/tmp/elips-sdk", dimension=128, metric="cosine")
docs = db.vault("documents")
docs.place_document("alpha design note", {"kind": "design"})
docs.place_document("beta runbook", {"kind": "ops"})
hits = docs.seek_text("alpha", top=2)
print(hits[0].document.text, db.config.text_embedder_info.model)
plan = docs.explain_seek(
[1.0, 0.0],
top=1,
where=elips.Filter().field("kind").equals("design"),
has_text_component=True,
)
print(plan.strategy, plan.metadata_accelerated)elips.open()
def open(
path: str,
dimension: int = 0,
metric: str = "cosine",
index: str = "graph",
access_mode: str = "read_write",
embedder = None,
use_default_text_embedder: bool = True,
) -> Databasepath— filesystem directory or":memory:".dimension— required for new databases and every in-memory open. Existing databases reuse the persisted identity.metric—"cosine","euclidean", or"dot_product".index—"graph"(HNSW) or"exact".access_mode—"read_write"or"read_only"; read-only requires an existing database.embedder— optional Python callable orLocalEmbedderConfig.use_default_text_embedder— attach the built-in local embedder automatically when no explicit embedder is supplied.
Database
vault(name)— return (lazily creating) aVault.list_vaults()— current vault names.begin_transaction()— atomic batched writes.query(eql, bindings={})— execute one EQL statement.checkpoint()— flush manifest/segments or snapshot and truncate the WAL.compact()— rebuild every vault index and checkpoint.close()— graceful shutdown; idempotent.abandon()— testing hook that suppresses checkpoint so the next open must recover from the WAL.config— effectiveConfigincluding persisted dimension/metric/index and resolved embedder metadata.gpu_info()/gpu_stats()— available only in GPU builds.
Vault
rid = docs.place(
[1.0, 0.0],
{"kind": "design"},
document=elips.DocumentAttachment(text="alpha design note"),
chunk=chunk,
lineage=lineage,
)
# Text-first — requires a configured text embedder
rid = docs.place_document("alpha design note", {"kind": "design"})
# Mixed batch
docs.place_many([
{"vector": [1.0, 0.0], "data": {"kind": "vector-only"}},
{"text": "alpha design note", "data": {"kind": "text-first"}},
])Query surfaces — seek, seek_text, seek_hybrid — all accept top, where, and threshold. Hybrid takes an extra lexical_weight (default 0.25). Every hit returns id, distance, data, document, chunk, and lineage from the authoritative record store.
Modern wrapper
engine = elips.connect(
"/tmp/elips-modern",
dimension=128,
metric="cosine",
)
arena = engine.arena("documents")
keys = arena.write_many([
elips.RecordInput(text="alpha design note", meta={"kind": "design"}),
{"text": "beta runbook", "meta": {"kind": "ops"}},
])
rows = arena.pull(keys, include_vectors=True)
hits = arena.probe_text("alpha", top=2)
hybrid = arena.probe_hybrid([0.0, 1.0], "alpha", top=2)Arena prefers the native core text APIs when the database config has a resolved text embedder. If the wrapper is given a Python callable instead, it falls back to Python-side embedding plus seek_hybrid. If neither exists, text-first calls raise ConfigError / ValueError — ELIPS never silently degrades to lexical-only behaviour.
Config
config = (
elips.Config()
.dimension(2)
.metric("cosine")
.segmented_storage(True)
.metadata_acceleration(True)
.auto_text_embedder(True)
)
db = elips.open_with_config("/tmp/elips-quickstart", config)segmented_storage(True)— default. Writeselips.manifest+ per-vault segment files.metadata_acceleration(True)— equality and set-membership filters narrow candidates throughMetadataIndex.auto_text_embedder(True)— provisions the built-in local embedder for new databases.local_text_embedder(...)— pin a rehydratable local embedder that ELIPS restores automatically on reopen.text_embedder(callable, ...)— attach a Python callable embedder. ELIPS persists metadata only; reopening without the same callable makes text-first APIs raiseConfigError.
Ingestion patterns
# 1. Vector with attached document
attachment = elips.DocumentAttachment(text="gamma appendix", mime_type="text/plain")
docs.place([1.0, 0.0], {"kind": "appendix"}, document=attachment)
# 2. Text-first
docs.place_document("alpha design note", {"kind": "design"})
# 3. Explicit chunk coordinates
chunk = elips.ChunkInfo()
chunk.document_key = "doc-alpha"
chunk.ordinal = 0
chunk.char_start = 0
chunk.char_end = 17
docs.place_document("alpha design note", {"kind": "design"}, chunk=chunk)Query & planner
# Vector
hits = docs.seek([1.0, 0.0], top=2)
# Text-first
hits = docs.seek_text("alpha", top=2)
# Hybrid (vector + lexical overlap from attached documents)
hits = docs.seek_hybrid([0.0, 1.0], "alpha", top=2, lexical_weight=0.35)
# Inspect the plan
where = elips.Filter().field("kind").equals("design")
plan = docs.explain_seek([1.0, 0.0], top=1, where=where, has_text_component=True)
print(plan.strategy.name, plan.metadata_accelerated, plan.candidate_count)The planner always emits one of ann_index, exact_candidates, full_scan, text_probe, hybrid_fusion — exposed identically here and in EQL.
EQL from Python
rows = db.query(
"seek in documents nearest $q top 5 where kind = \"design\" yield",
bindings={"q": [1.0, 0.0]},
)Text-first retrieval is not exposed through EQL — use Vault.seek_text / seek_hybrid. See EQL reference.
Transactions
with db.begin_transaction() as txn:
v = txn.vault("documents")
v.place([1.0, 0.0], {"tag": "a"})
v.place([0.0, 1.0], {"tag": "b"})
# clean exit → commit; raised exception → auto-rollbackTransactions buffer place and erase calls and validate eagerly (dimension & finiteness). Commit applies operations in order, each one WAL-appended before the in-memory mutation. See Transaction engine.
Embedders
Three options, in order of preference:
- Default local embedder — automatically attached on new databases. Persisted under
text_embedder/and restored on reopen. - Explicit
LocalEmbedderConfig— pin model/revision/path. Rehydratable. - Python callable — full flexibility, but only metadata persists. Reopening without the same callable makes text-first APIs fail with
ConfigError.
Persistence & lifecycle
checkpoint()— write manifest+segments (or snapshot) and truncate the WAL.compact()— rebuild indexes then checkpoint.close()— checkpoint and release locks.abandon()— testing hook that leaves recovery work in the WAL.
reader = elips.open("/tmp/elips-sdk", access_mode="read_only")
print(reader.vault("documents").seek_text("alpha", top=1)[0].data)Read-only opens take a shared advisory lock and reject place, place_document, erase, rebuild_index, and compaction-driven mutation with StorageError.
Errors
ConfigError— invalid config, dimension mismatch, missing text embedder on text-first calls, or read-only open against a missing database.DimensionMismatch/InvalidVector— eager-validated on everyplace.LockConflict— another writer already holds the database.StorageError— IO failure or mutation attempted in read-only mode.NotFound— missing record.ParseError— malformed EQL.
Typing
The package ships py.typed and a complete _core.pyi stub, including the modern wrapper classes, so IDEs and type checkers see the full public API.
Pitfalls
- Calling
seek_text/place_documentwithout a configured embedder raisesConfigError— by design, not silent fallback. - Reopening a database that was created with a Python-callable embedder requires the same callable; without it text-first APIs fail.
":memory:"opens requiredimension > 0every time.- Only one read-write opener at a time per database directory. Use
access_mode="read_only"for shared-reader serving.