elips/docs
Reference · Python

Python SDK

ELIPS ships two Python surfaces over the same C++ core. The low-level bindings (open / Database / Vault / Config) mirror the runtime one-to-one. The modern wrapper (connect / Engine / Arena) adds typed text-first ergonomics on top. Both speak to the same vaults, the same WAL, and the same planner.

Install & build

The bindings are built from the repository — there is no PyPI wheel yet. Build the extension and put the package on PYTHONPATH:

bash
cmake -S . -B build -G Ninja -DELIPS_BUILD_PYTHON=ON
cmake --build build --target elips_pymodule
export PYTHONPATH=$PWD/bindings/python

Two surfaces, one core

  • Low-level: open(), open_with_config(), Database, Vault, Config. Pick this when you need exact parity with the C++ runtime — full control over Durability, AccessMode, GPU settings, and the embedder.
  • Modern: connect(), Engine, Arena, RecordInput, Row, Hit. Pick this for typed, text-first ingestion and retrieval.

Low-level API

python
import elips

db = elips.open("/tmp/elips-sdk", dimension=128, metric="cosine")
docs = db.vault("documents")
docs.place_document("alpha design note", {"kind": "design"})
docs.place_document("beta runbook", {"kind": "ops"})

hits = docs.seek_text("alpha", top=2)
print(hits[0].document.text, db.config.text_embedder_info.model)

plan = docs.explain_seek(
    [1.0, 0.0],
    top=1,
    where=elips.Filter().field("kind").equals("design"),
    has_text_component=True,
)
print(plan.strategy, plan.metadata_accelerated)

elips.open()

python
def open(
    path: str,
    dimension: int = 0,
    metric: str = "cosine",
    index: str = "graph",
    access_mode: str = "read_write",
    embedder = None,
    use_default_text_embedder: bool = True,
) -> Database
  • path — filesystem directory or ":memory:".
  • dimension — required for new databases and every in-memory open. Existing databases reuse the persisted identity.
  • metric"cosine", "euclidean", or "dot_product".
  • index"graph" (HNSW) or "exact".
  • access_mode"read_write" or "read_only"; read-only requires an existing database.
  • embedder — optional Python callable or LocalEmbedderConfig.
  • use_default_text_embedder — attach the built-in local embedder automatically when no explicit embedder is supplied.

Database

  • vault(name) — return (lazily creating) a Vault.
  • list_vaults() — current vault names.
  • begin_transaction() — atomic batched writes.
  • query(eql, bindings={}) — execute one EQL statement.
  • checkpoint() — flush manifest/segments or snapshot and truncate the WAL.
  • compact() — rebuild every vault index and checkpoint.
  • close() — graceful shutdown; idempotent.
  • abandon() — testing hook that suppresses checkpoint so the next open must recover from the WAL.
  • config — effective Config including persisted dimension/metric/index and resolved embedder metadata.
  • gpu_info() / gpu_stats() — available only in GPU builds.

Vault

python
rid = docs.place(
    [1.0, 0.0],
    {"kind": "design"},
    document=elips.DocumentAttachment(text="alpha design note"),
    chunk=chunk,
    lineage=lineage,
)

# Text-first — requires a configured text embedder
rid = docs.place_document("alpha design note", {"kind": "design"})

# Mixed batch
docs.place_many([
    {"vector": [1.0, 0.0], "data": {"kind": "vector-only"}},
    {"text": "alpha design note", "data": {"kind": "text-first"}},
])

Query surfaces — seek, seek_text, seek_hybrid — all accept top, where, and threshold. Hybrid takes an extra lexical_weight (default 0.25). Every hit returns id, distance, data, document, chunk, and lineage from the authoritative record store.

Modern wrapper

python
engine = elips.connect(
    "/tmp/elips-modern",
    dimension=128,
    metric="cosine",
)
arena = engine.arena("documents")

keys = arena.write_many([
    elips.RecordInput(text="alpha design note", meta={"kind": "design"}),
    {"text": "beta runbook", "meta": {"kind": "ops"}},
])

rows = arena.pull(keys, include_vectors=True)
hits = arena.probe_text("alpha", top=2)
hybrid = arena.probe_hybrid([0.0, 1.0], "alpha", top=2)

Arena prefers the native core text APIs when the database config has a resolved text embedder. If the wrapper is given a Python callable instead, it falls back to Python-side embedding plus seek_hybrid. If neither exists, text-first calls raise ConfigError / ValueError — ELIPS never silently degrades to lexical-only behaviour.

Config

python
config = (
    elips.Config()
    .dimension(2)
    .metric("cosine")
    .segmented_storage(True)
    .metadata_acceleration(True)
    .auto_text_embedder(True)
)
db = elips.open_with_config("/tmp/elips-quickstart", config)
  • segmented_storage(True) — default. Writes elips.manifest + per-vault segment files.
  • metadata_acceleration(True) — equality and set-membership filters narrow candidates through MetadataIndex.
  • auto_text_embedder(True) — provisions the built-in local embedder for new databases.
  • local_text_embedder(...) — pin a rehydratable local embedder that ELIPS restores automatically on reopen.
  • text_embedder(callable, ...) — attach a Python callable embedder. ELIPS persists metadata only; reopening without the same callable makes text-first APIs raise ConfigError.

Ingestion patterns

python
# 1. Vector with attached document
attachment = elips.DocumentAttachment(text="gamma appendix", mime_type="text/plain")
docs.place([1.0, 0.0], {"kind": "appendix"}, document=attachment)

# 2. Text-first
docs.place_document("alpha design note", {"kind": "design"})

# 3. Explicit chunk coordinates
chunk = elips.ChunkInfo()
chunk.document_key = "doc-alpha"
chunk.ordinal = 0
chunk.char_start = 0
chunk.char_end = 17
docs.place_document("alpha design note", {"kind": "design"}, chunk=chunk)

Query & planner

python
# Vector
hits = docs.seek([1.0, 0.0], top=2)

# Text-first
hits = docs.seek_text("alpha", top=2)

# Hybrid (vector + lexical overlap from attached documents)
hits = docs.seek_hybrid([0.0, 1.0], "alpha", top=2, lexical_weight=0.35)

# Inspect the plan
where = elips.Filter().field("kind").equals("design")
plan = docs.explain_seek([1.0, 0.0], top=1, where=where, has_text_component=True)
print(plan.strategy.name, plan.metadata_accelerated, plan.candidate_count)

The planner always emits one of ann_index, exact_candidates, full_scan, text_probe, hybrid_fusion — exposed identically here and in EQL.

EQL from Python

python
rows = db.query(
    "seek in documents nearest $q top 5 where kind = \"design\" yield",
    bindings={"q": [1.0, 0.0]},
)

Text-first retrieval is not exposed through EQL — use Vault.seek_text / seek_hybrid. See EQL reference.

Transactions

python
with db.begin_transaction() as txn:
    v = txn.vault("documents")
    v.place([1.0, 0.0], {"tag": "a"})
    v.place([0.0, 1.0], {"tag": "b"})
    # clean exit → commit; raised exception → auto-rollback

Transactions buffer place and erase calls and validate eagerly (dimension & finiteness). Commit applies operations in order, each one WAL-appended before the in-memory mutation. See Transaction engine.

Embedders

Three options, in order of preference:

  1. Default local embedder — automatically attached on new databases. Persisted under text_embedder/ and restored on reopen.
  2. Explicit LocalEmbedderConfig — pin model/revision/path. Rehydratable.
  3. Python callable — full flexibility, but only metadata persists. Reopening without the same callable makes text-first APIs fail with ConfigError.

Persistence & lifecycle

  • checkpoint() — write manifest+segments (or snapshot) and truncate the WAL.
  • compact() — rebuild indexes then checkpoint.
  • close() — checkpoint and release locks.
  • abandon() — testing hook that leaves recovery work in the WAL.
python
reader = elips.open("/tmp/elips-sdk", access_mode="read_only")
print(reader.vault("documents").seek_text("alpha", top=1)[0].data)

Read-only opens take a shared advisory lock and reject place, place_document, erase, rebuild_index, and compaction-driven mutation with StorageError.

Errors

  • ConfigError — invalid config, dimension mismatch, missing text embedder on text-first calls, or read-only open against a missing database.
  • DimensionMismatch / InvalidVector — eager-validated on every place.
  • LockConflict — another writer already holds the database.
  • StorageError — IO failure or mutation attempted in read-only mode.
  • NotFound — missing record.
  • ParseError — malformed EQL.

Typing

The package ships py.typed and a complete _core.pyi stub, including the modern wrapper classes, so IDEs and type checkers see the full public API.

Pitfalls

  • Calling seek_text / place_document without a configured embedder raises ConfigError — by design, not silent fallback.
  • Reopening a database that was created with a Python-callable embedder requires the same callable; without it text-first APIs fail.
  • ":memory:" opens require dimension > 0 every time.
  • Only one read-write opener at a time per database directory. Use access_mode="read_only" for shared-reader serving.