elips/docs
Overview

Project history

ELIPS — Embedded Local Index & Persistence System — was conceived in 2024 to answer one question: what would a vector database look like if it were built like SQLite? Embedded, in-process, zero infrastructure.

Origins: built from first principles

The core was implemented from first principles in C++23. No third-party vector-search libraries (FAISS, hnswlib, nmslib) are used for the core indexing. Every subsystem — the vector type, the distance kernels, the HNSW graph, the WAL, the query language, the Python bindings — was written for ELIPS. The only external dependencies are the C++23 standard library, GoogleTest (test-only), PyBind11 (Python bindings), and optional GPU backend libraries (cuVS, MPS, oneMKL).

Per ADR-0001, the decision to use C++23 was driven by the need for predictable, allocation-controlled performance for SIMD distance kernels and graph traversal, plus first-class embeddability in both C++ and Python processes with no language runtime beyond the standard library.

Motivation

In 2024 the vector-database landscape was dominated by client-server architectures — cloud APIs, server processes, REST/gRPC/GraphQL endpoints, deployment pipelines. Powerful for large teams; disproportionate for CLIs, desktop apps, research notebooks, CI pipelines, edge devices, and single-process services. The friction of "spin up a vector database" discourages adoption of semantic search in smaller projects.

ELIPS targets that gap: vector search as a library, not a service. If SQLite replaced client-server RDBMSes for embedded use cases, ELIPS aims to do the same for vector search.

v1.0 scope

Version 1.0 ships a vertically complete prototype: every layer from the domain types through the WAL, planner, EQL, CLI, and Python bindings. The deferred list under Roadmap is intentional — each future capability has a v1.0 seam (a port, a factory, a dispatch table) so it can land additively without breaking the existing surface.

The name

The four pillars are baked into the name: embedded (in-process), local (disk on the host), indexed (HNSW for fast search), and persistent (WAL plus snapshot or segmented manifest).