elips/docs
Concepts

Algorithms

Indexes plug in behind IndexPort. Two ship in v1.0 — graph (HNSW) and exact — and an optional GPU family lives under src/gpu_engine/ behind GpuPort.

Distance metrics

Three metrics: cosine, euclidean, and dot. Distance kernels are dispatched through a function pointer in Metrics so AVX2 / AVX-512 / NEON variants can slot in without touching call sites. By convention, smaller distance always means more similar.

HNSW (graph)

HierarchicalGraphIndex implements Hierarchical Navigable Small World — a multi-layer proximity graph that gives sub-linear ANN with strong recall. Three parameters matter:

  • M / max_connections — outgoing neighbours per node. Higher = better recall, more memory.
  • ef_construction — candidate pool during insertion. Higher = better graph, slower writes.
  • ef_search — candidate pool during search. Higher = better recall, slower reads.
layer 2 · sparse hubsahlayer 1 · regionalacehlayer 0 · full populationdescend until ef_search convergesgreedy search → climb down → refine
HNSW — sparse hubs at the top, full population at layer 0. Search starts at a hub and descends greedily, refining with ef_search at each step.

Exact

ExactIndex performs a brute-force scan. Use it for small vaults, ground-truth measurement when tuning recall, and unit tests. It implements the same IndexPort contract, so switching is a configuration change.

Metadata acceleration

MetadataIndex is an exact-match accelerator for equality and set-membership predicates. The planner consults it first; if it produces a sufficiently small candidate set, the executor switches strategy to exact_candidates over the narrow set instead of running ANN over the full population.

python
plan = docs.explain_seek(
    [1.0, 0.0],
    top=5,
    where=elips.Filter().field("kind").equals("design"),
)
print(plan.strategy, plan.metadata_accelerated, plan.candidate_count)

Hybrid fusion

seek_hybrid() combines vector distance with lexical overlap on attached DocumentAttachment text. The planner emits the hybrid_fusion strategy; fusion is score-level, not retrieval-level, so a record only needs to surface in one stage to be considered.

GPU family

When built with -DELIPS_GPU_ENABLED=ON, ELIPS exposes several GPU index types behind GpuPort:GpuBruteForceIndex, GpuGraphIndex, GpuIVFFlatIndex, GpuIVFPQIndex, GpuHybridIndex, and a distributed variant. Selection runs through GpuSelector and GpuDeviceManager at startup; domain code never sees a backend type. For the full engine architecture, see GPU engine.