Lesson 30← back to roadmap
GPUs hate small launches. The DynamicBatcher exists to make them stop being small.
Configurable knobs (include/elips/gpu_engine/GpuConfig.hpp)
- window_us — how long to wait for stragglers (microseconds)
- max_batch — hard cap on coalesced queries per launch
- algorithm — brute-force, IVF-Flat, IVF-PQ, graph (CAGRA), hybrid
GpuIngestionPipeline streams record vectors onto the device. GpuQuantizationPipeline trains IVF-PQ centroids and pre-encodes residuals. GpuSearchPipeline orchestrates the batched distance + top-k path. GpuProfiler instruments every stage.
cpp
auto cfg = elips::Config{}
.dimension(768)
.metric(elips::Metric::cosine)
.gpu(elips::gpu::GpuConfig{}
.policy(elips::gpu::GpuPolicy::PreferGpu)
.algorithm(elips::gpu::Algorithm::cagra)
.window_us(500)
.max_batch(64));