Docs › Extras
Benchmarks
Real performance numbers from Engram running on a single machine. All measurements taken with the default Xenova/all-MiniLM-L6-v2 model (384-dim), SQLite WAL mode, Node.js 22.
i
Test environment: Ubuntu 22.04, Intel i7-12700, 32GB RAM, NVMe SSD. Numbers will vary based on hardware. The embedding model download (~25MB) is not included in timing.
Key Metrics
Embedding Latency
8ms/text
Single text, all-MiniLM-L6-v2
Recall Latency (100 memories)
18ms
Full 7-step pipeline
Store Throughput
120mem/s
Sequential stores with auto-linking
Cold Startup (1k memories)
1.2s
Full DB scan + index build
Cached Startup (1k memories)
45ms
Index loaded from disk
Index File Size (1k)
1.5MB
384-dim float32 vectors
Recall Latency vs Memory Count
Full recall pipeline: embed query + vector search + graph expand + score + rank + truncate.
| Memories | p50 | p95 | p99 |
|---|---|---|---|
| 10 | 6ms | 9ms | 12ms |
| 100 | 18ms | 24ms | 32ms |
| 1,000 | 45ms | 62ms | 85ms |
| 10,000 | 180ms | 240ms | 310ms |
| 50,000 | 850ms | 1.1s | 1.4s |
✓
For datasets over 10k memories, consider using the re-embedding pipeline with a smaller topK value or switching to a dedicated vector database.
Startup Performance
Cold start scans all memories from SQLite and builds the vector index. Cached start loads the persisted index from disk and only syncs new memories.
| Memories | Cold Start | Cached Start | Speedup |
|---|---|---|---|
| 100 | 120ms | 12ms | 10x |
| 1,000 | 1.2s | 45ms | 27x |
| 10,000 | 11s | 320ms | 34x |
| 50,000 | 55s | 1.5s | 37x |
Embedding Throughput
Using embedBatch() for bulk operations vs sequential embed().
| Method | Throughput | Latency/item | Notes |
|---|---|---|---|
| embed() sequential | ~120/s | 8ms | Single text at a time |
| embedBatch(32) | ~400/s | 2.5ms | Batch of 32, parallel ONNX inference |
| store() full pipeline | ~80/s | 12ms | Embed + DB insert + auto-link + contradiction check |
Storage Footprint
| Component | Per Memory | 1k Memories | 10k Memories |
|---|---|---|---|
| SQLite row (no embedding) | ~0.5 KB | 500 KB | 5 MB |
| FP16 embedding (384-dim) | 768 B | 750 KB | 7.5 MB |
| Total DB file | ~1.3 KB | 1.3 MB | 13 MB |
| Persisted index file | ~1.5 KB | 1.5 MB | 15 MB |
i
FP16 compression reduces embedding storage by 2x compared to full Float32. A 10k memory brain fits comfortably in ~30MB total (DB + index).