v0.1.3 — Open Source

Universal AI Brain
for every model.

Give any AI persistent memory that survives sessions, systems, and restarts. Connect once — every tool you use shares a single, growing brain.

$ curl -fsSL https://engram.am/install.sh | bash

Full setup guide →GitHub

120+

memories/sec

~18ms

recall latency

384

embedding dim

API keys needed

Quick start

Up and running in minutes

Four steps from zero to a fully working AI memory backend with a live REST API, 3D dashboard, and Claude Code integration.

Clone & install

Requires Node.js 22+ and pnpm 9+.

$ git clone https://github.com/ayvazyan10/engram $ cd engram $ pnpm install

Build all packages

Compiles core, MCP server, adapters, and the dashboard in one command.

$ pnpm turbo run build

Start the API server

Engram listens on port 4901. Set ENGRAM_DB_PATH to choose where memories are stored.

$ ENGRAM_DB_PATH=./engram.db node apps/server/dist/index.js # Swagger UI → http://localhost:4901/docs

Connect to Claude Code

Add Engram as an MCP server — you get store_memory, recall_context and other native tools:

// ~/.claude/settings.json { "mcpServers": { "engram": { "command": "npx", "args": ["-y", "@engram-ai-memory/mcp@latest"] } } }

Verify the setup

Make your first API call to confirm Engram is working:

$ curl -X POST http://localhost:4901/api/memory \ -H "Content-Type: application/json" \ -d '{"content":"User prefers TypeScript","type":"semantic"}'

◷Swagger UI at localhost:4901/docs

⬡3D dashboard — http://localhost:4901 (served from API)

◈Seed 67 demo memories — cd packages/core && npx tsx scripts/demo.ts

How it works

Every query,
enriched with memory.

A seven-step pipeline runs transparently on every recall — embed, search, expand, score, rank, truncate, inject.

01

Embed

Query is encoded into a 384-dim semantic vector locally via ONNX Runtime. No API calls, no cost, ~8ms.

02

Search

Cosine similarity over the HNSW vector index finds the most relevant past memories in ~6ms.

03

Expand

Knowledge graph traversal (depth 2) retrieves related concepts through typed edges — even if they weren't in the vector results.

04

Score

Each candidate is ranked: 45% similarity + 25% recency + 20% importance + 10% access frequency.

05

Rank & Deduplicate

Results sorted by composite score. Duplicates removed. Top candidates selected within the token budget.

06

Inject

Assembled context — grouped by type (knowledge, patterns, events) — is injected into the AI prompt.

07

Store & Learn

The AI's response is stored as a new memory. Contradictions are auto-detected. Access counts update. The brain grows.

Embed→Search→Expand→Score→Rank & Deduplicate→Inject→Store & Learn

Memory model

Mirrors the human brain.

Three long-term memory types work together — just like episodic, semantic, and procedural memory in neuroscience. Old episodes consolidate into facts. Unused memories decay. Important ones strengthen.

◷

Episodic

Timestamped events and conversations. Auto-consolidated into semantic facts when clusters form. Decay naturally over time.

e.g. "User decided to use Fastify instead of Express — 2026-03-15"

◈

Semantic

Facts and knowledge, organized as a graph. Each concept links to related ones via typed edges (is_a, causes, contradicts).

e.g. concept: TypeScript → "typed superset of JS used for all backend code"

◎

Procedural

Trigger → action patterns. When a situation matches, the skill fires. High-confidence procedures are protected from decay.

e.g. "When DB migration needed → run drizzle-kit generate, never drizzle-kit push"

⬡

Working memory — assembled fresh for each query from all three types. Scored, ranked, truncated to fit the token budget. Never stored.

Integrations

Works with any AI tool.

One memory store. Every model. Connect via MCP, proxy, REST, CLI, or WebSocket — no vendor lock-in.

Claude Code    ──MCP──────────→ ┐
Claude Desktop ──extension────→ │
Ollama         ──proxy────────→ │  Engram  →  SQLite / PostgreSQL
OpenClaw       ──REST─────────→ │  :4901  →  42+ endpoints
CLI            ──direct───────→ │  /neural →  8 WS events
Any app        ──REST/WS──────→ ┘

⬡

Claude Code

MCP — 18 native tools

store, recall, search, forget, decay, contradictions, tags, webhooks, plugins — all as native Claude Code tools. Zero API calls.

⬢

Ollama

Transparent HTTP proxy

Point your client at :11435 instead of :11434. Memory context is injected automatically — zero code changes needed.

◈

OpenClaw

REST adapter

EngramClient class or withMemory() wrapper. Drop-in persistent memory for any OpenClaw agent.

▸

CLI

Terminal CLI

engram store, search, recall, stats, export, import. Pipeable output for scripting. No server required.

◇

Any App

REST API + WebSocket

42+ endpoints, 8 WebSocket events. POST /api/recall to retrieve, POST /api/memory to store. Works from any language.

⬡

Claude Desktop

1-click Desktop Extension

Install from Claude Desktop or Smithery — one click, zero config. The .mcpb bundle auto-installs the MCP server on first launch.

Performance

Fast enough to disappear.

Memory retrieval is so fast you'll never notice it's happening.

120+

stores/sec

Full pipeline: embed + DB + auto-link + contradiction check

~18ms

recall p50

7-step pipeline at 100 memories

~6ms

search p50

Vector cosine similarity over HNSW index

2×

compressed

FP16 embeddings (Float32→Int16)

27×

faster startup

Cached index vs cold DB scan (1k memories)

⬡

Embeddings run locally — no API, no cost, no data leaving your machine.

Uses ONNX Runtime WASM with Xenova/all-MiniLM-L6-v2 (23 MB). Vectors stored as FP16, giving 2× compression with no accuracy loss.

Packages

Modular by design.

Use just the core, or compose the full stack. Every package is independently versioned.

@engram-ai-memory/corenpm ↗

Memory engine, embeddings, knowledge graph, retrieval

@engram-ai-memory/mcpnpm ↗

MCP server — native Claude Code integration

@engram-ai-memory/clinpm ↗

Terminal tool — store, search, recall, export, import

@engram-ai-memory/visnpm ↗

Force-directed layout + animation helpers

@engram-ai-memory/server

Fastify REST API + Socket.io WebSocket server

@engram-ai-memory/web

React 3D visualization dashboard — 5 render modes

@engram-ai-memory/adapter-ollama

Transparent Ollama proxy — zero client changes

@engram-ai-memory/adapter-openclaw

OpenClaw adapter — EngramClient + withMemory()

♥

Support the project

Keep Engram free & open source.

Engram is MIT-licensed and built in the open. If it's useful to you, consider supporting continued development — every contribution helps keep the lights on.

Bitcoin

BTC

bc1qaahm…rh85kkclick to expand

Solana

SOL

HKpRw1h8…BX27Xgclick to expand

TON

UQAkUqVS…nr2CFdclick to expand

USDT

BEP20

0x34B39c…D75236click to expand