REST API
Engram exposes a JSON HTTP API on port 4901 (configurable). All endpoints accept and return application/json. No authentication is required for local access.
http://localhost:4901Memories
/api/memoryStore a new memory. The content is embedded asynchronously; the response returns immediately with the created record.
Request body
{
"content": "string", // required
"type": "episodic|semantic|procedural", // optional, default "episodic"
"source": "string", // optional, caller identifier
"sessionId": "string", // optional
"tags": ["string"], // optional
"metadata": {}, // optional, arbitrary object
"importance": 0.5, // optional, default: episodic 0.5, semantic 0.7, procedural 0.6
"concept": "string", // optional, label for graph node
"namespace": "string" // optional, scope memory to a namespace
}Response 201
{
"memory": {
"id": "a1b2c3d4-...",
"type": "episodic",
"content": "...",
"summary": "...",
"importance": 0.5,
"confidence": 1.0,
"accessCount": 0,
"lastAccessedAt": null,
"eventAt": null,
"sessionId": null,
"source": "curl",
"concept": null,
"triggerPattern": null,
"actionPattern": null,
"namespace": null,
"metadata": {},
"tags": [],
"createdAt": "2025-03-21T10:00:00.000Z",
"updatedAt": "2025-03-21T10:00:00.000Z",
"archivedAt": null
},
"contradictions": {
"hasContradictions": false,
"contradictions": [],
"candidatesChecked": 3,
"latencyMs": 12
}
}/api/memoryList stored memories with optional filtering and pagination.
Query parameters
| Param | Type | Description |
|---|---|---|
| type | string | Filter by memory type: episodic, semantic, or procedural. |
| source | string | Filter by source identifier. |
| limit | number | Max records to return. Range: 1-200, default: 50. |
| offset | number | Pagination offset. Default: 0. |
Response 200
{
"count": 1247,
"memories": [ /* memory objects */ ]
}/api/memory/:idRetrieve a single memory by ID. Returns 404 if not found.
Response 200
{
"id": "a1b2c3d4-...",
"type": "semantic",
"content": "...",
"summary": "...",
"importance": 0.85,
"confidence": 1.0,
"accessCount": 14,
"lastAccessedAt": "2025-03-21T...",
"eventAt": null,
"sessionId": null,
"source": "curl",
"concept": "Tech stack",
"triggerPattern": null,
"actionPattern": null,
"metadata": {},
"tags": ["tech"],
"createdAt": "2025-03-21T...",
"updatedAt": "2025-03-21T...",
"archivedAt": null
}/api/memory/:idSoft-delete (archive) a single memory by ID.
Response 204 No Content
Empty response body on success.
Recall
/api/recallContext assembly — retrieves the most relevant memories and returns a formatted context string suitable for LLM prompts.
Request body
{
"query": "string", // required
"maxTokens": 2000, // optional, default 2000, max 8000
"types": ["semantic"], // optional, filter by type(s)
"sources": ["cli"], // optional, filter by source(s)
"sessionId": "string", // optional
"crossNamespace": false // optional, search all namespaces
}Response 200
{
"context": "## Relevant memories\n\n...",
"memories": [
{
"id": "a1b2c3d4-...",
"type": "semantic",
"content": "...",
"summary": "...",
"score": 0.94,
"similarity": 0.91,
"source": "cli"
}
],
"latencyMs": 18
}/api/searchSemantic vector search — returns memories ranked by cosine similarity.
Request body
{
"query": "string", // required
"topK": 10, // optional, 1-50, default 10
"threshold": 0.3, // optional, 0-1, default 0.3
"types": ["semantic"], // optional, filter by type(s)
"sources": ["cli"], // optional, filter by source(s)
"crossNamespace": false // optional, search all namespaces
}Response 200
{
"count": 4,
"latencyMs": 6,
"results": [
{
"id": "a1b2c3d4-...",
"type": "semantic",
"content": "...",
"summary": "...",
"importance": 0.85,
"source": "cli",
"createdAt": "2025-03-21T..."
}
]
}/api/recall/streamStreaming recall via Server-Sent Events — memories arrive progressively as they're found. High-confidence vector results first, then graph-expanded neighbors.
Query parameters
| Param | Type | Description |
|---|---|---|
| query | string | Required. The recall query. |
| maxTokens | number | Max context tokens. Default: 2000. |
| types | string | Comma-separated memory types. |
| sources | string | Comma-separated source filters. |
| crossNamespace | boolean | Search all namespaces. Default: false. |
SSE Events
// Phase 1 — vector search results (high confidence, immediate)
event: vector
data: {
"phase": "vector",
"memory": { "id": "...", "type": "semantic", "content": "...", "score": 0.94, "similarity": 0.91 },
"rank": 1,
"contextSoFar": "[NEURAL MEMORY CONTEXT]\n..."
}
// Phase 2 — graph-expanded neighbors (lower confidence, backfill)
event: graph
data: {
"phase": "graph",
"memory": { "id": "...", "type": "episodic", "content": "...", "score": 0.42, "similarity": 0.1 },
"rank": 5,
"contextSoFar": "..."
}
// Phase 3 — complete (final assembled context)
event: complete
data: {
"phase": "complete",
"context": "[NEURAL MEMORY CONTEXT]\n[KNOWLEDGE]\n...",
"memories": [ /* all memories in final scored order */ ],
"latencyMs": 24
}contextSoFar field on each chunk gives you a usable partial context at every step.Stats
/api/healthHealth check endpoint.
Response 200
{
"status": "ok",
"version": "1.0.0",
"uptime": 12345
}/api/statsReturns aggregate statistics about the memory store and knowledge graph.
Response 200
{
"total": 1247,
"byType": {
"episodic": 680,
"semantic": 412,
"procedural": 155
},
"bySource": {
"cli": 300,
"mcp": 947
},
"namespace": null,
"indexSize": 1247,
"graphNodes": 1247,
"graphEdges": 2843
}Bulk operations
/api/memory/batchStore multiple memories in a single request. Embeddings are processed in parallel.
Request body
{
"memories": [
{ "content": "...", "type": "semantic", "importance": 0.8 },
{ "content": "...", "type": "episodic" }
]
}Response 201
{
"count": 2,
"latencyMs": 24,
"ids": ["a1b2c3d4-...", "e5f6g7h8-..."],
"contradictions": 0
}Sessions
/api/sessionsCreate a new session to group related memories.
Request body
{
"source": "string", // optional
"context": "string" // optional
}Response 201
{
"id": "a1b2c3d4-..."
}/api/sessionsList all sessions.
Knowledge Graph
/api/graph/:idGet the graph neighborhood for a memory node.
Query parameters
| Param | Type | Description |
|---|---|---|
| depth | number | Traversal depth. Range: 1-4, default: 2. |
Response 200
{
"node": { /* memory object */ },
"connections": [ /* edge objects */ ],
"neighbors": [ /* memory objects */ ]
}/api/connectionsCreate a graph edge between two memory nodes.
Request body
{
"sourceId": "a1b2c3d4-...", // required
"targetId": "e5f6g7h8-...", // required
"relationship": "relates_to", // required: is_a | has_property | causes | relates_to | contradicts | part_of | follows
"strength": 1.0, // optional, 0-1, default 1.0
"bidirectional": false // optional, default false
}Response 201
{
"id": "...",
"sourceId": "a1b2c3d4-...",
"targetId": "e5f6g7h8-...",
"relationship": "relates_to",
"strength": 1.0,
"bidirectional": false
}Error format
All errors follow a consistent shape:
// 400 Bad Request
{
"statusCode": 400,
"code": "VALIDATION_ERROR",
"error": "Bad Request",
"message": ""type" must be one of episodic, semantic, procedural"
}
// 404 Not Found
{
"statusCode": 404,
"code": "NOT_FOUND",
"error": "Not Found",
"message": "Memory a1b2c3d4-... not found"
}Consolidation
/api/consolidateConsolidate episodic memories into semantic summaries. Clusters similar episodes by vector similarity, merges each cluster into a single semantic memory, and archives the originals. Like sleep consolidation in the brain.
// Request
{
"minClusterSize": 3, // min episodes to form cluster (default: 3)
"threshold": 0.6 // similarity threshold (default: 0.6)
}
// Response (200)
{
"consolidated": 2,
"memories": [
{
"id": "a1b2c3d4-...",
"concept": "deployment workflow",
"content": "When deploying → run migrations first..."
}
]
}Decay & Retention
/api/decayRun a memory decay sweep. Evaluates all memories and archives stale ones based on the decay policy. Also triggers auto-consolidation of old episodic memories if enabled.
Request body
{
"dryRun": false // optional, default false — if true, preview without modifying
}Response 200
{
"scannedCount": 101,
"archivedCount": 12,
"archivedIds": ["a1b2c3d4-...", "..."],
"decayedCount": 34,
"protectedCount": 8,
"consolidatedCount": 2,
"newSemanticIds": ["f9a8b7c6-..."],
"durationMs": 45
}/api/decay/policyGet the current decay policy configuration.
Response 200
{
"halfLifeDays": 7,
"archiveThreshold": 0.05,
"decayIntervalMs": 3600000,
"batchSize": 200,
"importanceDecayRate": 0.01,
"importanceFloor": 0.05,
"consolidation": {
"enabled": true,
"minClusterSize": 3,
"similarityThreshold": 0.6,
"minEpisodicAgeMs": 86400000
},
"protectionRuleCount": 4
}/api/decay/policyUpdate the decay policy at runtime. Only the fields you include are changed — others keep their current values.
Request body
{
"halfLifeDays": 14, // optional — Ebbinghaus half-life
"archiveThreshold": 0.1, // optional — retention score floor
"decayIntervalMs": 7200000, // optional — auto-sweep interval (ms)
"importanceDecayRate": 0.005, // optional — daily importance reduction
"importanceFloor": 0.02, // optional — minimum importance
"consolidation": { // optional — auto-consolidation settings
"enabled": true,
"minClusterSize": 5,
"similarityThreshold": 0.7
}
}Response 200
{
"message": "Decay policy updated",
"halfLifeDays": 14,
"archiveThreshold": 0.1,
"decayIntervalMs": 7200000,
...
}memory:decayed event is emitted on the /neural Socket.io namespace with the full sweep result object.POST /api/memory call automatically finds the top 3 most similar existing memories (threshold 0.5) and creates bidirectionalrelates_to edges. The knowledge graph grows organically with every stored memory.concept is provided when storing, Engram automatically extracts a short topic label (2–5 words) from the content.Contradiction Detection
Engram automatically detects when new memories conflict with existing ones. When a memory is stored, the engine finds same-topic candidates via vector similarity, then analyzes content for negation, value changes, temporal overrides, and opposite sentiment. Detected contradictions are linked with contradicts graph edges.
/api/contradictionsList all unresolved contradictions — memory pairs linked by 'contradicts' edges where both memories are still active.
Response 200
{
"count": 1,
"contradictions": [
{
"edgeId": "e1f2g3h4-...",
"confidence": 0.637,
"metadata": { "signals": ["negation"], "suggestedStrategy": "keep_both" },
"source": {
"id": "a1b2c3d4-...",
"content": "The app does not use PostgreSQL...",
"type": "semantic",
"importance": 0.8,
"createdAt": "2026-03-24T..."
},
"target": {
"id": "e5f6g7h8-...",
"content": "The app uses PostgreSQL...",
"type": "semantic",
"importance": 0.8,
"createdAt": "2026-03-24T..."
}
}
]
}/api/contradictions/check/:idCheck a specific memory for contradictions against the existing memory store. Useful for re-scanning older memories.
Response 200
{
"hasContradictions": true,
"contradictions": [
{
"newMemoryId": "a1b2c3d4-...",
"existingMemoryId": "e5f6g7h8-...",
"similarity": 0.872,
"confidence": 0.637,
"signals": [
{ "type": "negation", "description": "Content contains negation...", "weight": 0.75 }
],
"suggestedStrategy": "keep_newest"
}
],
"candidatesChecked": 5,
"latencyMs": 18
}/api/contradictions/resolveResolve a contradiction between two memories using one of five strategies.
Request body
{
"sourceId": "a1b2c3d4-...", // required
"targetId": "e5f6g7h8-...", // required
"strategy": "keep_newest" // required: keep_newest | keep_oldest | keep_important | keep_both | manual
}Strategies
| Strategy | Action |
|---|---|
keep_newest | Archive the older memory, keep the newer one. |
keep_oldest | Archive the newer memory, keep the older one. |
keep_important | Archive the lower-importance memory. |
keep_both | Keep both memories; the contradicts edge remains as documentation. |
manual | No action taken — flag for human review. |
Response 200
{
"resolved": true,
"archivedId": "e5f6g7h8-...",
"keptId": "a1b2c3d4-..."
}/api/contradictions/configGet the current contradiction detection configuration.
Response 200
{
"enabled": true,
"similarityThreshold": 0.65,
"confidenceThreshold": 0.4,
"maxCandidates": 10,
"defaultStrategy": "keep_both",
"autoResolve": false
}/api/contradictions/configUpdate contradiction detection settings at runtime.
Request body
{
"enabled": true, // optional — toggle detection on/off
"similarityThreshold": 0.7, // optional — min similarity for same-topic matching
"confidenceThreshold": 0.5, // optional — min confidence to flag
"maxCandidates": 20, // optional — max candidates to evaluate per store
"defaultStrategy": "keep_newest", // optional — default auto-resolve strategy
"autoResolve": true // optional — auto-resolve using default strategy
}memory:contradiction event is emitted on the /neural namespace. When resolved, a memory:contradiction_resolved event fires.negation (content negates existing statement),value_change (explicit change patterns like "switched from X to Y"),temporal_override (newer info supersedes older), and opposite_sentiment (opposing sentiment about the same subject).Embedding Management
Engram tracks which embedding model generated each memory's vector. When you switch models, the re-embedding pipeline updates stale vectors so search quality stays consistent. Legacy memories (stored before model tracking) can be backfilled without re-embedding.
/api/embeddings/statusGet the current embedding model, dimension, and how many memories are current, stale, or legacy.
Response 200
{
"currentModel": "Xenova/all-MiniLM-L6-v2",
"currentDimension": 384,
"totalEmbedded": 1247,
"currentModelCount": 1200,
"staleCount": 0,
"legacyCount": 47,
"needsReEmbed": true
}/api/embeddings/re-embedRe-embed memories with the current model. Processes in batches; emits WebSocket progress events. Can be slow for large stores.
Request body
{
"onlyStale": true, // optional — only re-embed stale/legacy memories (default: true)
"batchSize": 32 // optional — batch size 1-100 (default: 32)
}Response 200
{
"total": 47,
"processed": 47,
"failed": 0,
"failedIds": [],
"durationMs": 2340,
"model": "Xenova/all-MiniLM-L6-v2",
"message": "Re-embedded 47 memories in 2340ms"
}/api/embeddings/backfillTag legacy memories (stored before model tracking) with the current model ID, without actually re-computing their embeddings. Use when you know the same model was used.
Response 200
{
"currentModel": "Xenova/all-MiniLM-L6-v2",
"currentDimension": 384,
"totalEmbedded": 1247,
"currentModelCount": 1247,
"staleCount": 0,
"legacyCount": 0,
"needsReEmbed": false,
"message": "Backfilled legacy memories with model: Xenova/all-MiniLM-L6-v2"
}embedding:progressevents fire after each batch on the /neural namespace. A finalembedding:complete event fires when done.Xenova/all-MiniLM-L6-v2 (384-dim). Other compatible models include Xenova/bge-small-en-v1.5 (384-dim) and Xenova/bge-base-en-v1.5 (768-dim). Set via ENGRAM_EMBEDDING_MODEL env var.Vector Index Management
Engram persists the in-memory vector index to disk on shutdown and reloads it on startup. This enables fast incremental startup — only new memories added since the last save need to be loaded from the database. Set the index path via ENGRAM_INDEX_PATH env var or it defaults to {dbPath}.index.
/api/index/statusGet vector index status — how it was loaded, entry count, persistence info.
Response 200
{
"loadedFrom": "disk", // "disk" | "database" | "not_loaded"
"entryCount": 1247,
"dimension": 384,
"indexPath": "/data/engram.db.index",
"indexFileExists": true,
"incrementalCount": 3, // memories added since cache
"initDurationMs": 45 // startup time
}/api/index/rebuildForce a full vector index rebuild from the database. Discards any cached index and rebuilds from scratch.
Response 200
{
"loadedFrom": "database",
"entryCount": 1247,
"initDurationMs": 320,
"message": "Index rebuilt: 1247 entries in 320ms"
}/api/index/saveForce save the vector index to disk immediately. Normally happens automatically on shutdown.
Response 200
{
"entryCount": 1247,
"indexPath": "/data/engram.db.index",
"indexFileExists": true,
"message": "Index saved to /data/engram.db.index"
}Webhooks
Subscribe external systems to memory lifecycle events via HTTP callbacks. Webhooks fire asynchronously with HMAC-SHA256 signing, exponential backoff retry (3 attempts), and auto-disable after 10 consecutive failures.
/api/webhooksSubscribe a new webhook. Returns the created subscription.
Request body
{
"url": "https://example.com/engram-hook", // required
"events": ["stored", "forgotten"], // required: stored | forgotten | decayed | consolidated | contradiction
"secret": "my-shared-secret", // optional — enables HMAC-SHA256 signing
"description": "Notify Slack on new memories" // optional
}Response 201
{
"id": "a1b2c3d4-...",
"url": "https://example.com/engram-hook",
"events": ["stored", "forgotten"],
"active": true,
"description": "Notify Slack on new memories",
"secret": "my-shared-secret",
"failCount": 0,
"createdAt": "2026-03-25T...",
"lastTriggeredAt": null
}/api/webhooksList all webhook subscriptions. Use ?activeOnly=true to filter.
Response 200
{
"count": 2,
"webhooks": [ /* subscription objects */ ]
}/api/webhooks/:idDelete a webhook subscription.
Response 204 No Content
/api/webhooks/:id/testSend a test event to verify the webhook endpoint is reachable.
Response 200
{
"webhookId": "a1b2c3d4-...",
"url": "https://example.com/engram-hook",
"success": true,
"statusCode": 200,
"attempts": 1
}{ "event": "stored", "timestamp": "...", "data": { ... } }. When a secret is configured, the X-Engram-Signature header contains sha256=<hmac>.stored (new memory), forgotten (archived),decayed (sweep completed), consolidated (episodes merged),contradiction (conflict detected).Tagging & Collections
Browse and filter memories by tags. Tags are stored as JSON arrays on each memory. Collections group tags by prefix (e.g. project:alpha, topic:ml).
/api/tagsGet the tag cloud — all unique tags with memory counts, sorted by count descending.
Response 200
{
"count": 8,
"tags": [
{ "tag": "typescript", "count": 42 },
{ "tag": "backend", "count": 28 },
{ "tag": "project:alpha", "count": 15 }
]
}/api/tags/:tagGet all memories with a specific tag. Supports pagination via ?limit and ?offset.
Response 200
{
"tag": "typescript",
"count": 3,
"memories": [ /* memory objects */ ]
}/api/collectionsGet collections — tags grouped by prefix. Tags without a colon go into the 'default' collection.
Response 200
{
"count": 3,
"collections": [
{
"name": "project",
"prefix": "project",
"tags": [
{ "tag": "project:alpha", "count": 15 },
{ "tag": "project:beta", "count": 8 }
],
"totalMemories": 23
}
]
}/api/memory/:id/tagsAdd a tag to a memory. Idempotent — adding an existing tag is a no-op.
Request body
{ "tag": "important" }Response 200
{ "id": "a1b2c3d4-...", "tags": ["existing", "important"] }/api/memory/:id/tags/:tagRemove a tag from a memory.
Response 200
{ "id": "a1b2c3d4-...", "tags": ["remaining-tag"] }Plugin System
Engram supports a formalized plugin system for community extensions. Plugins register hooks that fire at key lifecycle events: onStore,onRecall, onForget, onDecay,onStartup, onShutdown. Errors in plugins are isolated — they never break core operations.
/api/pluginsList all registered plugins with their hooks and metadata.
Response 200
{
"count": 2,
"plugins": [
{
"id": "my-org/logger",
"name": "Memory Logger",
"version": "1.0.0",
"description": "Logs all memory events to console",
"hooks": ["onStore", "onRecall", "onForget"],
"registeredAt": "2026-03-25T..."
}
]
}/api/plugins/:idGet a single registered plugin by ID.
/api/plugins/:idUnregister a plugin by ID. Its hooks will no longer fire.
Response 204 No Content
id (unique identifier),name, version, optional description, and a hooks object with lifecycle callbacks. Plugins run in registration order.onStore (after memory stored),onRecall (after context assembled), onForget (after archive),onDecay (after sweep), onStartup (after brain init),onShutdown (before close).