Reference

REST API

Engram exposes a JSON HTTP API on port 4901 (configurable). All endpoints accept and return application/json. No authentication is required for local access.

Base URLhttp://localhost:4901

Memories

POST/api/memory

Store a new memory. The content is embedded asynchronously; the response returns immediately with the created record.

Request body

{
  "content":    "string",          // required
  "type":       "episodic|semantic|procedural",  // optional, default "episodic"
  "source":     "string",          // optional, caller identifier
  "sessionId":  "string",          // optional
  "tags":       ["string"],        // optional
  "metadata":   {},                // optional, arbitrary object
  "importance": 0.5,               // optional, default: episodic 0.5, semantic 0.7, procedural 0.6
  "concept":    "string",          // optional, label for graph node
  "namespace":  "string"           // optional, scope memory to a namespace
}

Response `201`

{
  "memory": {
    "id":              "a1b2c3d4-...",
    "type":            "episodic",
    "content":         "...",
    "summary":         "...",
    "importance":      0.5,
    "confidence":      1.0,
    "accessCount":     0,
    "lastAccessedAt":  null,
    "eventAt":         null,
    "sessionId":       null,
    "source":          "curl",
    "concept":         null,
    "triggerPattern":  null,
    "actionPattern":   null,
    "namespace":       null,
    "metadata":        {},
    "tags":            [],
    "createdAt":       "2025-03-21T10:00:00.000Z",
    "updatedAt":       "2025-03-21T10:00:00.000Z",
    "archivedAt":      null
  },
  "contradictions": {
    "hasContradictions": false,
    "contradictions":    [],
    "candidatesChecked": 3,
    "latencyMs":         12
  }
}

GET/api/memory

List stored memories with optional filtering and pagination.

Query parameters

Param	Type	Description
type	string	Filter by memory type: `episodic`, `semantic`, or `procedural`.
source	string	Filter by source identifier.
limit	number	Max records to return. Range: 1-200, default: 50.
offset	number	Pagination offset. Default: 0.

Response `200`

{
  "count": 1247,
  "memories": [ /* memory objects */ ]
}

GET/api/memory/:id

Retrieve a single memory by ID. Returns 404 if not found.

Response `200`

{
  "id":              "a1b2c3d4-...",
  "type":            "semantic",
  "content":         "...",
  "summary":         "...",
  "importance":      0.85,
  "confidence":      1.0,
  "accessCount":     14,
  "lastAccessedAt":  "2025-03-21T...",
  "eventAt":         null,
  "sessionId":       null,
  "source":          "curl",
  "concept":         "Tech stack",
  "triggerPattern":  null,
  "actionPattern":   null,
  "metadata":        {},
  "tags":            ["tech"],
  "createdAt":       "2025-03-21T...",
  "updatedAt":       "2025-03-21T...",
  "archivedAt":      null
}

DELETE/api/memory/:id

Soft-delete (archive) a single memory by ID.

Response `204` No Content

Empty response body on success.

Recall

POST/api/recall

Context assembly — retrieves the most relevant memories and returns a formatted context string suitable for LLM prompts.

Request body

{
  "query":      "string",           // required
  "maxTokens":  2000,               // optional, default 2000, max 8000
  "types":      ["semantic"],       // optional, filter by type(s)
  "sources":    ["cli"],            // optional, filter by source(s)
  "sessionId":  "string",           // optional
  "crossNamespace": false            // optional, search all namespaces
}

Response `200`

{
  "context": "## Relevant memories\n\n...",
  "memories": [
    {
      "id":         "a1b2c3d4-...",
      "type":       "semantic",
      "content":    "...",
      "summary":    "...",
      "score":      0.94,
      "similarity": 0.91,
      "source":     "cli"
    }
  ],
  "latencyMs": 18
}

POST/api/search

Semantic vector search — returns memories ranked by cosine similarity.

Request body

{
  "query":     "string",           // required
  "topK":      10,                 // optional, 1-50, default 10
  "threshold": 0.3,                // optional, 0-1, default 0.3
  "types":     ["semantic"],       // optional, filter by type(s)
  "sources":   ["cli"],            // optional, filter by source(s)
  "crossNamespace": false            // optional, search all namespaces
}

Response `200`

{
  "count": 4,
  "latencyMs": 6,
  "results": [
    {
      "id":         "a1b2c3d4-...",
      "type":       "semantic",
      "content":    "...",
      "summary":    "...",
      "importance": 0.85,
      "source":     "cli",
      "createdAt":  "2025-03-21T..."
    }
  ]
}

GET/api/recall/stream

Streaming recall via Server-Sent Events — memories arrive progressively as they're found. High-confidence vector results first, then graph-expanded neighbors.

Query parameters

Param	Type	Description
query	string	Required. The recall query.
maxTokens	number	Max context tokens. Default: 2000.
types	string	Comma-separated memory types.
sources	string	Comma-separated source filters.
crossNamespace	boolean	Search all namespaces. Default: false.

SSE Events

// Phase 1 — vector search results (high confidence, immediate)
event: vector
data: {
  "phase": "vector",
  "memory": { "id": "...", "type": "semantic", "content": "...", "score": 0.94, "similarity": 0.91 },
  "rank": 1,
  "contextSoFar": "[NEURAL MEMORY CONTEXT]\n..."
}

// Phase 2 — graph-expanded neighbors (lower confidence, backfill)
event: graph
data: {
  "phase": "graph",
  "memory": { "id": "...", "type": "episodic", "content": "...", "score": 0.42, "similarity": 0.1 },
  "rank": 5,
  "contextSoFar": "..."
}

// Phase 3 — complete (final assembled context)
event: complete
data: {
  "phase": "complete",
  "context": "[NEURAL MEMORY CONTEXT]\n[KNOWLEDGE]\n...",
  "memories": [ /* all memories in final scored order */ ],
  "latencyMs": 24
}

✓

Use streaming recall when you want to start processing context before the full pipeline completes. The contextSoFar field on each chunk gives you a usable partial context at every step.

Stats

GET/api/health

Health check endpoint.

Response `200`

{
  "status":  "ok",
  "version": "1.0.0",
  "uptime":  12345
}

GET/api/stats

Returns aggregate statistics about the memory store and knowledge graph.

Response `200`

{
  "total": 1247,
  "byType": {
    "episodic": 680,
    "semantic": 412,
    "procedural": 155
  },
  "bySource": {
    "cli": 300,
    "mcp": 947
  },
  "namespace":  null,
  "indexSize":  1247,
  "graphNodes": 1247,
  "graphEdges": 2843
}

Bulk operations

POST/api/memory/batch

Store multiple memories in a single request. Embeddings are processed in parallel.

Request body

{
  "memories": [
    { "content": "...", "type": "semantic", "importance": 0.8 },
    { "content": "...", "type": "episodic" }
  ]
}

Response `201`

{
  "count": 2,
  "latencyMs": 24,
  "ids": ["a1b2c3d4-...", "e5f6g7h8-..."],
  "contradictions": 0
}

Sessions

POST/api/sessions

Create a new session to group related memories.

Request body

{
  "source":  "string",   // optional
  "context": "string"    // optional
}

Response `201`

{
  "id": "a1b2c3d4-..."
}

GET/api/sessions

List all sessions.

Knowledge Graph

GET/api/graph/:id

Get the graph neighborhood for a memory node.

Query parameters

Param	Type	Description
depth	number	Traversal depth. Range: 1-4, default: 2.

Response `200`

{
  "node":        { /* memory object */ },
  "connections": [ /* edge objects */ ],
  "neighbors":   [ /* memory objects */ ]
}

POST/api/connections

Create a graph edge between two memory nodes.

Request body

{
  "sourceId":      "a1b2c3d4-...",   // required
  "targetId":      "e5f6g7h8-...",   // required
  "relationship":  "relates_to",     // required: is_a | has_property | causes | relates_to | contradicts | part_of | follows
  "strength":      1.0,              // optional, 0-1, default 1.0
  "bidirectional": false             // optional, default false
}

Response `201`

{
  "id":            "...",
  "sourceId":      "a1b2c3d4-...",
  "targetId":      "e5f6g7h8-...",
  "relationship":  "relates_to",
  "strength":      1.0,
  "bidirectional": false
}

Error format

All errors follow a consistent shape:

// 400 Bad Request
{
  "statusCode": 400,
  "code":       "VALIDATION_ERROR",
  "error":      "Bad Request",
  "message":    ""type" must be one of episodic, semantic, procedural"
}

// 404 Not Found
{
  "statusCode": 404,
  "code":       "NOT_FOUND",
  "error":      "Not Found",
  "message":    "Memory a1b2c3d4-... not found"
}

Consolidation

POST/api/consolidate

Consolidate episodic memories into semantic summaries. Clusters similar episodes by vector similarity, merges each cluster into a single semantic memory, and archives the originals. Like sleep consolidation in the brain.

// Request
{
  "minClusterSize": 3,   // min episodes to form cluster (default: 3)
  "threshold": 0.6       // similarity threshold (default: 0.6)
}

// Response (200)
{
  "consolidated": 2,
  "memories": [
    {
      "id": "a1b2c3d4-...",
      "concept": "deployment workflow",
      "content": "When deploying → run migrations first..."
    }
  ]
}

Decay & Retention

POST/api/decay

Run a memory decay sweep. Evaluates all memories and archives stale ones based on the decay policy. Also triggers auto-consolidation of old episodic memories if enabled.

Request body

{
  "dryRun": false    // optional, default false — if true, preview without modifying
}

Response `200`

{
  "scannedCount": 101,
  "archivedCount": 12,
  "archivedIds": ["a1b2c3d4-...", "..."],
  "decayedCount": 34,
  "protectedCount": 8,
  "consolidatedCount": 2,
  "newSemanticIds": ["f9a8b7c6-..."],
  "durationMs": 45
}

GET/api/decay/policy

Get the current decay policy configuration.

Response `200`

{
  "halfLifeDays": 7,
  "archiveThreshold": 0.05,
  "decayIntervalMs": 3600000,
  "batchSize": 200,
  "importanceDecayRate": 0.01,
  "importanceFloor": 0.05,
  "consolidation": {
    "enabled": true,
    "minClusterSize": 3,
    "similarityThreshold": 0.6,
    "minEpisodicAgeMs": 86400000
  },
  "protectionRuleCount": 4
}

PUT/api/decay/policy

Update the decay policy at runtime. Only the fields you include are changed — others keep their current values.

Request body

{
  "halfLifeDays": 14,             // optional — Ebbinghaus half-life
  "archiveThreshold": 0.1,        // optional — retention score floor
  "decayIntervalMs": 7200000,     // optional — auto-sweep interval (ms)
  "importanceDecayRate": 0.005,   // optional — daily importance reduction
  "importanceFloor": 0.02,        // optional — minimum importance
  "consolidation": {              // optional — auto-consolidation settings
    "enabled": true,
    "minClusterSize": 5,
    "similarityThreshold": 0.7
  }
}

Response `200`

{
  "message": "Decay policy updated",
  "halfLifeDays": 14,
  "archiveThreshold": 0.1,
  "decayIntervalMs": 7200000,
  ...
}

ℹ

WebSocket event: When the auto-decay timer runs and archives memories, a memory:decayed event is emitted on the /neural Socket.io namespace with the full sweep result object.

ℹ

Auto-linking: Every POST /api/memory call automatically finds the top 3 most similar existing memories (threshold 0.5) and creates bidirectionalrelates_to edges. The knowledge graph grows organically with every stored memory.

ℹ

Auto-concepts: If no concept is provided when storing, Engram automatically extracts a short topic label (2–5 words) from the content.

Contradiction Detection

Engram automatically detects when new memories conflict with existing ones. When a memory is stored, the engine finds same-topic candidates via vector similarity, then analyzes content for negation, value changes, temporal overrides, and opposite sentiment. Detected contradictions are linked with contradicts graph edges.

GET/api/contradictions

List all unresolved contradictions — memory pairs linked by 'contradicts' edges where both memories are still active.

Response `200`

{
  "count": 1,
  "contradictions": [
    {
      "edgeId":     "e1f2g3h4-...",
      "confidence": 0.637,
      "metadata":   { "signals": ["negation"], "suggestedStrategy": "keep_both" },
      "source": {
        "id": "a1b2c3d4-...",
        "content": "The app does not use PostgreSQL...",
        "type": "semantic",
        "importance": 0.8,
        "createdAt": "2026-03-24T..."
      },
      "target": {
        "id": "e5f6g7h8-...",
        "content": "The app uses PostgreSQL...",
        "type": "semantic",
        "importance": 0.8,
        "createdAt": "2026-03-24T..."
      }
    }
  ]
}

POST/api/contradictions/check/:id

Check a specific memory for contradictions against the existing memory store. Useful for re-scanning older memories.

Response `200`

{
  "hasContradictions": true,
  "contradictions": [
    {
      "newMemoryId":      "a1b2c3d4-...",
      "existingMemoryId": "e5f6g7h8-...",
      "similarity":       0.872,
      "confidence":       0.637,
      "signals": [
        { "type": "negation", "description": "Content contains negation...", "weight": 0.75 }
      ],
      "suggestedStrategy": "keep_newest"
    }
  ],
  "candidatesChecked": 5,
  "latencyMs": 18
}

POST/api/contradictions/resolve

Resolve a contradiction between two memories using one of five strategies.

Request body

{
  "sourceId": "a1b2c3d4-...",   // required
  "targetId": "e5f6g7h8-...",   // required
  "strategy": "keep_newest"     // required: keep_newest | keep_oldest | keep_important | keep_both | manual
}

Strategies

Strategy	Action
`keep_newest`	Archive the older memory, keep the newer one.
`keep_oldest`	Archive the newer memory, keep the older one.
`keep_important`	Archive the lower-importance memory.
`keep_both`	Keep both memories; the `contradicts` edge remains as documentation.
`manual`	No action taken — flag for human review.

Response `200`

{
  "resolved":   true,
  "archivedId": "e5f6g7h8-...",
  "keptId":     "a1b2c3d4-..."
}

GET/api/contradictions/config

Get the current contradiction detection configuration.

Response `200`

{
  "enabled":             true,
  "similarityThreshold": 0.65,
  "confidenceThreshold": 0.4,
  "maxCandidates":       10,
  "defaultStrategy":     "keep_both",
  "autoResolve":         false
}

PUT/api/contradictions/config

Update contradiction detection settings at runtime.

Request body

{
  "enabled":             true,       // optional — toggle detection on/off
  "similarityThreshold": 0.7,        // optional — min similarity for same-topic matching
  "confidenceThreshold": 0.5,        // optional — min confidence to flag
  "maxCandidates":       20,         // optional — max candidates to evaluate per store
  "defaultStrategy":     "keep_newest",  // optional — default auto-resolve strategy
  "autoResolve":         true        // optional — auto-resolve using default strategy
}

ℹ

WebSocket events: When a contradiction is detected on store, a memory:contradiction event is emitted on the /neural namespace. When resolved, a memory:contradiction_resolved event fires.

ℹ

Signal types: The detector uses four heuristic signals —negation (content negates existing statement),value_change (explicit change patterns like "switched from X to Y"),temporal_override (newer info supersedes older), and opposite_sentiment (opposing sentiment about the same subject).

Embedding Management

Engram tracks which embedding model generated each memory's vector. When you switch models, the re-embedding pipeline updates stale vectors so search quality stays consistent. Legacy memories (stored before model tracking) can be backfilled without re-embedding.

GET/api/embeddings/status

Get the current embedding model, dimension, and how many memories are current, stale, or legacy.

Response `200`

{
  "currentModel":     "Xenova/all-MiniLM-L6-v2",
  "currentDimension": 384,
  "totalEmbedded":    1247,
  "currentModelCount": 1200,
  "staleCount":       0,
  "legacyCount":      47,
  "needsReEmbed":     true
}

POST/api/embeddings/re-embed

Re-embed memories with the current model. Processes in batches; emits WebSocket progress events. Can be slow for large stores.

Request body

{
  "onlyStale":  true,    // optional — only re-embed stale/legacy memories (default: true)
  "batchSize":  32       // optional — batch size 1-100 (default: 32)
}

Response `200`

{
  "total":      47,
  "processed":  47,
  "failed":     0,
  "failedIds":  [],
  "durationMs": 2340,
  "model":      "Xenova/all-MiniLM-L6-v2",
  "message":    "Re-embedded 47 memories in 2340ms"
}

POST/api/embeddings/backfill

Tag legacy memories (stored before model tracking) with the current model ID, without actually re-computing their embeddings. Use when you know the same model was used.

Response `200`

{
  "currentModel":      "Xenova/all-MiniLM-L6-v2",
  "currentDimension":  384,
  "totalEmbedded":     1247,
  "currentModelCount": 1247,
  "staleCount":        0,
  "legacyCount":       0,
  "needsReEmbed":      false,
  "message":           "Backfilled legacy memories with model: Xenova/all-MiniLM-L6-v2"
}

ℹ

WebSocket events: During re-embedding, embedding:progressevents fire after each batch on the /neural namespace. A finalembedding:complete event fires when done.

ℹ

Supported models: The default is Xenova/all-MiniLM-L6-v2 (384-dim). Other compatible models include Xenova/bge-small-en-v1.5 (384-dim) and Xenova/bge-base-en-v1.5 (768-dim). Set via ENGRAM_EMBEDDING_MODEL env var.

Vector Index Management

Engram persists the in-memory vector index to disk on shutdown and reloads it on startup. This enables fast incremental startup — only new memories added since the last save need to be loaded from the database. Set the index path via ENGRAM_INDEX_PATH env var or it defaults to {dbPath}.index.

GET/api/index/status

Get vector index status — how it was loaded, entry count, persistence info.

Response `200`

{
  "loadedFrom":       "disk",         // "disk" | "database" | "not_loaded"
  "entryCount":       1247,
  "dimension":        384,
  "indexPath":        "/data/engram.db.index",
  "indexFileExists":  true,
  "incrementalCount": 3,              // memories added since cache
  "initDurationMs":   45              // startup time
}

POST/api/index/rebuild

Force a full vector index rebuild from the database. Discards any cached index and rebuilds from scratch.

Response `200`

{
  "loadedFrom":       "database",
  "entryCount":       1247,
  "initDurationMs":   320,
  "message":          "Index rebuilt: 1247 entries in 320ms"
}

POST/api/index/save

Force save the vector index to disk immediately. Normally happens automatically on shutdown.

Response `200`

{
  "entryCount":       1247,
  "indexPath":        "/data/engram.db.index",
  "indexFileExists":  true,
  "message":          "Index saved to /data/engram.db.index"
}

ℹ

Startup optimization: With a persisted index, startup goes from O(n) full-scan to near-instant cache load + incremental delta. For 10k memories, this can reduce init time from ~2s to ~50ms.

Webhooks

Subscribe external systems to memory lifecycle events via HTTP callbacks. Webhooks fire asynchronously with HMAC-SHA256 signing, exponential backoff retry (3 attempts), and auto-disable after 10 consecutive failures.

POST/api/webhooks

Subscribe a new webhook. Returns the created subscription.

Request body

{
  "url":         "https://example.com/engram-hook",   // required
  "events":      ["stored", "forgotten"],             // required: stored | forgotten | decayed | consolidated | contradiction
  "secret":      "my-shared-secret",                  // optional — enables HMAC-SHA256 signing
  "description": "Notify Slack on new memories"       // optional
}

Response `201`

{
  "id":              "a1b2c3d4-...",
  "url":             "https://example.com/engram-hook",
  "events":          ["stored", "forgotten"],
  "active":          true,
  "description":     "Notify Slack on new memories",
  "secret":          "my-shared-secret",
  "failCount":       0,
  "createdAt":       "2026-03-25T...",
  "lastTriggeredAt": null
}

GET/api/webhooks

List all webhook subscriptions. Use ?activeOnly=true to filter.

Response `200`

{
  "count": 2,
  "webhooks": [ /* subscription objects */ ]
}

DELETE/api/webhooks/:id

Delete a webhook subscription.

Response `204` No Content

POST/api/webhooks/:id/test

Send a test event to verify the webhook endpoint is reachable.

Response `200`

{
  "webhookId":  "a1b2c3d4-...",
  "url":        "https://example.com/engram-hook",
  "success":    true,
  "statusCode": 200,
  "attempts":   1
}

ℹ

Webhook payload format: Each delivery is a POST with JSON body:{ "event": "stored", "timestamp": "...", "data": { ... } }. When a secret is configured, the X-Engram-Signature header contains sha256=<hmac>.

ℹ

Events: stored (new memory), forgotten (archived),decayed (sweep completed), consolidated (episodes merged),contradiction (conflict detected).

Tagging & Collections

Browse and filter memories by tags. Tags are stored as JSON arrays on each memory. Collections group tags by prefix (e.g. project:alpha, topic:ml).

GET/api/tags

Get the tag cloud — all unique tags with memory counts, sorted by count descending.

Response `200`

{
  "count": 8,
  "tags": [
    { "tag": "typescript", "count": 42 },
    { "tag": "backend",    "count": 28 },
    { "tag": "project:alpha", "count": 15 }
  ]
}

GET/api/tags/:tag

Get all memories with a specific tag. Supports pagination via ?limit and ?offset.

Response `200`

{
  "tag": "typescript",
  "count": 3,
  "memories": [ /* memory objects */ ]
}

GET/api/collections

Get collections — tags grouped by prefix. Tags without a colon go into the 'default' collection.

Response `200`

{
  "count": 3,
  "collections": [
    {
      "name": "project",
      "prefix": "project",
      "tags": [
        { "tag": "project:alpha", "count": 15 },
        { "tag": "project:beta",  "count": 8 }
      ],
      "totalMemories": 23
    }
  ]
}

POST/api/memory/:id/tags

Add a tag to a memory. Idempotent — adding an existing tag is a no-op.

Request body

{ "tag": "important" }

Response `200`

{ "id": "a1b2c3d4-...", "tags": ["existing", "important"] }

DELETE/api/memory/:id/tags/:tag

Remove a tag from a memory.

Response `200`

{ "id": "a1b2c3d4-...", "tags": ["remaining-tag"] }

Plugin System

Engram supports a formalized plugin system for community extensions. Plugins register hooks that fire at key lifecycle events: onStore,onRecall, onForget, onDecay,onStartup, onShutdown. Errors in plugins are isolated — they never break core operations.

GET/api/plugins

List all registered plugins with their hooks and metadata.

Response `200`

{
  "count": 2,
  "plugins": [
    {
      "id":           "my-org/logger",
      "name":         "Memory Logger",
      "version":      "1.0.0",
      "description":  "Logs all memory events to console",
      "hooks":        ["onStore", "onRecall", "onForget"],
      "registeredAt": "2026-03-25T..."
    }
  ]
}

GET/api/plugins/:id

Get a single registered plugin by ID.

DELETE/api/plugins/:id

Unregister a plugin by ID. Its hooks will no longer fire.

Response `204` No Content

ℹ

Plugin manifest: Each plugin provides id (unique identifier),name, version, optional description, and a hooks object with lifecycle callbacks. Plugins run in registration order.

ℹ

Hook types: onStore (after memory stored),onRecall (after context assembled), onForget (after archive),onDecay (after sweep), onStartup (after brain init),onShutdown (before close).

← PreviousMCP Tools Next →Packages

REST API

Memories

Request body

Response 201

Query parameters

Response 200

Response 200

Response 204 No Content

Recall

Request body

Response 200

Request body

Response 200

Query parameters

SSE Events

Stats

Response 200

Response 200

Bulk operations

Request body

Response 201

Sessions

Request body

Response 201

Knowledge Graph

Query parameters

Response 200

Request body

Response 201

Error format

Consolidation

Decay & Retention

Request body

Response 200

Response 200

Request body

Response 200

Contradiction Detection

Response 200

Response 200

Request body

Strategies

Response 200

Response 200

Request body

Embedding Management

Response 200

Request body

Response 200

Response 200

Vector Index Management

Response 200

Response 200

Response 200

Webhooks

Request body

Response 201

Response 200

Response 204 No Content

Response 200

Tagging & Collections

Response 200

Response 200

Response 200

Request body

Response 200

Response 200

Plugin System

Response 200

Response 204 No Content

Response `201`

Response `200`

Response `200`

Response `204` No Content

Response `200`

Response `200`

Response `200`

Response `200`

Response `201`

Response `201`

Response `200`

Response `201`

Response `200`

Response `200`

Response `200`

Response `200`

Response `200`

Response `200`

Response `200`

Response `200`

Response `200`

Response `200`

Response `200`

Response `200`

Response `200`

Response `201`

Response `200`

Response `204` No Content

Response `200`

Response `200`

Response `200`

Response `200`

Response `200`

Response `200`

Response `200`

Response `204` No Content