EngramEngramdocs
v0.1.0
Search docs…⌘K
GitHub
Reference

REST API

Engram exposes a JSON HTTP API on port 4901 (configurable). All endpoints accept and return application/json. No authentication is required for local access.

Base URLhttp://localhost:4901

Memories

POST/api/memory

Store a new memory. The content is embedded asynchronously; the response returns immediately with the created record.

Request body

{
  "content":    "string",          // required
  "type":       "episodic|semantic|procedural",  // optional, default "episodic"
  "source":     "string",          // optional, caller identifier
  "sessionId":  "string",          // optional
  "tags":       ["string"],        // optional
  "metadata":   {},                // optional, arbitrary object
  "importance": 0.5,               // optional, default: episodic 0.5, semantic 0.7, procedural 0.6
  "concept":    "string",          // optional, label for graph node
  "namespace":  "string"           // optional, scope memory to a namespace
}

Response 201

{
  "memory": {
    "id":              "a1b2c3d4-...",
    "type":            "episodic",
    "content":         "...",
    "summary":         "...",
    "importance":      0.5,
    "confidence":      1.0,
    "accessCount":     0,
    "lastAccessedAt":  null,
    "eventAt":         null,
    "sessionId":       null,
    "source":          "curl",
    "concept":         null,
    "triggerPattern":  null,
    "actionPattern":   null,
    "namespace":       null,
    "metadata":        {},
    "tags":            [],
    "createdAt":       "2025-03-21T10:00:00.000Z",
    "updatedAt":       "2025-03-21T10:00:00.000Z",
    "archivedAt":      null
  },
  "contradictions": {
    "hasContradictions": false,
    "contradictions":    [],
    "candidatesChecked": 3,
    "latencyMs":         12
  }
}
GET/api/memory

List stored memories with optional filtering and pagination.

Query parameters

ParamTypeDescription
typestringFilter by memory type: episodic, semantic, or procedural.
sourcestringFilter by source identifier.
limitnumberMax records to return. Range: 1-200, default: 50.
offsetnumberPagination offset. Default: 0.

Response 200

{
  "count": 1247,
  "memories": [ /* memory objects */ ]
}
GET/api/memory/:id

Retrieve a single memory by ID. Returns 404 if not found.

Response 200

{
  "id":              "a1b2c3d4-...",
  "type":            "semantic",
  "content":         "...",
  "summary":         "...",
  "importance":      0.85,
  "confidence":      1.0,
  "accessCount":     14,
  "lastAccessedAt":  "2025-03-21T...",
  "eventAt":         null,
  "sessionId":       null,
  "source":          "curl",
  "concept":         "Tech stack",
  "triggerPattern":  null,
  "actionPattern":   null,
  "metadata":        {},
  "tags":            ["tech"],
  "createdAt":       "2025-03-21T...",
  "updatedAt":       "2025-03-21T...",
  "archivedAt":      null
}
DELETE/api/memory/:id

Soft-delete (archive) a single memory by ID.

Response 204 No Content

Empty response body on success.


Recall

POST/api/recall

Context assembly — retrieves the most relevant memories and returns a formatted context string suitable for LLM prompts.

Request body

{
  "query":      "string",           // required
  "maxTokens":  2000,               // optional, default 2000, max 8000
  "types":      ["semantic"],       // optional, filter by type(s)
  "sources":    ["cli"],            // optional, filter by source(s)
  "sessionId":  "string",           // optional
  "crossNamespace": false            // optional, search all namespaces
}

Response 200

{
  "context": "## Relevant memories\n\n...",
  "memories": [
    {
      "id":         "a1b2c3d4-...",
      "type":       "semantic",
      "content":    "...",
      "summary":    "...",
      "score":      0.94,
      "similarity": 0.91,
      "source":     "cli"
    }
  ],
  "latencyMs": 18
}
POST/api/search

Semantic vector search — returns memories ranked by cosine similarity.

Request body

{
  "query":     "string",           // required
  "topK":      10,                 // optional, 1-50, default 10
  "threshold": 0.3,                // optional, 0-1, default 0.3
  "types":     ["semantic"],       // optional, filter by type(s)
  "sources":   ["cli"],            // optional, filter by source(s)
  "crossNamespace": false            // optional, search all namespaces
}

Response 200

{
  "count": 4,
  "latencyMs": 6,
  "results": [
    {
      "id":         "a1b2c3d4-...",
      "type":       "semantic",
      "content":    "...",
      "summary":    "...",
      "importance": 0.85,
      "source":     "cli",
      "createdAt":  "2025-03-21T..."
    }
  ]
}
GET/api/recall/stream

Streaming recall via Server-Sent Events — memories arrive progressively as they're found. High-confidence vector results first, then graph-expanded neighbors.

Query parameters

ParamTypeDescription
querystringRequired. The recall query.
maxTokensnumberMax context tokens. Default: 2000.
typesstringComma-separated memory types.
sourcesstringComma-separated source filters.
crossNamespacebooleanSearch all namespaces. Default: false.

SSE Events

// Phase 1 — vector search results (high confidence, immediate)
event: vector
data: {
  "phase": "vector",
  "memory": { "id": "...", "type": "semantic", "content": "...", "score": 0.94, "similarity": 0.91 },
  "rank": 1,
  "contextSoFar": "[NEURAL MEMORY CONTEXT]\n..."
}

// Phase 2 — graph-expanded neighbors (lower confidence, backfill)
event: graph
data: {
  "phase": "graph",
  "memory": { "id": "...", "type": "episodic", "content": "...", "score": 0.42, "similarity": 0.1 },
  "rank": 5,
  "contextSoFar": "..."
}

// Phase 3 — complete (final assembled context)
event: complete
data: {
  "phase": "complete",
  "context": "[NEURAL MEMORY CONTEXT]\n[KNOWLEDGE]\n...",
  "memories": [ /* all memories in final scored order */ ],
  "latencyMs": 24
}
Use streaming recall when you want to start processing context before the full pipeline completes. The contextSoFar field on each chunk gives you a usable partial context at every step.

Stats

GET/api/health

Health check endpoint.

Response 200

{
  "status":  "ok",
  "version": "1.0.0",
  "uptime":  12345
}
GET/api/stats

Returns aggregate statistics about the memory store and knowledge graph.

Response 200

{
  "total": 1247,
  "byType": {
    "episodic": 680,
    "semantic": 412,
    "procedural": 155
  },
  "bySource": {
    "cli": 300,
    "mcp": 947
  },
  "namespace":  null,
  "indexSize":  1247,
  "graphNodes": 1247,
  "graphEdges": 2843
}

Bulk operations

POST/api/memory/batch

Store multiple memories in a single request. Embeddings are processed in parallel.

Request body

{
  "memories": [
    { "content": "...", "type": "semantic", "importance": 0.8 },
    { "content": "...", "type": "episodic" }
  ]
}

Response 201

{
  "count": 2,
  "latencyMs": 24,
  "ids": ["a1b2c3d4-...", "e5f6g7h8-..."],
  "contradictions": 0
}

Sessions

POST/api/sessions

Create a new session to group related memories.

Request body

{
  "source":  "string",   // optional
  "context": "string"    // optional
}

Response 201

{
  "id": "a1b2c3d4-..."
}
GET/api/sessions

List all sessions.


Knowledge Graph

GET/api/graph/:id

Get the graph neighborhood for a memory node.

Query parameters

ParamTypeDescription
depthnumberTraversal depth. Range: 1-4, default: 2.

Response 200

{
  "node":        { /* memory object */ },
  "connections": [ /* edge objects */ ],
  "neighbors":   [ /* memory objects */ ]
}
POST/api/connections

Create a graph edge between two memory nodes.

Request body

{
  "sourceId":      "a1b2c3d4-...",   // required
  "targetId":      "e5f6g7h8-...",   // required
  "relationship":  "relates_to",     // required: is_a | has_property | causes | relates_to | contradicts | part_of | follows
  "strength":      1.0,              // optional, 0-1, default 1.0
  "bidirectional": false             // optional, default false
}

Response 201

{
  "id":            "...",
  "sourceId":      "a1b2c3d4-...",
  "targetId":      "e5f6g7h8-...",
  "relationship":  "relates_to",
  "strength":      1.0,
  "bidirectional": false
}

Error format

All errors follow a consistent shape:

// 400 Bad Request
{
  "statusCode": 400,
  "code":       "VALIDATION_ERROR",
  "error":      "Bad Request",
  "message":    ""type" must be one of episodic, semantic, procedural"
}

// 404 Not Found
{
  "statusCode": 404,
  "code":       "NOT_FOUND",
  "error":      "Not Found",
  "message":    "Memory a1b2c3d4-... not found"
}

Consolidation

POST/api/consolidate

Consolidate episodic memories into semantic summaries. Clusters similar episodes by vector similarity, merges each cluster into a single semantic memory, and archives the originals. Like sleep consolidation in the brain.

// Request
{
  "minClusterSize": 3,   // min episodes to form cluster (default: 3)
  "threshold": 0.6       // similarity threshold (default: 0.6)
}

// Response (200)
{
  "consolidated": 2,
  "memories": [
    {
      "id": "a1b2c3d4-...",
      "concept": "deployment workflow",
      "content": "When deploying → run migrations first..."
    }
  ]
}

Decay & Retention

POST/api/decay

Run a memory decay sweep. Evaluates all memories and archives stale ones based on the decay policy. Also triggers auto-consolidation of old episodic memories if enabled.

Request body

{
  "dryRun": false    // optional, default false — if true, preview without modifying
}

Response 200

{
  "scannedCount": 101,
  "archivedCount": 12,
  "archivedIds": ["a1b2c3d4-...", "..."],
  "decayedCount": 34,
  "protectedCount": 8,
  "consolidatedCount": 2,
  "newSemanticIds": ["f9a8b7c6-..."],
  "durationMs": 45
}
GET/api/decay/policy

Get the current decay policy configuration.

Response 200

{
  "halfLifeDays": 7,
  "archiveThreshold": 0.05,
  "decayIntervalMs": 3600000,
  "batchSize": 200,
  "importanceDecayRate": 0.01,
  "importanceFloor": 0.05,
  "consolidation": {
    "enabled": true,
    "minClusterSize": 3,
    "similarityThreshold": 0.6,
    "minEpisodicAgeMs": 86400000
  },
  "protectionRuleCount": 4
}
PUT/api/decay/policy

Update the decay policy at runtime. Only the fields you include are changed — others keep their current values.

Request body

{
  "halfLifeDays": 14,             // optional — Ebbinghaus half-life
  "archiveThreshold": 0.1,        // optional — retention score floor
  "decayIntervalMs": 7200000,     // optional — auto-sweep interval (ms)
  "importanceDecayRate": 0.005,   // optional — daily importance reduction
  "importanceFloor": 0.02,        // optional — minimum importance
  "consolidation": {              // optional — auto-consolidation settings
    "enabled": true,
    "minClusterSize": 5,
    "similarityThreshold": 0.7
  }
}

Response 200

{
  "message": "Decay policy updated",
  "halfLifeDays": 14,
  "archiveThreshold": 0.1,
  "decayIntervalMs": 7200000,
  ...
}
WebSocket event: When the auto-decay timer runs and archives memories, a memory:decayed event is emitted on the /neural Socket.io namespace with the full sweep result object.
Auto-linking: Every POST /api/memory call automatically finds the top 3 most similar existing memories (threshold 0.5) and creates bidirectionalrelates_to edges. The knowledge graph grows organically with every stored memory.
Auto-concepts: If no concept is provided when storing, Engram automatically extracts a short topic label (2–5 words) from the content.

Contradiction Detection

Engram automatically detects when new memories conflict with existing ones. When a memory is stored, the engine finds same-topic candidates via vector similarity, then analyzes content for negation, value changes, temporal overrides, and opposite sentiment. Detected contradictions are linked with contradicts graph edges.

GET/api/contradictions

List all unresolved contradictions — memory pairs linked by 'contradicts' edges where both memories are still active.

Response 200

{
  "count": 1,
  "contradictions": [
    {
      "edgeId":     "e1f2g3h4-...",
      "confidence": 0.637,
      "metadata":   { "signals": ["negation"], "suggestedStrategy": "keep_both" },
      "source": {
        "id": "a1b2c3d4-...",
        "content": "The app does not use PostgreSQL...",
        "type": "semantic",
        "importance": 0.8,
        "createdAt": "2026-03-24T..."
      },
      "target": {
        "id": "e5f6g7h8-...",
        "content": "The app uses PostgreSQL...",
        "type": "semantic",
        "importance": 0.8,
        "createdAt": "2026-03-24T..."
      }
    }
  ]
}
POST/api/contradictions/check/:id

Check a specific memory for contradictions against the existing memory store. Useful for re-scanning older memories.

Response 200

{
  "hasContradictions": true,
  "contradictions": [
    {
      "newMemoryId":      "a1b2c3d4-...",
      "existingMemoryId": "e5f6g7h8-...",
      "similarity":       0.872,
      "confidence":       0.637,
      "signals": [
        { "type": "negation", "description": "Content contains negation...", "weight": 0.75 }
      ],
      "suggestedStrategy": "keep_newest"
    }
  ],
  "candidatesChecked": 5,
  "latencyMs": 18
}
POST/api/contradictions/resolve

Resolve a contradiction between two memories using one of five strategies.

Request body

{
  "sourceId": "a1b2c3d4-...",   // required
  "targetId": "e5f6g7h8-...",   // required
  "strategy": "keep_newest"     // required: keep_newest | keep_oldest | keep_important | keep_both | manual
}

Strategies

StrategyAction
keep_newestArchive the older memory, keep the newer one.
keep_oldestArchive the newer memory, keep the older one.
keep_importantArchive the lower-importance memory.
keep_bothKeep both memories; the contradicts edge remains as documentation.
manualNo action taken — flag for human review.

Response 200

{
  "resolved":   true,
  "archivedId": "e5f6g7h8-...",
  "keptId":     "a1b2c3d4-..."
}
GET/api/contradictions/config

Get the current contradiction detection configuration.

Response 200

{
  "enabled":             true,
  "similarityThreshold": 0.65,
  "confidenceThreshold": 0.4,
  "maxCandidates":       10,
  "defaultStrategy":     "keep_both",
  "autoResolve":         false
}
PUT/api/contradictions/config

Update contradiction detection settings at runtime.

Request body

{
  "enabled":             true,       // optional — toggle detection on/off
  "similarityThreshold": 0.7,        // optional — min similarity for same-topic matching
  "confidenceThreshold": 0.5,        // optional — min confidence to flag
  "maxCandidates":       20,         // optional — max candidates to evaluate per store
  "defaultStrategy":     "keep_newest",  // optional — default auto-resolve strategy
  "autoResolve":         true        // optional — auto-resolve using default strategy
}
WebSocket events: When a contradiction is detected on store, a memory:contradiction event is emitted on the /neural namespace. When resolved, a memory:contradiction_resolved event fires.
Signal types: The detector uses four heuristic signals —negation (content negates existing statement),value_change (explicit change patterns like "switched from X to Y"),temporal_override (newer info supersedes older), and opposite_sentiment (opposing sentiment about the same subject).

Embedding Management

Engram tracks which embedding model generated each memory's vector. When you switch models, the re-embedding pipeline updates stale vectors so search quality stays consistent. Legacy memories (stored before model tracking) can be backfilled without re-embedding.

GET/api/embeddings/status

Get the current embedding model, dimension, and how many memories are current, stale, or legacy.

Response 200

{
  "currentModel":     "Xenova/all-MiniLM-L6-v2",
  "currentDimension": 384,
  "totalEmbedded":    1247,
  "currentModelCount": 1200,
  "staleCount":       0,
  "legacyCount":      47,
  "needsReEmbed":     true
}
POST/api/embeddings/re-embed

Re-embed memories with the current model. Processes in batches; emits WebSocket progress events. Can be slow for large stores.

Request body

{
  "onlyStale":  true,    // optional — only re-embed stale/legacy memories (default: true)
  "batchSize":  32       // optional — batch size 1-100 (default: 32)
}

Response 200

{
  "total":      47,
  "processed":  47,
  "failed":     0,
  "failedIds":  [],
  "durationMs": 2340,
  "model":      "Xenova/all-MiniLM-L6-v2",
  "message":    "Re-embedded 47 memories in 2340ms"
}
POST/api/embeddings/backfill

Tag legacy memories (stored before model tracking) with the current model ID, without actually re-computing their embeddings. Use when you know the same model was used.

Response 200

{
  "currentModel":      "Xenova/all-MiniLM-L6-v2",
  "currentDimension":  384,
  "totalEmbedded":     1247,
  "currentModelCount": 1247,
  "staleCount":        0,
  "legacyCount":       0,
  "needsReEmbed":      false,
  "message":           "Backfilled legacy memories with model: Xenova/all-MiniLM-L6-v2"
}
WebSocket events: During re-embedding, embedding:progressevents fire after each batch on the /neural namespace. A finalembedding:complete event fires when done.
Supported models: The default is Xenova/all-MiniLM-L6-v2 (384-dim). Other compatible models include Xenova/bge-small-en-v1.5 (384-dim) and Xenova/bge-base-en-v1.5 (768-dim). Set via ENGRAM_EMBEDDING_MODEL env var.

Vector Index Management

Engram persists the in-memory vector index to disk on shutdown and reloads it on startup. This enables fast incremental startup — only new memories added since the last save need to be loaded from the database. Set the index path via ENGRAM_INDEX_PATH env var or it defaults to {dbPath}.index.

GET/api/index/status

Get vector index status — how it was loaded, entry count, persistence info.

Response 200

{
  "loadedFrom":       "disk",         // "disk" | "database" | "not_loaded"
  "entryCount":       1247,
  "dimension":        384,
  "indexPath":        "/data/engram.db.index",
  "indexFileExists":  true,
  "incrementalCount": 3,              // memories added since cache
  "initDurationMs":   45              // startup time
}
POST/api/index/rebuild

Force a full vector index rebuild from the database. Discards any cached index and rebuilds from scratch.

Response 200

{
  "loadedFrom":       "database",
  "entryCount":       1247,
  "initDurationMs":   320,
  "message":          "Index rebuilt: 1247 entries in 320ms"
}
POST/api/index/save

Force save the vector index to disk immediately. Normally happens automatically on shutdown.

Response 200

{
  "entryCount":       1247,
  "indexPath":        "/data/engram.db.index",
  "indexFileExists":  true,
  "message":          "Index saved to /data/engram.db.index"
}
Startup optimization: With a persisted index, startup goes from O(n) full-scan to near-instant cache load + incremental delta. For 10k memories, this can reduce init time from ~2s to ~50ms.

Webhooks

Subscribe external systems to memory lifecycle events via HTTP callbacks. Webhooks fire asynchronously with HMAC-SHA256 signing, exponential backoff retry (3 attempts), and auto-disable after 10 consecutive failures.

POST/api/webhooks

Subscribe a new webhook. Returns the created subscription.

Request body

{
  "url":         "https://example.com/engram-hook",   // required
  "events":      ["stored", "forgotten"],             // required: stored | forgotten | decayed | consolidated | contradiction
  "secret":      "my-shared-secret",                  // optional — enables HMAC-SHA256 signing
  "description": "Notify Slack on new memories"       // optional
}

Response 201

{
  "id":              "a1b2c3d4-...",
  "url":             "https://example.com/engram-hook",
  "events":          ["stored", "forgotten"],
  "active":          true,
  "description":     "Notify Slack on new memories",
  "secret":          "my-shared-secret",
  "failCount":       0,
  "createdAt":       "2026-03-25T...",
  "lastTriggeredAt": null
}
GET/api/webhooks

List all webhook subscriptions. Use ?activeOnly=true to filter.

Response 200

{
  "count": 2,
  "webhooks": [ /* subscription objects */ ]
}
DELETE/api/webhooks/:id

Delete a webhook subscription.

Response 204 No Content

POST/api/webhooks/:id/test

Send a test event to verify the webhook endpoint is reachable.

Response 200

{
  "webhookId":  "a1b2c3d4-...",
  "url":        "https://example.com/engram-hook",
  "success":    true,
  "statusCode": 200,
  "attempts":   1
}
Webhook payload format: Each delivery is a POST with JSON body:{ "event": "stored", "timestamp": "...", "data": { ... } }. When a secret is configured, the X-Engram-Signature header contains sha256=<hmac>.
Events: stored (new memory), forgotten (archived),decayed (sweep completed), consolidated (episodes merged),contradiction (conflict detected).

Tagging & Collections

Browse and filter memories by tags. Tags are stored as JSON arrays on each memory. Collections group tags by prefix (e.g. project:alpha, topic:ml).

GET/api/tags

Get the tag cloud — all unique tags with memory counts, sorted by count descending.

Response 200

{
  "count": 8,
  "tags": [
    { "tag": "typescript", "count": 42 },
    { "tag": "backend",    "count": 28 },
    { "tag": "project:alpha", "count": 15 }
  ]
}
GET/api/tags/:tag

Get all memories with a specific tag. Supports pagination via ?limit and ?offset.

Response 200

{
  "tag": "typescript",
  "count": 3,
  "memories": [ /* memory objects */ ]
}
GET/api/collections

Get collections — tags grouped by prefix. Tags without a colon go into the 'default' collection.

Response 200

{
  "count": 3,
  "collections": [
    {
      "name": "project",
      "prefix": "project",
      "tags": [
        { "tag": "project:alpha", "count": 15 },
        { "tag": "project:beta",  "count": 8 }
      ],
      "totalMemories": 23
    }
  ]
}
POST/api/memory/:id/tags

Add a tag to a memory. Idempotent — adding an existing tag is a no-op.

Request body

{ "tag": "important" }

Response 200

{ "id": "a1b2c3d4-...", "tags": ["existing", "important"] }
DELETE/api/memory/:id/tags/:tag

Remove a tag from a memory.

Response 200

{ "id": "a1b2c3d4-...", "tags": ["remaining-tag"] }

Plugin System

Engram supports a formalized plugin system for community extensions. Plugins register hooks that fire at key lifecycle events: onStore,onRecall, onForget, onDecay,onStartup, onShutdown. Errors in plugins are isolated — they never break core operations.

GET/api/plugins

List all registered plugins with their hooks and metadata.

Response 200

{
  "count": 2,
  "plugins": [
    {
      "id":           "my-org/logger",
      "name":         "Memory Logger",
      "version":      "1.0.0",
      "description":  "Logs all memory events to console",
      "hooks":        ["onStore", "onRecall", "onForget"],
      "registeredAt": "2026-03-25T..."
    }
  ]
}
GET/api/plugins/:id

Get a single registered plugin by ID.

DELETE/api/plugins/:id

Unregister a plugin by ID. Its hooks will no longer fire.

Response 204 No Content

Plugin manifest: Each plugin provides id (unique identifier),name, version, optional description, and a hooks object with lifecycle callbacks. Plugins run in registration order.
Hook types: onStore (after memory stored),onRecall (after context assembled), onForget (after archive),onDecay (after sweep), onStartup (after brain init),onShutdown (before close).
REST API — Engram Docs