Add memory to a Claude chatbot

Give your Claude-powered chatbot persistent, synthesized memory across sessions. Users won't have to re-introduce themselves.

Note
What you'll build: A chatbot that remembers each user's preferences, background, and recent context — personalising every response without asking the user to repeat themselves.

How it works

1

Before each turn: fetch memory

Call GET /v1/context with userId and the user's current message as q. The query steers which relevant chunks surface alongside the synthesized profile.

2

Inject into system prompt

Prepend static, dynamic, and optionally relevant facts into your Claude system prompt.

3

After each turn: ingest (fire-and-forget)

POST the conversation turn to /v1/ingest asynchronously — never block the response on it.

Full implementation

typescriptchatbot.ts
"kw">import Anthropic "kw">from "@anthropic-ai/sdk"; "kw">import { randomUUID } "kw">from "crypto"; "kw">const anthropic = "kw">new Anthropic(); "kw">const ANANSI_URL = "https:">class="cm">//anansimemory.com"; "kw">const ANANSI_KEY = process.env.ANANSI_API_KEY!; "kw">interface Memory { static: "kw">string[]; dynamic: "kw">string[]; relevant: { content: "kw">string; similarity: "kw">number; metadata: Record<"kw">string, unknown> }[]; } "kw">async "kw">function getMemory(userId: "kw">string, query: "kw">string): Promise<Memory> { "kw">const res = "kw">await fetch( `${ANANSI_URL}/v1/context?userId=${encodeURIComponent(userId)}&q=${encodeURIComponent(query)}`, { headers: { Authorization: `Bearer ${ANANSI_KEY}` } } ); "kw">if (!res.ok) "kw">return { static: [], dynamic: [], relevant: [] }; "kw">return res.json(); } "kw">function buildSystemPrompt(memory: Memory): "kw">string { "kw">const lines = ["You are a helpful assistant with persistent memory of this user."]; "kw">if (memory.static.length) { lines.push("\n## About this user"); memory.static.forEach((f) => lines.push(`- ${f}`)); } "kw">if (memory.dynamic.length) { lines.push("\n## What they're working on right now"); memory.dynamic.forEach((d) => lines.push(`- ${d}`)); } "kw">if (memory.relevant.length) { lines.push("\n## Relevant context for this message"); memory.relevant.slice(0, 3).forEach((r) => lines.push(`- ${r.content}`)); } "kw">return lines.join("\n"); } "kw">export "kw">async "kw">function chat( userId: "kw">string, userMessage: "kw">string, history: { role: "user" | "assistant"; content: "kw">string }[], sessionId?: "kw">string ) { "kw">const sid = sessionId ?? randomUUID(); "kw">const memory = "kw">await getMemory(userId, userMessage); "kw">const response = "kw">await anthropic.messages.create({ model: "claude-haiku-4-5-20251001", system: buildSystemPrompt(memory), messages: [...history, { role: "user", content: userMessage }], max_tokens: 1024, }); "kw">const reply = response.content[0]."kw">type === "text" ? response.content[0].text : ""; "kw">class="cm">// Fire-and-forget — never block the user-facing response on this fetch(`${ANANSI_URL}/v1/ingest`, { method: "POST", headers: { Authorization: `Bearer ${ANANSI_KEY}`, "Content-Type": "application/json" }, body: JSON.stringify({ userId, content: `User: ${userMessage}\nAssistant: ${reply}`, sourceType: "conversation", sessionId: sid, }), }).catch(() => {}); "kw">return { reply, sessionId: sid }; }
Tip
Fire-and-forget ingest: Never await the ingest call. It adds 50–150ms of unnecessary latency to every response. Ingest asynchronously after returning the reply.

Using relevant chunks

The relevant array contains the top vector-search hits for the user's current message. Use it for specific factual lookups that go beyond the synthesized profile:

typescript
"kw">class="cm">// Only inject relevant chunks "kw">if they're high-confidence (similarity > 0.7) "kw">const highConfidence = memory.relevant.filter((r) => r.similarity > 0.7); "kw">if (highConfidence.length) { system += "\n\n## Directly relevant context\n"; system += highConfidence.map((r) => `- ${r.content}`).join("\n"); }

The static and dynamic arrays are the synthesized profile — always use those. The relevant array is bonus signal — use it when the query is specific enough that raw chunks add value.

Session grouping with sessionId

Pass a sessionId to group conversation turns. Anansi uses it during synthesis to distinguish "what happened in this session" vs "long-term history".

typescript
"kw">class="cm">// Generate once per chat session, reuse for every turn in that session "kw">const sessionId = randomUUID(); "kw">class="cm">// e.g. "3f7a2b1c-..." "kw">class="cm">// Pass it to every ingest call in the session "kw">await fetch(`${ANANSI_URL}/v1/ingest`, { method: "POST", headers: { Authorization: `Bearer ${ANANSI_KEY}`, "Content-Type": "application/json" }, body: JSON.stringify({ userId: "user_abc", content: "User: How does BullMQ handle retries?\nAssistant: ...", sourceType: "conversation", sessionId, "kw">class="cm">// same UUID for every turn in this session }), });

What the profile looks like after a few sessions

json
{ "static": [ "Senior engineer at a fintech startup", "Prefers concise answers without preamble", "Uses TypeScript, BullMQ, and Postgres" ], "dynamic": [ "Debugging a webhook deduplication issue this week", "Asked about Stripe idempotency keys in the last session" ], "relevant": [ { "content": "User: What's the BullMQ default retry delay?\nAssistant: Exponential backoff starting at 1s.", "similarity": 0.89, "metadata": { "sessionId": "3f7a2b1c-...", "timestamp": "2026-06-08T..." } } ] }

Claude receives this before every message — zero extra work from the user, zero re-introduction across sessions.