Add memory to a Claude chatbot
Give your Claude-powered chatbot persistent, synthesized memory across sessions. Users won't have to re-introduce themselves.
Note
What you'll build: A chatbot that remembers each user's preferences, background, and recent context — personalising every response without asking the user to repeat themselves.
How it works
1
Before each turn: fetch memory
Call GET /v1/context with userId and the user's current message as q. The query steers which relevant chunks surface alongside the synthesized profile.
2
Inject into system prompt
Prepend static, dynamic, and optionally relevant facts into your Claude system prompt.
3
After each turn: ingest (fire-and-forget)
POST the conversation turn to /v1/ingest asynchronously — never block the response on it.
Full implementation
typescriptchatbot.ts
"kw">import Anthropic "kw">from "@anthropic-ai/sdk";
"kw">import { randomUUID } "kw">from "crypto";
"kw">const anthropic = "kw">new Anthropic();
"kw">const ANANSI_URL = "https:">class="cm">//anansimemory.com";
"kw">const ANANSI_KEY = process.env.ANANSI_API_KEY!;
"kw">interface Memory {
static: "kw">string[];
dynamic: "kw">string[];
relevant: { content: "kw">string; similarity: "kw">number; metadata: Record<"kw">string, unknown> }[];
}
"kw">async "kw">function getMemory(userId: "kw">string, query: "kw">string): Promise<Memory> {
"kw">const res = "kw">await fetch(
`${ANANSI_URL}/v1/context?userId=${encodeURIComponent(userId)}&q=${encodeURIComponent(query)}`,
{ headers: { Authorization: `Bearer ${ANANSI_KEY}` } }
);
"kw">if (!res.ok) "kw">return { static: [], dynamic: [], relevant: [] };
"kw">return res.json();
}
"kw">function buildSystemPrompt(memory: Memory): "kw">string {
"kw">const lines = ["You are a helpful assistant with persistent memory of this user."];
"kw">if (memory.static.length) {
lines.push("\n## About this user");
memory.static.forEach((f) => lines.push(`- ${f}`));
}
"kw">if (memory.dynamic.length) {
lines.push("\n## What they're working on right now");
memory.dynamic.forEach((d) => lines.push(`- ${d}`));
}
"kw">if (memory.relevant.length) {
lines.push("\n## Relevant context for this message");
memory.relevant.slice(0, 3).forEach((r) => lines.push(`- ${r.content}`));
}
"kw">return lines.join("\n");
}
"kw">export "kw">async "kw">function chat(
userId: "kw">string,
userMessage: "kw">string,
history: { role: "user" | "assistant"; content: "kw">string }[],
sessionId?: "kw">string
) {
"kw">const sid = sessionId ?? randomUUID();
"kw">const memory = "kw">await getMemory(userId, userMessage);
"kw">const response = "kw">await anthropic.messages.create({
model: "claude-haiku-4-5-20251001",
system: buildSystemPrompt(memory),
messages: [...history, { role: "user", content: userMessage }],
max_tokens: 1024,
});
"kw">const reply = response.content[0]."kw">type === "text" ? response.content[0].text : "";
"kw">class="cm">// Fire-and-forget — never block the user-facing response on this
fetch(`${ANANSI_URL}/v1/ingest`, {
method: "POST",
headers: { Authorization: `Bearer ${ANANSI_KEY}`, "Content-Type": "application/json" },
body: JSON.stringify({
userId,
content: `User: ${userMessage}\nAssistant: ${reply}`,
sourceType: "conversation",
sessionId: sid,
}),
}).catch(() => {});
"kw">return { reply, sessionId: sid };
}
Tip
Fire-and-forget ingest: Never await the ingest call. It adds 50–150ms of unnecessary latency to every response. Ingest asynchronously after returning the reply.
Using relevant chunks
The relevant array contains the top vector-search hits for the user's current message. Use it for specific factual lookups that go beyond the synthesized profile:
typescript
"kw">class="cm">// Only inject relevant chunks "kw">if they're high-confidence (similarity > 0.7)
"kw">const highConfidence = memory.relevant.filter((r) => r.similarity > 0.7);
"kw">if (highConfidence.length) {
system += "\n\n## Directly relevant context\n";
system += highConfidence.map((r) => `- ${r.content}`).join("\n");
}
The static and dynamic arrays are the synthesized profile — always use those. The relevant array is bonus signal — use it when the query is specific enough that raw chunks add value.
Session grouping with sessionId
Pass a sessionId to group conversation turns. Anansi uses it during synthesis to distinguish "what happened in this session" vs "long-term history".
typescript
"kw">class="cm">// Generate once per chat session, reuse for every turn in that session
"kw">const sessionId = randomUUID(); "kw">class="cm">// e.g. "3f7a2b1c-..."
"kw">class="cm">// Pass it to every ingest call in the session
"kw">await fetch(`${ANANSI_URL}/v1/ingest`, {
method: "POST",
headers: { Authorization: `Bearer ${ANANSI_KEY}`, "Content-Type": "application/json" },
body: JSON.stringify({
userId: "user_abc",
content: "User: How does BullMQ handle retries?\nAssistant: ...",
sourceType: "conversation",
sessionId, "kw">class="cm">// same UUID for every turn in this session
}),
});
What the profile looks like after a few sessions
json
{
"static": [
"Senior engineer at a fintech startup",
"Prefers concise answers without preamble",
"Uses TypeScript, BullMQ, and Postgres"
],
"dynamic": [
"Debugging a webhook deduplication issue this week",
"Asked about Stripe idempotency keys in the last session"
],
"relevant": [
{
"content": "User: What's the BullMQ default retry delay?\nAssistant: Exponential backoff starting at 1s.",
"similarity": 0.89,
"metadata": { "sessionId": "3f7a2b1c-...", "timestamp": "2026-06-08T..." }
}
]
}
Claude receives this before every message — zero extra work from the user, zero re-introduction across sessions.