You tell your AI girlfriend about your bad day at work. Next week, she asks how things are going with that difficult coworker. It feels magical โ but how does it actually work? And why do some apps remember everything while others forget you exist between sessions?
The Basic Problem
Large language models (LLMs) don't have memory in the human sense. Each conversation is processed independently. The model reads the conversation history, generates a response, and then "forgets" everything. Memory in AI companions is an engineering layer built on top of the model, not a native capability.
How Apps Implement Memory
Method 1: Context Window Stuffing
The simplest approach: include previous conversation history in the prompt. If the model has a 128K token context window, you can fit roughly 50-100 pages of conversation. The AI reads it all before responding. Pros: Simple, accurate recall. Cons: Expensive (you're paying for all those tokens every message), and eventually you run out of space.
Method 2: Summarization
Periodically summarize old conversations into compressed notes. Instead of storing "we talked about your dog Max for 20 messages," store "User has a dog named Max, golden retriever, 3 years old." Pros: Space-efficient. Cons: Lossy โ nuance and emotional context get compressed away. This is why some apps feel like they "sort of" remember you but miss the details.
Method 3: RAG (Retrieval-Augmented Generation)
Store conversation snippets in a vector database. When you mention something, the system searches for relevant past conversations and injects them into the prompt. Pros: Scales well, retrieves relevant context. Cons: Retrieval isn't perfect โ sometimes it pulls irrelevant memories or misses important ones.
Method 4: Hybrid (What the Best Apps Use)
Combine all three: recent conversations in full context, older conversations summarized, and a RAG system for specific detail retrieval. Nomi AI and Replika use variations of this approach. DeepSeek V4's 1M token context window could simplify this dramatically by just... keeping everything.
Why It Matters
Memory is what turns an AI chatbot into an AI companion. Without it, every conversation starts from zero. With it, the AI builds a model of who you are, what you care about, and how to interact with you. The apps that get memory right are the ones users stay with long-term.
Which Apps Have the Best Memory?
| App | Memory Approach | Quality |
|---|---|---|
| Nomi AI | Hybrid (RAG + summaries + shared notes) | โญโญโญโญโญ |
| Replika | Summarization + context | โญโญโญ |
| Veridia | Hybrid with game state | โญโญโญโญ |
| Character.AI | Limited context window | โญโญ |
| Candy AI | Basic summarization | โญโญ |
Memory Is a Product Layer
Large context windows help, but memory still needs curation. A companion should not remember every typo, temporary mood, or roleplay fact as permanent truth. Good systems classify memories: stable profile facts, relationship preferences, story state, safety boundaries, and short-term conversation context. Bad systems dump everything into a vector database and hope retrieval fixes it.
The best user-facing memory controls are boring and powerful: show what the app remembers, let users edit it, let users delete it, and separate fictional roleplay facts from real personal facts. That is how memory becomes trust instead of surveillance.
