Observational Memory is a new type of memory system designed specifically for agentic AI systems. It functions as a human-inspired memory approach that compresses context into meaningful observations while maintaining completely stable context windows compatible with major AI providers like Anthropic and OpenAI.
The system uses formatted text-based messages resembling logs rather than structured objects or knowledge graphs. It employs a three-date model for better temporal reasoning and emoji-based prioritization (🔴 for important, 🟡 for maybe important, 🟢 for info only). The memory system breaks the context window into two blocks: observations and raw messages, with configurable token thresholds for compression and garbage collection.
Observational Memory operates through two background agents: an observer agent that compresses messages into observations when they hit 30k tokens (default threshold), and a reflector agent that garbage collects unimportant observations when they reach 40k tokens. This structure enables consistent prompt caching with full cache hits on most turns and only infrequent cache invalidation during reflection.
The system achieves state-of-the-art results on benchmarks, scoring 94.87% on LongMemEval with gpt-5-mini and 84.23% with gpt-4o. It's designed to handle modern AI agent workloads including coding agents, browser agents using Playwright, deep research agents browsing multiple URLs, and parallelizable agents that generate large amounts of context quickly.
Observational Memory is available as an open-source implementation within Mastra's platform and is compatible with prompt caching systems from major AI providers. It's positioned as the primary memory system for Mastra users migrating from previous memory implementations like working memory and semantic recall.
admin
Observational Memory is designed for developers and organizations building AI agent systems, particularly those working with coding agents, browser automation agents, research agents, and parallelizable AI systems. It targets users who need to manage large context windows efficiently while maintaining compatibility with major AI providers' prompt caching systems. The system is especially valuable for teams dealing with context-heavy operations like tool call results, file scanning, command execution, and parallel browsing activities.