Documentation Index
Fetch the complete documentation index at: https://docs.openmemind.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Memind retrieval is designed to assemble useful agent context, not just return similar text.
Many memory systems use a simple retrieval path:
query -> vector search -> top-k memories
This works for basic recall, but it often breaks down when agents need exact terms, source context, long-running topics, time-aware memory, or higher-level understanding.
Memind provides two retrieval strategies:
| Strategy | Latency profile | Best for | Main idea |
|---|
SIMPLE | Millisecond-level latency | Low-latency agents, chatbots, and frequent memory injection. | Multi-channel retrieval and fusion without the heavier deep reasoning path. |
DEEP | Second-level latency | Complex questions, high-quality retrieval, and workflows that can trade latency for completeness. | Sufficiency checking, query expansion, graph/thread assist, and optional |
Use SIMPLE when memory should feel instant.
Use DEEP when memory should be more complete.
Both strategies can retrieve across multiple memory layers:
- Insight Tree for high-level understanding
- Memory Items for structured facts, events, directives, playbooks, and resolutions
- Raw Data captions for source-level context
- Item Graph for related entities and relationships
- Memory Threads for long-running topics and project context
- Temporal signals for time-aware retrieval
The goal is to return context that an agent can use immediately: facts, evidence, and interpretation.
Why top-k memory retrieval breaks down
Most memory systems eventually hit the same retrieval problems.
| Problem | What happens |
|---|
| Semantic-only recall misses exact terms | A query about a tool name, class name, API, or project term may not retrieve the right memory. |
| Top-k returns fragments | The agent receives isolated facts but not the surrounding context. |
| Long-running topics get split apart | Related memories from the same project, workflow, or incident are not retrieved together. |
| Time-sensitive questions are weak | A query like “what did we decide last week?” is treated like a normal semantic query. |
| High-level understanding is missing | The agent retrieves what was said, but not what the system has learned. |
| Retrieval is hard to debug | Developers cannot see why a memory was returned or missed. |
Memind retrieval is designed around these failure modes.
It combines semantic search, keyword search, temporal signals, graph relationships, memory threads, Raw Data captions, and Insight Tree context into a retrieval result that is more useful for agents.
Layered retrieval
Memind retrieval is layered retrieval.
It searches what was stored, what was understood, where it came from, how it connects, and when it happened.
| Layer | What it contributes |
|---|
| What was understood | Insight Tree returns stable patterns, preferences, and higher-level understanding. |
| What was stored | Memory Items return concrete facts, events, directives, playbooks, and resolutions. |
| Where it came from | Raw Data captions return source-level context behind retrieved memories. |
| How it connects | Item Graph and Memory Threads return related entities, relationships, topics, and project context. |
| When it happened | Temporal signals help when the query depends on recency or time constraints. |
This is the core difference:
Similarity search finds related text. Memind retrieval assembles usable agent context.
Same query, different retrieval
Consider this query:
What should I remember before writing the next Memind docs page?
A typical vector-memory system may return fragments:
- The user dislikes generic descriptions.
- The user wants implementation details.
- The user is writing Memind docs.
These facts are useful, but the agent still has to infer the broader writing strategy.
SIMPLE retrieval can return a context package:
Insights
- The user prefers implementation-grounded technical writing that explains product differentiation through architecture and behavior.
Items
- The user said the overview should not sound generic.
- The user asked to explain Raw Data captions as source-level context.
- The user wanted SIMPLE and DEEP retrieval to be distinguished by latency and quality.
Captions
- Documentation planning session covering Memind 0.2.0 positioning, open-source docs structure, and how to explain the retrieval system to developers.
DEEP retrieval can investigate further when the initial context is not enough:
Initial retrieval
- Finds documentation preferences and recent retrieval-doc feedback.
Sufficiency check
- Determines whether the current context is enough to answer.
If insufficient
- Expands the query into writing style, product differentiation, retrieval strategy, and architecture explanation.
- Routes keyword-style expansions to keyword search.
- Routes semantic or hypothetical expansions to vector search.
- Explores related graph and thread context.
- Optionally reranks final evidence.
The difference is not only the number of returned memories.
The difference is that Memind can assemble evidence, interpretation, source context, and related topic context together.
Retrieval strategies at a glance
Memind exposes two main retrieval strategies.
| Strategy | Use it when | What it optimizes for |
|---|
SIMPLE | The query is direct and latency matters. | Fast recall with strong multi-channel coverage. |
DEEP | The query is complex, ambiguous, or needs broader context. | Higher-quality context through reasoning-assisted retrieval. |
A practical default is:
- Start with
SIMPLE for normal agent turns.
- Use
DEEP when the query needs stronger recall, better evidence, or cross-session reasoning.
SIMPLE retrieval
SIMPLE retrieval is the low-latency retrieval path.
It is designed for millisecond-level memory recall in agents and chatbots that need to retrieve memory before many responses or tool actions.
SIMPLE is not plain vector top-k. It runs multiple retrieval channels and fuses their results.
How SIMPLE works
At a high level, SIMPLE retrieval follows this flow:
Query
-> Insight vector search
-> Item vector search
-> Item keyword search
-> Temporal item search
-> Weighted fusion
-> Memory thread assist
-> Graph assist
-> Raw Data caption aggregation
-> Adaptive truncation
-> Retrieval result
The main channels are:
| Channel | Role |
|---|
| Insight vector search | Finds high-level understanding from the Insight Tree. |
| Item vector search | Finds semantically similar memory items. |
| Item keyword search | Finds exact terms through keyword/BM25 search. |
| Temporal item search | Adds time-aware candidates when the query contains temporal intent. |
| Memory thread assist | Pulls related items from long-running topics. |
| Graph assist | Expands from direct hits to related graph-connected memory. |
After candidate retrieval, Memind merges the channels with weighted fusion, aggregates related Raw Data captions, and truncates the final result to fit the configured context budget.
Why SIMPLE is useful
SIMPLE gives agents fast memory recall without relying on a heavier reasoning path.
It improves over plain vector search because each channel covers a different failure mode:
| Problem | How SIMPLE helps |
|---|
| Semantic search misses exact technical terms. | Keyword search can match names, APIs, tools, and code terms. |
| Keyword search misses paraphrased meaning. | Vector search can find semantically similar memory. |
| Top-k returns isolated facts. | Graph and thread assist can add related context. |
| A query depends on time. | Temporal retrieval can use occurred time and time constraints. |
| Items are too terse. | Raw Data captions add source-level context. |
| One channel dominates results. | Weighted fusion combines signals from multiple channels. |
Use SIMPLE when retrieval should be fast but still memory-aware.
When to use SIMPLE
Use SIMPLE for:
- millisecond-level memory recall
- normal chat turns
- low-latency agent loops
- direct fact or preference lookup
- retrieving recent or obvious context
- applications where retrieval runs often
- cases where you want strong recall without extra deep-retrieval cost
Example:
var result = memory.retrieve(
memoryId,
"What does this user prefer when writing technical docs?",
RetrievalConfig.Strategy.SIMPLE
).block();
DEEP retrieval
DEEP retrieval is the quality-first retrieval path.
It is designed for second-level retrieval latency, where the application can spend more time to get broader evidence, better recall, and higher-quality context.
DEEP is useful for harder questions: ambiguous queries, cross-session investigations, project-level questions, or cases where the agent needs stronger evidence before acting.
DEEP does not simply increase top-k. It first checks whether the initial context is sufficient. If not, it expands the search intelligently.
How DEEP works
At a high level, DEEP retrieval follows this flow:
Query
-> Insight search + initial item retrieval
-> Sufficiency check
-> If sufficient:
return insights + items + evidence
-> If insufficient:
typed query expansion
-> LEX queries -> keyword search
-> VEC / HYDE queries -> vector search
-> multi-channel fusion
-> memory thread assist
-> graph assist
-> optional rerank
-> Raw Data caption aggregation
-> retrieval result
The key difference is the sufficiency check.
Before expanding the search, Memind looks at the initial insights, items, and Raw Data captions and asks whether the current context is enough to answer the query. If it is enough, Memind can return early. If not, it continues into deeper retrieval.
Why DEEP is useful
DEEP helps when the first search pass does not provide enough context.
| Capability | Why it matters |
|---|
| Sufficiency check | Avoids unnecessary expansion when the initial result is already enough. |
| Typed query expansion | Generates targeted follow-up queries instead of only increasing top-k. |
LEX routing | Sends keyword-style expansions to keyword search. |
VEC / HYDE routing | Sends semantic expansions to vector search. |
| Thread assist | Finds related context inside long-running topics. |
| Graph assist | Expands through relationships between memory items. |
| Optional rerank | Improves final ordering for complex queries. |
| Evidence output | Returns key evidence when the initial context is sufficient. |
This makes DEEP useful for retrieval quality, not just retrieval quantity.
When to use DEEP
Use DEEP for:
- second-level retrieval where quality matters more than latency
- ambiguous user questions
- cross-session memory search
- project or investigation questions
- queries that need broader evidence
- cases where missing context is expensive
- tasks where retrieval quality matters more than latency
Example:
var result = memory.retrieve(
memoryId,
"What has changed in this project direction over the last few weeks?",
RetrievalConfig.Strategy.DEEP
).block();
SIMPLE vs DEEP
Use this table as a practical guide.
| Need | Recommended strategy |
|---|
| Millisecond-level memory recall | SIMPLE |
| Low-latency chatbot responses | SIMPLE |
| Memory retrieval before frequent agent actions | SIMPLE |
| Direct user preference lookup | SIMPLE |
| Exact technical term or tool lookup | SIMPLE |
| Second-level retrieval is acceptable | DEEP |
| Retrieval quality matters more than latency | DEEP |
| Need more complete evidence | DEEP |
| Ambiguous or underspecified query | DEEP |
| Cross-session investigation | DEEP |
| Project-level context reconstruction | DEEP |
In most applications, SIMPLE is the default retrieval mode. Use DEEP selectively for harder questions.
What retrieval returns
Memind retrieval returns a structured result, not just a flat list.
| Output | Meaning |
|---|
insights | High-level understanding from the Insight Tree. |
items | Structured memory items ranked by retrieval score. |
rawData | Aggregated Raw Data captions behind retrieved items. |
evidences | Key evidence produced by deep retrieval when available. |
strategy | The retrieval strategy used. |
query | The effective query used for retrieval. |
The formatted result is designed for agent context construction:
Each layer has a different role:
| Layer | Role |
|---|
| Insights | Provide interpretation. |
| Items | Provide concrete facts and memory units. |
| Raw Data captions | Provide source-level context. |
A useful retrieval result should help the agent understand both what happened and what Memind has learned from it.
Retrieval traces
Memory retrieval can be difficult to debug. Memind provides retrieval traces so developers can inspect what happened during retrieval.
A trace can help answer questions like:
- Was
SIMPLE or DEEP used?
- Which retrieval channels ran?
- Did keyword search return candidates?
- Did temporal retrieval activate?
- Did graph assist or memory-thread assist change the result?
- Did
DEEP trigger query expansion?
- Was reranking applied?
- Why did a specific item appear in the final result?
This is especially useful when retrieval feels wrong. Instead of treating memory as a black box, you can inspect the retrieval path and tune the configuration.
Configuration
Retrieval behavior can be controlled through runtime configuration.
Common configuration areas include:
| Area | Controls |
|---|
| Strategy | Whether to use SIMPLE or DEEP. |
| Tier limits | How many insights, items, and Raw Data captions to retrieve. |
| Fusion scoring | How vector, keyword, temporal, graph, and thread signals are weighted. |
| Temporal retrieval | Whether time-aware retrieval is enabled. |
| Graph assist | Whether related graph-connected memory can be added. |
| Memory thread assist | Whether long-running topic context can be added. |
| Deep retrieval | Sufficiency checking, query expansion, and reranking behavior. |
| Trace | Whether retrieval traces are collected for debugging. |
| Cache | Whether repeated retrieval requests can reuse cached results. |
Start with default settings, inspect retrieval traces in Memind UI, then tune strategy, top-k limits, assist behavior, and reranking only when needed.