Overview
Memory Retrieval is how your application reads useful context from Memind. Memind provides two retrieval strategies:| Strategy | Latency profile | Best for |
|---|---|---|
SIMPLE | Millisecond-level latency | Low-latency agents, chatbots, and frequent memory injection. |
DEEP | Second-level latency | Complex questions, higher-quality retrieval, and workflows that can trade latency for completeness. |
SIMPLE when memory should feel instant.
Use DEEP when memory should be more complete.
This page focuses on how to call retrieval APIs, choose a strategy, use the result as agent context, configure retrieval, and debug retrieval behavior.
For how retrieval works internally, see Retrieval.
When to retrieve memory
Retrieve memory when the agent needs context that may not be present in the current prompt. Common moments include:- before generating an assistant response
- before planning a task
- before calling tools
- when a user asks about past preferences, decisions, or project context
- when restoring context across sessions
- when building a prompt for a long-running agent
- when the agent needs prior tool experience or reusable playbooks
Choose a retrieval strategy
Choose the strategy based on latency and retrieval quality requirements.| Need | Recommended strategy |
|---|---|
| Millisecond-level memory recall | SIMPLE |
| Low-latency chatbot responses | SIMPLE |
| Memory retrieval before frequent agent actions | SIMPLE |
| Direct fact or preference lookup | SIMPLE |
| Exact technical term or tool lookup | SIMPLE |
| Second-level retrieval is acceptable | DEEP |
| Retrieval quality matters more than latency | DEEP |
| Need more complete evidence | DEEP |
| Ambiguous or underspecified query | DEEP |
| Cross-session investigation | DEEP |
| Project-level context reconstruction | DEEP |
SIMPLE.
Use DEEP selectively for harder questions where missing context is more expensive than waiting longer.
Retrieve with SIMPLE
UseSIMPLE for fast memory recall.
SIMPLE is designed for low-latency agents and chatbots. It can retrieve relevant insights, memory items, and Raw Data captions without using the heavier deep-retrieval path.
Use it for:
- normal chat turns
- frequent memory injection
- direct questions
- preference lookup
- recent context recall
- latency-sensitive agent loops
Retrieve with DEEP
UseDEEP for quality-first retrieval.
DEEP is designed for complex or ambiguous queries. It can use sufficiency checking, typed query expansion, graph/thread assist, optional reranking, and evidence-backed output.
Use it for:
- cross-session investigation
- project-level questions
- ambiguous user requests
- tasks that need stronger evidence
- situations where retrieval quality matters more than latency
Use retrieval results as agent context
The easiest way to use retrieval output in an agent prompt isformattedResult().
| Section | Role |
|---|---|
Insights | Higher-level understanding, stable preferences, learned patterns. |
Items | Concrete memory facts, events, directives, playbooks, and resolutions. |
Captions | Raw Data captions that provide source-level context. |
Response format
Retrieval returns aRetrievalResult.
| Field | Meaning |
|---|---|
items | Ranked Memory Items returned by retrieval. |
insights | Higher-level understanding from the Insight Tree. |
rawData | Aggregated Raw Data captions behind retrieved items. |
evidences | Key evidence produced by deep retrieval when available. |
strategy | The retrieval strategy used. |
query | The effective query used for retrieval. |
itemsprovide concrete factsinsightsprovide interpretationrawDataprovides source-level contextevidencesprovide supporting information for complex retrieval
Configure retrieval
For simple use cases, pass a strategy directly:RetrievalRequest with a custom RetrievalConfig.
| Tier | Meaning |
|---|---|
| Tier 1 | Insight retrieval. |
| Tier 2 | Memory Item retrieval. |
| Tier 3 | Raw Data caption retrieval. |
RetrievalConfig.simple() for low-latency retrieval configuration.
RetrievalConfig.deep() for quality-first retrieval configuration.
| Area | Controls |
|---|---|
| Strategy | Whether to use SIMPLE or DEEP. |
| Tier limits | How many insights, items, and Raw Data captions to retrieve. |
| Fusion scoring | How vector, keyword, temporal, graph, and thread signals are weighted. |
| Graph assist | Whether graph-connected memory can be added. |
| Thread assist | Whether long-running topic context can be added. |
| Temporal retrieval | Whether time-aware retrieval is used. |
| Rerank | Whether deep retrieval can rerank final candidates. |
| Timeout | How long retrieval may run. |
| Cache | Whether repeated retrieval requests can reuse cached results. |
| Trace | Whether retrieval traces are collected for debugging. |
Retrieve by scope or category
Use aRetrievalRequest when you want to restrict retrieval scope.
For user memory:
Debug with retrieval traces
Memory retrieval can be difficult to debug. Memind provides retrieval traces so developers can inspect what happened during retrieval. A trace can help answer:- which strategy was used
- what query was executed
- whether cache was used
- which retrieval channels ran
- whether keyword search returned candidates
- whether temporal retrieval activated
- whether graph assist changed the result
- whether memory-thread assist changed the result
- whether
DEEPtriggered query expansion - whether reranking was applied
- why a specific item appeared in the final result
Best practices
Start simple:- Use
SIMPLEas the default strategy. - Use
DEEPonly when the query is complex or quality matters more than latency. - Do not use
DEEPfor every chatbot turn unless latency is acceptable.
- Ask retrieval for the memory you need, not the whole user prompt.
- Keep the query short enough to express intent clearly.
- Preserve useful time expressions such as “last week”, “recently”, or “before the release”.
- Use
formattedResult()as the default prompt context format. - Include memory only when the result is not empty.
- Let insights guide behavior, items support facts, and captions provide source context.
- Inspect retrieval traces before changing top-k or scoring settings.
- Check whether the missing information was extracted first.
- Check Raw Data and Memory Items if retrieval cannot find expected context.
- Use user memory for user preferences, facts, and history.
- Use agent memory for directives, tool experience, playbooks, and resolutions.
- Use category filters when you know what type of memory the query needs.

