Documentation Index
Fetch the complete documentation index at: https://docs.openmemind.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Multimodal Extraction is how your application turns real-world input into Memind memory.
Memind can extract memory from conversations, streaming messages, documents, images, audio, tool calls, and custom raw content. These inputs enter the same memory construction pipeline and can become Raw Data, Memory Items, Item Graph relationships, Memory Threads, and Insight Tree updates.
This page focuses on the extraction APIs, supported content types, plugin setup, response format, and usage patterns.
If you want to run a complete example first, start with Quickstart.
For the internal memory construction model, see Memory Construction.
Supported content types
Memind supports multiple content types through built-in conversation support and raw-data plugins.
| Content type | Examples | Support |
|---|
| Conversations | chat messages, agent turns, transcripts | Built in |
| Streaming messages | live chatbot or agent messages | Built in with buffer and commit |
| Documents | text, markdown, HTML, CSV, PDF | Document raw-data plugin |
| Images | screenshots, diagrams, photos | Image raw-data plugin |
| Audio | recordings, transcripts, voice notes | Audio raw-data plugin |
| Tool calls | tool input/output, test results, shell output | Tool-call raw-data plugin |
| Custom raw content | application-specific records, logs, domain objects | RawContent and plugin SPI |
The important idea is that Memind does not treat memory as only chat history.
Different input types are normalized into Raw Data first, then processed into structured and connected memory.
Plugin setup
Conversation extraction is built into memind-core.
Document, image, audio, and tool-call extraction require the matching raw-data plugin dependency and runtime registration.
Maven dependencies
Add the plugin you need.
<dependency>
<groupId>com.openmemind.ai</groupId>
<artifactId>memind-plugin-rawdata-document</artifactId>
<version>${memind.version}</version>
</dependency>
<dependency>
<groupId>com.openmemind.ai</groupId>
<artifactId>memind-plugin-rawdata-image</artifactId>
<version>${memind.version}</version>
</dependency>
<dependency>
<groupId>com.openmemind.ai</groupId>
<artifactId>memind-plugin-rawdata-audio</artifactId>
<version>${memind.version}</version>
</dependency>
<dependency>
<groupId>com.openmemind.ai</groupId>
<artifactId>memind-plugin-rawdata-toolcall</artifactId>
<version>${memind.version}</version>
</dependency>
Runtime registration
Register the plugin on the Memind runtime.
var memory = Memory.builder()
// other runtime components: chat client, store, vector, text search...
.rawDataPlugin(new DocumentRawDataPlugin())
.build();
You can register multiple raw-data plugins when your application accepts multiple content types.
var memory = Memory.builder()
// other runtime components...
.rawDataPlugin(new DocumentRawDataPlugin())
.rawDataPlugin(new ImageRawDataPlugin())
.rawDataPlugin(new AudioRawDataPlugin())
.rawDataPlugin(new ToolCallRawDataPlugin())
.build();
Spring Boot starters
Spring Boot users can use the matching starters to auto-configure raw-data plugins.
| Content type | Spring Boot starter |
|---|
| Documents | memind-plugin-rawdata-document-starter |
| Images | memind-plugin-rawdata-image-starter |
| Audio | memind-plugin-rawdata-audio-starter |
| Tool calls | memind-plugin-rawdata-toolcall-starter |
The snippets below show the extraction entry points after the required plugin is available.
Choose the extraction mode based on how content arrives in your application.
| Mode | API | Best for |
|---|
| Batch conversation | addMessages() | You already have a complete conversation segment. |
| Streaming messages | addMessage() | Chatbot or agent messages arrive one at a time. |
| Manual commit | commit() | You want to force buffered messages into memory. |
| Raw content extraction | extract() | Documents, images, audio, tool calls, and custom content. |
| Plugin helpers | Plugin-specific helpers | Convenience APIs for supported content plugins. |
Most applications use more than one mode.
For example, a chatbot may use addMessage() during the conversation, commit() at session end, and extract() when the user uploads a document or when the agent produces important tool output.
Use addMessages() when you already have a complete conversation segment.
This is useful for:
- examples and tests
- importing a transcript
- processing a finished conversation
- extracting memory from a known context window
var result = memory.addMessages(
memoryId,
messages,
ExtractionConfig.defaults().withLanguage("Chinese")
).block();
Memind treats the message list as one extraction unit.
The pipeline can preserve the source as Raw Data, generate captions, extract Memory Items, update graph and thread projections, and schedule Insight Tree updates depending on configuration.
Use addMessage() when messages arrive one at a time.
This is the common mode for chatbots, coding agents, and long-running assistants.
memory.addMessage(memoryId, Message.user("Can you review this design?")).block();
memory.addMessage(memoryId, Message.assistant("The main issue is that the API boundary is unclear."))
.block();
In streaming mode, Memind does not need to extract memory from every single message immediately.
Messages are first stored in the conversation buffer. Memind then decides when the accumulated context is ready to commit. When a commit boundary is reached, extraction is triggered. If the boundary is not ready yet, the call may complete without producing a new extraction result.
This avoids turning every chat turn into a disconnected memory item.
Commit buffered context
Use commit() to force the current conversation buffer into memory.
This is useful when:
- a session ends
- an agent run finishes
- the application is shutting down
- you want to make sure the latest buffered context is written
var result = memory.commit(memoryId).block();
For real-time agents, a common pattern is:
memory.addMessage(memoryId, userMessage).block();
memory.addMessage(memoryId, assistantMessage).block();
// At the end of the session or run:
memory.commit(memoryId).block();
If the buffer is empty, the commit returns an empty extraction result.
Document extraction requires memind-plugin-rawdata-document.
Use the document plugin to extract memory from text files, Markdown, HTML, CSV, PDF, and other parser-supported document formats. The plugin parses the content, segments it into document-aware sections, generates captions, and sends the normalized content into Memind’s memory construction pipeline.
var request = DocumentExtractionRequests.document(memoryId, documentContent);
var result = memory.extract(request).block();
documentContent is created with the document plugin’s content model or parser flow. See the document plugin or Java SDK docs for construction examples.
Document extraction is useful for:
- project docs
- runbooks
- design documents
- meeting notes
- release notes
- support articles
- internal knowledge bases
Document captions are especially useful because they preserve source-level context behind extracted Memory Items.
Image extraction requires memind-plugin-rawdata-image.
Use the image plugin to extract memory from screenshots, diagrams, UI captures, whiteboards, photos, or other visual artifacts. The plugin can analyze image content, produce image semantics, generate captions, and pass the resulting representation into memory extraction.
var request = ImageExtractionRequests.image(memoryId, imageContent);
var result = memory.extract(request).block();
imageContent is created with the image plugin’s content model or parser flow. See the image plugin or Java SDK docs for construction examples.
Image extraction is useful when visual context matters, such as:
- UI screenshots
- architecture diagrams
- error screenshots
- visual bug reports
- whiteboard notes
- product or design references
Instead of losing visual context, Memind can convert images into source-level Raw Data that can support later memory retrieval.
Audio extraction requires memind-plugin-rawdata-audio.
Use the audio plugin to extract memory from recordings, transcripts, voice notes, meetings, interviews, support calls, or agent sessions that include spoken context. Audio content can be transcribed, segmented, captioned, and processed into memory.
var request = AudioExtractionRequests.audio(memoryId, audioContent);
var result = memory.extract(request).block();
audioContent is created with the audio plugin’s content model or parser flow. See the audio plugin or Java SDK docs for construction examples.
Audio extraction can preserve:
- transcript segments
- speaker or timing metadata when available
- summarized captions
- source references
- extracted facts, events, preferences, and decisions
This lets Memind turn spoken context into searchable and retrievable memory.
Tool-call extraction requires memind-plugin-rawdata-toolcall.
Use the tool-call plugin to extract memory from agent tool usage. Tool calls are important for agents because they capture what the agent tried, what worked, what failed, and what should be reused later.
var result = ToolCallMemories.report(memory, memoryId, toolCalls).block();
toolCalls is a list of tool-call records containing tool name, input, output, status, duration, and related metadata.
Tool-call extraction is useful for:
- coding agents
- research agents
- automation agents
- test and deployment agents
- long-running task agents
Tool-call memory can help Memind extract agent-scoped memory such as:
- tool experience
- durable directives
- reusable playbooks
- resolved problems
- failed attempts
- successful workflows
This is one of the main ways Memind helps agents remember their own operating experience, not only user preferences.
Use extract() for custom content types.
Custom raw content is useful when your application has domain-specific records that are not normal chat messages, documents, images, audio, or tool calls.
Examples include:
- internal business events
- workflow records
- logs
- incident reports
- CRM records
- product analytics events
- domain-specific artifacts
var result = memory.extract(
memoryId,
rawContent,
ExtractionConfig.defaults()
).block();
For advanced use cases, implement a custom RawContent type and a raw-data processor or plugin. This lets Memind normalize your content into Raw Data before extracting memory.
Most applications can start with the default extraction configuration.
Use custom configuration when you need to control language, chunking, insight behavior, thread behavior, or plugin-specific processing.
| Area | Controls |
|---|
| Language | Extraction language and prompt behavior. |
| Chunking | Conversation, document, audio, or content-specific segmentation. |
| Captions | Raw Data caption generation. |
| Memory Items | Item extraction behavior and category handling. |
| Insight | Insight Tree construction and asynchronous insight scheduling. |
| Threads | Memory Thread derivation and enrichment. |
| Plugins | Document, image, audio, tool-call, or custom content behavior. |
Example:
var config = ExtractionConfig.defaults().withLanguage("Chinese");
var result = memory.addMessages(memoryId, messages, config).block();
Start with defaults, inspect the generated memory in Memind UI, then tune extraction behavior when needed.
All extraction modes return an ExtractionResult.
ExtractionResult
memoryId
rawDataResult
memoryItemResult
insightResult
status
duration
errorMessage
insightPending
| Field | Meaning |
|---|
memoryId | The memory identity that received the content. |
rawDataResult | Raw Data construction result. |
memoryItemResult | Memory Item extraction result. |
insightResult | Insight construction result when available. |
status | Extraction status such as SUCCESS, PARTIAL_SUCCESS, or FAILED. |
duration | Total extraction duration. |
errorMessage | Failure or partial-success error message. |
insightPending | Whether insight work has been scheduled asynchronously but has not completed yet. |
insightPending is important because Insight Tree construction may be asynchronous. A successful extraction can write Raw Data and Memory Items immediately while insight construction continues in the background.
Use Memind UI to inspect what extraction produced.
Useful inspection points include:
| View | What to inspect |
|---|
| Buffers | Pending and recent conversation context. |
| Raw Data | Segments, captions, source references, metadata, and content type. |
| Memory Items | Extracted facts, preferences, events, directives, playbooks, resolutions, and foresight. |
| Item Graph | Entities, aliases, mentions, and item relationships. |
| Memory Threads | Long-running topics, projects, workflows, incidents, and decisions. |
| Insight Tree | Higher-level understanding built from extracted memory. |
| Settings | Runtime configuration that affects extraction behavior. |
If extraction does not look right, inspect Raw Data first.
Raw Data captions usually show whether the source content was segmented and understood correctly before higher-level memory was extracted.
Best practices
Use the extraction mode that matches how content arrives:
- Use
addMessages() for complete conversation segments.
- Use
addMessage() for real-time agents and chatbots.
- Use
commit() at session end or before shutdown.
- Use
extract() for documents, images, audio, tool calls, and custom raw content.
- Use plugin helpers when available.
Set up the right plugin:
- Conversation extraction works with
memind-core.
- Document extraction needs the document raw-data plugin.
- Image extraction needs the image raw-data plugin.
- Audio extraction needs the audio raw-data plugin.
- Tool-call extraction needs the tool-call raw-data plugin.
Preserve source context:
- Keep useful metadata such as source client, content type, timestamps, file names, URLs, and tool names.
- Prefer source-aware content objects over plain text when the source matters.
- Inspect Raw Data captions before tuning item extraction.
Avoid fragmented memory:
- Do not force every single chat message into memory as a complete unit.
- Let streaming messages accumulate until there is enough context.
- Commit the final buffer when a run ends.
Tune gradually:
- Start with default extraction behavior.
- Review Raw Data, Memory Items, Threads, and Insights in Memind UI.
- Adjust language, chunking, plugin options, or insight behavior only when the output shows a real need.