Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.openmemind.com/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Multimodal Extraction is how your application turns real-world input into Memind memory. Memind can extract memory from conversations, streaming messages, documents, images, audio, tool calls, and custom raw content. These inputs enter the same memory construction pipeline and can become Raw Data, Memory Items, Item Graph relationships, Memory Threads, and Insight Tree updates. This page focuses on the extraction APIs, supported content types, plugin setup, response format, and usage patterns. If you want to run a complete example first, start with Quickstart. For the internal memory construction model, see Memory Construction.

Supported content types

Memind supports multiple content types through built-in conversation support and raw-data plugins.
Content typeExamplesSupport
Conversationschat messages, agent turns, transcriptsBuilt in
Streaming messageslive chatbot or agent messagesBuilt in with buffer and commit
Documentstext, markdown, HTML, CSV, PDFDocument raw-data plugin
Imagesscreenshots, diagrams, photosImage raw-data plugin
Audiorecordings, transcripts, voice notesAudio raw-data plugin
Tool callstool input/output, test results, shell outputTool-call raw-data plugin
Custom raw contentapplication-specific records, logs, domain objectsRawContent and plugin SPI
The important idea is that Memind does not treat memory as only chat history. Different input types are normalized into Raw Data first, then processed into structured and connected memory.

Plugin setup

Conversation extraction is built into memind-core. Document, image, audio, and tool-call extraction require the matching raw-data plugin dependency and runtime registration.

Maven dependencies

Add the plugin you need.
<dependency>
  <groupId>com.openmemind.ai</groupId>
  <artifactId>memind-plugin-rawdata-document</artifactId>
  <version>${memind.version}</version>
</dependency>
<dependency>
  <groupId>com.openmemind.ai</groupId>
  <artifactId>memind-plugin-rawdata-image</artifactId>
  <version>${memind.version}</version>
</dependency>
<dependency>
  <groupId>com.openmemind.ai</groupId>
  <artifactId>memind-plugin-rawdata-audio</artifactId>
  <version>${memind.version}</version>
</dependency>
<dependency>
  <groupId>com.openmemind.ai</groupId>
  <artifactId>memind-plugin-rawdata-toolcall</artifactId>
  <version>${memind.version}</version>
</dependency>

Runtime registration

Register the plugin on the Memind runtime.
var memory = Memory.builder()
    // other runtime components: chat client, store, vector, text search...
    .rawDataPlugin(new DocumentRawDataPlugin())
    .build();
You can register multiple raw-data plugins when your application accepts multiple content types.
var memory = Memory.builder()
    // other runtime components...
    .rawDataPlugin(new DocumentRawDataPlugin())
    .rawDataPlugin(new ImageRawDataPlugin())
    .rawDataPlugin(new AudioRawDataPlugin())
    .rawDataPlugin(new ToolCallRawDataPlugin())
    .build();

Spring Boot starters

Spring Boot users can use the matching starters to auto-configure raw-data plugins.
Content typeSpring Boot starter
Documentsmemind-plugin-rawdata-document-starter
Imagesmemind-plugin-rawdata-image-starter
Audiomemind-plugin-rawdata-audio-starter
Tool callsmemind-plugin-rawdata-toolcall-starter
The snippets below show the extraction entry points after the required plugin is available.

Extraction modes

Choose the extraction mode based on how content arrives in your application.
ModeAPIBest for
Batch conversationaddMessages()You already have a complete conversation segment.
Streaming messagesaddMessage()Chatbot or agent messages arrive one at a time.
Manual commitcommit()You want to force buffered messages into memory.
Raw content extractionextract()Documents, images, audio, tool calls, and custom content.
Plugin helpersPlugin-specific helpersConvenience APIs for supported content plugins.
Most applications use more than one mode. For example, a chatbot may use addMessage() during the conversation, commit() at session end, and extract() when the user uploads a document or when the agent produces important tool output.

Extract conversation messages

Use addMessages() when you already have a complete conversation segment. This is useful for:
  • examples and tests
  • importing a transcript
  • processing a finished conversation
  • extracting memory from a known context window
var result = memory.addMessages(
    memoryId,
    messages,
    ExtractionConfig.defaults().withLanguage("Chinese")
).block();
Memind treats the message list as one extraction unit. The pipeline can preserve the source as Raw Data, generate captions, extract Memory Items, update graph and thread projections, and schedule Insight Tree updates depending on configuration.

Extract streaming messages

Use addMessage() when messages arrive one at a time. This is the common mode for chatbots, coding agents, and long-running assistants.
memory.addMessage(memoryId, Message.user("Can you review this design?")).block();

memory.addMessage(memoryId, Message.assistant("The main issue is that the API boundary is unclear."))
    .block();
In streaming mode, Memind does not need to extract memory from every single message immediately. Messages are first stored in the conversation buffer. Memind then decides when the accumulated context is ready to commit. When a commit boundary is reached, extraction is triggered. If the boundary is not ready yet, the call may complete without producing a new extraction result. This avoids turning every chat turn into a disconnected memory item.

Commit buffered context

Use commit() to force the current conversation buffer into memory. This is useful when:
  • a session ends
  • an agent run finishes
  • the application is shutting down
  • you want to make sure the latest buffered context is written
var result = memory.commit(memoryId).block();
For real-time agents, a common pattern is:
memory.addMessage(memoryId, userMessage).block();
memory.addMessage(memoryId, assistantMessage).block();

// At the end of the session or run:
memory.commit(memoryId).block();
If the buffer is empty, the commit returns an empty extraction result.

Extract documents

Document extraction requires memind-plugin-rawdata-document. Use the document plugin to extract memory from text files, Markdown, HTML, CSV, PDF, and other parser-supported document formats. The plugin parses the content, segments it into document-aware sections, generates captions, and sends the normalized content into Memind’s memory construction pipeline.
var request = DocumentExtractionRequests.document(memoryId, documentContent);

var result = memory.extract(request).block();
documentContent is created with the document plugin’s content model or parser flow. See the document plugin or Java SDK docs for construction examples. Document extraction is useful for:
  • project docs
  • runbooks
  • design documents
  • meeting notes
  • release notes
  • support articles
  • internal knowledge bases
Document captions are especially useful because they preserve source-level context behind extracted Memory Items.

Extract images

Image extraction requires memind-plugin-rawdata-image. Use the image plugin to extract memory from screenshots, diagrams, UI captures, whiteboards, photos, or other visual artifacts. The plugin can analyze image content, produce image semantics, generate captions, and pass the resulting representation into memory extraction.
var request = ImageExtractionRequests.image(memoryId, imageContent);

var result = memory.extract(request).block();
imageContent is created with the image plugin’s content model or parser flow. See the image plugin or Java SDK docs for construction examples. Image extraction is useful when visual context matters, such as:
  • UI screenshots
  • architecture diagrams
  • error screenshots
  • visual bug reports
  • whiteboard notes
  • product or design references
Instead of losing visual context, Memind can convert images into source-level Raw Data that can support later memory retrieval.

Extract audio

Audio extraction requires memind-plugin-rawdata-audio. Use the audio plugin to extract memory from recordings, transcripts, voice notes, meetings, interviews, support calls, or agent sessions that include spoken context. Audio content can be transcribed, segmented, captioned, and processed into memory.
var request = AudioExtractionRequests.audio(memoryId, audioContent);

var result = memory.extract(request).block();
audioContent is created with the audio plugin’s content model or parser flow. See the audio plugin or Java SDK docs for construction examples. Audio extraction can preserve:
  • transcript segments
  • speaker or timing metadata when available
  • summarized captions
  • source references
  • extracted facts, events, preferences, and decisions
This lets Memind turn spoken context into searchable and retrievable memory.

Extract tool calls

Tool-call extraction requires memind-plugin-rawdata-toolcall. Use the tool-call plugin to extract memory from agent tool usage. Tool calls are important for agents because they capture what the agent tried, what worked, what failed, and what should be reused later.
var result = ToolCallMemories.report(memory, memoryId, toolCalls).block();
toolCalls is a list of tool-call records containing tool name, input, output, status, duration, and related metadata. Tool-call extraction is useful for:
  • coding agents
  • research agents
  • automation agents
  • test and deployment agents
  • long-running task agents
Tool-call memory can help Memind extract agent-scoped memory such as:
  • tool experience
  • durable directives
  • reusable playbooks
  • resolved problems
  • failed attempts
  • successful workflows
This is one of the main ways Memind helps agents remember their own operating experience, not only user preferences.

Extract custom raw content

Use extract() for custom content types. Custom raw content is useful when your application has domain-specific records that are not normal chat messages, documents, images, audio, or tool calls. Examples include:
  • internal business events
  • workflow records
  • logs
  • incident reports
  • CRM records
  • product analytics events
  • domain-specific artifacts
var result = memory.extract(
    memoryId,
    rawContent,
    ExtractionConfig.defaults()
).block();
For advanced use cases, implement a custom RawContent type and a raw-data processor or plugin. This lets Memind normalize your content into Raw Data before extracting memory.

Configure extraction

Most applications can start with the default extraction configuration. Use custom configuration when you need to control language, chunking, insight behavior, thread behavior, or plugin-specific processing.
AreaControls
LanguageExtraction language and prompt behavior.
ChunkingConversation, document, audio, or content-specific segmentation.
CaptionsRaw Data caption generation.
Memory ItemsItem extraction behavior and category handling.
InsightInsight Tree construction and asynchronous insight scheduling.
ThreadsMemory Thread derivation and enrichment.
PluginsDocument, image, audio, tool-call, or custom content behavior.
Example:
var config = ExtractionConfig.defaults().withLanguage("Chinese");

var result = memory.addMessages(memoryId, messages, config).block();
Start with defaults, inspect the generated memory in Memind UI, then tune extraction behavior when needed.

Response format

All extraction modes return an ExtractionResult.
ExtractionResult
  memoryId
  rawDataResult
  memoryItemResult
  insightResult
  status
  duration
  errorMessage
  insightPending
FieldMeaning
memoryIdThe memory identity that received the content.
rawDataResultRaw Data construction result.
memoryItemResultMemory Item extraction result.
insightResultInsight construction result when available.
statusExtraction status such as SUCCESS, PARTIAL_SUCCESS, or FAILED.
durationTotal extraction duration.
errorMessageFailure or partial-success error message.
insightPendingWhether insight work has been scheduled asynchronously but has not completed yet.
insightPending is important because Insight Tree construction may be asynchronous. A successful extraction can write Raw Data and Memory Items immediately while insight construction continues in the background.

Inspect extracted memory

Use Memind UI to inspect what extraction produced. Useful inspection points include:
ViewWhat to inspect
BuffersPending and recent conversation context.
Raw DataSegments, captions, source references, metadata, and content type.
Memory ItemsExtracted facts, preferences, events, directives, playbooks, resolutions, and foresight.
Item GraphEntities, aliases, mentions, and item relationships.
Memory ThreadsLong-running topics, projects, workflows, incidents, and decisions.
Insight TreeHigher-level understanding built from extracted memory.
SettingsRuntime configuration that affects extraction behavior.
If extraction does not look right, inspect Raw Data first. Raw Data captions usually show whether the source content was segmented and understood correctly before higher-level memory was extracted.

Best practices

Use the extraction mode that matches how content arrives:
  • Use addMessages() for complete conversation segments.
  • Use addMessage() for real-time agents and chatbots.
  • Use commit() at session end or before shutdown.
  • Use extract() for documents, images, audio, tool calls, and custom raw content.
  • Use plugin helpers when available.
Set up the right plugin:
  • Conversation extraction works with memind-core.
  • Document extraction needs the document raw-data plugin.
  • Image extraction needs the image raw-data plugin.
  • Audio extraction needs the audio raw-data plugin.
  • Tool-call extraction needs the tool-call raw-data plugin.
Preserve source context:
  • Keep useful metadata such as source client, content type, timestamps, file names, URLs, and tool names.
  • Prefer source-aware content objects over plain text when the source matters.
  • Inspect Raw Data captions before tuning item extraction.
Avoid fragmented memory:
  • Do not force every single chat message into memory as a complete unit.
  • Let streaming messages accumulate until there is enough context.
  • Commit the final buffer when a run ends.
Tune gradually:
  • Start with default extraction behavior.
  • Review Raw Data, Memory Items, Threads, and Insights in Memind UI.
  • Adjust language, chunking, plugin options, or insight behavior only when the output shows a real need.