Memory
Memory management enables agents to maintain context across multi-turn conversations.
See Also
For detailed API signatures, see the Memory API Reference.
Overview
MARSYS provides a memory system that:
- Maintains Context: Preserves conversation history
- Supports Multiple Types: ConversationMemory, ManagedConversationMemory, KGMemory
- Handles Token Limits: ManagedConversationMemory automatically manages context size
Core Components
Message Structure
The fundamental unit of memory:
from marsys.agents.memory import Message@dataclassclass Message:role: str # user, assistant, system, toolcontent: Optional[Union[str, Dict[str, Any], List[Dict[str, Any]]]] = Nonemessage_id: str # Auto-generated UUIDname: Optional[str] = None # Tool name or model nametool_calls: Optional[List[ToolCallMsg]] = Noneagent_calls: Optional[List[AgentCallMsg]] = Nonestructured_data: Optional[Dict[str, Any]] = Noneimages: Optional[List[str]] = None # For vision modelstool_call_id: Optional[str] = None # For tool response messagesreasoning_details: Optional[List[Dict[str, Any]]] = None # For model thinking/reasoning traces
Reasoning Details
The reasoning_details field preserves model thinking/reasoning traces (e.g., Gemini 3 thought signatures). This is critical for multi-turn tool calling with models that use extended thinking, as the reasoning context must be preserved across turns.
ConversationMemory
Standard memory implementation for storing conversation history:
from marsys.agents.memory import ConversationMemory# Create memory with optional system promptmemory = ConversationMemory(description="You are a helpful assistant")# Add a messagemessage_id = memory.add(role="user", content="Hello")# Or add a Message object directlyfrom marsys.agents.memory import Messagemsg = Message(role="assistant", content="Hi there!")memory.add(message=msg)# Retrieve messages (returns List[Dict])all_messages = memory.retrieve_all()recent_messages = memory.retrieve_recent(n=5)# Get messages for LLM (same as retrieve_all for ConversationMemory)llm_messages = memory.get_messages()# Other operationsmemory.retrieve_by_id("message-uuid-here")memory.retrieve_by_role("user", n=3)memory.remove_by_id("message-uuid-here")memory.reset_memory() # Clears all except system message
Key Methods:
| Method | Description |
|---|---|
add() | Add a message, returns message_id |
update() | Update existing message by ID |
retrieve_all() | Get all messages as dicts |
retrieve_recent(n) | Get last n messages as dicts |
get_messages() | Get messages for LLM consumption |
retrieve_by_id() | Get message by ID |
retrieve_by_role() | Filter by role |
remove_by_id() | Delete message by ID |
reset_memory() | Clear all messages (keeps system prompt) |
ManagedConversationMemory
Advanced memory with automatic token management:
from marsys.agents.memory import ManagedConversationMemory, ManagedMemoryConfigconfig = ManagedMemoryConfig(threshold_tokens=150000, # When to trigger compactionimage_token_estimate=800)# Derived: compaction_target_tokens = threshold_tokens * (1 - min_reduction_ratio)memory = ManagedConversationMemory(config=config)# Usage is identical to ConversationMemorymemory.add(role="user", content="Hello")messages = memory.get_messages() # Returns curated context within token budget
Active Context Compaction (ACM)
Managed memory uses an active-context policy (active_context) to decide how and when to reduce context:
mode="compaction": run processor pipeline and rewrite memorymode="sliding_window": return recent context window without rewriting raw historyprocessor_order(default):["tool_truncation", "summarization", "backward_packing"]excluded_processors: skip specific processors by namemin_reduction_ratio: minimum estimated savings ratio to run non-final processors
from marsys.agents.memory import (ActiveContextPolicyConfig,ManagedMemoryConfig,SummarizationConfig,ToolTruncationConfig,)memory_config = ManagedMemoryConfig(threshold_tokens=120_000,active_context=ActiveContextPolicyConfig(mode="compaction",processor_order=["tool_truncation", "summarization", "backward_packing"],excluded_processors=[],min_reduction_ratio=0.4,tool_truncation=ToolTruncationConfig(max_tool_message_tokens=1200),summarization=SummarizationConfig(output_max_tokens=6000),),)
The compaction target is derived automatically: compaction_target_tokens = threshold_tokens * (1 - min_reduction_ratio).
Optional Separate Compaction Model
You can keep your main model for task execution and use a cheaper/faster model for memory compaction:
from marsys.agents import Agentfrom marsys.models import ModelConfigagent = Agent(model_config=ModelConfig(type="api", provider="openrouter", name="anthropic/claude-opus-4.6"),compaction_model_config=ModelConfig(type="api",provider="openrouter",name="anthropic/claude-haiku-4.5",),goal="Research and synthesis",instruction="You are a research agent.",)
KGMemory
Knowledge graph memory that extracts facts from text:
from marsys.agents.memory import KGMemory# Requires a model for fact extractionmemory = KGMemory(model=your_model, description="Initial context")# Add facts directlymemory.add_fact(role="user", subject="Paris", predicate="is capital of", obj="France")# Or add text and extract facts automaticallymemory.add(role="user", content="The Eiffel Tower is in Paris.")# Facts are extracted asynchronously using the model
MemoryManager
Factory class that creates the appropriate memory type:
from marsys.agents.memory import MemoryManager# Create ConversationMemorymanager = MemoryManager(memory_type="conversation_history",description="System prompt")# Create ManagedConversationMemorymanager = MemoryManager(memory_type="managed_conversation",description="System prompt",memory_config=ManagedMemoryConfig(...))# Create KGMemorymanager = MemoryManager(memory_type="kg",description="System prompt",model=your_model # Required for KG)# Use like the underlying memory typemanager.add(role="user", content="Hello")messages = manager.get_messages()# Save/load for persistence (can include additional state)manager.save_to_file("memory.json", additional_state={"planning": {...}})additional_state = manager.load_from_file("memory.json")
Memory Events
When memory is cleared via reset_memory(), MARSYS emits MemoryResetEvent. Managed compaction can also emit CompactionEvent (started, completed, failed) for status channels.
from marsys.coordination.event_bus import EventBusbus = EventBus()manager.set_event_context(agent_name="Researcher", event_bus=bus, session_id="run_123")
When agents run through Orchestra/auto_run(), this context is wired automatically.
Message Addition Examples
from marsys.agents.memory import ConversationMemorymemory = ConversationMemory()# User inputmemory.add(role="user", content="Analyze the quarterly sales data")# Agent responsememory.add(role="assistant", content="I'll analyze the data for you.")# Tool call (assistant requesting a tool)memory.add(role="assistant",content=None,tool_calls=[{"id": "call_123","type": "function","function": {"name": "analyze_sales","arguments": '{"quarter": "Q4"}'}}])# Tool resultmemory.add(role="tool",content='{"total_sales": 1500000, "growth": "15%"}',tool_call_id="call_123",name="analyze_sales")
Best Practices
1. Use Correct Methods
# CORRECTmemory.add(role="user", content="Hello")messages = memory.retrieve_all()memory.reset_memory()# WRONG - these methods don't exist# memory.add_message(...)# memory.get_recent(...)# memory.clear()
2. Handle Tool Results Properly
# CORRECT - Link tool result to tool callmemory.add(role="tool",content=result,tool_call_id="call_123",name="tool_name")# WRONG - No associationmemory.add(role="tool", content=result)
3. Use ManagedConversationMemory for Long Conversations
For conversations that may exceed token limits:
from marsys.agents.memory import MemoryManager, ManagedMemoryConfigmanager = MemoryManager(memory_type="managed_conversation",memory_config=ManagedMemoryConfig(threshold_tokens=100000))
Next Steps
Messages
Understand message types and formats
Agents
How agents use memory systems
Tools
Tool results in memory
Memory System
MARSYS provides ConversationMemory for basic needs, ManagedConversationMemory for automatic token management, and KGMemory for knowledge graphs.