Memory

Memory management enables agents to maintain context across multi-turn conversations.

Overview

MARSYS provides a memory system that:

Maintains Context: Preserves conversation history
Supports Multiple Types: ConversationMemory, ManagedConversationMemory, KGMemory
Handles Token Limits: ManagedConversationMemory automatically manages context size

Core Components

Message Structure

The fundamental unit of memory:

from marsys.agents.memory import Message

@dataclass
class Message:
    role: str  # user, assistant, system, tool
    content: Optional[Union[str, Dict[str, Any], List[Dict[str, Any]]]] = None
    message_id: str  # Auto-generated UUID
    name: Optional[str] = None  # Tool name or model name
    tool_calls: Optional[List[ToolCallMsg]] = None
    agent_calls: Optional[List[AgentCallMsg]] = None
    structured_data: Optional[Dict[str, Any]] = None
    images: Optional[List[str]] = None  # For vision models
    tool_call_id: Optional[str] = None  # For tool response messages
    reasoning_details: Optional[List[Dict[str, Any]]] = None  # For model thinking/reasoning traces

Reasoning Details

The reasoning_details field preserves model thinking/reasoning traces (e.g., Gemini 3 thought signatures). This is critical for multi-turn tool calling with models that use extended thinking, as the reasoning context must be preserved across turns.

ConversationMemory

Standard memory implementation for storing conversation history:

from marsys.agents.memory import ConversationMemory

# Create memory with optional system prompt
memory = ConversationMemory(description="You are a helpful assistant")

# Add a message
message_id = memory.add(role="user", content="Hello")

# Or add a Message object directly
from marsys.agents.memory import Message
msg = Message(role="assistant", content="Hi there!")
memory.add(message=msg)

# Retrieve messages (returns List[Dict])
all_messages = memory.retrieve_all()
recent_messages = memory.retrieve_recent(n=5)

# Get messages for LLM (same as retrieve_all for ConversationMemory)
llm_messages = memory.get_messages()

# Other operations
memory.retrieve_by_id("message-uuid-here")
memory.retrieve_by_role("user", n=3)
memory.remove_by_id("message-uuid-here")
memory.reset_memory()  # Clears all except system message

Key Methods:

Method	Description
`add()`	Add a message, returns message_id
`update()`	Update existing message by ID
`retrieve_all()`	Get all messages as dicts
`retrieve_recent(n)`	Get last n messages as dicts
`get_messages()`	Get messages for LLM consumption
`retrieve_by_id()`	Get message by ID
`retrieve_by_role()`	Filter by role
`remove_by_id()`	Delete message by ID
`reset_memory()`	Clear all messages (keeps system prompt)

ManagedConversationMemory

Advanced memory with automatic token management:

from marsys.agents.memory import ManagedConversationMemory, ManagedMemoryConfig

config = ManagedMemoryConfig(
    threshold_tokens=150000,  # When to trigger compaction
    image_token_estimate=800
)
# Derived: compaction_target_tokens = threshold_tokens * (1 - min_reduction_ratio)

memory = ManagedConversationMemory(config=config)

# Usage is identical to ConversationMemory
memory.add(role="user", content="Hello")
messages = memory.get_messages()  # Returns curated context within token budget

Active Context Compaction (ACM)

Managed memory uses an active-context policy (active_context) to decide how and when to reduce context:

mode="compaction": run processor pipeline and rewrite memory
mode="sliding_window": return recent context window without rewriting raw history
processor_order (default): ["tool_truncation", "summarization", "backward_packing"]
excluded_processors: skip specific processors by name
min_reduction_ratio: minimum estimated savings ratio to run non-final processors

from marsys.agents.memory import (
    ActiveContextPolicyConfig,
    ManagedMemoryConfig,
    SummarizationConfig,
    ToolTruncationConfig,
)

memory_config = ManagedMemoryConfig(
    threshold_tokens=120_000,
    active_context=ActiveContextPolicyConfig(
        mode="compaction",
        processor_order=["tool_truncation", "summarization", "backward_packing"],
        excluded_processors=[],
        min_reduction_ratio=0.4,
        tool_truncation=ToolTruncationConfig(max_tool_message_tokens=1200),
        summarization=SummarizationConfig(output_max_tokens=6000),
    ),
)

The compaction target is derived automatically: compaction_target_tokens = threshold_tokens * (1 - min_reduction_ratio).

Optional Separate Compaction Model

You can keep your main model for task execution and use a cheaper/faster model for memory compaction:

from marsys.agents import Agent
from marsys.models import ModelConfig

agent = Agent(
    model_config=ModelConfig(type="api", provider="openrouter", name="anthropic/claude-opus-4.6"),
    compaction_model_config=ModelConfig(
        type="api",
        provider="openrouter",
        name="anthropic/claude-haiku-4.5",
    ),
    goal="Research and synthesis",
    instruction="You are a research agent.",
)

KGMemory

Knowledge graph memory that extracts facts from text:

from marsys.agents.memory import KGMemory

# Requires a model for fact extraction
memory = KGMemory(model=your_model, description="Initial context")

# Add facts directly
memory.add_fact(role="user", subject="Paris", predicate="is capital of", obj="France")

# Or add text and extract facts automatically
memory.add(role="user", content="The Eiffel Tower is in Paris.")
# Facts are extracted asynchronously using the model

MemoryManager

Factory class that creates the appropriate memory type:

from marsys.agents.memory import MemoryManager

# Create ConversationMemory
manager = MemoryManager(
    memory_type="conversation_history",
    description="System prompt"
)

# Create ManagedConversationMemory
manager = MemoryManager(
    memory_type="managed_conversation",
    description="System prompt",
    memory_config=ManagedMemoryConfig(...)
)

# Create KGMemory
manager = MemoryManager(
    memory_type="kg",
    description="System prompt",
    model=your_model  # Required for KG
)

# Use like the underlying memory type
manager.add(role="user", content="Hello")
messages = manager.get_messages()

# Save/load for persistence (can include additional state)
manager.save_to_file("memory.json", additional_state={"planning": {...}})
additional_state = manager.load_from_file("memory.json")

Memory Events

When memory is cleared via reset_memory(), MARSYS emits MemoryResetEvent. Managed compaction can also emit CompactionEvent (started, completed, failed) for status channels.

from marsys.coordination.event_bus import EventBus

bus = EventBus()
manager.set_event_context(agent_name="Researcher", event_bus=bus, session_id="run_123")

When agents run through Orchestra/auto_run(), this context is wired automatically.

Message Addition Examples

from marsys.agents.memory import ConversationMemory

memory = ConversationMemory()

# User input
memory.add(role="user", content="Analyze the quarterly sales data")

# Agent response
memory.add(role="assistant", content="I'll analyze the data for you.")

# Tool call (assistant requesting a tool)
memory.add(
    role="assistant",
    content=None,
    tool_calls=[{
        "id": "call_123",
        "type": "function",
        "function": {
            "name": "analyze_sales",
            "arguments": '{"quarter": "Q4"}'
        }
    }]
)

# Tool result
memory.add(
    role="tool",
    content='{"total_sales": 1500000, "growth": "15%"}',
    tool_call_id="call_123",
    name="analyze_sales"
)

Best Practices

1. Use Correct Methods

# CORRECT
memory.add(role="user", content="Hello")
messages = memory.retrieve_all()
memory.reset_memory()

# WRONG - these methods don't exist
# memory.add_message(...)
# memory.get_recent(...)
# memory.clear()

2. Handle Tool Results Properly

# CORRECT - Link tool result to tool call
memory.add(
    role="tool",
    content=result,
    tool_call_id="call_123",
    name="tool_name"
)

# WRONG - No association
memory.add(role="tool", content=result)

3. Use ManagedConversationMemory for Long Conversations

For conversations that may exceed token limits:

from marsys.agents.memory import MemoryManager, ManagedMemoryConfig

manager = MemoryManager(
    memory_type="managed_conversation",
    memory_config=ManagedMemoryConfig(
        threshold_tokens=100000
    )
)

MARSYS provides ConversationMemory for basic needs, ManagedConversationMemory for automatic token management, and KGMemory for knowledge graphs.

Navigation