Specialized Agents
MARSYS provides specialized agents that extend the base Agent class with domain-specific tools and instructions. These agents are production-ready and optimized for common tasks.
Overview
Specialized agents combine:
- Domain-specific tools: Curated toolsets for specific tasks
- Scenario-based instructions: Adaptive guidance rather than rigid workflows
- Configurable capabilities: Enable/disable features based on requirements
- Security features: Built-in validation and safety mechanisms
Available Specialized Agents
BrowserAgent
Autonomous browser automation with vision-based interaction and screenshot analysis.
Best for: Web scraping, UI testing, form filling, web research, dynamic content extraction
Key Features:
- Vision-based element interaction (no selectors needed)
- Multi-mode operation (primitive, advanced)
- Screenshot analysis with multimodal models
- Auto-screenshot management with sliding window
- Session persistence: Save and load browser sessions (cookies, localStorage)
- Tab management: List, switch, and close browser tabs
from marsys.agents import BrowserAgent# BrowserAgent requires async creation via create_safe()agent = await BrowserAgent.create_safe(model_config=config,name="WebAutomation",mode="advanced", # "primitive" or "advanced"headless=True,session_path="./sessions/my_session.json" # Optional: load existing session)
FileOperationAgent
Intelligent file and directory operations with optional bash command execution.
Best for: Code analysis, configuration management, log processing, documentation generation
Key Features:
- Type-aware file handling (Python, JSON, PDF, Markdown, images)
- Unified diff editing with high success rate
- Content and structure search (ripgrep-based)
- Optional bash tools for complex operations
- Security: Command validation, blocked dangerous patterns, timeouts
from marsys.agents import FileOperationAgentagent = FileOperationAgent(model_config=config,name="FileHelper",enable_bash=True, # Enable bash commandsallowed_bash_commands=["grep", "find", "wc"] # Whitelist)
WebSearchAgent
Multi-source information gathering across web and scholarly databases.
Best for: Research, fact-checking, literature reviews, current events
Key Features:
- Multi-source search (Bing, Google, arXiv, Semantic Scholar, PubMed)
- Configurable search modes (web, scholarly, or all)
- API key validation at initialization
- Query formulation strategies
from marsys.agents import WebSearchAgentagent = WebSearchAgent(model_config=config,name="Researcher",search_mode="all", # "web", "scholarly", or "all"bing_api_key=os.getenv("BING_SEARCH_API_KEY"))
Comparison
| Agent | Primary Use Case | Tools | API Keys Required |
|---|---|---|---|
| BrowserAgent | Web automation | Browser control | None |
| FileOperationAgent | File system operations | 6-16 tools | None |
| WebSearchAgent | Information gathering | 1-5 sources | Bing/Google (web) |
Common Patterns
Multi-Agent Workflow
Combine specialized agents in a topology:
from marsys.coordination import Orchestrafrom marsys.coordination.topology.patterns import PatternConfigbrowser_agent = BrowserAgent(config, mode="vision")file_agent = FileOperationAgent(config, enable_bash=True)search_agent = WebSearchAgent(config, search_mode="scholarly")topology = PatternConfig.hub_and_spoke(hub="Coordinator",spokes=["BrowserAgent", "FileHelper", "Researcher"])result = await Orchestra.run(task="Research topic, scrape related websites, and analyze findings",topology=topology)
Agent Pools for Parallel Execution
Some specialized agents support pooling for true parallel execution:
from marsys.agents import create_browser_agent_pool# Create pool of 3 browser instancespool = await create_browser_agent_pool(num_instances=3,model_config=config,mode="vision",headless=True)# Acquire instance for taskasync with pool.acquire(branch_id="task_1") as agent:result = await agent.run("Navigate to example.com")# Pool handles instance allocation and release
Pool Benefits
Agent pools provide true parallelism with separate instances, automatic instance management, fair allocation with queuing, and proper resource cleanup.
Best Practices
Choose the Right Agent
- Use BrowserAgent when interacting with dynamic web content, filling forms, or scraping JavaScript-rendered pages
- Use FileOperationAgent when working with local files, bash commands, or analyzing codebases
- Use WebSearchAgent when gathering information from online sources or conducting research
Configure Security Appropriately
# BrowserAgent: production modeagent = BrowserAgent(mode="stealth", # Avoid detectionheadless=True, # No GUItimeout=30)# FileOperationAgent: strict whitelistagent = FileOperationAgent(enable_bash=True,allowed_bash_commands=["grep", "find", "wc"])
Handle Missing API Keys Gracefully
try:search_agent = WebSearchAgent(config,search_mode="web",bing_api_key=os.getenv("BING_SEARCH_API_KEY"))except ValueError as e:print(f"Missing API key: {e}")# Fall back to scholarly-only modesearch_agent = WebSearchAgent(config, search_mode="scholarly")
Creating Custom Specialized Agents
To create your own specialized agent:
from marsys.agents import Agentclass MySpecializedAgent(Agent):def __init__(self, model_config, **kwargs):# Initialize toolstools = self._build_tools()# Build instructioninstruction = self._build_instruction()super().__init__(model_config=model_config,goal="Your goal here",instruction=instruction,tools=tools,**kwargs)
Specialized Agents Ready!
MARSYS provides production-ready specialized agents for browser automation, file operations, and web search. Choose the right agent for your task and combine them in topologies for complex workflows.