AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

Reference: Wu, Bansal, Zhang, Wu, Li, Zhu, Jiang, Zhang, Zhang, Liu, Awadallah, White, Burger, Wang (2023). Microsoft Research et al. arXiv:2308.08155v2. Source file: 2308.08155v2.pdf. URL

Summary

AutoGen is an open-source Microsoft framework for building LLM applications as conversations among customisable conversable agents. Each agent has a configurable back-end (LLMs, humans, tools, or a combination) and can send, receive, and react to messages. Developers compose applications by (1) defining specialised conversable agents and (2) programming their interaction patterns via natural language prompts and/or code — a paradigm the authors call conversation programming.

The framework supports diverse topologies (two-agent chat, group chat, hierarchical chat, dynamic routing), human-in-the-loop participation, and tool execution via code or function calls. Empirical studies demonstrate AutoGen on math, coding, QA, operations research, online decision-making, and entertainment tasks, showing that multi-agent conversations can exceed single-agent performance while reducing development effort.

Key Ideas

Conversable agents as the unifying abstraction — uniform message interface over LLMs, humans, and tools.
Conversation programming: defining agent capabilities + scripting their interaction patterns as the application-building paradigm.
Flexible conversation topologies: joint chat, hierarchical chat, group chat, dynamic routing.
Human-in-the-loop and tool execution as first-class participants, not special cases.
Empirical validation across six domains showing modular composition yields strong task performance.

Connections

Conceptual Contribution

Claim: Multi-agent LLM applications are best built as conversations between conversable agents whose behaviour is programmed via a fusion of natural-language prompts and code; this abstraction unifies LLM, human, and tool participants under one message-passing model.
Mechanism: Introduces a Python framework with ConversableAgent, AssistantAgent, UserProxyAgent classes; message passing drives LLM inference, human input requests, or tool/code execution; developers declaratively compose agent graphs. Case studies quantify gains over single-agent baselines.
Concepts introduced/used: LLM Agents, Multi-Agent Systems, Tool Use, Agent Communication Languages, Interoperability
Stance: framework / empirical study
Relates to: Cited by Survey Of Agent Interoperability Protocols as prior art for in-framework agent coordination that protocols like Agent-to-Agent Protocol now aim to standardise across frameworks. Its conversation-programming abstraction is a concrete instance of the communication-centric view advocated by Beyond Self-Talk - Communication-Centric Survey Of LLM Multi-Agent Systems.

Tags

#llm-agents #multi-agent-systems #framework #tool-use #conversation-programming

Summary

This review argues that prior surveys of LLM-based Multi-Agent Systems (LLM-MAS) over-emphasise application domains and agent architectures while neglecting the communication layer that actually enables collaboration. The authors propose a two-level analytical framework separating system-level communication (architecture, goals, and protocols — how agents are organised) from system-internal communication (strategies, paradigms, objects, and content — what messages carry and how they are interpreted).

Drawing on classical communication theory’s source/channel split, they decompose LLM-MAS workflows into speaker/listener, message format, negotiation paradigm, and coordination protocol, then survey representative works under each cell. The review highlights communication efficiency, security vulnerabilities, and benchmark inadequacy as primary open problems.

Key Ideas

Communication as the missing analytical layer in LLM-MAS surveys.

Two-level framework: system-level (architecture, goal, protocol) vs system-internal (strategy, paradigm, object, content) communication.

Adoption of Shannon-style source/channel abstractions to describe LLM agent exchanges.

Brain / Perception / Action model of LLM agents as the atomic communication node.

Open issues: scalability, security of inter-agent channels, multimodal message formats, benchmarking.

Conceptual Contribution

Claim: The analytical primitive for understanding LLM-MAS is communication, not architecture; a two-level framework (system-level vs system-internal) captures how message protocol choices shape emergent collective behaviour.

Mechanism: Repurposes classical communication-theory distinctions (source/channel, architecture/content) as a taxonomy, then classifies and compares LLM-MAS workflows under each axis, exposing gaps in current designs.

Concepts introduced/used: LLM Agents, Multi-Agent Systems, Agent Communication Languages, Interoperability

Stance: survey

Relates to: Complements Survey Of Agent Interoperability Protocols by analysing communication patterns inside MAS, whereas that survey focuses on inter-agent wire protocols. Shares the communication-first lens with KQML Language And Protocol and FIPA-ACL but reframed for LLM agents.

Summary

This survey examines four emerging agent communication protocols targeting different interoperability tiers: the Model Context Protocol (MCP) for JSON-RPC tool invocation and context delivery; the Agent Communication Protocol (ACP) for REST-native multi-part performative messaging; the Agent-to-Agent Protocol (A2A) for peer-to-peer Agent-Card-based task outsourcing; and the Agent Network Protocol (ANP) for decentralized discovery using DIDs and JSON-LD.

The authors contrast architectures, discovery mechanisms, security models, and communication patterns, then recommend a phased adoption roadmap (MCP for tool access, then ACP for messaging, A2A for collaborative execution, ANP for open marketplaces). A timeline traces ancestry from KQML (1993) and FIPA-ACL (2000) through RAG, ReAct, function-calling up to modern agent protocols.

Key Ideas

Phased adoption roadmap: MCP -> ACP -> A2A -> ANP.

MCP core primitives: Tools, Resources, Prompts, Sampling under JSON-RPC 2.0.

A2A introduces Agent Cards, Tasks, Artifacts for enterprise-scale delegation.

ANP uses DIDs and JSON-LD for decentralized, internet-scale agent discovery.

Security threats tabulated across creation/operation/update lifecycle phases.

Conceptual Contribution

Claim: Modern agent interoperability is best understood as a four-tier stack (MCP for tools, ACP for messaging, A2A for delegation, ANP for open discovery) and should be adopted in that phased order.

Mechanism: Structured comparison of architectures, discovery, security, and message patterns; historical timeline rooting each protocol in KQML/FIPA-ACL ancestry; lifecycle threat table.

Concepts introduced/used: Model Context Protocol, Agent-to-Agent Protocol, Agent Network Protocol, Agent Communication Protocol, KQML, FIPA-ACL, Agent Cards, Decentralized Identifiers, JSON-RPC, Tool Use, LLM Agents

Stance: survey

Relates to: Complements the broader Survey Of AI Agent Protocols with a narrower, adoption-oriented roadmap. Its security-threat lifecycle connects directly to AI Agents Under Threat and MalTool Malicious Tool Attacks.

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

Summary

Key Ideas

Connections

Conceptual Contribution

Tags

Backlinks