Here, I present two multi-agent coordination algorithms that let decentralized robots learn
to collectively share knowledge and develop collaboration plans: Multi-agent Graph Attention
Communication (MAGIC) and Heterogeneous Policy Networks (HetNet).
Communication is a key component of successful coordination, enabling the
agents to convey
information and cooperate to collectively achieve shared goals. In high-performing human teams,
human
experts judiciously choose when to communicate and whom to communicate with, communicating only when
beneficial. We would like agents to emulate such behavior, without the need of hand-defining
communication protocols across agents.
We propose a novel multi-agent reinforcement learning algorithm,
Multi-Agent Graph-attentIon Communication (MAGIC), with a
graph-attention communication protocol in which we learn 1) a
Scheduler to help with the problems of when to communicate and
whom to address messages to, and 2) a Message Processor using
Graph Attention Networks (GATs) with dynamic graphs to deal
with communication signals.
Figure 1: MAGIC Framework
Figure 1: MAGIC in Google Research Football
With our framework, we set a new state-of-the-art in communication-based
MARL by modeling the topology of
interactions among agents as a dynamic directed graph that accommodates time-varying communication
needs and accurately
captures the relations between agents. Our proposed framework emulates the features of an effective
human-human team through: 1) the Scheduler, which helps each agent to decide when it should
communicate
and whom it should communicate with, and 2) the Message Processor, which integrates and processes
received messages in preparation for decision making. We test across several domains, including the
high-dimensional Google Research Football environment.
With Heterogeneous Policy Networks, we move to thinking about
"heterogeneous" robots. Heterogeneity in robots' design
characteristics and their roles are introduced to leverage the relative merits of different agents
and their capabilities (e.g., a ground robot vs. a UAV). We define a heterogeneous robot team as a
group of cooperative agents that are capable of performing different tasks and may have access to
different sensory information.
Figure 1: Heterogeneous Policy Networks
We categorize agents with similar state, action, and observation
spaces in the same class. In such a heterogeneous setting, communicating is not straightforward as
agents do not speak the same "language''; The dependency generated via sensor-lax agents on agents
with strong sensing capabilities makes efficient communication protocols for cooperation a
requirement rather than an additional modeling technique for performance.
Heterogeneous Policy Networks is an end-to-end model with a differentiable
encoder-decoder channel to account for the heterogeneity of inter-class messages, "translating''
the encoded messages into a shared, intermediate language among agents of a composite team. TThe
result is we can now autonomously learn behaviors to coordinate multiple robots, varying in sensor
or actuator capabilities, to accomplish objectives that a single agent could not handle alone, such
as coordinating to detect and put out a fire!