16891: Multi-Robot Planning and Coordination

Robotics Institute, Carnegie Mellon University, Spring 2025

Last update: 8-16-2025

More details can be found in course canvas.

Course Introduction

Overview:

The course provides a graduate-level introduction to the field of multi-robot planning and coordination from both AI and robotics perspectives. Topics for the course include multi-robot cooperative task planning, multi-robot path/motion planning, learning for coordination, coordinating robots under uncertainty, etc. The course will particularly focus on state-of-the-art Multi-Agent Path Finding algorithms that can coordinate hundreds of robots with rigorous theoretical guarantees. Current applications for these technologies will be highlighted, such as mobile robot coordination for warehouses and drone swarm control.

Textbook:

There is no assigned textbook for this class. Reference materials are provided in the course schedule as well as the lecture slides.

Course Description:

The course includes lectures, research paper presentations and discussions, and course projects. The majority of this course is a seminar-style survey of issues and approaches to planning and coordination in multi-robot systems. Although the subject area is multi-robot coordination, it is also an explicit goal of this course to advance students’ critical thinking and communication skills, which is achieved through discussions, presentations, and report writing.

Prerequisite knowledge:

There are no formal prerequisites for this class.
Informally, students should be familiar with algorithms and informed search (for example, A*). Students should also have basic knowledge of probability and optimization.

Course topics:

Each of the following 8 topics will be covered by 2-5 lectures:

Course Activities and Grading

  
Paper presentation15%
Paper reading15%
Coding assignments30%
Research project40%

Summary of reading lists and research projects from previous years can be found here: spring 2024, spring 2023.

Schedule

DateFormatTopics
01/13Lecture 0Overview
01/15Lecture 1Basics of MAPF: A*-based Optimal Methods
01/20Martin Luther King DayNo Class
01/22Lecture 2Basics of MAPF: CBS-based Optimal Methods
01/27Lecture 3Basics of MAPF: CBS-based Bounded-suboptimal Methods
01/29Lecture 4Basics of MAPF: Greedy Search-based Methods
02/03Lecture 5Basics of MAPF: Greedy Rule-based Methods
02/05Lecture 6Task Planning: Multi-Robot Task Allocation
02/10Lecture 7Task Planning: Combined Task and Path Planning
02/12Lecture 8 and Paper Discussion 1Task Planning: Recent Progress (p1,p2)
02/17Lecture 9Motion Planning: Planning and Coordination under Uncertainty
02/19Lecture 10Motion Planning: Planning with Robot Dynamics
02/24Lecture 11Motion Planning: Planning with Robot Dynamics (cont)
02/26Paper Discussion 2Motion Planning: Recent Progress (p3,p4,p5)
03/03Spring BreakNo Class
03/05Spring BreakNo Class
03/10Lecture 12Decentralized Planning: ORCA, CBF, and Shard
03/12Lecture 13Decentralized Planning: Distributed PP and Distributed CSP
03/17Paper Discussion 3Decentralized Planning: Recent Progress (p6,p7,p8)
03/19Lecture 14Lifelong and Online Planning: Task and Path Planning
03/24Lecture 15Lifelong and Online Planning: Interleaving Planning and Execution
03/26Paper Discussion 4Other Multi-Robot Planning Problems: (p9,p10,p11)
03/31Guest Lecture 1Learning for Planning and Coordination: Multi-Agent Reinforcement Learning for MAPF by Hang Ma (Simon Fraser University)
04/02Guest Lecture 2Learning for Planning and Coordination: Leveraging Learning with Heuristic Search for Multi-Agent Motion Planning by Rishi Veerapaneni (CMU)
04/07Paper Discussion 5Learning for Planning and Coordination: Recent Progress (p12,p13,p14)
04/09Guest Lecture 3From Click to Delivery: Challenges and Opportunities in Multi-Agent Planning at Amazon by Federico Pecora (Amazon Robotics)
04/14Lecture 17Applications: Multi-Arm Assembly
04/16Lecture 18Applications: Environment Optimization
04/21Project Presentation 1 
04/23Project Presentation 2 

Reading List

Book chapters:

  1. Multiple Mobile Robot Systems, Handbook of Robotics, 2016.
  2. Multi-Agent Path Finding, Encyclopedia of Robotics, 2025.

Research papers:

  1. Conflict-Based Steiner Search for Multi-Agent Combinatorial Path Finding (RSS’18)
  2. Multi-Robot Task and Motion Planning With Subtask Dependencies (RAL’20)
  3. Trajectory Planning for Heterogeneous Robot Teams (IROS’18)
  4. Scalable and Safe Multi-Agent Motion Planning with Nonlinear Dynamics and Bounded Disturbances (AAAI’21)
  5. Accelerating Search-Based Planning for Multi-Robot Manipulation by Leveraging Online-Generated Experiences (ICAPS’24)
  6. PRIMAL: Pathfinding via Reinforcement and Imitation Multi-Agent Learning (RA-L’19)
  7. Fast, On-line Collision Avoidance for Dynamic Vehicles Using Buffered Voronoi Cells (RAL’17)
  8. Distributing Collaborative Multi-Robot Planning with Gaussian Belief Propagation (RA-L’23)
  9. Offline Time-Independent Multiagent Path Planning (IJCAI’22)
  10. SOCIALMAPF: Optimal and Efficient Multi-Agent Path Finding With Strategic Agents for Social Navigation (RAL’23)
  11. CTRMs: Learning to Construct Cooperative Timed Roadmaps for Multi-agent Path Planning in Continuous Spaces (AAMAS’22)
  12. A Comprehensive Review on Leveraging Machine Learning for Multi-Agent Path Finding (IEEE Access’24)
  13. Decentralized Monte Carlo Tree Search for Partially Observable Multi-agent Pathfinding (AAAI’24)
  14. Learning a Decentralized Multi-arm Motion Planner (CoRL’20)

Student Projects

Group 1

There are many real-world applications where the use of multiple robotic manipulators is needed to increase the throughput. Examples of such scenarios can be related to assembly tasks in manufacturing units. These tasks are often repetitive given the workspace of the robot remains the same and the environment is fairly static. Constant Time Motion Planning paradigm has been introduced recently to leverage such static and structured environments for generating motion plans with a user-defined time bound guarantee. This algorithm precomputes data structures, which are then used in the online phase to find the path from start to the goal state. There has been work done in the multi-agent planning community on finding a valid fast solution for multiple manipulators working in the same workspace; however it remains largely unexplored on how to provide constant time guarantees. We propose a method to extend this framework to multiple manipulators working in the same workspace.


Group 2

We introduce CBS-RRT*, a novel framework integrating Conflict-Based Search (CBS) with RRT* for scalable multi-robot path planning in 3D. By replacing traditional A* low-level planning with RRT*, our approach efficiently handles continuous, high-dimensional spaces typical of robotic arms, while retaining CBS's optimal conflict resolution. The framework demonstrates adaptability by coordinating heterogeneous systems (drones planned via A*, arms via RRT*) in shared workspaces. Experiments show CBS-RRT* significantly enhances planning efficiency and maintains solution quality, particularly in complex scenarios with overlapping workspaces, offering a promising direction for real-world multi-robot coordination like warehouse automation.


Group 3

This paper introduces Multi-Arm Multi-Model Planning Diffusion (MAMD), an extension of the Multi-robot Multi-model planning Diffusion (MMD) framework to three-dimensional environments, focusing on arms. We adapt the algorithm to address 3D motion planning for multiple articulated manipulators, generating collision-free trajectories in shared workspaces. MAMD maintains the ability to plan for multiple robots using single-robot training data while handling the increased dimensionality of arm configurations. We demonstrate its effectiveness through simulations in various 3D environments, showing efficient coordination of multiple arms. MAMD also introduces a new workspace-time-based constraint in addition to a jointspace-time-based constraint for efficient collision avoidance.


Group 4

High-degree-of-freedom robotic manipulators have notoriously complex motion planning challenges, rising due to their vast configuration spaces. Existing planners predominantly rely on sampling-based strategies (e.g., RRT, PRM) that are efficient in practice but lack completeness and strong optimality guarantees. In this work, we attempt to formulate motion planning for a single manipulator as a multi-agent path finding (MAPF) problem, and explore using the widely-adopted Conflict-Based Search (CBS) to solve it. By conceptually treating different segments or joints of one arm as cooperating agents, we leverage CBS’s systematic conflict resolution to ensure collision-free paths with provable solution quality.


Group 5

Recent studies have converted planning trajectory optimization to a probabilistic setting, aiming to sample from the posterior distribution of trajectories so can be solved by learning the score function for diffusion processes. However, such methods require expert trajectory demonstrations and struggle in long-horizon planning. In this work, we proposed Model-Based Diffusion Optimal Control (MDOC), which develops a model-based diffusion motion planning that analytically calculates the score function through Monte Carlo Approximation, so no demonstration data is required. We also equip the method with Model Predictive Control settings to improve planning performance within long horizons for online tasks.


Group 6

A novel approach is introduced to improve Conflict-Based Search (CBS) for multi-agent pathfinding (MAPF) problems by developing a deep learning-based oracle to predict the utility of conflicts in search expansion. The proposed solution trains a transformer model to predict which conflict from a set should be chosen for use in expansion, based on the paths and conflicts encountered in prior search iterations. The model is iteratively trained, starting with a classical oracle and refining over multiple cycles. Unlike previous work, which approximates oracle performance, this method uses Monte Carlo trials to attempt to estimate the true utility of conflicts.


Group 7

This project adapts the AlphaMaze pipeline, originally designed to teach a large language model (LLM) to solve single-agent textual mazes, to address multi-agent pathfinding (MAPF) tasks in small grid environments. The approach uses supervised fine-tuning (SFT) on solver-generated trajectories to train the model to coordinate multiple agents, avoid collisions, and efficiently reach goals. Given the limited multi-step planning capabilities of LLMs, the project explores whether expert data fine-tuning alone can yield collision-free solutions that scale to small-to-medium MAPF instances. Empirical results aim to provide insights into the feasibility of using text-based LLMs for multi-agent coordination tasks.


Group 8

Unmanned Ground Vehicles (UGVs) often have difficulty exploring complex environments due to obstacles such as stairs and rough terrain. Unmanned Aerial Vehicles (UAVs) can provide a good view from above but have short flight times, limiting their effectiveness for large areas. Combining a UAV with a UGV can use the strengths of both. We propose a multi-robot exploration framework where UAV and UGV work together to create a shared map and plan their paths. Our system uses a Conflict-Based Search (CBS) approach for assigning exploration tasks and coordinates robots through a global planner (Probabilistic Roadmap and Traveling Salesman Problem) and local planners (A* search). Simulation results in an office-like environment showed that using UAV and UGV together explored the area faster and more completely than using each robot alone. Our framework improves coverage and efficiency in multi-robot exploration tasks.


Group 9

Within the spectrum of Multi-Agent Path Finding (MAPF) algorithms, there is often a trade-off between the speed/scalability of the algorithm and the solution optimality/theoretical properties it provides. This work looks at a way of breaking this trade-off through exploiting the properties of the given MAPF instance. The proposed method Decoupled Conflict Based Search (DCBS) identifies a set of non-blocking cells within the graph, assigns priorities to agents based on whether their start/goal states are blocking, and plans agents in groups in descending priority. This effectively decouples the problem into a series of smaller subproblems, which in turn makes the algorithm more scalable while maintaining completeness. Experimental results show that DCBS has faster runtime and better scalability than Priority Based Search and has almost identical solution costs.