Warm Starts and Goal-based Heuristics to Improve MARL Performance in Advanced Distributed Defence

Project Value

$420,821

DIP Contribution

$150,000

Status

Completed Project

Project Lead

A/Prof Claudia Szabo, The University of Adelaide

Collaborating Partners

The University of South Australia, The University of Adelaide, Department of Defence, DEWC

Project Summary

Multi-agent reinforcement learning (MARL) is appealing for advanced distributed defence because it is able to learn quickly and adapt to unforeseen situations, which is a frequent occurrence in contested and dynamic defence environments. However, MARL is expensive to train and MARL approaches do not do well in environments with sparse rewards, which are often encountered in Defence. Traditional search-based multi-agent planning approaches do not suffer from this, however they are less adaptable to rapid changes.

This project bridges the gap between different decision-making approaches to create a novel hybrid approach that enables multi-agent systems to make effective decisions in complex and dynamic scenarios. It combines reinforcement learning and multi-agent planning, to enable agents that behave cooperatively in achieving the desired system wide goals. An initial prototype will be delivered that can then be translated into existing simulators in collaboration with DSTG.