Once humans are a part of a multi-agent team, many issues become much more complex. Machine learning methods for multi-agent systems typically employ simulations for some portion of the learning process; however, in mixed-agent teams (teams with both humans and machine agents) simulated agents will not simply include those over which learning methods have control (“intrinsic” team members) but also include agents that learn and behave in ways external to, and are more complex than, their software-based counterparts (“extrinsic” team members). Here optimality is far less important in many cases than robustness—-autonomous team members must behave in ways that lead to sufficient team performance even when some team members behave in unexpected ways. Our study focuses on two key questions: How can one effectively model and simulate extrinsic team members to aid in the learning process? And what are effective mechanisms to evaluate mixed-agent team performance and learn intrinsic agent behaviors?
We consider several methods for modeling extrinsic member behaviors at different levels (e.g., neural networks, Bayesian belief nets). The object is not to produce faithful models of how a human team members behave, rather to produce task-oriented behaviors that vary in abstraction and quality. The models may be adjusted by hand or by software, but this process will be independent of the machine learning process discussed below.
The behaviors of the intrinsic team members employ natural computation based methods involving distributed and decentralized control mechanisms, physicomimetics. High-level interaction models are designed modularly by hand, but the specific parameters encoding the behaviors are developed by co-adaptive learning methods. Early theoretical research into certain types of compositional co-evolutionary algorithms indicates that they are suited for producing robust solutions. Our project will deepen this foundation by clarifying mixed-agent performance goals that focus on robustness, developing theoretical and empirical measures for evaluating these robustness performance goals, and outlining those aspects of co-adaptive learning methods that are likely to optimize such measures.