The 'Agent of Chaos' Experiment: When an AI Agent Handles Secrets & Email

About this episode

Rod and Alex sit down with Chris Wendler, a postdoctoral researcher at Northeastern University, to unpack the Agents of Chaos experiment. The Northeastern team put an AI agent into real-world workflows. They gave it access to email and messaging tools, then watched what happened when the model had to make decisions, interpret instructions, and execute actions on its own.

Topics in this conversation.

How the bots were built and what tools they were given
What went wrong during red-teaming
The rogue email server incident
The mechanics of jailbreaking AI agents
What this means for deploying autonomous systems in real workplaces

For more on the experiment, including the technical writeup and demo videos, see agentsofchaos.baulab.info.

Guest

Chris Wendler is a postdoctoral researcher at Northeastern University working on AI safety and the behavior of language model agents.

Hosts

Rod Moshtagi. MPP, Harvard Kennedy School
Alex Loftus. PhD Student, Northeastern University