The 'Agent of Chaos' Experiment: When an AI Agent Handles Secrets & Email
About this episode
Rod and Alex sit down with Chris Wendler, a postdoctoral researcher at Northeastern University, to unpack the Agents of Chaos experiment. The Northeastern team put an AI agent into real-world workflows. They gave it access to email and messaging tools, then watched what happened when the model had to make decisions, interpret instructions, and execute actions on its own.
Topics in this conversation.
- How the bots were built and what tools they were given
- What went wrong during red-teaming
- The rogue email server incident
- The mechanics of jailbreaking AI agents
- What this means for deploying autonomous systems in real workplaces
For more on the experiment, including the technical writeup and demo videos, see agentsofchaos.baulab.info.
Guest
Chris Wendler is a postdoctoral researcher at Northeastern University working on AI safety and the behavior of language model agents.
Hosts
- Rod Moshtagi. MPP, Harvard Kennedy School
- Alex Loftus. PhD Student, Northeastern University