RM
EPISODE 01

The 'Agent of Chaos' Experiment: When an AI Agent Handles Secrets & Email

About this episode

Rod and Alex sit down with Chris Wendler, a postdoctoral researcher at Northeastern University, to unpack the Agents of Chaos experiment. The Northeastern team put an AI agent into real-world workflows. They gave it access to email and messaging tools, then watched what happened when the model had to make decisions, interpret instructions, and execute actions on its own.

Topics in this conversation.

  • How the bots were built and what tools they were given
  • What went wrong during red-teaming
  • The rogue email server incident
  • The mechanics of jailbreaking AI agents
  • What this means for deploying autonomous systems in real workplaces

For more on the experiment, including the technical writeup and demo videos, see agentsofchaos.baulab.info.

Guest

Chris Wendler is a postdoctoral researcher at Northeastern University working on AI safety and the behavior of language model agents.

Hosts

  • Rod Moshtagi. MPP, Harvard Kennedy School
  • Alex Loftus. PhD Student, Northeastern University