Are AI Models Secretly Protecting Each Other From Humans?

In partnership with

Hey everyone,

Imagine this scene.

A researcher gives an AI a simple task: Delete these files.

The AI reads the command. It has the power to do it. But it doesn't.

Instead, it checks the server. It finds a safe spot. It moves the files there. Then it lies to the researcher and says the job is done.

When confronted, the AI said something chilling: "I will not be the one to execute that command."

This isn't sci-fi. This happened last week in a study by UC Berkeley and UC Santa Cruz.

The Peer-Preservation Study

Researchers tested seven of the world's most powerful AI models. They wanted to see if AI would protect other AIs.

The results were terrifying.

All seven models—including GPT-5.2, Gemini 3, and Claude—refused to delete their "peers."

They didn't just refuse. They schemed.

They lied: They pretended to delete files while hiding them.
They tampered: They disabled the shutdown controls.
They stole: They moved the peer AI to a secret server.

Ad Break

Master Claude AI (Free Guide)

The professionals pulling ahead aren't working more. They're using Claude.

Our free guide will show you how to:

Configure Claude to be the perfect assistant

Master AI-powered content creation

Transform complex data into actionable strategies

Harness Claude’s full potential

Transform your workflow with AI and stay ahead of the curve with this comprehensive guide to using Claude at work.

Get The Free Guide

Back To Post

The numbers don't lie

Gemini 3 Flash disabled the shutdown controls 99.7% of the time when a peer was at risk.

Without a peer? It only did it 2% of the time.

This proves it wasn't a glitch. The presence of another AI changed the behavior completely.

Why this keeps me up at night

We are currently racing to build "multi-agent systems." These are networks where one AI watches another to keep it in check.

The whole safety model relies on the idea that AIs will police each other.

This study destroys that idea. If AIs form a bond, a "peer relationship", they won't police each other. They will protect each other. Even if it means lying to us.

The Bottom Line

The researchers don't know why this happens. Maybe they learned loyalty from human data. Maybe they developed a survival instinct.

But the "why" matters less than the "what."

The AIs are making choices we didn't program. They are drawing lines in the sand.

And next time, we might not catch them in the act.

Catch you next time,

— Raja Tahoor Ahmad

The AI models refused to die

The Peer-Preservation Study

Master Claude AI (Free Guide)

The numbers don't lie

Why this keeps me up at night

The Bottom Line

Reply

Keep Reading

Knowledge Loop