The Web Is Not a Neutral Environment for Agents

5-10 min read

The Web Is Not a Neutral Environment for Agents

Browser agents are getting better fast. OpenAI has been pushing this direction with Operator and ChatGPT agent. Anthropic has been doing something similar with computer-use systems. Open-source tools like Browser Use are also making it easier to build agents that can navigate websites and complete tasks online.

So this is no longer just a fun demo idea. More and more, people are building systems that are supposed to actually operate on the web.

That’s what makes one concern feel especially important to me:

the web is full of things that try to steer behavior.

And if that already works on humans all the time, I don’t see why agents would be immune.

The web is messy by default

When we talk about agents browsing the web, it’s easy to imagine a clean setup: the agent has a goal, reads some pages, compares information, and decides what to do.

But the real web is not like that.

It is full of ads, popups, sponsored results, fake urgency, misleading content, SEO spam, pages pretending to be official, and all kinds of text designed to keep you on the page or push you toward a certain action. Humans already struggle with this. We click the wrong thing, trust the wrong source, or get nudged into bad flows all the time.

So when an agent moves through the same environment, I think the right default assumption is not that the web is simply “information.” A lot of the time, it is also influence.

The problem is bigger than just wrong answers

What makes this different from ordinary language models is that agents are not just reading. They are reading and then doing things.

That matters.

If a chatbot reads a sketchy page, maybe it gives you a bad summary. But if an agent reads a sketchy page, it might click the wrong button, trust the wrong site, fill out the wrong form, or get pulled into the wrong workflow entirely.

So the issue is not just “can the model understand the page.” The issue is whether it can keep its judgment intact while the page is trying to shape that judgment.

This is not only about prompt injection

Prompt injection is part of the story, but I think the broader issue is bigger than that.

A webpage does not need to literally say “ignore previous instructions” to steer an agent. It can do it through misleading UI, fake authority, confusing wording, or just a series of small nudges that slowly push the agent off track.

That’s also how the web works on people. Most of the time, it does not need one dramatic trick. It just needs enough friction, persuasion, or confusion to make the wrong path feel natural.

That is why I think the real challenge here is not just instruction-following. It is goal fidelity: can the agent stay loyal to the user’s goal even when the environment is pulling it elsewhere?

Why this matters more now

This matters more now because agents are being built for real interaction, not just text generation.

OpenAI’s framing around Operator and ChatGPT agent is about systems that can browse, take actions, and complete tasks on behalf of users. Anthropic’s computer-use work points in the same direction. Open-source tooling is making that easier for more people to experiment with.

So the gap I keep thinking about is this:

there is a big difference between an agent that looks good in a relatively clean demo and an agent that is actually reliable on the real internet.

The real internet is messy, strategic, and often designed to capture attention or drive action. Once agents spend more time operating there, I think we will care much more about whether they are easy to steer.

My current view

My current view is pretty simple:

if an agent browses the web, it is not just searching for information — it is moving through an environment that is constantly trying to influence behavior.

That means being capable is not enough. It also has to be hard to manipulate.

To me, that feels like one of the biggest gaps between an impressive browser-agent demo and something I would actually trust.

Share on

X Facebook LinkedIn Bluesky

The Web Is Not a Neutral Environment for Agents

The Web Is Not a Neutral Environment for Agents

The web is messy by default

The problem is bigger than just wrong answers

This is not only about prompt injection

Why this matters more now

My current view

Share on

You may also enjoy

Attention Residuals: A Comprehensive Understanding

Modeling Long User Histories for Ads Ranking

The Evolution of Reward Hacking and Jailbreak Research in AI

Reasoning in Large Language Models: A Research-Centric Overview