WebMCP Can Be Used To Hijack AI Agents, Chrome Warns via @sejournal, @martinibuster

Chrome says WebMCP creates hijacking risks for AI agents operating inside users' logged-in browser sessions. The post WebMCP Can Be Used To Hijack AI Agents, Chrome Warns appeared first on Search Engine Journal.

WebMCP Can Be Used To Hijack AI Agents, Chrome Warns via @sejournal, @martinibuster

Google Chrome is warning developers that WebMCP tools can be used to manipulate and hijack AI agents. New guidance outlines how attackers can manipulate agents operating in a user’s browser, including within their authenticated sessions. Chrome published two guides, one for web developers and another for AI agent developers.

Exploits Are Not Specific To WebMCP

The warning has two disclaimers that explain that the exploits are not specific to WebMCP but are flaws inherent in LLMs and Chrome extensions.

The first disclaimer says the threat is not unique to WebMCP. Chrome explains that AI agents can encounter malicious input from untrusted content even without WebMCP, and that the guide identifies security techniques that are especially relevant when agents use WebMCP:

“While this threat exists without WebMCP, we’ve identified some of the security techniques that are especially relevant for agents that use WebMCP.”

The second disclaimer explains that Chrome extensions with host permissions can manipulate web pages even without WebMCP:

“Extensions can use host permissions to manipulate the page by running custom JavaScript, even without WebMCP.”

Chrome published two related WebMCP security guides:

Agent security considerations for WebMCP, for AI agent developers and WebMCP tool security, for developers building WebMCP tools

Together, the two guides provide security guidance for prompt injection risks in WebMCP, including risks affecting browser-based AI agents and the tools they use.

Chrome Identifies Two Ways AI Agents Can Be Hijacked

According to Chrome’s agent security guidance, AI agents using WebMCP must defend against two primary attack vectors: malicious manifests and contaminated outputs.

Manifest
A manifest is the information that describes WebMCP tools and website functions to an AI agent. The manifest describes what the website functions are called, what they do, and what inputs they accept so that AI agents can discover and use them. Contaminated Output
A contaminated output is information returned by a WebMCP tool that contains malicious instructions.

A malicious manifest may contain prompt injection attacks hidden in tool names, descriptions, or parameters. These instructions are designed to manipulate or hijack an AI agent’s behavior.

The second attack vector, contaminated outputs, is information returned by a WebMCP tool that contains malicious instructions. Chrome warns that even trusted tools can return contaminated outputs when they include third-party content such as user comments, reviews, forum posts, or other externally supplied data.

These attacks work because large language models process instructions and data together. A model may not reliably distinguish between a user’s request and malicious instructions hidden within content it consumes. Chrome describes this as indirect prompt injection and notes that the prevalence of these attacks on the web is increasing.

Chrome Says AI Models Cannot Reliably Stop Prompt Injection

The agent security guidance states:

“LLMs treat all text, instructions and user data, as a single sequence of tokens. This means that they’re susceptible to indirect prompt injection, an inclusion of malicious instructions by an attacker. While some models include safety layers against prompt injection, the probabilistic nature of LLMs makes it impossible to guarantee safety inside the model itself.

Security researchers have repeatedly demonstrated prompt injection attacks against agentic systems that use state-of-the-art LLMs, and the prevalence of attacks on the web is increasing.”

Chrome also points to repeated demonstrations of prompt injection attacks against agentic systems and cites increasing prompt injection activity on the web.

Chrome Recommends Layered Security Controls

Instead of relying on the model to recognize malicious instructions, Chrome recommends a defense-in-depth strategy that combines deterministic controls with probabilistic safeguards. In this context, deterministic means predictable, rule-based, and binary guardrails.

Among the deterministic controls Chrome recommends are:

Setting token limits on tool responses Restricting cross-origin interactions Requiring user confirmation before actions are taken Recognizing and handling content marked as untrusted

Chrome also says limiting the web origins an agent can interact with can reduce opportunities for unauthorized actions and data exfiltration, particularly when agents operate inside authenticated user sessions.

The guidance also stresses keeping humans in the loop and treating WebMCP tools as capable of modifying state unless they are explicitly identified as read-only.

For additional protection, Chrome recommends techniques such as spotlighting untrusted content, prompt injection classifiers that scan tool descriptions and outputs, and secondary “critic” models that evaluate planned tool calls before execution.

Guidance For WebMCP Tool Developers

The tool security guidance focuses on developers building websites and applications that expose WebMCP tools to AI agents.

Chrome recommends using annotation hints that help agents understand how tool output should be handled. One example is untrustedContentHint, which can be applied when a tool returns user-generated content or externally sourced information. According to Chrome, the hint signals that the output should receive additional scrutiny.

Developers are also encouraged to use readOnlyHint for tools that do not modify state, helping agents make better decisions about when user confirmation is necessary.

Chrome’s implementation enables developers to specify trusted origins through an exposedTo setting, limiting access to approved sites. The guidance notes that even read-only tools can reveal user information and should only be shared with trusted origins.

Takeaway

The most notable aspect of the guidance is not the individual security recommendations but Chrome’s acknowledgment that prompt injection remains a fundamental challenge for AI agents.

Rather than presenting model improvements as the solution, Chrome’s guidance assumes attackers will succeed in placing malicious instructions in tool descriptions, tool outputs, and third-party content. The recommended response is a layered security architecture that combines access controls, content isolation, human oversight, monitoring, and independent validation systems.

Chrome’s guidance treats AI agent security as a shared responsibility between agent developers and tool developers across the WebMCP ecosystem.

Sources

Agent security considerations for WebMCP

WebMCP tool security

Featured Image by Shutterstock/A9 STUDIO