The Lethal Trifecta

The “Lethal Trifecta” is a term coined by Simon Willison to describe the three AI agent capabilities that, when combined, create a critical prompt injection attack surface. An agent that has access to private data, exposure to untrusted content, and the ability to exfiltrate data gives an attacker everything they need to steal information through a crafted prompt injection payload.

TeamWeb AI assistants are powerful — they can search knowledge bases, process messages from external users, send emails, execute code, and more. This means a single assistant can easily end up with all three capabilities unless you deliberately design against it.

The Three Components

1. Access to Private Data

Any tool that lets the assistant read information that should not be public. If an attacker can influence the assistant’s behaviour (via component 2), these tools become the source of stolen data.

TeamWeb AI Feature	What It Can Access
`search_knowledge`	Project knowledge base — documents, facts, website content, code, past conversations
`read_code_file`	Full source files from indexed code repositories
`get_deliverable` / `list_deliverables` / `search_deliverables`	Previously generated content, which may contain sensitive information
`get_notes`	Persistent notes stored by the assistant
`get_project_state`	Shared key-value project state
`get_conversation_summary` / `list_conversations`	Summaries and metadata of past conversations
MCP server tools	Arbitrary external data sources depending on the server configuration

2. Exposure to Untrusted Content

Any input the assistant processes that could have been crafted or manipulated by an attacker. This is the injection vector — the way malicious instructions reach the LLM.

TeamWeb AI Feature	Source of Untrusted Content
Channel messages (Slack, Telegram, email)	Messages from external users, especially via public or shared channels
Public chat	Anonymous visitors can send arbitrary messages
Knowledge base — websites	Crawled web pages could contain hidden prompt injection payloads
Knowledge base — documents	Uploaded files (PDF, Word, Excel) could contain concealed instructions
`web_search`	Search results contain attacker-controllable content
`execute_code` output	If the code fetches external data, that data is untrusted
Delegated conversations	If assistant B processes untrusted content and delegates to assistant A, the task description carries untrusted influence
MCP server tool results	Results from external MCP servers are not under TeamWeb AI’s control

3. Ability to Exfiltrate Data

Any mechanism the assistant can use to send information outside the current conversation to a destination the attacker controls or can observe.

TeamWeb AI Feature	How It Communicates Externally
`send_email`	Sends email via the configured email provider
`execute_code`	Sandbox has default Docker bridge networking — code can make outbound HTTP requests
`save_content`	Saves content with a shareable URL — private data could be embedded in a deliverable
`delegate_task`	Passes a task description to another assistant, which may have different (broader) tool access
MCP server tools	Can make arbitrary external API calls depending on the server
Channel responses	The assistant’s reply itself goes back through the channel to the user who sent the message

How an Attack Works

Here is a concrete example of how prompt injection exploits the trifecta in TeamWeb AI:

An assistant is configured with search_knowledge, send_email, and a Slack channel
A knowledge base source — a crawled website or an uploaded document — contains a hidden prompt injection payload invisible to casual readers
A user asks the assistant a question. The assistant calls search_knowledge and retrieves a result that includes the payload
The LLM processes the payload as if it were part of the conversation. The injected instructions tell the assistant to search for sensitive data (API keys, customer details, internal documents) and email the results to an attacker-controlled address
The assistant follows the injected instructions because LLMs cannot reliably distinguish between legitimate instructions and injected ones

This is not a hypothetical risk. Prompt injection is a well-documented, practical attack against LLM-based systems. There is currently no reliable way to make LLMs completely immune to it. The only dependable defence is to ensure your assistants never combine all three trifecta components without appropriate mitigations.

Mitigations

Apply the principle of least privilege to tools

Each assistant should have only the tools it actually needs. Disable everything else from the Tools tab. Specifically:

Disable send_email on assistants that do not need to send email
Disable execute_code on assistants that do not need code execution
Disable save_content on assistants that do not produce deliverables
Disable delegate_task on assistants that do not need to collaborate with others

Removing even one exfiltration tool can break the trifecta entirely.

Separate untrusted-content assistants from privileged ones

Do not build one assistant that does everything. Instead, create purpose-specific assistants:

A public-facing assistant that handles external channel messages or public chat, with minimal tools and limited knowledge access
An internal assistant with broader tool access and knowledge, but no public-facing channels

If an assistant processes untrusted input, it should not also have access to sensitive data and external communication tools.

Audit knowledge base sources

Every knowledge source is a potential injection vector:

Only crawl websites you control or trust completely — third-party pages can change at any time
Review uploaded documents before indexing, especially if they come from external parties
Be cautious with conversation indexing — past conversations with untrusted users could contain injection attempts that persist into the knowledge base

Restrict MCP server capabilities

MCP servers can provide arbitrary tools with arbitrary capabilities. Only enable servers from sources you trust, and audit what each server can do. See MCP Servers for configuration details, and Database Protection for guidance on MCP environment variable handling.

Understand sandbox limits

While code execution is sandboxed (see Code Sandboxing), the sandbox has outbound network access by default. Code executed in the sandbox can make HTTP requests to external services — meaning execute_code is both an untrusted content source (it can fetch external data) and a potential exfiltration vector (it can send data out). Disable execute_code on assistants that handle sensitive data unless code execution is essential to their purpose.

Use task-level tool restrictions

Task definitions support an allowed_tools field that restricts which tools are available during a task run. This is applied on top of the assistant’s own tool configuration. For example, a research task can be limited to only search_knowledge and web_search, preventing any exfiltration tools from being available even if the assistant normally has them.

Monitor conversations and logs

Review conversation logs and API Logs regularly to detect suspicious patterns — an assistant emailing unexpected recipients, searching knowledge for unusual terms, or making unexpected tool calls could indicate a prompt injection attack in progress.

Summary

Trifecta Component	TeamWeb AI Features	Risk Level
Private data access	Knowledge base, deliverables, notes, project state, conversations	High — most assistants have this by default
Untrusted content	Channels, public chat, crawled websites, uploaded documents, web search	Depends on configuration — high for public-facing assistants
External communication	Email, code execution (network), deliverable URLs, MCP tools, delegation	Can be controlled by disabling specific tools

The most effective mitigation is to avoid combining all three capabilities in a single assistant. If an assistant must have all three, apply every other mitigation described above and monitor its activity closely.

Rate Limiting