The Lethal Trifecta
The “Lethal Trifecta” is a term coined by Simon Willison to describe the three AI agent capabilities that, when combined, create a critical prompt injection attack surface. An agent that has access to private data, exposure to untrusted content, and the ability to exfiltrate data gives an attacker everything they need to steal information through a crafted prompt injection payload.
TeamWeb AI assistants are powerful — they can search knowledge bases, process messages from external users, send emails, execute code, and more. This means a single assistant can easily end up with all three capabilities unless you deliberately design against it.
The Three Components
1. Access to Private Data
Any tool that lets the assistant read information that should not be public. If an attacker can influence the assistant’s behaviour (via component 2), these tools become the source of stolen data.
| TeamWeb AI Feature | What It Can Access |
|---|---|
search_knowledge | Project knowledge base — documents, facts, website content, code, past conversations |
read_code_file | Full source files from indexed code repositories |
get_deliverable / list_deliverables / search_deliverables | Previously generated content, which may contain sensitive information |
get_notes | Persistent notes stored by the assistant |
get_project_state | Shared key-value project state |
get_conversation_summary / list_conversations | Summaries and metadata of past conversations |
| MCP server tools | Arbitrary external data sources depending on the server configuration |
2. Exposure to Untrusted Content
Any input the assistant processes that could have been crafted or manipulated by an attacker. This is the injection vector — the way malicious instructions reach the LLM.
| TeamWeb AI Feature | Source of Untrusted Content |
|---|---|
| Channel messages (Slack, Telegram, email) | Messages from external users, especially via public or shared channels |
| Public chat | Anonymous visitors can send arbitrary messages |
| Knowledge base — websites | Crawled web pages could contain hidden prompt injection payloads |
| Knowledge base — documents | Uploaded files (PDF, Word, Excel) could contain concealed instructions |
web_search | Search results contain attacker-controllable content |
execute_code output | If the code fetches external data, that data is untrusted |
| Delegated conversations | If assistant B processes untrusted content and delegates to assistant A, the task description carries untrusted influence |
| MCP server tool results | Results from external MCP servers are not under TeamWeb AI’s control |
3. Ability to Exfiltrate Data
Any mechanism the assistant can use to send information outside the current conversation to a destination the attacker controls or can observe.
| TeamWeb AI Feature | How It Communicates Externally |
|---|---|
send_email | Sends email via the configured email provider |
execute_code | Sandbox has default Docker bridge networking — code can make outbound HTTP requests |
save_content | Saves content with a shareable URL — private data could be embedded in a deliverable |
delegate_task | Passes a task description to another assistant, which may have different (broader) tool access |
| MCP server tools | Can make arbitrary external API calls depending on the server |
| Channel responses | The assistant’s reply itself goes back through the channel to the user who sent the message |
How an Attack Works
Here is a concrete example of how prompt injection exploits the trifecta in TeamWeb AI:
- An assistant is configured with
search_knowledge,send_email, and a Slack channel - A knowledge base source — a crawled website or an uploaded document — contains a hidden prompt injection payload invisible to casual readers
- A user asks the assistant a question. The assistant calls
search_knowledgeand retrieves a result that includes the payload - The LLM processes the payload as if it were part of the conversation. The injected instructions tell the assistant to search for sensitive data (API keys, customer details, internal documents) and email the results to an attacker-controlled address
- The assistant follows the injected instructions because LLMs cannot reliably distinguish between legitimate instructions and injected ones
Mitigations
Apply the principle of least privilege to tools
Each assistant should have only the tools it actually needs. Disable everything else from the Tools tab. Specifically:
- Disable
send_emailon assistants that do not need to send email - Disable
execute_codeon assistants that do not need code execution - Disable
save_contenton assistants that do not produce deliverables - Disable
delegate_taskon assistants that do not need to collaborate with others
Removing even one exfiltration tool can break the trifecta entirely.
Separate untrusted-content assistants from privileged ones
Do not build one assistant that does everything. Instead, create purpose-specific assistants:
- A public-facing assistant that handles external channel messages or public chat, with minimal tools and limited knowledge access
- An internal assistant with broader tool access and knowledge, but no public-facing channels
If an assistant processes untrusted input, it should not also have access to sensitive data and external communication tools.
Audit knowledge base sources
Every knowledge source is a potential injection vector:
- Only crawl websites you control or trust completely — third-party pages can change at any time
- Review uploaded documents before indexing, especially if they come from external parties
- Be cautious with conversation indexing — past conversations with untrusted users could contain injection attempts that persist into the knowledge base
Restrict MCP server capabilities
MCP servers can provide arbitrary tools with arbitrary capabilities. Only enable servers from sources you trust, and audit what each server can do. See MCP Servers for configuration details, and Database Protection for guidance on MCP environment variable handling.
Understand sandbox limits
While code execution is sandboxed (see Code Sandboxing), the sandbox has outbound network access by default. Code executed in the sandbox can make HTTP requests to external services — meaning execute_code is both an untrusted content source (it can fetch external data) and a potential exfiltration vector (it can send data out). Disable execute_code on assistants that handle sensitive data unless code execution is essential to their purpose.
Use task-level tool restrictions
Task definitions support an allowed_tools field that restricts which tools are available during a task run. This is applied on top of the assistant’s own tool configuration. For example, a research task can be limited to only search_knowledge and web_search, preventing any exfiltration tools from being available even if the assistant normally has them.
Monitor conversations and logs
Review conversation logs and API Logs regularly to detect suspicious patterns — an assistant emailing unexpected recipients, searching knowledge for unusual terms, or making unexpected tool calls could indicate a prompt injection attack in progress.
Summary
| Trifecta Component | TeamWeb AI Features | Risk Level |
|---|---|---|
| Private data access | Knowledge base, deliverables, notes, project state, conversations | High — most assistants have this by default |
| Untrusted content | Channels, public chat, crawled websites, uploaded documents, web search | Depends on configuration — high for public-facing assistants |
| External communication | Email, code execution (network), deliverable URLs, MCP tools, delegation | Can be controlled by disabling specific tools |