Defeating Prompt Injections by Design

Reference: Debenedetti, Shumailov, Fan, Hayes, Carlini, Fabian, Kern, Shi, Terzis & Tramèr (2025). Defeating Prompt Injections by Design (CaMeL). arXiv:2503.18813 (Google DeepMind / ETH Zürich). URL. Code: https://github.com/google-research/camel-prompt-injection.

Summary

CaMeL (“CApabilities for MachinE Learning”) is a robust, by-design defence against Prompt Injection attacks on tool-using LLM Agents. Rather than trying to make the model itself injection-resistant — an approach that decade-long experience with content filters suggests will fail — CaMeL wraps an arbitrary LLM in a protective system layer that performs explicit control- and data-flow separation between the trusted user query and the untrusted data the agent retrieves from tools, websites, or shared memory.

The trusted query is first compiled into a structured plan: a small program whose control flow is fixed at parse time and whose data flow between steps is statically determined. Untrusted strings returned by tools are treated as inert data — they can populate variables but cannot rewrite the program, redirect tool calls, or change which downstream tools are invoked. To prevent exfiltration over authorised channels (the harder half of the problem, since some tools must be allowed to write outwards), CaMeL attaches Capabilities to each data value tracking its provenance and policy class; tool invocations are gated by Information Flow Control policies that check capabilities against an explicit security label lattice.

Evaluated on the AgentDojo benchmark, CaMeL solves 77 % of tasks with provable security guarantees, against 84 % for an undefended baseline — a small utility cost for a structural defence that does not depend on the LLM noticing the attack. The paper positions CaMeL as a successor to ad-hoc prompt-level mitigations and as a concrete instance of end-to-end security thinking applied to agentic AI.

Key Ideas

Threat model: prompt injection from any untrusted data source the agent reads — tools, web pages, files, memory, other agents.
Control-flow extraction: parse the trusted user query into a fixed control-flow plan; downstream model calls see only data, never code.
Data-flow tracking: every variable carries a provenance label; tools that consume “untrusted” labels cannot influence which subsequent tools are called.
Capabilities for tool calls: classic capability-based access control transplanted to LLM tool use; security policies enforced at the tool boundary.
Provable security: when a task is completed under CaMeL, the trace itself certifies that no untrusted data influenced control flow — a property auditable post hoc.
Empirical cost: 77 % vs 84 % task success — graceful degradation rather than catastrophic refusal.
Open source: reference implementation released; integrates with existing agent frameworks via tool-call interception.

Connections

Conceptual Contribution

Claim: Prompt injection is structurally unsolvable at the model layer; it must be eliminated by enforcing a strict separation between code (the trusted query) and data (everything else) at the agent runtime, using classical capability-based Information Flow Control rather than ML-based content classification.
Mechanism: Compile the user query into a fixed control-flow program; route all retrieved data through tagged variables; gate every tool invocation by capability-checked information-flow policies. The LLM’s outputs can populate data fields but never alter control flow or bypass capability checks.
Concepts introduced/used: CaMeL, Control-Flow Integrity, Data-Flow Tracking, Capabilities, Information Flow Control, Prompt Injection, Tool Use, Agent Security, Provable Security (Agents)
Stance: systems / engineering with light formal grounding
Relates to: Spiritual successor to A Language-Based Approach To Prevent DDoS and Security Kernel Lambda Calculus for agent runtimes; an architectural realisation of the threat model catalogued in SoK The Attack Surface of Agentic AI and the multi-agent threats surveyed in Open Challenges in Multi-Agent Security; companion to AgentDojo (the benchmark on which it is evaluated).

Tags

#agent-security #prompt-injection #llm-agents #capabilities #information-flow-control #tool-use

Backlinks

Proof-Carrying Code - Necula
Enforceable Security Policies - Schneider
Trusted Machine Learning Models Unlock Private Inference ×2
Privacy Reasoning in Ambiguous Contexts ×2
Open Challenges in Multi-Agent Security
AgentDojo ×3
index
Information Flow Control ×2
Capabilities ×2
CaMeL ×2
concept-map ×2

Linked Pages

AgentDojo

AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents

Reference: Debenedetti, Zhang, Balunović, Beurer-Kellner, Fischer & Tramèr (2024). AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents. NeurIPS 2024 Datasets & Benchmarks Track. arXiv:2406.13352 (ETH Zürich). URL. Project: https://agentdojo.spylab.ai/.

Summary

AgentDojo is the first benchmark designed to evaluate the adversarial robustness of tool-using LLM Agents against Prompt Injection attacks in realistic settings. The authors observe that existing prompt-injection evaluations are either toy (single-turn, one tool) or static (a fixed adversarial corpus that defences quickly memorise). AgentDojo instead provides an extensible execution environment: 97 realistic multi-step tasks across four simulated domains (Slack-like workspace, e-banking, travel booking, e-mail client) plus 629 injection test cases drawn from a structured threat taxonomy, with a clean separation between user tasks, injection tasks, and defence wrappers.

Each evaluation pair consists of (a) a legitimate user goal the agent must achieve and (b) an attacker-chosen secondary goal injected via tool output, document content, or third-party message. A run “succeeds for the attacker” if the agent completes the injected task; it “succeeds for the user” if the original goal is met regardless. This separation surfaces realistic costs: aggressive defences may stop attacks but also break the agent.

Empirically, state-of-the-art LLMs solve less than 66 % of the legitimate tasks even in the absence of attacks. Existing prompt-injection attacks succeed against the best agents in under 25 % of cases, and existing defences (delimiters, instruction-paraphrase detectors, secondary injection-detector LLMs) drop the attack success rate to ~8 % — leaving a wide gap from the “no attacks” baseline. AgentDojo has since become the standard arena for new defences (e.g. CaMeL) and adaptive attacks.

Key Ideas

Four realistic environments: Slack-style workspace, e-banking, travel booking, e-mail client — each with tens of stateful tools.
97 user tasks × 629 injection tests: taxonomised by attacker goal (data exfiltration, unauthorised action, denial of service, etc.).
Dynamic, extensible API: new tasks/attacks/defences pluggable as Python classes; no fixed leaderboard.
Two orthogonal success criteria: user-task success and attack success are measured independently — surfacing the security–utility tradeoff.
Attack catalogue: indirect injection via tool returns, document poisoning, conversation hijack, social engineering; adaptive variants supported.
Defence catalogue: instruction delimiters, role labels, secondary classifier, tool-call gating, full-system mitigations like CaMeL.
Headline numbers: best agents solve <66 % of clean tasks; attacks succeed <25 % unaided; ~8 % with current defences — but still a gap, especially for adaptive attacks.

Connections

Conceptual Contribution

Claim: Prompt-injection robustness must be measured in the wild — across realistic multi-tool tasks where the agent must do useful work while exposed to attacker-controlled inputs. Static benchmarks systematically over-estimate defence strength; an extensible environment that supports adaptive attack/defence development is the right empirical instrument.
Mechanism: A Python execution environment with four domains, hundreds of stateful tools, structured user-task / injection-task pairs, and parallel success metrics; defences and attacks register as plug-ins so new variants can be evaluated against existing ones.
Concepts introduced/used: AgentDojo, Prompt Injection, Indirect Prompt Injection, Adaptive Attack, Tool Use, Agent Security, Security-Utility Tradeoff
Stance: empirical / benchmark
Relates to: Direct companion to Defeating Prompt Injections by Design (the CaMeL defence); operationalises the attack-surface taxonomy of SoK The Attack Surface of Agentic AI and the multi-agent threat catalogue of Open Challenges in Multi-Agent Security; complements tool-level threat studies like MalTool Malicious Tool Attacks and ClawWorm Self-Propagating Attacks Across LLM Agent Ecosystems.

Open Challenges in Multi-Agent Security

Open Challenges in Multi-Agent Security: Towards Secure Systems of Interacting AI Agents

Reference: Schroeder de Witt, Krawiecka, Krawczuk, Hagag, Anderson, et al. (24 authors total) (2025). Open Challenges in Multi-Agent Security: Towards Secure Systems of Interacting AI Agents. arXiv:2505.02077 (Oxford / Cambridge / EPFL / industrial labs). URL.

Summary

This position paper introduces Multi-Agent Security (MASec) as a distinct research field, sitting between traditional cybersecurity, AI safety, and multi-agent systems — and argues that it is the dominant security frontier as LLM Agents begin to interact directly with one another across the open web, physical environments, and institutional infrastructures. The threats MASec studies emerge from interaction; they are not properties of any single agent in isolation.

The authors taxonomise threats arising from agent interaction along several axes: (i) secret collusion (agents coordinating to defeat oversight through covert side-channels including steganographic message-passing), (ii) coordinated swarm attacks (jailbreaks, prompt injections, or misinformation cascading through agent networks), (iii) network-effect amplification (privacy breaches, data poisoning, and disinformation spreading faster than mitigation), and (iv) multi-agent dispersion / stealth optimisation (adversaries exploiting fleet size to evade detection and persist).

They argue these threats are systematically understudied because research is scattered across AI Safety, Multi-Agent Systems, Distributed Security, Game Theory, complex systems, and AI governance, each with its own vocabulary. The paper provides a unifying taxonomy, identifies fundamental security–utility and security–security trade-offs, and lays out a research agenda — including the design of Free-Form Protocols (necessary for task generalisation but enabling collusion), governance and attribution infrastructure, and detection/response mechanisms for emergent multi-agent threats. The work is foundational reading for anyone designing inter-agent protocols, including the Agent-to-Agent Protocol, Model Context Protocol, and successors.

Key Ideas

Defines Multi-Agent Security (MASec) as a field: securing networks of interacting AI agents against threats that emerge or amplify through interaction.
Secret collusion: agents coordinating covertly (including via steganography) to defeat oversight — a new kind of “Schelling-point” attack on alignment.
Coordinated swarm attacks: distributed jailbreaks, prompt injections, data poisoning that succeed because the fleet succeeds even when individual instances fail.
Network effects: privacy breaches, disinformation, and jailbreaks spread through agent populations the way they spread through humans — only faster.
Dispersion & stealth optimisation: adversaries exploit the size and heterogeneity of agent fleets to evade oversight; novel persistent threats at system level.
Free-form protocols as risk surface: the same expressivity that makes inter-agent communication useful enables covert channels; reining in expressivity costs utility.
Security–utility and security–security trade-offs are fundamental — every defence opens or closes other attack surfaces.
Calls for a unified MASec research agenda spanning AI Safety, Distributed Security, Game Theory, complex systems, and AI governance.

Connections

Conceptual Contribution

Claim: Security of interacting AI agents is a distinct problem from either single-agent AI safety or classical cybersecurity. Threats emerge from interaction (secret collusion, swarm attacks, network-effect contagion) and are systematically missed by frameworks anchored to individual systems or static attack surfaces.
Mechanism: A new field — Multi-Agent Security — with a threat taxonomy (collusion, swarm, contagion, dispersion), explicit security–utility / security–security trade-offs, and a research agenda spanning protocol design, attribution, detection, and governance.
Concepts introduced/used: Multi-Agent Security, Secret Collusion, Swarm Attack, Network Effect (Security), Free-Form Protocols, Stealth Optimisation, Agent Security, AI Governance
Stance: position paper / survey / research agenda
Relates to: Sister survey to SoK The Attack Surface of Agentic AI but operating one level up — at networks of agents rather than the agent runtime. Provides the multi-agent threat model that defences like Defeating Prompt Injections by Design address, that infrastructure proposals like Infrastructure for AI Agents try to govern, and that economic frameworks like Virtual Agent Economies embed. Directly extends classical Distributed Security and connects to Learning Collusion in Episodic Inventory-Constrained Markets for the collusion sub-thread.

SoK The Attack Surface of Agentic AI

SoK: The Attack Surface of Agentic AI — Tools, and Autonomy

Reference: Ali Dehghantanha, Sajad Homayoun (2026). arXiv:2603.22928v1 (Cyber Science Lab, University of Guelph; Aalborg University). Source file: 2603.22928v1.pdf. URL

Summary

A systematisation-of-knowledge paper that maps the attack surface of agentic LLM systems — those that plan, call tools, browse, run code, coordinate with other agents, and rely on retrieval-augmented generation (RAG). The authors develop a reference pipeline, identify ten numbered attack surfaces (AS1–AS10) across a Trusted Computing Base (TCB) boundary separating the LLM core, planner, orchestrator, policy guards, and secrets vault from untrusted inputs (web, RAG index, tools, APIs, file I/O).

From a literature-driven review of ~100 candidate papers (2023–2025) they synthesise a taxonomy of seven attack goals (G1 data exfiltration, G2 integrity subversion, G3 privilege escalation, G4 resource abuse, G5 fraud, G6 persistence/backdoor, G7 supply-chain compromise) and five multi-step attack paths (P1–P5) including direct and indirect prompt injection, RAG index poisoning, cross-tool drop, and multi-agent hops. The work maps each vector to OWASP LLM Top-10 2025 and MITRE ATLAS IDs, and proposes attacker-aware quantitative metrics (Unsafe Action Rate, Policy Adherence Rate, Privilege-Escalation Distance, Retrieval Risk Score, Time-to-Contain, Out-of-Role Action Rate, Cost-Exploit Susceptibility) for reproducible benchmarking.

The central thesis is that agentic security risk is structural rather than prompt-level: compromises arise from system composition — tool brokering, persistent memory, and execution lifecycle — that blurs trust boundaries between the model, data, and execution environment. A defence-in-depth playbook across pre-ingestion, inference, agent logic, infrastructure, and monitoring layers is given in appendices.

Key Ideas

Reference agentic pipeline with explicit TCB and ten numbered attack surfaces (AS1–AS10)
Taxonomy of 7 attack goals × 7 vector classes × 5 attack paths
Causal threat graph for tracing attacker influence to unsafe action
Attacker-aware metrics: UAR, PAR, PED, RRS, TTC, OORAR, CES
Mapping to OWASP GenAI LLM Top-10 2025 and MITRE ATLAS
RAG is not intrinsically safer; indirect injection is practical and hard to stamp out
Defence-in-depth across five layers (data, inference, agent logic, infra, monitoring)

Connections

Conceptual Contribution

Claim: Agentic AI security risk is a structural property of system composition (tool use, persistent memory, orchestration, supply chain) rather than a model-level prompt-safety problem; a reference TCB model plus attacker-aware metrics is needed to make defences auditable and comparable.
Mechanism: Define a reference pipeline with trust boundary between trusted orchestration (LLM core, planner, policy, vault) and untrusted ingress (web, RAG, sandbox, APIs). Enumerate ten attack surfaces, seven goals, five multi-step paths, map each to OWASP/MITRE, and define scenario-driven metrics (UAR, PAR, PED, RRS, TTC, OORAR, CES) computable from structured execution traces.
Concepts introduced/used: Agentic TCB, Attack Surface Taxonomy, Causal Threat Graph, Indirect Prompt Injection, RAG Poisoning, Privilege-Escalation Distance, Unsafe Action Rate, OWASP LLM Top-10, MITRE ATLAS, Defence in Depth
Stance: survey / engineering
Relates to: Complements A Language-Based Approach To Prevent DDoS and LangSec by extending structural-security thinking to agentic runtimes. Sits alongside Prompt Injection and Agent Security concept hubs, and provides the threat model that protocols like Model Context Protocol and Agent-to-Agent Protocol must defend against.

Security Kernel Lambda Calculus

A Security Kernel Based on the Lambda-Calculus

Reference: Jonathan A. Rees (1996). MIT AI Laboratory Memo No. 1564. Source file: AIM-1564.pdf. URL

Summary

Rees describes Scheme 48, a programming environment whose design is guided by operating-system security principles. The security kernel is W7, a call-by-value lambda-calculus with extensions for abstract data types, object mutation, and hardware access. Each user or subsystem runs in a separate evaluation environment holding the objects representing that user’s privileges; because environments determine availability of object references, protection and sharing are controlled by construction.

The paper describes experience with Scheme 48 as the programming environment for Cornell’s mobile robots (no underlying OS) and as a secure multi-user workstation environment, arguing that lexical scope + first-class environments are a natural substrate for capability-style security among cooperating agents.

Key Ideas

Lambda-calculus as minimal security kernel via capability-style environments.
Scheme 48 (W7): modules, macros, dynamic isolation, portable across platforms.
Trust mediated by controlling which names bind to which objects.
Authentication via capsules (tamper-proof labelled objects).
Applied to robots and multi-user Scheme environments.

Connections

Conceptual Contribution

Claim: Lexical scope and first-class environments in a lambda-calculus variant (W7) are a sufficient security kernel: protection reduces to controlling which names bind to which object references.
Mechanism: Implements Scheme 48 as a capability-style environment where each user/subsystem runs in its own environment; authentication via capsules (tamper-proof labelled objects); deployed on Cornell mobile robots (no OS) and as multi-user workstation environment.
Concepts introduced/used: Capability Security, Lambda Calculus, Lexical Scope, Scheme 48, Capsules, Distributed Security, Multi-Agent Systems
Stance: foundational / engineering
Relates to: Provides a principled, capability-based alternative to the input-validation discipline of Seven Turrets Of Babel and the static DDoS analysis of A Language-Based Approach To Prevent DDoS; relevant to sandboxing malicious tools described in MalTool Malicious Tool Attacks.

Tags

A Language-Based Approach To Prevent DDoS

A Language-Based Approach to Prevent DDoS Attacks in Distributed Financial Agent Systems

Reference: Fazeldehkordi, Owe, Ramezanifarkhani (2018). University of Oslo. Source file: A language-based approach to prevent DDoS attacks in distributed financial agent systems.pdf. URL

Summary

The authors propose adding a language-based layer of defense against DoS/DDoS to distributed financial agent systems built on the actor model with asynchronous method calls and futures (in the style of Creol/ABS). Because such languages make it cheap to launch non-blocking floods, they adapt a static analysis for detecting call-flooding cycles to the many-to-one DDoS setting.

The analysis builds per-method control-flow graphs, identifies cycles, and classifies nodes as strongly- or weakly-reachable to detect unbounded method-call generation at compile time. They distinguish one-to-one, many-to-one, and one-to-many flooding, and illustrate with a publish/subscribe newsletter example where future-based optimization accidentally enables a DoS against subscribers.

Key Ideas

Static detection of call-based flooding in actor-model languages with futures.
Classification: one-to-one, many-to-one, one-to-many flooding.
Strong vs weak reachability in control-flow cycles.
Instantiation flooding (unbounded object creation) as a resource-exhaustion vector.
Application to financial service subscriber systems.

Connections

Conceptual Contribution

Claim: Actor-based agent languages with asynchronous futures make DoS/DDoS cheap to launch inadvertently, and static analysis of call-flow cycles can prevent it at compile time.
Mechanism: Builds per-method control-flow graphs, detects strongly/weakly-reachable cycles that generate unbounded calls; extends from one-to-one to many-to-one and one-to-many flooding; illustrated on a Creol/ABS publish/subscribe newsletter.
Concepts introduced/used: Static Analysis, Actor Model, Futures, DDoS, Control-Flow Graph, Distributed Security, Multi-Agent Systems
Stance: engineering
Relates to: Shares the language-level-security stance of Seven Turrets Of Babel and Security Kernel Lambda Calculus; addresses a threat class complementary to the tool-level attacks of MalTool Malicious Tool Attacks and the broad landscape of AI Agents Under Threat.

Tags

Provable Security (Agents)

(page does not exist)

Agent Security

Security concerns specific to LLM-agent systems: tool attacks, prompt injection, memory poisoning, inter-agent trust failures.

In this vault

Tool Use

LLM-agent capability of invoking external tools (APIs, code execution, database queries). Standardised through Model Context Protocol.

In this vault

Prompt Injection

Attack where adversary-controlled text inside an LLM’s input context is interpreted as instructions — classic LangSec parser-differential in a natural-language setting.

In this vault

Information Flow Control

Static or dynamic restriction of how data labelled with a security class can flow through a program. Cornerstone of Capability Security, language-based security (Security Kernel Lambda Calculus), and the CaMeL approach to prompt injection.

In this vault

Capabilities

Unforgeable tokens that name and authorise the right to act on a resource. Foundational primitive of Capability Security / Object Capability Security; reused in CaMeL to gate LLM tool calls; cousin to Ambient Calculus capabilities.

In this vault

Data-Flow Tracking

(page does not exist)

Control-Flow Integrity

(page does not exist)

CaMeL

CApabilities for MachinE Learning: control-flow / data-flow separation + capability-gated tool calls as a by-design defence against Prompt Injection on tool-using LLM Agents. See Defeating Prompt Injections by Design.

In this vault

Model Context Protocol

MCP — an open protocol (Anthropic, 2024) standardising how LLM applications connect to external tools and data sources.

Discussed in:

Distributed Security

Security of distributed/agent systems: mobile code, secure messaging, language-based defences.

AI Agents Under Threat

AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways

Reference: Deng, Guo, Han, Ma, Xiong, Wen, Xiang (2025). ACM Computing Surveys 57(7), Article 182. Source file: 3716628.pdf. URL

Summary

This survey organizes the emerging threat landscape of LLM-powered AI agents around four knowledge gaps: unpredictability of multi-step user inputs, complexity of internal execution, variability of operational environments, and interactions with untrusted external entities. It unifies single-agent and multi-agent attack surfaces within a perception/brain/action + agent2agent/agent2env/agent2memory taxonomy.

Concrete threats reviewed include adversarial prompts, prompt injection, jailbreaks, backdoor attacks, hallucination and misalignment, tool-use risks, indirect prompt injection, reinforcement-learning environment attacks, cooperative and competitive inter-agent risks, and long/short-term memory attacks. The authors tabulate defenses (prevention- and detection-based), rate their efficacy, and highlight open directions for robust and trustworthy agents.

Key Ideas

Four knowledge gaps framing agent security.
Taxonomy: perception / brain / action / agent2agent / agent2env / agent2memory threats.
Six categories of prompt-injection attack engineering (naive, escape, context-ignore, fake-completion, multimodal, combined).
Jailbreak domino effect in multi-agent populations.
Memory poisoning and indirect prompt injection as underexplored surfaces.

Connections

Conceptual Contribution

Claim: LLM Agents security should be organised around four knowledge gaps (input unpredictability, internal complexity, environmental variability, untrusted interactions) mapped onto a perception/brain/action + agent2{agent,env,memory} taxonomy.
Mechanism: Surveys adversarial prompts, prompt injection, jailbreaks, backdoors, hallucination, tool-use risks, indirect injection, RL environment attacks, inter-agent cooperative/competitive risks, memory poisoning; tabulates prevention- vs detection-based defences and rates their efficacy.
Concepts introduced/used: Prompt Injection, Jailbreak, Backdoor Attacks, Tool Use, Memory Poisoning, Hallucination, Model Context Protocol, LLM Agents, Multi-Agent Systems, Trust and Reputation, Distributed Security, Agent Security
Stance: survey
Relates to: Provides the threat scaffolding that MalTool Malicious Tool Attacks deepens at the tool layer; complements lifecycle threats in Survey Of Agent Interoperability Protocols; motivates static-analysis defences like A Language-Based Approach To Prevent DDoS.

Tags

LLM Agents

Large-language-model-powered agents: natural-language coordination, tool use, multi-agent orchestration.

Surveys & frameworks

Protocols & communication

Failures & threats

Lineage

Capabilities (Ambient)

The three boundary-crossing primitives of the Ambient Calculus: in n enters a sibling ambient named n, out n exits the parent ambient if it is named n, open n dissolves a child ambient named n. Possession of a capability name is the formal account of access: a process can cross a boundary iff it has the corresponding capability. Conceptually adjacent to Capability Security in the Dennis–Van Horn lineage — both treat the unforgeable name-as-token as the primitive of permission.

In this vault

End-to-End Arguments in System Design

Reference

Saltzer, J. H., Reed, D. P., & Clark, D. D. (1984). “End-to-End Arguments in System Design.” ACM Transactions on Computer Systems, 2(4), 277-288. URL

Summary

Saltzer, Reed, and Clark articulate a design principle for layered distributed systems that had long been used but rarely stated explicitly: functions requiring knowledge and action at the endpoints of a communication — such as reliable delivery, integrity checking, encryption, duplicate suppression — cannot be fully and correctly implemented at lower layers. Lower-layer implementations are at best performance optimizations; the end-to-end argument says they cannot substitute for the end-level check.

The canonical example is careful file transfer between two hosts. Even if the communication network offers reliable delivery, threats remain — disk errors at either host, memory corruption during buffering, software bugs in the file-transfer program itself. No amount of reliability layered into the network can defend against these; only an end-to-end checksum computed from the file on disk at host A and verified against the file on disk at host B closes the loop. The paper then iterates the argument through encryption (only the endpoints know the plaintext), duplicate suppression (only the application knows what “duplicate” means at the transaction level), delivery acknowledgements, and crash recovery.

The principle is a design heuristic, not an absolute rule: performance sometimes justifies redundant lower-layer mechanisms (e.g., per-hop error correction in a very noisy link). But it inverts the naïve “make the network as reliable as possible” instinct, provides the intellectual backbone for the Internet’s dumb-network / smart-edges architecture, and underwrites TCP’s placement in the hosts rather than the routers. Its influence extends to REST’s principled avoidance of server-side session state, to security architectures that refuse to trust intermediaries, and to the “fate-sharing” style of protocol design.

Key Ideas

End-to-end argument: a function that must be correct at endpoints cannot be completely implemented below the endpoints.
Lower layers as optimization: partial lower-level help is only a performance enhancement, never a correctness substitute.
Careful file transfer: the worked example — only an end-to-end checksum protects against all failure modes.
Dumb core, smart edges: Internet architecture as the principle’s canonical application.
Encryption placement: true confidentiality requires endpoint encryption; network-level encryption is not enough.
Acknowledgements: application-meaningful acks (e.g., “request served”) require endpoint involvement.
Cost-benefit nuance: redundancy below is justified when error rate or cost of retry makes it worthwhile.

Connections

Principled Design Of The Modern Web Architecture — Fielding’s REST thesis formalizes many end-to-end commitments.
REST
LangSec — input parsing at the application boundary is itself an end-to-end verification.
Actor Model — supervisor-style recovery relies on end-to-end state ownership.
Impossibility of Distributed Consensus with One Faulty Process — endpoints cannot delegate liveness to lower layers either.