Trusted Capable Model Environment

Shumailov et al. 2025 design pattern: a capable ML model run under explicit input/output constraints + Information Flow Control + statelessness can play the role of a trusted third party in private-inference problems that classical cryptography (MPC, ZKP) cannot scale to. See Trusted Machine Learning Models Unlock Private Inference.

In this vault

Backlinks

Linked Pages

index

Agent Communications Vault

A curated, wikilink-connected reading vault on agent communication languages, multi-agent systems, capability security, distributed systems, and LLM agents — from McCarthy and Minsky through KQML/FIPA to modern LLM agent protocols.

Each note summarises a paper in its own words (summary, key ideas, conceptual contribution, connections) and is cross-linked to related concepts and papers, forming a navigable graph of the field.

Start with concept-map for a guided tour, or browse the map of content below.

How to contribute

The vault is a plain-text zetl wikilink graph — every note is a markdown file with [[wikilinks]]. Contributions welcome:

Clone: git clone https://github.com/anuna-cooperative/agent-comms-wiki.git
Add or edit notes as plain markdown. New paper notes should follow the structure of existing ones (Reference, Summary, Key Ideas, Connections, Conceptual Contribution, Tags).
Run zetl check to validate links, and zetl build to preview the site locally.
Open a pull request at https://github.com/anuna-cooperative/agent-comms-wiki.

See README for detailed conventions.

Map of Content

Concept Hubs

Foundational

concept-map

Conceptual Map

A guided conceptual tour through the vault. Where index lists the papers, this page lists the ideas and shows how they interlock. Every paper note now also carries a ## Conceptual Contribution section (claim / mechanism / concepts / stance / relates-to).

1. The Central Tension: What Does a Message Mean?

Agent communication’s perennial question — whose mental states does a message commit? — runs the length of this vault.

Speech Act Theory (Austin → Searle → Foundations Of Illocutionary Logic) fixes a vocabulary: illocutionary force, direction of fit, sincerity and preparatory conditions. Every ACL after this inherits it.
Mentalistic Semantics — grounding message meaning in the beliefs/intentions of sender and receiver. KQML (KQML Overview, KQML Language And Protocol, KQML as an Agent Communication Language) and FIPA-ACL adopt it.
Commitment-based Semantics / Public Semantics — the counter-move. Singh’s critique (ACL Rethinking Principles, Agent Communication Languages - Rethinking the Principles) argues mentalistic semantics is unverifiable: we cannot inspect another agent’s mind, only its public commitments. Agent Communication And Institutional Reality pushes further: every message is a declaration that alters social commitments; Searle’s “counts-as” is the operative logic.
Verifiable Semantics — Verifiable Semantics for ACLs formalises the critique by requiring grounding in program state so conformance is model-checkable. A Common Ontology Of ACLs offers a reconciliation: role-instanced public attitudes unify the two families.
Conversation Policy / Interaction Protocols — even with messages nailed down, coordination needs conversations. Coordinating Agents Using ACL Conversations (Colored Petri Nets), ACRE Agent Conversation Reasoning Engine (Dooley graphs), and An Interaction-oriented Agent Framework for Open Environments (commitment-based protocols) make the conversation first-class.

Surveys mapping this debate: The State of the Art in Agent Communication Languages, Trends in Agent Communication Language.

2. The Language Stack

Messages compose into languages compose into protocols.

Layer	Concept	Representative papers
Content	KIF, ontology term sets	KQML Overview, Ontolingua Portable Ontology Specifications, Handbook On Ontologies
Message	Performatives / illocutions	KQML, FIPA-ACL, Foundations Of Illocutionary Logic
Conversation	Interaction Protocols	Coordinating Agents Using ACL Conversations, ACRE Agent Conversation Reasoning Engine
Transport	Facilitators, routing	KQML Language And Protocol, Model Context Protocol, Agent-to-Agent Protocol

This same stack — content / message / conversation / transport — reappears in the modern LLM-agent protocol wave: see Survey Of AI Agent Protocols and Survey Of Agent Interoperability Protocols, which place Model Context Protocol (tools), ACP, Agent-to-Agent Protocol, and Agent Network Protocol at progressively higher layers.

3. How Does Shared Language Arise?

A separate tradition asks where meaning comes from rather than what it contains.

Linguistic foundations. Three Models for the Description of Language establishes what structure a shared code must have (Chomsky hierarchy, transformational grammar). Algorithmic Information Theory - Grunwald Vitanyi provides the information-theoretic counterpart: meaning is compressed description.
Language Games. Language Games for Autonomous Robots (Steels) shows grounded lexicons self-assemble through situated interaction — no designer required. The same bootstrap appears decision-theoretically in Towards Automating the Evolution of Linguistic Competence and Toward Automated Evolution of ACLs: rational agents negotiate vocabulary when current language fails.
Emergent Communication. The deep-learning revival: Multi-Agent Cooperation and the Emergence of Natural Language, Emergence of Grounded Compositional Language in Multi-Agent Populations — neural agents in referential/signalling games evolve compositional codes. On the Pitfalls of Measuring Emergent Communication is the sharpest critique: most metrics fail to distinguish real communication from confounds; measure positive signalling and positive listening with causal interventions.
Common Business Communication Language is an analogue in the pre-ML era — an open-ended language negotiable between organisations with graceful partial-understanding fallback.
The LLM inflection point. Why AI Agents Communicate In Human Language argues natural language is exactly the wrong inter-agent medium: lossy, non-differentiable, ambiguous. The thread rejoins the ACL debate a quarter-century later.

Trusted Machine Learning Models Unlock Private Inference

Trusted Machine Learning Models Unlock Private Inference for Problems Currently Infeasible with Cryptography

Reference: Shumailov, Ramage, Meiklejohn, Kairouz, Hartmann, Balle & Bagdasarian (2025). Trusted Machine Learning Models Unlock Private Inference for Problems Currently Infeasible with Cryptography. arXiv:2501.08970 (Google Research). URL.

Summary

The paper proposes Trusted Capable Model Environments (TCMEs) — a new design point in the privacy-preserving-computation landscape, sitting between classical trusted execution environments and cryptographic protocols such as multi-party computation (MPC), homomorphic encryption, and zero-knowledge proofs. The motivating observation: capable modern ML models can plausibly play the role of the trusted third party in many private-inference scenarios that classical cryptography handles only at toy scale or not at all.

A TCME is defined by three constraints under which a capable model operates: (i) explicit input/output constraints scoping what the model is permitted to receive and emit; (ii) explicit information-flow control binding outputs to authorised data-flow channels; (iii) explicit statelessness — the model cannot retain or leak inputs across sessions. Under these constraints, even an inscrutable LLM can serve as a credible “trusted intermediary”: it computes a function of two parties’ data and reveals only the agreed output.

The authors argue TCMEs unlock private inference for problems where MPC is infeasible because the function is too rich, the inputs too large, or the spec too implicit (natural-language matching, fuzzy de-duplication, semantic agreement-checking). They walk through use cases — private record matching, contract negotiation, secret-keeping triage — and show that even classical cryptographic problems (private set intersection, secure multi-party comparison) admit TCME implementations that scale further than current MPC. The paper closes with the limitations: trust in TCMEs reduces to trust in the model+hardware+policy stack; statelessness must be engineered, not assumed.

Key Ideas

TCME definition: a capable ML model + explicit I/O constraints + explicit information-flow control + explicit statelessness.
Trusted-third-party substitution: the model fills the role MPC traditionally requires a non-colluding cryptographic protocol to enact.
Coverage envelope: TCMEs handle privacy problems too rich or too implicit for current MPC (semantic matching, fuzzy agreements, natural-language contracts).
Bridge to cryptography: even classical PSI/comparison protocols can be implemented as TCMEs — sometimes more efficiently.
Statelessness is engineered: memory leaks, side channels, and re-training contamination are the real attack surface, not the model logic.
Trust composition: TCME trust assumption = trust(model) ∧ trust(hardware) ∧ trust(policy enforcement).
Use cases sketched: private record matching, negotiation, triage, search over private corpora, semantic compliance checks.

Connections

Conceptual Contribution

Claim: Capable ML models, operated under explicit information-flow and statelessness constraints, can act as trusted third parties for private-inference problems that classical cryptography cannot scale to. This expands the realm of feasible privacy-preserving computation beyond MPC’s current envelope.
Mechanism: Define Trusted Capable Model Environments (TCMEs): model + explicit I/O constraints + explicit IFC + explicit statelessness. Demonstrate via use cases that TCMEs solve both novel privacy problems (semantic matching) and re-instantiate classical ones (PSI) at scales MPC cannot reach.
Concepts introduced/used: Trusted Capable Model Environment, Trusted Third Party, Information Flow Control, Private Inference, Statelessness (Privacy), Multi-Party Computation
Stance: position / architectural proposal
Relates to: Direct companion to NDAI Agreements — both treat TEE+AI or model+constraints as a substrate for previously infeasible commitment / privacy primitives. Provides the technical substrate that Privacy Reasoning in Ambiguous Contexts reasons about behaviourally and that Infrastructure for AI Agents would expose as governance infrastructure. Complementary to Defeating Prompt Injections by Design’s CaMeL: both treat the agent as a constrained reasoner whose outputs are gated by information-flow policy.

NDAI Agreements

Reference: Stephenson, Miller, Sun, Annem & Parikh (2025). NDAI Agreements. arXiv:2502.07924 (UIUC; Cornell Tech; et al.). URL.

Summary

The “NDAI agreement” — non-disclosure AI agreement — is a mechanism in which a TEE combined with an AI agent jointly stands in for a trusted human intermediary, resolving the classical disclosure–appropriation paradox of information markets first identified by Arrow (1962) and Nelson (1959). An inventor cannot reveal an idea to a potential investor without risking misappropriation; without revealing it, no efficient bargain can be struck. The result is well-known: under-disclosure, under-investment, under-licensing.

Stephenson et al. show formally — via a buyer/seller bargaining game — that delegating the disclosure-and-payment decision to a tamper-proof program running inside a TEE eliminates the hold-up problem, achieving full disclosure and an efficient ex post transfer. When the invention’s value exceeds the value a TEE can fully secure (e.g. because some leakage is unavoidable), partial disclosure still strictly improves welfare over the no-disclosure equilibrium. They then model agent error — payments or disclosures going wrong — and prove that simple safeguards (budget caps, acceptance thresholds) preserve most of the efficiency gains.

The substantive economic claim is that TEE + AI behave as an “ironclad NDA”: a credible commitment device for the disclosure problem that was previously unattainable with paper contracts (because expropriation is unverifiable) or with cryptography alone (because invention value is unbounded and the seller’s information is a complex unstructured artefact). The result links the Mechanism Design / Hold-Up Problem tradition to AI-agent infrastructure, and gives a sharp theoretical case for the economic value of trusted model environments and confidential-compute hardware as agent-economy substrates.

Key Ideas

Formalises the Arrow–Nelson information paradox / hold-up problem in a bargaining model between seller (inventor) and buyer (investor).
TEEs + AI agents delegate disclosure and payment to tamper-proof programs that neither party can subvert; this implements an ex-ante commitment device unavailable under classical contracts.
Full-disclosure efficient equilibrium under the NDAI when the invention’s value lies within what the TEE can secure.
Partial disclosure dominates no-disclosure even when full security is impossible: high-value inventions still get partially revealed in welfare-improving ways.
Models agent imperfection: errors in payment or disclosure can occur; budget caps and acceptance thresholds bound the damage and preserve most welfare.
Frames TEEs+AI as an “ironclad NDA”: a cryptographically/hardware-enforced commitment that traditional NDAs cannot match.
Policy implications for R&D Commercialisation, Technology Transfer, and inter-firm collaboration; bridges economic theory to confidential-compute hardware.

Connections

Conceptual Contribution

Claim: A trusted execution environment hosting an AI agent can serve as a credible commitment device that solves the classical Arrow–Nelson disclosure problem of information markets — achieving full disclosure and efficient transfer where paper NDAs and pure cryptography both fail.
Mechanism: A bargaining model in which TEEs+AI mediate the disclosure-and-payment decision; closed-form characterisation of equilibria for full and partial disclosure; sensitivity analysis to agent error with policy-instrument bounds (budget caps, acceptance thresholds).
Concepts introduced/used: NDAI Agreement, Trusted Execution Environment, Hold-Up Problem, Arrow Information Paradox, Mechanism Design, Commitment Device, Disclosure Game
Stance: formal economic theory with technical implications
Relates to: Companion to Trusted Machine Learning Models Unlock Private Inference — both argue that capability + trusted execution can replace previously infeasible cryptographic primitives. The economic counterpart to the engineering catalogue in Infrastructure for AI Agents; a building block for the markets imagined in Virtual Agent Economies and the information-asymmetry resolution explored in Language Models Can Reduce Asymmetry in Information Markets. Sits in the lineage of Vickrey / mechanism-design tradition.

Information Flow Control

Static or dynamic restriction of how data labelled with a security class can flow through a program. Cornerstone of Capability Security, language-based security (Security Kernel Lambda Calculus), and the CaMeL approach to prompt injection.