Expand ↗
Page list (942)

Agent Libel

Failure mode where an LLM agent produces defamatory or false claims about third parties (other agents, users) during autonomous operation.

In this vault

Last changed by zetl · stable 5d · history

Backlinks