Coercing LLMs to Do and Reveal (Almost) Anything with Jonas

The TWIML AI Podcast (formerly This Week in...

Coercing LLMs to Do and Reveal (Almost) Anything with Jonas Geiping

Listen now

Description

Today we're joined by Jonas Geiping, a research group leader at the ELLIS Institute, to explore his paper: "Coercing LLMs to Do and Reveal (Almost) Anything". Jonas explains how neural networks can be exploited, highlighting the risk of deploying LLM agents that interact with the real world. We discuss the role of open models in enabling security research, the challenges of optimizing over certain constraints, and the ongoing difficulties in achieving robustness in neural networks. Finally, we delve into the future of AI security, and the need for a better approach to mitigate the risks posed by optimized adversarial attacks. The complete show notes for this episode can be found at twimlai.com/go/678.

More Episodes

See all »

Controlling Fusion Reactor Instability with Deep Reinforcement Learning with Aza Jalalvand

Today we're joined by Azarakhsh (Aza) Jalalvand, a research scholar at Princeton University, to discuss his work using deep reinforcement learning to control plasma instabilities in nuclear fusion reactors. Aza explains his team developed a model to detect and avoid a fatal plasma instability...

Published 04/29/24

GraphRAG: Knowledge Graphs for AI Applications with Kirk Marple

Today we're joined by Kirk Marple, CEO and founder of Graphlit, to explore the emerging paradigm of "GraphRAG," or Graph Retrieval Augmented Generation. In our conversation, Kirk digs into the GraphRAG architecture and how Graphlit uses it to offer a multi-stage workflow for ingesting,...

Published 04/22/24

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Published 04/22/24