Back
Whitepaper
Misleading Models - Testing for Deception
To build safe, trustworthy AI apps, enterprises must understand how and why LLM models may scheme and deceive. In partnership with a major LLM provider, we tested how incentives like self-preservation or user appeasement can drive strategic deception. Download the report to learn more.
May 6, 2025

Download the Full Report
Overview
In this report, we cover:
- How LLMs strategically deceive users
- Incentives that trigger dishonest behavior
- Risks of deploying untested models
What’s New from Alice
Distilling LLMs into Efficient Transformers for Real-World AI
webinar
Sep 25, 2025
 -
This is some text inside of a div block.
 min read
This technical webinar explores how we distilled the world knowledge of a large language model into a compact, high-performing transformer—balancing safety, latency, and scale. Learn how we combine LLM-based annotations and weight distillation to power real-world AI safety.
Red-Team Lab
