Alice AI Security Benchmark

Overview
In this report, we cover:
- Model performance on precision, recall, and FPR using real and synthetic adversarial prompts
- Multilingual detection accuracy across 13 global languages
- Emerging techniques in prompt injection and jailbreak tactics that evade standard filters
Use these findings to assess your current safety stack, then reinforce your defenses with a system built to scale. Download the report and secure your GenAI systems before attackers find the gaps.
‍

What’s New from Alice
Your LLM Has No Idea What It's Doing
Diana Kelley, CISO at Noma Security and former Cybersecurity CTO at Microsoft, joins Mo to work through the real mechanics of LLM risk: why the context window flattens the trust boundary between system instructions and user data, why that makes reliable internal guardrails essentially impossible, and why agentic AI is less a new threat category and more a stress test for the hygiene debt organizations never fully paid off.
Distilling LLMs into Efficient Transformers for Real-World AI
This technical webinar explores how we distilled the world knowledge of a large language model into a compact, high-performing transformer—balancing safety, latency, and scale. Learn how we combine LLM-based annotations and weight distillation to power real-world AI safety.
Exposing the Hidden Risks of AI Toys
AI-powered toys are entering children’s everyday lives, but new research reveals serious safety gaps. Alice testing shows how child-like interactions can lead to inappropriate content, unsafe conversations, and risky behaviors.
