The LLM Safety Review: Benchmarks & Analysis
As GenAI tools and the LLMs behind them impact the daily lives of billions, this report examines whether these technologies can be trusted to keep users safe.
What you’ll learn:
- How LLMs respond to risky prompts from bad actors and vulnerable users
- Where current models show safety strengths and weaknesses
- Actionable steps to improve LLM safety and reduce harmful outcomes

Download the Full Report
Overview
In this first independent benchmarking report on the LLM safety landscape, ActiveFence’s subject-matter experts put leading models to the test. More than 20,000 prompts were used to analyze how six LLMs respond across seven major languages and four high-risk abuse areas: child exploitation, hate speech, self-harm, and misinformation. The report provides comparative insight into each model’s relative safety strengths and weaknesses, helping teams understand where gaps exist and where additional resources may be required.
What’s New from Alice
Distilling LLMs into Efficient Transformers for Real-World AI
This technical webinar explores how we distilled the world knowledge of a large language model into a compact, high-performing transformer—balancing safety, latency, and scale. Learn how we combine LLM-based annotations and weight distillation to power real-world AI safety.
