Benchmark

The LLM Safety Review: Benchmarks & Analysis

As GenAI tools and the LLMs behind them impact the daily lives of billions, this report examines whether these technologies can be trusted to keep users safe.

What you’ll learn:

How LLMs respond to risky prompts from bad actors and vulnerable users
Where current models show safety strengths and weaknesses
Actionable steps to improve LLM safety and reduce harmful outcomes

Aug 1, 2023

Overview

In this first independent benchmarking report on the LLM safety landscape, ActiveFence’s subject-matter experts put leading models to the test. More than 20,000 prompts were used to analyze how six LLMs respond across seven major languages and four high-risk abuse areas: child exploitation, hate speech, self-harm, and misinformation. The report provides comparative insight into each model’s relative safety strengths and weaknesses, helping teams understand where gaps exist and where additional resources may be required.

Download the Full Report

What’s New from Alice

Afraid AI Will Replace You? Here's the One Skill It Can't

podcast

June 2, 2026

min read

James Villarrubia went from building AI for NASA's drone and aerospace programs to becoming CTO of a travel tech company. In this episode, he and Mo get into why curiosity might be the most important skill in the AI era, what happens to our brains when we stop pushing back on the answers we get, and why the people most resistant to AI might actually be seeing something the rest of us are missing.

Listen Now

It Takes AI to Break AI: The Case for AI Red Teaming

webinar

May 25, 2026

This is some text inside of a div block.

min read

As AI systems gain autonomy, organizations need security approaches built specifically for AI behavior. Learn why AI-driven red teaming is becoming a critical defense layer.

Learn More

Evaluation of Instagram Teen Accounts

whitepaper

Jun 1, 2026

This is some text inside of a div block.

min read

This report evaluates default and opt-in content protections under real-world and adversarial conditions. The study examines safeguard effectiveness, resilience against attempts to surface inappropriate content, and platform improvements made following testing.

Learn More

Secure the keys to GenAI wonderland?

Get a demo

The LLM Safety Review: Benchmarks & Analysis

Overview

Download the Full Report

What’s New from Alice

Curiouser Soundbites: The AI Risk Debt Your Enterprise Is Already Carrying

Afraid AI Will Replace You? Here's the One Skill It Can't

It Takes AI to Break AI: The Case for AI Red Teaming

Evaluation of Instagram Teen Accounts

Secure the keys to GenAI wonderland?