ActiveFence is now Alice
x
Back
Whitepaper

The LLM Safety Review: Benchmarks & Analysis

As GenAI tools and the LLMs behind them impact the daily lives of billions, this report examines whether these technologies can be trusted to keep users safe.

What you’ll learn:

  • How LLMs respond to risky prompts from bad actors and vulnerable users
  • Where current models show safety strengths and weaknesses
  • Actionable steps to improve LLM safety and reduce harmful outcomes
Aug 1, 2023

Download the Full Report

Overview

In this first independent benchmarking report on the LLM safety landscape, ActiveFence’s subject-matter experts put leading models to the test. More than 20,000 prompts were used to analyze how six LLMs respond across seven major languages and four high-risk abuse areas: child exploitation, hate speech, self-harm, and misinformation. The report provides comparative insight into each model’s relative safety strengths and weaknesses, helping teams understand where gaps exist and where additional resources may be required.

Secure the keys to GenAI wonderland?

Get a demo
Guardrails
Red-Team Lab