Validate Model Safety and Benchmark Against Competitors for Responsible Deployment

To validate its most advanced foundation model to date, Amazon engaged Allice for a manual red-teaming evaluation of Nova Premier, testing the model's readiness for safe and secure deployment.

Feb 18, 2026

Get a demo

Validating Foundation Model Safety for Responsible Deployment

Company Info

Company Size

Industry

GenAI - LLM

About

Nova Premier is Amazon’s most advanced foundation model, designed for complex reasoning and serves as a distillation teacher for downstream systems.

"Through this hands-on evaluation, Alice strengthened Nova’s security posture and supported Amazon’s broader Responsible AI goals, ensuring the model could be deployed with greater confidence."

Rahul Gupta

Senior Manager, Responsible AI, Amazon AGI

AT A GLANCE

To help validate its most advanced model to date, Amazon partnered with Alice to red-team Nova Premier against high-risk prompts. The results positioned Nova as safer than its competitors, marking a major step toward secure enterprise deployment.

Challenge

Amazon aimed to rigorously validate the safety of its most capable foundation model, Nova Premier ahead of public release. With increasing risks associated with advanced generative models, they sought to benchmark it against real-world adversarial threats across critical responsible AI (RAI) categories.

Solution

Alice partnered with Amazon as a third-party red teamer to perform manual, blind evaluations of Nova Premier on Amazon Bedrock. Testing spanned prompts across Amazon’s eight RAI categories, including safety, fairness and bias, and privacy and security. ALice also benchmarked Nova Premier against other LLMs for comparison.

Impact

The collaboration demonstrated how expert-led manual red teaming complements automated testing, offering a comprehensive snapshot of model robustness.

Globally trusted for good reason.

Alice is led, supported, and backed by experts in communicative tech integrity. See how we use our unparalleled threat intelligence to continuously protect over 3 billion people worldwide.

Get a Demo

What’s New from Alice

Securing Agentic AI: The OWASP Approach

podcast

February 4, 2026

min read

In this episode, Mo Sadek is joined by Steve Wilson (Chief AI and Product Officer at Exabeam, founder and co-chair of the OWASP GenAI Security Project) to explore how OWASP is shaping practical guidance for agentic AI security. They dig into prompt injection, guardrails, red teaming, and what responsible adoption can look like inside real organizations.

Listen Now

Distilling LLMs into Efficient Transformers for Real-World AI

webinar

Sep 25, 2025

This is some text inside of a div block.

min read

This technical webinar explores how we distilled the world knowledge of a large language model into a compact, high-performing transformer—balancing safety, latency, and scale. Learn how we combine LLM-based annotations and weight distillation to power real-world AI safety.

Learn More