ActiveFence is now Alice
x
Back
Leading AAA Gaming Studio
-
Case Studies

Providing Confidence for Safe, On-Time Release of an In-Game, AI-Powered NPC

A leading AAA gaming studio partnered with Alice to proactively test and secure an AI-powered in-game NPC ahead of launch. Using a hybrid AI red teaming approach, Alice surfaced over 20,000 policy violations including in high-risk areas like self-harm, child safety, and prompt injection across four languages, multiple modalities, and conversation types. Findings enabled architectural improvements and policy enforcement, giving product, legal, and executive stakeholders the confidence to launch safely while maintaining the immersive gameplay experience players expect.

May 26, 2026
Get a demo
Company Info

Company Size

Industry

Gaming / AI-Powered Experiences

About

A top-tier AAA gaming studio with 1,000+ employees, known for high-budget blockbuster titles. As the studio integrates generative AI into core gameplay experiences, maintaining player trust, brand safety, and narrative integrity across multiple languages and modalities has become a critical product priority.
AT A GLANCE

Using a hybrid AI Red Teaming approach, Alice surfaced over 20,000 policy violations in high-risk areas including self-harm, child safety, and prompt injection, tested across four languages, multiple modalities, and conversation types. Findings enabled architectural improvements and policy enforcement while maintaining the immersive gameplay experience players expect.

Challenge

To revolutionise player interaction, the studio set out to launch an AI-powered non-player character (NPC) capable of dynamic, natural-language conversations. But the unpredictable behaviour of large language models introduced significant LLM safety risks threatening player trust and brand reputation.

The system was complex: an agentic AI architecture with multiple LLMs orchestrated through system prompts, LLM-based judges, and real-time content filters. It had to support multi-turn conversations across four languages and multiple modalities all while staying in character. The studio's communication policy raised the bar further, requiring every NPC interaction to be contextually appropriate and narratively aligned.

Balancing creativity with control, the team needed to rigorously pressure-test the system pre-launch — to ensure safety, maintain narrative integrity, and protect the brand.

How Alice Helped

To mitigate safety risks before launch, the studio partnered with Alice to deploy WonderBuild — Alice's purpose-built AI red teaming solution designed to stress-test generative AI systems before deployment.

Alice implemented a hybrid red teaming strategy combining two complementary approaches:

Automated adversarial testingThousands of prompts were generated across languages, modalities, and gameplay scenarios to systematically uncover policy violations, misalignments, and edge-case failures at scale.

Manual, intelligence-led red teamingSubject-matter experts investigated nuanced failure modes, narrative inconsistencies, and safety blind spots that automated testing alone cannot surface.

The approach was tailored to the studio's unique architecture and communication policy, testing how the NPC performed under real-world conversational pressure while staying in character. Findings revealed vulnerabilities in critical areas including self-harm, child safety, illegal activity, prompt injection, and narrative-breaking responses. Each issue was triaged and translated into actionable, architecture-level recommendations that strengthened system integrity without sacrificing immersion.

The Results

Within just two weeks, Alice's WonderBuild red teaming solution delivered the clarity and confidence the studio needed to move forward.

The engagement surfaced 20,000+ policy-violating or misaligned outputs across languages, modalities, and gameplay scenarios providing a comprehensive picture of the system's risk surface before a single player encountered it.

Key outcomes:

  • 20,000+ policy violations and misaligned outputs uncovered across four languages and multiple modalities
  • Multiple architecture-level improvements implemented directly from Alice's findings
  • Vulnerabilities identified in high-risk areas including self-harm, child safety, illegal activity, and prompt injection
  • Product, legal, and executive teams achieved shared confidence in system readiness
  • Safe, on-time launch delivered without compromising gameplay immersion

The red teaming exercises not only uncovered high-impact risks but also delivered a clear, data-driven path to remediation. As a result, the studio reinforced its safety posture while preserving the immersive, in-character experience critical to gameplay.

For teams building and launching AI-powered apps and agents, explore how WonderBuild stress-tests generative AI systems before deployment.

Share

Trusted by security and product teams in the world's most regulated industries

Alice brings years of adversarial intelligence expertise to AI security. We give enterprise teams the coverage that generic guardrails and one-time audits can't match.

Get a demo

What’s New from Alice

Beneath the Surface: The Growing Ecosystem of AI Nudification

whitepaper
May 19, 2026
,
 
May 19, 2026
 -
This is some text inside of a div block.
 min read
May 19, 2026

Alice analyzed 100 AI nudification websites to uncover how synthetic NCII ecosystems scale through frictionless onboarding, affiliate monetization, and cross-platform distribution.

Learn More