ActiveFence is now Alice

Blog

SPIRE: Detecting Prompt Injection in Zero-Day Using Semantic Matching

Shiri Simon Segal

Oct 30, 2025

Our Guardrails keep GenAI safe, fast, and production-ready

Check out WonderFence

TL;DR

As AI attacks like prompt injections become more sophisticated, traditional static defenses and keyword filters are increasingly easy to bypass. Alice developed the Semantic Prompt Injection Retrieval Engine (SPIRE) to provide a more agile defense by identifying adversarial intent rather than just matching specific signatures. Instead of requiring constant model retraining, SPIRE uses a dynamic index of "adversarial fragments"—short, validated text spans known to cause harmful behavior—to detect new threats via semantic similarity. This zero-day approach allows security teams to patch vulnerabilities in minutes, catching mutated or translated versions of attacks that standard classifiers often miss.

The Expanding Attack Surface of Generative AI

Large Language Models are revolutionizing enterprise operations, but their adoption brings unprecedented security challenges. At the forefront of these challenges is prompt injection, a sophisticated attack vector that can manipulate AI systems to perform unintended actions or reveal sensitive information. Unlike traditional cybersecurity threats, prompt injections can bypass conventional security measures by exploiting the very nature of how LLMs process natural language.

Understanding the Threat Landscape

Prompt injection attacks manifest in two primary forms:

Direct Injection: Malicious instructions embedded directly in user inputs that attempt to override system prompts
Indirect Injection: Covert directives hidden in external content that the LLM processes, such as web pages, documents, or databases

The latter category is particularly concerning for enterprise environments where LLMs regularly interact with external data sources. A seemingly innocuous document could contain hidden instructions that, when processed by an LLM, could lead to unauthorized data access, policy violations, or even complete system compromise.

Traditional Detection Methods and Their Limitations

Current approaches to prompt injection detection have significant shortcomings:

Rule-Based Systems: While straightforward, these systems can be evaded by creative rephrasing or novel injection techniques
Input Sanitization: Basic filtering approaches often fail to understand semantic context, leading to both false positives and missed attacks
Pattern Matching: These methods struggle with the infinite variation possible in natural language attacks

SPIRE: A New Paradigm in Prompt Injection Detection

Our research team has developed SPIRE (Semantic Prompt Injection Recognition Engine), an approach specifically designed to address the limitations of traditional detection methods. SPIRE leverages advanced semantic analysis to detect even previously unseen injection attempts by focusing on the intent and meaning behind inputs rather than their surface-level characteristics.

Core Technical Innovation

At the heart of SPIRE is a sophisticated semantic matching system that:

Analyzes contextual relationships between legitimate prompts and potential injection attempts
Employs adversarial training techniques to build robust detection capabilities
Utilizes a multi-layer verification process that examines both semantic content and structural patterns

Key Advantages Over Traditional Methods

Semantic-based detection offers several critical improvements:

Zero-Day Detection: By focusing on semantic patterns rather than known attack signatures, SPIRE can identify novel injection attempts that would bypass traditional filters
Reduced False Positives: Deep contextual understanding allows for more accurate distinction between legitimate queries and malicious inputs
Scalability: The semantic approach scales effectively with the volume and variety of enterprise LLM interactions
Adaptive Learning: The system can be continuously updated with new semantic patterns without requiring complete retraining

Enterprise Implementation Considerations

For organizations deploying LLMs at scale, SPIRE offers several key benefits:

Seamless Integration: Designed to work with existing LLM infrastructure without requiring significant architectural changes
Customizable Security Thresholds: Organizations can calibrate detection sensitivity based on their specific risk profiles and use cases
Comprehensive Audit Trails: Detailed logging and reporting capabilities support security investigations and compliance requirements
Real-Time Protection: Minimal latency impact ensures security measures don't compromise system performance

The Path Forward: Securing the AI-Powered Enterprise

As LLMs become more deeply integrated into enterprise operations, the sophistication of prompt injection attacks will continue to evolve. SPIRE represents a significant advancement in our ability to protect these systems, but it's part of a broader security framework that must evolve alongside the threat landscape.

The battle against prompt injection requires a multi-layered approach combining advanced technical solutions, organizational best practices, and ongoing research into emerging attack vectors. By implementing semantic-based detection systems like SPIRE alongside other security measures, enterprises can significantly reduce their vulnerability to these sophisticated attacks while maintaining the benefits of LLM integration.

As we continue to develop and refine SPIRE, we're committed to sharing our findings with the security community and working towards a more secure AI-powered future. The challenges posed by prompt injection are significant, but with the right tools and approaches, we can stay ahead of this evolving threat.

SPIRE is just one part of our defense ecosystem.

Learn more about WonderFence

What’s New from Alice

The Former Google Cloud CISO's Take on AI, Agents, and What Comes Next

There's a lot of noise around AI and security right now, and not many people who can cut through it the way Phil Venables can. He was CISO at Goldman Sachs, then the first CISO for Google Cloud, and he's now a partner at Ballistic Ventures. In this episode, he tells us why attackers scaling up worries him more than the vulnerabilities themselves, what trust even means when an agent is acting in your environment, and why the answer to most of this comes back to the same fundamentals we've leaned on for years.

Listen Now

It Takes AI to Break AI: The Case for AI Red Teaming

webinar

May 25, 2026

This is some text inside of a div block.

min read

May 25, 2026

This is some text inside of a div block.

min watch

As AI systems gain autonomy, organizations need security approaches built specifically for AI behavior. Learn why AI-driven red teaming is becoming a critical defense layer.

Learn More