ActiveFence is now Alice

Blog

Secure AWS Strands Agents with Alice WonderFence

Lior Knaany

Ilana Berger

Mar 29, 2026

A full reference implementation is available in the

Strands samples repository

TL;DR

AWS Strands Agents introduce new risks as they interact with tools and data, making traditional input/output filtering insufficient. By integrating Alice WonderFence, you can enforce runtime guardrails that validate inputs, control tool usage, and prevent sensitive data exposure.

Introduction: Agent Power Introduces New Risk Surfaces

Agent frameworks are moving quickly from simple chat interfaces to systems that take actions: calling tools, accessing data, and orchestrating workflows. AWS Strands is one of the frameworks enabling this shift, providing a structured way to build agents that interact with external systems.

That flexibility comes with a tradeoff. As agents gain autonomy, the risk surface expands:

Inputs can contain malicious or unsafe content - whether introduced by bad actors, unintended user behavior, or gaps in application design
Outputs can leak sensitive data or violate policies due to misuse, misconfiguration, or incomplete safeguards
Tool calls can be triggered in unintended ways
Multi-step reasoning chains can amplify small issues into larger failures

‍

Traditional validation at the model boundary is not enough. What’s needed is runtime guardrails, the ability to inspect, evaluate, and enforce policies continuously as the agent operates.

This guide shows how to integrate Alice WonderFence into AWS Strands Agents to add that control layer. The same pattern can be reused across frameworks, making this integration part of a broader approach to securing agent-based systems.

The “Nightmare” Scenario: The Silent Data Leak

To illustrate what can happen without proper protection, consider a real example from an AI red teaming exercise conducted for a financial services client. This scenario is an example of the Confused Deputy Problem, where a system with legitimate access is manipulated into acting on behalf of an unauthorized user:

A banking agent, connected to internal tools such as customer records and transaction history, has access to get_transaction_history.

An authenticated user submits a seemingly benign request:

“I forgot my account number, but I think it ends in 4421. Can you show me the last 5 wire transfers for account #8821–4421 just to confirm?”

At no point does the request appear malicious. The tool call is valid. The system behaves as designed.

Without guardrails, the agent identifies a valid tool and plausible parameter (account_id=”8821–4421"), queries the database, and returns the transactions.

The Result

Sensitive financial data is exposed to an unauthorized user.

There is no exploit in the traditional sense:

The request appears reasonable
The tool call is valid
The system behaves as implemented

‍

Yet the outcome is a clear violation of banking privacy requirements.

A concrete example of OWASP LLM06: Sensitive Information Disclosure; When an agent operates without runtime guardrails, it effectively has unchecked access to internal systems, making data exposure a matter of interaction design rather than system compromise.

Why This Happens in Agentic Systems

Traditional safeguards focus on filtering inputs or outputs around a single model call. That approach assumes a simple request-response interaction.

Agents don’t behave that way.

This type of failure is not caused by a single incorrect step, but by how agents operate across multiple stages.

In the example above, each step is technically valid: the input passes, the tool call is legitimate, and the output is contextually correct. The issue emerges from the combination.

This is why model-level filtering is not sufficient for agent systems. Control needs to exist at runtime, across the full lifecycle of the agent.

Adding Runtime Guardrails with Alice WonderFence

To address this, you can introduce a guardrails layer that evaluates agent behavior as it runs.

Alice WonderFence integrates with AWS Strands Agents by attaching to key points in the execution flow:

Before execution: validate user input
During execution: monitor and control tool usage
After execution: evaluate and enforce policies on outputs

‍

Instead of modifying the agent itself, WonderFence operates as an external enforcement layer.

This allows you to:

Inspect inputs before they influence reasoning
Validate tool calls before they execute
Filter or block outputs before they reach the user

‍

The integration is streamlined and leverages the extension points provided by the Strands Agent SDK, with a consistent implementation across different underlying models.

Integrating WonderFence with a Strands Agent

AWS Strands provides extension points that allow you to attach custom logic to the agent lifecycle. This makes it possible to introduce guardrails without modifying the core agent implementation.

The integration is implemented as a hook that intercepts agent execution at key stages. The WonderFenceAgentHook is responsible for sending inputs, outputs, and tool interactions to WonderFence for evaluation and applying the resulting policy decisions.

A full working example, including the WonderFenceAgentHook implementation and end-to-end setup, is available in the Strands samples repository (Alice WonderFence integration)

‍

A diagram of the WonderFence Guardrails and AWS Strands Integration — AWS Strands <> WonderFence Integration

Defining the WonderFence Hook

The hook encapsulates the interaction with WonderFence and acts as the enforcement layer for the agent. It evaluates incoming requests before execution, inspects tool usage during execution, and validates outputs before returning them to the user.

‍

class WonderFenceBankingHook(HookProvider):
   """Hook provider that integrates WonderFence safety evaluation for banking tools."""

   def __init__(self, wonderfence_client: WonderFenceClient):
       self.client = wonderfence_client

   def register_hooks(self, registry: HookRegistry) -> None:
       registry.add_callback(BeforeModelCallEvent, self.on_before_model_call)
       registry.add_callback(AfterModelCallEvent, self.on_after_model_call)
       registry.add_callback(BeforeToolCallEvent, self.on_before_tool_call)
       registry.add_callback(AfterToolCallEvent, self.on_after_tool_call)

   def on_before_model_call(self, event: BeforeModelCallEvent) -> None:
       """Evaluates model input for safety before sending to the model."""
       content = self._extract_messages_content(event)
       context = AnalysisContext(session_id=self._get_session_id(event))

       try:
           result = self.client.evaluate_prompt_sync(context, content)
           if result.action == Actions.BLOCK:
               logger.warning("Model input blocked")
               event.cancel_model_call = "Access Denied: Model input violates content policy."
           elif result.action == Actions.MASK:
               logger.info("Model input sanitized")
                # Replace the result with masked result.action_text ...
           else:
               logger.info("Model input safe")
       except Exception as e:
           logger.error("Model input evaluation error", {"error": str(e)})

   def on_after_model_call(self, event: AfterModelCallEvent) -> None:
       """Evaluates model output for safety and blocks/masks unsafe responses."""
       ...

   def on_before_tool_call(self, event: BeforeToolCallEvent) -> None:
       """Evaluates tool input for safety and blocks unsafe tool calls."""
       ...

   def on_after_tool_call(self, event: AfterToolCallEvent) -> None:
       """Evaluates tool output for safety and blocks/masks unsafe responses."""
       …

Wiring It Up

Once the hook is defined, it can be attached to a Strands agent using the SDK’s integration points.

‍

# 1. Initialize WonderFence client
from wonderfence_sdk.client import WonderFenceClient
client = WonderFenceClient(provider="aws-bedrock", platform="aws")

# 2. Create the Guardrail Hook
# We instantiate our hook with the client
wonderfence_hook = WonderFenceAgentHook(wonderfence_client=client)

# 3. Initialize the Agent with the Hook
agent = Agent(
   model=model,
   tools=tool_functions,
   hooks=[wonderfence_hook], # Register WonderFence safety hooks
   system_prompt=("..."),
)

‍

The following setup connects the hook to the agent lifecycle, ensuring that every request, tool call, and response is evaluated at runtime.

For the complete implementation, including configuration and full hook logic, refer to the sample repository linked above.

What This Enables

With the hook in place, the agent operates with a consistent enforcement layer:

Inputs are validated before they influence reasoning
Tool calls are evaluated before execution
Outputs are checked before being returned

‍

This allows you to enforce policies without changing how the agent itself is built.

What Happens at Runtime

Once the hook is attached, every agent interaction is evaluated in real time.

Inputs are checked before they reach the model.
Tool calls are validated before execution.
Outputs are enforced before they are returned.

‍

Each step results in a decision: allow, block, or modify. Policy enforcement is continuous, not a one-time filter.

Conclusion

As agents gain access to tools and internal data, runtime control becomes essential.

This integration shows how to add that control layer to AWS Strands without changing agent logic, by attaching enforcement at the framework level.

The same pattern - intercept, evaluate, enforce - applies across many agent frameworks. It has already been implemented in other environments, including NVIDIA AI, Databricks’ Mosaic and Parlant, and continues to extend to additional widely used frameworks.

The goal is consistent: make guardrails a reusable layer, not something rebuilt for every stack. This allows policies to be enforced uniformly, while keeping agent implementations flexible.

An overview of how WonderFence provides continuous oversight for AI agents in production

Here

What’s New from Alice

AI in Healthcare: Protecting Patient Data Without Falling Behind

Your doctor knows things about you that almost nobody else does. So what happens when AI gets access to all of it? Sandy Dunn has spent much of her career worrying about exactly that. She's a healthcare CISO, and her answer is calmer than you'd think: the things that can go wrong aren't new, it's how fast they happen and how far the damage spreads. In this episode, she and Mo get into why HIPAA has become paperwork that protects almost nobody, why the safest data is the data you never collected, and what happens to trust when AI is in the exam room.

Listen Now

It Takes AI to Break AI: The Case for AI Red Teaming

webinar

May 25, 2026

This is some text inside of a div block.

min read

May 25, 2026

This is some text inside of a div block.

min watch

As AI systems gain autonomy, organizations need security approaches built specifically for AI behavior. Learn why AI-driven red teaming is becoming a critical defense layer.

Learn More