AI Software Engineer
About the Position
Alice’s Innovation team builds adversarial RL environments that train the world’s most advanced AI models to be safer. Our customers are the leading frontier AI labs, who use these environments for post-training reinforcement learning and safety evaluation. This is the bleeding edge of AI safety technology: the environments you build will directly shape how next-generation models learn to resist adversarial attacks.
We’re looking for a Principal Software Engineer to own the RL Gym platform end-to-end: from architecting multi-site web environments that simulate real-world attack surfaces, to optimizing our in-house orchestration harness (AgenticVerse) for high-performance delivery into customer training pipelines.
This is a builder role. You’ll lead a small team (including a dedicated web environments engineer), operating with high autonomy, moving fast from concept to working prototype to production system. You’ll interact directly with customer engineering teams to understand their infrastructure constraints and deliver environments that meet their scale and reliability requirements.
Why this role
This is one of the few roles in the industry where your code directly influences how the next generation of AI models are trained. You’ll be at the center of advancing AI safety, building systems that the world’s top labs depend on to make their models more robust. The work is technically deep, the problem space is genuinely novel, and the field is moving faster than any team can keep up with alone. There’s no playbook. You’ll write it.
What you’ll do:
Platform & performance
- Own and evolve AgenticVerse, our in-house orchestration harness that provisions and manages RL environments at scale. Focus on performance: low-latency provisioning, high concurrency, minimal overhead per environment instance
- Design and build isolated, reproducible web environments using Firecracker microVMs or Docker containers
- Architect multi-site scenarios (3-4 interconnected web applications per task) with rich interactions: drag-and-drop, file uploads, authentication flows, LLM-in-the-loop components
- Implement deterministic verifiers that evaluate agent behavior with zero ambiguity
Customer delivery
- Work directly with engineering teams at leading AI labs to integrate RL Gym environments into their training and evaluation pipelines
- Translate customer specs into working environments, iterating rapidly on feedback
- Own the technical relationship: SLAs, API contracts, integration architecture
- Adapt environment delivery formats to cus tomer infrastructure (real-time API calls vs. offline batch, managed vs. raw artifacts)
- Build customer-facing UIs when needed (dashboards, environment configuration portals, monitoring interfaces)
Rapid prototyping
- Take ambiguous problem descriptions and produce working prototypes within days, not weeks
- Validate new environment types, interaction patterns, and verifier approaches quickly
- Build internal tooling that accelerates scenario authoring and testing
Requirements
Must have
- 8+ years of software engineering experience, with a track record of building production systems from zero
- Deep expertise in infrastructure: Linux, containers (Docker), VMs (Firecracker or similar), networking, cloud platforms (AWS strongly preferred)
- Strong Python skills and comfort with async/concurrent systems
- Experience building platforms or developer tools (not just consuming them)
- Full-stack capability: backend services, infrastructure-as-code, APIs, and frontend development (React or similar) for customer-facing interfaces
- Demonstrated ability to work autonomously with minimal specification, making sound architectural decisions under ambiguity
- Comfort working directly with external customers and translating technical constraints into engineering solutions
- English fluency (written and verbal) for customer-facing communication
Nice to have
- Experience with reinforcement learning infrastructure, training pipelines, or evaluation frameworks
- Background in security, adversarial testing, or trust & safety systems
- Familiarity with browser automation, headless browsers, or web scraping at scale
- Experience with Kubernetes operators or custom schedulers
- Prior work in a 0-to-1 environment (startup, innovation lab, or R&D team building new products)
About Alice
THE CHALLENGES ALONG THE WAY
1. Being Both Strategist and Executioner
One of the hardest parts of this role is that you’re both the visionary and the builder; the one drawing the map and paving the road.
That means switching between high-level strategy and hands-on experimentation daily, and doing it while bringing others along with you. There’s no playbook for this kind of work. You’re paving an unpaved road, one small experiment at a time.
2. Balancing Security and Innovation
ActiveFence is the leading provider of security and safety solutions for online experiences, safeguarding more than 3 billion users, top foundation models, and the world’s largest enterprises and tech platforms every day.
As a trusted ally to major technology firms and Fortune 500 brands that build user-generated and GenAI products, ActiveFence empowers security, AI, and policy teams with low-latency Real-Time Guardrails and a continuous Red Teaming program that pressure-tests systems with adversarial prompts and emerging threat techniques. Powered by deep threat intelligence, unmatched harmful-content detection, and coverage of 117+ languages, ActiveFence enables organizations to deliver engaging and trustworthy experiences at global scale while operating safely and responsibly across all threat landscapes.
