Distilling LLMs into Efficient Transformers for Real-World AI
This technical webinar explores how we distilled the world knowledge of a large language model into a compact, high-performing transformer—balancing safety, latency, and scale. Learn how we combine LLM-based annotations and weight distillation to power real-world AI safety.
Watch On-Demand
Watch On-Demand
Distilling LLMs into Efficient Transformers for Real-World AI

Overview
Building GenAI systems is easy. But making them safe, scalable, and accurate in the real world? That’s where things get tricky.
In this technical webinar, Shiri Simon Segal, our Sr. Data Scientist, shares how her tam uses dual knowledge distillation to turn large foundation models into efficient transformers that stay aligned, compliant, and abuse-aware, under real-world pressure.
You’ll learn:
- Why accuracy is essential for GenAI safety
- How label-based and feature-based distillation work together
- How we use LLMs for high-quality automated annotation
- How we applied this technique in one of the most challenging abuse areas, through a real-world case study
- Practical tips for building safer, scalable models
Watch now and see what it takes to turn raw LLM power into safe, production-ready AI.
Meet our speakers

What’s New from Alice
"Okay, Here is How to Build a Bomb": Millions Download Dangerous LLMs
Thousands of abliterated LLMs have flooded open-source platforms with millions of downloads. These models comply with virtually any request, from bomb-making to malware, and run fully offline on consumer devices.
Your LLM Has No Idea What It's Doing
Diana Kelley, CISO at Noma Security and former Cybersecurity CTO at Microsoft, joins Mo to work through the real mechanics of LLM risk: why the context window flattens the trust boundary between system instructions and user data, why that makes reliable internal guardrails essentially impossible, and why agentic AI is less a new threat category and more a stress test for the hygiene debt organizations never fully paid off.
Distilling LLMs into Efficient Transformers for Real-World AI
This technical webinar explores how we distilled the world knowledge of a large language model into a compact, high-performing transformer—balancing safety, latency, and scale. Learn how we combine LLM-based annotations and weight distillation to power real-world AI safety.
