Most PII leaks happen before your prompt ever hits the model — here's how to catch them at the pipeline level with deterministic ETLs and synthetic stand-ins.
The AI gold rush has created a massive blind spot: we are opening up our
databases without fixing our core access problems. Whether it’s an over-
permissioned LLM vacuuming up customer emails, or a developer casually
dumping a production table into a test environment, exposing sensitive data has
never been easier—or more permanent.
In this session, we’ll break down the anatomy of a modern data leak. We’ll look at
how sensitive information actually slips into AI context windows and human
hands, and how to mitigate the risk before it happens. We’ll walk through
practical architectural patterns—from automated masking and deterministic
ETLs to synthetic data generation—so you can safely build AI features without
exposing your users to models or unauthorized insiders.
Speaker
Daniyar Mussakulov
Engineering lead @3T Software Labs
Daniyar Mussakulov is an engineering leader with more than ten years of experience building software platforms and developing engineering teams. He focuses on improving how teams
work, modernizing systems, and finding practical uses of AI to make engineers more effective. Known for leading with both technical depth and genuine empathy, Daniyar creates environments where engineers thrive — and builds the kind of foundations that scale.
Session Sponsored By
The governed data access platform for MongoDB — built for humans and AI agents alike .