Building an AI agent is easier than ever. However, moving from a local notebook to a production-grade "Agent Engine" that serves large scale web services poses complex problems. Developers often struggle with unpredictable execution times, framework lock-in, and the sheer overhead of managing long-running agentic loops.

In this session, we dive into how to leverage a unified deployment layer that remains agnostic to your agent framework—regardless of the specific Agentic framework you may use. Using Ray as our reference implementation, we will demonstrate how to bridge the gap between local development and production-grade reliability.

We will explore:

Decoupling Orchestration from Execution: How to manage complex agent state across distributed nodes without sacrificing scalability.
Operational Excellence: Utilizing native autoscaling and intelligent traffic management to handle the unpredictable, bursty nature of agentic workloads.
The "Agent Engine" Pattern: A blueprint for building resilient, high-throughput agent deployments designed to evolve as the AI landscape shifts.

Speaker

Deepak Mohanakumar Chandramouli

Senior Machine Learning Engineer @Apple, 20+ Years in Distributed Systems and Scalable Data/Compute/ML Infrastructure

Deepak Chandramouli is a Senior ML Engineering Leader at Apple with over 20 years of experience in distributed systems and scalable data/compute/ML infrastructure. At Apple, he specializes in building robust ML compute planes that bridge the gap between research and high-scale production environments.

Bhumik Vinodkumar Thakkar

Senior Software Engineer @Apple, Expert in Artificial Intelligence and Large-Scale Distributed Systems

Bhumik Thakkar is a senior engineering leader specializing in Artificial Intelligence and large-scale Distributed Systems, with extensive experience building enterprise-grade AI infrastructure. He has led high-impact initiatives at global technology companies including Microsoft, Meta, and Apple, delivering scalable systems that serve billions of users worldwide. His work focuses on Large Language Models, optimized model inference platforms, and resilient distributed architectures that power mission-critical applications at global scale.

Beyond the Prototype: Scaling Framework Agnostic AI Agent Infrastructure with Ray

Speaker

Deepak Mohanakumar Chandramouli

Speaker

Bhumik Vinodkumar Thakkar

Speaker

Deepak Mohanakumar Chandramouli

Speaker

Bhumik Vinodkumar Thakkar

Date

Location

Topics

Slides

Share

InfoQ Resources

Social Media Links

Conference

Helpful Resources

InfoQ & QCon Events