Inference
Session
Inference
Inferencing for Enterprises
Monday Jun 1 / 01:20PM EDT
This presentation will cover what areas enterprises like JPMC consider to be most important when running inferencing at scale.
Dio Rettori
Head of Product for AI Infrastructure Platforms @JPMorganChase & Co, Previously @Solo.io, @Red Hat, and @Pivotal Software
Session
Performance
Serving LLMs at Scale: The Hidden KV Cache Advantage
Monday Jun 1 / 11:30AM EDT
KV cache is the hidden lever behind inference cost and performance. It directly impacts GPU utilization, throughput, and Time to First Token.
Khawaja Shams
Co-Founder & CEO @Momento, previously @NASA and @Amazon