Inference

Session Inference

Monday Jun 1 / 01:20PM EDT

This presentation will cover what areas enterprises like JPMC consider to be most important when running inferencing at scale.

Dio Rettori

Head of Product for AI Infrastructure Platforms @JPMorganChase & Co, Previously @Solo.io, @Red Hat, and @Pivotal Software

Session Performance

Monday Jun 1 / 11:30AM EDT

KV cache is the hidden lever behind inference cost and performance. It directly impacts GPU utilization, throughput, and Time to First Token.

Khawaja Shams

Co-Founder & CEO @Momento, previously @NASA and @Amazon