Beyond Prompting: Context Engineering for Production-Grade AI

If you're building production-grade AI applications, you may know an uncomfortable truth: reliable LLM outputs require far more than clever prompting. You must orchestrate tool calling, implement memory and retrieval pipelines, and dynamically manage context tokens while maintaining consistency across thousands of interactions.

This is where context engineering comes in. Context engineering is an emerging discipline that treats the LLM's context window as an architectural resource to be designed, optimized, and managed. In this session, we'll explore how developers can implement sophisticated context-engineering patterns that improve LLM output reliability. You'll learn what it takes to design an AI system with memory management capabilities for both short- and long-term memory, advanced retrieval-augmented generation with query compression and reranking, and the efficient usage of token management.

Using a real-world example, we'll evolve a naïve LLM implementation into a fully engineered context pipeline and highlight the measurable impact on accuracy, consistency, and cost. By the end, you'll understand the architectural trade-offs behind context engineering—and why skipping it results in unreliable, inefficient AI systems.


Speaker

Ricardo Ferreira

Lead, Developer Relations @Redis, Expert in Distributed Systems, Databases, and Software Development, Previously @AWS, @Elastic, and @Confluent

Ricardo leads the developer relations team at Redis. He built a successful career in DevRel, working for companies such as AWS, Elastic, and Confluent.

Before DevRel, he had built deep expertise in distributed systems, databases, and software development for over 20 years. Ricardo's career began with a decade-long focus on software engineering. Then, he switched gears to solution architecture. In this role, he specialized in distributed systems, databases, and big data technologies.

Read more
Find Ricardo Ferreira at: