Data Lakes, Lakehouse & AI-Ready Data

Sessions on this topic

4 sessions from the Summit covered this topic. Each is a self-contained mini-lesson.

ANT301Advanced
A practitioners guide to data for agentic AI
In this session, gain the skills needed to deploy end-to-end agentic AI applications using your most valuable data. This session focuses on data management using processes like Model Context Protocol (MCP) and Retrieval Augmented Generation (RAG), and provides concepts that apply to other methods of customizing agentic AI applications. Discover best practice architectures using AWS database services like Amazon Aurora and OpenSearch Service, along with analytical, data processing and streaming experiences found in SageMaker Unified Studio. Learn data lake, governance, and data quality concepts and how Amazon Bedrock AgentCore and Bedrock Knowledge Bases, and other features tie solution components together.
ISV303Advanced
From hours to minutes: SafetyCulture's journey to 90% faster analytics
Discover how SafetyCulture, the global workplace operations platform used by 70,000+ organizations, achieved a 90% reduction in daily data pipeline execution time while processing the same data volumes on Amazon Redshift. Join this session with SafetyCulture data engineering team to learn how their team transformed a complex, slow-running data warehouse into a high-performance, AI-ready analytics platform using modern lakehouse architecture principles.
DAT401Expert
Real-Time DataLakes with Apache Iceberg, Amazon MSK, and Amazon S3
Learn how to optimize Apache Iceberg data lakes on Amazon S3 for cost-effectiveness while enabling real-time analytics. This session explores S3 Tables deployments, focusing on streaming data from Apache Kafka via Amazon MSK into Iceberg format. Discover practical approaches for real-time table maintenance, metadata optimization for high-velocity writes, and data compaction strategies. Implement cost-effective retention policies using S3 Lifecycle configurations while maintaining sub-minute data freshness. See how MSK's native Iceberg integration eliminates pipeline overhead, reducing latency and operational costs. Gain actionable insights for balancing streaming performance with cost optimization at scale.
DAT201Intermediate
Scaling Data Analytics: Easygo's Modern Lakehouse Journey on AWS
Discover how Melbourne-based Easygo, powering Stake and Kick.com, transformed their data analytics infrastructure to process over 600,000 daily transactions and tens of millions of streaming events. Learn about their implementation of a modern lakehouse architecture combining Amazon Aurora Zero-ETL integration with Amazon Redshift, Amazon Kinesis with AWS Glue streaming, and Apache Iceberg on Amazon S3. Results include 95% faster queries, 80% fewer ingestion incidents, 9 hours weekly maintenance savings, and accelerated global expansion. Explore practical strategies for building scalable, secure data foundations delivering near real-time analytics with robust governance across regulated markets.

Non-obvious insights

One sharp, contrarian insight per session — the things teams don't think of unprompted.

RAG retrieval quality is dominated by chunking strategy, not embedding model. Boring but true. Spend a week on chunk size, overlap, and semantic boundaries before you spend a dollar on a fancier embedder. ---ANT301 — A practitioners guide to data for agentic AI

Most pipeline 90% wins come from *removing redundant computation* (the same aggregation computed three times across three pipelines), not from faster compute. Profile before re-platforming. You'll find that half the pipelines are doing the same work. ---ISV303 — From hours to minutes: SafetyCulture's journey to 90…

Data Lakes, Lakehouse & AI-Ready Data

Overview

Key concepts

Key AWS services

Learn more — curated resources

Sessions on this topic

A practitioners guide to data for agentic AI

From hours to minutes: SafetyCulture's journey to 90% faster analytics

Real-Time DataLakes with Apache Iceberg, Amazon MSK, and Amazon S3

Scaling Data Analytics: Easygo's Modern Lakehouse Journey on AWS

Live updates related to this topic LIVE

RisingWave | Streaming Infrastructure for Agentic AI

Analytics and Data Science News for the Week of June 19

Unravel Data Unveils Arvix AI, the First Agentic AI Engine ...

Huawei Cloud Announces Agentic AI Products, Shaping the ...

Salesforce and Databricks Build the Shared Foundation for Human and AI Agent Work - Salesforce

Non-obvious insights

Data Lakes, Lakehouse & AI-Ready Data

Overview

Key concepts

Key AWS services

Learn more — curated resources

Sessions on this topic

A practitioners guide to data for agentic AI

From hours to minutes: SafetyCulture's journey to 90% faster analytics

Real-Time DataLakes with Apache Iceberg, Amazon MSK, and Amazon S3

Scaling Data Analytics: Easygo's Modern Lakehouse Journey on AWS

Live updates related to this topic LIVE

RisingWave | Streaming Infrastructure for Agentic AI

Analytics and Data Science News for the Week of June 19

Unravel Data Unveils Arvix AI, the First Agentic AI Engine ...

Huawei Cloud Announces Agentic AI Products, Shaping the ...

Salesforce and Databricks Build the Shared Foundation for Human and AI Agent Work - Salesforce

Non-obvious insights

Related topics

Analytics, Redshift & Generative BI

Streaming & Real-Time Data

Databases & Aurora

Agentic AI