Overview
A modern data foundation unifies operational, analytical, and AI workloads on open formats so data is queryable from any engine without copies. AWS's strategy centers on Amazon S3 Tables (managed Apache Iceberg), Amazon SageMaker Lakehouse (unified access across S3, Redshift, federated sources), AWS Glue for catalog and ETL, and Amazon DataZone for governance. With Iceberg as the open table format, you get ACID transactions, time travel, and schema evolution on cheap S3 storage.
Key concepts
- Apache Iceberg — open table format with ACID semantics
- Catalog standardization (AWS Glue Data Catalog, AWS Lake Formation)
- Zero-ETL integrations between operational stores and analytics
- Data quality, lineage, and active metadata
- Fine-grained access control with Lake Formation
Key AWS services
- Amazon S3 Tables
- AWS Glue
- AWS Lake Formation
- Amazon DataZone
- SageMaker Lakehouse
Learn more — curated resources
Hand-picked official docs, foundational papers, and the best community guides for going deeper on this topic.
Sessions on this topic
4 sessions from the Summit covered this topic. Each is a self-contained mini-lesson.
- ANT301Advanced
A practitioners guide to data for agentic AI
In this session, gain the skills needed to deploy end-to-end agentic AI applications using your most valuable data. This session focuses on data management using processes like Model Context Protocol (MCP) and Retrieval Augmented Generation (RAG), and provides concepts that apply to other methods of customizing agentic AI applications. Discover best practice architectures using AWS database services like Amazon Aurora and OpenSearch Service, along with analytical, data processing and streaming experiences found in SageMaker Unified Studio. Learn data lake, governance, and data quality concepts and how Amazon Bedrock AgentCore and Bedrock Knowledge Bases, and other features tie solution components together.
- ISV303Advanced
From hours to minutes: SafetyCulture's journey to 90% faster analytics
Discover how SafetyCulture, the global workplace operations platform used by 70,000+ organizations, achieved a 90% reduction in daily data pipeline execution time while processing the same data volumes on Amazon Redshift. Join this session with SafetyCulture data engineering team to learn how their team transformed a complex, slow-running data warehouse into a high-performance, AI-ready analytics platform using modern lakehouse architecture principles.
- DAT401Expert
Real-Time DataLakes with Apache Iceberg, Amazon MSK, and Amazon S3
Learn how to optimize Apache Iceberg data lakes on Amazon S3 for cost-effectiveness while enabling real-time analytics. This session explores S3 Tables deployments, focusing on streaming data from Apache Kafka via Amazon MSK into Iceberg format. Discover practical approaches for real-time table maintenance, metadata optimization for high-velocity writes, and data compaction strategies. Implement cost-effective retention policies using S3 Lifecycle configurations while maintaining sub-minute data freshness. See how MSK's native Iceberg integration eliminates pipeline overhead, reducing latency and operational costs. Gain actionable insights for balancing streaming performance with cost optimization at scale.
- DAT201Intermediate
Scaling Data Analytics: Easygo's Modern Lakehouse Journey on AWS
Discover how Melbourne-based Easygo, powering Stake and Kick.com, transformed their data analytics infrastructure to process over 600,000 daily transactions and tens of millions of streaming events. Learn about their implementation of a modern lakehouse architecture combining Amazon Aurora Zero-ETL integration with Amazon Redshift, Amazon Kinesis with AWS Glue streaming, and Apache Iceberg on Amazon S3. Results include 95% faster queries, 80% fewer ingestion incidents, 9 hours weekly maintenance savings, and accelerated global expansion. Explore practical strategies for building scalable, secure data foundations delivering near real-time analytics with robust governance across regulated markets.
Live updates related to this topic LIVE
Sourced via Parallel AI Monitor — continuous web watch on 21 topical streams. Updated .
- risingwave.com high confidence Agent-native data infrastructure
RisingWave | Streaming Infrastructure for Agentic AI
Databricks launched an 'agentic data and AI stack' featuring an 'agentic data foundation,' Unity AI Gateway for multi-AI governance, and Genie Ontology as a business context graph to support AI agents.
- solutionsreview.com high confidence Agent-native data infrastructure
Analytics and Data Science News for the Week of June 19
Databricks launched an 'agentic data and AI stack' featuring an 'agentic data foundation,' Unity AI Gateway for multi-AI governance, and Genie Ontology as a business context graph to support AI agents.
- businesswire.com high confidence Agent-native data infrastructure
Unravel Data Unveils Arvix AI, the First Agentic AI Engine ...
Linux Foundation announced its intent to launch the 'Agent Name Service' to establish a trusted identity infrastructure for AI agents, providing a foundational layer for agent identification and trust in agentic ecosystems.
- huawei.com high confidence Agent-native data infrastructure
Huawei Cloud Announces Agentic AI Products, Shaping the ...
Cohesity announced Cohesity Maestro, a headless architecture for cyber resilience that utilizes the Model Context Protocol (MCP). It allows AI agents (e.g., Claude, ChatGPT, Gemini) to natively access and trigger Cohesity Data Cloud capabilities, including cyber resilience orches
- salesforce.com high confidence Agent-native data infrastructure
Salesforce and Databricks Build the Shared Foundation for Human and AI Agent Work - Salesforce
Salesforce and Databricks announced an expanded partnership on June 16, 2026, to create a shared foundation for AI agents. This includes expanded Zero Copy infrastructure with metadata-aware access controls, Federated Search for Agentforce agents to search Databricks, and the int
External links matched to this topic via topic relevance. The KB does not endorse third-party content; verify before citing.
Non-obvious insights
From the PlaybookOne sharp, contrarian insight per session — the things teams don't think of unprompted.