Overview
Amazon SageMaker AI is the managed service for the full machine-learning lifecycle: data labeling, notebooks, training jobs (including distributed training on thousands of GPUs/Trainium chips), hyperparameter tuning, model registry, deployment, and monitoring. SageMaker Unified Studio brings together SageMaker, Bedrock, Glue, EMR, Redshift, and QuickSight in one workspace so data engineers, data scientists, and analysts collaborate on the same data.
Key concepts
- Training: SageMaker Training Jobs, distributed training, spot instances
- Fine-tuning and distillation for cost-effective specialization
- Inference: real-time, serverless, asynchronous, batch transform
- Model Registry, Pipelines, and MLOps automation
- Feature Store for reusable features across teams
- AWS Trainium and Inferentia for cost-optimized ML
Key AWS services
- Amazon SageMaker AI
- SageMaker Unified Studio
- AWS Trainium
- AWS Inferentia
- SageMaker JumpStart
Learn more — curated resources
Hand-picked official docs, foundational papers, and the best community guides for going deeper on this topic.
Sessions on this topic
17 sessions from the Summit covered this topic. Each is a self-contained mini-lesson.
- AIM401Expert
Beyond API Dependency: Fine-tuning Cost-Effective Models on AWS
As API costs for general-purpose LLMs rise, relying solely on off-the-shelf models can quickly undermine both cost control and system reliability. In this session, we share how Nearmap moved beyond API dependency by fine-tuning and distilling domain-specific models on AWS to analyze 300 million building permits for roof modifications. Well discuss our approach to generating and structuring training data, distilling large models into smaller, production-ready alternatives, evaluating trade-offs across model architectures, and making data-driven accuracy-versus-cost decisions before deployment. Attendees will leave with concrete patterns for shipping efficient, specialized models into production.
- ANT301Advanced
A practitioners guide to data for agentic AI
In this session, gain the skills needed to deploy end-to-end agentic AI applications using your most valuable data. This session focuses on data management using processes like Model Context Protocol (MCP) and Retrieval Augmented Generation (RAG), and provides concepts that apply to other methods of customizing agentic AI applications. Discover best practice architectures using AWS database services like Amazon Aurora and OpenSearch Service, along with analytical, data processing and streaming experiences found in SageMaker Unified Studio. Learn data lake, governance, and data quality concepts and how Amazon Bedrock AgentCore and Bedrock Knowledge Bases, and other features tie solution components together.
- MAM307Advanced
Modernise legacy code using fine-tuned Gen AI models
Rio Tintos data science team saw an opportunity to preserve institutional knowledge and improve developer productivity by modernizing a legacy codebase. Rather than attempting a full system overhaul, the team focused first on adding generative AI capabilities to their critical legacy application. By using the proven, open, and trusted data foundation of AWS, the company laid the groundwork for incremental modernization without disrupting core operations. Learn about model fine tuning against legacy codebases, Amazon Nova, SageMaker Jumpstart and AgentCore in this deep dive with AWS & Rio Tinto
- COP302Advanced
Applying AI for FinOps and FinOps for AI
Explore the intersection of AI and FinOps in this advanced session. First, discover how Kiro CLI can simplify AWS cost management by analyzing trends, explaining spend, and recommending optimizations like rightsizing and Savings Plans. Then, dive into FinOps for AI- learn how to track and control generative AI costs across Amazon EC2, Amazon SageMaker, Amazon Bedrock, and more. We'll share architecture patterns, cost-saving strategies, and real-world examples to help you build scalable, production-ready AI solutions while staying on budget. Whether you're optimizing existing workloads or launching new AI initiatives, you'll leave with practical tools to maximize value.
- DAT402Expert
Deep dive into database integrations with AWS Zero-ETL
Learn how AWS zero-ETL integrations eliminate complex data movement pipelines across multiple database engines, enabling data engineers, architects, and DBAs to reduce maintenance overhead while ensuring near real-time data availability for analytics and ML workloads. Examine the underlying architecture for supported zero-ETL integrations between Amazon Aurora, Amazon DynamoDB, and Amazon RDS sources to Amazon Redshift, Amazon SageMaker, and Amazon OpenSearch Service targets. Explore data movement options, tunable settings, and monitoring capabilities for ongoing data replicationall without traditional ETL complexity.
- DEV201Intermediate
How Flybuys Built AI Governance to Accelerate Adoption at Scale
Scaling AI successfully isnt just about moving fast — its about building the right foundations first. In this session, learn how Flybuys focused early on AI governance, steering documents, and engineering standards to enable smooth, secure AI adoption at scale. Well explore how upfront investment in guardrails, training, and approval processes allowed teams to deploy AI capabilities faster and with confidence. Youll hear how Flybuys is embedding governance and security expectations into engineering workflows using Kiro, including standardised steering patterns, approval pathways, and controlled rollout of AI capabilities such as Powers. Attendees will gain practical insights into how slowing down early can unlock faster, safer AI delivery across the organisation.
- DAT303Advanced
Explore whats new in data and AI governance with SageMaker Catalog
Join this session to learn about the latest capabilities in Amazon SageMaker Catalog that help organizations govern data and AI more effectively. We will walk through new features that make it easier to discover, govern, and securely share structured and unstructured data, models, business intelligence dashboards, and applications. Youll hear how customers are using these capabilities to improve data discovery and access, streamline compliance, and support AI initiatives.
- WPS203Intermediate
Optimising Outpatient Waitlists with ML at Gold Coast Health
Deploying ML in high-stakes environments demands enterprise readiness, governance, and continuous monitoring. In this session, you'll learn how Gold Coast Health moved from pilot to production with a predictive model identifying patients unlikely to attend procedures — achieving 33% precision, doubling the 15% manual baseline — while ensuring fairness across cohorts. The session covers real-world ML architecture on Amazon SageMaker Pipelines, production monitoring including data quality, pipeline health, and drift detection, plus navigating AI governance through bias analysis and impact assessment. Whether you're in healthcare, financial services, or any regulated industry, walk away with actionable patterns for deploying responsible ML at scale.
- FSI207Intermediate
From enterprise data mesh to AI with Amazon SageMaker Unified Studio
From enterprise data mesh to AI with Amazon SageMaker Unified StudioFinancial institutions are unlocking enormous value with AI agents — from personalised customer experiences to better risk decision making. But to deliver on that promise, agents need data they can find, understand, and trust. This session shows how a data mesh architecture on Amazon SageMaker Unified Studio builds that foundation: discoverable data across lines of business, business context that grounds agent responses in real meaning, quality signals that build confidence in every answer, and governed access that keeps you compliant by design. We cover domain ownership, multi-account strategies, data contracts, business glossaries, data quality, and cross-domain governance — and demonstrate how this foundation empowers agentic AI that delivers trusted, accurate results at enterprise scale.
- STP213Intermediate
AI-Powered Farming: How Halter's ML Models Transform Dairy Operations
New Zealand Unicorn agritech startup Halter is revolutionizing dairy farming with AI-powered smart collars that predict critical livestock events. Their machine learning models enable heat detection, calving prediction, pasture optimization, and animal behavior classification, processing data from thousands of GPS-enabled collars across remote farms. By leveraging AWS infrastructure, Halter's engineering team built scalable ML pipelines that help farmers make data-driven decisions, reduce labor costs, and improve animal welfare. Learn how Halter developed production ML models for agriculture, overcame challenges of training on livestock data, and their journey toward managed ML services.
- STP204Intermediate
How Heidi Health Fine-Tunes Speech-to-Text Models on AWS
Join Heidi Health and AWS's Generative AI Innovation Center (GenAIIC) for a behind-the-scenes look at building and deploying custom speech-to-text AI for healthcare. Learn hard-won lessons and a practical blueprint: curating domain-specific training data, fine-tuning open-weight models, validating non-deterministic outputs at scale, and shipping to production with optimized inference. Both teams share how AWS services reduced infrastructure complexity, accelerated iteration cycles, and scaled custom models across diverse real-world use cases — all while maintaining security and cost efficiency. Ideal for ML engineers, data scientists, and technical leaders exploring fine-tuning and production ML on AWS.
- ISV102Foundational
From documents to voice - building AI products on AWS
How Affinda leverages Amazon Bedrock (Claude), SageMaker, EKS & CloudFormation to deliver intelligent document processing at enterprise scale, cutting setup time and costs by 90% with 95%+ accuracy. This session will demonstrate how Affinda powers real-world AI product development from Affinda's Intelligfent Document Processing platform to Pathfindr's (acquired by Affinda) custom AI agents. The session will showcase the complete journey of building Honey Insurance's voice agent - Australia's first voice agent in financial services, and how the Affinda-AWS partnership enables rapid AI product development for Enterprises.
- STP212Intermediate
How Apate AI uses Amazon Bedrock and voice AI to catch scammers
Scams are a global epidemic costing businesses and consumers trillions. Apate AI turns the tables on fraudsters by deploying lifelike conversational AI agents, powered by Amazon Bedrock and speech models on Amazon SageMaker bidirectional streaming, that engage scammers in real time to detect, divert, disrupt, and decode their tactics. In this session, learn how Apate AI converts every scam interaction into actionable intelligence and how to build your own voice AI agents on AWS.
- STP216Intermediate
Building AI Agents: From Open-Source Frameworks to Production-Grade
AI agents are moving from demo to deployment. Startups across ANZ are building production-grade assistants using open-source orchestration frameworks, fine-tuned foundation models, and GPU-accelerated inference on AWS and NVIDIA infrastructure. This panel explores what it actually takes to ship agentic use casesfrom choosing the right models and frameworks to managing latency, cost, and reliability at scale. We'll hear from AirTree VC on where the investment thesis is heading, from NVIDIA on how accelerated compute is shaping the agent stack, and from Heidi Health building and scaling these systems in production. Whether it's vertical agents for healthcare, customer support, or code generation, we'll focus on what's working, what's hype, and where the real startup opportunities lie in the agent ecosystem.
- IND101Foundational
Test, Learn, Iterate: Amazon Connect Success
Discover how Flybuys achieved rapid contact centre transformation through early Amazon Connect adoption using AI-powered capabilities and a disciplined Test, Learn, Iterate approach. Starting with a focused pilot, they deployed AI-driven features like intelligent routing, real-time sentiment analysis, and automated quality assurance. They progressed through Launch, Activate, and Consume phasescapturing baseline metrics, scaling through peer-led training, and continuously refining AI performance based on weekly feedback loops. The results: reduced AHT, improved CSAT, 100% AI-powered QA coverage, and measurable ROI. This demonstrates that early AI adoption delivers calculated, data-driven transformation.
- FSI202Intermediate
Accelerating Payment Innovation: Spec-Driven Development with AWS Kiro
Australian Payments Plusoperator of Australia's critical payment infrastructure including eftpos, BPAY, and NPP, processing millions of daily transactionstransformed their development practices by adopting Spec-Driven Development using AWS Kiro. AP+ manages the payment rails connecting banks, merchants, and consumers throughout Australia. Through intensive Event-Driven Architecture bootcamps and hands-on training, engineering teams now independently run development workshops every two weeks, accelerating delivery of payment platform innovations while maintaining the highest security and compliance standards required for national financial infrastructure. Learn the practical framework for building development velocity in regulated environments.
- MAE204Intermediate
How Amazon Ads Creative Agent uses AWS to democratize ad creation
Media advertisers see up to 25% higher engagement when delivering custom creative to relevant audiences, yet producing quality video ads traditionally requires weeks of expensive and specialized expertise. Discover the inner workings of Amazon Ads new AI Creative Agent, and how it's transforming the creative process by automating and enhancing the generation of multi-format ads to businesses regardless of their size or creative expertise. Explore how Amazon Bedrock, custom-built ML models, GPUs, and model evaluations are used to orchestrate and generate compelling ad creatives into full video productions with professional voiceovers from conversational natural language, while reducing creative development time.
Live updates related to this topic LIVE
Sourced via Parallel AI Monitor — continuous web watch on 21 topical streams. Updated .
- arxiv.org Agent benchmarks & evals
arxiv.org
ACE-Bench: Agent Configurable Evaluation with Scalable Horizons and Controllable Difficulty. Methodology: unified grid-based planning tasks where agents fill hidden slots with orthogonal controls for Scalable Horizons (H) and Controllable Difficulty (decoy budget B); tools resolv
- arxiv.org Agent benchmarks & evals
arxiv.org
AgentGate: A Lightweight Structured Routing Engine for the Internet of Agents. Methodology: two-stage structured routing framework (action decision + structural grounding) that formulates routing as a constrained decision problem, plus a routing-oriented fine-tuning scheme with c
- arxiv.org Agent benchmarks & evals
arxiv.org
Market-Bench: Benchmarking Large Language Models on Economic and Trade Competition. Methodology: configurable multi-agent supply-chain economic model where LLMs act as retailer agents in procurement (bids/auctions) and retail (pricing, marketing) stages; logs full trajectories (b
- benchlm.ai Agent benchmarks & evals
benchlm.ai
BenchLM.ai updated its AI Agent & Tool-Use Leaderboard (Apr 23, 2026). Methodology: An 'Agentic Score' is calculated as a weighted average of Terminal-Bench 2.0 (40%), OSWorld-Verified (35%), and BrowseComp (25%). It tracks 24 agentic benchmarks including MCP Atlas and Toolathlon
- benchlm.ai high confidence Agent benchmarks & evals
Agentic Benchmarks 2026: Tool Use, Browsing, Computer Use | BenchLM.ai
BenchLM.ai updated its Agentic Benchmarks leaderboard on 2026-05-11. The update introduced two new benchmarks: 1) OpenHands Index, a holistic coding-agent benchmark covering issue resolution, frontend work, greenfield development, testing, and information gathering; and 2) SWE-At
External links matched to this topic via topic relevance. The KB does not endorse third-party content; verify before citing.
Non-obvious insights
From the PlaybookOne sharp, contrarian insight per session — the things teams don't think of unprompted.