Overview
Serverless computing lets you build apps as small functions that run on demand. AWS Lambda is the original — over 2 million customers use it. Pair Lambda with API Gateway (HTTP), EventBridge (events), SQS/SNS (messaging), and Step Functions (workflows) and you have an entire backend that scales from zero to millions of requests with no servers to patch. SnapStart cuts cold starts to milliseconds for Java, Python, and .NET.
Key concepts
- Event-driven architecture and choreography vs. orchestration
- Cold start mitigation: provisioned concurrency, SnapStart
- Idempotency, retries, and dead-letter queues
- Step Functions: standard vs. express, distributed map
- Lambda Powertools — observability and best practices library
Key AWS services
- AWS Lambda
- AWS Step Functions
- Amazon EventBridge
- Amazon API Gateway
- Amazon SQS
Learn more — curated resources
Hand-picked official docs, foundational papers, and the best community guides for going deeper on this topic.
Sessions on this topic
11 sessions from the Summit covered this topic. Each is a self-contained mini-lesson.
- MAM306Advanced
Adding Agentic AI to legacy apps with Amazon Bedrock AgentCore
In this code-first session, we demonstrate how to add agentic AI capabilities and augment a legacy application using Amazon Bedrock AgentCore and the Amazon Strands Agents SDK. We will explore how to build AI-powered features for a legacy application without modifying the existing backend code. We will showcase how to leverage existing APIs and Lambda functions as the backbone for your agentic AI experience. You'll learn how to execute code in isolated sandbox environments, ensuring security while accessing internal data sources with Amazon Bedrock AgentCore Code Interpreter.
- DEV207Intermediate
Data Observability Without the Pain - Lessons from a Production System
Modern IoT platforms are inherently data platforms. Events flow through APIs, queues, AWS Lambda Serverless functions, storage systems, and device networks before becoming meaningful data. When something goes wrong, tracing a single event across these distributed components quickly becomes painfuland the question shifts from _what happened_ to _where do I even start looking Ill walk through three practical observability patterns drawn from building and operating a production, event-driven IoT healthcare platform on AWS that processes tens of thousands of device events daily. Using OpenTelemetry, AWS X-Ray and Honeycomb, well explore techniques for gaining visibility into asynchronous event pipelines, correlating activity across services, and tracing events as they move through distributed systems. Youll leave with three concrete patterns you can apply immediately to your own event-driven data systems.
- DEV312Advanced
Strands Agents on Lambda: Observability With Powertools & X-Ray
When a Strands Agent fails across five Lambda log streams with no correlation, debugging takes 20 minutes minimum. This session demonstrates a structured observability layer that reduces diagnosis to under two minutes. You'll learn how Lambda Powertools Tracer wraps Strands tool invocations as X-Ray subsegments, how Powertools Logger injects AgentCore session correlation IDs across invocations, and how Powertools Metrics surfaces tool retry frequency as CloudWatch alarms — before timeouts occur. The session covers three production failure classes — tool timeout, reasoning loop, and retry storm — and delivers a reusable CDK construct providing full instrumentation for any Strands Agent Lambda deployment.
- DAT301Advanced
Powering your Agentic AI experience with AWS Streaming and Messaging
Powering your Agentic AI experience with AWS Streaming and MessagingOrganizations are accelerating innovation with generative AI and agentic AI use cases. This session explores how AWS streaming and messaging services such as Amazon Managed Streaming for Apache Kafka, Kinesis Data Streams, Amazon Managed Service for Apache Flink, and Amazon SQS build intelligent, responsive applications. Discover how streaming supports real-time data ingestion and processing, while messaging ensures reliable coordination between AI agents, orchestrates workflows, and delivers critical information at scale. Learn architectural patterns that highlight how a unified approach acts on data as fast as needed, providing the reliability and scale to grow for your next generation of AI.
- DEV311Advanced
Serverless Developer Experience: Day in a life of builder
What does it mean to be a serverless developer in the era of GenAI What disciplines do you need to master to build cloud-native, serverless solutions today In this session, we'll walk through a day in the life of a serverless developer and explore the core principles, architecture patterns, frameworks, and how to leverage GenAI tools to build your next-generation serverless application.
- ISV206Intermediate
Scaling RAG to Millions of Vectors: The Squiz Story
Squiz, a global Digital Experience Platform provider, is transforming how organizations deliver conversational search experiences. By adopting Amazon S3 Vectors, Squiz reimagined its ingestion pipeline — increasing data processing speed by 50% and shifting from bespoke, always-on infrastructure to a scalable serverless model. This allows Squiz to seamlessly scale from 25,000 to millions of vectors per client, while significantly reducing costs. Hear how this shift freed engineering teams to focus on RAG innovation rather than infrastructure management, and how it powers smart video search capabilities across their platform.
- ARC302Advanced
Secure Multi-tenant SaaS with AWS Lambda: A Tenant Isolation Deep Dive
In this session, learn about AWS Lambda's execution environment lifecycle, diving deep into how the service manages isolation at the function level, and understanding the security implications of environment reuse patterns. Learn about traditional patterns for compute isolation in multi-tenant environments, as well as explore Lambda's tenant isolation mode - a new powerful capability that enables tenant-level compute separation without operational overhead. Explore how to implement robust tenant isolation strategies, manage state across executions, and leverage Lambda's security boundaries effectively. Whether building new SaaS applications or enhancing existing ones, leave with practical knowledge to implement secure multi-tenant architectures at scale.
- ARC403Expert
Secure Multi-tenant SaaS with AWS Lambda: A Tenant Isolation Deep Dive
Secure Multi-tenant SaaS with AWS Lambda: A Tenant Isolation Deep DiveIn this session, learn about AWS Lambda's execution environment lifecycle, diving deep into how the service manages isolation at the function level, and understanding the security implications of environment reuse patterns. Learn about traditional patterns for compute isolation in multi-tenant environments, as well as explore Lambda's tenant isolation mode - a new powerful capability that enables tenant-level compute separation without operational overhead. Explore how to implement robust tenant isolation strategies, manage state across executions, and leverage Lambda's security boundaries effectively. Whether building new SaaS applications or enhancing existing ones, leave with practical knowledge to implement secure multi-tenant architectures at scale.
- DEV203Intermediate
Decisions Over Diagrams: How Bell Financial Group Architects on AWS
Architecture diagrams show what you built. They don't explain why. At Bell Financial Group, every major technology choice — from landing zone design to compute platform to database engine — is captured in an Architecture Decision Document that forces honest evaluation of trade-offs. In this talk, the Head of Engineering at Bell Financial Group walks through the real decisions behind their AWS platform: why ECS Fargate beat EKS, when DynamoDB wins over relational databases, why the entire infrastructure is written in TypeScript CDK, and the deliberate constraints they place on Lambda usage. No slides full of boxes and arrows — just the reasoning, the trade-offs, and the lessons learned building a regulated financial services platform on AWS.
- DEV310Advanced
Zero-Downtime Migration from Sydney to Auckland (ap-southeast-6)
With AWS ap-southeast-6 (Auckland) now open, New Zealand organizations can repatriate workloads from Sydney. This advanced session provides practical migration strategies minimizing downtime and eliminating data loss across every layer of your stack. You'll learn region-to-region migration patterns for: *Storage*: S3 replication, EBS snapshots, EFS cross-region transfers *Databases*: RDS read replicas, DynamoDB global tables, self-managed EC2 database replication *Applications*: Lambda, ECS/EKS workload migration, EC2 AMI copying Walk away with a prioritized migration playbook, realistic RTO/RPO targets, and battle-tested sequencing strategies for large-scale data transfers without extended application outages.
- INO102Foundational
Partnering for Scale & Innovation
Discover how Sportsbet, Australia's leading sports betting platform, transformed its technology organization through strategic AWS partnerships. Learn how they built an enterprise-wide AI learning culture that achieved record-breaking certification results, and completed a serverless-first modernization that delivered significant cost savings and emissions reductions while handling massive scale. This session shares practical strategies for building executive buy-in, establishing effective AWS partnerships, and creating a culture where innovation thrivesdemonstrating how to focus internal resources on competitive differentiation while partnering for technical expertise to accelerate transformation.
Live updates related to this topic LIVE
Sourced via Parallel AI Monitor — continuous web watch on 21 topical streams. Updated .
- forbes.com high confidence Scaling infra for agent workloads
AWS Cuts AI Agent Setup To 3 API Calls In AgentCore Update
Waxell published a detailed framework on AI Agent Circuit Breakers, proposing automated circuit breakers implemented at the governance plane (outside agent code) to prevent runaway loops, monitor cost velocity, handle consecutive failures, and stop scope violations.
- linkedin.com Scaling infra for agent workloads
API7.ai, Original Creator of Apache APISIX | LinkedIn
Arcjet introduced 'Guards,' a runtime security service for AI agent workflows that enables enforcement of per-user token budgets and spend limits inside agent loops and can detect prompt injection in tool results.
- virtualizationreview.com high confidence Scaling infra for agent workloads
How to Scale Backend Infrastructure for the Age of Agentic AI
Waxell provides a governance layer for infrastructure-layer budget enforcement that wraps LLM requests and tool calls, synchronously terminating sessions before an API call is placed once a per-session or fleet-wide token/cost ceiling is reached, preventing runaway loop scenarios
- agentmarketcap.ai high confidence Scaling infra for agent workloads
Agent-Native Database Architecture 2026: Why REST APIs Fail ...
Waxell provides a governance layer for infrastructure-layer budget enforcement that wraps LLM requests and tool calls, synchronously terminating sessions before an API call is placed once a per-session or fleet-wide token/cost ceiling is reached, preventing runaway loop scenarios
- axell.ai high confidence Scaling infra for agent workloads
AI Agent Token Budget Enforcement [2026]
Waxell provides a governance layer for infrastructure-layer budget enforcement that wraps LLM requests and tool calls, synchronously terminating sessions before an API call is placed once a per-session or fleet-wide token/cost ceiling is reached, preventing runaway loop scenarios
External links matched to this topic via topic relevance. The KB does not endorse third-party content; verify before citing.
Non-obvious insights
From the PlaybookOne sharp, contrarian insight per session — the things teams don't think of unprompted.