Optimizing Lambda performance for your serverless applications

James Beswick Senior Developer Advocate, AWS Serverless @jbesw

© 2020, Web Services, Inc. or its Affiliates. About me

• James Beswick • Email: [email protected] • Twitter: @jbesw • Senior Developer Advocate – AWS Serverless • Self-confessed serverless geek • Software Developer • Product Manager • Previously: • Multiple start-up tech guy • Rackspace, USAA, Morgan Stanley, J P Morgan

© 2020, , Inc. or its Affiliates. Agenda

Memory and profiling

© 2020, Amazon Web Services, Inc. or its Affiliates. How does Lambda work?

© 2020, Amazon Web Services, Inc. or its Affiliates. Anatomy of an AWS Lambda function

Your function

Language runtime Execution environment Lambda service Compute substrate

© 2020, Amazon Web Services, Inc. or its Affiliates. Where you can impact performance…

Your function

Language runtime Execution environment Lambda service Compute substrate

© 2020, Amazon Web Services, Inc. or its Affiliates. Anatomy of an AWS Lambda function

Handler () function Event object Context object Function to be executed Data sent during Lambda Methods available to upon invocation function Invocation interact with runtime information (request ID, log group, more)

// Python // Node.js

import json const MyLib = require(‘my-package’) import mylib const myLib = new MyLib()

def lambda_handler(event, context): exports.handler = async (event, context) => { # TODO implement # TODO implement return { return { 'statusCode': 200, statusCode: 200, 'body': json.dumps('Hello World!') body: JSON.stringify('Hello from Lambda!') } } }

© 2020, Amazon Web Services, Inc. or its Affiliates. Function lifecycle – worker host

Start new Download Execution Execute Execute your code environment INIT code handler code

Full Partial Warm cold start cold start start

AWS optimization Your optimization

© 2020, Amazon Web Services, Inc. or its Affiliates. Measuring with AWS X-Ray

const AWSXRay = require(‘aws-xray-sdk-core’) const AWS = AWSXRay.captureAWS(require(‘aws-sdk’)) Profile and troubleshoot AWSXRay.captureFunc('annotations', subsegment => { serverless applications: subsegment.addAnnotation('Name', name) subsegment.addAnnotation('UserID', event.userid) • Lambda instruments }) incoming requests and can capture calls made in code • API Gateway inserts tracing header into HTTP calls and reports data back to X-Ray

© 2020, Amazon Web Services, Inc. or its Affiliates. X-Ray Trace Example

© 2020, Amazon Web Services, Inc. or its Affiliates. Three areas of performance

Latency Throughput Cost

© 2020, Amazon Web Services, Inc. or its Affiliates. Cold starts

© 2020, Amazon Web Services, Inc. or its Affiliates. Function lifecycle – a warm start

Service identifies if Request made warm execution to Lambda’s API environments is available

Yes

Complete Invoke handler invocation

© 2020, Amazon Web Services, Inc. or its Affiliates. Function lifecycle – a full cold start

Service identifies if Find available Request made Download warm execution compute to Lambda’s API customer code environments is No resource available

Complete Start execution Invoke handler Execute INIT invocation environment

© 2020, Amazon Web Services, Inc. or its Affiliates. Cold starts - Execution environment

The facts: • <1% of production workloads • Varies from <100ms to >1s

Be aware… • You cannot target warm environments • Pinging functions to keep them warm is limited

© 2020, Amazon Web Services, Inc. or its Affiliates. Cold starts - Execution environment

The facts: Cold starts occur when… • <1% of production workloads • Environment is reaped • Varies from <100ms to >1s • Failure in underlying resources • Rebalancing across Azs Be aware… • Updating code/config flushes • You cannot target warm • Scaling up environments • Pinging functions to keep them warm is limited

© 2020, Amazon Web Services, Inc. or its Affiliates. Cold starts - Execution environment

Influenced by: • Memory allocation • Size of function package • How often a function is called • Internal algorithms

AWS optimizes for this part of a cold start.

© 2020, Amazon Web Services, Inc. or its Affiliates. Lambda + VPC – Major performance improvement

 Before: 14.8 sec duration

After: 933ms duration  Read more at http://bit.ly/vpc-lambda

© 2020, Amazon Web Services, Inc. or its Affiliates. Cold starts - Static initialization

The facts:

• Code run before handler Dependencies, • Used to initialize objects, configuration establish connections, etc. information • Biggest impact on cold-starts Also occurs when… • A new execution environment is run for the first time • Scaling up

© 2020, Amazon Web Services, Inc. or its Affiliates. Cold starts - Static initialization

Influenced by: What can help… • Size of function package • Code optimization • Amount of code • Trim SDKs • Reuse connections • Amount of initialization work • Don’t load if not used • Lazily load variables The developer is responsible for • Provisioned Concurrency this part of a cold start.

© 2020, Amazon Web Services, Inc. or its Affiliates. Provisioned Concurrency on AWS Lambda

Pre-creates execution environments, running INIT code. • Mostly for latency-sensitive, interactive workloads • Improved consistency across the long tail of performance • Minimal changes to code or Lambda usage • Integrated with AWS Auto Scaling • Adds a cost factor for per concurrency provisioned but a lower duration cost for execution • This could save you money when heavily utilized

© 2020, Amazon Web Services, Inc. or its Affiliates. Function lifecycle – a Provisioned Concurrency start

Function Find available configured with Download compute Provisioned customer code resource Concurrency

Start execution Execute INIT environment

© 2020, Amazon Web Services, Inc. or its Affiliates. Function lifecycle – a Provisioned Concurrency invocation

Service identifies if Request made warm execution to Lambda’s API environment is This becomes the default for all available provisioned concurrency execution environments Yes

Complete Invoke handler invocation

© 2020, Amazon Web Services, Inc. or its Affiliates. Provisioned Concurrency – things to know

• Reduces the start time to <100ms • Can’t configure for $LATEST • Use versions/aliases • Provisioning rampup of 500 per minute • No changes to function handler code performance • Requests above provisioned concurrency follow on-demand Lambda limits and behaviors for cold- starts, bursting, pricing • Still overall account concurrency per limit region • Wide support (CloudFormation, Terraform, , Datadog, Epsagon, Lumigo, Thundra, etc.)

© 2020, Amazon Web Services, Inc. or its Affiliates. Provisioned Concurrency

Things to know: • We provision more than you request. • We still reap these environments. • There is less CPU burst than On-Demand during init

© 2020, Amazon Web Services, Inc. or its Affiliates. Provisioned Concurrency - configuration

© 2020, Amazon Web Services, Inc. or its Affiliates. Provisioned Concurrency - configuration

© 2020, Amazon Web Services, Inc. or its Affiliates. Provisioned Concurrency – Application Auto Scaling

© 2020, Amazon Web Services, Inc. or its Affiliates. Memory and profiling

© 2020, Amazon Web Services, Inc. or its Affiliates. Resources allocation

Memory  Power

© 2020, Amazon Web Services, Inc. or its Affiliates. CPU-bound example

“Compute 1,000 times all prime numbers <= 1M”

128 MB 11.722 sec $0.024628 256 MB 6.678 sec $0.028035 512 MB 3.194 sec $0.026830 1024 MB 1.465 sec $0.024638

© 2020, Amazon Web Services, Inc. or its Affiliates. Meet AWS Lambda Power Tuning

Features: • Data-driven cost and performance optimization for AWS Lambda • Available as an AWS Serverless Application Repository app • Easy to integrate with CI/CD

https://github.com/alexcasalboni/aws-lambda-power-tuning

© 2020, Amazon Web Services, Inc. or its Affiliates. Meet AWS Lambda Power Tuning

https://github.com/alexcasalboni/aws-lambda-power-tuning

© 2020, Amazon Web Services, Inc. or its Affiliates. AWS Lambda Power Tuning (input)

© 2020, Amazon Web Services, Inc. or its Affiliates. AWS Lambda Power Tuning (input)

© 2020, Amazon Web Services, Inc. or its Affiliates. AWS Lambda Power Tuning (input)

© 2020, Amazon Web Services, Inc. or its Affiliates. AWS Lambda Power Tuning (input)

© 2020, Amazon Web Services, Inc. or its Affiliates. AWS Lambda Power Tuning (input)

© 2020, Amazon Web Services, Inc. or its Affiliates. AWS Lambda Power Tuning (input)

© 2020, Amazon Web Services, Inc. or its Affiliates. AWS Lambda Power Tuning (output)

© 2020, Amazon Web Services, Inc. or its Affiliates. AWS Lambda Power Tuning (visualization)

Lowest costs

Fastest execution time

https://github.com/alexcasalboni/aws-lambda-power-tuning

© 2020, Amazon Web Services, Inc. or its Affiliates. Real-world examples

© 2020, Amazon Web Services, Inc. or its Affiliates. No-Op (trivial data manipulation <100ms)

© 2020, Amazon Web Services, Inc. or its Affiliates. CPU-bound (numpy: inverting 1500x1500 matrix)

© 2020, Amazon Web Services, Inc. or its Affiliates. CPU-bound (prime numbers)

© 2020, Amazon Web Services, Inc. or its Affiliates. CPU-bound (prime numbers – more granularity)

© 2020, Amazon Web Services, Inc. or its Affiliates. Network-bound (third-party API call)

© 2020, Amazon Web Services, Inc. or its Affiliates. Network-bound (3x DDB queries)

© 2020, Amazon Web Services, Inc. or its Affiliates. Network-bound (S3 download – 150MB)

© 2020, Amazon Web Services, Inc. or its Affiliates. Network-bound (S3 download multithread – 150MB)

© 2020, Amazon Web Services, Inc. or its Affiliates. Cost/performance patterns

https://github.com/alexcasalboni/aws-lambda-power-tuning

© 2020, Amazon Web Services, Inc. or its Affiliates. How billing rounding impacts cost optimization

Example 1 Example 2

Time = 480ms Time = 310ms Billing = 500ms Billing = 400ms

A 15% performance optimization... A 5% performance optimization... Time = 408ms Time = 294ms Billing = 500ms Billing = 300ms ... leads to a 0% cost ... leads to a 25% cost optimization optimization

© 2020, Amazon Web Services, Inc. or its Affiliates. Architecture and best practices

© 2020, Amazon Web Services, Inc. or its Affiliates. Optimization best practices

Avoid «monolithic» functions Minify/uglify production code

Reduce deployment package size Browserify/Webpack Micro/Nano services

Optimize dependencies (and imports) Lazy initialization of shared libs/objs E.g. up to 120ms faster with Node.js SDK Helps if multiple functions per file

© 2020, Amazon Web Services, Inc. or its Affiliates. Optimized dependency usage (Node.js SDK & X-Ray)

// const AWS = require('aws-sdk’) const DynamoDB = require('aws-sdk/clients/dynamodb’) // 125ms faster

// const AWSXRay = require('aws-xray-sdk’) const AWSXRay = require('aws-xray-sdk-core’) // 5ms faster

// const AWS = AWSXRay.captureAWS(require('aws-sdk’)) const dynamodb = new DynamoDB.DocumentClient() AWSXRay.captureAWSClient(dynamodb.service) // 140ms faster

© 2020, Amazon Web Services, Inc. or its Affiliates. Lazy initialization example (Python & boto3) import boto3

S3_client = None ddb_client = None def get_objects(event, context): if not s3_client: s3_client = boto3.client("s3") # business logic def get_items(event, context): if not ddb_client: ddb_client = boto3.client(”dynamodb") # business logic

© 2020, Amazon Web Services, Inc. or its Affiliates. Amazon RDS Proxy

Fully managed, highly available database proxy for Amazon RDS. Pools and shares connections to make applications more scalable, more resilient to database failures, and more secure.

© 2020, Amazon Web Services, Inc. or its Affiliates. Optimization best practices (performance/cost)

Externalize orchestration Avoid idle/sleep – delegate to Step Functions

Transform, not Transport Minimize data transfer (S3 Select, advanced filtering, EventBridge input transformer, etc.)

Discard uninteresting events asap Trigger configuration (S3 prefix, SNS filter, EventBridge content filtering, etc.)

© 2020, Amazon Web Services, Inc. or its Affiliates. Reusing connections with Keep-Alive

• For functions using http(s) requests • Use in SDK with environment variable • Or set keep-alive property in your function code • Can reduce typical DynamoDB operation from 30ms to 10ms • Available in most runtime SDKs

For more details, visit https://bit.ly/reuse-connection

© 2020, Amazon Web Services, Inc. or its Affiliates. Comparing sync and async

Service A Service B Service A SQS queue Service B

Synchronous: Asynchronous: • The caller is waiting • The caller receives ack quickly • Waiting incurs cost • Minimizes cost of waiting • Downstream slowdown affects • Queueing separates fast and entire process slow processes • Process change is complex • Process change is easy • Dependent on custom code • Uses manages services

© 2020, Amazon Web Services, Inc. or its Affiliates. Lambda Performance - Summary

Cold starts Architecture and optimization • Causes of a cold start • Best practices • VPC improvements • RDS Proxy • Provisioned Concurrency • Async and service integration

Memory and profiling • Memory is power for Lambda • AWS Lambda Power Tuning • Trade-offs of cost and speed

© 2020, Amazon Web Services, Inc. or its Affiliates. Thanks! James Beswick, AWS Serverless [email protected]

© 2020, Amazon Web Services, Inc. or its Affiliates.