Serverless Deployment of Machine Learning Models on AWS Lambda
This article is written by Lloyd Hamilton
February 7, 2022This article is written by Lloyd Hamilton. Lloyd is a data instructor based in CodeClan Edinburgh where he teach subjects ranging from data manipulation, data visualisation and machine learning. He recently wanted to understand how he could create value from his machine learning models and this lead him to the world of MLOPs.
Introduction
In my previous guide, we explored the concepts and methods in deploying machine learning model on AWS Elastic Beanstalk. Despite being largely automated, services like AWS Elastic Beanstalk still requires the deployment of key services such as EC2 instances and Elastic Load Balancers. Provisioned resources on AWS Elastic Beanstalk are always active, even when not required.
The concept of serverless orchestration of code moves away from traditional implementation of cloud computing resources by eliminating infrastructure management tasks. Serverless cloud computing is an evolution of the hands free approach to infrastructure management offered on Elastic Beanstalk but without the provisioning or management of servers.
Serverless computing is an event-driven compute service that can run code for almost any application. Since developers do not need to manage infrastructure, serverless implementation of code has the benefit of increasing productivity as developers can spend more time writing code. Ultimately, serverless functions are stateless and are only executed when you need them. This makes them highly cost effective solutions for many applications.
In this guide, we will learn how to deploy a machine learning model as a lambda function, the serverless offering by AWS. We will first set up the working environment by integrating AWS CLI on our machine. Next, we will train a K-nearest neighbour classifier which we will deploy as a docker container. This guide will walk you through the tools you need to enable you to test your application locally before deployment as a lambda function on AWS.
Let’s begin.
Contents
- Pre-requisites
- Introduction to the MNIST data set
- Training a K-Nearest Neighbour (KNN) classifier
- Initialising AWS S3 bucket
- Deploying and testing AWS lambda functions with SAM
- AWS resource termination
- Summary
Pre-requisites
There are several pre-requisites that you will need before moving forward. This guide will require you to interact with many tools, so do spend some time in fullfilling the pre-requisites.
- You will need an AWS account. You can sign up to the free tier which will be automatically applied on sign up.
- Some technical knowledge in navigating command line.
- Install AWS CLI
- Set up AWS CLI
- Install AWS Serverless Application Model CLI
- Install Docker
- Python 3.9.7
- VS Code with Jupyter Extension or any of your preferred IDE.
- Poetry — Python package management tool (Read my previous post on getting set up with Poetry)
- Python libraries: scikit-learn, numpy, requests, pandas, joblib, boto3, matplotlib, pyjanitor, jupyter, ipykernel. You can install my current python build using Poetry, alternatively the
requirements.txt
file is included in the Git repository. - The project repository for this project is linked here. The main body of code can be found in the Jupyter notebook linked here.
Overview
This aim of this guide is to walk you through the step required to deploy a machine learning model as a lambda function on AWS. This guide documents the key tools required to deploy a lambda function. Here is an overview of what we will be covering in this project.
- Training a K-nearest neighbour classifier on the MNIST data set for deployment.
- Initialising a S3 bucket as a data store.
- Local testing of dockerised lambda functions with AWS Serverless Application Model (SAM).
- Deployment of cloudformation stack using AWS SAM.
1. Introduction to the MNIST data
For this classification project, we will be using the MNIST data set that contain 70,000 images of handwritten digits. In this data set, each row represents an image and each column a pixel from a 28 by 28 pixel image. The MNIST dataset is widely used to train classifiers and can be fetched using the helper function sklearn.datasets.fetch_openml
. All data from OpenML is free to use, including all empirical data and metadata, licensed under the CC-BY licence.
All code for this project can be found in the Jupyter notebook,deploying_models.ipynb
, from the github repo linked here.
The code below will download the MNIST data, and sample 20,000 rows. The dataset has been reduced to decrease model size and build time for this project. The code below will also plot the first image in the data set which we can see is the number eight.https://towardsdatascience.com/media/6189c094ab2006875ec810d21fcb6248
2. Training a K-Nearest Neighbors Classifier
First we will split the data into training and test set then train a K-nearest neighbour classifier using thescikit-learn
library.
The model achieves a decent average accuracy of 96% from cross validation. Let’s evaluate the model’s performance on the test_features
data set and plot a confusion matrix with the show_cm
function as shown below.
Base on the accuracy on the test data set, we can see that our model fits the data. We get very similar prediction accuracy when comparing accuracies between the training and testing set.
Furthermore, a confusion matrix, like above, is very effective in helping visualise the gaps in the model’s performance. It will help us understand the kind of errors that the classifier is making.
The matrix indicates that there were 16 instances where the number 4 was misidentified for the number 9, and 12 instances where the number 8 was misidentified for the number 5.
Looking at the images below, it is possible to see why some of these errors may occur as the number 4 and 9 do share some similar features. Likewise for the number 8 and 5.
This insight is not going to affect model deployment on AWS but will help guide strategies to further improve the model.
For now, we will save the model locally to be containerised as part of the lambda function using Docker.
3. Initialising AWS S3 Bucket
The image below illustrates the overall resource infrastructure that will need to be deployed to support our lambda function. There are three key resources requirements for our application:
- S3 Bucket to store data.
- API gateway to manage HTTP requests.
- Lambda function containing the predictive logic.
The Lambda function will contain Python code that performs a prediction based on the test_features
dataset stored on a S3 bucket. Therefore, we will first need to initialise a S3 bucket where we can host our data.
To do so, we will be interacting with AWS using the AWS Python SDK boto3
. This package contains all the dependencies we require to integrate Python projects with AWS.
Let’s initialise a S3 bucket with the code below.
Note: The bucket_name
has to be unique therefore you will have to replace the bucket_name
with a name that is not taken.
The S3 bucket will host our test_features
data set which we can call in our lambda function to perform a prediction.
To save an object currently in our workspace, we will be making use of BytesIO
function from the io
library. This will enable us to temporary store the test_features
data set in a file object. This file object can be uploaded onto a S3 bucket by calling the .upload_fileobj
function.
The bucket
variable defines the destination S3 bucket and the key
variable will define the file path in the bucket. The bucket
and key
variables will form part of the data payload in the POST HTTP request to our lambda function.
We can check if the objects have been uploaded with the helper function below. list_s3_objects
will list all objects in the defined bucket.
Output: [‘validation/test_features.joblib’]
We have now successfully initialised a S3 bucket to store the test_feature
data. The next two key resources, API Gateway and lambda function, will be deployed using AWS Serverless Application Model (SAM).
4. Deploying and Testing AWS Lambda Functions with SAM
AWS SAM is an open source framework used to build serverless applications. It is a tool that streamlines the build process of serverless architecture by providing simple syntax to deploy functions, APIs or databases on AWS. SAM is a platform that unifies all the tools you need to rapidly deploy serverless applications all within a YAML configuration file.
There are other options such as serverless which is a great option. Serverless has the added advantage of being a universal cloud interface (AWS, Azure, Google Cloud) for increased versatility. However I personally have found that integration and testing of docker containers locally to be better on AWS SAM than on serverless. I would be curious if anyone have different opinions! Do leave a note.
Here is overall folder structure of the current project and can be found on github here.
In the following sections, I will be specifically discussing three important files.
- A
.yaml
file detailing the SAM configurations. (template_no_auth.yaml
) - A
.py
file containing the code for our lambda function. (lambda_predict.py
) - A
Dockerfile
detailing code that containerise our lambda function. (Dockerfile
)
4.1. template_no_auth.yaml
The template_no_auth.yaml
defines all the code we need to build our serverless application. You can find the official documentation to the template specifications here.
Note: This current template does not include resources that performs server side authentication of API requests. Therefore, deployment of our lambda function at its current state will allow anyone with the URL to make a request to your function.
Let’s take a detailed look at the template file to better understand the configurations that are being defined. I have broken it down in three sections and have linked respective documentation for each declaration in the headers.
AWSTemplateFormatVersion
The latest template format version is 2010-09-09
and is currently the only valid value.
AWS::Serverless-2016–10–31
declaration identifies an AWS CloudFormation template file as an AWS SAM template file and is a requirement for SAM template files.
Global variables to be used by specific resources can be defined here. Function timeout and the memory size is set to 50 and 5000 MB respectively. When the specified timeout is reached, the function will stop execution. You should set the timeout value to your expected execution time to stop your function from running longer than intended. Finally, in our template we have set the open API version to 3.0.1.
Set the default staging value to dev
. You can define parameter values which can be referenced in the yaml file.
The resources section is where we will declare the specific AWS resources we require for our application. This list details the number of available resources you can declare in SAM.
For our project, we will be declaring the API gateway and lambda function as resources. We will not need to declare a S3 bucket as we have already created a bucket for our project.
In the resources section, an API called LambdaAPI
is declared. LambdaAPI
has the property StageName
that has the parameter stage.
The resource section also declares a lambda function with the name PredictFunction
. To declare the lambda function as a docker image, the PackageType
variable needs to be defined as Image
and a link to a docker file must be declared in the Metadata
section of the yaml file.
We also specified an event
that will trigger the lambda function. In this case, a POST HTTP request to the /predict
end point byLambdaAPI
will trigger the lambda function. Finally, for the lambda function to have access to S3 buckets, we have attached the AWS manage policyAmazonS3FullAccess
.
In the outputs
section we declared a set of outputs to return after deploying the application with SAM. I have defined the output to return the URL of the API endpoint to invoke the lambda function.
4.2. lambda_predict.py
The lambda_predict.py
file contains code pertaining to the predictive logic for our application. In general, the function will:
- Load the model.
- Download the
test_features
data set referenced by thebucket
andkey
variable. - Perform a prediction on the downloaded data set.
- Return JSON object of the predictions as a numpy array.
The python file also contain a logger
class that logs the progress of the script which significantly helps when debugging.
In addition, it is a good time to note the concept of cold start and how that affects latency when optimising lambda functions. I have linked an article that explains this concept really well.
4.3. Dockerfile
The Dockerfile
details the instructions required to containerised our lambda function as a docker image. I will be using Python 3.9 and installing the python dependencies using poetry.
Key thing to note, the entry point for the docker image is set to the lamba_handler
function which is declared in the lambda_predict.py
file. This entry point defines the function to be executed during an event
trigger, an event such as a HTTP POST request. Any code outside of the lambda_handler
function that is within the same script will be executed when the container image is initialised.
4.4. Building and testing the application locally.
AWS SAM provide functionality to build and locally test applications before deployment.
- Ensure docker is running. In a terminal window, navigate to the project directory and build the application in SAM.
2. Locally deploy the dockerised lambda function.
3. Locally invoke the function at http://127.0.0.1:3000/predict
. Your URL may differ.
Note: Thebucket
and key
variable which references the test_feature
data set on S3 will need to be passed as part of the data payload in the POST HTTP request.
The locally invoked lambda function performs as we expect as we achieve identical results when compared to previous test_feature
predictions.
4.5. Deploying on AWS Lambda
As easy as it was to deploy locally, SAM will also handle all the heavy lifting to deploy on AWS Lambda.
a) Build the application in SAM.
b) Deploy the application.
Follow the prompts that guides you through the deployment configurations. Most of the settings I used were the default value with a few exceptions.
SAM will upload the latest build of your application onto a managed Amazon Elastic Container Registry (Amazon ECR) during the deployment phase.
SAM will also output a list of CloudFormation events detailing the deployment of the requested AWS resources for your application.
The final output will detail the API gateway URL to invoke the lambda function.
c) Invoke your function by replacing the URL in the code below with the URL from the output above.
Congratulation! 🎉🎉 If you have reached this milestone, we have successfully deployed a KNN classifier as a lambda function on AWS.
However, as previously mentioned, the exposed API is currently not secure and anyone with the URL can execute your function. There are many ways to secure lambda functions with API gateway however it is within the scope of this guide.
d) To terminate and delete AWS lambda functions use the command below. Replace [NAME_OF_STACK]
with the name of your application. Documentation can be found here.
Summary
The versatility of lambda functions in production cannot be understated. API driven execution of lambda functions, as demonstrated in this project, is one of the many event driven ways lambda functions can be activated. In addition to being a cost effective solution, lambda functions requires less maintenance as AWS handles the bulk of resource and infrastructure management. Therefore, this gives developers more time to focus their attention elsewhere.
In this guide, we have trained, tested and deployed a machine learning model on AWS lambda. First, a K-nearest neighbour classifier was trained on the MNIST data set. This trained model was packaged with a lambda function, containing the predictive logic, using Docker. With SAM, the dockerised container was tested locally before deployment on AWS as a cloudformation stack where the model was served as an API endpoint.
If you have reached the end of this guide, I hope you have learned something new. Leave a comment if have any issues and I will be more than happy to help.
Please do follow me on LinkedIn, Medium or Twitter (@iLloydHamilton) for more data science-related content.
Come learn with me at CodeClan.
Watch this space.