Flask endpoint vs Sagemaker endpoint


I want to build a simple web app, where a person would enter some parameters of a car and my machine learning algorithm would predict the price of the car given the parameters. I want to learn aws and therefore want to deploy and host everything there.

By checking websites and tutorials I identified the following steps I need to do:

  1. Collect data, train the model
  2. Build Flask api around the pickled model to serve predictions
  3. Create beautiful css/html front-end
  4. Create a docker image
  5. Push docker image to AWS ECR ad upload model artifact to S3
  6. Creat Sagemaker prediction endpoint
  7. Create an API endpoint with Chalice

What I don’t understand is:

  1. Why do I need to create a sagemaker endpoint (and Chalice endpoint) if I have already a flask endpoint that will predict the price? Cant I just spin-off the EC2 instance that will call flask endpoint and will give the prediction?
  2. Are the steps I described the most efficient way to create a web app with ML model and deploy it to AWS?

Would be happy to learn about your opinion!


There are obviously many different architectures to achieve what you are trying to do.

Here is one that has worked for me in achieving something similar:

1) Set up AWS S3/RDS for data storage/collection etc – you can use S3 to store data for training as well as a place users can upload data from your web app. You can use RDS to store any metadata and keep track of all your items in your S3 bucket.

2) Use Elastic Beanstalk to host your web app. I’ve built a few Django apps (instead of Flask) and was able to easily integrate, deploy, etc. the apps with Elastic Beanstalk. Additionally, Elastic Beanstalk comes with a bunch of features that will help you easily manage traffic on your website.

3) Use Sagemaker to deploy your models. Once deployed, you can make pretty easy use of Amazon’s SDK, Boto3, for sending data between your web app and your model for predictions.

The general idea is to split up the data, the web app, and the models into separate parts so that you can easily replace one part of your architecture with another if you find a better solution that fits.

Leave a Reply