Blog > Data Science & AI > How to productionize and deploy machine learning models?

How to Productionize and Deploy Machine Learning Models?

by | May 14, 2024


While building a machine learning model with good predictive performance is an important skill, that is only just the beginning. The true value of these models is realized only when they are successfully deployed and used in real-world applications. Deployment skills enable data scientists to transform theoretical models into practical tools that can solve actual problems.

Increasingly, organizations seek data scientists who can (1) develop sophisticated models and (2) thereafter deploy these models effectively. As a result, data scientists who are also equipped with the skills to deploy machine learning models are significantly more valuable to employers and prospective employers. In fact, model deployment is no longer just a good-to-have ability; it is fast becoming a fundamental expectation in the field of data science.

Given so, we thought it would be great to create a practical guide to (1) share about the broad process of model deployment, and (2) provide some simple code implementation to serve as a framework for those who are interested to deploy their very own machine learning models.

Learning Objectives

Here are the learning objectives of this article:

  • Understand the difference between research and production phase for machine learning models
  • Learn about what serialization entails, and why it is pertinent to productionizing machine learning models
  • Learn about the relevance of Flask in the context of productionizing a machine learning model
  • Learn how to deploy the model on the cloud (with AWS as an example).

This article assumes that you already are familiar with (1) Python programming, and (2) Machine learning models.

The Transition from Research Phase to Production Phase

Before diving into the technicalities of model deployment, it’s crucial to understand that where machine learning are concerned, they can be broken down into 2 phases:

  • Research Phase
  • Production Phase

For the benefit of those who are not exposed to machine learning, the Research Phase consists of the following steps: 

  • Data Collection: Involves gathering relevant data from various sources, which may include databases, APIs, or manual data entry.
  • Data Preprocessing: This step focuses on cleaning and preparing the collected data for analysis. Tasks include handling missing values, data transformation, normalization, and feature engineering.
  • Exploratory Data Analysis: This phase involves exploring the dataset to understand its characteristics and relationships between variables. Visualization techniques are often employed to identify patterns, outliers, and correlations within the data.
  • Model Selection: Involves choosing the appropriate machine learning algorithm(s) based on the problem domain, dataset characteristics, and desired outcomes. This step may also include feature selection or dimensionality reduction techniques.
  • Model Training: Utilizes the selected machine learning algorithm(s) to train the model on the prepared dataset. The model learns patterns and relationships from the training data to make predictions or classifications on new, unseen data.

Once the model is developed, this is when we begin the Productionization Phase, which involves:

  • Model Evaluation: Assessing the performance of the trained model using evaluation metrics appropriate for the task at hand. Common metrics include accuracy, precision, recall, F1-score, and mean squared error, depending on whether the problem is classification or regression.
  • Model Serialization: Serialization refers to the process of saving the trained model to disk in a format that can be easily loaded and utilized by other systems or applications. This ensures the model’s portability and accessibility for deployment.
  • Creating a Deployment Environment: Setting up the infrastructure and environment necessary to deploy the trained model into production. This may involve deploying the model on cloud platforms, creating APIs for model inference, and integrating with existing systems or applications.
  • Monitoring and Updating: Once the model is deployed, it’s crucial to continuously monitor its performance and behavior in real-world scenarios. Monitoring helps detect any degradation in performance or concept drift, prompting updates or retraining of the model to maintain its accuracy and relevance over time.

Understanding the Dataset

Now that you have understood the broad process for transitioning from research phase to production phase, let’s take a look at the implementation using a very simple example. This will then allow you to focus on the core of this article – which is the deployment itself.

Wine Quality

Here we use the Wine Quality dataset from Kaggle which contains data on various wines. This dataset is excellent for regression models, where the goal is to predict the quality score based on the properties.

Model Training

First, we train a regression model. We will use a simple linear regression model for this example, although in practice, you might choose a more complex model (such as ensemble models) depending on the context of the problem. 

Model Training

Model Serialization

After training, we serialize the model using pickle to prepare it for deployment.Model serialization, essentially, is the process of converting a trained model into a format that can be easily stored, transferred, or deployed. This is a necessary step in the productionizing of the model because:

  • It allows for the preservation of the model’s state — including its learned parameters and configurations — ensuring that the model can be reused, shared, or deployed without the need to retrain it from scratch. This not only saves computational resources but also time and effort.
  • It facilitates the deployment of models in different environments. A serialized model can be easily integrated into various application ecosystems, be it on a local server or the cloud.
  • It enables consistency in model performance, as the exact trained model state can be replicated across different platforms and applications. 

Here is how we serialize our model:

Model Serialization

FLASK Application for Deployment

Next, we need to use FLASK to wrap the models in a RESTful web service, which allows clients to interact with the model via HTTP requests. This means that once a model is trained and serialized, FLASK can be used to build an API endpoint that accepts data inputs, processes them through the model, and returns predictions or analysis results. In layman’s term, your client can access your machine learning model via an URL instead of you having to send them your jupyter notebook to run the model each time they require a prediction.

This approach simplifies the integration of machine learning models into existing web applications and broader IT systems. Here is how we use FLASK

Flask Application for Deployment

This code sets up a FLASK server with a prediction endpoint. The endpoint accepts feature data through a POST request and returns the quality prediction.

Deploying the Machine Learning Model on AWS

Deploying the Machine Learning Model on AWS

Once we have productionized the model, the next step is to choose a reliable cloud provider and then deploy your model on the cloud. In this case, we will use AWS, one of the leading cloud service providers, offers robust and scalable options for deploying machine learning models. Here’s how you can deploy your Wine Quality prediction model on AWS.

Preparing Your Model for AWS Deployment

Before deploying, ensure your model is trained, evaluated, and serialized. The serialized model file (wine_model.pkl) will be used in the deployment process.

Step 1: Setting up an AWS Account

If you don’t already have an AWS account, create one at AWS Management Console. You’ll need to provide some basic information and payment details.

Step 2: Choosing the Right AWS Service

For model deployment, AWS offers several services. The most common ones are:

  • Amazon EC2: Suitable for a more manual setup where you control the server instances.
  • AWS Lambda and Amazon API Gateway: For a serverless architecture, allowing auto-scaling and pay-per-use pricing.
  • Amazon SageMaker: Provides an end-to-end machine learning service, which is easier to use for those unfamiliar with cloud deployments.

For this guide, let’s choose Amazon EC2 for its flexibility and control.

Step 3: Launching an EC2 Instance

  1. Log in to the AWS Management Console and navigate to the EC2 dashboard.
  2. Launch a new instance by selecting an appropriate Amazon Machine Image (AMI), like Ubuntu Server.
  3. Choose an instance type. For simple models, ‘t2.micro’ might be sufficient.
  4. Configure instance details as needed, including network and subnet settings.
  5. Add storage if the default storage is insufficient for your needs.
  6. Configure security groups to set up firewall rules. Ensure that you allow HTTP and SSH traffic.
  7. Review and launch the instance. You’ll be prompted to select or create a new key pair for SSH access. Download this key pair, as you’ll need it for SSH access.

Step 4: Deploying the Model on EC2

  1. Connect to your EC2 instance using SSH with the downloaded key pair.
  2. Set up your environment on the EC2 instance. This includes installing Python, Flask, and any other necessary libraries.
  3. Upload your serialized model file (wine_model.pkl) to the instance. You can use SCP (Secure Copy Protocol) or a similar tool.
  4. Create a Flask app similar to the one you created for local deployment. This app will use the uploaded model to make predictions.
  5. Run your Flask app on the EC2 instance. You can use nohup or a similar tool to keep the app running in the background.

Step 5: Accessing Your Deployed Model

Once your Flask app is running on the EC2 instance, you can access it using the instance’s public DNS or IP address. The model is now deployed in the cloud and can handle requests sent over the internet.


As a data scientist, your job is not over when you have deployed the model. Post-deployment is essential to continuously monitor and update the model. This includes tracking performance, recognizing data drifts or environmental changes, and adjusting the model accordingly. This is important because the model performance will certainly deviate from the hypothesized performance you derived during the research phase.


Deploying a machine learning model is a skill that bridges the gap between theory and practical application. This guide offers a simple framework and brief insight into the broad process for deploying a machine learning model. Obviously, there are a lot more things one can do. For instance, we can refactor the code to adhere to OOP principles, and also utilize CI/CD to deploy the model.

Those interested in learning how to deploy machine learning models in a structured manner can check out our AI300: Deploying Machine Learning Systems to the Cloud course. Learners without python programming or machine learning experience can check out the AI100: Python Programming & Data Visualisation and AI200: Applied Machine Learning course respectively. 

Upskill Today With Heicoders Academy

Secure your spot in our next cohort! Limited seats available.