Deploy ML models using Flask

2 min readNov 28, 2020

Recently I got a coding task to develop an end to end solution to find the most k words that occur most frequently in an arbitrary number of text files. This post would focus more on the solution design rather than the algorithmic part of the problem.

There’s a famous saying in MlOps, if you want to deploy a model, just flask it. I have used a flask server so that the user can upload files via a webpage. The architecture is designed in an extensible way. Currently there is just one file type supported(.txt). It can be extended to include other types such as pdf, doc and so on.

The directory structure for flask is pretty simple. On the topmost level, we have the app folder. Inside we have an app.py which is the starting point for the server. The templates folder holds your html web pages.

The app.py has the code where the model is loaded and predictions are done. I am using Counter from the standard library to get the most frequent words and NLTK package to get the word occurrences in sentences.

To make this code platform agnostic, I have created a docker file for the same. Dockerfile has the below code

FROM python:3.8.4-slim-buster

RUN apt-get update && apt-get clean

ENV LANG en_US.UTF-8
ENV LANGUAGE en_US:en
ENV LC_ALL en_US.UTF-8

COPY ./ .

RUN pip3 install -r requirements.txt
RUN python -m nltk.downloader stopwords
RUN python -m nltk.downloader punkt
RUN mkdir -p /uploads

ENV PYTHONPATH="$PYTHONPATH:/"


EXPOSE 5000
CMD python app/app.py

The code could be run using the following commands

Build the docker image

docker build -t “name:Dockerfile” .

run the Image

docker run — publish 5000:5000 — name nm name:Dockerfile

On the browser go to
http://localhost:5000/

In the Most common field Type the number of how many frequent words you want to find. In the choose file add a zip file which combines all the text documents. There should be no folders within the zip file. One sample zip file I have provided in the uploads folder when you unzip the code

https://github.com/pranav-kohli/most-frequent-words/

Improvements

Instead of Flask server we can use an asynchronous server like Tornado which can help scaling up the number of request we can process at a time
Need to add a caching layer for requests, could be done using Flask-caching.
Since we are using docker images, we can scale out using kubernetes + Knative stack along with Istio service mesh. Knative provides serverless capabilities along with traffic routing and load balancing on top of kubernetes. Can scale the number of pods based on the requests load and will scale down to zero if no requests are there.

Deploy ML models using Flask

Build the docker image

run the Image

Written by Pranav Kohli

No responses yet