Traditional approaches to development are difficult to maintain when using complex machine learning models in production. Development on a laptop or local machine can be slow to train the machine learning model for deep learning engineers. As a result, we typically make use of cloud machines with more powerful hardware to both train and run our machine learning models. This is good practice since we abstract complex computation and instead make AJAX requests as necessary. In this tutorial, we will make a pre-trained deep learning model named Word2Vec available to other services by building a REST API from the ground up.
An Ubuntu 16.04 server instance with at least 4GB RAM. For testing and development purposes, you can choose an instance with 4GB RAM
Understanding of how to use the Linux operating system to create/navigate/edit folders and files
Word embeddings are a recent development in natural language processing and deep learning that has revolutionized both fields due to rapid progress. Word embeddings are essentially vectors that each correspond to a single word such that the vectors mean the words. This can be demonstrated by certain phenomena such as the vector for
king - queen = boy - girl. Word vectors are used to build everything from recommendation engines to chat-bots that actually understand the English language.
Word embeddings are not random; they are generated by training a neural network. A recent powerful word embedding implementation comes from Google named Word2Vec which is trained by predicting words that appear next to other words in a language. For example, for the word
"cat", the neural network will predict the words
"feline". This intuition of words appearing near each other allows us to place them in vector space.
However, in practice, we tend to use the pre-trained models of other large corporations such as Google in order to quickly prototype and to simplify deployment processes. In this tutorial we will download and use Googleâs Word2Vec pre-trained word embeddings. We can do this by running the following command in our working directory.
The word embedding model we downloaded is in a
.magnitude format. This format allows us to query the model efficiently using SQL, and is therefore the optimal embedding format for production servers. Since we need to be able to read the
.magnitude format, weâll install the
pymagnitude package. Weâll also install
flask to later serve the deep learning predictions made by the model.
pip3 install pymagnitude flask
Weâll also add it to our dependency tracker with the following command. This creates a file named
requirements.txt and saves our Python libraries so we can re-install them at a later time.
pip3 freeze > requirements.txt
To begin, weâll create a file to handle opening and querying the word embeddings.
Next, weâll add the following lines to
model.py to import Magnitude.
from pymagnitude import Magnitude vectors = Magnitude('GoogleNews-vectors-negative300.magnitude')
We can play around with the
pymagnitude package and the deep learning model by using the
query method, providing an argument for a word.
cat_vector = vectors.query('cat') print(cat_vector)
For the core of our API, we will define a function to return the difference in meaning between two words. This is the backbone for most deep learning solutions for things such as recommendation engines (i.e. showing content with similar words).
We can play around with this function by using the
print(vectors.similarity("cat", "dog")) print(vectors.most_similar("cat", topn=100))
We implement the similarity calculator as follows. This method will be called by the Flask API in the next section. Note that this function returns a real value between 0 and 1.
def similarity(word1, word2): return vectors.similarity(word1, word2)
Weâll create our server in a file named
service.py with the following contents. We import
request to handle our server capabilities and we import the
similarity engine from the module we wrote earlier.
from flask import Flask, request from model import similarity app = Flask(__name__) @app.route("/", methods=['GET']) def welcome(): return "Welcome to our Machine Learning REST API!" @app.route("/similarity", methods=['GET']) def similarity_route(): word1 = request.args.get("word1") word2 = request.args.get("word2") return str(similarity(word1, word2)) if __name__ == "__main__": app.run(port=8000, debug=True)
Our server is rather bare bones, but can easily be extended by creating more routes using the
We can run our Flask server by running the following commands to activate our virtual environment, install our packages, and run its associated Python file.
source venv/bin/activate pip3 install -r requirements.txt python3 service.py
Our server will be available at
localhost:8000. We can query our database at
localhost:8000/similarity?word1=cat&word2=dog and view the response either in our browser or through another AJAX client.