Machine Learning

Step-by-step tutorial to Machine Learning of Google Cloud

January 20, 2020

Hi there! If you were looking for a guide to ML (Machine Learning) on a google cloud platform, you have in the right place! In this tutorial, we will take a detailed step-by-step look at Machine Learning on Google’s Cloud platform and by the end of this tutorial, you will be able to:

  • Understand Google Cloud Machine Learning engine and TensorFlow.
  • Understand the advantages of using Google Cloud ML engine.
  • Know the basics of training an ML model and using it for predictive analysis.
  • Package or compile your code and place it on the Google’s cloud network.
  • Configure and request a machine learning training job.
  • Monitor an ML training job while it executes.
  • Use hyperparameter tuning to maximise an ML model’s prediction accuracy
  • Deploy your model on the cloud to use it for predictions.
  • Know how to request online predictions from the ML models hosted with Cloud ML engine.
  • Know how to execute Google cloud ML engine batch prediction jobs.
  • Know how to migrate models from Cloud ML Beta.
  • Understand how to find and debug errors in your model.

Let us begin this tutorial without anymore delay!

What is the Google Cloud ML engine?

Google cloud ML engine is a cloud platform service that provides all the tools and functions to run TensorFlow model training applications in the cloud. REST API, a set of RESTful services serve as the core of Google Cloud platform. These services aid in maintaining the ML models, managing jobs, versions and make predictions on the hosted models of this platform. 

This engine enables users to easily build machine learning models that can work on any type and size of data. The data is preprocessed via Google Cloud Dataflow and allows access to data from Google BigQuery, Google Cloud Storage, etc.

Why is the Google Cloud ML engine becoming a popular choice for ML?

Over the years, Google has managed to prove its mettle in the internet search, cloud service platforms and cloud computing domains for over fifteen years. It is becoming popular for training machine learning models, deploying them and crunching Big Data analytics so organisations can derive derive meaningful insights. And the reasons behind its popularity are largely obvious.

The first and (stand-out) reason is Google’s end-to-end security system. The same security net that keeps Gmail and other Google apps secure protects the ML models and applications developed on Google cloud. Therefore, users can be assured of their work’s integrity and security.

The second reason is the ‘developer-friendly’ platform infrastructure. Google has employed a massive workforce of developers, that helped them create a tailor-made solution for developers frequenting cloud-computing and optimise it for needs specific to them. The third reason is the affordability. A user is only charged for training their model and getting predictions. This is done on the ‘number of minutes’ their cloud computing is used. Also, managing a ML model or application is free of charge on this platform. 

With the Google Cloud platform, you can use preprocessed data from Google Cloud DataFlow, build ML models, and deploy them on the web. This platform is a custom-solution aimed for developers engaged in building predictive analytics models, making Google's Cloud ML engine a great choice for modelling predictive analytics. Here's why:

  • End-to-end security for ML models and other TensorFlow applications.
  • A tailor-made platform for developers developed by developers.
  • Affordable choice modelling through ML-based predictive analysis, with users being charged for the computing power used.
  • A huge perk is Big Data Analytics being powered by Google’s inexpensive data storage.

Let us now get started with the basics of training and predicting models on the Google cloud.

Basics of training and prediction of ML models on Google cloud ML engine

In this part, we will walk through the steps involved in building, training and obtaining predictions from Machine Learning models on the Google Cloud platform. For doing so, it is important you possess an understanding of the TensorFlow software.

  • Understanding TensorFlow

TensorFlow is an open-source library that can be used to perform a range of computing operations using Dataflow programming and is commonly used to create Machine Learning models like neural networks. It was developed by the Google Brain team and is written in Python, C++ & CUDA. Let us see what its functions are.

*This tutorial assumes that the reader is familiar with programming in Python, concept of arrays and the basics of machine learning.

Functioning of TensorFlow

TensorFlow Core, the lowest level API among the many APIs offered by TensorFlow, provides the user with total programming control. That is, it allows for deep levels of control over models to ML researchers and developers. The higher level APIs lie atop the TensorFlow Core, and are typically easier to learn and use than the TensorFlow core API. These higher level APIs enable consistent workflow between different users and make repetitive tasks easier. For example, tf.estimator is a higher level API that makes it easier to manage datasets, training, estimators and inference. 

However, to work on robust higher-level APIs, it is essential you understand a model's internal operations to see how the TensorFlow core works.

TensorFlow core: So what are Tensors?

In TensorFlow, ‘tensors’ represent the central unit of data. A tensor is nothing but a set of primitive values, represented by an array that can have ‘n’ number of dimensions. The number of dimensions of a tensor is represented by a tensor’s ‘rank.’

Consider these for example;

7 # -Rank 0 tensor, scalar value with [ ] shape
[4.,5.,6.] # -Rank 1 tensor, a vector with shape [3]
[[4.,5.,6.], [7.,8.,9.]] # -Rank 2 tensor, matrix with shape [2,3]
[[[1.,2.,3.]], [[4.,5.,6.]]] # -Rank 3 tensor with shape [2, 1, 3]

Now that you know what tensors are, let's keep moving!

How to import TensorFlow?

To import TensorFlow programs, use the following statement in Python.

Import tensorflow as tf

This command allows Python users access to all the classes, methods and symbols in TensorFlow.

Understanding the computational graph

The core TensorFlow programs can be divided into two sections:

  1. Creating the computational graph
  2. Running/executing the computational graph

A series of TensorFlow operations arranged in a graph of ‘nodes’ makes a computational graph. Every node can receive 0 or more tensors as an input. In return, every node produces a tensor as an output. There is also a node which is ‘constant.’ Such a node does not take inputs and its output value is stored internally. Let us see this with an example.

Example: Creating floating point Tensors (node1 & node2)

[node1 =  tf.constant(2.0, dtype=tf.float32)
node2 =  tf.constant(3.0) #  also tf.float32 implicitly
print(node1, node2)]

The output returned is as follows;

[Tensor(“Const :0”, shape=(), dtype=float32) Tensor(“Const_1 :0”, shape=(), dtype=float32)]

You might have noticed that the output values 2.0 and 3.0 have not been printed. This output (2.0 and 3.0) will be produced when these nodes are evaluated. To evaluate these nodes, the computational graph must run within a session. Such a session contains the state of TensorFlow runtime and its controls. 

To run a computational graph through a session, follow this;

[sess = tf.Session()
print([node1, node2]))]

The output result is as follows;

[2.0, 3.0]

The above code started a Session object and then called its run sequence to process the computational graph for evaluating node1 and node2. Similarly, more sophisticated computations can be created using the Tensor nodes with operations (here, operations are the same as nodes). 

Here is an example of how you can build a new graph by adding two constant nodes. 

[from __future__ import print_function
node3 = tf.add(node1, node2)
print(“node3:”, node3)

The outcome for the last two print statements comes out as follows:

[node3 : Tensor(“Add:0”, shape=(), dtype=float32) : 5.0]

Great! But what is a TensorBoard?

A TensorBoard is a tool provided by TensorFlow that can visualise computational graphs. A graph in TensorBoard can be made to receive external inputs via placeholders. A placeholder ensures that a value is given to a node at a later stage. Here is the example of placeholder values.

adder_node=a+b or adder_node=tf.add(a,b)]

The above graph can also be evaluated with multiple inputs. This is achieved by using the feed_dict argument to the run method for feeding the values to placeholders. Refer the following for instance;

[print(, {a : 3, b : 4.5}))
print(, {a : [1,3], b : [2,4]}))]

The output will be;

[3. 7.]

By adding another operation to the computational graph, we can make it more complex. For example;

[Add_and_triple = adder_node * 3
print(, {a : 3, b : 4.5}))]

The output will be


Making the computational graph trainable;

A computational graph should be modifiable, i.e. it should be able to give new outputs using the same inputs, so it can be made ‘trainable.’ This is necessary because in ML, we want the model to receive random inputs so it can make predictions. Variables. Variables enable the ‘trainable parameters’ to be added to the graph. A variable consists of a ‘type’ as well as an ‘initial value.’ Refer the following example;

[W= tf.Variable([.3], dtype=tf.float32)
b= tf.Variable([-.3], dtype=tf.float32)
x= tf.placeholder(tf.float32)
linear_model= W*x + b]

To initialise the constant, the user needs to call tf.constant. Remember, the values of constants never change because:

Variables do not initialise like constants do, i.e. by calling tf.Variable. For initialising any variable in the TensorFlow program, a special operation must be called explicitly.”

Here is how;

[init= tf.global_variables_initializer()]

Here, init is just a handle for initialising all the global variables. All the variables remain uninitialised until is called.

Now, we can evaluate a linear_model for several values of ‘x’, since it is a placeholder.

print(, {x : [1, 2, 3, 4]}))

The output turns out as;

[ 0.     0.30000001      0.60000002 0.90000004]

At this point, we have created a model. But we have no idea about its accuracy. Therefore, we will require a ‘y’ placeholder to provide the values we need. We also need to write something called a ‘loss function.’

What on Earth is a Loss function?

Well, a loss function gives us a measure of a model’s accuracy by measuring its deviation from the provided data. For our model, we will employ a standard loss model for linear regression, where it adds the squares of the differences between the model’s output values and that of the provided data. 

linear_model - y will create a vector where each element represents the error delta of the corresponding example. This error-delta value will be squared by calling the tf.square function. All the ‘squared errors’ will be then added up to give a single scalar value. This scalar value will summarise the error of all examples using tf.reduce_sum.

And this is how you execute it;

[y= tf.placeholder(tf.float32)
Squared_deltas =tf.square(linear_model - y)
loss= tf.reduce_sum(squared_deltas)
print(, {x : [1, 2, 3, 4], y: [0, -1, -2, -3]}))]

The output loss value will be 23.66.

We can also improve output loss values manually. For instance, the values of W and b can be reassigned the perfect values of -1 and 1. The variable initialised to the value provided to the tf.Variable function can be manipulated using operations like tf.assign. 

W= -1 and b=1 are fine parameters for our model. We can therefore change their values as follows;

fixW = tf.assign(W, [-1.])
fixb = tf.assign(b, [1.])[fixW, fixb])
print(, {x : [1, 2, 3, 4], y : [0, -1, -2, -3]}))

The loss value now becomes 0.0.

In our model, we actually ‘knew’ the perfect values of W and b. But machine learning focuses on automatically detecting the right parameter models. Let's take a look at how we can make that happen using TensorFlow's 'optimisers'. 

Training the model using the tf.Train API

Optimisers in TensorFlow minimise the loss function by slowly changing each variable. In this tutorial, we will see ‘gradient descent’, a simple optimiser in TensorFlow. There are several more advanced optimisers, but we don't really need to know them all right now. All we need is to figure out what gradient descent is and how it works.

Gradient descent modifies each variable according to that variable's ‘derivative loss’ value. This saves us time on the tedious and erroneous manual computing of more symbolic derivatives. 

Aiding our coding efficiency, TensorFlow automatically produces derivatives using the function tf.gradients. It only requires a description of the model. 

Let us now see how optimisers work in a model;

[optimizer= tf.train.GradientDescentOptimizer(0.01)
train= optimizer.minimize(loss) (this resets the values to incorrect defaults)
for i in range(1000):, {x : [1, 2, 3, 4], y : [0, -1, -2, -3]})

print([W, b]))]

The results in the model's final parameters will be:

[array([-0.9999969], dtype=float32), array([0.99999082], dtype=float32)

Aaaand there! We just did some real machine learning!

Through a simple linear regression model, we just used the TensorFlow core code. Needless to say, more complicated models will require more complex coding, but that's beside the point. The basics are all that matter, after all. If you can master them, the whole world of Machine Learning opens up for you.

Here is the final trainable linear regression model that we just created:

import tensorflow as tf

[# Model parameters
W = tf.Variable([.3], dtype=tf.float32)
b = tf.Variable([-.3], dtype=tf.float32)
# Model input and output
x = tf.placeholder(tf.float32)
linear_model = W*x + b
y = tf.placeholder(tf.float32)

# loss
loss = tf.reduce_sum(tf.square(linear_model - y)) # sum of the squares
# optimizer
optimizer = tf.train.GradientDescentOptimizer(0.01)
train = optimizer.minimize(loss)

# training data
x_train = [1, 2, 3, 4]
y_train = [0, -1, -2, -3]
# training loop
init = tf.global_variables_initializer()
sess = tf.Session() # reset values to wrong
for i in range(1000):, {x: x_train, y: y_train})

# evaluate training accuracy
curr_W, curr_b, curr_loss =[W, b, loss], {x: x_train, y: y_train})
print("W: %s b: %s loss: %s"%(curr_W, curr_b, curr_loss))]

On running the above code, the output produced is;

W : [-0.9999969] b : [0.99999082] loss : 5.69997e-11

You may notice in the output above that the loss is very close to zero. However, on running this program, the loss values might not be the same. This is because we initialised the model using random values and it's not necessary for correlation to exist between the variables as such.

Simplifying Machine Learning using the ‘tf.estimator’ library

Machine Learning's mechanics are simplified by tf.estimator, which is a high-level TensorFlow library. It allows for the easy operations of the following;

  • Executing/running training loops on the model
  • Executing/running the evaluation loops on the model
  • Managing the datasets

Basic usage of tf.estimator

To see how tf.estimator’ simplifies our linear regression program, let's import NumPy to load, manipulate, and preprocess the data imported: 

[import numpy as np

import tensorflow as tf]

Declare the list of features. Here, we have only one numeric feature. But, there are other types of columns that we can use.

[feature_columns = [tf.feature_column.numeric_column("x", shape=[1])]

An estimator is the function to call for training (fitting) and evaluation (inference). There are many predefined types like linear regression, linear classification, and many neural network classifiers and regressors. The following code shows an estimator that performs linear regression:

estimator = tf.estimator.LinearRegressor(feature_columns=feature_columns)

TensorFlow provides many helper methods to read and set up data sets. Here we use two data sets: one for training and one for evaluation. Here, we will tell the function about the number of batches of data (num_epochs) we want and the size of each batch for prediction training.

[x_train = np.array([1., 2., 3., 4.])
y_train = np.array([0., -1., -2., -3.])
x_eval = np.array([2., 5., 8., 1.])
y_eval = np.array([-1.01, -4.1, -7, 0.])
input_fn = tf.estimator.inputs.numpy_input_fn(
    {"x": x_train}, y_train, batch_size=4, num_epochs=None, shuffle=True)
train_input_fn = tf.estimator.inputs.numpy_input_fn(
    {"x": x_train}, y_train, batch_size=4, num_epochs=1000, shuffle=False)
eval_input_fn = tf.estimator.inputs.numpy_input_fn(
    {"x": x_eval}, y_eval, batch_size=4, num_epochs=1000, shuffle=False)
(TensorFlow allows to invoke 1000 training steps. Here is how)
estimator.train(input_fn=input_fn, steps=1000)
(We then evaluate our model’s performance)
train_metrics = estimator.evaluate(input_fn=train_input_fn)
eval_metrics = estimator.evaluate(input_fn=eval_input_fn)
print("train metrics: %r"% train_metrics)
print("eval metrics: %r"% eval_metrics)

The output result of our model comes out as follows.

train metrics: {'average_loss': 1.4833182e-08, 'global_step': 1000, 'loss': 5.9332727e-08}
[eval metrics: {'average_loss': 0.0025353201, 'global_step': 1000, 'loss': 0.01014128}]

If you noticed the loss function is still close to zero, then congratulations! you just picked up the basics of TensorFlow core programming.

Now, let's take a look at our objective again: using Google Cloud for Machine Learning and deploying models on the web.

Getting started with the Google Cloud ML engine

Now, let's start with creating a Machine Learning model on the Google Cloud ML engine. For the sample dataset, let's use one based on the US population and train our model to predict an individual’s income category.

Before getting started, make sure that:

  • You have a GCP account with the Cloud ML engine
  • Your Cloud storage APIs are activated.
  • You have Cloud SDK installed and initialised.
  • You have TensorFLow & TensorBoard installed on your workstation.

Overview of our ML model in Google Cloud;

We will be building a wide and deep model for ‘Predicting the income class of a person based on his/her information,’ using Deep Neural Nets (DNNs). We will use data from the US census to train our model and generate predictions. 

First, our model will learn to recognise patterns with the training data set (what and how variables influence the income of a person). Then our model will generate predictions about the income level of a person based on the information captured.

Our model will learn from this data's complex features and generate high-level abstraction and correlation between those features. We will define our model through TensorFlow’s inbuilt DNNCombinedLinearClassifier class. Therefore, all we need to do is make sure that the data is made particular to our dataset only.

Begin by downloading the sample from Git repository;

Samples can be downloaded from the Git repository for MacOS and Cloud Shell on Windows. To simplify the tutorial, let's just take a look at how you can download a sample using the Git repository for Cloud Shell on Windows.

  • To download the Cloud ML engine sample zip file, use this command: wget
  • Now, unzip the file using unzip and extract the cloudml-samples-master directory.
  • Now go to cd cloudml-samples-master/census/estimator and run the commands from the estimator directory.

*Note: Google Cloud storage and the Cloud Machine Learning engine will charge you for training, running predictions, and deploying this model.

Importing the training data;

The Google Cloud storage bucket hosts relevant data files for ML modelling. As decided earlier, we will using the United States Census data to build a model that predicts a person’s income category. We will require the and adult.test data files from the storage bucket. 

*Note: Let's first develop, train and validate our model locally, then place it on the cloud for easier iterations and debugging.

  • Download the data you need to a local directory.
  • Define variables corresponding to the downloaded data files using ‘mkdir data gsutil -m cp gs://cloudml-public/census/data/* data/’
  • Now, set local file paths for the TRAIN_DATA and EVAL_DATA variables by using the following commands: TRAIN_DATA=$(pwd)/data/ EVAL_DATA=$(pwd)/data/adult.test.csv
  • Check the requirements.txt file for the project's requirements or dependencies, and install the ones you require. Use the following code- ‘sudo pip install -r ../requirements.txt’

The data will now be stored in the directory in a comma-separated value format. Following is an example of a data format from the file;

“39, State-gov, 77516, Bachelors, 13, Never-married, Adm-clerical, Not-in-family, White, Male, 2174, 0, 40, United-States, <=50K
50, Self-emp-not-inc, 83311, Bachelors, 13, Married-civ-spouse, Exec-managerial, Husband, White, Male, 0, 0, 13, United-States, <=50K
38, Private, 215646, HS-grad, 9, Divorced, Handlers-cleaners, Not-in-family, White, Male, 0, 0, 40, United-States, <=50K”

Time to run a local trainer for our model;

The local trainer will load your Python training code and train the model in an environment similar to the Live Cloud ML engine cloud training. It will:

  • Specify a MODEL_DIR variable to an output directory. Use MODEL_DIR=output command. This will set MODEL_DIR to the value of the output.
  • Clear the output directory of residual data from previous training runs. Use this code- rm -rf $MODEL_DIR/*

Now run the training locally using the following code;

gcloud ml-engine local train \
    --module-name trainer.task \
    --package-path trainer/ \
    --job-dir $MODEL_DIR \
    -- \
    --train-files $TRAIN_DATA \
    --eval-files $EVAL_DATA \
    --train-steps 1000 \
    --eval-steps 100

Now, our model has been trained locally. It is time to evaluate the results. For this we need to inspect the ‘summary logs.’ 

Inspecting the summary logs via TensorBoard

As we know, TensorBoard is a computational graph visualization tool. Using it, we can view quantitative metrics, TensorFlow graphs, and additional information in image forms. 

Let's now launch TensorBoard and point it at the summary logs. These logs were produced during and after the training session. Follow these steps;

  • tensorboard --logdir=$MODEL_DIR --port=8080 (Launches the TensorBoard)
  • From the ‘web preview menu’, located at the top of the command line select ‘Preview on port 8080.’ 
  • Select the ‘Accuracy’ option to see a graph of ‘how the accuracy varies as the job executes.’ 
  • Use ctrl+c on the command line to shutdown TensorBoard anytime.

The output from local-training should be:



To inspect the summary logs using TensorBoard, just do what we did above. The command is:

  • tensorboard --logdir=$MODEL_DIR --port=8080 (To launch TensorBoard)
  • From the ‘web preview menu’, select the ‘Preview on port 8080’ option. 

At this point, we have trained our ML model to work in the cloud's distributed execution environment as well. 

We will now see how to set our Cloud storage bucket up. Setting it up is essential as the ML engine service accesses Cloud storage for reading and writing data during the training and model batch prediction phase.

Wait, how do I set the Cloud storage bucket up?

Follow these steps;

  • Select a unique bucket name for the Cloud storage. Specify it using the BUCKET_NAME="your_bucket_name"
  • Use the project name with -mlengine extension. Use this- PROJECT_ID=$(gcloud config list project --format "value(core.project)") BUCKET_NAME=${PROJECT_ID}-mlengine
  • Check the name for the bucket that you created using- echo $BUCKET_NAME
  • Now, let us specify the ‘Region’ for our bucket. It must be a specific location and not a multi-region location. For example, use the following code- REGION=us-central1. This creates the variable ‘REGION’ and points it to ‘us-central1.’ 
  • The new bucket will be created using the following command- gsutil mb -l $REGION gs://$BUCKET_NAME

*Note-We must use the region where we plan on running the Cloud ML jobs. We used ‘us-central1’ just as an example. Check here for a list of available regions on the Google Cloud.

The next step is to upload our data into our cloud storage bucket. Stick to the following steps to do the same;

  • gsutil cp -r data gs://$BUCKET_NAME/data. ‘Gsutil’ is used to copy the two files on our storage bucket.
  • We must now set the TRAIN_DATA and EVAL_DATA variables to correspond to the files. Follow this command code- TRAIN_DATA=gs://$BUCKET_NAME/data/ EVAL_DATA=gs://$BUCKET_NAME/data/adult.test.csv
  • Now, copy the JSON test file ‘test.json’ to the cloud storage bucket using this command- gsutil cp ../test.json gs://$BUCKET_NAME/data/test.json
  • Now, to point to that file, set a TEST_JSON variable. Do this by-TEST_JSON=gs://$BUCKET_NAME/data/test.json

At this point, our model has been validated and ready to be trained on Google Cloud.

Our ML model is ready for training on the cloud. We will begin by requesting a single-instance training job on the cloud.

Requesting a single-instance training job

To run the single-instance trainer, we will use the BASIC scale tier. The initial job might take a little longer to initiate. Here is how we will do it;

  • Let us select a unique identifiable name for our initial training run. Use this- JOB_NAME=census_single_1
  • Now, the output generated by the Cloud ML engine must be specified to a directory. Use the job name as the output directory. Here is an example- OUTPUT_PATH=gs://$BUCKET_NAME/$JOB_NAME. Here, ‘OUTPUT_PATH’ directs to the ‘census-single-1’ directory.

Execute the following code to run a single-instance cloud training for your model. Use the --verbosity tag to DEBUG the code. Doing this will allow you to retrieve accuracy, loss and other parameters. 

gcloud ml-engine jobs submit training $JOB_NAME \
--job-dir $OUTPUT_PATH \
--runtime-version 1.2 \
--module-name trainer.task \
--package-path trainer/ \
--region $REGION \
-- \
--train-files $TRAIN_DATA \
--eval-files $EVAL_DATA \
--train-steps 1000 \
--verbosity DEBUG

To view the progress of your trainer, go to Google Cloud Platform console, choose ‘ML engine’ option then select ‘Jobs’ option. 

Inspecting the output of the Cloud ML training job

For single-instance cloud training, the output will be stored in the Google Cloud storage. In our model, the output is stored in OUTPUT_PATH. Run it using this command;

gsutil ls -r $OUTPUT_PATH

The output from single-instance cloud trainer must be similar to the output from local-training

Inspecting the Stackdriver logs;

The ‘stackdriver log’ contains the stout and stderr log statements during & after the training of model on the cloud. They are used to see the behaviour of the training code in the Cloud environment. 

To view the logs from your job;

  • Go to the Google Cloud Platform console. Select ‘ML engine’ option, then choose ‘Jobs’ option. 
  • Select the ‘View logs’ option. 
  • Check for the term ‘Accuracy’ in the displayed log records. 
  • To view logs in your console, run the following command- gcloud ml-engine jobs stream-logs $JOB_NAME
  • You can also view the logs on TensorBoard using the same steps we followed earlier. 

Now, we will see how we can run our model on the distributed training on the cloud.

Similarly, the distributed training can be run on the Cloud and stackdriver logs can be used to evaluate the output results by the model.

It is time now that we pivot our model for better predictive accuracy. Let us see how we can do this.

Using the Hyperparameter tuning to improve our model’s accuracy

In Google Cloud ML, Hyperparameter tuning is employed to boost a model’s prediction accuracy. Hyperparameter settings are contained in the YAML file ‘hptuning_config.yaml’. Use the --config variable to include this file in your training request. Use the following code to tune your model and improve its accuracy;

  • Create a variable and choose a new job name to point to the configuration file. Follow this- HPTUNING_CONFIG=../hptuning_config.yaml JOB_NAME=census_core_hptune_1 TRAIN_DATA=gs://$BUCKET_NAME/data/ EVAL_DATA=gs://$BUCKET_NAME/data/adult.test.csv
  • Specify the output path and include the job name- OUTPUT_PATH=gs://$BUCKET_NAME/$JOB_NAME

Run the following command to tune the hyperparameters for your model;

gcloud ml-engine jobs submit training $JOB_NAME \
    --stream-logs \
    --job-dir $OUTPUT_PATH \
    --runtime-version 1.2 \
    --config $HPTUNING_CONFIG \
    --module-name trainer.task \
    --package-path trainer/ \
    --region $REGION \
    --scale-tier STANDARD_1 \
    -- \
    --train-files $TRAIN_DATA \
    --eval-files $EVAL_DATA \
    --train-steps 1000 \
    --verbosity DEBUG  \
    --eval-steps 100

There! You just created, trained and tuned a machine learning model on the Google cloud ML engine. Now is the time to deploy it on the web.

Deploying your model on the web to get predictions

To deploy your ML model on the web, follow these steps.

  • Select a name for your model- MODEL_NAME=census
  • Now, start a Cloud ML engine model- gcloud ml-engine models create $MODEL_NAME --regions=$REGION
  • Specify the job output to be used- OUTPUT_PATH=gs://$BUCKET_NAME/census_dist_1
  • Trace the full path of the exported trained model using- gsutil ls -r $OUTPUT_PATH/export
  • Now, search for the directory ‘$OUTPUT_PATH/export/Servo/<timestamp>’ and copy this path without the ‘:’ in the end. 
  • Specify the environment variable ‘MODEL_BINARIES’ to the above directories value. Follow this- ‘MODEL_BINARIES=gs://$BUCKET_NAME/census_dist_1/export/Servo/1487877383942/’. Here, $BUCKET_NAME is our cloud storage bucket name and census_dist_1 is our output directory.

Now, you should run the following command to create the first version (v1);

gcloud ml-engine versions create v1 \

gcloud ml-engine versions create v1 \
--model $MODEL_NAME \
--origin $MODEL_BINARIES \
--runtime-version 1.2

Congratulations! You just deployed your model on the cloud. Now, it is time to make some predictions.

Sending a prediction request to our deployed model

You ML model is now deployed on the Google cloud network. Send the prediction request using the following command;

gcloud ml-engine predict \
--model $MODEL_NAME \
--version v1 \
--json-instances \

Prediction outcome by the model

The prediction response will look like this.


LOGISTIC : [0.003707568161189556]

LOGITS : [-5.593664646148682]

PROBABILITIES : [0.9962924122810364, 0.003707568161189556]

Submitting a batch prediction job request 

Batch prediction requests are used in cases of large amounts of data. Batch requests are useful in no-latency requirements upon getting the prediction results. 

Follow these coding steps to get batch predictions;

  • Choose a name for the job- JOB_NAME=census_prediction_1
  • Specify the output path- OUTPUT_PATH=gs://$BUCKET_NAME/$JOB_NAME

Run the following command to get the batch prediction;

gcloud ml-engine jobs submit prediction $JOB_NAME \
--model $MODEL_NAME \
--version v1 \
--data-format TEXT \
--region $REGION \
--input-paths $TEST_JSON \
--output-path $OUTPUT_PATH/predictions

You can check the progress of your batch prediction job by running this code;

gcloud ml-engine jobs describe $JOB_NAME

Upon successful batch prediction job, the ‘state: SUCCEEDED’ must get displayed. 

Reading the output summary of the batch-prediction job

Use this command- gsutil cat $OUTPUT_PATH/predictions/prediction.results-00000-of-00001

The output will look like this;

{"probabilities": [0.9962924122810364, 0.003707568161189556], "logits": [-5.593664646148682], "classes": 0, "logistic": [0.003707568161189556]}

The final clean up step!

To avoid additional charges on the Cloud ML engine, cleaning up the training and prediction data from the directories is necessary. Here is how you can do it;

  • Go to the ‘Terminal window.’ 
  • Run- gsutil rm -r gs://$BUCKET_NAME/$JOB_NAME

Upon successful cleanup, the following message will be displayed;

Removing gs://my-awesome-bucket/just-a-folder/cloud-storage.logo.png#1456530077282000...
Removing gs://my-awesome-bucket/…

Use the same command to clean up other directories that you created for this project.

Congratulations! You just successfully created, trained, tested and deployed a machine learning model on the Google cloud network.

Want to learn more about Machine Learning on different platforms? Well, all you have to do is stay tuned!