Ultimate beginner's guide to Caffe for Deep Learning

You have arrived at the right place if you’re looking for information on Caffe for Deep Learning projects. This article is a guide for beginners who are trying to find their way around Deep Learning using the Caffe framework. At the end of this guide to Caffe, you’ll be able to know;

What is Deep Learning?
What is Caffe software?
Why is Caffe a popular choice for Deep Learning?
How to install Caffe software on your machine?
How to run your first program using Caffe?

Preprocessing the data
Labeling the data
Converting images into LMDB dataset
Data augmentation
Setting up the architecture of your deep learning model
Customizing Python layers between input and output
Executing forward and backward pass for loss layers
Deploying your Deep learning network
Monitoring your deep learning model
Training your deep learning network model
Testing your deep learning network
Running the predictions
Visualization
Knowing the performance parameters of the model

Recommended resources for a deeper understanding of Caffe

Let us get started with our objective!

#1. What is Deep Learning?

Deep Learning is a subdomain of Machine Learning methods and techniques based on learning data representations and making predictions, without using task-specific algorithms. The learning in this case, by an ML model, can be supervised, semi-supervised or unsupervised. Deep learning algorithms are inspired by and based on the human brain’s structure and function, and are called Artificial Neural Networks.

A Machine Learning model in case of deep learning learns through examples (with positive and negative labels). The ML model is trained on a pre-processed data set of specific examples, and then makes prediction based on its training.

Here is a link that you can refer for more information on deep learning.

Let us now switch our focus to Caffe software.

#2. What is Caffe software?

Caffe is an open-source deep learning framework developed for Machine Learning. It is written in C++ and Caffe’s interface is coded in Python. It has been developed by the Berkeley AI Research, with contributions from the community developers.

This software has been designed keeping in mind the expressions, speed, modularity, openness and full community support to enable seamless creation of Deep Learning models.

Let us see what are the features that make Caffe a popular choice for Deep Learning projects.

#3. Why is Caffe a popular choice for Deep Learning?

Caffe has been designed for the purposes of speed, open-source ML development, expressive architecture and seamless community support. These features make Caffe framework a popular choice for building Deep Learning models. The same are explained as follows;

Extensible Code: The Caffe framework features top-notch codes, algorithms and models. Over the time, Caffe has witnessed many changes contributed by the developers and researchers that makes it a powerful platform for deep learning projects.
Expressive Architecture: An expressive and modular architecture of Caffe allows for boosted development of applications and programs based on machine learning. One can define the models and optimize them without hard-coding efforts. The user can switch between GPU (Graphics Processing Unit) and CPU by using a single-flag and train the ML model on a GPU machine. The model can be then deployed to mobile device platforms or commodity clusters.
Speed: Another feature that makes Caffe a popular choice for Deep Learning operations. With a single Nvidia K40 GPU, Caffe can process over 60 million images per day. That speed translates to 1 millisecond/image for inference and 4 milliseconds/image for learning operations. Recent library versions and latest hardware are still boosting Caffe’s speed performance. This makes Caffe framework a perfect candidate for deploying deep learning ML models at industry level as well as for research experiments.
Community support: The users of Caffe platform can get developer support from Caffe-users group and GitHub platform. Various startup-prototypes, academic research projects, and industrial applications relating to vision, speech, and multimedia recognition have been forged on Caffe, and powered by the support of Caffe Community. Interestingly, the Caffe team likes to call its developers as ‘Brewers’!

Now that you are familiar with some notable features of the Caffe framework that make it a powerful platform for Deep Learning, let us see how you can download and install it to get started with it.

#4. How to download and install Caffe software?

In this part, you will go through the steps involved in downloading and installing the Caffe software to build Deep Learning models. Here they are;

Prerequisites for Using Caffe

Following are the prerequisites to support and install Caffe framework on your machine. Take a look.

Mandatory dependencies

These are the Caffe’s several dependencies that need to be installed before using it.

CUDA

This software library API is required for running Caffe in GPU (Graphics Processing Unit) mode. CUDA is required for application programming with Caffe. It is a parallel computing platform for interface modelling created by Nvidia.

Install the CUDA library version 6 or 7+

BLAS

BLAS or Basic Linear Algebra Subprograms is a set of algorithms for performing common linear algebra operations like scalar multiplications, dot products, vector additions, matrix multiplications and linear combinations.

Install ATLAS, MKL or OpenBLAS
If you are using OpenBLAS, set BLAS := OPEN in ‘Makefile.config.’

C++ libraries

For running Caffe on your machine, C++ library is required.

Install C++ libraries via Boost.org. The library must be newer than the version 1.55.

Install the following libraries as well;

Protobuf
Glog
Gflags
Hdf5

Optional dependencies

Following are some optional dependencies that can be installed as per the user’s preference.

OpenCV

OpenCV is a Open Source Computer Vision library for commercial and academic use. It is supported by all the major operating systems and its interface is coded in C, C++, Java, and Python.

Install OpenCV version 2.4 or higher via OpenCV.org

IO Libraries

You can install Input-Output libraries like lmdb and leveldb. Here, leveldb will require installation of ‘snappy’ API as well.

cuDNN

It is Nvidia CUDA’s Deep Neural Network Library for accelerated GPU processing of deep neural networks. It allows for highly tuned implementations of standard routines like normalization, pooling, forward and backward convolution, and activation layers.

Install cuDNN version 6 to accelerate the Caffe in GPU mode. Install cuDNN and then uncomment USE_CUDNN := flag in ‘Makefile.config’ while installing Caffe.
Doing this will speed up your Caffe models the acceleration is automatic.
To use the Caffe without GPU mode, i.e., only in CPU-mode, uncomment CPU_ONLY in ‘Makefile.config’ to configure Caffe to run without CUDA.

PyCaffe and Matcaffe dependencies

For Python Caffe, you need to install Python version 2.7 or Python version 3.3+. The boost library can be accessed via ‘boost.python.’
For MATLAB Caffe, you need to install MATLAB with ‘mex’ compiler.

Steps to install PyCaffe;

Here are the steps to install PyCaffe (Caffe for Python) on your machine. Assuming that you have installed all the prerequisites like C++, Python, CUDA and other optional dependencies as well.

Follow this code to download the latest version of Caffe and build it. Open command prompt in your Ubuntu system and run this code in a shell file.

# Set up here how many cores you want to use during the installation:
NUMBER_OF_CORES=2
 
cd
sudo apt-get update
sudo DEBIAN_FRONTEND=noninteractive apt-get upgrade -y -q -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold" # If you are OK with all defaults
 
sudo apt-get install -y libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev
sudo apt-get install -y --no-install-recommends libboost-all-dev
sudo apt-get install -y libatlas-base-dev
sudo apt-get install -y python-dev
sudo apt-get install -y python-pip git
 
# For Ubuntu 14.04
sudo apt-get install -y libgflags-dev libgoogle-glog-dev liblmdb-dev protobuf-compiler
 
git clone https://github.com/LMDB/lmdb.git
cd lmdb/libraries/liblmdb
sudo make
sudo make install
 
# More pre-requisites
sudo apt-get install -y cmake unzip doxygen
sudo apt-get install -y protobuf-compiler
sudo apt-get install -y libffi-dev python-dev build-essential
sudo pip install lmdb
sudo pip install numpy
sudo apt-get install -y python-numpy
sudo apt-get install -y gfortran # required by scipy
sudo pip install scipy # required by scikit-image
sudo apt-get install -y python-scipy # in case pip failed
sudo apt-get install -y python-nose
sudo pip install scikit-image # to fix https://github.com/BVLC/caffe/issues/50
 
# Get caffe (http://caffe.berkeleyvision.org/installation.html#compilation)
cd
mkdir caffe
cd caffe
wget https://github.com/BVLC/caffe/archive/master.zip
unzip -o master.zip
cd caffe-master
 
# Prepare Python binding (pycaffe)
cd python
for req in $(cat requirements.txt); do sudo pip install $req; done
echo "export PYTHONPATH=$(pwd):$PYTHONPATH " >> ~/.bash_profile # to be able to call "import caffe" from Python after reboot
source ~/.bash_profile # Update shell
cd ..
 
# Compile caffe and pycaffe
cp Makefile.config.example Makefile.config
sed -i '8s/.*/CPU_ONLY := 1/' Makefile.config # Line 8: CPU only
sudo apt-get install -y libopenblas-dev
sed -i '33s/.*/BLAS := open/' Makefile.config # Line 33: to use OpenBLAS
# Note that if one day the Makefile.config changes and these line numbers change, we're screwed
# Maybe it would be best to simply append those changes at the end of Makefile.config
echo "export OPENBLAS_NUM_THREADS=($NUMBER_OF_CORES)" >> ~/.bash_profile
mkdir build
cd build
cmake ..
cd ..
make all -j$NUMBER_OF_CORES # 4 is the number of parallel threads for compilation: typically equal to number of physical cores
make pycaffe -j$NUMBER_OF_CORES
make test
make runtest
#make matcaffe
make distribute
 
# Bonus for other work with pycaffe
sudo pip install pydot
sudo apt-get install -y graphviz
sudo pip install scikit-learn

The installation directory of Caffe must adapt to the following paths listed below.

export OPENBLAS_NUM_THREADS=(4)
export CAFFE_ROOT=/home/david/caffe
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export PYTHONPATH=/home/david/caffe/python:$PYTHONPATH

#5. How to run your first program in Caffe?

Now that you have successfully installed Caffe, it is time to build your first program in Caffe. Following are the steps than you need to follow. Let us get started!

Step 1. Preprocessing the data for Deep learning with Caffe.

To read the input data, Caffe uses LMDBs or Lightning-Memory mapped database. Hence, Caffe is based on the Pythin LMDB package.

The dataset of images to be fed in Caffe must be stored as a blob of dimension (N,C,H,W). N represents the size of the dataset, C means the number of channels, H reflects the height of the images, and W means the width of the images. Caffe can process the image datasets as 8-bit Chars or 32-bit Floats.

To process or write your data as an LMDB that can be read by Caffe, run the following code.

import lmdb
import numpy
import caffe
 
def write(self, images, labels = []):
   """
   Write a single image or multiple images and the corresponding label(s).
   The imags are expected to be two-dimensional NumPy arrays with
   multiple channels (if applicable).
  
   :param images: input images as list of numpy.ndarray with height x width x channels
   :type images: [numpy.ndarray]
   :param labels: corresponding labels (if applicable) as list
   :type labels: [float]
   :return: list of keys corresponding to the written images
   :rtype: [string]
   """
      
   if len(labels) > 0:
       assert len(images) == len(labels)
      
   keys = []
   env = lmdb.open(self._lmdb_path, map_size = max(1099511627776, len(images)*images[0].nbytes))
      
   with env.begin(write = True) as transaction:
       for i in range(len(images)):
           datum = caffe.proto.caffe_pb2.Datum()
           datum.channels = images[i].shape[2]
           datum.height = images[i].shape[0]
           datum.width = images[i].shape[1]
              
           assert images[i].dtype == numpy.uint8 or images[i].dtype == numpy.float, "currently only numpy.uint8 and numpy.float images are supported"
              
           if images[i].dtype == numpy.uint8:
               # For NumPy 1.9 or higher, use tobytes() instead!
               datum.data = images[i].transpose(2, 0, 1).tostring()
           else:
               datum.float_data.extend(images[i].transpose(2, 0, 1).flat)
                  
           if len(labels) > 0:
               datum.label = labels[i]
              
           key = to_key(self._write_pointer)
           keys.append(key)
              
           transaction.put(key.encode('ascii'), datum.SerializeToString());
           self._write_pointer += 1
      
   return keys

Step 2. Label the preprocessed image datasets for training the Caffe model by running the following code in your Ubuntu command prompt.

import tools.lmdb_io # LMDB I/O tools in caffe-tools
 
lmdb_path = 'tests/test_lmdb'
lmdb = tools.lmdb_io.LMDB(lmdb_path)
 
# Some random images in uint8:
write_images = [(numpy.random.rand(10, 10, 3)*255).astype(numpy.uint8)]*10
write_labels = [0]*10
      
lmdb.write(write_images, write_labels)
read_images, read_labels, read_keys = lmdb.read()

Now it is time to make your preprocessed and labeled LMDB data readable. Use the following code.

def read(self):
       """
       Read the whole LMDB. The method will return the data and labels (if
       applicable) as dictionary which is indexed by the eight-digit numbers
       stored as strings.
 
       :return: read images, labels and the corresponding keys
       :rtype: ([numpy.ndarray], [int], [string])
       """
      
       images = []
       labels = []
       keys = []
       env = lmdb.open(self._lmdb_path, readonly = True)
      
       with env.begin() as transaction:
           cursor = transaction.cursor();
          
           for key, raw in cursor:
               datum = caffe.proto.caffe_pb2.Datum()
               datum.ParseFromString(raw)
              
               label = datum.label
              
               if datum.data:
                   image = numpy.fromstring(datum.data, dtype = numpy.uint8).reshape(datum.channels, datum.height, datum.width).transpose(1, 2, 0)
               else:
                   image = numpy.array(datum.float_data).astype(numpy.float).reshape(datum.channels, datum.height, datum.width).transpose(1, 2, 0)
              
               images.append(image)
               labels.append(label)
               keys.append(key)
      
       return images, labels, keys

Step 3. Converting images and CSV datasets

In the following code, Iris dataset has been converted to LMDB from CSV format.

import tools.pre_processing
import tools.lmdb_io
lmdb_converted = args.working_directory + '/lmdb_converted'
pp_in = tools.pre_processing.PreProcessingInputCSV(args.file, delimiter = ',',
                                                  label_column = 4,
                                                  label_column_mapping = {
                                                      'Iris-setosa': 0,
                                                      'Iris-versicolor': 1,
                                                      'Iris-virginica': 2
                                                  })
pp_out_converted = tools.pre_processing.PreProcessingOutputLMDB(lmdb_converted)
pp_convert = tools.pre_processing.PreProcessingNormalize(pp_in, pp, 7.9)
pp_convert.run()   
  
print('LMDB:')
lmdb = tools.lmdb_io.LMDB(lmdb_converted)
images, labels, keys = lmdb.read()
  
for n in range(len(images)):
   print images[n].reshape((4)), labels[n]

Step 4. Data augmentation

For a Deep Learning model to automatically learn robustness to noise, invariances, and artificially expand the dataset size, Data Augmentation is required. In Caffe, ‘tools.data_augmentation’ provides data augmentation techniques. Here is an example as to run a Data Augmentation command for PyCaffe.

def multiplicative_gaussian_noise(images, std = 0.05):
   """
   Multiply with Gaussian noise.
  
   :param images: images (or data) in Caffe format (batch_size, height, width, channels)
   :type images: numpy.ndarray
   :param std: standard deviation of Gaussian
   :type std: float
   :return: images (or data) with multiplicative Gaussian noise
   :rtype: numpy.ndarray
   """
  
   assert images.ndim == 4
   assert images.dtype == numpy.float32
  
   return numpy.multiply(images, numpy.random.randn(images.shape[0], images.shape[1], images.shape[2], images.shape[3])*std + 1)
 
def additive_gaussian_noise(images, std = 0.05):
   """
   Add Gaussian noise to the images.
  
   :param images: images (or data) in Caffe format (batch_size, height, width, channels)
   :type images: numpy.ndarray
   :param std: standard deviation of Gaussian
   :type std: float
   :return: images (or data) with additive Gaussian noise
   :rtype: numpy.ndarray
   """
  
   assert images.ndim == 4
   assert images.dtype == numpy.float32
  
   return images + numpy.random.randn(images.shape[0], images.shape[1], images.shape[2], images.shape[3])*std

Step 5. Setting up the architecture of your Deep learning model.

With PyCaffe, you can programmatically define your deep learning network’s architecture. Following code is an example for Network Definition while using an Iris dataset.

def iris_network(lmdb_path, batch_size):
   """
   Simple network for Iris classification.
  
   :param lmdb_path: path to LMDB to use (train or test LMDB)
   :type lmdb_path: string
   :param batch_size: batch size to use
   :type batch_size: int
   :return: the network definition as string to write to the prototxt file
   :rtype: string
   """
      
   net = caffe.NetSpec()
   net.data, net.labels = caffe.layers.Data(batch_size = batch_size, backend = caffe.params.Data.LMDB,
                                            source = lmdb_path, ntop = 2)
   net.data_aug = caffe.layers.Python(net.data,
                                      python_param = dict(module = 'tools.layers', layer = 'DataAugmentationRandomMultiplicativeNoiseLayer'))
   net.labels_aug = caffe.layers.Python(net.labels,
                                        python_param = dict(module = 'tools.layers', layer = 'DataAugmentationDuplicateLabelsLayer'))
   net.fc1 = caffe.layers.InnerProduct(net.data_aug, num_output = 12,
                                       bias_filler = dict(type = 'xavier', std = 0.1),
                                       weight_filler = dict(type = 'xavier', std = 0.1))
   net.sigmoid1 = caffe.layers.Sigmoid(net.fc1)
   net.fc2 = caffe.layers.InnerProduct(net.sigmoid1, num_output = 3,
                                       bias_filler = dict(type = 'xavier', std = 0.1),
                                       weight_filler = dict(type = 'xavier', std = 0.1))
   net.score = caffe.layers.Softmax(net.fc2)
   net.loss = caffe.layers.MultinomialLogisticLoss(net.score, net.labels_aug)
      
   return net.to_proto()

Step 6. Customizing Python layers between the input and output layers.

With PyCaffe, you can define custom Python code layers by running the following code.

with open(train_prototxt_path, 'w') as f:
   f.write('force_backward: true\n') # For the MNIST network it is not necessary, but for illustration purposes ...
   f.write(str(mnist_network(train_lmdb_path, train_batch_size)))

Then run,

class TestLayer(caffe.Layer):
   """
   A test layer meant for testing purposes which actually does nothing.
   Note, however, to use the force_backward: true option in the net specification
   to enable the backward pass in layers without parameters.
   """
 
   def setup(self, bottom, top):
       """
       Checks the correct number of bottom inputs.
      
       :param bottom: bottom inputs
       :type bottom: [numpy.ndarray]
       :param top: top outputs
       :type top: [numpy.ndarray]
       """
      
       pass
 
   def reshape(self, bottom, top):
       """
       Make sure all involved blobs have the right dimension.
      
       :param bottom: bottom inputs
       :type bottom: caffe._caffe.RawBlobVec
       :param top: top outputs
       :type top: caffe._caffe.RawBlobVec
       """
      
       top[0].reshape(bottom[0].data.shape[0], bottom[0].data.shape[1], bottom[0].data.shape[2], bottom[0].data.shape[3])
      
   def forward(self, bottom, top):
       """
       Forward propagation.
      
       :param bottom: bottom inputs
       :type bottom: caffe._caffe.RawBlobVec
       :param top: top outputs
       :type top: caffe._caffe.RawBlobVec
       """
      
       top[0].data[...] = bottom[0].data
 
   def backward(self, top, propagate_down, bottom):
       """
       Backward pass.
      
       :param bottom: bottom inputs
       :type bottom: caffe._caffe.RawBlobVec
       :param propagate_down:
       :type propagate_down:
       :param top: top outputs
       :type top: caffe._caffe.RawBlobVec
       """
           
       bottom[0].diff[...] = top[0].diff[...]

Step 7. Forward and backward passes can be executed with Manhatten Loss layer, which is similar to Euclidean loss layer.

class ManhattenLoss(caffe.Layer):
   """
   Compute the Manhatten Loss.
   """
  
   def setup(self, bottom, top):
       """
       Checks the correct number of bottom inputs.
      
       :param bottom: bottom inputs
       :type bottom: [numpy.ndarray]
       :param top: top outputs
       :type top: [numpy.ndarray]
       """
          
       if len(bottom) != 2:
           raise Exception('Need two bottom inputs for Manhatten distance.')
      
   def reshape(self, bottom, top):
       """
       Make sure all involved blobs have the right dimension.
      
       :param bottom: bottom inputs
       :type bottom: caffe._caffe.RawBlobVec
       :param top: top outputs
       :type top: caffe._caffe.RawBlobVec
       """
      
       # Check bottom dimensions.
       if bottom[0].count != bottom[1].count:
           raise Exception('Inputs of both bottom inputs have to match.')
      
       # Set shape of diff to input shape.
       self.diff = numpy.zeros_like(bottom[0].data, dtype = numpy.float32)
      
       # Set output dimensions:           
       top[0].reshape(1)
  
   def forward(self, bottom, top):
       """
       Forward propagation, i.e. compute the Manhatten loss.
      
       :param bottom: bottom inputs
       :type bottom: caffe._caffe.RawBlobVec
       :param top: top outputs
       :type top: caffe._caffe.RawBlobVec
       """
      
       scores = bottom[0].data # network output
       labels = bottom[1].data.reshape(scores.shape) # labels
      
       self.diff[...] = (-1)*(scores < labels).astype(int) \
               + (scores > labels).astype(int)
      
       top[0].data[0] = numpy.sum(numpy.abs(scores - labels)) / bottom[0].num
  
   def backward(self, top, propagate_down, bottom):
       """
       Backward pass.
      
       :param bottom: bottom inputs
       :type bottom: caffe._caffe.RawBlobVec
       :param propagate_down:
       :type propagate_down:
       :param top: top outputs
       :type top: caffe._caffe.RawBlobVec
       """
      
       for i in range(2):
           if not propagate_down[i]:
               continue
          
           if i == 0:
               sign = 1
           else:
               sign = -1
          
           # also see the comments of this article for the discussion why top[0].diff[0] is used:
           bottom[i].diff[...] = (sign * self.diff * top[0].diff[0] / bottom[i].num).reshape(bottom[i].diff.shape)

Step 8. To perform data augmentation on the input layer and the associated labels, the following two layers are coded.

First layer

class DataAugmentationDoubleLabelsLayer(caffe.Layer):
   """
   All data augmentation labels double or quadruple the number of samples per
   batch. This layer is the base layer to double or quadruple the
   labels accordingly.
   """
      
   def setup(self, bottom, top):
       """
       Checks the correct number of bottom inputs.
      
       :param bottom: bottom inputs
       :type bottom: [numpy.ndarray]
       :param top: top outputs
       :type top: [numpy.ndarray]
       """
      
       self._k = 2
 
   def reshape(self, bottom, top):
       """
       Make sure all involved blobs have the right dimension.
      
       :param bottom: bottom inputs
       :type bottom: caffe._caffe.RawBlobVec
       :param top: top outputs
       :type top: caffe._caffe.RawBlobVec
       """
      
       if len(bottom[0].shape) == 4:
           top[0].reshape(self._k*bottom[0].data.shape[0], bottom[0].data.shape[1], bottom[0].data.shape[2], bottom[0].data.shape[3])
       elif len(bottom[0].shape) == 3:
           top[0].reshape(self._k*bottom[0].data.shape[0], bottom[0].data.shape[1], bottom[0].data.shape[2])
       elif len(bottom[0].shape) == 2:
           top[0].reshape(self._k*bottom[0].data.shape[0], bottom[0].data.shape[1])
       else:
           top[0].reshape(self._k*bottom[0].data.shape[0])
      
   def forward(self, bottom, top):
       """
       Forward propagation.
      
       :param bottom: bottom inputs
       :type bottom: caffe._caffe.RawBlobVec
       :param top: top outputs
       :type top: caffe._caffe.RawBlobVec
       """
      
       batch_size = bottom[0].data.shape[0]
       if len(bottom[0].shape) == 4:
           top[0].data[0:batch_size, :, :, :] = bottom[0].data
          
           for i in range(self._k - 1):
               top[0].data[(i + 1)*batch_size:(i + 2)*batch_size, :, :, :] = bottom[0].data
       elif len(bottom[0].shape) == 3:
           top[0].data[0:batch_size, :, :] = bottom[0].data
          
           for i in range(self._k - 1):
               top[0].data[(i + 1)*batch_size:(i + 2)*batch_size, :, :] = bottom[0].data
       elif len(bottom[0].shape) == 2:
           top[0].data[0:batch_size, :] = bottom[0].data
          
           for i in range(self._k - 1):
               top[0].data[(i + 1)*batch_size:(i + 2)*batch_size, :] = bottom[0].data
       else:
           top[0].data[0:batch_size] = bottom[0].data
          
           for i in range(self._k - 1):
               top[0].data[(i + 1)*batch_size:(i + 2)*batch_size] = bottom[0].data
          
   def backward(self, top, propagate_down, bottom):
       """
       Backward pass.
      
       :param bottom: bottom inputs
       :type bottom: caffe._caffe.RawBlobVec
       :param propagate_down:
       :type propagate_down:
       :param top: top outputs
       :type top: caffe._caffe.RawBlobVec
       """
           
       pass

Second layer

class DataAugmentationMultiplicativeGaussianNoiseLayer(caffe.Layer):
   """
   Multiplicative Gaussian noise.
   """
  
   def setup(self, bottom, top):
       """
       Checks the correct number of bottom inputs.
      
       :param bottom: bottom inputs
       :type bottom: [numpy.ndarray]
       :param top: top outputs
       :type top: [numpy.ndarray]
       """
      
       pass
 
   def reshape(self, bottom, top):
       """
       Make sure all involved blobs have the right dimension.
      
       :param bottom: bottom inputs
       :type bottom: caffe._caffe.RawBlobVec
       :param top: top outputs
       :type top: caffe._caffe.RawBlobVec
       """
      
       top[0].reshape(2*bottom[0].data.shape[0], bottom[0].data.shape[1], bottom[0].data.shape[2], bottom[0].data.shape[3])
      
   def forward(self, bottom, top):
       """
       Forward propagation.
      
       :param bottom: bottom inputs
       :type bottom: caffe._caffe.RawBlobVec
       :param top: top outputs
       :type top: caffe._caffe.RawBlobVec
       """
      
       batch_size = bottom[0].data.shape[0]
       top[0].data[0:batch_size, :, :, :] = bottom[0].data
       top[0].data[batch_size:2*batch_size, :, :, :] = tools.data_augmentation.multiplicative_gaussian_noise(bottom[0].data)
      
   def backward(self, top, propagate_down, bottom):
       """
       Backward pass.
      
       :param bottom: bottom inputs
       :type bottom: caffe._caffe.RawBlobVec
       :param propagate_down:
       :type propagate_down:
       :param top: top outputs
       :type top: caffe._caffe.RawBlobVec
       """
           
       pass

Step 9. Deploying your Deep learning model

To deploy the model, two steps must be followed. The first step is removing the LMDB input layer, and the second step is to remove the loss layer.

Removing the LMDB input layer

layer {
 name: "data"
 type: "Data"
 top: "data"
 top: "label"
 transform_param {
   scale: 0.00390625
 }
 data_param {
   source: "train_lmdb"
   batch_size: 128
   backend: LMDB
 }
}
# ...
layer {
 name: "loss"
 type: "SoftmaxWithLoss"
 bottom: "fc8"
 bottom: "label"
 top: "loss"
}

Removing the loss layer

layer {
 name: "data"
 type: "Input"
 top: "data"
 input_param { shape: { dim: 128 dim: 1 dim: 28 dim: 28 } }
}
# ...

Step 10. Training your Deep learning model using a Solver.

Your Deep learning model on Caffe can be trained with the help of a Solver.

# Assuming that the solver .prototxt has already been configured including
# the corresponding training and testing network definitions (as .prototxt).
solver = caffe.SGDSolver(prototxt_solver)
 
iterations = 1000 # Depending on dataset size, batch size etc. ...
for iteration in range(iterations):
   solver.step(1) # We could also do larger steps (i.e. multiple iterations at once).
  
   # Here we could monitor the progress by testing occasionally,
   # plotting loss, error, gradients, activations etc.

You will then set Solver Configuration stored as ‘.prototxt’ file. ‘Tools.solver’ will allow you to read and write the solver configuration.

Now that your model is ready for training, it is time to monitor it.

Step 11. Monitor your Deep learning model training progress

def count_errors(scores, labels):
   """
   Utility method to count the errors given the ouput of the
   "score" layer and the labels.
      
   :param score: output of score layer
   :type score: numpy.ndarray
   :param labels: labels
   :type labels: numpy.ndarray
   :return: count of errors
   :rtype: int
   """
      
   return numpy.sum(numpy.argmax(scores, axis = 1) != labels)
  
solver = caffe.SGDSolver(prototxt_solver)
callbacks = []
 
# Callback to report loss in console. Also automatically plots the loss
# and writes it to the given file. In order to silence the console,
# use plot_loss instead of report_loss.
report_loss = tools.solvers.PlotLossCallback(100, '/loss.png') # How often to report the loss and where to plot it
callbacks.append({
   'callback': tools.solvers.PlotLossCallback.report_loss,
   'object': report_loss,
   'interval': 1,
})
  
# Callback to report error in console.
# Needs to know the training set size and testing set size and
# is provided with a function count_errors to count (or calculate) the errors
# given the labels and the network output
report_error = tools.solvers.PlotErrorCallback(count_errors, training_set_size, testing_set_size,
                                              '', # may be used for saving early stopping models, uninteresting here ...
                                              'error.png') # where to plot the error
callbacks.append({
   'callback': tools.solvers.PlotErrorCallback.report_error,
   'object': report_error,
   'interval': 500,
})
 
# Callback for saving regular snapshots using the snapshot_prefix in the
# solver prototxt file.
callbacks.append({
   'callback': tools.solvers.SnapshotCallback.write_snapshot,
   'object': tools.solvers.SnapshotCallback(),
   'interval': 500,
})
  
monitoring_solver = tools.solvers.MonitoringSolver(solver)
monitoring_solver.register_callback(callbacks)
monitoring_solver.solve(args.iterations)

Now comes the testing part where you will judge your deep learning model’s prediction accuracy.

We will initialize our Deep learning network and test it by,

net = caffe.Net(deploy_prototxt_path, caffemodel_path, caffe.TEST)

Transforming the input data is the nest step.

transformer = caffe.io.Transformer({'data': (1, image.shape[2], image.shape[0], image.shape[1])})
transformer.set_transpose('data', (2, 0, 1)) # To reshape from (H, W, C) to (C, H, W) ...
transformer.set_raw_scale('data', 1/255.) # To scale to [0, 1] ...
net.blobs['data'].data[...] = transformer.preprocess('data', image)

Step 12. Running the predictions

net.forward()
scores = net.blobs['score'].data

Step 13. Visualization

Visualization is important so as to learn the model weights of your deep neural network. To understand the inner workings of your model and understand the performance through parameters, run the following code for visualization.

def visualize_kernels(net, layer, zoom = 5):
   """
   Visualize kernels in the given convolutional layer.
  
   :param net: caffe network
   :type net: caffe.Net
   :param layer: layer name
   :type layer: string
   :param zoom: the number of pixels (in width and height) per kernel weight
   :type zoom: int
   :return: image visualizing the kernels in a grid
   :rtype: numpy.ndarray
   """
  
   num_kernels = net.params[layer][0].data.shape[0]
   num_channels = net.params[layer][0].data.shape[1]
   kernel_height = net.params[layer][0].data.shape[2]
   kernel_width = net.params[layer][0].data.shape[3]
  
   image = numpy.zeros((num_kernels*zoom*kernel_height, num_channels*zoom*kernel_width))
   for k in range(num_kernels):
       for c in range(num_channels):
           kernel = net.params[layer][0].data[k, c, :, :]
           kernel = cv2.resize(kernel, (zoom*kernel_height, zoom*kernel_width), kernel, 0, 0, cv2.INTER_NEAREST)
           kernel = (kernel - numpy.min(kernel))/(numpy.max(kernel) - numpy.min(kernel))
           image[k*zoom*kernel_height:(k + 1)*zoom*kernel_height, c*zoom*kernel_width:(c + 1)*zoom*kernel_width] = kernel
  
   return image

Step 14. To know the several other parameters, run the following command codes.

Know the layer names by,

def get_layers(net):
   """
   Get the layer names of the network.
  
   :param net: caffe network
   :type net: caffe.Net
   :return: layer names
   :rtype: [string]
   """
  
   return [layer for layer in net.params.keys()]

Copying the weights of the model

def copy_weights(net_from, net_to):
   """
   Copy weights between networks.
      
   :param net_from: network to copy weights from
   :type net_from: caffe.Net
   :param net_to: network to copy weights to
   :type net_to: caffe.Net
   """
  
   # http://stackoverflow.com/questions/38511503/how-to-compute-test-validation-loss-in-pycaffe
   params = net_from.params.keys()
   for pr in params:
       net_to.params[pr][1] = net_from.params[pr][1]
       net_to.params[pr][0] = net_from.params[pr][0]

To know the batch size,

def get_batch_size(net):
   """
   Get the batch size used in the network.
  
   :param net: network
   :type net: caffe.Net
   """
      
   return net.blobs['data'].data.shape[0]

To know the losses incurred in the current batch, run the following code.

def get_loss(net):
   """
   Gets the loss from the training net.
      
   :param net: network to get the loss
   :type net: caffe.Net
   """
      
   return net.blobs['loss'].data

You can also compute the gradient magnitude for each network layer by,

gradients = []
for i in range(len(net.layers)):
   gradients.append(numpy.sum(numpy.multiply(net.layers[i].blobs[0].diff, net.layers[i].blobs[0].diff)) \
       + numpy.sum(numpy.multiply(net.layers[i].blobs[1].diff, net.layers[i].blobs[1].diff)))

#6. Recommended resources for a deeper understanding of Caffe.

You can refer the following recommended links to Caffe resources. Take a look at them for a deeper understanding of Caffe.

Installation guides

Caffe and community

GitHub repositories

Endnote

In this Caffe guide we just saw how it can be used to build deep learning neural networks. Try building one yourself and share your experience with us in the comment box below!

Deep Learning