You have arrived at the right place if you’re looking for information on Caffe for Deep Learning projects. This article is a guide for beginners who are trying to find their way around Deep Learning using the Caffe framework. At the end of this guide to Caffe, you’ll be able to know;
- What is Deep Learning?
- What is Caffe software?
- Why is Caffe a popular choice for Deep Learning?
- How to install Caffe software on your machine?
- How to run your first program using Caffe?
- Preprocessing the data
- Labeling the data
- Converting images into LMDB dataset
- Data augmentation
- Setting up the architecture of your deep learning model
- Customizing Python layers between input and output
- Executing forward and backward pass for loss layers
- Deploying your Deep learning network
- Monitoring your deep learning model
- Training your deep learning network model
- Testing your deep learning network
- Running the predictions
- Visualization
- Knowing the performance parameters of the model
Recommended resources for a deeper understanding of Caffe
Let us get started with our objective!
#1. What is Deep Learning?
Deep Learning is a subdomain of Machine Learning methods and techniques based on learning data representations and making predictions, without using task-specific algorithms. The learning in this case, by an ML model, can be supervised, semi-supervised or unsupervised. Deep learning algorithms are inspired by and based on the human brain’s structure and function, and are called Artificial Neural Networks.
A Machine Learning model in case of deep learning learns through examples (with positive and negative labels). The ML model is trained on a pre-processed data set of specific examples, and then makes prediction based on its training.
Here is a link that you can refer for more information on deep learning.
Let us now switch our focus to Caffe software.
#2. What is Caffe software?
Caffe is an open-source deep learning framework developed for Machine Learning. It is written in C++ and Caffe’s interface is coded in Python. It has been developed by the Berkeley AI Research, with contributions from the community developers.
This software has been designed keeping in mind the expressions, speed, modularity, openness and full community support to enable seamless creation of Deep Learning models.
Let us see what are the features that make Caffe a popular choice for Deep Learning projects.
#3. Why is Caffe a popular choice for Deep Learning?
Caffe has been designed for the purposes of speed, open-source ML development, expressive architecture and seamless community support. These features make Caffe framework a popular choice for building Deep Learning models. The same are explained as follows;
- Extensible Code: The Caffe framework features top-notch codes, algorithms and models. Over the time, Caffe has witnessed many changes contributed by the developers and researchers that makes it a powerful platform for deep learning projects.
- Expressive Architecture: An expressive and modular architecture of Caffe allows for boosted development of applications and programs based on machine learning. One can define the models and optimize them without hard-coding efforts. The user can switch between GPU (Graphics Processing Unit) and CPU by using a single-flag and train the ML model on a GPU machine. The model can be then deployed to mobile device platforms or commodity clusters.
- Speed: Another feature that makes Caffe a popular choice for Deep Learning operations. With a single Nvidia K40 GPU, Caffe can process over 60 million images per day. That speed translates to 1 millisecond/image for inference and 4 milliseconds/image for learning operations. Recent library versions and latest hardware are still boosting Caffe’s speed performance. This makes Caffe framework a perfect candidate for deploying deep learning ML models at industry level as well as for research experiments.
- Community support: The users of Caffe platform can get developer support from Caffe-users group and GitHub platform. Various startup-prototypes, academic research projects, and industrial applications relating to vision, speech, and multimedia recognition have been forged on Caffe, and powered by the support of Caffe Community. Interestingly, the Caffe team likes to call its developers as ‘Brewers’!
Now that you are familiar with some notable features of the Caffe framework that make it a powerful platform for Deep Learning, let us see how you can download and install it to get started with it.
#4. How to download and install Caffe software?
In this part, you will go through the steps involved in downloading and installing the Caffe software to build Deep Learning models. Here they are;
Prerequisites for Using Caffe
Following are the prerequisites to support and install Caffe framework on your machine. Take a look.
Mandatory dependencies
These are the Caffe’s several dependencies that need to be installed before using it.
CUDA
This software library API is required for running Caffe in GPU (Graphics Processing Unit) mode. CUDA is required for application programming with Caffe. It is a parallel computing platform for interface modelling created by Nvidia.
- Install the CUDA library version 6 or 7+
BLAS
BLAS or Basic Linear Algebra Subprograms is a set of algorithms for performing common linear algebra operations like scalar multiplications, dot products, vector additions, matrix multiplications and linear combinations.
- Install ATLAS, MKL or OpenBLAS
- If you are using OpenBLAS, set BLAS := OPEN in ‘Makefile.config.’
C++ libraries
For running Caffe on your machine, C++ library is required.
- Install C++ libraries via Boost.org. The library must be newer than the version 1.55.
Install the following libraries as well;
- Protobuf
- Glog
- Gflags
- Hdf5
Optional dependencies
Following are some optional dependencies that can be installed as per the user’s preference.
OpenCV
OpenCV is a Open Source Computer Vision library for commercial and academic use. It is supported by all the major operating systems and its interface is coded in C, C++, Java, and Python.
- Install OpenCV version 2.4 or higher via OpenCV.org
IO Libraries
You can install Input-Output libraries like lmdb and leveldb. Here, leveldb will require installation of ‘snappy’ API as well.
cuDNN
It is Nvidia CUDA’s Deep Neural Network Library for accelerated GPU processing of deep neural networks. It allows for highly tuned implementations of standard routines like normalization, pooling, forward and backward convolution, and activation layers.
- Install cuDNN version 6 to accelerate the Caffe in GPU mode. Install cuDNN and then uncomment USE_CUDNN := flag in ‘Makefile.config’ while installing Caffe.
- Doing this will speed up your Caffe models the acceleration is automatic.
- To use the Caffe without GPU mode, i.e., only in CPU-mode, uncomment CPU_ONLY in ‘Makefile.config’ to configure Caffe to run without CUDA.
PyCaffe and Matcaffe dependencies
- For Python Caffe, you need to install Python version 2.7 or Python version 3.3+. The boost library can be accessed via ‘boost.python.’
- For MATLAB Caffe, you need to install MATLAB with ‘mex’ compiler.
Steps to install PyCaffe;
Here are the steps to install PyCaffe (Caffe for Python) on your machine. Assuming that you have installed all the prerequisites like C++, Python, CUDA and other optional dependencies as well.
Follow this code to download the latest version of Caffe and build it. Open command prompt in your Ubuntu system and run this code in a shell file.
# Set up here how many cores you want to use during the installation: NUMBER_OF_CORES=2 cd sudo apt-get update sudo DEBIAN_FRONTEND=noninteractive apt-get upgrade -y -q -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold" # If you are OK with all defaults sudo apt-get install -y libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev sudo apt-get install -y --no-install-recommends libboost-all-dev sudo apt-get install -y libatlas-base-dev sudo apt-get install -y python-dev sudo apt-get install -y python-pip git # For Ubuntu 14.04 sudo apt-get install -y libgflags-dev libgoogle-glog-dev liblmdb-dev protobuf-compiler git clone https://github.com/LMDB/lmdb.git cd lmdb/libraries/liblmdb sudo make sudo make install # More pre-requisites sudo apt-get install -y cmake unzip doxygen sudo apt-get install -y protobuf-compiler sudo apt-get install -y libffi-dev python-dev build-essential sudo pip install lmdb sudo pip install numpy sudo apt-get install -y python-numpy sudo apt-get install -y gfortran # required by scipy sudo pip install scipy # required by scikit-image sudo apt-get install -y python-scipy # in case pip failed sudo apt-get install -y python-nose sudo pip install scikit-image # to fix https://github.com/BVLC/caffe/issues/50 # Get caffe (http://caffe.berkeleyvision.org/installation.html#compilation) cd mkdir caffe cd caffe wget https://github.com/BVLC/caffe/archive/master.zip unzip -o master.zip cd caffe-master # Prepare Python binding (pycaffe) cd python for req in $(cat requirements.txt); do sudo pip install $req; done echo "export PYTHONPATH=$(pwd):$PYTHONPATH " >> ~/.bash_profile # to be able to call "import caffe" from Python after reboot source ~/.bash_profile # Update shell cd .. # Compile caffe and pycaffe cp Makefile.config.example Makefile.config sed -i '8s/.*/CPU_ONLY := 1/' Makefile.config # Line 8: CPU only sudo apt-get install -y libopenblas-dev sed -i '33s/.*/BLAS := open/' Makefile.config # Line 33: to use OpenBLAS # Note that if one day the Makefile.config changes and these line numbers change, we're screwed # Maybe it would be best to simply append those changes at the end of Makefile.config echo "export OPENBLAS_NUM_THREADS=($NUMBER_OF_CORES)" >> ~/.bash_profile mkdir build cd build cmake .. cd .. make all -j$NUMBER_OF_CORES # 4 is the number of parallel threads for compilation: typically equal to number of physical cores make pycaffe -j$NUMBER_OF_CORES make test make runtest #make matcaffe make distribute # Bonus for other work with pycaffe sudo pip install pydot sudo apt-get install -y graphviz sudo pip install scikit-learn
The installation directory of Caffe must adapt to the following paths listed below.
export OPENBLAS_NUM_THREADS=(4) export CAFFE_ROOT=/home/david/caffe export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH export PYTHONPATH=/home/david/caffe/python:$PYTHONPATH
#5. How to run your first program in Caffe?
Now that you have successfully installed Caffe, it is time to build your first program in Caffe. Following are the steps than you need to follow. Let us get started!
Step 1. Preprocessing the data for Deep learning with Caffe.
To read the input data, Caffe uses LMDBs or Lightning-Memory mapped database. Hence, Caffe is based on the Pythin LMDB package.
The dataset of images to be fed in Caffe must be stored as a blob of dimension (N,C,H,W). N represents the size of the dataset, C means the number of channels, H reflects the height of the images, and W means the width of the images. Caffe can process the image datasets as 8-bit Chars or 32-bit Floats.
To process or write your data as an LMDB that can be read by Caffe, run the following code.
import lmdb import numpy import caffe def write(self, images, labels = []): """ Write a single image or multiple images and the corresponding label(s). The imags are expected to be two-dimensional NumPy arrays with multiple channels (if applicable). :param images: input images as list of numpy.ndarray with height x width x channels :type images: [numpy.ndarray] :param labels: corresponding labels (if applicable) as list :type labels: [float] :return: list of keys corresponding to the written images :rtype: [string] """ if len(labels) > 0: assert len(images) == len(labels) keys = [] env = lmdb.open(self._lmdb_path, map_size = max(1099511627776, len(images)*images[0].nbytes)) with env.begin(write = True) as transaction: for i in range(len(images)): datum = caffe.proto.caffe_pb2.Datum() datum.channels = images[i].shape[2] datum.height = images[i].shape[0] datum.width = images[i].shape[1] assert images[i].dtype == numpy.uint8 or images[i].dtype == numpy.float, "currently only numpy.uint8 and numpy.float images are supported" if images[i].dtype == numpy.uint8: # For NumPy 1.9 or higher, use tobytes() instead! datum.data = images[i].transpose(2, 0, 1).tostring() else: datum.float_data.extend(images[i].transpose(2, 0, 1).flat) if len(labels) > 0: datum.label = labels[i] key = to_key(self._write_pointer) keys.append(key) transaction.put(key.encode('ascii'), datum.SerializeToString()); self._write_pointer += 1 return keys
Step 2. Label the preprocessed image datasets for training the Caffe model by running the following code in your Ubuntu command prompt.
import tools.lmdb_io # LMDB I/O tools in caffe-tools lmdb_path = 'tests/test_lmdb' lmdb = tools.lmdb_io.LMDB(lmdb_path) # Some random images in uint8: write_images = [(numpy.random.rand(10, 10, 3)*255).astype(numpy.uint8)]*10 write_labels = [0]*10 lmdb.write(write_images, write_labels) read_images, read_labels, read_keys = lmdb.read()
Now it is time to make your preprocessed and labeled LMDB data readable. Use the following code.
def read(self): """ Read the whole LMDB. The method will return the data and labels (if applicable) as dictionary which is indexed by the eight-digit numbers stored as strings. :return: read images, labels and the corresponding keys :rtype: ([numpy.ndarray], [int], [string]) """ images = [] labels = [] keys = [] env = lmdb.open(self._lmdb_path, readonly = True) with env.begin() as transaction: cursor = transaction.cursor(); for key, raw in cursor: datum = caffe.proto.caffe_pb2.Datum() datum.ParseFromString(raw) label = datum.label if datum.data: image = numpy.fromstring(datum.data, dtype = numpy.uint8).reshape(datum.channels, datum.height, datum.width).transpose(1, 2, 0) else: image = numpy.array(datum.float_data).astype(numpy.float).reshape(datum.channels, datum.height, datum.width).transpose(1, 2, 0) images.append(image) labels.append(label) keys.append(key) return images, labels, keys
Step 3. Converting images and CSV datasets
In the following code, Iris dataset has been converted to LMDB from CSV format.
import tools.pre_processing import tools.lmdb_io lmdb_converted = args.working_directory + '/lmdb_converted' pp_in = tools.pre_processing.PreProcessingInputCSV(args.file, delimiter = ',', label_column = 4, label_column_mapping = { 'Iris-setosa': 0, 'Iris-versicolor': 1, 'Iris-virginica': 2 }) pp_out_converted = tools.pre_processing.PreProcessingOutputLMDB(lmdb_converted) pp_convert = tools.pre_processing.PreProcessingNormalize(pp_in, pp, 7.9) pp_convert.run() print('LMDB:') lmdb = tools.lmdb_io.LMDB(lmdb_converted) images, labels, keys = lmdb.read() for n in range(len(images)): print images[n].reshape((4)), labels[n]
Step 4. Data augmentation
For a Deep Learning model to automatically learn robustness to noise, invariances, and artificially expand the dataset size, Data Augmentation is required. In Caffe, ‘tools.data_augmentation’ provides data augmentation techniques. Here is an example as to run a Data Augmentation command for PyCaffe.
def multiplicative_gaussian_noise(images, std = 0.05): """ Multiply with Gaussian noise. :param images: images (or data) in Caffe format (batch_size, height, width, channels) :type images: numpy.ndarray :param std: standard deviation of Gaussian :type std: float :return: images (or data) with multiplicative Gaussian noise :rtype: numpy.ndarray """ assert images.ndim == 4 assert images.dtype == numpy.float32 return numpy.multiply(images, numpy.random.randn(images.shape[0], images.shape[1], images.shape[2], images.shape[3])*std + 1) def additive_gaussian_noise(images, std = 0.05): """ Add Gaussian noise to the images. :param images: images (or data) in Caffe format (batch_size, height, width, channels) :type images: numpy.ndarray :param std: standard deviation of Gaussian :type std: float :return: images (or data) with additive Gaussian noise :rtype: numpy.ndarray """ assert images.ndim == 4 assert images.dtype == numpy.float32 return images + numpy.random.randn(images.shape[0], images.shape[1], images.shape[2], images.shape[3])*std
Step 5. Setting up the architecture of your Deep learning model.
With PyCaffe, you can programmatically define your deep learning network’s architecture. Following code is an example for Network Definition while using an Iris dataset.
def iris_network(lmdb_path, batch_size): """ Simple network for Iris classification. :param lmdb_path: path to LMDB to use (train or test LMDB) :type lmdb_path: string :param batch_size: batch size to use :type batch_size: int :return: the network definition as string to write to the prototxt file :rtype: string """ net = caffe.NetSpec() net.data, net.labels = caffe.layers.Data(batch_size = batch_size, backend = caffe.params.Data.LMDB, source = lmdb_path, ntop = 2) net.data_aug = caffe.layers.Python(net.data, python_param = dict(module = 'tools.layers', layer = 'DataAugmentationRandomMultiplicativeNoiseLayer')) net.labels_aug = caffe.layers.Python(net.labels, python_param = dict(module = 'tools.layers', layer = 'DataAugmentationDuplicateLabelsLayer')) net.fc1 = caffe.layers.InnerProduct(net.data_aug, num_output = 12, bias_filler = dict(type = 'xavier', std = 0.1), weight_filler = dict(type = 'xavier', std = 0.1)) net.sigmoid1 = caffe.layers.Sigmoid(net.fc1) net.fc2 = caffe.layers.InnerProduct(net.sigmoid1, num_output = 3, bias_filler = dict(type = 'xavier', std = 0.1), weight_filler = dict(type = 'xavier', std = 0.1)) net.score = caffe.layers.Softmax(net.fc2) net.loss = caffe.layers.MultinomialLogisticLoss(net.score, net.labels_aug) return net.to_proto()
Step 6. Customizing Python layers between the input and output layers.
With PyCaffe, you can define custom Python code layers by running the following code.
with open(train_prototxt_path, 'w') as f: f.write('force_backward: true\n') # For the MNIST network it is not necessary, but for illustration purposes ... f.write(str(mnist_network(train_lmdb_path, train_batch_size)))
Then run,
class TestLayer(caffe.Layer): """ A test layer meant for testing purposes which actually does nothing. Note, however, to use the force_backward: true option in the net specification to enable the backward pass in layers without parameters. """ def setup(self, bottom, top): """ Checks the correct number of bottom inputs. :param bottom: bottom inputs :type bottom: [numpy.ndarray] :param top: top outputs :type top: [numpy.ndarray] """ pass def reshape(self, bottom, top): """ Make sure all involved blobs have the right dimension. :param bottom: bottom inputs :type bottom: caffe._caffe.RawBlobVec :param top: top outputs :type top: caffe._caffe.RawBlobVec """ top[0].reshape(bottom[0].data.shape[0], bottom[0].data.shape[1], bottom[0].data.shape[2], bottom[0].data.shape[3]) def forward(self, bottom, top): """ Forward propagation. :param bottom: bottom inputs :type bottom: caffe._caffe.RawBlobVec :param top: top outputs :type top: caffe._caffe.RawBlobVec """ top[0].data[...] = bottom[0].data def backward(self, top, propagate_down, bottom): """ Backward pass. :param bottom: bottom inputs :type bottom: caffe._caffe.RawBlobVec :param propagate_down: :type propagate_down: :param top: top outputs :type top: caffe._caffe.RawBlobVec """ bottom[0].diff[...] = top[0].diff[...]
Step 7. Forward and backward passes can be executed with Manhatten Loss layer, which is similar to Euclidean loss layer.
class ManhattenLoss(caffe.Layer): """ Compute the Manhatten Loss. """ def setup(self, bottom, top): """ Checks the correct number of bottom inputs. :param bottom: bottom inputs :type bottom: [numpy.ndarray] :param top: top outputs :type top: [numpy.ndarray] """ if len(bottom) != 2: raise Exception('Need two bottom inputs for Manhatten distance.') def reshape(self, bottom, top): """ Make sure all involved blobs have the right dimension. :param bottom: bottom inputs :type bottom: caffe._caffe.RawBlobVec :param top: top outputs :type top: caffe._caffe.RawBlobVec """ # Check bottom dimensions. if bottom[0].count != bottom[1].count: raise Exception('Inputs of both bottom inputs have to match.') # Set shape of diff to input shape. self.diff = numpy.zeros_like(bottom[0].data, dtype = numpy.float32) # Set output dimensions: top[0].reshape(1) def forward(self, bottom, top): """ Forward propagation, i.e. compute the Manhatten loss. :param bottom: bottom inputs :type bottom: caffe._caffe.RawBlobVec :param top: top outputs :type top: caffe._caffe.RawBlobVec """ scores = bottom[0].data # network output labels = bottom[1].data.reshape(scores.shape) # labels self.diff[...] = (-1)*(scores < labels).astype(int) \ + (scores > labels).astype(int) top[0].data[0] = numpy.sum(numpy.abs(scores - labels)) / bottom[0].num def backward(self, top, propagate_down, bottom): """ Backward pass. :param bottom: bottom inputs :type bottom: caffe._caffe.RawBlobVec :param propagate_down: :type propagate_down: :param top: top outputs :type top: caffe._caffe.RawBlobVec """ for i in range(2): if not propagate_down[i]: continue if i == 0: sign = 1 else: sign = -1 # also see the comments of this article for the discussion why top[0].diff[0] is used: bottom[i].diff[...] = (sign * self.diff * top[0].diff[0] / bottom[i].num).reshape(bottom[i].diff.shape)
Step 8. To perform data augmentation on the input layer and the associated labels, the following two layers are coded.
First layer
class DataAugmentationDoubleLabelsLayer(caffe.Layer): """ All data augmentation labels double or quadruple the number of samples per batch. This layer is the base layer to double or quadruple the labels accordingly. """ def setup(self, bottom, top): """ Checks the correct number of bottom inputs. :param bottom: bottom inputs :type bottom: [numpy.ndarray] :param top: top outputs :type top: [numpy.ndarray] """ self._k = 2 def reshape(self, bottom, top): """ Make sure all involved blobs have the right dimension. :param bottom: bottom inputs :type bottom: caffe._caffe.RawBlobVec :param top: top outputs :type top: caffe._caffe.RawBlobVec """ if len(bottom[0].shape) == 4: top[0].reshape(self._k*bottom[0].data.shape[0], bottom[0].data.shape[1], bottom[0].data.shape[2], bottom[0].data.shape[3]) elif len(bottom[0].shape) == 3: top[0].reshape(self._k*bottom[0].data.shape[0], bottom[0].data.shape[1], bottom[0].data.shape[2]) elif len(bottom[0].shape) == 2: top[0].reshape(self._k*bottom[0].data.shape[0], bottom[0].data.shape[1]) else: top[0].reshape(self._k*bottom[0].data.shape[0]) def forward(self, bottom, top): """ Forward propagation. :param bottom: bottom inputs :type bottom: caffe._caffe.RawBlobVec :param top: top outputs :type top: caffe._caffe.RawBlobVec """ batch_size = bottom[0].data.shape[0] if len(bottom[0].shape) == 4: top[0].data[0:batch_size, :, :, :] = bottom[0].data for i in range(self._k - 1): top[0].data[(i + 1)*batch_size:(i + 2)*batch_size, :, :, :] = bottom[0].data elif len(bottom[0].shape) == 3: top[0].data[0:batch_size, :, :] = bottom[0].data for i in range(self._k - 1): top[0].data[(i + 1)*batch_size:(i + 2)*batch_size, :, :] = bottom[0].data elif len(bottom[0].shape) == 2: top[0].data[0:batch_size, :] = bottom[0].data for i in range(self._k - 1): top[0].data[(i + 1)*batch_size:(i + 2)*batch_size, :] = bottom[0].data else: top[0].data[0:batch_size] = bottom[0].data for i in range(self._k - 1): top[0].data[(i + 1)*batch_size:(i + 2)*batch_size] = bottom[0].data def backward(self, top, propagate_down, bottom): """ Backward pass. :param bottom: bottom inputs :type bottom: caffe._caffe.RawBlobVec :param propagate_down: :type propagate_down: :param top: top outputs :type top: caffe._caffe.RawBlobVec """ pass
Second layer
class DataAugmentationMultiplicativeGaussianNoiseLayer(caffe.Layer): """ Multiplicative Gaussian noise. """ def setup(self, bottom, top): """ Checks the correct number of bottom inputs. :param bottom: bottom inputs :type bottom: [numpy.ndarray] :param top: top outputs :type top: [numpy.ndarray] """ pass def reshape(self, bottom, top): """ Make sure all involved blobs have the right dimension. :param bottom: bottom inputs :type bottom: caffe._caffe.RawBlobVec :param top: top outputs :type top: caffe._caffe.RawBlobVec """ top[0].reshape(2*bottom[0].data.shape[0], bottom[0].data.shape[1], bottom[0].data.shape[2], bottom[0].data.shape[3]) def forward(self, bottom, top): """ Forward propagation. :param bottom: bottom inputs :type bottom: caffe._caffe.RawBlobVec :param top: top outputs :type top: caffe._caffe.RawBlobVec """ batch_size = bottom[0].data.shape[0] top[0].data[0:batch_size, :, :, :] = bottom[0].data top[0].data[batch_size:2*batch_size, :, :, :] = tools.data_augmentation.multiplicative_gaussian_noise(bottom[0].data) def backward(self, top, propagate_down, bottom): """ Backward pass. :param bottom: bottom inputs :type bottom: caffe._caffe.RawBlobVec :param propagate_down: :type propagate_down: :param top: top outputs :type top: caffe._caffe.RawBlobVec """ pass
Step 9. Deploying your Deep learning model
To deploy the model, two steps must be followed. The first step is removing the LMDB input layer, and the second step is to remove the loss layer.
Removing the LMDB input layer
layer { name: "data" type: "Data" top: "data" top: "label" transform_param { scale: 0.00390625 } data_param { source: "train_lmdb" batch_size: 128 backend: LMDB } } # ... layer { name: "loss" type: "SoftmaxWithLoss" bottom: "fc8" bottom: "label" top: "loss" }
Removing the loss layer
layer { name: "data" type: "Input" top: "data" input_param { shape: { dim: 128 dim: 1 dim: 28 dim: 28 } } } # ...
Step 10. Training your Deep learning model using a Solver.
Your Deep learning model on Caffe can be trained with the help of a Solver.
# Assuming that the solver .prototxt has already been configured including # the corresponding training and testing network definitions (as .prototxt). solver = caffe.SGDSolver(prototxt_solver) iterations = 1000 # Depending on dataset size, batch size etc. ... for iteration in range(iterations): solver.step(1) # We could also do larger steps (i.e. multiple iterations at once). # Here we could monitor the progress by testing occasionally, # plotting loss, error, gradients, activations etc.
You will then set Solver Configuration stored as ‘.prototxt’ file. ‘Tools.solver’ will allow you to read and write the solver configuration.
Now that your model is ready for training, it is time to monitor it.
Step 11. Monitor your Deep learning model training progress
def count_errors(scores, labels): """ Utility method to count the errors given the ouput of the "score" layer and the labels. :param score: output of score layer :type score: numpy.ndarray :param labels: labels :type labels: numpy.ndarray :return: count of errors :rtype: int """ return numpy.sum(numpy.argmax(scores, axis = 1) != labels) solver = caffe.SGDSolver(prototxt_solver) callbacks = [] # Callback to report loss in console. Also automatically plots the loss # and writes it to the given file. In order to silence the console, # use plot_loss instead of report_loss. report_loss = tools.solvers.PlotLossCallback(100, '/loss.png') # How often to report the loss and where to plot it callbacks.append({ 'callback': tools.solvers.PlotLossCallback.report_loss, 'object': report_loss, 'interval': 1, }) # Callback to report error in console. # Needs to know the training set size and testing set size and # is provided with a function count_errors to count (or calculate) the errors # given the labels and the network output report_error = tools.solvers.PlotErrorCallback(count_errors, training_set_size, testing_set_size, '', # may be used for saving early stopping models, uninteresting here ... 'error.png') # where to plot the error callbacks.append({ 'callback': tools.solvers.PlotErrorCallback.report_error, 'object': report_error, 'interval': 500, }) # Callback for saving regular snapshots using the snapshot_prefix in the # solver prototxt file. callbacks.append({ 'callback': tools.solvers.SnapshotCallback.write_snapshot, 'object': tools.solvers.SnapshotCallback(), 'interval': 500, }) monitoring_solver = tools.solvers.MonitoringSolver(solver) monitoring_solver.register_callback(callbacks) monitoring_solver.solve(args.iterations)
Now comes the testing part where you will judge your deep learning model’s prediction accuracy.
We will initialize our Deep learning network and test it by,
net = caffe.Net(deploy_prototxt_path, caffemodel_path, caffe.TEST)
Transforming the input data is the nest step.
transformer = caffe.io.Transformer({'data': (1, image.shape[2], image.shape[0], image.shape[1])}) transformer.set_transpose('data', (2, 0, 1)) # To reshape from (H, W, C) to (C, H, W) ... transformer.set_raw_scale('data', 1/255.) # To scale to [0, 1] ... net.blobs['data'].data[...] = transformer.preprocess('data', image)
Step 12. Running the predictions
net.forward() scores = net.blobs['score'].data
Step 13. Visualization
Visualization is important so as to learn the model weights of your deep neural network. To understand the inner workings of your model and understand the performance through parameters, run the following code for visualization.
def visualize_kernels(net, layer, zoom = 5): """ Visualize kernels in the given convolutional layer. :param net: caffe network :type net: caffe.Net :param layer: layer name :type layer: string :param zoom: the number of pixels (in width and height) per kernel weight :type zoom: int :return: image visualizing the kernels in a grid :rtype: numpy.ndarray """ num_kernels = net.params[layer][0].data.shape[0] num_channels = net.params[layer][0].data.shape[1] kernel_height = net.params[layer][0].data.shape[2] kernel_width = net.params[layer][0].data.shape[3] image = numpy.zeros((num_kernels*zoom*kernel_height, num_channels*zoom*kernel_width)) for k in range(num_kernels): for c in range(num_channels): kernel = net.params[layer][0].data[k, c, :, :] kernel = cv2.resize(kernel, (zoom*kernel_height, zoom*kernel_width), kernel, 0, 0, cv2.INTER_NEAREST) kernel = (kernel - numpy.min(kernel))/(numpy.max(kernel) - numpy.min(kernel)) image[k*zoom*kernel_height:(k + 1)*zoom*kernel_height, c*zoom*kernel_width:(c + 1)*zoom*kernel_width] = kernel return image
Step 14. To know the several other parameters, run the following command codes.
Know the layer names by,
def get_layers(net): """ Get the layer names of the network. :param net: caffe network :type net: caffe.Net :return: layer names :rtype: [string] """ return [layer for layer in net.params.keys()]
Copying the weights of the model
def copy_weights(net_from, net_to): """ Copy weights between networks. :param net_from: network to copy weights from :type net_from: caffe.Net :param net_to: network to copy weights to :type net_to: caffe.Net """ # http://stackoverflow.com/questions/38511503/how-to-compute-test-validation-loss-in-pycaffe params = net_from.params.keys() for pr in params: net_to.params[pr][1] = net_from.params[pr][1] net_to.params[pr][0] = net_from.params[pr][0]
To know the batch size,
def get_batch_size(net): """ Get the batch size used in the network. :param net: network :type net: caffe.Net """ return net.blobs['data'].data.shape[0]
To know the losses incurred in the current batch, run the following code.
def get_loss(net): """ Gets the loss from the training net. :param net: network to get the loss :type net: caffe.Net """ return net.blobs['loss'].data
You can also compute the gradient magnitude for each network layer by,
gradients = [] for i in range(len(net.layers)): gradients.append(numpy.sum(numpy.multiply(net.layers[i].blobs[0].diff, net.layers[i].blobs[0].diff)) \ + numpy.sum(numpy.multiply(net.layers[i].blobs[1].diff, net.layers[i].blobs[1].diff)))
#6. Recommended resources for a deeper understanding of Caffe.
You can refer the following recommended links to Caffe resources. Take a look at them for a deeper understanding of Caffe.
Installation guides
- http://stackoverflow.com/questions/31395729/how-to-enable-multithreading-with-caffe/31396229
- https://github.com/BVLC/caffe/wiki/Install-Caffe-on-EC2-from-scratch-(Ubuntu,-CUDA-7,-cuDNN-3)
- https://github.com/BVLC/caffe/wiki/Ubuntu-16.04-or-15.10-Installation-Guide
- https://gist.github.com/titipata/f0ef48ad2f0ebc07bcb9
- https://github.com/asampat3090/caffe-ubuntu-14.04
- https://github.com/mrgloom/Caffe-snippets
Caffe and community
GitHub repositories
- https://github.com/nitnelave/pycaffe_tutorial
- https://github.com/pulkitag/pycaffe-utils
- https://github.com/DeeperCS/pycaffe-mnist
- https://github.com/swift-n-brutal/pycaffe_utils
- https://github.com/jimgoo/caffe-oxford102
- https://github.com/ruimashita/caffe-train
- https://github.com/roseperrone/video-object-detection
- https://github.com/pecarlat/caffeTools
- https://github.com/donnemartin/data-science-ipython-notebooks
- https://github.com/jay-mahadeokar/pynetbuilder
- https://github.com/adilmoujahid/deeplearning-cats-dogs-tutorial
- https://github.com/Franck-Dernoncourt/caffe_demos
- https://github.com/koosyong/caffestud
- https://github.com/NVIDIA/DIGITS/tree/master/examples/python-layer
Endnote
In this Caffe guide we just saw how it can be used to build deep learning neural networks. Try building one yourself and share your experience with us in the comment box below!