Deep Neural Network
This page gives you an overview of our deep neural network (DNN) APIs. Our goal is to simplify machine learning programming in common use cases. You will learn how to create a DNN regressor for regression and a DNN classifier for classification. The key difference between a regressor and a classifier is the output value. The regressor predicts a continuous value in the output layer, while the classifier predicts a discrete label.
The source code of both the DNN regressor and classifier is placed in include/dtc/ml/dnn.hpp
and
src/ml/dnn.cpp
, respectively.
Preliminaries
There are a great deal amount of DNN tutorials on the web. Some great resources are Deep Learning Course by Andrew Ng, Tensorflow machine learning Beginner Guide, and Machine Learning Mastery by Jason Brownlee. We will focus on how to use our API to create DNN estimators.
Input and Output Data Types
All DNN estimators in MLCraft use Eigen Library for matrix operations. You need to convert the data into an Eigen matrix in order to use our DNN estimators. You can find the tutorial about Eigen library here.
DNN Regressor
The best way to learn to use our DNN regressor is to go through an example.
We created two matrices,
Eigen::MatrixXf data(30, 3)
and Eigen::MatrixXf gold(30, 1)
,
and initialize them with random values.
The matrix data
has 30 samples each of three features and
the matrix gold
stores the observed value of each sample.
The goal is to train a DNN model to look at the a sample and predict its label value.
We create a DNN regressor with three layers:
- 1st layer: input layer with size equal to the feature size
- 2nd layer: hidden layer with ten neurons followed by sigmoid activation
- 3rd layer: output layer with one neuron to generate the measurement
Next, train the model by calling the method train
, which takes six arguments,
data
, gold
, num_epochs
, mini_batch_size
,
learning_rate
, and a lambda to call at the end of each epoch.
When training completes, call infer
to see the result of the model.
// Create a 30 x 3 matrix to store the samples
Eigen::MatrixXf data(30, 3);
// Create a 30 x 1 matrix to store the labels.
Eigen::MatrixXf gold(30, 1);
// For simplicity, we fill matrices with random values.
data = Eigen::MatrixXf::Random(30, 3);
gold = Eigen::MatrixXf::Random(30, 1);
// Declare a DNN regressor
dtc::ml::DnnRegressor dnn;
// Add layers into the regressor
dnn.layer<dtc::ml::FullyConnectedLayer>(data.cols(), 10, dtc::ml::Activation::SIGMOID);
dnn.layer<dtc::ml::FullyConnectedLayer>(10, gold.cols());
// Parameters to train the model
const int num_epochs {30};
const int mini_batch_size {15};
const float learning_rate {0.01f};
// Start training
dnn.train(data, gold, num_epochs, mini_batch_size, learning_rate, [&, i=0] (dtc::ml::DnnRegressor& dnn) mutable {
// In each epoch, we evaluate the regressor by computing the mean square error
printf("epoch %d: mse=%.4f\n", i++, (dnn.infer(data)-gold).array().square().sum() / (2.0f*data.rows()));
});
DNN Classifier
The best way to learn to use our DNN classifier is to go through an example. We demonstrate how to create a DNN classifier for digit classification on the famous MNIST dataset. The MNIST dataset contains a set of handwritten digit images. The goal is to train a DNN model to look at an image and predict which digit (0-9) it is.
Images and labels are stored in two matrices of type Eigen::MatrixXf
and Eigen::VectorXi
,
respectively.
We create a DNN classifier with three layers:
- 1st layer: input layer with size equal to the feature size (# pixels per image)
- 2nd layer: hidden layer with 30 neurons followed by ReLU activation
- 3rd layer: output layer with 10 neurons corresponding to 10 digits
Next, train the model by calling the method train
, which takes six arguments,
images
, labels
, num_epochs
, mini_batch_size
,
learning_rate
, and a lambda to call at the end of each epoch.
When training finishes, call infer
to see the result of the model.
// Load and store the images and labels in a matrix and a vector
Eigen::MatrixXf images = dtc::ml::read_mnist_image("train-images.idx3-ubyte") / 255.0;
Eigen::VectorXi labels = dtc::ml::read_mnist_label("train-labels.idx1-ubyte");
// Create a DNN classifier
dtc::ml::DnnClassifier dnn;
// Add layers into the classifier
dnn.layer<dtc::ml::FullyConnectedLayer>(images.cols(), 30, dtc::ml::Activation::RELU);
dnn.layer<dtc::ml::FullyConnectedLayer>(30, 10);
// Parameters to train the model
const int num_epochs {5};
const int mini_batch_size {64};
const float learning_rate {0.01f};
// Start training
dnn.train(images, labels, num_epochs, mini_batch_size, learning_rate, [&, i=0] (auto& dnn) mutable {
// predict the labels of images and count the number of correct predictions.
auto c = ((dnn.infer(images) - labels).array() == 0).count();
auto t = images.rows();
dtc::cout("[Accuracy at epoch ", i++, "]: ", c, "/", t, "=", c/static_cast(t), '\n').flush();
});
DNN Modifier
The classes DnnClassifier
and DnnRegressor
share a lot of similarities
in the network creation.
The main difference is the output layer and the loss function used to generate the prediction.
Here we summarize these class methods and their usages:
Layer
The layer
method allows you to add a layer into the DNN.
A DNN layer is a std::variant
of the following types:
Fully connected layer
// Add a 10x5 FullyConnectedLayer with RELU as the activation function dnn.layer<dtc::ml::FullyConnectedLayer>(10, 5, dtc::ml::Activation::RELU);
Dropout layer
// Add a DropOutLayer with keep probability = 0.5 dnn.layer<dtc::ml::DropOutLayer>(0.5);
Activation
The Activation
enum allows you to attach different activation functions to a layer.
We currently support the following activation types:
Sigmoid
// Add a 10x5 FullyConnectedLayer with SIGMOID as the activation function dnn.layer<dtc::ml::FullyConnectedLayer>(10, 5, dtc::ml::Activation::SIGMOID);
Tanh
// Add a 10x5 FullyConnectedLayer with TANH as the activation function dnn.layer<dtc::ml::FullyConnectedLayer>(10, 5, dtc::ml::Activation::TANH);
Rectified linear unit (ReLU)
// Add a 10x5 FullyConnectedLayer with ReLU as the activation function dnn.layer<dtc::ml::FullyConnectedLayer>(10, 5, dtc::ml::Activation::ReLU);
Leaky ReLU
// Add a FullyConnectedLayer with leaky ReLU as the activation function dnn.layer<dtc::ml::FullyConnectedLayer>(10, 5, dtc::ml::Activation::LEAKY_RELU);
Optimizer
The optimizer
method allows you to associate different optimization methods with a DNN estimator.
An optimizer is a std::variant
of the following types:
AdamOptimizer (default optimizer in DNN)
// Set the optimizer to AdamOptimizer dnn.optimizer<dtc::ml::AdamOptimizer>();
GradientDescentOptimizer
// Set the optimizer to GradientDescentOptimizer dnn.optimizer<dtc::ml::GradientDescentOptimizer>();
AdagradOptimizer
// Set the optimizer to AdagradOptimizer dnn.optimizer<dtc::ml::AdagradOptimizer>();
AdamaxOptimizer
// Set the optimizer to AdamaxOptimizer dnn.optimizer<dtc::ml::AdamaxOptimizer>();
RMSpropOptimizer
// Set the optimizer to RMSpropOptimizer dnn.optimizer<dtc::ml::RMSpropOptimizer>();
MomentumOptimizer
// Set the optimizer to MomentumOptimizer dnn.optimizer<dtc::ml::MomentumOptimizer>();
Loss
The loss
method allows you to set the loss function of a DNN estimator.
A loss function is a std::variant
of the following types:
MeanSquaredError (default in DNN regressor)
// MeanSquaredError dnn.loss<dtc::ml::MeanSquaredError>();
MeanAbsoluteError
// MeanAbsoluteError dnn.loss<dtc::ml::MeanAbsoluteError>();
SoftmaxCrossEntropy (default in DNN classifier)
// SoftmaxCrossEntropy dnn.loss<dtc::ml::SoftmaxCrossEntropy>();
HuberLoss
// HuberLoss dnn.loss<dtc::ml::HuberLoss>();
Training
Training a DNN regressor and a DNN classifier is different in label types.
Labels in a DNN regressor are stored in Eigen::MatrixXf
,
while labels in a DNN classifier are stored in Eigen::VectorXi
.
Each row corresponds to an example's label.
Because of problem nature, we allow multiple labels in a regressor.
For DNN classifier, each label is an integer of range 0
to N-1
, where N
is the number of neurons at the last layer.
The signature of DNN regressor's train
method is as follows:
train(Eigen::MatrixXf& features, Eigen::MatrixXf& label, size_t num_epochs, size_t mini_batch_size, float learning_rate, C&& callback)
The signature of DNN classifier's train
method in as follows:
train(Eigen::MatrixXf& features, Eigen::VectorXi& label, size_t num_epochs, size_t mini_batch_size, float learning_rate, C&& callback)
Both methods take a callable object that will be invoked at the end of each epoch. This is useful when you intend to present per-epoch information such as debugging and logging, or apply per-epoch update such as weight decay and early stop.
Inference
The difference of inference stage between a DNN regressor and a DNN classifier is the return value.
The return value from the DNN regressor is of type Eigen::MatrixXf
,
while the return value from the DNN classifier is of type Eigen::VectorXi
.
Each row corresponds to an estimated label of the example.
For DNN classifier, each label is an integer of range 0
to N-1
, where N
is the number of neurons at the last layer.
The signature of DNN regressor's infer
method is as follows:
Eigen::MatrixXf infer(Eigen::MatrixXf& features) const
The signature of DNN classifier's infer
method in as follows:
Eigen::VectorXi infer(Eigen::MatrixXf& features) const
Save/Load Model
The methods save
and load
allow you to export a model to a binary file
and restore an existing model.
// Save the model to the binary file "my_dnn_model"
dnn.save("my_dnn_model");
// Load an existing model from the binary file "my_dnn_model"
dnn.load("my_dnn_model");