Deep Neural Network

This page gives you an overview of our deep neural network (DNN) APIs. Our goal is to simplify machine learning programming in common use cases. You will learn how to create a DNN regressor for regression and a DNN classifier for classification. The key difference between a regressor and a classifier is the output value. The regressor predicts a continuous value in the output layer, while the classifier predicts a discrete label.

The source code of both the DNN regressor and classifier is placed in include/dtc/ml/dnn.hpp and src/ml/dnn.cpp, respectively.

Preliminaries

There are a great deal amount of DNN tutorials on the web. Some great resources are Deep Learning Course by Andrew Ng, Tensorflow machine learning Beginner Guide, and Machine Learning Mastery by Jason Brownlee. We will focus on how to use our API to create DNN estimators.

Input and Output Data Types

All DNN estimators in MLCraft use Eigen Library for matrix operations. You need to convert the data into an Eigen matrix in order to use our DNN estimators. You can find the tutorial about Eigen library here.

DNN Regressor

The best way to learn to use our DNN regressor is to go through an example. We created two matrices, Eigen::MatrixXf data(30, 3) and Eigen::MatrixXf gold(30, 1), and initialize them with random values. The matrix data has 30 samples each of three features and the matrix gold stores the observed value of each sample. The goal is to train a DNN model to look at the a sample and predict its label value.

We create a DNN regressor with three layers:

1st layer: input layer with size equal to the feature size
2nd layer: hidden layer with ten neurons followed by sigmoid activation
3rd layer: output layer with one neuron to generate the measurement

Next, train the model by calling the method train, which takes six arguments, data, gold, num_epochs, mini_batch_size, learning_rate, and a lambda to call at the end of each epoch. When training completes, call infer to see the result of the model.

// Create a 30 x 3 matrix to store the samples
Eigen::MatrixXf data(30, 3); 

// Create a 30 x 1 matrix to store the labels.
Eigen::MatrixXf gold(30, 1);

// For simplicity, we fill matrices with random values.
data = Eigen::MatrixXf::Random(30, 3);
gold = Eigen::MatrixXf::Random(30, 1);

// Declare a DNN regressor
dtc::ml::DnnRegressor dnn; 

// Add layers into the regressor
dnn.layer<dtc::ml::FullyConnectedLayer>(data.cols(), 10, dtc::ml::Activation::SIGMOID);
dnn.layer<dtc::ml::FullyConnectedLayer>(10, gold.cols());

// Parameters to train the model
const int num_epochs {30};
const int mini_batch_size {15};
const float learning_rate {0.01f};

// Start training
dnn.train(data, gold, num_epochs, mini_batch_size, learning_rate, [&, i=0] (dtc::ml::DnnRegressor& dnn) mutable {
  // In each epoch, we evaluate the regressor by computing the mean square error 
  printf("epoch %d: mse=%.4f\n", i++, (dnn.infer(data)-gold).array().square().sum() / (2.0f*data.rows()));
});

DNN Classifier

The best way to learn to use our DNN classifier is to go through an example. We demonstrate how to create a DNN classifier for digit classification on the famous MNIST dataset. The MNIST dataset contains a set of handwritten digit images. The goal is to train a DNN model to look at an image and predict which digit (0-9) it is.

Images and labels are stored in two matrices of type Eigen::MatrixXf and Eigen::VectorXi, respectively. We create a DNN classifier with three layers:

1st layer: input layer with size equal to the feature size (# pixels per image)
2nd layer: hidden layer with 30 neurons followed by ReLU activation
3rd layer: output layer with 10 neurons corresponding to 10 digits

Next, train the model by calling the method train, which takes six arguments, images, labels, num_epochs, mini_batch_size, learning_rate, and a lambda to call at the end of each epoch. When training finishes, call infer to see the result of the model.

// Load and store the images and labels in a matrix and a vector
Eigen::MatrixXf images = dtc::ml::read_mnist_image("train-images.idx3-ubyte") / 255.0;
Eigen::VectorXi labels = dtc::ml::read_mnist_label("train-labels.idx1-ubyte");

// Create a DNN classifier
dtc::ml::DnnClassifier dnn;

// Add layers into the classifier
dnn.layer<dtc::ml::FullyConnectedLayer>(images.cols(), 30, dtc::ml::Activation::RELU);
dnn.layer<dtc::ml::FullyConnectedLayer>(30, 10);

// Parameters to train the model
const int num_epochs {5};
const int mini_batch_size {64};
const float learning_rate {0.01f};

// Start training
dnn.train(images, labels, num_epochs, mini_batch_size, learning_rate, [&, i=0] (auto& dnn) mutable {
  // predict the labels of images and count the number of correct predictions.
  auto c = ((dnn.infer(images) - labels).array() == 0).count();
  auto t = images.rows();
  dtc::cout("[Accuracy at epoch ", i++, "]: ", c, "/", t, "=", c/static_cast(t), '\n').flush();
});

DNN Modifier

The classes DnnClassifier and DnnRegressor share a lot of similarities in the network creation. The main difference is the output layer and the loss function used to generate the prediction. Here we summarize these class methods and their usages:

Layer

The layer method allows you to add a layer into the DNN. A DNN layer is a std::variant of the following types:

Fully connected layer

// Add a 10x5 FullyConnectedLayer with RELU as the activation function
dnn.layer<dtc::ml::FullyConnectedLayer>(10, 5, dtc::ml::Activation::RELU);

Dropout layer

// Add a DropOutLayer with keep probability = 0.5
dnn.layer<dtc::ml::DropOutLayer>(0.5);

Note: A dropout layer can only be added after another layer. The library will automatically decide the number of neurons based on its previous layer.

Activation

The Activation enum allows you to attach different activation functions to a layer. We currently support the following activation types:

Sigmoid

// Add a 10x5 FullyConnectedLayer with SIGMOID as the activation function
dnn.layer<dtc::ml::FullyConnectedLayer>(10, 5, dtc::ml::Activation::SIGMOID);

Tanh

// Add a 10x5 FullyConnectedLayer with TANH as the activation function
dnn.layer<dtc::ml::FullyConnectedLayer>(10, 5, dtc::ml::Activation::TANH);

Rectified linear unit (ReLU)

// Add a 10x5 FullyConnectedLayer with ReLU as the activation function
dnn.layer<dtc::ml::FullyConnectedLayer>(10, 5, dtc::ml::Activation::ReLU);

Leaky ReLU

// Add a FullyConnectedLayer with leaky ReLU as the activation function
dnn.layer<dtc::ml::FullyConnectedLayer>(10, 5, dtc::ml::Activation::LEAKY_RELU);

Optimizer

The optimizer method allows you to associate different optimization methods with a DNN estimator. An optimizer is a std::variant of the following types:

AdamOptimizer (default optimizer in DNN)

// Set the optimizer to AdamOptimizer 
dnn.optimizer<dtc::ml::AdamOptimizer>();

GradientDescentOptimizer

// Set the optimizer to GradientDescentOptimizer
dnn.optimizer<dtc::ml::GradientDescentOptimizer>();

AdagradOptimizer

// Set the optimizer to AdagradOptimizer
dnn.optimizer<dtc::ml::AdagradOptimizer>();

AdamaxOptimizer

// Set the optimizer to AdamaxOptimizer 
dnn.optimizer<dtc::ml::AdamaxOptimizer>();

RMSpropOptimizer

// Set the optimizer to RMSpropOptimizer
dnn.optimizer<dtc::ml::RMSpropOptimizer>();

MomentumOptimizer

// Set the optimizer to MomentumOptimizer
dnn.optimizer<dtc::ml::MomentumOptimizer>();

Loss

The loss method allows you to set the loss function of a DNN estimator. A loss function is a std::variant of the following types:

MeanSquaredError (default in DNN regressor)

// MeanSquaredError
dnn.loss<dtc::ml::MeanSquaredError>();

MeanAbsoluteError

// MeanAbsoluteError
dnn.loss<dtc::ml::MeanAbsoluteError>();

SoftmaxCrossEntropy (default in DNN classifier)

// SoftmaxCrossEntropy
dnn.loss<dtc::ml::SoftmaxCrossEntropy>();

HuberLoss

// HuberLoss
dnn.loss<dtc::ml::HuberLoss>();

Training

Training a DNN regressor and a DNN classifier is different in label types. Labels in a DNN regressor are stored in Eigen::MatrixXf, while labels in a DNN classifier are stored in Eigen::VectorXi. Each row corresponds to an example's label. Because of problem nature, we allow multiple labels in a regressor. For DNN classifier, each label is an integer of range 0 to N-1, where N is the number of neurons at the last layer.

The signature of DNN regressor's train method is as follows:

train(Eigen::MatrixXf& features, Eigen::MatrixXf& label, size_t num_epochs, size_t mini_batch_size, float learning_rate, C&& callback)

The signature of DNN classifier's train method in as follows:

train(Eigen::MatrixXf& features, Eigen::VectorXi& label, size_t num_epochs, size_t mini_batch_size, float learning_rate, C&& callback)

Both methods take a callable object that will be invoked at the end of each epoch. This is useful when you intend to present per-epoch information such as debugging and logging, or apply per-epoch update such as weight decay and early stop.

Note: We do not apply one-hot encoding for classification label because of its large and redundant space consumption.

Inference

The difference of inference stage between a DNN regressor and a DNN classifier is the return value. The return value from the DNN regressor is of type Eigen::MatrixXf, while the return value from the DNN classifier is of type Eigen::VectorXi. Each row corresponds to an estimated label of the example. For DNN classifier, each label is an integer of range 0 to N-1, where N is the number of neurons at the last layer.

The signature of DNN regressor's infer method is as follows:

Eigen::MatrixXf infer(Eigen::MatrixXf& features) const

The signature of DNN classifier's infer method in as follows:

Eigen::VectorXi infer(Eigen::MatrixXf& features) const

Save/Load Model

The methods save and load allow you to export a model to a binary file and restore an existing model.

// Save the model to the binary file "my_dnn_model"
dnn.save("my_dnn_model");

// Load an existing model from the binary file "my_dnn_model"
dnn.load("my_dnn_model");

Congratulations! You have just accomplished the DNN tutorial in MLCraft. Please refer to other pages of MLCraft to learn more machine learning API.