Matlab
Matlab
Matlab
Phone: 508-647-7000
R2019a
3-D Support: New layers enable deep learning with 3-D data
................................................. 1-4
iii
Deep Learning Layers: Hyperbolic tangent and exponential
linear unit activation layers . . . . . . . . . . . . . . . . . . . . . . . . . . 1-6
R2018b
iv Contents
Network Analyzer: Visualize, analyze, and find problems in
network architectures before training . . . . . . . . . . . . . . . . . . 2-4
v
Functionality being removed or changed . . . . . . . . . . . . . . . . . 2-9
'ValidationPatience' training option default is Inf . . . . . . . . . . . 2-9
ClassNames property of ClassificationOutputLayer will be
removed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-9
'ClassNames' option of importKerasNetwork,
importCaffeNetwork, and importONNXNetwork will be
removed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-10
Different file name for checkpoint networks . . . . . . . . . . . . . 2-10
R2018a
vi Contents
Pretrained Networks: Accelerate transfer learning by freezing
layer weights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-4
R2017b
vii
Deep Learning Layer Definition: Define new layers with
learnable parameters, and specify loss functions for
classification and regression output layers . . . . . . . . . . . . . . 4-4
R2017a
viii Contents
Pretrained Models: Transfer learning with pretrained CNN
models AlexNet, VGG-16, and VGG-19, and import models
from Caffe (including Caffe Model Zoo) . . . . . . . . . . . . . . . . 5-2
R2016b
ix
Performance: Train CNNs faster when using ImageDatastore
object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2
R2016a
R2015b
x Contents
R2015a
R2014b
Bug Fixes
R2014a
Training panels for Neural Fitting Tool and Neural Time Series
Tool Provide Choice of Training Algorithms . . . . . . . . . . . . 11-2
R2013b
xi
Cross-entropy performance measure for enhanced pattern
recognition and classification accuracy . . . . . . . . . . . . . . . . 12-7
R2013a
Bug Fixes
R2012b
xii Contents
Faster training and simulation with computer clusters using
MATLAB Distributed Computing Server . . . . . . . . . . . . . . 14-10
R2012a
Bug Fixes
R2011b
Bug Fixes
R2011a
Bug Fixes
R2010b
xiii
New Time Series Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . 18-9
R2010a
Bug Fixes
R2009b
Bug Fixes
xiv Contents
R2009a
Bug Fixes
R2008b
Bug Fixes
R2008a
R2007b
xv
Changing Default Input Processing Functions . . . . . . . . . . . . 24-4
Changing Default Output Processing Functions . . . . . . . . . . 24-5
R2007a
R2006b
R2006a
xvi Contents
Data Preprocessing and Postprocessing . . . . . . . . . . . . . . . . . 27-2
dividevec Automatically Splits Data . . . . . . . . . . . . . . . . . . . 27-3
fixunknowns Encodes Missing Data . . . . . . . . . . . . . . . . . . . 27-3
removeconstantrows Handles Constant Values . . . . . . . . . . . 27-3
mapminmax, mapstd, and processpca Are New . . . . . . . . . . . 27-3
R14SP3
xvii
1
R2019a
Version: 12.1
New Features
Bug Fixes
Compatibility Considerations
R2019a
For more information, see “Generate MATLAB Code from Deep Network Designer”.
1-2
Layer Initialization: Initialize layer weights and biases using
initializers or a custom function
Initialize layer weights and biases using initializers such as the Glorot initializer (also
known as the Xavier initializer), the He initializer, and orthogonal initializers. To specify
the initializer for the weights and biases of convolutional layers or fully connected layers,
use the 'WeightsInitializer' and 'BiasInitializer' name-value pairs of the
layers, respectively. To specify the initializer for the input weights, the recurrent weights,
and the biases for LSTM and BiLSTM layers, use the 'InputWeightsInitializer',
'RecurrentWeightsInitializer', and 'BiasInitializer' name-value pairs,
respectively.
• batchNormalizationLayer
• bilstmLayer
• convolution2dLayer
• convolution3dLayer
• fullyConnectedLayer
• groupedConvolution2dLayer
• lstmLayer
• transposedConv2dLayer
• transposedConv3dLayer
• wordEmbeddingLayer (Text Analytics Toolbox)
For an example showing how to compare the different initializers, see “Compare Layer
Weight Initializers”. For an example showing how to create a custom initialization
function, see “Specify Custom Weight Initialization Function”.
1-3
R2019a
For an example showing how to create a block of layers for channel-wise separable
convolution (also known as depth-wise separable convolution), see “Create Layers for
Channel-Wise Separable Convolution”.
3-D Support: New layers enable deep learning with 3-D data
These new layers enable you to work with 3-D data:
• image3dInputLayer
• convolution3dLayer
• transposedConv3dLayer
• averagePooling3dLayer
• maxPooling3dLayer
• concatenationLayer
These existing layers are enhanced to support 3-D data in deep learning networks:
• reluLayer
• leakyReluLayer
• clippedReluLayer
• fullyConnectedLayer
• softmaxLayer
• classificationLayer
• regressionLayer
For a list of available layers, see “List of Deep Learning Layers”. For an example showing
how to train a network using 3-D data, see “3-D Brain Tumor Segmentation Using Deep
Learning”.
1-4
For more information about defining custom layers, see “Define Custom Deep Learning
Layers”. To learn how to check that the layer is valid automatically using the
checkLayer function, see “Check Custom Layer Validity”.
• activations
• classify
• predict
To retrain a network on a new classification task, follow the steps in “Train Deep Learning
Network to Classify New Images” and load the pretrained network you want to use
instead of GoogLeNet.
For more information on pretrained neural networks in MATLAB, see “Pretrained Deep
Neural Networks”.
1-5
R2019a
To retrain a network on a new classification task, follow the steps in “Train Deep Learning
Network to Classify New Images” and load the pretrained network you want to use
instead of GoogLeNet.
For more information on pretrained neural networks in MATLAB, see “Pretrained Deep
Neural Networks”.
1-6
• “Classify Videos Using Deep Learning”
• “Run Multiple Deep Learning Experiments”
• “Train Network Using Out-of-Memory Sequence Data”
• “Compare Layer Weight Initializers”
• “Specify Custom Weight Initialization Function”
1-7
R2019a
In previous releases, the software, by default, initializes the layer weights by sampling
from a normal distribution with a mean of zero and a variance of 0.01. To reproduce this
behavior, set the 'WeightsInitializer' option of these layers to 'narrow-normal'.
Glorot is default input weights initialization for LSTM and BiLSTM layers
Behavior change
Starting in R2019a, the software, by default, initializes the layer input weights of
lstmLayer and bilstmLayer using the Glorot initializer. This behavior helps stabilize
training and usually reduces the training time of deep networks.
In previous releases, the software, by default, initializes the layer input weights by
sampling from a normal distribution a mean of zero and a variance of 0.01. To reproduce
this behavior, set the 'InputWeightsInitializer' option of these layers to 'narrow-
normal'.
Orthogonal is default recurrent weights initialization for LSTM and BiLSTM layers
Behavior change
Starting in R2019a, the software, by default, initializes the layer recurrent weights of
LSTM and BiLSTM layers with Q, the orthogonal matrix given by the QR decomposition of
Z = QR for a random matrix Z sampled from a unit normal distribution. This behavior
helps stabilize training and usually reduces the training time of deep networks.
In previous releases, the software, by default, initializes the layer recurrent weights by
sampling from a normal distribution with a mean of zero and a variance of 0.01. To
reproduce this behavior, set the 'RecurrentWeightsInitializer' option of the layer
to 'narrow-normal'.
1-8
Custom layers have new properties NumInputs, InputNames, NumOutputs, and
OutputNames
Starting in R2019a, custom layers have the new properties NumInputs, InputNames,
NumOutputs, and OutputNames. These properties enable support for custom layers with
multiple inputs and multiple outputs.
If you use a custom layer created in R2018b or earlier, the layer cannot have any
properties named NumInputs, InputNames, NumOutputs, or OutputNames. You must
rename these properties to use the layer in R2019a and onwards.
Before R2018a, to perform custom image preprocessing for training deep learning
networks, you had to specify a custom read function using the readFcn property of
imageDatastore. However, reading files using a custom read function was slow because
imageDatastore did not prefetch files.
• In addition to specifying the preprocessing operations, you must also define properties
and methods to support reading data in batches, reading data by index, and
partitioning and shuffling data.
• You must specify a value for the NumObservations property, but this value may be ill-
defined or difficult to define in real-world applications.
1-9
R2019a
• Custom mini-batch datastores are not flexible enough to support common deep
learning workflows, such as deployed workflows using GPU Coder™.
Starting in R2019a, built-in datastores natively support prefetch, shuffling, and parallel
training when reading batches of data. The transform function is the preferred way to
perform custom image preprocessing using built-in datastores. The combine function is
the preferred way to concatenate read data from multiple datastores, including
transformed datastores. Concatenated data can serve as the network inputs and expected
responses for training deep learning networks. The transform and combine functions
have several advantages over custom mini-batch datastores.
• The functions enable data preprocessing and concatenation for all types of datastores,
including imageDatastore.
• The transform function requires you to define only the data processing pipeline.
• When used on a deterministic datastore, the functions support tall data types and
MapReduce.
• The functions support deployed workflows.
For more information about custom image preprocessing, see “Preprocess Images for
Deep Learning”.
matlab.io.datastore.BackgroundDispatchable and
matlab.io.datastore.PartitionableByIndex are not recommended
Still runs
matlab.io.datastore.BackgroundDispatchable and
matlab.io.datastore.PartitionableByIndex add support for prefetching and
parallel training to custom mini-batch datastores. You can use custom mini-batch
datastores to preprocess sequence, time series, or text data, but recurrent networks such
as LSTM networks do not support prefetching or parallel and multi-GPU training.
Starting in R2019a, built-in datastores natively support prefetching and parallel training,
so custom mini-batch datastores are not recommended for custom image preprocessing.
1-10
2
R2018b
Version: 12.0
New Features
Bug Fixes
Compatibility Considerations
R2018b
2-2
For examples, see:
2-3
R2018b
Import deep learning networks and network architectures from ONNX using
importONNXNetwork and importONNXLayers.
To perform network validation during training, specify validation data using the
'ValidationData' name-value pair argument of trainingOptions. You can change
the validation frequency using the 'ValidationFrequency' name-value pair argument.
For more information, see Specify Validation Data.
For an example showing how to specify validation data for an LSTM network, see Classify
Text Data Using Deep Learning.
2-4
For an example showing how to assemble a network from pretrained layers, see Assemble
Network from Pretrained Keras Layers.
2-5
R2018b
For an example showing how to use dilated convolutions for semantic segmentation, see
Semantic Segmentation Using Dilated Convolutions
For an example showing how to use a custom mini-batch datastore for sequence data, see
Train Network Using Out-of-Memory Sequence Data.
To retrain a network on a new classification task, follow the steps of Train Deep Learning
Network to Classify New Images and load ResNet-18 or DenseNet-201 instead of
GoogLeNet.
2-6
To use importKerasNetwork and importKerasLayers, you must install the Deep
Learning Toolbox Importer for TensorFlow-Keras Models support package. If this support
package is not installed, the functions provide a download link.
• wordEmbeddingLayer
• roiInputLayer
• roiMaxPooling2dLayer
• regionProposalLayer
• rpnSoftmaxLayer
• rpnClassificationLayer
• rcnnBoxRegressionLayer
• weightedClassificationLayer (custom layer example)
• dicePixelClassificationLayer (custom layer example)
2-7
R2018b
• The augment function can apply identical random transformations to multiple images.
Use the augment function to apply identical transformations to input and response
image pairs in a custom mini-batch datastore.
You can also use the augment function to easily visualize the transformations applied
to sample images.
• The new 'RandScale' property of imageDataAugmenter scales an image uniformly
in the vertical and horizontal directions to maintain the image aspect ratio.
• Several properties of imageDataAugmenter now support sampling over disjoint
intervals or using nonuniform probability distributions. Specify a custom sampling
function using a function handle.
• Use the example Train Deep Learning Network to Classify New Images to fine-tune
any pretrained network for a new image classification task.
• Compare Pretrained Networks
• Transfer Learning with Deep Network Designer
• Interactive Transfer Learning Using AlexNet
• Build Networks with Deep Network Designer
• Deep Learning Tips and Tricks
• Assemble Network from Pretrained Keras Layers
• List of Deep Learning Layers
• Convert Classification Network into Regression Network
• Resume Training from Checkpoint Network
• Semantic Segmentation Using Dilated Convolutions
• Image Processing Operator Approximation Using Deep Learning
2-8
• Assemble Network from Pretrained Keras Layers
• Train Network Using Out-of-Memory Sequence Data
• Denoise Speech Using Deep Learning Networks
• Classify Gender Using Long Short-Term Memory Networks
In previous releases, the default value is 5. To reproduce this behavior, set the
'ValidationPatience' option in trainingOptions to 5.
2-9
R2018b
The ClassNames property contains a cell array of character vectors. The Classes
property contains a categorical array. To use the Classes property with functions that
require cell array input, convert the classes using the cellstr function.
Starting in R2018b, when saving checkpoint networks, the software assigns file names
beginning with net_checkpoint_. In previous releases, the software assigns file names
beginning with convnet_checkpoint_. For more information, see the
'CheckpointPath' option in trainingOptions.
If you have code that saves and loads checkpoint networks, then update your code to load
files with the new name.
2-10
3
R2018a
Version: 11.1
New Features
Bug Fixes
Compatibility Considerations
R2018a
To create an LSTM network that learns from complete sequences at each time step,
include a bidirectional LSTM layer in your network by using bilstmLayer.
To create training options for the Adam or RMSProp solvers, use the trainingOptions
function. trainingOptions('adam') and trainingOptions('rmsprop') create
training options for the Adam and RMSProp solvers, respectively. To specify solver
options, use the 'GradientDecayFactor', 'SquaredGradientDecayFactor', and
'Epsilon' name-value pair arguments.
3-2
Define custom data preprocessing operations by creating your own mini-batch datastore.
You can optionally add support for functionality such as shuffling during training, parallel
and multi-GPU training, and background dispatch. For more information, see Develop
Custom Mini-Batch Datastore.
Compatibility Considerations
In previous releases, you could preprocess images with resizing, rotation, reflection, and
other geometric transformations by using an augmentedImageSource. The
augmentedImageSource function now creates an augmentedImageDatastore object.
An augmentedImageDatastore behaves similarly to an augmentedImageSource, with
additional properties and methods to assist with data augmentation.
You can now use augmentedImageDatastore for both training and prediction. In the
previous release, you could use augmentedImageSource for training but not prediction.
To use DAG networks for feature extraction or visualization of layer activations, use the
activations function.
3-3
R2018a
targets and the predicted labels outputs. For an example, see Plot Confusion Matrix
Using Categorical Labels.
The replaceLayer function requires the Neural Network Toolbox Importer for
TensorFlow-Keras Models support package. If this support package is not installed, type
importKerasLayer or importKerasNetwork in the command line for a download link.
3-4
Add-On Explorer. You can also download the networks from MathWorks Neural Network
Toolbox Team. After you install the add-ons, use the squeezenet and
inceptionresnetv2 functions to load the networks, respectively.
To retrain a network on a new classification task, follow the steps of Transfer Learning
Using GoogLeNet. Load a SqueezeNet or Inception-ResNet-v2 network instead of
GoogLeNet, and change the names of the layers that you remove and connect to match
the names of your pretrained network. For more information, see squeezenet and
inceptionresnetv2.
The analyzeNetwork function requires the Deep Learning Network Analyzer for Neural
Network Toolbox support package. To download and install support package, use the Add-
On Explorer. You can also download the support package from MathWorks Neural
Network Toolbox Team. For more information, see analyzeNetwork.
Import deep learning networks and network architectures from ONNX using
importONNXNetwork and importONNXLayers.
3-5
R2018a
To learn about options, see Scale Up Deep Learning in Parallel and in the Cloud.
3-6
• JPEG Image Deblocking Using Deep Learning
• Remove Noise from Color Image Using Pretrained Neural Network
For more examples of deep learning applications, see Deep Learning Applications and
Deep Learning GPU Code Generation.
3-7
4
R2017b
Version: 11.0
New Features
Bug Fixes
Compatibility Considerations
R2017b
• Create a LayerGraph object using layerGraph. The layer graph specifies the
network architecture. You can create an empty layer graph and then add layers to it.
You can also create a layer graph directly from an array of network layers. The layers
in the graph are automatically connected sequentially.
• Add layers to the layer graph using addLayers and remove layers from the graph
using removeLayers.
• Connect layers of the layer graph using connectLayers and disconnect layers using
disconnectLayers.
• Plot the network architecture using plot.
• Train the network using the layer graph as the layers input argument to
trainNetwork. The trained network is a DAGNetwork object.
• Perform classification and prediction on new data using classify and predict.
For an example showing how to create and train a DAG network, see Create and Train
DAG Network for Deep Learning.
You can also load a pretrained DAG network by installing the Neural Network Toolbox
Model for GoogLeNet Network add-on. For a transfer learning example, see Transfer
Learning Using GoogLeNet. For more information, see googlenet.
4-2
LSTM networks can be used for the following types of problems:
You might want to make multiple predictions on parts of a long sequence, or might not
have the complete time series in advance. For these tasks, you can make the LSTM
network remember and forget the network state between predictions. To configure the
state of LSTM networks, use the following functions:
To perform network validation during training, specify validation data using the
'ValidationData' name-value pair argument of trainingOptions. By default, the
software validates the network every 50 training iterations by predicting the response of
the validation data and calculating the validation loss and accuracy (root mean square
error for regression networks). You can change the validation frequency using the
'ValidationFrequency' name-value pair argument.
Network training stops when the validation loss stops improving. By default, if the
validation loss is larger than or equal to the previously smallest loss five times in a row,
4-3
R2017b
then network training stops. To change the number of times that the validation loss is
allowed to not decrease before training stops, use the 'ValidationPatience' name-value
pair argument.
• For an example showing how to define a PReLU layer, a layer with learnable
parameters, see Define a Layer with Learnable Parameters.
• For an example showing how to define a classification output layer and specify a loss
function, see Define a Classification Output Layer.
• For an example showing how to define a regression output layer and specify a loss
function, see Define a Regression Output Layer.
To turn on the training progress plot, use the 'Plots' name-value pair argument of
trainingOptions. For more information, see Monitor Deep Learning Training Progress.
4-4
Deep Learning Image Preprocessing: Efficiently resize and
augment image data for training
You can now preprocess images for network training with more options, including
resizing, rotation, reflection, and other geometric transformations. To train a network
using augmented images, create an augmentedImageSource and use it as an input
argument to trainNetwork. You can configure augmentation options using the
imageDataAugmenter function. For more information, see Preprocess Images for Deep
Learning.
Augmentation helps to prevent the network from overfitting and memorizing the exact
details of the training images. It also increases the effective size of the training data set
by generating new images based on the training images. For example, use augmentation
to generate new images that randomly flip the training images along the vertical axis, and
randomly translate the training images horizontally and vertically.
To resize images in other contexts, such as for prediction, classification, and network
validation during training, use imresize.
Compatibility Considerations
In previous releases, you could perform limited image cropping and reflection using the
DataAugmentation property of imageInputLayer. The DataAugmentation property
is not recommended. Use augmentedImageSource instead.
4-5
R2017b
You can access the model using the googlenet function. If the Neural Network Toolbox
Model for GoogLeNet Network support package is not installed, then the function
provides a link to the required support package in the Add-On Explorer. GoogLeNet won
the ImageNet Large-Scale Visual Recognition Challenge in 2014. The network is smaller
and typically faster than VGG networks, and smaller and more accurate than AlexNet on
the ImageNet challenge data set. The network is a directed acyclic graph (DAG) network,
and googlenet returns the network as a DAGNetwork object. You can use this
pretrained model for classification and transfer learning. For an example, see Transfer
Learning Using GoogLeNet. For more information on pretrained neural networks in
MATLAB, see Pretrained Convolutional Neural Networks.
To retrain a network on a new classification task, follow the steps of Transfer Learning
Using GoogLeNet. Load a ResNet network instead of GoogLeNet, and change the names
of the layers that you remove and connect to match the names of the ResNet layers. To
extract the layers and architecture of the network for further processing, use
layerGraph. For more information, see resnet50 and resnet101.
4-6
To retrain the network on a new classification task, follow the steps of Transfer Learning
Using GoogLeNet. Load the Inception-v3 network instead of GoogLeNet, and change the
names of the layers that you remove and connect to match the names of the Inception-v3
layers. To extract the layers and architecture of the network for further processing, use
layerGraph. For more information, see inceptionv3.
4-7
R2017b
Alternatively, you can import CNN layers from TensorFlow-Keras by using the
importKerasLayers function. This function imports the network architecture as a
Layerarray or LayerGraph object. You can then specify the training options using the
trainingOptions function and train this network using the trainNetwork function.
For both importKerasNetwork and importKerasLayers, you must install the Neural
Network Toolbox Importer for TensorFlow-Keras Models add-on from the MATLAB® Add-
Ons menu.
4-8
Functionality Result Use Instead Compatibility
Considerations
Padding property of Warns PaddingSize Replace all instances
Convolution2dLay property of of Padding property
er, Convolution2dLay with PaddingSize.
MaxPooling2dLaye er, When you create
r, and MaxPooling2dLaye network layers, use
AveragePooling2d r, and the 'Padding'
Layer objects AveragePooling2d name-value pair
Layer objects argument to specify
the padding. For
more information,
see
Convolution2dLay
er,
MaxPooling2dLaye
r, and
AveragePooling2d
Layer.
4-9
5
R2017a
Version: 10.0
New Features
Bug Fixes
R2017a
You can access the models using the functions alexnet, vgg16, and vgg19. These
models are SeriesNetwork objects. You can use these pretrained models for
classification and transfer learning.
You can also import other pretrained CNN models from Caffe by using the
importCaffeNetwork function. This function imports models as a SeriesNetwork
object. You can then use these models for classifying new data.
Alternatively, you can import CNN layers from Caffe by using the importCaffeLayers
function. This function imports the layer architecture as a Layer array. You can then
specify the training options using the trainingOptions function and train this network
using the trainNetwork function.
For both importCaffeNetwork and importCaffeLayers, you can install the Neural
Network Toolbox Importer for Caffe Models add-on from the MATLAB® Add-Ons menu.
5-2
Deep Learning with Cloud Instances: Train convolutional
neural networks using multiple GPUs in MATLAB and MATLAB
Distributed Computing Server for Amazon EC2
You can use MATLAB to perform deep learning in the cloud using Amazon Elastic
Compute Cloud (Amazon EC2®) with new P2 instances and data stored in the cloud. If you
do not have a suitable GPU available for faster training of a convolutional neural network,
you can use Amazon Elastic Compute Cloud instead. Try different numbers of GPUs per
machine to accelerate training. You can compare and explore the performance of multiple
deep neural network configurations to find the best tradeoff of accuracy and memory use.
Deep learning in the cloud also requires Parallel Computing Toolbox™. For details, see
Deep Learning in the Cloud.
For specifying the hardware on which to train the network, and for system requirements,
see the ExecutionEnvironment name-value pair argument on trainingOptions.
5-3
R2017a
highlights the features your trained ConvNet has learned, helping you understand and
diagnose network behavior. For examples, see Deep Dream Images Using AlexNet and
Visualize Features of a Convolutional Neural Network.
You can also display network activations on an image to investigate features the network
has learned to identify. To try an example, see Visualize Activations of a Convolutional
Neural Network.
To find out what tasks you can do, see Deep Learning in MATLAB. To learn about
convolutional neural networks and how they work in MATLAB, see:
5-4
• Try Deep Learning in 10 Lines of MATLAB Code
• Create Simple Deep Learning Network for Classification
• Transfer Learning and Fine-Tuning of Convolutional Neural Networks
• Transfer Learning Using AlexNet
• Feature Extraction Using AlexNet
• Deep Dream Images Using AlexNet
• Visualize Activations of a Convolutional Neural Network
• Visualize Features of a Convolutional Neural Network
• Create Typical Convolutional Neural Networks
• Plot Training Accuracy During Network Training
• Plot Progress and Stop Training at Specified Accuracy
• Resume Training from a Checkpoint Network
• Train a Convolutional Neural Network for Regression
5-5
6
R2016b
Version: 9.1
New Features
Bug Fixes
Compatibility Considerations
R2016b
The following methods and functions are NOT supported in deployed mode:
6-2
• Training progress dialog, nntraintool.
• genFunction and gensim to generate MATLAB code or Simulink® blocks
• view method
• nctool, nftool, nnstart, nprtool, ntstool
• Plot functions (such as plotperform, plottrainstate, ploterrhist,
plotregression, plotfit, and so on)
Compatibility Considerations
You do not need to specify for generateFunction to generate code for matrices.
Previously, you needed to specify 'MatrixOnly',true.
For more information about the network, see Pretrained Convolutional Neural Network.
6-3
7
R2016a
Version: 9.0
New Features
Bug Fixes
R2016a
NOTE: This feature requires the Parallel Computing Toolbox and a CUDA-enabled
NVIDIA GPU with compute capability 3.0 or higher.
7-2
8
R2015b
Version: 8.4
New Features
Bug Fixes
R2015b
8-2
9
R2015a
Version: 8.3
New Features
Bug Fixes
R2015a
9-2
10
R2014b
Version: 8.2.1
Bug Fixes
11
R2014a
Version: 8.2
New Features
Bug Fixes
R2014a
Training panels for Neural Fitting Tool and Neural Time Series
Tool Provide Choice of Training Algorithms
The training panels in the Neural Fitting and Neural Time Series tools now let you select
a training algorithm before clicking Train. The available algorithms are:
• Levenberg-Marquardt (trainlm)
• Bayesian Regularization (trainbr)
• Scaled Conjugate Gradient (trainscg)
For more information on using Neural Fitting, see Fit Data with a Neural Network.
For more information on using Neural Time Series, see Neural Network Time Series
Prediction and Modeling.
[x,t] = house_dataset;
net = feedforwardnet(10,'trainbr');
[net,tr] = train(net,x,t);
[x,t] = house_dataset;
net = feedforwardnet(10,'trainbr');
net.trainParam.max_fail = 6;
[net,tr] = train(net,x,t);
11-2
12
R2013b
Version: 8.1
New Features
Bug Fixes
Compatibility Considerations
R2013b
The function genFunction generates a stand-alone MATLAB function for simulating any
trained neural network and preparing it for deployment in many scenarios:
12-2
houseNet = train(houseNet,x,t);
y = houseNet(x);
A MATLAB function with the same interface as the neural network object is generated
and tested, and viewed.
genFunction(houseNet,'houseFcn');
y2 = houseFcn(x);
accuracy2 = max(abs(y-y2))
edit houseFcn
The new function can be compiled with the MATLAB Compiler tools (license required) to a
shared/dynamically linked library with mcc.
Next, another version of the MATLAB function is generated which supports only matrix
arguments (no cell arrays). This function is tested. Then it is used to generate a MEX-
function with the MATLAB Coder tool codegen (license required) which is also tested.
genFunction(houseNet,'houseFcn','MatrixOnly','yes');
y3 = houseFcn(x);
accuracy3 = max(abs(y-y3))
[x,t] = maglev_dataset;
maglevNet = narxnet(1:2,1:2,10);
[X,Xi,Ai,T] = preparets(maglevNet,x,{},t);
maglevNet = train(maglevNet,X,T,Xi,Ai);
[y,xf,af] = maglevNet(X,Xi,Ai);
Next, a MATLAB function is generated and tested. The function is then used to create a
shared/dynamically linked library with mcc.
genFunction(maglevNet,'maglevFcn');
[y2,xf,af] = maglevFcn(X,Xi,Ai);
accuracy2 = max(abs(cell2mat(y)-cell2mat(y2)))
mcc -W lib:libMaglev -T link:lib maglevFcn
12-3
R2013b
Next, another version of the MATLAB function is generated which supports only matrix
arguments (no cell arrays). This function is tested. Then it is used to generate a MEX-
function with the MATLAB Coder tool codegen, and the result is also tested.
genFunction(maglevNet,'maglevFcn','MatrixOnly','yes');
x1 = cell2mat(X(1,:)); % Convert each input to matrix
x2 = cell2mat(X(2,:));
xi1 = cell2mat(Xi(1,:)); % Convert each input state to matrix
xi2 = cell2mat(Xi(2,:));
[y3,xf1,xf2] = maglevFcn(x1,x2,xi1,xi2);
accuracy3 = max(abs(cell2mat(y)-y3))
Enhanced Tools
The function genFunction is introduced with a new panel in the tools nftool, nctool,
nprtool and ntstool.
The advanced scripts generated on the Save Results panel of each of these tools includes
an example of deploying networks with genFunction.
12-4
For more information, see Deploy Neural Network Functions.
It can be useful to simulate a trained neural network up the present with all the known
values of a time-series in open-loop mode, then switch to closed-loop mode to continue
the simulation for as many predictions into the future as are desired. It is now much
easier to do this.
Previously, openloop and closeloop transformed the neural network between those
two modes.
net = openloop(net)
net = closeloop(net)
12-5
R2013b
This is still the case. However, these functions now also support the transformation of
input and layer delay state values between open- and closed-loop modes, making
switching between closed-loop to open-loop multistep prediction easier.
[net,xi,ai] = openloop(net,xi,ai);
[net,xi,ai] = closeloop(net,xi,ai);
Here, a neural network is trained to model the magnetic levitation system in default open-
loop mode.
[X,T] = maglev_dataset;
net = narxnet(1:2,1:2,10);
[x,xi,ai,t] = preparets(net,X,{},T);
net = train(net,x,t,xi,ai);
view(net)
Then closeloop is used to convert the network to closed-loop form for simulation.
netc = closeloop(net);
[x,xi,ai,t] = preparets(netc,X,{},T);
y = netc(x,xi,ai);
view(netc)
Now consider the case where you might have a record of the Maglev’s behavior for 20
time steps, but then want to predict ahead for 20 more time steps beyond that.
Define the first 20 steps of inputs and targets, representing the 20 time steps where the
output is known, as defined by the targets t. Then the next 20 time steps of the input are
defined, but you use the network to predict the 20 outputs using each of its predictions
feedback to help the network perform the next prediction.
12-6
x1 = x(1:20);
t1 = t(1:20);
x2 = x(21:40);
[x,xi,ai,t] = preparets(net,x1,{},t1);
[y1,xf,af] = net(x,xi,ai);
Now the final input and layer states returned by the network are converted to closed-loop
form along with the network. The final input states xf, and layer states af, of the open-
loop network become the initial input states xi, and layer states ai, of the closed-loop
network.
[netc,xi,ai] = closeloop(net,xf,af);
Typically, preparets is used to define initial input and layer states. Since these have
already been obtained from the end of the open-loop simulation, you do not need
preparets to continue with the 20 step predictions of the closed-loop network.
[y2,xf,af] = netc(x2,xi,ai);
Note that x2 can be set to different sequences of inputs to test different scenarios for
however many time steps you would like to make predictions. For example, to predict the
magnetic levitation system’s behavior if 10 random inputs were used:
x2 = num2cell(rand(1,10));
[y2,xf,af] = netc(x2,xi,ai);
See “Softmax transfer function in output layer gives consistent class probabilities for
pattern recognition and classification” on page 12-8.
12-7
R2013b
First, networks created with patternnet now use the cross-entropy performance
measure (crossentropy), which frequently produces classifiers with fewer percentage
misclassifications than obtained using mean squared error.
Second, patternnet returns networks that use the Soft Max transfer function
(softmax) for the output layer instead of the tansig sigmoid transfer function. softmax
results in output vectors normalized so they sum to 1.0, that can be interpreted as class
probabilities. (tansig also produces outputs in the 0 to 1 range, but they do not sum to
1.0 and have to be manually normalized before being treated as consistent class
probabilities.)
Here a patternnet with 10 neurons is created, its performance function and diagram
are displayed.
net = patternnet(10);
net.performFcn
ans =
crossentropy
view(net)
The output layer’s transfer function is shown with the symbol for softmax.
Training the network takes advantage of the new crossentropy performance function.
Here the network is trained to classify iris flowers. The cross-entropy performance
algorithm is shown in the nntraintool algorithm section. Clicking the “Performance”
plot button shows how the network’s cross-entropy was minimized throughout the
training session.
[x,t] = iris_dataset;
net = train(net,x,t);
12-8
Simulating the network results in normalized output. Sample 150 is used to illustrate the
normalization of class membership likelihoods:
y = net(x(:,150))
y =
0.0001
0.0528
0.9471
sum(y)
The network output shows three membership probabilities with class three as by far the
most likely. Each probability value is between 0 and 1, and together they sum to 1
indicating the 100% probability that the input x(:,150) falls into one of the three
classes.
Compatibility Considerations
If a patternnet network is used to train on target data with only one row, the network’s
output transfer function will be changed to tansig and its outputs will continue to
operate as they did before the softmax enhancement. However, the 1-of-N notation for
targets is recommended even when there are only two classes. In that case the targets
12-9
R2013b
should have two rows, where each column has a 1 in the first or second row to indicate
class membership.
If you prefer the older patternnet of mean squared error performance and a sigmoid
output transfer function, you can specify this by setting those neural network object
properties. Here is how that is done for a patternnet with 10 neurons.
net = patternnet(10);
net.layers{2}.transferFcn = 'tansig';
net.performFcn = 'mse';
This feature can be especially useful for long parallel training sessions that are more
likely to be interrupted by computing resource failures and which you can stop only with
a Ctrl+C break, because the nntraintool tool (with its Stop button) is not available
during parallel training.
[x,t] = house_dataset;
net = feedforwardnet(10);
net2 = train(net,x,t,'CheckpointFile','MyCheckpoint.mat');
By default, checkpoint saves occur at most once every 60 seconds. For the short training
example above this results in only two checkpoints, one at the beginning and one at the
end of training.
12-10
The optional training argument 'CheckpointDelay' changes the frequency of saves.
For example, here the minimum checkpoint delay is set to 10 seconds, for a time-series
problem where a neural network is trained to model a levitated magnet.
[x,t] = maglev_dataset;
net = narxnet(1:2,1:2,10);
[X,Xi,Ai,T] = preparets(net,x,{},t);
net2 = train(net,X,T,Xi,Ai,'CheckpointFile','MyCheckpoint.mat','CheckpointDelay',10);
After a computer failure or training interruption, the checkpoint structure containing the
best neural network obtained before the interruption and the training record can be
reloaded. In this case the stage field value is 'Final', indicating the last save was at
the final epoch, because training completed successfully. The first epoch checkpoint is
indicated by 'First', and intermediate checkpoints by 'Write'.
load('MyCheckpoint.mat')
checkpoint =
file: '/WorkingDir/MyCheckpoint.mat'
time: [2013 3 22 5 0 9.0712]
number: 6
stage: 'Final'
net: [1x1 network]
tr: [1x1 struct]
Training can be resumed from the last checkpoint by reloading the dataset (if necessary),
then calling train with the recovered network.
net = checkpoint.net;
[x,t] = maglev_dataset;
load('MyCheckpoint.mat');
[X,Xi,Ai,T] = preparets(net,x,{},t);
net2 = train(net,X,T,Xi,Ai,'CheckpointFile','MyCheckpoint.mat','CheckpointDelay',10);
For more information, see Automatically Save Checkpoints During Neural Network
Training.
12-11
R2013b
Here a feed-forward neural network is created and its input and output properties
examined.
net = feedforwardnet(10);
net.input
net.output
The net.inputs{1} notation for the input and net.outputs{2} notation for the
second layer output continue to work. The cell array notation continues to be required for
networks with multiple inputs and outputs.
net = feedforwardnet(10)
Compatibility Considerations
The efficiency properties are still supported and do not yet generate warnings, so
backward compatibility is maintained. However the recommended way to use memory
reduction is no longer to set net.efficiency.memoryReduction. The recommended
notation since R2012b is to use optional training arguments:
[x,t] = vinyl_dataset;
net = feedforwardnet(10);
net = train(net,x,t,'Reduction',10);
Memory reduction is a way to trade off training time for lower memory requirements
when using Jacobian training such as trainlm and trainbr. The MemoryReduction
value indicates how many passes must be made to simulate the network and calculate its
12-12
gradients each epoch. The storage requirements go down as the memory reduction goes
up, although not necessarily proportionally. The default MemoryReduction is 1, which
indicates no memory reduction.
12-13
13
R2013a
Version: 8.0.1
Bug Fixes
14
R2012b
Version: 8.0
New Features
Bug Fixes
Compatibility Considerations
R2012b
In Version 7, typical code for training and simulating a feed-forward neural network looks
like this:
[x,t] = house_dataset;
net = feedforwardnet(10);
view(net)
net = train(net,x,t);
y = net(x);
In Version 8.0, the above code does not need to be changed, but calculations now happen
in compiled native MEX code.
Speedups of as much as 25% over Version 7.0 have been seen on a sample system (4-core
2.8 GHz Intel i7 with 12 GB RAM).
Note that speed improvements measured on the sample system might vary significantly
from improvements measured on other systems due to different chip speeds, memory
bandwidth, and other hardware and software variations.
The following code creates, views, and trains a dynamic NARX neural network model of a
maglev system in open-loop mode.
[x,t] = maglev_dataset;
net = narxnet(1:2,1:2,10);
view(net)
[X,Xi,Ai,T] = preparets(net,x,{},t);
net = train(net,X,T,Xi,Ai);
y = net(X,Xi,Ai)
14-2
The following code measures training speed over 10 training sessions, with the training
window disabled to avoid GUI timing interference.
On the sample system, this ran three times (3x) faster in Version 8.0 than in Version 7.0.
rng(0)
[x,t] = maglev_dataset;
net = narxnet(1:2,1:2,10);
[X,Xi,Ai,T] = preparets(net,x,{},t);
net.trainParam.showWindow = false;
tic
for i=1:10
net = train(net,X,T,Xi,Ai);
end
toc
[x,t] = maglev_dataset;
net = narxnet(1:2,1:2,10);
net = closeloop(net);
view(net)
[X,Xi,Ai,T] = preparets(net,x,{},t);
net = train(net,X,T,Xi,Ai);
For this case, and most closed-loop (recurrent) network training, Version 8.0 ran the code
more than one-hundred times (100x) faster than Version 7.0.
A dramatic example of where the improved closed loop training speed can help is when
training a NARX network model of a double pendulum. By initially training the network in
14-3
R2012b
open-loop mode, then in closed-loop mode with two time step sequences, then three time
step sequences, etc., a network has been trained that can simulate the system for 500
time steps in closed-loop mode. This corresponds to a 500 step ahead prediction.
1.5
Exact Model
1
NN Model
0.5
q1
−0.5
−1
−1.5
0 50 100 150 200 250 300 350 400 450 500
Time Steps
1
q2
−1 Exact Model
NN Model
−2
0 50 100 150 200 250 300 350 400 450 500
Time Steps
Because of the Version 8.0 MEX speedup, this only took a few hours, as opposed to the
months it would have taken in Version 7.0.
MEX code is also far more memory efficient. The amount of RAM used for intermediate
variables during training and simulation is now relatively constant, instead of growing
linearly with the number of samples. In other words, a problem with 10,000 samples
requires the same temporary storage as a problem with only 100 samples.
14-4
This memory efficiency means larger problems can be trained on a single computer.
Compatibility Considerations
For very large networks, MEX code might fall back to MATLAB code. If this happens and
memory availability becomes an issue, use the 'reduction' option to implement
memory reduction. The reduction number indicates the number of passes to make
through the data for each calculation. Each pass calculates with a fraction of the data,
and the results are combined after all passes are complete. This trades off lower memory
requirements for longer calculation times.
net = train(net,x,t,'reduction',10);
y = net(x,'reduction',10);
net.efficiency.memoryReduction = N;
This continues to work in Version 8.0, but it is recommended that you update your code to
use the 'reduction' option for train and network simulation. Additional name-value
pair arguments are the standard way to indicate calculation options.
14-5
R2012b
Note that, during training, the calculation of network outputs, performance, gradient, and
Jacobian calculations are parallelized, while the main training code remains on one
worker.
matlabpool open
numWorkers = matlabpool('size')
14-6
If calling matlabpool produces an error, it might be that Parallel Computing Toolbox is
not available.
[x,t] = house_dataset;
net = feedforwardnet(10);
net = train(net,x,t,'useParallel','yes');
y = sim(net,'useParallel','yes');
On the sample system with a pool of four cores, typical speedups have been between 3x
and 3.7x. Using more than four cores might produce faster speeds. For more information,
see Parallel and GPU Computing.
To train and simulate with a GPU set the 'useGPU' option to 'yes'. Use the gpuDevice
command to get information on your GPU.
gpuInfo = gpuDevice
Speedups on the sample system with an nVidia GTX 470 GPU card have been between 3x
and 7x, but might increase as GPUs continue to improve.
You can also use multiple GPUs. If you set both 'useParallel' and 'useGPU' to
'yes', any worker associated with a unique GPU will use that GPU, and other workers
will use their CPU core. It is not efficient to share GPUs between workers, as that would
require them to perform their calculations in sequence instead of in parallel.
14-7
R2012b
numWorkers = matlabpool('size')
numGPUs = gpuDeviceCount
[x,t] = house_dataset;
net = feedforwardnet(10);
net.trainFcn = 'trainscg';
net = train(net,x,t,'useParallel','yes','useGPU','yes');
y = sim(net,'useParallel','yes','useGPU','yes');
Tests with three GPU workers and one CPU worker on the sample system have seen 3x or
higher speedup. Depending on the size of the problem, and how much it uses the capacity
of each GPU, adding GPUs might increase speed or might simply increase the size of
problem that can be run.
In some cases, training with both GPUs and CPUs can result in slower speeds than just
training with the GPUs, because the CPUs might not keep up with the GPUs. In this case,
set 'useGPU' to 'only' and only GPU workers will be used.
[x,t] = house_dataset;
net = feedforwardnet(10);
net = train(net,x,t,'useParallel','yes','useGPU','only');
y = sim(net,'useParallel','yes','useGPU','only');
This is done by loading the Composite sequentially. For instance, here the sub-datasets
are loaded from files as they are distributed:
Xc = Composite;
Tc = Composite;
for i=1:10
data = load(['dataset' num2str(i)])
Xc{i} = data.x;
Tc{i} = data.t;
clear data
end
14-8
This technique allows for training with datasets of any size, limited only by the available
RAM across an entire cluster.
n = -10:0.01:10;
a1 = elliotsig(n);
a2 = tansig(n);
h = plot(n,a1,n,a2);
legend(h,'ELLIOTSIG','TANSIG','Location','NorthWest')
To set up a neural network to use the elliotsig transfer function, change each tansig
layer’s transfer function with its transferFcn property. For instance, here a network
using elliotsig is created, viewed, trained, and simulated:
[x,t] = house_dataset;
net = feedforwardnet(10);
view(net) % View TANSIG network
14-9
R2012b
net.layers{1}.transferFcn = 'elliotsig';
view(net) % View ELLIOTSIG network
net = train(net,x,t);
y = net(x)
n = rand(1000,1000);
tic, for i=1:100, a = elliotsig(n); end, elliotsigTime = toc
tic, for i=1:100, a = tansig(n); end, tansigTime = toc
speedup = tansigTime / elliotsigTime
However, because of the different shape, elliotsig might not result in faster training
than tansig. It might require more training steps. For simulation, elliotsig is always
faster.
14-10
For more information, see Parallel and GPU Computing.
14-11
R2012b
This is done using the Parallel Computing Toolbox function Composite. Composite data is
data spread across a parallel pool of MATLAB workers.
For instance, if a parallel pool is open with four workers, data can be distributed as
follows:
[x,t] = house_dataset;
Xc = Composite;
Tc = Composite;
Xc{1} = x(:, 1:150); % First 150 samples of x
Tc{1} = x(:, 1:150); % First 150 samples of t
Xc{2} = x(:, 151:300); % Second 150 samples of x
Tc{2} = x(:, 151:300); % Second 150 samples of t
Xc{3} = x(:, 301:403); % Third 103 samples of x
Tc{3} = x(:, 301:403); % Third 103 samples of t
Xc{4} = x(:, 404:506); % Fourth 103 samples of x
Tc{4} = x(:, 404:506); % Fourth 103 samples of t
When you call train, the 'useParallel' option is not needed, because train
automatically trains in parallel when using Composite data.
net = train(net,Xc,Tc);
If you want workers 1 and 2 to use GPU devices 1 and 2, while workers 3 and 4 use CPUs,
set up data for workers 1 and 2 using nndata2gpu inside an spmd clause.
spmd
if labindex <= 2
Xc = nndata2gpu(Xc);
Tc = nndata2gpu(Tc);
end
end
The function nndata2gpu takes a neural network matrix or cell array time series data
and converts it to a properly sized gpuArray on the worker’s GPU. This involves
transposing the matrices, padding the columns so their first elements are memory
14-12
aligned, and combining matrices, if the data was a cell array of matrices. To reverse
process outputs returned after simulation with gpuArray data, use gpu2nndata to
convert back to a regular matrix or a cell array of matrices.
As with 'useParallel', the data type removes the need to specify 'useGPU'. Training
and simulation automatically recognize that two of the workers have gpuArray data and
employ their GPUs accordingly.
net = train(net,Xc,Tc);
This way, any variation in speed or memory limitations between workers can be
accounted for by putting differing numbers of samples on those workers.
Set the 'showResources' option to 'yes' to check what resources are actually being
used, as opposed to requested for use, when training and simulating.
[x,t] = house_dataset;
net = feedforwardnet(10);
14-13
R2012b
net2 = train(net,x,t,'showResources','yes');
y = net2(x,'showResources','yes');
Computing Resources:
MEX on PCWIN64
net2 = train(net,x,t,'useParallel','yes','showResources','yes');
y = net2(x,'useParallel','yes','showResources','yes');
Computing Resources:
Worker 1 on Computer1, MEX on PCWIN64
Worker 2 on Computer1, MEX on PCWIN64
Worker 3 on Computer1, MEX on PCWIN64
Worker 4 on Computer1, MEX on PCWIN64
net2 = train(net,x,t,'useGPU','yes','showResources','yes');
y = net2(x,'useGPU','yes','showResources','yes');
Computing Resources:
GPU device 1, TypeOfCard
net2 = train(net,x,t,'useParallel','yes','useGPU','yes',...
'showResources','yes');
y = net2(x,'useParallel','yes','useGPU','yes','showResources','yes');
Computing Resources:
Worker 1 on Computer1, GPU device 1, TypeOfCard
Worker 2 on Computer1, GPU device 2, TypeOfCard
Worker 3 on Computer1, MEX on PCWIN64
Worker 4 on Computer1, MEX on PCWIN64
net2 = train(net,x,t,'useParallel','yes','useGPU','only',...
'showResources','yes');
y = net2(x,'useParallel','yes','useGPU','only','showResources','yes');
Computing Resources:
Worker 1 on Computer1, GPU device 1, TypeOfCard
Worker 2 on Computer1, GPU device 2, TypeOfCard
14-14
In Version 8.0 the related functions for neural network processing are in package folders,
so each local function has its own file.
For instance, in Version 7.0 the function tansig contained a large switch statement and
several local functions. In Version 8.0 there is a root function tansig, along with several
package functions in the folder /toolbox/nnet/nnet/nntransfer/+tansig/.
+tansig/activeInputRange.m
+tansig/apply.m
+tansig/backprop.m
+tansig/da_dn.m
+tansig/discontinuity.m
+tansig/forwardprop.m
+tansig/isScalar.m
+tansig/name.m
+tansig/outputRange.m
+tansig/parameterInfo.m
+tansig/simulinkParameters.m
+tansig/type.m
Each transfer function has its own package with the same set of package functions. For
lists of processing, weight, net input, transfer, performance, and distance functions, each
of which has its own package, type the following:
help nnprocess
help nnweight
help nnnetinput
help nntransfer
help nnperformance
help nndistance
The calling interfaces for training functions are updated for the new calculation modes
and parallel support. Normally, training functions would not be called directly, but
indirectly by train, so this is unlikely to require any code changes.
Compatibility Considerations
Due to the new package organization for processing, weight, net input, transfer,
performance and distance functions, any custom functions of these types will need to be
updated to conform to this new package system before they will work with Version 8.0.
14-15
R2012b
See the main functions and package functions for mapminmax, dotprod, netsum,
tansig, mse, and dist for examples of this new organization. Any of these functions and
its package functions may be used as a template for new or updated custom functions.
Due to the new calling interfaces for training functions, any custom backpropagation
training function will need to be updated to work with Version 8.0. See trainlm and
trainscg for examples that can be used as templates for any new or updated custom
training function.
14-16
15
R2012a
Version: 7.0.3
Bug Fixes
16
R2011b
Version: 7.0.2
Bug Fixes
17
R2011a
Version: 7.0.1
Bug Fixes
18
R2010b
Version: 7.0
New Features
Bug Fixes
Compatibility Considerations
R2010b
18-2
New Time Series GUI and Tools
The new ntstool function opens a wizard GUI that allows time series problems to be
solved with three kinds of neural networks: NARX networks (neural auto-regressive with
external input), NAR networks (neural auto-regressive), and time delay neural networks.
It follows a similar format to the neural fitting (nftool), clustering (nctool), and pattern
recognition (nprtool) tools.
Network diagrams shown in the Neural Time Series Tool, Neural Training Tool, and with
the view(net) command, have been improved to show tap delay lines in front of
weights, the sizes of inputs, layers and outputs, and the time relationship of inputs and
outputs. Open loop feedback outputs and inputs are indicated with matching tab and
indents in their respective blocks.
18-3
R2010b
The Save Results panel of the Neural Network Time Series Tool allows you to generate
both a Simple Script, which demonstrates how to get the same results as were obtained
with the wizard, and an Advanced Script, which provides an introduction to more
advanced techniques.
The Train Network panel of the Neural Network Time Series Tool introduces four new
plots, which you can also access from the Network Training Tool and the command line.
plotresponse(errors)
18-4
The dynamic response can be plotted, with colors indicating how targets were assigned to
training, validation and test sets across timesteps. (Dividing data by timesteps and other
criteria, in addition to by sample, is a new feature described in “New Time Series
Validation” on page 18-9.)
plotresponse(targets,outputs)
ploterrcorr(errors)
18-5
R2010b
plotinerrcorr(inputs,errors)
Simpler time series neural network creation is provided for NARX and time-delay
networks, and a new function creates NAR networks. All the network diagrams shown
here are generated with the command view(net).
18-6
Several new data sets provide sample problems that can be solved with these networks.
These data sets are also available within the ntstool GUI and the command line.
[x, t] = simpleseries_dataset;
[x, t] = simplenarx_dataset;
[x, t] = exchanger_dataset;
[x, t] = maglev_dataset;
[x, t] = ph_dataset;
[x, t] = pollution_dataset;
[x, t] = refmodel_dataset;
[x, t] = robotarm_dataset;
[x, t] = valve_dataset;
The preparets function formats input and target time series for time series networks, by
shifting the inputs and targets as needed to fill initial input and layer delay states. This
function simplifies what is normally a tricky data preparation step that must be
customized for details of each kind of network and its number of delays.
[x, t] = simplenarx_dataset;
net = narxnet(1:2, 1:2, 10);
[xs, xi, ai, ts] = preparets(net, x, {}, t);
net = train(net, xs, ts, xi, ai);
y = net(xs, xi, ai)
The output-to-input feedback of NARX and NAR networks (or custom time series network
with output-to-input feedback loops) can be converted between open- and closed-loop
modes using the two new functions closeloop and openloop.
18-7
R2010b
The total delay through a network can be adjusted with the two new functions
removedelay and adddelay. Removing a delay from a NARX network which has a
minimum input and feedback delay of 1, so that it now has a minimum delay of 0, allows
the network to predict the next target value a timestep ahead of when that value is
expected.
net = removedelay(net)
net = adddelay(net)
The new function catsamples allows you to combine multiple time series into a single
neural network data variable. This is useful for creating input and target data from
multiple input and target time series.
In the case where the time series are not the same length, the shorter time series can be
padded with NaN values. This will indicate “don't care” or equivalently “don't know”
input and targets, and will have no effect during simulation and training.
Alternatively, the shorter series can be padded with any other value, such as zero.
18-8
x = catsamples(x1, x2, x3, 'pad', 0)
There are many other new and updated functions for handling neural network data, which
make it easier to manipulate neural network time series data.
help nndatafun
However, many time series problems involve only a single time series. In order to support
validation you can set the new property to divide data up by timestep. This is the default
setting for NARXNET and other time series networks.
net.divideMode = 'time'
This property can be set manually, and can be used to specify dividing up of targets
across both sample and timestep, by all target values (i.e., across sample, timestep, and
output element), or not to perform data division at all.
net.divideMode = 'sampletime'
net.divideMode = 'all'
net.divideMode = 'none'
When the feedback mode of the output is set to 'closed', the properties change to
reflect that the output-to-input feedback is now implemented with internal feedback by
removing input j from the network, and having output properties as follows:
18-9
R2010b
net.outputs{i}.feedbackInput = [];
net.outputs{i}.feedbackMode = 'closed'
Another output property keeps track of the proper closed-loop delay, when a network is in
open-loop mode. Normally this property has this setting:
net.outputs{i}.feedbackDelay = 0
However, if a delay is removed from the network, it is updated to 1, to indicate that the
network's output is actually one timestep ahead of its inputs, and must be delayed by 1 if
it is to be converted to closed-loop form.
net.outputs{i}.feedbackDelay = 1
You can define error weights by sample, output element, time step, or network output:
ew = [1.0 0.5 0.7 0.2]; % Weighting errors across 4 samples
ew = [0.1; 0.5; 1.0]; % ... across 3 output elements
ew = {0.1 0.2 0.3 0.5 1.0}; % ... across 5 timesteps
ew = {1.0; 0.5}; % ... across 2 network outputs
These can also be defined across any combination. For example, weighting error across
two time series (i.e., two samples) over four timesteps:
ew = {[0.5 0.4], [0.3 0.5], [1.0 1.0], [0.7 0.5]};
In the general case, error weights can have exactly the same dimension as targets, where
each target has an associated error weight.
Some performance functions are now obsolete, as their functionality has been
implemented as options within the four remaining performance functions: mse, mae, sse,
and sae.
18-10
% Any value between the default 0 and 1.
net.performParam.regularization
The error normalization implemented in msne and msnereg is now implemented with a
normalization property.
% Either 'normalized', 'percent', or the default 'none'.
net.performParam.normalization
Compatibility Considerations
The old performance functions and old performance arguments lists continue to work as
before, but are no longer recommended.
gensim has new options for generating neural network systems in Simulink.
Name - the system name
SampleTime - the sample time
18-11
R2010b
For instance, here a NARX network is created and set up in MATLAB to use workspace
inputs and outputs.
[x, t] = simplenarx_dataset;
net = narxnet(1:2, 1:2, 10);
[xs, xi, ai, ts] = preparets(net, x, {}, t);
net = train(net, xs, ts, xi, ai);
net = closeloop(net);
[sysName, netName] = gensim(net, 'InputMode', 'workspace', ...
'OutputMode', 'workspace', 'SolverMode', 'discrete');
Simulink neural network blocks now allow initial conditions for input and layer delays to
be set directly by double-clicking the neural network block. setsiminit and
getsiminit provide command-line control for setting and getting input and layer delays
for a neural network Simulink block.
18-12
References to functions throughout the online documentation and command-line help now
link directly to their function pages.
help feedforwardnet
net = feedforwardnet(10);
Subobjects of the network, such as inputs, layers, outputs, biases, weights, and parameter
lists also display with links.
net.inputs{1}
net.layers{1}
net.outputs{2}
net.biases{1}
net.inputWeights{1, 1}
net.trainParam
The training tool nntraintool and the wizard GUIs nftool, nprtool, nctool, and
ntstool, provide numerous hyperlinks to documentation.
For instance, here you can calculate the error gradient for a newly created and
configured feedforward network.
net = feedforwardnet(10);
[x, t] = simplefit_dataset;
18-13
R2010b
% New function
net = feedforwardnet(hiddenSizes, trainingFcn)
% Old function
net = newff(x,t,hiddenSizes, transferFcns, trainingFcn, ...
learningFcn, performanceFcn, inputProcessingFcns, ...
outputProcessingFcns, dataDivisionFcn)
The new functions (and the old functions they replace) are:
feedforwardnet (newff)
cascadeforwardnet (newcf)
competlayer (newc)
distdelaynet (newdtdnn)
elmannet (newelm)
fitnet (newfit)
layrecnet (newlrn)
linearlayer (newlin)
lvqnet (newlvq)
narxnet (newnarx, newnarxsp)
patternnet (newpr)
perceptron (newp)
selforgmap (newsom)
timedelaynet (newtdnn)
The network's inputs and outputs are created with size zero, then configured for data
when train is called or by optionally calling the new function configure.
net = configure(net, x, t)
18-14
Unconfigured networks can be saved and reused by configuring them for many different
problems. unconfigure sets a configured network's inputs and outputs to zero, in a
network which can later be configured for other data.
net = unconfigure(net)
Compatibility Considerations
Old functions continue working as before, but are no longer recommended.
Improved GUIs
The neural fitting nftool, pattern recognition nprtool, and clustering nctool GUIs
have been updated with links back to the nnstart GUI. They give the option of
generating either simple or advanced scripts in their last panel. They also confirm with
you when closing, if a script has not been generated, or the results not yet saved.
To set the memory reduction level, use this new property. The default is 1, for no memory
reduction. Setting it to 2 or higher splits the calculations into that many parts.
net.efficiency.memoryReduction
Compatibility Considerations
The trainlm and trainbr training parameter MEM_REDUC is now obsolete.
References to it will need to be updated. Code referring to it will generate a warning.
18-15
R2010b
help simplefit_dataset
[x, t] = simplefit_dataset;
help nndatasets
The argument list for training functions, such as trainlm, traingd, etc., have been
updated to match train. The argument list for the adapt function adaptwb has been
updated. The argument list for the layer and network initialization functions, initlay,
initnw, and initwb have been updated.
Compatibility Considerations
Any custom functions of these types, or code which calls these functions manually, will
need to be updated.
18-16
19
R2010a
Version: 6.0.4
Bug Fixes
20
R2009b
Version: 6.0.3
Bug Fixes
21
R2009a
Version: 6.0.2
Bug Fixes
22
R2008b
Version: 6.0.1
Bug Fixes
23
R2008a
Version: 6.0
New Features
Bug Fixes
Compatibility Considerations
R2008a
The window also includes buttons for plots associated with the network being trained.
These buttons launch the plots during or after training. If the plots are open during
training, they update every epoch, resulting in animations that make understanding
network performance much easier.
The training window can be opened and closed at the command line as follows:
nntraintool
nntraintool('close')
• plotperform—Plot performance.
• plottrainstate—Plot training state.
Compatibility Considerations
To turn off the new training window and display command-line output (which was the
default display in previous versions), use these two training parameters:
net.trainParam.showWindow = false;
net.trainParam.showCommandLine = true;
The newpr function creates a pattern recognition network at the command line. Pattern
recognition networks are feed-forward networks that solve problems with Boolean or 1-of-
N targets and have confusion (plotconfusion) and receiver operating characteristic
(plotroc) plots associated with them.
23-2
The new confusion function calculates the true/false, positive/negative results from
comparing network output classification with target classes.
Compatibility Considerations
You can call the newsom function using conventions from earlier versions of the toolbox,
but using its new calling conventions gives you faster results.
Network diagrams appear in all the Neural Network Toolbox graphical interfaces. In
addition, you can open a network diagram viewer of any network from the command line
by typing
23-3
R2008a
view(net)
The nftool wizard has been updated to use newfit, for simpler operation, to include
the new network diagrams, and to include sample data sets. It now allows a Simulink
block version of the trained network to be generated from the final results panel.
Compatibility Considerations
The code generated by nftool is different the code generated in previous versions.
However, the code generated by earlier versions still operates correctly.
23-4
24
R2007b
Version: 5.1
New Features
Bug Fixes
Compatibility Considerations
R2007b
• newcf
• newff
• newdtdnn
• newelm
• newfftd
• newlin
• newlrn
• newnarx
• newnarxsp
For detailed information about each function, see the corresponding reference pages.
• You can now specify input and target data values directly. In the previous release, you
specified input ranges and the size of the output layer instead.
• The new syntax automates preprocessing, data division, and postprocessing of data.
For example, to create a two-layer feed-forward network with 20 neurons in its hidden
layer for a given a matrix of input vectors p and target vectors t, you can now use newff
with the following arguments:
net = newff(p,t,20);
This command also sets properties of the network such that the functions sim and train
automatically preprocess inputs and targets, and postprocess outputs.
In the previous release, you had to use the following three commands to create the same
network:
pr = minmax(p);
s2 = size(t,1);
net = newff(pr,[20 s2]);
24-2
Compatibility Considerations
Your existing code still works but might produce a warning that you are using obsolete
syntax.
At the command line, the new syntax for using network-creation functions, automates
preprocessing, postprocessing, and data-division operations.
For example, the following code returns a network that automatically preprocesses the
inputs and targets and postprocesses the outputs:
net = newff(p,t,20);
net = train(net,p,t);
y = sim(net,p);
To create the same network in a previous release, you used the following longer code:
[p1,ps1] = removeconstantrows(p);
[p2,ps2] = mapminmax(p1);
[t1,ts1] = mapminmax(t);
pr = minmax(p2);
s2 = size(t1,1);
net = newff(pr,[20 s2]);
net = train(net,p2,t1);
y1 = sim(net,p2)
y = mapminmax('reverse',y1,ts1);
The default input processFcns functions returned with a new network are, as follows:
net.inputs{1}.processFcns = ...
{'fixunknowns','removeconstantrows', 'mapminmax'}
24-3
R2007b
The elements of processParams are set to the default values of the fixunknowns,
removeconstantrows, and mapminmax functions.
The default output processFcns functions returned with a new network include the
following:
net.outputs{2}.processFcns = {'removeconstantrows','mapminmax'}
These defaults process outputs by removing rows with constant values across all samples
and mapping the values to the interval [-1 1].
sim and train automatically process inputs and targets using the input and output
processing functions, respectively. sim and train also reverse-process network outputs
as specified by the output processing functions.
For more information about processing input, target, and output data, see “Multilayer
Networks and Backpropagation Training” in the Neural Network Toolbox User's Guide.
You can change the default processing functions either by specifying optional processing
function arguments with the network-creation function, or by changing the value of
processFcns after creating your network.
You can also modify the default parameters for each processing function by changing the
elements of the processParams properties.
After you create a network object (net), you can use the following input properties to
view and modify the automatic processing settings:
The following input properties are automatically set and you cannot change them:
24-4
• net.inputs{1}.processSettings—Cell array of processing settings
• net.inputs{1}.processedRange—Ranges of example input vectors after
processing
• net.inputs{1}.processedSize—Number of input elements after processing
After you create a network object (net), you can use the following output properties to
view and modify the automatic processing settings:
Note These output properties require a network that has the output layer as the
second layer.
The following new output properties are automatically set and you cannot change them:
Automated data division occurs during network creation in the Network/Data Manager
GUI, Neural Network Fitting Tool GUI, and at the command line.
At the command line, to create and train a network with early stopping that uses 20% of
samples for validation and 20% for testing, you can use the following code:
24-5
R2007b
net = newff(p,t,20);
net = train(net,p,t);
Previously, you entered the following code to accomplish the same result:
pr = minmax(p);
s2 = size(t,1);
net = newff(pr,[20 s2]);
[trainV,validateV,testV] = dividevec(p,t,0.2,0.2);
[net,tr] = train(net,trainV.P,trainV.T,[],[],validateV,testV);
For more information about data division, see “Multilayer Networks and Backpropagation
Training” in the Neural Network Toolbox User's Guide.
Network creation functions return the following default data division properties:
• net.divideFcn = 'dividerand'
• net.divedeParam.trainRatio = 0.6;
• net.divideParam.valRatio = 0.2;
• net.divideParam.testRatio = 0.2;
Calling train on the network object net divided the set of input and target vectors into
three sets, such that 60% of the vectors are used for training, 20% for validation, and
20% for independent testing.
You can override default data division settings by either supplying the optional data
division argument for a network-creation function, or by changing the corresponding
property values after creating the network.
24-6
After creating a network, you can view and modify the data division behavior using the
following new network properties:
The function gensim now generates neural networks in Simulink that use the new
processing blocks.
Compatibility Considerations
Several properties are now obsolete, as described in the following table. Use the new
properties instead.
24-7
25
R2007a
Version: 5.0.2
R2006b
Version: 5.0.1
R2006a
Version: 5.0
New Features
Compatibility Considerations
R2006a
Both focused and distributed time-delay neural networks are now supported. Continue to
use the newfftd function to create focused time-delay neural networks. To create
distributed time-delay neural networks, use the newdtdnn function.
To create parallel NARX configurations, use the newnarx function. To create series-
parallel NARX networks, use the newnarxsp function. The sp2narx function lets you
convert NARX networks from series-parallel to parallel configuration, which is useful for
training.
Use the newlrn function to create LRN networks. LRN networks are useful for solving
some of the more difficult problems in filtering and modeling applications.
Custom Networks
The training functions in Neural Network Toolbox are enhanced to let you train arbitrary
custom dynamic networks that model complex dynamic systems. For more information
about working with these networks, see the Neural Network Toolbox documentation.
To open the Neural Network Fitting Tool, type the following at the MATLAB prompt:
nftool
27-2
dividevec Automatically Splits Data
The dividevec function facilitates dividing your data into three distinct sets to be used
for training, cross validation, and testing, respectively. Previously, you had to split the
data manually.
The fixunknowns function encodes missing values in your data so that they can be
processed in a meaningful and consistent way during network training. To reverse this
preprocessing operation and return the data to its original state, call fixunknowns again
with 'reverse' as the first argument.
The mapminmax, mapstd, and processpca functions are new and perform data
preprocessing and postprocessing operations.
Compatibility Considerations
Several functions are now obsolete, as described in the following table. Use the new
functions instead.
Each new function is more efficient than its obsolete predecessors because it
accomplishes both preprocessing and postprocessing of the data. For example, previously
27-3
R2006a
you used premnmx to process a matrix, and then postmnmx to return the data to its
original state. In this release, you accomplish both operations using mapminmax; to
return the data to its original state, you call mapminmax again with 'reverse' as the
first argument:
mapminmax('reverse',Y,PS)
ddotprod
dhardlim
dhardlms
dlogsig
dmae
dmse
dmsereg
dnetprod
dnetsum
dposlin
dpurelin
dradbas
dsatlin
dsatlins
dsse
dtansig
dtribas
Compatibility Considerations
To calculate a derivative in this version, you must pass a derivative argument to the
function. For example, to calculate the derivative of a hyperbolic tangent sigmoid transfer
function A with respect to N, use this syntax:
A = tansig(N,FP)
dA_dN = tansig('dn',N,A,FP)
27-4
Here, the argument 'dn' requests the derivative to be calculated.
27-5
28
R14SP3
Version: 4.0.6