Tensorlayer Documentation: Release 1.11.1

TensorLayer Documentation
Release 1.11.1
TensorLayer contributors
Dec 15, 2018

Starting with TensorLayer
1 User Guide 3
1.1 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.4 Contributing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.5 Get Involved in Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.6 FAQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2 API Reference 37
2.1 API - Activations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.2 API - Array Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.3 API - Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.4 API - Data Pre-Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.5 API - Distributed Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
2.6 API - Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
2.7 API - Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
2.8 API - Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
2.9 API - Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
2.10 API - Natural Language Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
2.11 API - Optimizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
2.12 API - Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
2.13 API - Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
2.14 API - Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
2.15 API - Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
3 Command-line Reference 243

3.1 CLI - Command Line Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
4 Indices and tables 245
Python Module Index 247
i
ii
TensorLayer Documentation, Release 1.11.1
Documentation Version: 1.11.1

Good News: We won the Best Open Source Software Award @ACM Multimedia (MM) 2017.
TensorLayer is a Deep Learning (DL) and Reinforcement Learning (RL) library extended from Google TensorFlow.
It provides popular DL and RL modules that can be easily customized and assembled for tackling real-world machine
learning problems. More details can be found here.
Note: If you got problem to read the docs online, you could download the repository on GitHub, then go to /docs/
_build/html/index.html to read the docs offline. The _build folder can be generated in docs using make
html.
Starting with TensorLayer 1

2 Starting with TensorLayer

CHAPTER 1
User Guide
The TensorLayer user guide explains how to install TensorFlow, CUDA and cuDNN, how to build and train neural
networks using TensorLayer, and how to contribute to the library as a developer.
1.1 Installation
TensorLayer has some prerequisites that need to be installed first, including TensorFlow , numpy and matplotlib. For
GPU support CUDA and cuDNN are required.
If you run into any trouble, please check the TensorFlow installation instructions which cover installing the TensorFlow
for a range of operating systems including Mac OX, Linux and Windows, or ask for help on tensorlayer@gmail.com
or FAQ.
1.1.1 Step 1 : Install dependencies
TensorLayer is build on the top of Python-version TensorFlow, so please install Python first.
Note: We highly recommend python3 instead of python2 for the sake of future.
Python includes pip command for installing additional modules is recommended. Besides, a virtual environment via
virtualenv can help you to manage python packages.
Take Python3 on Ubuntu for example, to install Python includes pip, run the following commands:
sudo apt-get install python3

sudo apt-get install python3-pip
sudo pip3 install virtualenv
To build a virtual environment and install dependencies into it, run the following commands: (You can also skip to
Step 3, automatically install the prerequisites by TensorLayer)
3
virtualenv env
env/bin/pip install matplotlib
env/bin/pip install numpy
env/bin/pip install scipy
env/bin/pip install scikit-image
To check the installed packages, run the following command:
env/bin/pip list
After that, you can run python script by using the virtual python as follow.
env/bin/python *.py
1.1.2 Step 2 : TensorFlow
The installation instructions of TensorFlow are written to be very detailed on TensorFlow website. However, there are
something need to be considered. For example, TensorFlow officially supports GPU acceleration for Linux, Mac OX
and Windows at present.
Warning: For ARM processor architecture, you need to install TensorFlow from source.
1.1.3 Step 3 : TensorLayer
The simplest way to install TensorLayer is as follow, it will also install the numpy and matplotlib automatically.
[stable version] pip install tensorlayer

[master version] pip install git+https://github.com/tensorlayer/tensorlayer.git
However, if you want to modify or extend TensorLayer, you can download the repository from Github and install it as
follow.
cd to the root of the git tree

pip install -e .
This command will run the setup.py to install TensorLayer. The -e reflects editable, then you can edit the source
code in tensorlayer folder, and import the edited TensorLayer.
1.1.4 Step 4 : GPU support
Thanks to NVIDIA supports, training a fully connected network on a GPU, which may be 10 to 20 times faster than
training them on a CPU. For convolutional network, may have 50 times faster. This requires an NVIDIA GPU with
CUDA and cuDNN support.
CUDA
The TensorFlow website also teach how to install the CUDA and cuDNN, please see TensorFlow GPU Support.
Download and install the latest CUDA is available from NVIDIA website:
• CUDA download and install
4 Chapter 1. User Guide

If CUDA is set up correctly, the following command should print some GPU information on the terminal:
python -c "import tensorflow"
cuDNN
Apart from CUDA, NVIDIA also provides a library for common neural network operations that especially speeds up
Convolutional Neural Networks (CNNs). Again, it can be obtained from NVIDIA after registering as a developer (it
take a while):
Download and install the latest cuDNN is available from NVIDIA website:
• cuDNN download and install
To install it, copy the *.h files to /usr/local/cuda/include and the lib* files to /usr/local/cuda/
lib64.
1.1.5 Windows User
TensorLayer is built on the top of Python-version TensorFlow, so please install Python first. NoteWe highly recom-
mend installing Anaconda. The lowest version requirements of Python is py35.
Anaconda download
GPU support
Thanks to NVIDIA supports, training a fully connected network on a GPU, which may be 10 to 20 times faster than
training them on a CPU. For convolutional network, may have 50 times faster. This requires an NVIDIA GPU with
CUDA and cuDNN support.
1. Installing Microsoft Visual Studio
You should preinstall Microsoft Visual Studio (VS) before installing CUDA. The lowest version requirements is
VS2010. We recommend installing VS2015 or VS2013. CUDA7.5 supports VS2010, VS2012 and VS2013. CUDA8.0
also supports VS2015.
2. Installing CUDA
Download and install the latest CUDA is available from NVIDIA website:
CUDA download
We do not recommend modifying the default installation directory.
3. Installing cuDNN
The NVIDIA CUDA® Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep
neural networks. Download and extract the latest cuDNN is available from NVIDIA website:
cuDNN download
After extracting cuDNN, you will get three folders (bin, lib, include). Then these folders should be copied to CUDA
installation. (The default installation directory is C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0)
1.1. Installation 5
Installing TensorLayer
You can easily install Tensorlayer using pip in CMD
pip install tensorflow #CPU version

pip install tensorflow-gpu #GPU version (GPU version and CPU version just choose
˓→one)
pip install tensorlayer #Install tensorlayer
Test
Enter “python” in CMD. Then:
import tensorlayer
If there is no error and the following output is displayed, the GPU version is successfully installed.
successfully opened CUDA library cublas64_80.dll locally

successfully opened CUDA library cuDNN64_5.dll locally
successfully opened CUDA library cufft64_80.dll locally
successfully opened CUDA library nvcuda.dll locally
successfully opened CUDA library curand64_80.dll locally
If there is no error, the CPU version is successfully installed.
1.1.6 Issue
If you get the following output when import tensorlayer, please read FQA.
_tkinter.TclError: no display name and no $DISPLAY environment variable
1.2 Tutorials
For deep learning, this tutorial will walk you through building handwritten digits classifiers using the MNIST dataset,
arguably the “Hello World” of neural networks. For reinforcement learning, we will let computer learns to play Pong
game from the original screen inputs. For nature language processing, we start from word embedding and then describe
language modeling and machine translation.
This tutorial includes all modularized implementation of Google TensorFlow Deep Learning tutorial, so you could
read TensorFlow Deep Learning tutorial at the same time [en] [cn] .
Note: For experts: Read the source code of InputLayer and DenseLayer, you will understand how TensorLayer
work. After that, we recommend you to read the codes on Github directly.
1.2.1 Before we start
The tutorial assumes that you are somewhat familiar with neural networks and TensorFlow (the library which Tensor-
Layer is built on top of). You can try to learn the basics of a neural network from the Deeplearning Tutorial.

For a more slow-paced introduction to artificial neural networks, we recommend Convolutional Neural Networks for
Visual Recognition by Andrej Karpathy et al., Neural Networks and Deep Learning by Michael Nielsen.
To learn more about TensorFlow, have a look at the TensorFlow tutorial. You will not need all of it, but a basic
understanding of how TensorFlow works is required to be able to use TensorLayer. If you’re new to TensorFlow,
going through that tutorial.
1.2.2 TensorLayer is simple
The following code shows a simple example of TensorLayer, see tutorial_mnist_simple.py . We

provide a lot of simple functions like fit() , test() ), however, if you want to understand the de-
tails and be a machine learning expert, we suggest you to train the network by using the data itera-
tion toolbox (tl.iterate) and the TensorFlow’s native API like sess.run(), see tutorial_mnist.py
<https://github.com/tensorlayer/tensorlayer/blob/master/example/tutorial_mnist.py>_ , tutorial_mlp_dropout1.py
and tutorial_mlp_dropout2.py <https://github.com/tensorlayer/tensorlayer/blob/master/example/tutorial_mlp_dropout2.py>_
for more details.
import tensorflow as tf
import tensorlayer as tl
sess = tf.InteractiveSession()
# prepare data
X_train, y_train, X_val, y_val, X_test, y_test = \
tl.files.load_mnist_dataset(shape=(-1,784))
# define placeholder
x = tf.placeholder(tf.float32, shape=[None, 784], name='x')
y_ = tf.placeholder(tf.int64, shape=[None, ], name='y_')
# define the network

network = tl.layers.InputLayer(x, name='input_layer')
network = tl.layers.DropoutLayer(network, keep=0.8, name='drop1')
network = tl.layers.DenseLayer(network, n_units=800,
act = tf.nn.relu, name='relu1')
# the softmax is implemented internally in tl.cost.cross_entropy(y, y_, 'cost') to
# speed up computation, so we use identity here.
# see tf.nn.sparse_softmax_cross_entropy_with_logits()
act = tf.identity,
name='output_layer')
# define cost function and metric.
y = network.outputs
cost = tl.cost.cross_entropy(y, y_, 'cost')
correct_prediction = tf.equal(tf.argmax(y, 1), y_)
acc = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
y_op = tf.argmax(tf.nn.softmax(y), 1)
# define the optimizer

train_params = network.all_params
train_op = tf.train.AdamOptimizer(learning_rate=0.0001, beta1=0.9, beta2=0.999,
epsilon=1e-08, use_locking=False).minimize(cost, var_
˓→list=train_params)
(continues on next page)
1.2. Tutorials 7
(continued from previous page)
# initialize all variables in the session

tl.layers.initialize_global_variables(sess)
# print network information

network.print_params()
network.print_layers()
# train the network

tl.utils.fit(sess, network, train_op, cost, X_train, y_train, x, y_,
acc=acc, batch_size=500, n_epoch=500, print_freq=5,
X_val=X_val, y_val=y_val, eval_train=False)
# evaluation
tl.utils.test(sess, network, acc, X_test, y_test, x, y_, batch_size=None, cost=cost)
# save the network to .npz file

tl.files.save_npz(network.all_params , name='model.npz')
sess.close()
1.2.3 Run the MNIST example
In the first part of the tutorial, we will just run the MNIST example that’s included in the source distribution of
TensorLayer. The MNIST dataset contains 60000 handwritten digits that are commonly used for training various
image processing systems. Each digit is 28x28 pixels in size.
We assume that you have already run through the Installation. If you haven’t done so already, get a copy of
the source tree of TensorLayer, and navigate to the folder in a terminal window. Enter the folder and run the
tutorial_mnist.py example script:
python tutorial_mnist.py
If everything is set up correctly, you will get an output like the following:
tensorlayer: GPU MEM Fraction 0.300000

Downloading train-images-idx3-ubyte.gz
Downloading train-labels-idx1-ubyte.gz
Downloading t10k-images-idx3-ubyte.gz
Downloading t10k-labels-idx1-ubyte.gz


X_train.shape (50000, 784)
y_train.shape (50000,)
X_val.shape (10000, 784)
y_val.shape (10000,)
X_test.shape (10000, 784)
y_test.shape (10000,)
X float32 y int64
[TL] InputLayer input_layer (?, 784)

[TL] DropoutLayer drop1: keep: 0.800000
[TL] DenseLayer relu1: 800, relu
[TL] DenseLayer output_layer: 10, identity
param 0: (784, 800) (mean: -0.000053, median: -0.000043 std: 0.035558)

param 1: (800,) (mean: 0.000000, median: 0.000000 std: 0.000000)
param 2: (800, 800) (mean: 0.000008, median: 0.000041 std: 0.035371)
num of params: 1276810
layer 0: Tensor("dropout/mul_1:0", shape=(?, 784), dtype=float32)

layer 1: Tensor("Relu:0", shape=(?, 800), dtype=float32)
layer 2: Tensor("dropout_1/mul_1:0", shape=(?, 800), dtype=float32)
layer 3: Tensor("Relu_1:0", shape=(?, 800), dtype=float32)
layer 5: Tensor("add_2:0", shape=(?, 10), dtype=float32)
learning_rate: 0.000100
batch_size: 128
Epoch 1 of 500 took 0.342539s

train loss: 0.330111
val loss: 0.298098
val acc: 0.910700
val loss: 0.097082
val acc: 0.971700
val loss: 0.070149
val acc: 0.978600
val loss: 0.060471
val acc: 0.982800
val loss: 0.055777
val acc: 0.983700
...
The example script allows you to try different models, including Multi-Layer Perceptron, Dropout, Dropconnect,
1.2. Tutorials 9
Stacked Denoising Autoencoder and Convolutional Neural Network. Select different models from if __name__
== '__main__':.
main_test_layers(model='relu')
main_test_denoise_AE(model='relu')
main_test_stacked_denoise_AE(model='relu')
main_test_cnn_layer()
1.2.4 Understand the MNIST example
Let’s now investigate what’s needed to make that happen! To follow along, open up the source code.
Preface
The first thing you might notice is that besides TensorLayer, we also import numpy and tensorflow:
import tensorflow as tf
from tensorlayer.layers import set_keep
import numpy as np
import time
As we know, TensorLayer is built on top of TensorFlow, it is meant as a supplement helping with some tasks, not as a
replacement. You will always mix TensorLayer with some vanilla TensorFlow code. The set_keep is used to access
the placeholder of keeping probabilities when using Denoising Autoencoder.
Loading data
The first piece of code defines a function load_mnist_dataset(). Its purpose is to download the MNIST dataset
(if it hasn’t been downloaded yet) and return it in the form of regular numpy arrays. There is no TensorLayer involved
at all, so for the purpose of this tutorial, we can regard it as:

tl.files.load_mnist_dataset(shape=(-1,784))
X_train.shape is (50000, 784), to be interpreted as: 50,000 images and each image has 784 pixels.
y_train.shape is simply (50000,), which is a vector the same length of X_train giving an integer class
label for each image – namely, the digit between 0 and 9 depicted in the image (according to the human annotator who
drew that digit).
For Convolutional Neural Network example, the MNIST can be load as 4D version as follow:

tl.files.load_mnist_dataset(shape=(-1, 28, 28, 1))
X_train.shape is (50000, 28, 28, 1) which represents 50,000 images with 1 channel, 28 rows and 28
columns each. Channel one is because it is a grey scale image, every pixel has only one value.
Building the model
This is where TensorLayer steps in. It allows you to define an arbitrarily structured neural network by creating and
stacking or merging layers. Since every layer knows its immediate incoming layers, the output layer (or output layers)

of a network double as a handle to the network as a whole, so usually this is the only thing we will pass on to the rest
of the code.
As mentioned above, tutorial_mnist.py supports four types of models, and we implement that via eas-
ily exchangeable functions of the same interface. First, we’ll define a function that creates a Multi-Layer
Perceptron (MLP) of a fixed architecture, explaining all the steps in detail. We’ll then implement a De-
noising Autoencoder (DAE), after that we will then stack all Denoising Autoencoder and supervised fine-tune
them. Finally, we’ll show how to create a Convolutional Neural Network (CNN). In addition, a simple ex-
ample for MNIST dataset in tutorial_mnist_simple.py, a CNN example for the CIFAR-10 dataset in
tutorial_cifar10_tfrecord.py.
Multi-Layer Perceptron (MLP)
The first script, main_test_layers(), creates an MLP of two hidden layers of 800 units each, followed by a
softmax output layer of 10 units. It applies 20% dropout to the input data and 50% dropout to the hidden layers.
To feed data into the network, TensofFlow placeholders need to be defined as follow. The None here means the
network will accept input data of arbitrary batch size after compilation. The x is used to hold the X_train data
and y_ is used to hold the y_train data. If you know the batch size beforehand and do not need this flexibility,
you should give the batch size here – especially for convolutional layers, this can allow TensorFlow to apply some
optimizations.

The foundation of each neural network in TensorLayer is an InputLayer instance representing the input data that
will subsequently be fed to the network. Note that the InputLayer is not tied to any specific data yet.
network = tl.layers.InputLayer(x, name='input')
Before adding the first hidden layer, we’ll apply 20% dropout to the input data. This is realized via a DropoutLayer
instance:
Note that the first constructor argument is the incoming layer, the second argument is the keeping probability for the
activation value. Now we’ll proceed with the first fully-connected hidden layer of 800 units. Note that when stacking
a DenseLayer.
network = tl.layers.DenseLayer(network, n_units=800, act = tf.nn.relu, name='relu1')
Again, the first constructor argument means that we’re stacking network on top of network. n_units simply
gives the number of units for this fully-connected layer. act takes an activation function, several of which are defined
in tensorflow.nn and tensorlayer.activation. Here we’ve chosen the rectifier, so we’ll obtain ReLUs. We’ll now
add dropout of 50%, another 800-unit dense layer and 50% dropout again:

network = tl.layers.DenseLayer(network, n_units=800, act = tf.nn.relu, name='relu2')
Finally, we’ll add the fully-connected output layer which the n_units equals to the number of classes. Note that,
the softmax is implemented internally in tf.nn.sparse_softmax_cross_entropy_with_logits() to
speed up computation, so we used identity in the last layer, more details in tl.cost.cross_entropy().
1.2. Tutorials 11
network = tl.layers.DenseLayer(network,
n_units=10,
act = tf.identity,
name='output')
As mentioned above, each layer is linked to its incoming layer(s), so we only need the output layer(s) to access a
network in TensorLayer:
y = network.outputs
cost = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(y, y_))
Here, network.outputs is the 10 identity outputs from the network (in one hot format), y_op is the integer output
represents the class index. While cost is the cross-entropy between the target and the predicted labels.
Denoising Autoencoder (DAE)
Autoencoder is an unsupervised learning model which is able to extract representative features, it has become more
widely used for learning generative models of data and Greedy layer-wise pre-train. For vanilla Autoencoder, see
Deeplearning Tutorial.
The script main_test_denoise_AE() implements a Denoising Autoencoder with corrosion rate of 50%. The
Autoencoder can be defined as follow, where an Autoencoder is represented by a DenseLayer:

network = tl.layers.DropoutLayer(network, keep=0.5, name='denoising1')
network = tl.layers.DenseLayer(network, n_units=200, act=tf.nn.sigmoid, name='sigmoid1
˓→')
recon_layer1 = tl.layers.ReconLayer(network,
x_recon=x,
n_units=784,
act=tf.nn.sigmoid,
name='recon_layer1')
To train the DenseLayer, simply run ReconLayer.pretrain(), if using denoising Autoencoder, the name
of corrosion layer (a DropoutLayer) need to be specified as follow. To save the feature images, set save to
True. There are many kinds of pre-train metrices according to different architectures and applications. For sigmoid
activation, the Autoencoder can be implemented by using KL divergence, while for rectifier, L1 regularization of
activation outputs can make the output to be sparse. So the default behaviour of ReconLayer only provide KLD
and cross-entropy for sigmoid activation function and L1 of activation outputs and mean-squared-error for rectifying
activation function. We recommend you to modify ReconLayer to achieve your own pre-train metrice.
recon_layer1.pretrain(sess,
x=x,
X_train=X_train,
X_val=X_val,
denoise_name='denoising1',
n_epoch=200,
batch_size=128,
print_freq=10,
save=True,
save_name='w1pre_')
In addition, the script main_test_stacked_denoise_AE() shows how to stacked multiple Autoencoder to one
network and then fine-tune.

Convolutional Neural Network (CNN)
Finally, the main_test_cnn_layer() script creates two CNN layers and max pooling stages, a fully-connected
hidden layer and a fully-connected output layer. More CNN examples can be found in other examples, like
network = tl.layers.Conv2d(network, 32, (5, 5), (1, 1),

act=tf.nn.relu, padding='SAME', name='cnn1')
network = tl.layers.MaxPool2d(network, (2, 2), (2, 2),
padding='SAME', name='pool1')
network = tl.layers.Conv2d(network, 64, (5, 5), (1, 1),
act=tf.nn.relu, padding='SAME', name='cnn2')
network = tl.layers.MaxPool2d(network, (2, 2), (2, 2),
padding='SAME', name='pool2')
network = tl.layers.FlattenLayer(network, name='flatten')

network = tl.layers.DenseLayer(network, 256, act=tf.nn.relu, name='relu1')
network = tl.layers.DenseLayer(network, 10, act=tf.identity, name='output')
Training the model
The remaining part of the tutorial_mnist.py script copes with setting up and running a training loop over the
MNIST dataset by using cross-entropy only.
Dataset iteration
An iteration function for synchronously iterating over two numpy arrays of input data and targets, respectively, in
mini-batches of a given number of items. More iteration function can be found in tensorlayer.iterate
tl.iterate.minibatches(inputs, targets, batchsize, shuffle=False)
Loss and update expressions
Continuing, we create a loss expression to be minimized in training:
y = network.outputs
cost = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(y, y_))
More cost or regularization can be applied here. For example, to apply max-norm on the weight matrices, we can add
the following line.
cost = cost + tl.cost.maxnorm_regularizer(1.0)(network.all_params[0]) +

tl.cost.maxnorm_regularizer(1.0)(network.all_params[2])
Depending on the problem you are solving, you will need different loss functions, see tensorlayer.cost
for more. Apart from using network.all_params to get the variables, we can also use tl.layers.
get_variables_with_name to get the specific variables by string name.
Having the model and the loss function here, we create update expression/operation for training the network. Tensor-
Layer does not provide many optimizers, we used TensorFlow’s optimizer instead:
1.2. Tutorials 13
train_op = tf.train.AdamOptimizer(learning_rate, beta1=0.9, beta2=0.999,
epsilon=1e-08, use_locking=False).minimize(cost, var_list=train_params)
For training the network, we fed data and the keeping probabilities to the feed_dict.
feed_dict = {x: X_train_a, y_: y_train_a}
feed_dict.update( network.all_drop )
sess.run(train_op, feed_dict=feed_dict)
While, for validation and testing, we use slightly different way. All Dropout, Dropconnect, Corrosion layers need to
be disabled. We use tl.utils.dict_to_one to set all network.all_drop to 1.
dp_dict = tl.utils.dict_to_one( network.all_drop )
feed_dict = {x: X_test_a, y_: y_test_a}
feed_dict.update(dp_dict)
err, ac = sess.run([cost, acc], feed_dict=feed_dict)
For evaluation, we create an expression for the classification accuracy:

correct_prediction = tf.equal(tf.argmax(y, 1), y_)
acc = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
What Next?
We also have a more advanced image classification example in tutorial_cifar10_tfrecord.py. Please read the code
and notes, figure out how to generate more training data and what is local response normalization. After that, try to
implement Residual Network (Hint: you may want to use the Layer.outputs).
1.2.5 Run the Pong Game example
In the second part of the tutorial, we will run the Deep Reinforcement Learning example which is introduced by
Karpathy in Deep Reinforcement Learning: Pong from Pixels.
python tutorial_atari_pong.py
Before running the tutorial code, you need to install OpenAI gym environment which is a popular benchmark for
Reinforcement Learning. If everything is set up correctly, you will get an output like the following:
[2016-07-12 09:31:59,760] Making new env: Pong-v0
[TL] DenseLayer output_layer: 3, identity
param 0: (6400, 200) (mean: -0.000009 median: -0.000018 std: 0.017393)
param 1: (200,) (mean: 0.000000 median: 0.000000 std: 0.000000)
param 2: (200, 3) (mean: 0.002239 median: 0.003122 std: 0.096611)
param 3: (3,) (mean: 0.000000 median: 0.000000 std: 0.000000)
episode 0: game 0 took 0.17381s, reward: -1.000000
episode 0: game 1 took 0.12629s, reward: 1.000000 !!!!!!!!


resetting env. episode reward total was -20.000000. running mean: -20.000000
episode 1: game 4 took 0.12520s, reward: 1.000000 !!!!!!!!
This example allows the neural network to learn how to play Pong game from the screen inputs, just like human
behavior. The neural network will play with a fake AI player and learn to beat it. After training for 15,000 episodes,
the neural network can win 20% of the games. The neural network win 35% of the games at 20,000 episode, we can
seen the neural network learn faster and faster as it has more winning data to train. If you run it for 30,000 episode, it
never loss.
render = False
resume = False
Setting render to True, if you want to display the game environment. When you run the code again, you can set
resume to True, the code will load the existing model and train the model basic on it.
1.2. Tutorials 15
1.2.6 Understand Reinforcement learning
Pong Game
To understand Reinforcement Learning, we let computer to learn how to play Pong game from the original screen
inputs. Before we start, we highly recommend you to go through a famous blog called Deep Reinforcement Learning:
Pong from Pixels which is a minimalistic implementation of Deep Reinforcement Learning by using python-numpy
and OpenAI gym environment.
python tutorial_atari_pong.py
Policy Network
In Deep Reinforcement Learning, the Policy Network is the same with Deep Neural Network, it is our player (or
“agent”) who output actions to tell what we should do (move UP or DOWN); in Karpathy’s code, he only defined 2
actions, UP and DOWN and using a single simgoid output; In order to make our tutorial more generic, we defined 3
actions which are UP, DOWN and STOP (do nothing) by using 3 softmax outputs.
# observation for training

states_batch_pl = tf.placeholder(tf.float32, shape=[None, D])
network = tl.layers.InputLayer(states_batch_pl, name='input_layer')

network = tl.layers.DenseLayer(network, n_units=H,
act = tf.identity, name='output_layer')
probs = network.outputs
sampling_prob = tf.nn.softmax(probs)
Then when our agent is playing Pong, it calculates the probabilities of different actions, and then draw sample (action)
from this uniform distribution. As the actions are represented by 1, 2 and 3, but the softmax outputs should be start
from 0, we calculate the label value by minus 1.
prob = sess.run(
sampling_prob,
feed_dict={states_batch_pl: x}
)
# action. 1: STOP 2: UP 3: DOWN
action = np.random.choice([1,2,3], p=prob.flatten())
...
ys.append(action - 1)
Policy Gradient
Policy gradient methods are end-to-end algorithms that directly learn policy functions mapping states to actions. An
approximate policy could be learned directly by maximizing the expected rewards. The parameters of a policy function
(e.g. the parameters of a policy network used in the pong example) could be trained and learned under the guidance of
the gradient of expected rewards. In other words, we can gradually tune the policy function via updating its parameters,
such that it will generate actions from given states towards higher rewards.
An alternative method to policy gradient is Deep Q-Learning (DQN). It is based on Q-Learning that tries to learn a
value function (called Q function) mapping states and actions to some value. DQN employs a deep neural network to
represent the Q function as a function approximator. The training is done by minimizing temporal-difference errors. A

neurobiologically inspired mechanism called “experience replay” is typically used along with DQN to help improve
its stability caused by the use of non-linear function approximator.
You can check the following papers to gain better understandings about Reinforcement Learning.
• Reinforcement Learning: An Introduction. Richard S. Sutton and Andrew G. Barto
• Deep Reinforcement Learning. David Silver, Google DeepMind
• UCL Course on RL
The most successful applications of Deep Reinforcement Learning in recent years include DQN with experience replay
to play Atari games and AlphaGO that for the first time beats world-class professional GO players. AlphaGO used the
policy gradient method to train its policy network that is similar to the example of Pong game.
• Atari - Playing Atari with Deep Reinforcement Learning
• Atari - Human-level control through deep reinforcement learning
• AlphaGO - Mastering the game of Go with deep neural networks and tree search
Dataset iteration
In Reinforcement Learning, we consider a final decision as an episode. In Pong game, a episode is a few dozen games,
because the games go up to score of 21 for either player. Then the batch size is how many episode we consider to
update the model. In the tutorial, we train a 2-layer policy network with 200 hidden layer units using RMSProp on
batches of 10 episodes.
We create a loss expression to be minimized in training:
actions_batch_pl = tf.placeholder(tf.int32, shape=[None])

discount_rewards_batch_pl = tf.placeholder(tf.float32, shape=[None])
loss = tl.rein.cross_entropy_reward_loss(probs, actions_batch_pl,
discount_rewards_batch_pl)
...
...
sess.run(
train_op,
feed_dict={
states_batch_pl: epx,
actions_batch_pl: epy,
discount_rewards_batch_pl: disR
}
)
The loss in a batch is relate to all outputs of Policy Network, all actions we made and the corresponding discounted
rewards in a batch. We first compute the loss of each action by multiplying the discounted reward and the cross-entropy
between its output and its true action. The final loss in a batch is the sum of all loss of the actions.
What Next?
The tutorial above shows how you can build your own agent, end-to-end. While it has reasonable quality, the default
parameters will not give you the best agent model. Here are a few things you can improve.
1.2. Tutorials 17
First of all, instead of conventional MLP model, we can use CNNs to capture the screen information better as Playing
Atari with Deep Reinforcement Learning describe.
Also, the default parameters of the model are not tuned. You can try changing the learning rate, decay, or initializing
the weights of your model in a different way.
Finally, you can try the model on different tasks (games) and try other reinforcement learning algorithm in Example.
1.2.7 Run the Word2Vec example
In this part of the tutorial, we train a matrix for words, where each word can be represented by a unique row vector in
the matrix. In the end, similar words will have similar vectors. Then as we plot out the words into a two-dimensional
plane, words that are similar end up clustering nearby each other.
python tutorial_word2vec_basic.py
If everything is set up correctly, you will get an output in the end.

1.2.8 Understand Word Embedding
Word Embedding
We highly recommend you to read Colah’s blog Word Representations to understand why we want to use a vector
representation, and how to compute the vectors. (For chinese reader please click. More details about word2vec can be
found in Word2vec Parameter Learning Explained.
Bascially, training an embedding matrix is an unsupervised learning. As every word is refected by an unique ID, which
is the row index of the embedding matrix, a word can be converted into a vector, it can better represent the meaning.
For example, there seems to be a constant male-female difference vector: woman man = queen - king, this
means one dimension in the vector represents gender.
The model can be created as follow.
1.2. Tutorials 19
# train_inputs is a row vector, a input is an integer id of single word.

# train_labels is a column vector, a label is an integer id of single word.
# valid_dataset is a column vector, a valid set is an integer id of single word.
train_inputs = tf.placeholder(tf.int32, shape=[batch_size])
train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1])
valid_dataset = tf.constant(valid_examples, dtype=tf.int32)
# Look up embeddings for inputs.

emb_net = tl.layers.Word2vecEmbeddingInputlayer(
inputs = train_inputs,
train_labels = train_labels,
vocabulary_size = vocabulary_size,
embedding_size = embedding_size,
num_sampled = num_sampled,
nce_loss_args = {},
E_init = tf.random_uniform_initializer(minval=-1.0, maxval=1.0),
E_init_args = {},
nce_W_init = tf.truncated_normal_initializer(
stddev=float(1.0/np.sqrt(embedding_size))),
nce_W_init_args = {},
nce_b_init = tf.constant_initializer(value=0.0),
nce_b_init_args = {},
name ='word2vec_layer',
)
Dataset iteration and loss
Word2vec uses Negative Sampling and Skip-Gram model for training. Noise-Contrastive Estimation Loss (NCE) can
help to reduce the computation of loss. Skip-Gram inverts context and targets, tries to predict each context word
from its target word. We use tl.nlp.generate_skip_gram_batch to generate training data as follow, see
tutorial_generate_text.py .
# NCE cost expression is provided by Word2vecEmbeddingInputlayer

cost = emb_net.nce_cost
train_params = emb_net.all_params
train_op = tf.train.AdagradOptimizer(learning_rate, initial_accumulator_value=0.1,

use_locking=False).minimize(cost, var_list=train_params)
data_index = 0
while (step < num_steps):
batch_inputs, batch_labels, data_index = tl.nlp.generate_skip_gram_batch(
data=data, batch_size=batch_size, num_skips=num_skips,
skip_window=skip_window, data_index=data_index)
feed_dict = {train_inputs : batch_inputs, train_labels : batch_labels}
_, loss_val = sess.run([train_op, cost], feed_dict=feed_dict)
Restore existing Embedding matrix
In the end of training the embedding matrix, we save the matrix and corresponding dictionaries. Then next
time, we can restore the matrix and directories as follow. (see main_restore_embedding_layer() in
tutorial_generate_text.py)

vocabulary_size = 50000
embedding_size = 128
model_file_name = "model_word2vec_50k_128"
batch_size = None
print("Load existing embedding matrix and dictionaries")

all_var = tl.files.load_npy_to_any(name=model_file_name+'.npy')
data = all_var['data']; count = all_var['count']
dictionary = all_var['dictionary']
reverse_dictionary = all_var['reverse_dictionary']
tl.nlp.save_vocab(count, name='vocab_'+model_file_name+'.txt')
del all_var, data, count
load_params = tl.files.load_npz(name=model_file_name+'.npz')
x = tf.placeholder(tf.int32, shape=[batch_size])
y_ = tf.placeholder(tf.int32, shape=[batch_size, 1])
emb_net = tl.layers.EmbeddingInputlayer(
inputs = x,
vocabulary_size = vocabulary_size,
embedding_size = embedding_size,
name ='embedding_layer')
tl.files.assign_params(sess, [load_params[0]], emb_net)
1.2.9 Run the PTB example
Penn TreeBank (PTB) dataset is used in many LANGUAGE MODELING papers, including “Empirical Evaluation
and Combination of Advanced Language Modeling Techniques”, “Recurrent Neural Network Regularization”. It
consists of 929k training words, 73k validation words, and 82k test words. It has 10k words in its vocabulary.
The PTB example is trying to show how to train a recurrent neural network on a challenging task of language modeling.
Given a sentence “I am from Imperial College London”, the model can learn to predict “Imperial College London”
from “from Imperial College”. In other word, it predict the next word in a text given a history of previous words. In
the previous example , num_steps (sequence length) is 3.
python tutorial_ptb_lstm.py
The script provides three settings (small, medium, large), where a larger model has better performance. You can
choose different settings in:
flags.DEFINE_string(
"model", "small",
"A type of model. Possible options are: small, medium, large.")
If you choose the small setting, you can see:

Epoch: 1 Learning rate: 1.000
0.004 perplexity: 5220.213 speed: 7635 wps
1.2. Tutorials 21

Epoch: 1 Train Perplexity: 265.558
Epoch: 1 Valid Perplexity: 178.436
...
Epoch: 13 Learning rate: 0.004
Epoch: 13 Train Perplexity: 36.643
Epoch: 13 Valid Perplexity: 121.475
Test Perplexity: 116.716
The PTB example shows that RNN is able to model language, but this example did not do something practically
interesting. However, you should read through this example and “Understand LSTM” in order to understand the
basics of RNN. After that, you will learn how to generate text, how to achieve language translation, and how to build
a question answering system by using RNN.
1.2.10 Understand LSTM
Recurrent Neural Network
We personally think Andrej Karpathy’s blog is the best material to Understand Recurrent Neural Network , after
reading that, Colah’s blog can help you to Understand LSTM Network [chinese] which can solve The Problem of
Long-Term Dependencies. We will not describe more about the theory of RNN, so please read through these blogs
before you go on.

Image by Andrej Karpathy
Synced sequence input and output
The model in PTB example is a typical type of synced sequence input and output, which was described by Karpathy
as “(5) Synced sequence input and output (e.g. video classification where we wish to label each frame of the video).
Notice that in every case there are no pre-specified constraints on the lengths of sequences because the recurrent
transformation (green) can be applied as many times as we like.”
The model is built as follows. Firstly, we transfer the words into word vectors by looking up an embedding matrix. In
this tutorial, there is no pre-training on the embedding matrix. Secondly, we stack two LSTMs together using dropout
between the embedding layer, LSTM layers, and the output layer for regularization. In the final layer, the model
provides a sequence of softmax outputs.
The first LSTM layer outputs [batch_size, num_steps, hidden_size] for stacking another LSTM
after it. The second LSTM layer outputs [batch_size*num_steps, hidden_size] for stacking a
DenseLayer after it. Then the DenseLayer computes the softmax outputs of each example n_examples =
batch_size*num_steps).
To understand the PTB tutorial, you can also read TensorFlow PTB tutorial.
(Note that, TensorLayer supports DynamicRNNLayer after v1.1, so you can set the input/output dropouts, number of
RNN layers in one single layer)
network = tl.layers.EmbeddingInputlayer(
inputs = x,
vocabulary_size = vocab_size,
embedding_size = hidden_size,
E_init = tf.random_uniform_initializer(-init_scale, init_scale),
name ='embedding_layer')
if is_training:
network = tl.layers.DropoutLayer(network, keep=keep_prob, name='drop1')
network = tl.layers.RNNLayer(network,
cell_fn=tf.contrib.rnn.BasicLSTMCell,
1.2. Tutorials 23

cell_init_args={'forget_bias': 0.0},
n_hidden=hidden_size,
initializer=tf.random_uniform_initializer(-init_scale, init_scale),
n_steps=num_steps,
return_last=False,
name='basic_lstm_layer1')
lstm1 = network
if is_training:
network = tl.layers.RNNLayer(network,
cell_fn=tf.contrib.rnn.BasicLSTMCell,
cell_init_args={'forget_bias': 0.0},
n_hidden=hidden_size,
initializer=tf.random_uniform_initializer(-init_scale, init_scale),
n_steps=num_steps,
return_last=False,
return_seq_2d=True,
name='basic_lstm_layer2')
lstm2 = network
if is_training:
network = tl.layers.DenseLayer(network,
n_units=vocab_size,
W_init=tf.random_uniform_initializer(-init_scale, init_scale),
b_init=tf.random_uniform_initializer(-init_scale, init_scale),
act = tf.identity, name='output_layer')
Dataset iteration
The batch_size can be seen as the number of concurrent computations we are running. As the following example
shows, the first batch learns the sequence information by using items 0 to 9. The second batch learn the sequence
information by using items 10 to 19. So it ignores the information from items 9 to 10 !n If only if we set batch_size
= 1`, it will consider all the information from items 0 to 20.
The meaning of batch_size here is not the same as the batch_size in the MNIST example. In the MNIST
example, batch_size reflects how many examples we consider in each iteration, while in the PTB example,
batch_size is the number of concurrent processes (segments) for accelerating the computation.
Some information will be ignored if batch_size > 1, however, if your dataset is “long” enough (a text corpus
usually has billions of words), the ignored information would not affect the final result.
In the PTB tutorial, we set batch_size = 20, so we divide the dataset into 20 segments. At the beginning of each
epoch, we initialize (reset) the 20 RNN states for the 20 segments to zero, then go through the 20 segments separately.
An example of generating training data is as follows:
train_data = [i for i in range(20)]

for batch in tl.iterate.ptb_iterator(train_data, batch_size=2, num_steps=3):
x, y = batch
print(x, '\n',y)
... [[ 0 1 2] <---x 1st subset/ iteration

... [10 11 12]]
... [[ 1 2 3] <---y
... [11 12 13]]


...
... [[ 3 4 5] <--- 1st batch input 2nd subset/ iteration
... [13 14 15]] <--- 2nd batch input
... [[ 4 5 6] <--- 1st batch target
... [14 15 16]] <--- 2nd batch target
...
... [[ 6 7 8] 3rd subset/ iteration
... [16 17 18]]
... [[ 7 8 9]
... [17 18 19]]
Note: This example can also be considered as pre-training of the word embedding matrix.
The cost function is the average cost of each mini-batch:
# See tensorlayer.cost.cross_entropy_seq() for more details

def loss_fn(outputs, targets, batch_size, num_steps):
# Returns the cost function of Cross-entropy of two sequences, implement
# softmax internally.
# outputs : 2D tensor [batch_size*num_steps, n_units of output layer]
# targets : 2D tensor [batch_size, num_steps], need to be reshaped.
# n_examples = batch_size * num_steps
# so
# cost is the average cost of each mini-batch (concurrent process).
loss = tf.nn.seq2seq.sequence_loss_by_example(
[outputs],
[tf.reshape(targets, [-1])],
[tf.ones([batch_size * num_steps])])
cost = tf.reduce_sum(loss) / batch_size
return cost
# Cost for Training

cost = loss_fn(network.outputs, targets, batch_size, num_steps)
For updating, truncated backpropagation clips values of gradients by the ratio of the sum of their norms, so as to make
the learning process tractable.
# Truncated Backpropagation for training

with tf.variable_scope('learning_rate'):
lr = tf.Variable(0.0, trainable=False)
tvars = tf.trainable_variables()
grads, _ = tf.clip_by_global_norm(tf.gradients(cost, tvars),
max_grad_norm)
optimizer = tf.train.GradientDescentOptimizer(lr)
train_op = optimizer.apply_gradients(zip(grads, tvars))
In addition, if the epoch index is greater than max_epoch, we decrease the learning rate by multipling lr_decay.
new_lr_decay = lr_decay ** max(i - max_epoch, 0.0)

sess.run(tf.assign(lr, learning_rate * new_lr_decay))
1.2. Tutorials 25
At the beginning of each epoch, all states of LSTMs need to be reseted (initialized) to zero states. Then after each
iteration, the LSTMs’ states is updated, so the new LSTM states (final states) need to be assigned as the initial states
of the next iteration:
# set all states to zero states at the beginning of each epoch
state1 = tl.layers.initialize_rnn_state(lstm1.initial_state)
state2 = tl.layers.initialize_rnn_state(lstm2.initial_state)
for step, (x, y) in enumerate(tl.iterate.ptb_iterator(train_data,
batch_size, num_steps)):
feed_dict = {input_data: x, targets: y,
lstm1.initial_state: state1,
lstm2.initial_state: state2,
}
# For training, enable dropout
# use the new states as the initial state of next iteration
_cost, state1, state2, _ = sess.run([cost,
lstm1.final_state,
lstm2.final_state,
train_op],
feed_dict=feed_dict
)
costs += _cost; iters += num_steps
Predicting
After training the model, when we predict the next output, we no long consider the number of steps (sequence length),
i.e. batch_size, num_steps are set to 1. Then we can output the next word one by one, instead of predicting a
sequence of words from a sequence of words.
input_data_test = tf.placeholder(tf.int32, [1, 1])
targets_test = tf.placeholder(tf.int32, [1, 1])
...
network_test, lstm1_test, lstm2_test = inference(input_data_test,
is_training=False, num_steps=1, reuse=True)
...
cost_test = loss_fn(network_test.outputs, targets_test, 1, 1)
...
print("Evaluation")
# Testing
# go through the test set step by step, it will take a while.
start_time = time.time()
costs = 0.0; iters = 0
# reset all states at the beginning
state1 = tl.layers.initialize_rnn_state(lstm1_test.initial_state)
state2 = tl.layers.initialize_rnn_state(lstm2_test.initial_state)
for step, (x, y) in enumerate(tl.iterate.ptb_iterator(test_data,
batch_size=1, num_steps=1)):
feed_dict = {input_data_test: x, targets_test: y,
lstm1_test.initial_state: state1,
lstm2_test.initial_state: state2,
}
_cost, state1, state2 = sess.run([cost_test,
lstm1_test.final_state,
lstm2_test.final_state],
feed_dict=feed_dict


)
costs += _cost; iters += 1
test_perplexity = np.exp(costs / iters)
print("Test Perplexity: %.3f took %.2fs" % (test_perplexity, time.time() - start_
˓→time))
What Next?
Now, you have understood Synced sequence input and output. Let’s think about Many to one (Sequence input and one
output), so that LSTM is able to predict the next word “English” from “I am from London, I speak ..”.
Please read and understand the code of tutorial_generate_text.py. It shows you how to restore a pre-trained
Embedding matrix and how to learn text generation from a given context.
Karpathy’s blog : “(3) Sequence input (e.g. sentiment analysis where a given sentence is classified as expressing
positive or negative sentiment). “
1.2.11 More Tutorials
In Example page, we provide many examples include Seq2seq, different type of Adversarial Learning, Reinforcement
Learning and etc.
1.2.12 More info
For more information on what you can do with TensorLayer, just continue reading through readthedocs. Finally, the
reference lists and explains as follow.
layers (tensorlayer.layers),
activation (tensorlayer.activation),
natural language processing (tensorlayer.nlp),
reinforcement learning (tensorlayer.rein),
cost expressions and regularizers (tensorlayer.cost),
load and save files (tensorlayer.files),
helper functions (tensorlayer.utils),
visualization (tensorlayer.visualize),
iteration functions (tensorlayer.iterate),
preprocessing functions (tensorlayer.prepro),
command line interface (tensorlayer.prepro),
1.3 Examples
1.3.1 Basics
• Multi-layer perceptron (MNIST). Classification task, see tutorial_mnist_simple.py.
1.3. Examples 27
• Multi-layer perceptron (MNIST). Classification with dropout using iterator, see method1 (use placeholder) and
method2 (use reuse).
• Denoising Autoencoder (MNIST). Classification task, see tutorial_mnist_autoencoder_cnn.py.
• Stacked Denoising Autoencoder and Fine-Tuning (MNIST). A MLP classification task, see tuto-
rial_mnist_autoencoder_cnn.py.
• Convolutional Network (MNIST). Classification task, see tutorial_mnist_autoencoder_cnn.py.
• Convolutional Network (CIFAR-10). Classification task, see tutorial_cifar10.py and tuto-
rial_cifar10_tfrecord.py.
• TensorFlow dataset API for object detection see here.
• Merge TF-Slim into TensorLayer. tutorial_inceptionV3_tfslim.py.
• Merge Keras into TensorLayer. tutorial_keras.py.
• Data augmentation with TFRecord. Effective way to load and pre-process data, see tutorial_tfrecord*.py and
• Data augmentation with Dataset API. Effective way to load and pre-process data, see tuto-
rial_cifar10_datasetapi.py.
• Data augmentation with TensorLayer. See tutorial_image_preprocess.py (for quick test only).
• Float 16 half-precision model, see tutorial_mnist_float16.py.
• Transparent distributed training. mnist by luomai.
1.3.2 Vision
• Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization, see examples.
• ArcFace: Additive Angular Margin Loss for Deep Face Recognition, see InsignFace.
• BinaryNet. Model compression, see mnist cifar10.
• Ternary Weight Network. Model compression, see mnist cifar10.
• DoReFa-Net. Model compression, see mnist cifar10.
• QuanCNN. Model compression, sees mnist cifar10.
• Wide ResNet (CIFAR) by ritchieng.
• Spatial Transformer Networks by zsdonghao.
• U-Net for brain tumor segmentation by zsdonghao.
• Variational Autoencoder (VAE) for (CelebA) by yzwxx.
• Variational Autoencoder (VAE) for (MNIST) by BUPTLdy.
• Image Captioning - Reimplementation of Google’s im2txt by zsdonghao.
1.3.3 Adversarial Learning
• DCGAN (CelebA). Generating images by Deep Convolutional Generative Adversarial Networks by zsdonghao.
• Generative Adversarial Text to Image Synthesis by zsdonghao.
• Unsupervised Image to Image Translation with Generative Adversarial Networks by zsdonghao.

• Improved CycleGAN with resize-convolution by luoxier.

• Super Resolution GAN by zsdonghao.
• BEGAN: Boundary Equilibrium Generative Adversarial Networks by 2wins.
• DAGAN: Fast Compressed Sensing MRI Reconstruction by nebulaV.
1.3.4 Natural Language Processing
• Recurrent Neural Network (LSTM). Apply multiple LSTM to PTB dataset for language modeling, see tuto-
rial_ptb_lstm_state_is_tuple.py.
• Word Embedding (Word2vec). Train a word embedding matrix, see tutorial_word2vec_basic.py.
• Restore Embedding matrix. Restore a pre-train embedding matrix, see tutorial_generate_text.py.
• Text Generation. Generates new text scripts, using LSTM network, see tutorial_generate_text.py.
• Chinese Text Anti-Spam by pakrchen.
• Chatbot in 200 lines of code for Seq2Seq.
• FastText Sentence Classification (IMDB), see tutorial_imdb_fasttext.py by tomtung.
1.3.5 Reinforcement Learning
• Policy Gradient / Network (Atari Ping Pong), see tutorial_atari_pong.py.

• Deep Q-Network (Frozen lake), see tutorial_frozenlake_dqn.py.
• Q-Table learning algorithm (Frozen lake), see tutorial_frozenlake_q_table.py.
• Asynchronous Policy Gradient using TensorDB (Atari Ping Pong) by nebulaV.
• AC for discrete action space (Cartpole), see tutorial_cartpole_ac.py.
• A3C for continuous action space (Bipedal Walker), see tutorial_bipedalwalker_a3c*.py.
• DAGGER for (Gym Torcs) by zsdonghao.
• TRPO for continuous and discrete action space by jjkke88.
1.3.6 Pretrained Models
• VGG 16 (ImageNet). Classification task, see tl.models.VGG16 or tutorial_vgg16.py.

• VGG 19 (ImageNet). Classification task, see tutorial_vgg19.py.
• InceptionV3 (ImageNet). Classification task, see tutorial_inceptionV3_tfslim.py.
• SqueezeNet (ImageNet). Model compression, see tl.models.SqueezeNetV1 or tutorial_squeezenet.py.
• MobileNet (ImageNet). Model compression, see tl.models.MobileNetV1 or tutorial_mobilenet.py.
• More CNN implementations of TF-Slim can be connected to TensorLayer via SlimNetsLayer.
• All pretrained models in pretrained-models.
1.3. Examples 29
1.3.7 Miscellaneous
• TensorDB by fangde see tl_paper.

• A simple web service - TensorFlask by JoelKronander.
1.4 Contributing
TensorLayer is a major ongoing research project in Data Science Institute, Imperial College London. The goal of the
project is to develop a compositional language while complex learning systems can be build through composition of
neural network modules.
Numerous contributors come from various horizons such as: Tsinghua University, Carnegie Mellon University, Uni-
versity of Technology of Compiegne, Google, Microsoft, Bloomberg and etc.
There are many functions need to be contributed such as Maxout, Neural Turing Machine, Attention, TensorLayer
Mobile and etc.
You can easily open a Pull Request (PR) on GitHub, every little step counts and will be credited. As an open-source
project, we highly welcome and value contributions!
If you are interested in working with us, please contact us at: tensorlayer@gmail.com.
1.4.1 Project Maintainers
The TensorLayer project was started by Hao Dong at Imperial College London in June 2016.
It is actively developed and maintained by the following people (in alphabetical order):
• Akara Supratak (@akaraspt) - https://akaraspt.github.io
• Fangde Liu (@fangde) - http://fangde.github.io/
• Guo Li (@lgarithm) - https://lgarithm.github.io
• Hao Dong (@zsdonghao) - https://zsdonghao.github.io
• Jonathan Dekhtiar (@DEKHTIARJonathan) - https://www.jonathandekhtiar.eu
• Luo Mai (@luomai) - http://www.doc.ic.ac.uk/~lm111/
• Simiao Yu (@nebulaV) - https://nebulav.github.io
Numerous other contributors can be found in the Github Contribution Graph.
1.4.2 What to contribute
Your method and example
If you have a new method or example in term of Deep learning and Reinforcement learning, you are welcome to
contribute.
• Provide your layer or example, so everyone can use it.
• Explain how it would work, and link to a scientific paper if applicable.
• Keep the scope as narrow as possible, to make it easier to implement.

Report bugs
Report bugs at the GitHub, we normally will fix it in 5 hours. If you are reporting a bug, please include:
• your TensorLayer, TensorFlow and Python version.
• steps to reproduce the bug, ideally reduced to a few Python commands.
• the results you obtain, and the results you expected instead.
If you are unsure whether the behavior you experience is a bug, or if you are unsure whether it is related to TensorLayer
or TensorFlow, please just ask on our mailing list first.
Fix bugs
Look through the GitHub issues for bug reports. Anything tagged with “bug” is open to whoever wants to implement
it. If you discover a bug in TensorLayer you can fix yourself, by all means feel free to just implement a fix and not
report it first.
Write documentation
Whenever you find something not explained well, misleading, glossed over or just wrong, please update it! The Edit
on GitHub link on the top right of every documentation page and the [source] link for every documented entity in the
API reference will help you to quickly locate the origin of any text.
1.4.3 How to contribute
Edit on GitHub
As a very easy way of just fixing issues in the documentation, use the Edit on GitHub link on the top right of a
documentation page or the [source] link of an entity in the API reference to open the corresponding source file in
GitHub, then click the Edit this file link to edit the file in your browser and send us a Pull Request. All you need for
this is a free GitHub account.
For any more substantial changes, please follow the steps below to setup TensorLayer for development.
Documentation
The documentation is generated with Sphinx. To build it locally, run the following commands:
pip install Sphinx

sphinx-quickstart
cd docs
make html
If you want to re-generate the whole docs, run the following commands:
cd docs
make clean
make html
1.4. Contributing 31
To write the docs, we recommend you to install Local RTD VM.

Afterwards, open docs/_build/html/index.html to view the documentation as it would appear on readthe-
docs. If you changed a lot and seem to get misleading error messages or warnings, run make clean html to force
Sphinx to recreate all files from scratch.
When writing docstrings, follow existing documentation as much as possible to ensure consistency throughout the
library. For additional information on the syntax and conventions used, please refer to the following documents:
• reStructuredText Primer
• Sphinx reST markup constructs
• A Guide to NumPy/SciPy Documentation
Testing
TensorLayer has a code coverage of 100%, which has proven very helpful in the past, but also creates some duties:
• Whenever you change any code, you should test whether it breaks existing features by just running the test
scripts.
• Every bug you fix indicates a missing test case, so a proposed bug fix should come with a new test that fails
without your fix.
Sending Pull Requests
When you’re satisfied with your addition, the tests pass and the documentation looks good without any markup errors,
commit your changes to a new branch, push that branch to your fork and send us a Pull Request via GitHub’s web
interface.
All these steps are nicely explained on GitHub: https://guides.github.com/introduction/flow/
When filing your Pull Request, please include a description of what it does, to help us reviewing it. If it is fixing an
open issue, say, issue #123, add Fixes #123, Resolves #123 or Closes #123 to the description text, so GitHub will close
it when your request is merged.
1.5 Get Involved in Research
1.5.1 Data Science Institute, Imperial College London
Data science is therefore by nature at the core of all modern transdisciplinary scientific activities, as it involves the
whole life cycle of data, from acquisition and exploration to analysis and communication of the results. Data science
is not only concerned with the tools and methods to obtain, manage and analyse data: it is also about extracting value
from data and translating it from asset to product.
Launched on 1st April 2014, the Data Science Institute at Imperial College London aims to enhance Imperial’s excel-
lence in data-driven research across its faculties by fulfilling the following objectives.
The Data Science Institute is housed in purpose built facilities in the heart of the Imperial College campus in South
Kensington. Such a central location provides excellent access to collabroators across the College and across London.
• To act as a focal point for coordinating data science research at Imperial College by facilitating access to funding,
engaging with global partners, and stimulating cross-disciplinary collaboration.
• To develop data management and analysis technologies and services for supporting data driven research in the
College.

• To promote the training and education of the new generation of data scientist by developing and coordinating
new degree courses, and conducting public outreach programmes on data science.
• To advise College on data strategy and policy by providing world-class data science expertise.
• To enable the translation of data science innovation by close collaboration with industry and supporting com-
mercialization.
If you are interested in working with us, please check our vacancies and other ways to get involved , or feel free to
contact us.
1.6 FAQ
1.6.1 How to effectively learn TensorLayer
No matter what stage you are in, we recommend you to spend just 10 minutes to read the source code of TensorLayer
and the Understand layer / Your layer in this website, you will find the abstract methods are very simple for everyone.
Reading the source codes helps you to better understand TensorFlow and allows you to implement your own methods
easily. For discussion, we recommend Gitter, Help Wanted Issues, QQ group and Wechat group.
Beginner
For people who new to deep learning, the contirbutors provided a number of tutorials in this website, these tutorials
will guide you to understand autoencoder, convolutional neural network, recurrent neural network, word embedding
and deep reinforcement learning and etc. If your already understand the basic of deep learning, we recommend you to
skip the tutorials and read the example codes on Github , then implement an example from scratch.
Engineer
For people from industry, the contirbutors provided mass format-consistent examples covering computer vision, nat-
ural language processing and reinforcement learning. Besides, there are also many TensorFlow users already im-
plemented product-level examples including image captioning, semantic/instance segmentation, machine translation,
chatbot and etc, which can be found online. It is worth noting that a wrapper especially for computer vision Tf-Slim
can be connected with TensorLayer seamlessly. Therefore, you may able to find the examples that can be used in your
project.
Researcher
For people from academic, TensorLayer was originally developed by PhD students who facing issues with other
libraries on implement novel algorithm. Installing TensorLayer in editable mode is recommended, so you can extend
your methods in TensorLayer. For researches related to image such as image captioning, visual QA and etc, you may
find it is very helpful to use the existing Tf-Slim pre-trained models with TensorLayer (a specially layer for connecting
Tf-Slim is provided).
1.6.2 Exclude some layers from training
You may need to get the list of variables you want to update, TensorLayer provides two ways to get the variables list.
The first way is to use the all_params of a network, by default, it will store the variables in order. You can print
the variables information via tl.layers.print_all_variables(train_only=True) or network.
print_params(details=False). To choose which variables to update, you can do as below.
1.6. FAQ 33
train_params = network.all_params[3:]
The second way is to get the variables by a given name. For example, if you want to get all variables which the layer
name contain dense, you can do as below.
train_params = tl.layers.get_variables_with_name('dense', train_only=True,
˓→printable=True)
After you get the variable list, you can define your optimizer like that so as to update only a part of the variables.
train_op = tf.train.AdamOptimizer(0.001).minimize(cost, var_list= train_params)
1.6.3 Logging
TensorLayer adopts the Python logging module to log running information. The logging module would print logs to
the console in default. If you want to configure the logging module, you shall follow its manual.
1.6.4 Visualization
Cannot Save Image
If you run the script via SSH control, sometime you may find the following error.
_tkinter.TclError: no display name and no $DISPLAY environment variable
If happen, run sudo apt-get install python3-tk or import matplotlib and matplotlib.
use('Agg') before import tensorlayer as tl. Alternatively, add the following code into the top of
visualize.py or in your own code.
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
1.6.5 Install Master Version
To use all new features of TensorLayer, you need to install the master version from Github. Before that, you need to
make sure you already installed git.
[stable version] pip install tensorlayer
[master version] pip install git+https://github.com/tensorlayer/tensorlayer.git
1.6.6 Editable Mode
• 1. Download the TensorLayer folder from Github.

• 2. Before editing the TensorLayer .py file.
• If your script and TensorLayer folder are in the same folder, when you edit the .py inside Tensor-
Layer folder, your script can access the new features.
• If your script and TensorLayer folder are not in the same folder, you need to run the following
command in the folder contains setup.py before you edit .py inside TensorLayer folder.

pip install -e .
1.6.7 Load Model
Note that, the tl.files.load_npz() can only able to load the npz model saved by tl.files.save_npz().
If you have a model want to load into your TensorLayer network, you can first assign your parameters into a list in
order, then use tl.files.assign_params() to load the parameters into your TensorLayer model.
1.6. FAQ 35

CHAPTER 2
API Reference
If you are looking for information on a specific function, class or method, this part of the documentation is for you.
2.1 API - Activations
To make TensorLayer simple, we minimize the number of activation functions as much as we can. So we encourage
you to use TensorFlow’s function. TensorFlow provides tf.nn.relu, tf.nn.relu6, tf.nn.elu, tf.nn.
softplus, tf.nn.softsign and so on. More TensorFlow official activation functions can be found here. For
parametric activation, please read the layer APIs.
The shortcut of tensorlayer.activation is tensorlayer.act.
2.1.1 Your activation
Customizes activation function in TensorLayer is very easy. The following example implements an activation that
multiplies its input by 2. For more complex activation, TensorFlow API will be required.
def double_activation(x):
return x * 2
A file containing various activation functions.
leaky_relu(x[, alpha, name]) leaky_relu can be used through its shortcut: tl.act.
lrelu().
leaky_relu6(x[, alpha, name]) leaky_relu6() can be used through its shortcut:
tl.act.lrelu6().
leaky_twice_relu6(x[, alpha_low, . . . ]) leaky_twice_relu6() can be used through its
shortcut: :func:`tl.act.ltrelu6().
ramp(x[, v_min, v_max, name]) Ramp activation function.
swish(x[, name]) Swish function.
Continued on next page
37
Table 1 – continued from previous page

sign(x) Sign function.
hard_tanh(x[, name]) Hard tanh activation function.
pixel_wise_softmax(x[, name]) Return the softmax outputs of images, every pixels have
multiple label, the sum of a pixel is 1.
2.1.2 Ramp
tensorlayer.activation.ramp(x, v_min=0, v_max=1, name=None)

Ramp activation function.
Parameters
• x (Tensor) – input.
• v_min (float) – cap input to v_min as a lower bound.
• v_max (float) – cap input to v_max as a upper bound.
• name (str) – The function name (optional).
Returns A Tensor in the same type as x.
Return type Tensor
2.1.3 Leaky ReLU
tensorlayer.activation.leaky_relu(x, alpha=0.2, name=’leaky_relu’)

leaky_relu can be used through its shortcut: tl.act.lrelu().
Warning: THIS FUNCTION IS DEPRECATED: It will be removed after after 2018-09-30.

Instructions for updating: This API is deprecated. Please use as tf.nn.leaky_relu.
This function is a modified version of ReLU, introducing a nonzero gradient for negative input. Introduced by
the paper: Rectifier Nonlinearities Improve Neural Network Acoustic Models [A. L. Maas et al., 2013]
The function return the following results:
• When x < 0: f(x) = alpha_low * x.
• When x >= 0: f(x) = x.
Parameters
• x (Tensor) – Support input type float, double, int32, int64, uint8, int16, or
int8.
• alpha (float) – Slope.
Examples
>>> import tensorlayer as tl

>>> net = tl.layers.DenseLayer(net, 100, act=lambda x : tl.act.lrelu(x, 0.2),
˓→name='dense')
38 Chapter 2. API Reference


Return type Tensor
References
• Rectifier Nonlinearities Improve Neural Network Acoustic Models [A. L. Maas et al., 2013]
2.1.4 Leaky ReLU6
tensorlayer.activation.leaky_relu6(x, alpha=0.2, name=’leaky_relu6’)

leaky_relu6() can be used through its shortcut: tl.act.lrelu6().
This activation function is a modified version leaky_relu() introduced by the following paper: Rectifier
Nonlinearities Improve Neural Network Acoustic Models [A. L. Maas et al., 2013]
This activation function also follows the behaviour of the activation function tf.nn.relu6() introduced by
the following paper: Convolutional Deep Belief Networks on CIFAR-10 [A. Krizhevsky, 2010]
• When x in [0, 6]: f(x) = x.
• When x > 6: f(x) = 6.
Parameters
int8.
• alpha (float) – Slope.
Examples

>>> net = tl.layers.DenseLayer(net, 100, act=lambda x : tl.act.leaky_relu6(x, 0.
˓→2), name='dense')

Return type Tensor
References
• Convolutional Deep Belief Networks on CIFAR-10 [A. Krizhevsky, 2010]
2.1. API - Activations 39

2.1.5 Twice Leaky ReLU6
tensorlayer.activation.leaky_twice_relu6(x, alpha_low=0.2, alpha_high=0.2,

name=’leaky_relu6’)
leaky_twice_relu6() can be used through its shortcut: :func:`tl.act.ltrelu6().
This activation function is a modified version leaky_relu() introduced by the following paper: Rectifier
Nonlinearities Improve Neural Network Acoustic Models [A. L. Maas et al., 2013]
This activation function also follows the behaviour of the activation function tf.nn.relu6() introduced by
This function push further the logic by adding leaky behaviour both below zero and above six.
• When x in [0, 6]: f(x) = x.
• When x > 6: f(x) = 6 + (alpha_high * (x-6)).
Parameters
int8.
• alpha_low (float) – Slope for x < 0: f(x) = alpha_low * x.
• alpha_high (float) – Slope for x < 6: f(x) = 6 (alpha_high * (x-6)).
Examples

>>> net = tl.layers.DenseLayer(net, 100, act=lambda x : tl.act.leaky_twice_
˓→relu6(x, 0.2, 0.2), name='dense')

Return type Tensor
References
2.1.6 Swish
tensorlayer.activation.swish(x, name=’swish’)
Swish function.
See Swish: a Self-Gated Activation Function.
Parameters

• name (str) – function name (optional).
Return type Tensor
2.1.7 Sign
tensorlayer.activation.sign(x)
Sign function.
Clip and binarize tensor using the straight through estimator (STE) for the gradient, usually be used for quan-
tizing values in Binarized Neural Networks: https://arxiv.org/abs/1602.02830.
Parameters x (Tensor) – input.
Examples
>>> net = tl.layers.DenseLayer(net, 100, act=lambda x : tl.act.lrelu(x, 0.2),

˓→name='dense')

Return type Tensor
References
• Rectifier Nonlinearities Improve Neural Network Acoustic Models, Maas et al. (2013) http:
//web.stanford.edu/~awni/papers/relu_hybrid_icml2013_final.pdf
• BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1, Courbariaux et al. (20
https://arxiv.org/abs/1602.02830
2.1.8 Hard Tanh
tensorlayer.activation.hard_tanh(x, name=’htanh’)
Hard tanh activation function.
Which is a ramp function with low bound of -1 and upper bound of 1, shortcut is htanh.
Parameters
Return type Tensor
2.1. API - Activations 41

2.1.9 Pixel-wise softmax
tensorlayer.activation.pixel_wise_softmax(x, name=’pixel_wise_softmax’)
Return the softmax outputs of images, every pixels have multiple label, the sum of a pixel is 1.

Instructions for updating: This API will be deprecated soon as tf.nn.softmax can do the same
thing.
Usually be used for image segmentation.

Parameters
• x (Tensor) –
input.
– For 2d image, 4D tensor (batch_size, height, weight, channel), where channel >= 2.
– For 3d image, 5D tensor (batch_size, depth, height, weight, channel), where channel
>= 2.
• name (str) – function name (optional)
Return type Tensor
Examples
>>> outputs = pixel_wise_softmax(network.outputs)

>>> dice_loss = 1 - dice_coe(outputs, y_, epsilon=1e-5)
References
• tf.reverse
2.1.10 Parametric activation
See tensorlayer.layers.
2.2 API - Array Operations
A file containing functions related to array manipulation.
alphas(shape, alpha_value[, name]) Creates a tensor with all elements set to alpha_value.
alphas_like(tensor, alpha_value[, name, . . . ]) Creates a tensor with all elements set to alpha_value.

2.2.1 Tensorflow Tensor Operations
tl.alphas
tensorlayer.array_ops.alphas(shape, alpha_value, name=None)

Creates a tensor with all elements set to alpha_value. This operation returns a tensor of type dtype with shape
shape and all elements set to alpha.
Parameters
• shape (A list of integers, a tuple of integers, or a 1-D Tensor of type int32.) – The shape
of the desired tensor
• alpha_value (float32, float64, int8, uint8, int16, uint16, int32‘, int64) – The value used
to fill the resulting Tensor.
• name (str) – A name for the operation (optional).
Returns
Return type A Tensor with all elements set to alpha.
Examples
>>> tl.alphas([2, 3], tf.int32) # [[alpha, alpha, alpha], [alpha, alpha, alpha]]
tl.alphas_like
tensorlayer.array_ops.alphas_like(tensor, alpha_value, name=None, optimize=True)

Creates a tensor with all elements set to alpha_value. Given a single tensor (tensor), this operation returns a
tensor of the same type and shape as tensor with all elements set to alpha_value.
Parameters
• tensor (tf.Tensor) – The Tensorflow Tensor that will be used as a template.
• alpha_value (float32, float64, int8, uint8, int16, uint16, int32‘, int64) – The value used
to fill the resulting Tensor.
• name (str) – A name for the operation (optional).
• optimize (bool) – if true, attempt to statically determine the shape of ‘tensor’ and en-
code it as a constant.
Returns
Return type A Tensor with all elements set to alpha_value.
Examples
>>> tensor = tf.constant([[1, 2, 3], [4, 5, 6]])

>>> tl.alphas_like(tensor, 0.5) # [[0.5, 0.5, 0.5], [0.5, 0.5, 0.5]]
2.2. API - Array Operations 43

2.3 API - Cost
To make TensorLayer simple, we minimize the number of cost functions as much as we can. So we encour-
age you to use TensorFlow’s function. For example, you can implement L1, L2 and sum regularization by tf.
nn.l2_loss, tf.contrib.layers.l1_regularizer, tf.contrib.layers.l2_regularizer and
tf.contrib.layers.sum_regularizer, see TensorFlow API.
2.3.1 Your cost function
TensorLayer provides a simple way to create you own cost function. Take a MLP below for example.
network = InputLayer(x, name='input')

network = DropoutLayer(network, keep=0.8, name='drop1')
network = DenseLayer(network, n_units=800, act=tf.nn.relu, name='relu1')
network = DenseLayer(network, n_units=800, act=tf.nn.relu, name='relu2')
network = DenseLayer(network, n_units=10, act=tf.identity, name='output')
The network parameters will be [W1, b1, W2, b2, W_out, b_out], then you can apply L2 regularization
on the weights matrix of first two layer as follow.
cost = tl.cost.cross_entropy(y, y_)

cost = cost + tf.contrib.layers.l2_regularizer(0.001)(network.all_params[0])
+ tf.contrib.layers.l2_regularizer(0.001)(network.all_params[2])
Besides, TensorLayer provides a easy way to get all variables by a given name, so you can also apply L2 regularization
on some weights as follow.
l2 = 0
for w in tl.layers.get_variables_with_name('W_conv2d', train_only=True,
˓→printable=False):
l2 += tf.contrib.layers.l2_regularizer(1e-4)(w)
cost = tl.cost.cross_entropy(y, y_) + l2
Regularization of Weights
After initializing the variables, the informations of network parameters can be observed by using network.
print_params().
param 0: (784, 800) (mean: -0.000000, median: 0.000004 std: 0.035524)

The output of network is network.outputs, then the cross entropy can be defined as follow. Besides, to regu-
larize the weights, the network.all_params contains all parameters of the network. In this case, network.

all_params = [W1, b1, W2, b2, Wout, bout] according to param 0, 1 . . . 5 shown by network.
print_params(). Then max-norm regularization on W1 and W2 can be performed as follow.
max_norm = 0
for w in tl.layers.get_variables_with_name('W', train_only=True, printable=False):
max_norm += tl.cost.maxnorm_regularizer(1)(w)
cost = tl.cost.cross_entropy(y, y_) + max_norm
In addition, all TensorFlow’s regularizers like tf.contrib.layers.l2_regularizer can be used with Ten-
sorLayer.
Regularization of Activation outputs
Instance method network.print_layers() prints all outputs of different layers in order. To achieve regular-
ization on activation output, you can use network.all_layers which contains all outputs of different layers.
If you want to apply L1 penalty on the activations of first hidden layer, just simply add tf.contrib.layers.
l2_regularizer(lambda_l1)(network.all_layers[1]) to the cost function.
layer 0: Tensor("dropout/mul_1:0", shape=(?, 784), dtype=float32)

layer 3: Tensor("Relu_1:0", shape=(?, 800), dtype=float32)
cross_entropy(output, target[, name]) Softmax cross-entropy operation, returns the Tensor-

Flow expression of cross-entropy for two distributions,
it implements softmax internally.
sigmoid_cross_entropy(output, target[, name]) Sigmoid cross-entropy operation, see tf.nn.
sigmoid_cross_entropy_with_logits.
binary_cross_entropy(output, target[, . . . ]) Binary cross entropy operation.
mean_squared_error(output, target[, . . . ]) Return the TensorFlow expression of mean-square-error
(L2) of two batch of data.
normalized_mean_square_error(output, tar- Return the TensorFlow expression of normalized mean-
get) square-error of two distributions.
absolute_difference_error(output, target[, Return the TensorFlow expression of absolute differ-
. . . ]) ence error (L1) of two batch of data.
dice_coe(output, target[, loss_type, axis, . . . ]) Soft dice (Sørensen or Jaccard) coefficient for compar-
ing the similarity of two batch of data, usually be used
for binary image segmentation i.e.
dice_hard_coe(output, target[, threshold, . . . ]) Non-differentiable Sørensen–Dice coefficient for com-
paring the similarity of two batch of data, usually be
used for binary image segmentation i.e.
iou_coe(output, target[, threshold, axis, . . . ]) Non-differentiable Intersection over Union (IoU) for
comparing the similarity of two batch of data, usually
be used for evaluating binary image segmentation.
cross_entropy_seq(logits, target_seqs[, . . . ]) Returns the expression of cross-entropy of two se-
quences, implement softmax internally.
cross_entropy_seq_with_mask(logits, . . . [, Returns the expression of cross-entropy of two se-
. . . ]) quences, implement softmax internally.
2.3. API - Cost 45


cosine_similarity(v1, v2) Cosine similarity [-1, 1].
li_regularizer(scale[, scope]) Li regularization removes the neurons of previous layer.
lo_regularizer(scale) Lo regularization removes the neurons of current layer.
maxnorm_regularizer([scale]) Max-norm regularization returns a function that can be
used to apply max-norm regularization to weights.
maxnorm_o_regularizer(scale) Max-norm output regularization removes the neurons of
current layer.
maxnorm_i_regularizer(scale) Max-norm input regularization removes the neurons of
previous layer.
2.3.2 Softmax cross entropy
tensorlayer.cost.cross_entropy(output, target, name=None)

Softmax cross-entropy operation, returns the TensorFlow expression of cross-entropy for two distributions, it
implements softmax internally. See tf.nn.sparse_softmax_cross_entropy_with_logits.
Parameters
• output (Tensor) – A batch of distribution with shape: [batch_size, num of classes].
• target (Tensor) – A batch of index with shape: [batch_size, ].
• name (string) – Name of this loss.
Examples
>>> ce = tl.cost.cross_entropy(y_logits, y_target_logits, 'my_loss')
References
• About cross-entropy: https://en.wikipedia.org/wiki/Cross_entropy.

• The code is borrowed from: https://en.wikipedia.org/wiki/Cross_entropy.
2.3.3 Sigmoid cross entropy
tensorlayer.cost.sigmoid_cross_entropy(output, target, name=None)

Sigmoid cross-entropy operation, see tf.nn.sigmoid_cross_entropy_with_logits.
Parameters
• output (Tensor) – A batch of distribution with shape: [batch_size, num of classes].
• target (Tensor) – A batch of index with shape: [batch_size, ].
• name (string) – Name of this loss.
2.3.4 Binary cross entropy
tensorlayer.cost.binary_cross_entropy(output, target, epsilon=1e-08, name=’bce_loss’)

Binary cross entropy operation.

Parameters
• output (Tensor) – Tensor with type of float32 or float64.
• target (Tensor) – The target distribution, format the same with output.
• epsilon (float) – A small value to avoid output to be zero.
• name (str) – An optional name to attach to this function.
References
• ericjang-DRAW
2.3.5 Mean squared error (L2)
tensorlayer.cost.mean_squared_error(output, target, is_mean=False,

name=’mean_squared_error’)
Return the TensorFlow expression of mean-square-error (L2) of two batch of data.
Parameters
• output (Tensor) – 2D, 3D or 4D tensor i.e. [batch_size, n_feature], [batch_size, height,
width] or [batch_size, height, width, channel].
• is_mean (boolean) –
Whether compute the mean or sum for each example.
– If True, use tf.reduce_mean to compute the loss between one target and predict
data.
– If False, use tf.reduce_sum (default).
References
• Wiki Mean Squared Error
2.3.6 Normalized mean square error
tensorlayer.cost.normalized_mean_square_error(output, target,
name=’normalized_mean_squared_error_loss’)
Return the TensorFlow expression of normalized mean-square-error of two distributions.
Parameters
2.3. API - Cost 47

2.3.7 Absolute difference error (L1)
tensorlayer.cost.absolute_difference_error(output, target, is_mean=False,

name=’absolute_difference_error_loss’)
Return the TensorFlow expression of absolute difference error (L1) of two batch of data.
Parameters
• is_mean (boolean) –
Whether compute the mean or sum for each example.
– If True, use tf.reduce_mean to compute the loss between one target and predict
data.
– If False, use tf.reduce_sum (default).
2.3.8 Dice coefficient
tensorlayer.cost.dice_coe(output, target, loss_type=’jaccard’, axis=(1, 2, 3), smooth=1e-05)

Soft dice (Sørensen or Jaccard) coefficient for comparing the similarity of two batch of data, usually be used for
binary image segmentation i.e. labels are binary. The coefficient between 0 to 1, 1 means totally match.
Parameters
• output (Tensor) – A distribution with shape: [batch_size, . . . .], (any dimensions).
• loss_type (str) – jaccard or sorensen, default is jaccard.
• axis (tuple of int) – All dimensions are reduced, default [1,2,3].
• smooth (float) –
This small value will be added to the numerator and denominator.
– If both output and target are empty, it makes sure dice is 1.
– If either output or target are empty (all pixels are background), dice = `smooth/
(small_value + smooth), then if smooth is very small, dice close to 0 (even
the image values lower than the threshold), so in this case, higher smooth can have a
higher dice.
Examples
>>> outputs = tl.act.pixel_wise_softmax(network.outputs)

>>> dice_loss = 1 - tl.cost.dice_coe(outputs, y_)
References
• Wiki-Dice

2.3.9 Hard Dice coefficient
tensorlayer.cost.dice_hard_coe(output, target, threshold=0.5, axis=(1, 2, 3), smooth=1e-05)

Non-differentiable Sørensen–Dice coefficient for comparing the similarity of two batch of data, usually be used
for binary image segmentation i.e. labels are binary. The coefficient between 0 to 1, 1 if totally match.
Parameters
• output (tensor) – A distribution with shape: [batch_size, . . . .], (any dimensions).
• target (tensor) – The target distribution, format the same with output.
• threshold (float) – The threshold value to be true.
• axis (tuple of integer) – All dimensions are reduced, default (1,2,3).
• smooth (float) – This small value will be added to the numerator and denominator, see
dice_coe.
References
• Wiki-Dice
2.3.10 IOU coefficient
tensorlayer.cost.iou_coe(output, target, threshold=0.5, axis=(1, 2, 3), smooth=1e-05)

Non-differentiable Intersection over Union (IoU) for comparing the similarity of two batch of data, usually be
used for evaluating binary image segmentation. The coefficient between 0 to 1, and 1 means totally match.
Parameters
• output (tensor) – A batch of distribution with shape: [batch_size, . . . .], (any dimen-
sions).
• target (tensor) – The target distribution, format the same with output.
• threshold (float) – The threshold value to be true.
• axis (tuple of integer) – All dimensions are reduced, default (1,2,3).
• smooth (float) – This small value will be added to the numerator and denominator, see
dice_coe.
Notes
• IoU cannot be used as training loss, people usually use dice coefficient for training, IoU and hard-dice for
evaluating.
2.3.11 Cross entropy for sequence
tensorlayer.cost.cross_entropy_seq(logits, target_seqs, batch_size=None)

Returns the expression of cross-entropy of two sequences, implement softmax internally. Normally be used for
fixed length RNN outputs, see PTB example.
Parameters
• logits (Tensor) – 2D tensor with shape of [batch_size * n_steps, n_classes].
2.3. API - Cost 49

• target_seqs (Tensor) – The target sequence, 2D tensor [batch_size, n_steps], if the

number of step is dynamic, please use tl.cost.cross_entropy_seq_with_mask
instead.
• batch_size (None or int.) –
Whether to divide the cost by batch size.
– If integer, the return cost will be divided by batch_size.
– If None (default), the return cost will not be divided by anything.
Examples
>>> see `PTB example <https://github.com/tensorlayer/tensorlayer/blob/master/

˓→example/tutorial_ptb_lstm_state_is_tuple.py>`__.for more details
>>> input_data = tf.placeholder(tf.int32, [batch_size, n_steps])

>>> targets = tf.placeholder(tf.int32, [batch_size, n_steps])
>>> # build the network
>>> print(net.outputs)
(batch_size * n_steps, n_classes)
>>> cost = tl.cost.cross_entropy_seq(network.outputs, targets)
2.3.12 Cross entropy with mask for sequence
tensorlayer.cost.cross_entropy_seq_with_mask(logits, target_seqs, input_mask, re-

turn_details=False, name=None)
Returns the expression of cross-entropy of two sequences, implement softmax internally. Normally be used for
Dynamic RNN with Synced sequence input and output.
Parameters
• logits (Tensor) – 2D tensor with shape of [batch_size * ?, n_classes], ? means dynamic
IDs for each example. - Can be get from DynamicRNNLayer by setting return_seq_2d
to True.
• target_seqs (Tensor) – int of tensor, like word ID. [batch_size, ?], ? means dynamic
IDs for each example.
• input_mask (Tensor) – The mask to compute loss, it has the same size with target_seqs,
normally 0 or 1.
• return_details (boolean) –
Whether to return detailed losses.
– If False (default), only returns the loss.
– If True, returns the loss, losses, weights and targets (see source code).
Examples
>>> batch_size = 64
>>> vocab_size = 10000
>>> embedding_size = 256
>>> input_seqs = tf.placeholder(dtype=tf.int64, shape=[batch_size, None], name=
˓→"input")


>>> target_seqs = tf.placeholder(dtype=tf.int64, shape=[batch_size, None], name=
˓→"target")
>>> input_mask = tf.placeholder(dtype=tf.int64, shape=[batch_size, None], name=

˓→"mask")
>>> net = tl.layers.EmbeddingInputlayer(

... inputs = input_seqs,
... vocabulary_size = vocab_size,
... embedding_size = embedding_size,
... name = 'seq_embedding')
>>> net = tl.layers.DynamicRNNLayer(net,
... cell_fn = tf.contrib.rnn.BasicLSTMCell,
... n_hidden = embedding_size,
... dropout = (0.7 if is_train else None),
... sequence_length = tl.layers.retrieve_seq_length_op2(input_seqs),
... return_seq_2d = True,
... name = 'dynamicrnn')
(?, 256)
>>> net = tl.layers.DenseLayer(net, n_units=vocab_size, name="output")
(?, 10000)
>>> loss = tl.cost.cross_entropy_seq_with_mask(net.outputs, target_seqs, input_
˓→mask)
2.3.13 Cosine similarity
tensorlayer.cost.cosine_similarity(v1, v2)
Cosine similarity [-1, 1].
Parameters v2 (v1,) – Tensor with the same shape [batch_size, n_feature].
References
• Wiki.
2.3.14 Regularization functions
For tf.nn.l2_loss, tf.contrib.layers.l1_regularizer, tf.contrib.layers.

l2_regularizer and tf.contrib.layers.sum_regularizer, see TensorFlow API.
Maxnorm
tensorlayer.cost.maxnorm_regularizer(scale=1.0)
Max-norm regularization returns a function that can be used to apply max-norm regularization to weights.
More about max-norm, see wiki-max norm. The implementation follows TensorFlow contrib.
Parameters scale (float) – A scalar multiplier Tensor. 0.0 disables the regularizer.
Returns
Return type A function with signature mn(weights, name=None) that apply Lo regularization.
Raises ValueError : If scale is outside of the range [0.0, 1.0] or if scale is not a float.
2.3. API - Cost 51

Special
tensorlayer.cost.li_regularizer(scale, scope=None)
Li regularization removes the neurons of previous layer. The i represents inputs. Returns a function that can be
used to apply group li regularization to weights. The implementation follows TensorFlow contrib.
Parameters
• scale (float) – A scalar multiplier Tensor. 0.0 disables the regularizer.
• scope (str) – An optional scope name for this function.
Returns
Return type A function with signature li(weights, name=None) that apply Li regularization.
Raises ValueError : if scale is outside of the range [0.0, 1.0] or if scale is not a float.
tensorlayer.cost.lo_regularizer(scale)
Lo regularization removes the neurons of current layer. The o represents outputs Returns a function that can be
used to apply group lo regularization to weights. The implementation follows TensorFlow contrib.
Returns
Return type A function with signature lo(weights, name=None) that apply Lo regularization.
tensorlayer.cost.maxnorm_o_regularizer(scale)
Max-norm output regularization removes the neurons of current layer. Returns a function that can be used
to apply max-norm regularization to each column of weight matrix. The implementation follows TensorFlow
contrib.
Returns
Return type A function with signature mn_o(weights, name=None) that apply Lo regularization.
tensorlayer.cost.maxnorm_i_regularizer(scale)
Max-norm input regularization removes the neurons of previous layer. Returns a function that can be used to
apply max-norm regularization to each row of weight matrix. The implementation follows TensorFlow contrib.
Returns
Return type A function with signature mn_i(weights, name=None) that apply Lo regularization.
2.4 API - Data Pre-Processing
affine_rotation_matrix([angle]) Create an affine transform matrix for image rotation.

affine_horizontal_flip_matrix([prob]) Create an affine transformation matrix for image hori-
zontal flipping.


affine_vertical_flip_matrix([prob]) Create an affine transformation for image vertical flip-
ping.
affine_shift_matrix([wrg, hrg, w, h]) Create an affine transform matrix for image shifting.
affine_shear_matrix([x_shear, y_shear]) Create affine transform matrix for image shearing.
affine_zoom_matrix([zoom_range]) Create an affine transform matrix for zooming/scaling
an image’s height and width.
affine_respective_zoom_matrix([w_range, Get affine transform matrix for zooming/scaling that
h_range]) height and width are changed independently.
transform_matrix_offset_center(matrix, y, Convert the matrix from Cartesian coordinates (the ori-
x) gin in the middle of image) to Image coordinates (the
origin on the top-left of image).
affine_transform(x, transform_matrix[, . . . ]) Return transformed images by given an affine matrix in
Scipy format (x is height).
affine_transform_cv2(x, transform_matrix[, Return transformed images by given an affine matrix in
. . . ]) OpenCV format (x is width).
affine_transform_keypoints(coords_list, Transform keypoint coordinates according to a given
...) affine transform matrix.
projective_transform_by_points(x, src, Projective transform by given coordinates, usually 4 co-
dst) ordinates.
rotation(x[, rg, is_random, row_index, . . . ]) Rotate an image randomly or non-randomly.
rotation_multi(x[, rg, is_random, . . . ]) Rotate multiple images with the same arguments, ran-
domly or non-randomly.
crop(x, wrg, hrg[, is_random, row_index, . . . ]) Randomly or centrally crop an image.
crop_multi(x, wrg, hrg[, is_random, . . . ]) Randomly or centrally crop multiple images.
flip_axis(x[, axis, is_random]) Flip the axis of an image, such as flip left and right, up
and down, randomly or non-randomly,
flip_axis_multi(x, axis[, is_random]) Flip the axises of multiple images together, such as flip
left and right, up and down, randomly or non-randomly,
shift(x[, wrg, hrg, is_random, row_index, . . . ]) Shift an image randomly or non-randomly.
shift_multi(x[, wrg, hrg, is_random, . . . ]) Shift images with the same arguments, randomly or
non-randomly.
shear(x[, intensity, is_random, row_index, . . . ]) Shear an image randomly or non-randomly.
shear_multi(x[, intensity, is_random, . . . ]) Shear images with the same arguments, randomly or
non-randomly.
shear2(x[, shear, is_random, row_index, . . . ]) Shear an image randomly or non-randomly.
shear_multi2(x[, shear, is_random, . . . ]) Shear images with the same arguments, randomly or
non-randomly.
swirl(x[, center, strength, radius, . . . ]) Swirl an image randomly or non-randomly, see scikit-
image swirl API and example.
swirl_multi(x[, center, strength, radius, . . . ]) Swirl multiple images with the same arguments, ran-
domly or non-randomly.
elastic_transform(x, alpha, sigma[, mode, . . . ]) Elastic transformation for image as described in
[Simard2003].
elastic_transform_multi(x, alpha, sigma[, Elastic transformation for images as described in
. . . ]) [Simard2003].
zoom(x[, zoom_range, flags, border_mode]) Zooming/Scaling a single image that height and width
are changed together.
respective_zoom(x[, h_range, w_range, . . . ]) Zooming/Scaling a single image that height and width
are changed independently.
zoom_multi(x[, zoom_range, flags, border_mode]) Zoom in and out of images with the same arguments,
randomly or non-randomly.
2.4. API - Data Pre-Processing 53


brightness(x[, gamma, gain, is_random]) Change the brightness of a single image, randomly or
non-randomly.
brightness_multi(x[, gamma, gain, is_random]) Change the brightness of multiply images, randomly or
non-randomly.
illumination(x[, gamma, contrast, . . . ]) Perform illumination augmentation for a single image,
randomly or non-randomly.
rgb_to_hsv(rgb) Input RGB image [0~255] return HSV image [0~1].
hsv_to_rgb(hsv) Input HSV image [0~1] return RGB image [0~255].
adjust_hue(im[, hout, is_offset, is_clip, . . . ]) Adjust hue of an RGB image.
imresize(x[, size, interp, mode]) Resize an image by given output size and method.
pixel_value_scale(im[, val, clip, is_random]) Scales each value in the pixels of the image.
samplewise_norm(x[, rescale, . . . ]) Normalize an image by rescale, samplewise centering
and samplewise centering in order.
featurewise_norm(x[, mean, std, epsilon]) Normalize every pixels by the same given mean and std,
which are usually compute from all examples.
channel_shift(x, intensity[, is_random, . . . ]) Shift the channels of an image, randomly or non-
randomly, see numpy.rollaxis.
channel_shift_multi(x, intensity[, . . . ]) Shift the channels of images with the same arguments,
randomly or non-randomly, see numpy.rollaxis.
drop(x[, keep]) Randomly set some pixels to zero by a given keeping
probability.
array_to_img(x[, dim_ordering, scale]) Converts a numpy array to PIL image object (uint8 for-
mat).
find_contours(x[, level, fully_connected, . . . ]) Find iso-valued contours in a 2D array for a given
level value, returns list of (n, 2)-ndarrays see skim-
age.measure.find_contours.
pt2map([list_points, size, val]) Inputs a list of points, return a 2D image.
binary_dilation(x[, radius]) Return fast binary morphological dilation of an image.
dilation(x[, radius]) Return greyscale morphological dilation of an image,
see skimage.morphology.dilation.
binary_erosion(x[, radius]) Return binary morphological erosion of an image, see
skimage.morphology.binary_erosion.
erosion(x[, radius]) Return greyscale morphological erosion of an image,
see skimage.morphology.erosion.
obj_box_coord_rescale([coord, shape]) Scale down one coordinates from pixel unit to the ratio
of image size i.e.
obj_box_coords_rescale([coords, shape]) Scale down a list of coordinates from pixel unit to the
ratio of image size i.e.
obj_box_coord_scale_to_pixelunit(coord[, Convert one coordinate [x, y, w (or x2), h (or y2)] in
shape]) ratio format to image coordinate format.
obj_box_coord_centroid_to_upleft_butright(coord) Convert one coordinate [x_center, y_center, w, h] to [x1,
y1, x2, y2] in up-left and botton-right format.
obj_box_coord_upleft_butright_to_centroid(coord) Convert one coordinate [x1, y1, x2, y2] to [x_center,
y_center, w, h].
obj_box_coord_centroid_to_upleft(coord) Convert one coordinate [x_center, y_center, w, h] to [x,
y, w, h].
obj_box_coord_upleft_to_centroid(coord) Convert one coordinate [x, y, w, h] to [x_center,
y_center, w, h].
parse_darknet_ann_str_to_list(annotations) Input string format of class, x, y, w, h, return list of list
format.


parse_darknet_ann_list_to_cls_box(annotations) Parse darknet annotation format into two lists for class
and bounding box.
obj_box_horizontal_flip(im[, coords, . . . ]) Left-right flip the image and coordinates for object de-
tection.
obj_box_imresize(im[, coords, size, interp, . . . ]) Resize an image, and compute the new bounding box
coordinates.
obj_box_crop(im[, classes, coords, wrg, . . . ]) Randomly or centrally crop an image, and compute the
new bounding box coordinates.
obj_box_shift(im[, classes, coords, wrg, . . . ]) Shift an image randomly or non-randomly, and compute
the new bounding box coordinates.
obj_box_zoom(im[, classes, coords, . . . ]) Zoom in and out of a single image, randomly or non-
randomly, and compute the new bounding box coordi-
nates.
keypoint_random_crop(image, annos[, mask, Randomly crop an image and corresponding
size]) keypoints without influence scales, given by
keypoint_random_resize_shortestedge.
keypoint_resize_random_crop(image, an- Reszie the image to make either its width or height
nos[, . . . ]) equals to the given sizes.
keypoint_random_rotate(image, annos[, mask, Rotate an image and corresponding keypoints.
rg])
keypoint_random_flip(image, annos[, mask, Flip an image and corresponding keypoints.
. . . ])
keypoint_random_resize(image, annos[, mask, Randomly resize an image and corresponding key-
. . . ]) points.
keypoint_random_resize_shortestedge(image, Randomly resize an image and corresponding keypoints
annos) based on shorter edgeself.
pad_sequences(sequences[, maxlen, dtype, . . . ]) Pads each sequence to the same length: the length of the
longest sequence.
remove_pad_sequences(sequences[, pad_id]) Remove padding.
process_sequences(sequences[, end_id, . . . ]) Set all tokens(ids) after END token to the padding value,
and then shorten (option) it to the maximum sequence
length in this batch.
sequences_add_start_id(sequences[, . . . ]) Add special start token(id) in the beginning of each se-
quence.
sequences_add_end_id(sequences[, end_id]) Add special end token(id) in the end of each sequence.
sequences_add_end_id_after_pad(sequences[, Add special end token(id) in the end of each sequence.
. . . ])
sequences_get_mask(sequences[, pad_val]) Return mask for sequences.
2.4.1 Affine Transform
Python can be FAST
Image augmentation is a critical step in deep learning. Though TensorFlow has provided tf.image, image augmen-
tation often remains as a key bottleneck. tf.image has three limitations:
• Real-world visual tasks such as object detection, segmentation, and pose estimation must cope with image
meta-data (e.g., coordinates). These data are beyond tf.image which processes images as tensors.
• tf.image operators breaks the pure Python programing experience (i.e., users have to use tf.py_func in
order to call image functions written in Python); however, frequent uses of tf.py_func slow down Tensor-
Flow, making users hard to balance flexibility and performance.

• tf.image API is inflexible. Image operations are performed in an order. They are hard to jointly optimize.
More importantly, sequential image operations can significantly reduces the quality of images, thus affecting
training accuracy.
TensorLayer addresses these limitations by providing a high-performance image augmentation API in Python. This
API bases on affine transformation and cv2.wrapAffine. It allows you to combine multiple image processing
functions into a single matrix operation. This combined operation is executed by the fast cv2 library, offering 78x
performance improvement (observed in openpose-plus for example). The following example illustrates the rationale
behind this tremendous speed up.
Example
The source code of complete examples can be found here. The following is a typical Python program that applies
rotation, shifting, flipping, zooming and shearing to an image,
image = tl.vis.read_image('tiger.jpeg')
xx = tl.prepro.rotation(image, rg=-20, is_random=False)

xx = tl.prepro.flip_axis(xx, axis=1, is_random=False)
xx = tl.prepro.shear2(xx, shear=(0., -0.2), is_random=False)
xx = tl.prepro.zoom(xx, zoom_range=0.8)
xx = tl.prepro.shift(xx, wrg=-0.1, hrg=0, is_random=False)
tl.vis.save_image(xx, '_result_slow.png')
However, by leveraging affine transformation, image operations can be combined into one:
# 1. Create required affine transformation matrices

M_rotate = tl.prepro.affine_rotation_matrix(angle=20)
M_flip = tl.prepro.affine_horizontal_flip_matrix(prob=1)
M_shift = tl.prepro.affine_shift_matrix(wrg=0.1, hrg=0, h=h, w=w)
M_shear = tl.prepro.affine_shear_matrix(x_shear=0.2, y_shear=0)
M_zoom = tl.prepro.affine_zoom_matrix(zoom_range=0.8)
# 2. Combine matrices
# NOTE: operations are applied in a reversed order (i.e., rotation is performed first)
M_combined = M_shift.dot(M_zoom).dot(M_shear).dot(M_flip).dot(M_rotate)
# 3. Convert the matrix from Cartesian coordinates (the origin in the middle of image)
# to image coordinates (the origin on the top-left of image)
transform_matrix = tl.prepro.transform_matrix_offset_center(M_combined, x=w, y=h)
# 4. Transform the image using a single operation

result = tl.prepro.affine_transform_cv2(image, transform_matrix) # 76 times faster
tl.vis.save_image(result, '_result_fast.png')
The following figure illustrates the rational behind combined affine transformation.

Using combined affine transformation has two key benefits. First, it allows you to leverage a pure Python API to
achieve orders of magnitudes of speed up in image augmentation, and thus prevent data pre-processing from becoming
a bottleneck in training. Second, performing sequential image transformation requires multiple image interpolations.
This produces low-quality input images. In contrast, a combined transformation performs the interpolation only once,
and thus preserve the content in an image. The following figure illustrates these two benefits:
The major reason for combined affine transformation being fast is because it has lower computational complexity.
Assume we have k affine transformations T1, ..., Tk, where Ti can be represented by 3x3 matrixes. The se-
quential transformation can be represented as y = Tk (... T1(x)), and the time complexity is O(k N) where
N is the cost of applying one transformation to image x. N is linear to the size of x. For the combined transformation
y = (Tk ... T1) (x) the time complexity is O(27(k - 1) + N) = max{O(27k), O(N)} = O(N)
(assuming 27k << N) where 27 = 3^3 is the cost for combining two transformations.

Get rotation matrix
tensorlayer.prepro.affine_rotation_matrix(angle=(-20, 20))
Create an affine transform matrix for image rotation. NOTE: In OpenCV, x is width and y is height.
Parameters angle (int/float or tuple of two int/float) –
Degree to rotate, usually -180 ~ 180.
• int/float, a fixed angle.
• tuple of 2 floats/ints, randomly sample a value as the angle between these 2 values.
Returns An affine transform matrix.
Return type numpy.array
Get horizontal flipping matrix
tensorlayer.prepro.affine_horizontal_flip_matrix(prob=0.5)
Create an affine transformation matrix for image horizontal flipping. NOTE: In OpenCV, x is width and y is
height.
Parameters prob (float) – Probability to flip the image. 1.0 means always flip.
Get vertical flipping matrix
tensorlayer.prepro.affine_vertical_flip_matrix(prob=0.5)
Create an affine transformation for image vertical flipping. NOTE: In OpenCV, x is width and y is height.
Parameters prob (float) – Probability to flip the image. 1.0 means always flip.
Get shifting matrix
tensorlayer.prepro.affine_shift_matrix(wrg=(-0.1, 0.1), hrg=(-0.1, 0.1), w=200, h=200)

Create an affine transform matrix for image shifting. NOTE: In OpenCV, x is width and y is height.
Parameters
• wrg (float or tuple of floats) –
Range to shift on width axis, -1 ~ 1.
– float, a fixed distance.
– tuple of 2 floats, randomly sample a value as the distance between these 2 values.
• hrg (float or tuple of floats) –
Range to shift on height axis, -1 ~ 1.
– float, a fixed distance.
– tuple of 2 floats, randomly sample a value as the distance between these 2 values.

• h (w,) – The width and height of the image.

Get shearing matrix
tensorlayer.prepro.affine_shear_matrix(x_shear=(-0.1, 0.1), y_shear=(-0.1, 0.1))

Create affine transform matrix for image shearing. NOTE: In OpenCV, x is width and y is height.
Parameters shear (tuple of two floats) – Percentage of shears for width and height di-
rections.
Get zooming matrix
tensorlayer.prepro.affine_zoom_matrix(zoom_range=(0.8, 1.1))
Create an affine transform matrix for zooming/scaling an image’s height and width. OpenCV format, x is width.
Parameters
• x (numpy.array) – An image with dimension of [row, col, channel] (default).
• zoom_range (float or tuple of 2 floats) –
The zooming/scaling ratio, greater than 1 means larger.
– float, a fixed ratio.
– tuple of 2 floats, randomly sample a value as the ratio between these 2 values.
Get respective zooming matrix
tensorlayer.prepro.affine_respective_zoom_matrix(w_range=0.8, h_range=1.1)
Get affine transform matrix for zooming/scaling that height and width are changed independently. OpenCV
format, x is width.
Parameters
• w_range (float or tuple of 2 floats) –
The zooming/scaling ratio of width, greater than 1 means larger.
– tuple of 2 floats, randomly sample a value as the ratio between 2 values.
• h_range (float or tuple of 2 floats) –
The zooming/scaling ratio of height, greater than 1 means larger.


Cartesian to image coordinates
tensorlayer.prepro.transform_matrix_offset_center(matrix, y, x)
Convert the matrix from Cartesian coordinates (the origin in the middle of image) to Image coordinates (the
origin on the top-left of image).
Parameters
• matrix (numpy.array) – Transform matrix.
• and y (x) – Size of image.
Returns The transform matrix.
Examples
• See tl.prepro.rotation, tl.prepro.shear, tl.prepro.zoom.
Apply image transform
tensorlayer.prepro.affine_transform_cv2(x, transform_matrix, flags=None, bor-

der_mode=’constant’)
Return transformed images by given an affine matrix in OpenCV format (x is width). (Powered by OpenCV2,
faster than tl.prepro.affine_transform)
Parameters
• transform_matrix (numpy.array) – A transform matrix, OpenCV format.
• border_mode (str) –
– constant, pad the image with a constant value (i.e. black or 0)
– replicate, the row or column at the very edge of the original is replicated to the extra
border.
Examples
>>> M_shear = tl.prepro.affine_shear_matrix(intensity=0.2, is_random=False)

>>> M_zoom = tl.prepro.affine_zoom_matrix(zoom_range=0.8)
>>> M_combined = M_shear.dot(M_zoom)
>>> result = tl.prepro.affine_transform_cv2(image, M_combined)

Apply keypoint transform
tensorlayer.prepro.affine_transform_keypoints(coords_list, transform_matrix)
Transform keypoint coordinates according to a given affine transform matrix. OpenCV format, x is width.
Note that, for pose estimation task, flipping requires maintaining the left and right body information. We should
not flip the left and right body, so please use tl.prepro.keypoint_random_flip.
Parameters
• coords_list (list of list of tuple/list) – The coordinates e.g., the key-
point coordinates of every person in an image.
• transform_matrix (numpy.array) – Transform matrix, OpenCV format.
Examples
>>> # 1. get all affine transform matrices

>>> M_rotate = tl.prepro.affine_rotation_matrix(angle=20)
>>> M_flip = tl.prepro.affine_horizontal_flip_matrix(prob=1)
>>> # 2. combine all affine transform matrices to one matrix
>>> M_combined = dot(M_flip).dot(M_rotate)
>>> # 3. transfrom the matrix from Cartesian coordinate (the origin in the middle
˓→of image)
>>> # to Image coordinate (the origin on the top-left of image)

>>> transform_matrix = tl.prepro.transform_matrix_offset_center(M_combined, x=w,
˓→y=h)
>>> # 4. then we can transfrom the image once for all transformations
>>> result = tl.prepro.affine_transform_cv2(image, transform_matrix) # 76 times
˓→faster
>>> # 5. transform keypoint coordinates

>>> coords = [[(50, 100), (100, 100), (100, 50), (200, 200)], [(250, 50), (200,
˓→50), (200, 100)]]
>>> coords_result = tl.prepro.affine_transform_keypoints(coords, transform_matrix)
2.4.2 Images
Projective transform by points
tensorlayer.prepro.projective_transform_by_points(x, src, dst, map_args=None,

output_shape=None, order=1,
mode=’constant’, cval=0.0,
clip=True, preserve_range=False)
Projective transform by given coordinates, usually 4 coordinates.
see scikit-image.
Parameters
• src (list or numpy) – The original coordinates, usually 4 coordinates of (width,
height).
• dst (list or numpy) – The coordinates after transformation, the number of coordi-
nates is the same with src.
• map_args (dictionary or None) – Keyword arguments passed to inverse map.

• output_shape (tuple of 2 int) – Shape of the output image generated. By default

the shape of the input image is preserved. Note that, even for multi-band images, only rows
and columns need to be specified.
• order (int) –
The order of interpolation. The order has to be in the range 0-5:
– 0 Nearest-neighbor
– 1 Bi-linear (default)
– 2 Bi-quadratic
– 3 Bi-cubic
– 4 Bi-quartic
– 5 Bi-quintic
• mode (str) – One of constant (default), edge, symmetric, reflect or wrap. Points outside the
boundaries of the input are filled according to the given mode. Modes match the behaviour
of numpy.pad.
• cval (float) – Used in conjunction with mode constant, the value outside the image
boundaries.
• clip (boolean) – Whether to clip the output to the range of values of the input image.
This is enabled by default, since higher order interpolation may produce values outside the
given input range.
• preserve_range (boolean) – Whether to keep the original range of values. Other-
wise, the input image is converted according to the conventions of img_as_float.
Returns A processed image.
Examples
Assume X is an image from CIFAR-10, i.e. shape == (32, 32, 3)
>>> src = [[0,0],[0,32],[32,0],[32,32]] # [w, h]

>>> dst = [[10,10],[0,32],[32,0],[32,32]]
>>> x = tl.prepro.projective_transform_by_points(X, src, dst)
References
• scikit-image : geometric transformations

• scikit-image : examples
Rotation
tensorlayer.prepro.rotation(x, rg=20, is_random=False, row_index=0, col_index=1, chan-

nel_index=2, fill_mode=’nearest’, cval=0.0, order=1)
Rotate an image randomly or non-randomly.
Parameters


• rg (int or float) – Degree to rotate, usually 0 ~ 180.
• is_random (boolean) – If True, randomly rotate. Default is False
• col_index and channel_index (row_index) – Index of row, col and channel,
default (0, 1, 2), for theano (1, 2, 0).
• fill_mode (str) – Method to fill missing pixel, default nearest, more options constant,
reflect or wrap, see scipy ndimage affine_transform
• cval (float) – Value used for points outside the boundaries of the input if
mode=‘constant‘. Default is 0.0
• order (int) – The order of interpolation. The order has to be in the range 0-5. See
tl.prepro.affine_transform and scipy ndimage affine_transform
Examples
>>> x --> [row, col, 1]

>>> x = tl.prepro.rotation(x, rg=40, is_random=False)
>>> tl.vis.save_image(x, 'im.png')
tensorlayer.prepro.rotation_multi(x, rg=20, is_random=False, row_index=0, col_index=1,

channel_index=2, fill_mode=’nearest’, cval=0.0, order=1)
Rotate multiple images with the same arguments, randomly or non-randomly. Usually be used for image seg-
mentation which x=[X, Y], X and Y should be matched.
Parameters
• x (list of numpy.array) – List of images with dimension of [n_images, row, col,
channel] (default).
• others (args) – See tl.prepro.rotation.
Returns A list of processed images.
Examples
>>> x, y --> [row, col, 1] greyscale

>>> x, y = tl.prepro.rotation_multi([x, y], rg=90, is_random=False)
Crop
tensorlayer.prepro.crop(x, wrg, hrg, is_random=False, row_index=0, col_index=1)

Randomly or centrally crop an image.
Parameters
• wrg (int) – Size of width.

• hrg (int) – Size of height.

• is_random (boolean,) – If True, randomly crop, else central crop. Default is False.
• row_index (int) – index of row.
• col_index (int) – index of column.
tensorlayer.prepro.crop_multi(x, wrg, hrg, is_random=False, row_index=0, col_index=1)
Randomly or centrally crop multiple images.
Parameters
channel] (default).
• others (args) – See tl.prepro.crop.
Flip
tensorlayer.prepro.flip_axis(x, axis=1, is_random=False)

Flip the axis of an image, such as flip left and right, up and down, randomly or non-randomly,
Parameters
• axis (int) –
Which axis to flip.
– 0, flip up and down
– 1, flip left and right
– 2, flip channel
• is_random (boolean) – If True, randomly flip. Default is False.
tensorlayer.prepro.flip_axis_multi(x, axis, is_random=False)
Flip the axises of multiple images together, such as flip left and right, up and down, randomly or non-randomly,
Parameters
channel] (default).
• others (args) – See tl.prepro.flip_axis.

Shift
tensorlayer.prepro.shift(x, wrg=0.1, hrg=0.1, is_random=False, row_index=0, col_index=1,

Shift an image randomly or non-randomly.
Parameters
• wrg (float) – Percentage of shift in axis x, usually -0.25 ~ 0.25.
• hrg (float) – Percentage of shift in axis y, usually -0.25 ~ 0.25.
• is_random (boolean) – If True, randomly shift. Default is False.
mode=’constant’. Default is 0.0.
tensorlayer.prepro.shift_multi(x, wrg=0.1, hrg=0.1, is_random=False, row_index=0,
col_index=1, channel_index=2, fill_mode=’nearest’, cval=0.0,
order=1)
Shift images with the same arguments, randomly or non-randomly. Usually be used for image segmentation
which x=[X, Y], X and Y should be matched.
Parameters
channel] (default).
• others (args) – See tl.prepro.shift.
Shear
tensorlayer.prepro.shear(x, intensity=0.1, is_random=False, row_index=0, col_index=1, chan-

nel_index=2, fill_mode=’nearest’, cval=0.0, order=1)
Shear an image randomly or non-randomly.
Parameters
• intensity (float) – Percentage of shear, usually -0.5 ~ 0.5 (is_random==True), 0 ~
0.5 (is_random==False), you can have a quick try by shear(X, 1).
• is_random (boolean) – If True, randomly shear. Default is False.


reflect or wrap, see and scipy ndimage affine_transform
References
• Affine transformation
tensorlayer.prepro.shear_multi(x, intensity=0.1, is_random=False, row_index=0, col_index=1,

Shear images with the same arguments, randomly or non-randomly. Usually be used for image segmentation
Parameters
channel] (default).
• others (args) – See tl.prepro.shear.
Shear V2
tensorlayer.prepro.shear2(x, shear=(0.1, 0.1), is_random=False, row_index=0, col_index=1,

Shear an image randomly or non-randomly.
Parameters
• shear (tuple of two floats) – Percentage of shear for height and width direction
(0, 1).
• is_random (boolean) – If True, randomly shear. Default is False.


References
• Affine transformation
tensorlayer.prepro.shear_multi2(x, shear=(0.1, 0.1), is_random=False, row_index=0,

col_index=1, channel_index=2, fill_mode=’nearest’, cval=0.0,
order=1)
Shear images with the same arguments, randomly or non-randomly. Usually be used for image segmentation
Parameters
channel] (default).
• others (args) – See tl.prepro.shear2.
Swirl
tensorlayer.prepro.swirl(x, center=None, strength=1, radius=100, rotation=0, out-

put_shape=None, order=1, mode=’constant’, cval=0, clip=True,
preserve_range=False, is_random=False)
Swirl an image randomly or non-randomly, see scikit-image swirl API and example.
Parameters
• center (tuple or 2 int or None) – Center coordinate of transformation (op-
tional).
• strength (float) – The amount of swirling applied.
• radius (float) – The extent of the swirl in pixels. The effect dies out rapidly beyond
radius.
• rotation (float) – Additional rotation applied to the image, usually [0, 360], relates to
center.
• output_shape (tuple of 2 int or None) – Shape of the output image gener-
ated (height, width). By default the shape of the input image is preserved.
• order (int, optional) – The order of the spline interpolation, default is 1. The order
has to be in the range 0-5. See skimage.transform.warp for detail.
• mode (str) – One of constant (default), edge, symmetric reflect and wrap. Points outside
the boundaries of the input are filled according to the given mode, with constant used as the
default. Modes match the behaviour of numpy.pad.
• cval (float) – Used in conjunction with mode constant, the value outside the image
boundaries.

• clip (boolean) – Whether to clip the output to the range of values of the input image.
This is enabled by default, since higher order interpolation may produce values outside the
given input range.
• preserve_range (boolean) – Whether to keep the original range of values. Other-
wise, the input image is converted according to the conventions of img_as_float.
• is_random (boolean,) –
If True, random swirl. Default is False.
– random center = [(0 ~ x.shape[0]), (0 ~ x.shape[1])]
– random strength = [0, strength]
– random radius = [1e-10, radius]
– random rotation = [-rotation, rotation]
Examples
>>> x --> [row, col, 1] greyscale

>>> x = tl.prepro.swirl(x, strength=4, radius=100)
tensorlayer.prepro.swirl_multi(x, center=None, strength=1, radius=100, rotation=0, out-

put_shape=None, order=1, mode=’constant’, cval=0,
clip=True, preserve_range=False, is_random=False)
Swirl multiple images with the same arguments, randomly or non-randomly. Usually be used for image seg-
mentation which x=[X, Y], X and Y should be matched.
Parameters
channel] (default).
• others (args) – See tl.prepro.swirl.
Elastic transform
tensorlayer.prepro.elastic_transform(x, alpha, sigma, mode=’constant’, cval=0,

is_random=False)
Elastic transformation for image as described in [Simard2003].
Parameters
• x (numpy.array) – A greyscale image.
• alpha (float) – Alpha value for elastic transformation.
• sigma (float or sequence of float) – The smaller the sigma, the more trans-
formation. Standard deviation for Gaussian kernel. The standard deviations of the Gaussian
filter are given for each axis as a sequence, or as a single number, in which case it is equal
for all axes.

• mode (str) – See scipy.ndimage.filters.gaussian_filter. Default is constant.

• cval (float,) – Used in conjunction with mode of constant, the value outside the image
boundaries.
• is_random (boolean) – Default is False.
Examples
>>> x = tl.prepro.elastic_transform(x, alpha=x.shape[1]*3, sigma=x.shape[1]*0.07)
References
• Github.
• Kaggle
tensorlayer.prepro.elastic_transform_multi(x, alpha, sigma, mode=’constant’, cval=0,

is_random=False)
Elastic transformation for images as described in [Simard2003].
Parameters
• x (list of numpy.array) – List of greyscale images.
• others (args) – See tl.prepro.elastic_transform.
Zoom
tensorlayer.prepro.zoom(x, zoom_range=(0.9, 1.1), flags=None, border_mode=’constant’)

Zooming/Scaling a single image that height and width are changed together.
Parameters
• zoom_range (float or tuple of 2 floats) –
The zooming/scaling ratio, greater than 1 means larger.
border.


tensorlayer.prepro.zoom_multi(x, zoom_range=(0.9, 1.1), flags=None, border_mode=’constant’)
Zoom in and out of images with the same arguments, randomly or non-randomly. Usually be used for image
segmentation which x=[X, Y], X and Y should be matched.
Parameters
channel] (default).
• others (args) – See tl.prepro.zoom.
Respective Zoom
tensorlayer.prepro.respective_zoom(x, h_range=(0.9, 1.1), w_range=(0.9, 1.1), flags=None,

border_mode=’constant’)
Zooming/Scaling a single image that height and width are changed independently.
Parameters
• h_range (float or tuple of 2 floats) –
The zooming/scaling ratio of height, greater than 1 means larger.
• w_range (float or tuple of 2 floats) –
The zooming/scaling ratio of width, greater than 1 means larger.
border.
Brightness
tensorlayer.prepro.brightness(x, gamma=1, gain=1, is_random=False)

Change the brightness of a single image, randomly or non-randomly.
Parameters
• gamma (float) –

Non negative real number. Default value is 1.

– Small than 1 means brighter.
– If is_random is True, gamma in a range of (1-gamma, 1+gamma).
• gain (float) – The constant multiplier. Default value is 1.
• is_random (boolean) – If True, randomly change brightness. Default is False.
References
• skimage.exposure.adjust_gamma
• chinese blog
tensorlayer.prepro.brightness_multi(x, gamma=1, gain=1, is_random=False)

Change the brightness of multiply images, randomly or non-randomly. Usually be used for image segmentation
Parameters
• x (list of numpyarray) – List of images with dimension of [n_images, row, col,
channel] (default).
• others (args) – See tl.prepro.brightness.
Brightness, contrast and saturation
tensorlayer.prepro.illumination(x, gamma=1.0, contrast=1.0, saturation=1.0,

is_random=False)
Perform illumination augmentation for a single image, randomly or non-randomly.
Parameters
• gamma (float) –
Change brightness (the same with tl.prepro.brightness)
– if is_random=False, one float number, small than one means brighter, greater than one
means darker.
– if is_random=True, tuple of two float numbers, (min, max).
• contrast (float) –
Change contrast.
– if is_random=False, one float number, small than one means blur.
• saturation (float) –

Change saturation.
– if is_random=False, one float number, small than one means unsaturation.
• is_random (boolean) – If True, randomly change illumination. Default is False.
Examples
Random
>>> x = tl.prepro.illumination(x, gamma=(0.5, 5.0), contrast=(0.3, 1.0),

˓→saturation=(0.7, 1.0), is_random=True)
Non-random
>>> x = tl.prepro.illumination(x, 0.5, 0.6, 0.8, is_random=False)
RGB to HSV
tensorlayer.prepro.rgb_to_hsv(rgb)
Input RGB image [0~255] return HSV image [0~1].
Parameters rgb (numpy.array) – An image with values between 0 and 255.
HSV to RGB
tensorlayer.prepro.hsv_to_rgb(hsv)
Input HSV image [0~1] return RGB image [0~255].
Parameters hsv (numpy.array) – An image with values between 0.0 and 1.0
Adjust Hue
tensorlayer.prepro.adjust_hue(im, hout=0.66, is_offset=True, is_clip=True, is_random=False)

Adjust hue of an RGB image.
This is a convenience method that converts an RGB image to float representation, converts it to HSV, add
an offset to the hue channel, converts back to RGB and then back to the original data type. For TF, see
tf.image.adjust_hue.and tf.image.random_hue.
Parameters
• im (numpy.array) – An image with values between 0 and 255.

• hout (float) –
The scale value for adjusting hue.
– If is_offset is False, set all hue values to this value. 0 is red; 0.33 is green; 0.66 is blue.
– If is_offset is True, add this value as the offset to the hue channel.
• is_offset (boolean) – Whether hout is added on HSV as offset or not. Default is True.
• is_clip (boolean) – If HSV value smaller than 0, set to 0. Default is True.
• is_random (boolean) – If True, randomly change hue. Default is False.
Examples
Random, add a random value between -0.2 and 0.2 as the offset to every hue values.
>>> im_hue = tl.prepro.adjust_hue(image, hout=0.2, is_offset=True, is_

˓→random=False)
Non-random, make all hue to green.
>>> im_green = tl.prepro.adjust_hue(image, hout=0.66, is_offset=False, is_

˓→random=False)
References
• tf.image.random_hue.
• tf.image.adjust_hue.
• StackOverflow: Changing image hue with python PIL.
Resize
tensorlayer.prepro.imresize(x, size=None, interp=’bicubic’, mode=None)

Resize an image by given output size and method.
Warning, this function will rescale the value to [0, 255].
Parameters
• size (list of 2 int or None) – For height and width.
• interp (str) – Interpolation method for re-sizing (nearest, lanczos, bilinear, bicubic
(default) or cubic).
• mode (str) – The PIL image mode (P, L, etc.) to convert image before resizing.

References
• scipy.misc.imresize
Pixel value scale
tensorlayer.prepro.pixel_value_scale(im, val=0.9, clip=None, is_random=False)

Scales each value in the pixels of the image.
Parameters
• im (numpy.array) – An image.
• val (float) –
The scale value for changing pixel value.
– If is_random=False, multiply this value with all pixels.
– If is_random=True, multiply a value between [1-val, 1+val] with all pixels.
• clip (tuple of 2 numbers) – The minimum and maximum value.
• is_random (boolean) – If True, see val.
Examples
Random
>>> im = pixel_value_scale(im, 0.1, [0, 255], is_random=True)
Non-random
>>> im = pixel_value_scale(im, 0.9, [0, 255], is_random=False)
Normalization
tensorlayer.prepro.samplewise_norm(x, rescale=None, samplewise_center=False, sam-

plewise_std_normalization=False, channel_index=2,
epsilon=1e-07)
Normalize an image by rescale, samplewise centering and samplewise centering in order.
Parameters
• rescale (float) – Rescaling factor. If None or 0, no rescaling is applied, otherwise we
multiply the data by the value provided (before applying any other transformation)
• samplewise_center (boolean) – If True, set each sample mean to 0.
• samplewise_std_normalization (boolean) – If True, divide each input by its
std.
• epsilon (float) – A small position value for dividing standard deviation.


Examples
>>> x = samplewise_norm(x, samplewise_center=True, samplewise_std_

˓→normalization=True)
>>> print(x.shape, np.mean(x), np.std(x))

(160, 176, 1), 0.0, 1.0
Notes
When samplewise_center and samplewise_std_normalization are True. - For greyscale image, every pixels are
subtracted and divided by the mean and std of whole image. - For RGB image, every pixels are subtracted and
divided by the mean and std of this pixel i.e. the mean and std of a pixel is 0 and 1.
tensorlayer.prepro.featurewise_norm(x, mean=None, std=None, epsilon=1e-07)
Normalize every pixels by the same given mean and std, which are usually compute from all examples.
Parameters
• mean (float) – Value for subtraction.
• std (float) – Value for division.
• epsilon (float) – A small position value for dividing standard deviation.
Channel shift
tensorlayer.prepro.channel_shift(x, intensity, is_random=False, channel_index=2)

Shift the channels of an image, randomly or non-randomly, see numpy.rollaxis.
Parameters
• intensity (float) – Intensity of shifting.
• is_random (boolean) – If True, randomly shift. Default is False.
• channel_index (int) – Index of channel. Default is 2.
tensorlayer.prepro.channel_shift_multi(x, intensity, is_random=False, channel_index=2)
Shift the channels of images with the same arguments, randomly or non-randomly, see numpy.rollaxis. Usually
be used for image segmentation which x=[X, Y], X and Y should be matched.
Parameters


channel] (default).
• others (args) – See tl.prepro.channel_shift.
Noise
tensorlayer.prepro.drop(x, keep=0.5)
Randomly set some pixels to zero by a given keeping probability.
Parameters
• x (numpy.array) – An image with dimension of [row, col, channel] or [row, col].
• keep (float) – The keeping probability (0, 1), the lower more values will be set to zero.
Numpy and PIL
tensorlayer.prepro.array_to_img(x, dim_ordering=(0, 1, 2), scale=True)

Converts a numpy array to PIL image object (uint8 format).
Parameters
• x (numpy.array) – An image with dimension of 3 and channels of 1 or 3.
• dim_ordering (tuple of 3 int) – Index of row, col and channel, default (0, 1, 2),
for theano (1, 2, 0).
• scale (boolean) – If True, converts image to [0, 255] from any range of value like [-1,
2]. Default is True.
Returns An image.
Return type PIL.image
References
PIL Image.fromarray
Find contours
tensorlayer.prepro.find_contours(x, level=0.8, fully_connected=’low’, posi-

tive_orientation=’low’)
Find iso-valued contours in a 2D array for a given level value, returns list of (n, 2)-ndarrays see skim-
age.measure.find_contours.
Parameters
• x (2D ndarray of double.) – Input data in which to find contours.
• level (float) – Value along which to find contours in the array.

• fully_connected (str) – Either low or high. Indicates whether array elements below
the given level value are to be considered fully-connected (and hence elements above the
value will only be face connected), or vice-versa. (See notes below for details.)
• positive_orientation (str) – Either low or high. Indicates whether the output
contours will produce positively-oriented polygons around islands of low- or high-valued
elements. If low then contours will wind counter-clockwise around elements below the iso-
value. Alternately, this means that low-valued elements are always on the left of the contour.
Returns Each contour is an ndarray of shape (n, 2), consisting of n (row, column) coordinates along
the contour.
Return type list of (n,2)-ndarrays
Points to Image
tensorlayer.prepro.pt2map(list_points=None, size=(100, 100), val=1)

Inputs a list of points, return a 2D image.
Parameters
• list_points (list of 2 int) – [[x, y], [x, y]..] for point coordinates.
• size (tuple of 2 int) – (w, h) for output size.
• val (float or int) – For the contour value.
Returns An image.
Binary dilation
tensorlayer.prepro.binary_dilation(x, radius=3)
Return fast binary morphological dilation of an image. see skimage.morphology.binary_dilation.
Parameters
• x (2D array) – A binary image.
• radius (int) – For the radius of mask.
Returns A processed binary image.
Greyscale dilation
tensorlayer.prepro.dilation(x, radius=3)
Return greyscale morphological dilation of an image, see skimage.morphology.dilation.
Parameters
• x (2D array) – An greyscale image.
Returns A processed greyscale image.

Binary erosion
tensorlayer.prepro.binary_erosion(x, radius=3)
Return binary morphological erosion of an image, see skimage.morphology.binary_erosion.
Parameters
• x (2D array) – A binary image.
Returns A processed binary image.
Greyscale erosion
tensorlayer.prepro.erosion(x, radius=3)
Return greyscale morphological erosion of an image, see skimage.morphology.erosion.
Parameters
• x (2D array) – A greyscale image.
Returns A processed greyscale image.
2.4.3 Object detection
Tutorial for Image Aug
Hi, here is an example for image augmentation on VOC dataset.
## download VOC 2012 dataset

imgs_file_list, _, _, _, classes, _, _,\
_, objs_info_list, _ = tl.files.load_voc_dataset(dataset="2012")
## parse annotation and convert it into list format

ann_list = []
for info in objs_info_list:
ann = tl.prepro.parse_darknet_ann_str_to_list(info)
c, b = tl.prepro.parse_darknet_ann_list_to_cls_box(ann)
ann_list.append([c, b])
# read and save one image

idx = 2 # you can select your own image
image = tl.vis.read_image(imgs_file_list[idx])
tl.vis.draw_boxes_and_labels_to_image(image, ann_list[idx][0],
ann_list[idx][1], [], classes, True, save_name='_im_original.png')
# left right flip

im_flip, coords = tl.prepro.obj_box_horizontal_flip(image,
ann_list[idx][1], is_rescale=True, is_center=True, is_random=False)


tl.vis.draw_boxes_and_labels_to_image(im_flip, ann_list[idx][0],
coords, [], classes, True, save_name='_im_flip.png')
# resize
im_resize, coords = tl.prepro.obj_box_imresize(image,
coords=ann_list[idx][1], size=[300, 200], is_rescale=True)
tl.vis.draw_boxes_and_labels_to_image(im_resize, ann_list[idx][0],
coords, [], classes, True, save_name='_im_resize.png')
# crop
im_crop, clas, coords = tl.prepro.obj_box_crop(image, ann_list[idx][0],
ann_list[idx][1], wrg=200, hrg=200,
is_rescale=True, is_center=True, is_random=False)
tl.vis.draw_boxes_and_labels_to_image(im_crop, clas, coords, [],
classes, True, save_name='_im_crop.png')
# shift
im_shfit, clas, coords = tl.prepro.obj_box_shift(image, ann_list[idx][0],
ann_list[idx][1], wrg=0.1, hrg=0.1,
tl.vis.draw_boxes_and_labels_to_image(im_shfit, clas, coords, [],
classes, True, save_name='_im_shift.png')
# zoom
im_zoom, clas, coords = tl.prepro.obj_box_zoom(image, ann_list[idx][0],
ann_list[idx][1], zoom_range=(1.3, 0.7),
tl.vis.draw_boxes_and_labels_to_image(im_zoom, clas, coords, [],
classes, True, save_name='_im_zoom.png')
In practice, you may want to use threading method to process a batch of images as follows.
import random
batch_size = 64
im_size = [416, 416]
n_data = len(imgs_file_list)
jitter = 0.2
def _data_pre_aug_fn(data):
im, ann = data
clas, coords = ann
## change image brightness, contrast and saturation randomly
im = tl.prepro.illumination(im, gamma=(0.5, 1.5),
contrast=(0.5, 1.5), saturation=(0.5, 1.5), is_random=True)
## flip randomly
im, coords = tl.prepro.obj_box_horizontal_flip(im, coords,
is_rescale=True, is_center=True, is_random=True)
## randomly resize and crop image, it can have same effect as random zoom
tmp0 = random.randint(1, int(im_size[0]*jitter))
tmp1 = random.randint(1, int(im_size[1]*jitter))
im, coords = tl.prepro.obj_box_imresize(im, coords,
[im_size[0]+tmp0, im_size[1]+tmp1], is_rescale=True,
interp='bicubic')
im, clas, coords = tl.prepro.obj_box_crop(im, clas, coords,
wrg=im_size[1], hrg=im_size[0], is_rescale=True,
is_center=True, is_random=True)


## rescale value from [0, 255] to [-1, 1] (optional)
im = im / 127.5 - 1
return im, [clas, coords]
# randomly read a batch of image and the corresponding annotations

idexs = tl.utils.get_random_int(min=0, max=n_data-1, number=batch_size)
b_im_path = [imgs_file_list[i] for i in idexs]
b_images = tl.prepro.threading_data(b_im_path, fn=tl.vis.read_image)
b_ann = [ann_list[i] for i in idexs]
# threading process
data = tl.prepro.threading_data([_ for _ in zip(b_images, b_ann)],
_data_pre_aug_fn)
b_images2 = [d[0] for d in data]
b_ann = [d[1] for d in data]
# save all images

for i in range(len(b_images)):
tl.vis.draw_boxes_and_labels_to_image(b_images[i],
ann_list[idexs[i]][0], ann_list[idexs[i]][1], [],
classes, True, save_name='_bbox_vis_%d_original.png' % i)
tl.vis.draw_boxes_and_labels_to_image((b_images2[i]+1)*127.5,
b_ann[i][0], b_ann[i][1], [], classes, True,
save_name='_bbox_vis_%d.png' % i)
Image Aug with TF Dataset API
• Example code for VOC here.
Coordinate pixel unit to percentage
tensorlayer.prepro.obj_box_coord_rescale(coord=None, shape=None)
Scale down one coordinates from pixel unit to the ratio of image size i.e. in the range of [0, 1]. It is the reverse
process of obj_box_coord_scale_to_pixelunit.
Parameters
• coords (list of 4 int or None) – One coordinates of one image e.g. [x, y, w, h].
• shape (list of 2 int or None) – For [height, width].
Returns New bounding box.
Return type list of 4 numbers
Examples
>>> coord = tl.prepro.obj_box_coord_rescale(coord=[30, 40, 50, 50], shape=[100,

˓→100])
[0.3, 0.4, 0.5, 0.5]

Coordinates pixel unit to percentage
tensorlayer.prepro.obj_box_coords_rescale(coords=None, shape=None)
Scale down a list of coordinates from pixel unit to the ratio of image size i.e. in the range of [0, 1].
Parameters
• coords (list of list of 4 ints or None) – For coordinates of more than one
images .e.g.[[x, y, w, h], [x, y, w, h], . . . ].
• shape (list of 2 int or None) – height, width].
Returns A list of new bounding boxes.
Return type list of list of 4 numbers
Examples
>>> coords = obj_box_coords_rescale(coords=[[30, 40, 50, 50], [10, 10, 20, 20]],
˓→shape=[100, 100])
>>> print(coords)
[[0.3, 0.4, 0.5, 0.5], [0.1, 0.1, 0.2, 0.2]]
>>> coords = obj_box_coords_rescale(coords=[[30, 40, 50, 50]], shape=[50, 100])
>>> print(coords)
[[0.3, 0.8, 0.5, 1.0]]
>>> coords = obj_box_coords_rescale(coords=[[30, 40, 50, 50]], shape=[100, 200])
>>> print(coords)
[[0.15, 0.4, 0.25, 0.5]]
Returns New coordinates.

Coordinate percentage to pixel unit
tensorlayer.prepro.obj_box_coord_scale_to_pixelunit(coord, shape=None)
Convert one coordinate [x, y, w (or x2), h (or y2)] in ratio format to image coordinate format. It is the reverse
process of obj_box_coord_rescale.
Parameters
• coord (list of 4 float) – One coordinate of one image [x, y, w (or x2), h (or y2)]
in ratio format, i.e value range [0~1].
• shape (tuple of 2 or None) – For [height, width].
Examples
>>> x, y, x2, y2 = tl.prepro.obj_box_coord_scale_to_pixelunit([0.2, 0.3, 0.5, 0.

˓→7], shape=(100, 200, 3))
[40, 30, 100, 70]

Coordinate [x_center, x_center, w, h] to up-left button-right
tensorlayer.prepro.obj_box_coord_centroid_to_upleft_butright(coord,
to_int=False)
Convert one coordinate [x_center, y_center, w, h] to [x1, y1, x2, y2] in up-left and botton-right format.
Parameters
• coord (list of 4 int/float) – One coordinate.
• to_int (boolean) – Whether to convert output as integer.
Examples
>>> coord = obj_box_coord_centroid_to_upleft_butright([30, 40, 20, 20])

[20, 30, 40, 50]
Coordinate up-left button-right to [x_center, x_center, w, h]
tensorlayer.prepro.obj_box_coord_upleft_butright_to_centroid(coord)
Convert one coordinate [x1, y1, x2, y2] to [x_center, y_center, w, h]. It is the reverse process of
obj_box_coord_centroid_to_upleft_butright.
Parameters coord (list of 4 int/float) – One coordinate.
Coordinate [x_center, x_center, w, h] to up-left-width-high
tensorlayer.prepro.obj_box_coord_centroid_to_upleft(coord)
Convert one coordinate [x_center, y_center, w, h] to [x, y, w, h]. It is the reverse process of
obj_box_coord_upleft_to_centroid.
Coordinate up-left-width-high to [x_center, x_center, w, h]
tensorlayer.prepro.obj_box_coord_upleft_to_centroid(coord)
Convert one coordinate [x, y, w, h] to [x_center, y_center, w, h]. It is the reverse process of
obj_box_coord_centroid_to_upleft.

Darknet format string to list
tensorlayer.prepro.parse_darknet_ann_str_to_list(annotations)
Input string format of class, x, y, w, h, return list of list format.
Parameters annotations (str) – The annotations in darkent format “class, x, y, w, h . . . .”
seperated by “\n”.
Returns List of bounding box.
Return type list of list of 4 numbers
Darknet format split class and coordinate
tensorlayer.prepro.parse_darknet_ann_list_to_cls_box(annotations)
Parse darknet annotation format into two lists for class and bounding box.
Input list of [[class, x, y, w, h], . . . ], return two list of [class . . . ] and [[x, y, w, h], . . . ].
Parameters annotations (list of list) – A list of class and bounding boxes of images
e.g. [[class, x, y, w, h], . . . ]
Returns
• list of int – List of class labels.
• list of list of 4 numbers – List of bounding box.
Image Aug - Flip
tensorlayer.prepro.obj_box_horizontal_flip(im, coords=None, is_rescale=False,

is_center=False, is_random=False)
Left-right flip the image and coordinates for object detection.
Parameters
• im (numpy.array) – An image with dimension of [row, col, channel] (default).
• coords (list of list of 4 int/float or None) – Coordinates [[x, y, w, h],
[x, y, w, h], . . . ].
• is_rescale (boolean) – Set to True, if the input coordinates are rescaled to [0, 1].
Default is False.
• is_center (boolean) – Set to True, if the x and y of coordinates are the centroid (i.e.
darknet format). Default is False.
• is_random (boolean) – If True, randomly flip. Default is False.
Returns
• numpy.array – A processed image
• list of list of 4 numbers – A list of new bounding boxes.
Examples

>>> im = np.zeros([80, 100]) # as an image with shape width=100, height=80

>>> im, coords = obj_box_left_right_flip(im, coords=[[0.2, 0.4, 0.3, 0.3], [0.1,
˓→0.5, 0.2, 0.3]], is_rescale=True, is_center=True, is_random=False)
>>> print(coords)
[[0.8, 0.4, 0.3, 0.3], [0.9, 0.5, 0.2, 0.3]]
>>> im, coords = obj_box_left_right_flip(im, coords=[[0.2, 0.4, 0.3, 0.3]], is_
˓→rescale=True, is_center=False, is_random=False)
>>> print(coords)
[[0.5, 0.4, 0.3, 0.3]]
>>> im, coords = obj_box_left_right_flip(im, coords=[[20, 40, 30, 30]], is_
˓→rescale=False, is_center=True, is_random=False)
>>> print(coords)
[[80, 40, 30, 30]]
>>> im, coords = obj_box_left_right_flip(im, coords=[[20, 40, 30, 30]], is_
˓→rescale=False, is_center=False, is_random=False)
>>> print(coords)
[[50, 40, 30, 30]]
Image Aug - Resize
tensorlayer.prepro.obj_box_imresize(im, coords=None, size=None, interp=’bicubic’,

mode=None, is_rescale=False)
Resize an image, and compute the new bounding box coordinates.
Parameters
[x, y, w, h], . . . ]
• interp and mode (size) – See tl.prepro.imresize.
• is_rescale (boolean) – Set to True, if the input coordinates are rescaled to [0, 1], then
return the original coordinates. Default is False.
Returns
Examples
>>> im = np.zeros([80, 100, 3]) # as an image with shape width=100, height=80

>>> _, coords = obj_box_imresize(im, coords=[[20, 40, 30, 30], [10, 20, 20, 20]],
˓→size=[160, 200], is_rescale=False)
>>> print(coords)
[[40, 80, 60, 60], [20, 40, 40, 40]]
>>> _, coords = obj_box_imresize(im, coords=[[20, 40, 30, 30]], size=[40, 100],
˓→is_rescale=False)
>>> print(coords)
[[20, 20, 30, 15]]
>>> _, coords = obj_box_imresize(im, coords=[[20, 40, 30, 30]], size=[60, 150],
˓→is_rescale=False)
>>> print(coords)
[[30, 30, 45, 22]]


>>> im2, coords = obj_box_imresize(im, coords=[[0.2, 0.4, 0.3, 0.3]], size=[160,
˓→200], is_rescale=True)
>>> print(coords, im2.shape)

[[0.2, 0.4, 0.3, 0.3]] (160, 200, 3)
Image Aug - Crop
tensorlayer.prepro.obj_box_crop(im, classes=None, coords=None, wrg=100, hrg=100,

is_rescale=False, is_center=False, is_random=False,
thresh_wh=0.02, thresh_wh2=12.0)
Randomly or centrally crop an image, and compute the new bounding box coordinates. Objects outside the
cropped image will be removed.
Parameters
• classes (list of int or None) – Class IDs.
[x, y, w, h], . . . ]
• hrg and is_random (wrg) – See tl.prepro.crop.
Default is False.
• is_center (boolean, default False) – Set to True, if the x and y of coordinates
are the centroid (i.e. darknet format). Default is False.
• thresh_wh (float) – Threshold, remove the box if its ratio of width(height) to image
size less than the threshold.
• thresh_wh2 (float) – Threshold, remove the box if its ratio of width to height or vice
verse higher than the threshold.
Returns
• list of int – A list of classes
Image Aug - Shift
tensorlayer.prepro.obj_box_shift(im, classes=None, coords=None, wrg=0.1, hrg=0.1,

row_index=0, col_index=1, channel_index=2,
fill_mode=’nearest’, cval=0.0, order=1, is_rescale=False,
is_center=False, is_random=False, thresh_wh=0.02,
thresh_wh2=12.0)
Shift an image randomly or non-randomly, and compute the new bounding box coordinates. Objects outside the
cropped image will be removed.
Parameters


[x, y, w, h], . . . ]
• hrg row_index col_index channel_index is_random fill_mode
cval and order (wrg,) –
Default is False.
• is_center (boolean) – Set to True, if the x and y of coordinates are the centroid (i.e.
Returns
Image Aug - Zoom
tensorlayer.prepro.obj_box_zoom(im, classes=None, coords=None, zoom_range=(0.9,

1.1), row_index=0, col_index=1, channel_index=2,
fill_mode=’nearest’, cval=0.0, order=1, is_rescale=False,
is_center=False, is_random=False, thresh_wh=0.02,
thresh_wh2=12.0)
Zoom in and out of a single image, randomly or non-randomly, and compute the new bounding box coordinates.
Objects outside the cropped image will be removed.
Parameters
[x, y, w, h], . . . ].
• row_index col_index channel_index is_random fill_mode cval
and order (zoom_range) –
Default is False.
• is_center (boolean) – Set to True, if the x and y of coordinates are the centroid. (i.e.
Returns


2.4.4 Keypoints
Image Aug - Crop
tensorlayer.prepro.keypoint_random_crop(image, annos, mask=None, size=(368, 368))

Randomly crop an image and corresponding keypoints without influence scales, given by
keypoint_random_resize_shortestedge.
Parameters
• image (3 channel image) – The given image for augmentation.
• annos (list of list of floats) – The keypoints annotation of people.
• mask (single channel image or None) – The mask if available.
• size (tuple of int) – The size of returned image.
Returns
Return type preprocessed image, annotation, mask
Image Aug - Resize then Crop
tensorlayer.prepro.keypoint_resize_random_crop(image, annos, mask=None, size=(368,

368))
Reszie the image to make either its width or height equals to the given sizes. Then randomly crop image without
influence scales. Resize the image match with the minimum size before cropping, this API will change the zoom
scale of object.
Parameters
• size (tuple of int) – The size (height, width) of returned image.
Returns
Return type preprocessed image, annos, mask
Image Aug - Rotate
tensorlayer.prepro.keypoint_random_rotate(image, annos, mask=None, rg=15.0)

Rotate an image and corresponding keypoints.
Parameters
• rg (int or float) – Degree to rotate, usually 0 ~ 180.

Returns
Image Aug - Flip
tensorlayer.prepro.keypoint_random_flip(image, annos, mask=None, prob=0.5, flip_list=(0,

1, 5, 6, 7, 2, 3, 4, 11, 12, 13, 8, 9, 10, 15, 14, 17, 16,
18))
Flip an image and corresponding keypoints.
Parameters
• prob (float, 0 to 1) – The probability to flip the image, if 1, always flip the image.
• flip_list (tuple of int) – Denotes how the keypoints number be changed after
flipping which is required for pose estimation task. The left and right body should be main-
tained rather than switch. (Default COCO format). Set to an empty tuple if you don’t need
to maintain left and right information.
Returns
Image Aug - Resize
tensorlayer.prepro.keypoint_random_resize(image, annos, mask=None, zoom_range=(0.8,

1.2))
Randomly resize an image and corresponding keypoints. The height and width of image will be changed inde-
pendently, so the scale will be changed.
Parameters
• zoom_range (tuple of two floats) – The minimum and maximum factor to
zoom in or out, e.g (0.5, 1) means zoom out 1~2 times.
Returns
Image Aug - Resize Shortest Edge
tensorlayer.prepro.keypoint_random_resize_shortestedge(image, annos, mask=None,

min_size=(368, 368),
zoom_range=(0.8,
1.2), pad_val=(0, 0,
<sphinx.ext.autodoc.importer._MockObject
object>))
Randomly resize an image and corresponding keypoints based on shorter edgeself. If the resized image is

smaller than min_size, uses padding to make shape matchs min_size. The height and width of image will be
changed together, the scale would not be changed.
Parameters
• min_size (tuple of two int) – The minimum size of height and width.
• zoom_range (tuple of two floats) – The minimum and maximum factor to
zoom in or out, e.g (0.5, 1) means zoom out 1~2 times.
• pad_val (int/float, or tuple of int or random function) – The
three padding values for RGB channels respectively.
Returns
2.4.5 Sequence
More related functions can be found in tensorlayer.nlp.
Padding
tensorlayer.prepro.pad_sequences(sequences, maxlen=None, dtype=’int32’, padding=’post’,

truncating=’pre’, value=0.0)
Pads each sequence to the same length: the length of the longest sequence. If maxlen is provided, any sequence
longer than maxlen is truncated to maxlen. Truncation happens off either the beginning (default) or the end of
the sequence. Supports post-padding and pre-padding (default).
Parameters
• sequences (list of list of int) – All sequences where each row is a sequence.
• maxlen (int) – Maximum length.
• dtype (numpy.dtype or str) – Data type to cast the resulting sequence.
• padding (str) – Either ‘pre’ or ‘post’, pad either before or after each sequence.
• truncating (str) – Either ‘pre’ or ‘post’, remove values from sequences larger than
maxlen either in the beginning or in the end of the sequence
• value (float) – Value to pad the sequences to the desired value.
Returns x – With dimensions (number_of_sequences, maxlen)
Examples
>>> sequences = [[1,1,1,1,1],[2,2,2],[3,3]]

>>> sequences = pad_sequences(sequences, maxlen=None, dtype='int32',
... padding='post', truncating='pre', value=0.)
[[1 1 1 1 1]


[2 2 2 0 0]
[3 3 0 0 0]]
Remove Padding
tensorlayer.prepro.remove_pad_sequences(sequences, pad_id=0)
Remove padding.
Parameters
• pad_id (int) – The pad ID.
Returns The processed sequences.
Return type list of list of int
Examples
>>> sequences = [[2,3,4,0,0], [5,1,2,3,4,0,0,0], [4,5,0,2,4,0,0,0]]

>>> print(remove_pad_sequences(sequences, pad_id=0))
[[2, 3, 4], [5, 1, 2, 3, 4], [4, 5, 0, 2, 4]]
Process
tensorlayer.prepro.process_sequences(sequences, end_id=0, pad_val=0, is_shorten=True, re-

main_end_id=False)
Set all tokens(ids) after END token to the padding value, and then shorten (option) it to the maximum sequence
length in this batch.
Parameters
• end_id (int) – The special token for END.
• pad_val (int) – Replace the end_id and the IDs after end_id to this value.
• is_shorten (boolean) – Shorten the sequences. Default is True.
• remain_end_id (boolean) – Keep an end_id in the end. Default is False.
Examples
>>> sentences_ids = [[4, 3, 5, 3, 2, 2, 2, 2], <-- end_id is 2

... [5, 3, 9, 4, 9, 2, 2, 3]] <-- end_id is 2
>>> sentences_ids = precess_sequences(sentences_ids, end_id=vocab.end_id, pad_
˓→val=0, is_shorten=True)
[[4, 3, 5, 3, 0], [5, 3, 9, 4, 9]]

Add Start ID
tensorlayer.prepro.sequences_add_start_id(sequences, start_id=0, remove_last=False)

Add special start token(id) in the beginning of each sequence.
Parameters
• start_id (int) – The start ID.
• remove_last (boolean) – Remove the last value of each sequences. Usually be used
for removing the end ID.
Examples
>>> sentences_ids = [[4,3,5,3,2,2,2,2], [5,3,9,4,9,2,2,3]]

>>> sentences_ids = sequences_add_start_id(sentences_ids, start_id=2)
[[2, 4, 3, 5, 3, 2, 2, 2, 2], [2, 5, 3, 9, 4, 9, 2, 2, 3]]
>>> sentences_ids = sequences_add_start_id(sentences_ids, start_id=2, remove_
˓→last=True)
[[2, 4, 3, 5, 3, 2, 2, 2], [2, 5, 3, 9, 4, 9, 2, 2]]
For Seq2seq
>>> input = [a, b, c]

>>> target = [x, y, z]
>>> decode_seq = [start_id, a, b] <-- sequences_add_start_id(input, start_id,
˓→True)
Add End ID
tensorlayer.prepro.sequences_add_end_id(sequences, end_id=888)
Add special end token(id) in the end of each sequence.
Parameters
• end_id (int) – The end ID.
Examples
>>> sequences = [[1,2,3],[4,5,6,7]]

>>> print(sequences_add_end_id(sequences, end_id=999))
[[1, 2, 3, 999], [4, 5, 6, 999]]

Add End ID after pad
tensorlayer.prepro.sequences_add_end_id_after_pad(sequences, end_id=888, pad_id=0)

Add special end token(id) in the end of each sequence.
Parameters
• end_id (int) – The end ID.
• pad_id (int) – The pad ID.
Examples
>>> sequences = [[1,2,0,0], [1,2,3,0], [1,2,3,4]]

>>> print(sequences_add_end_id_after_pad(sequences, end_id=99, pad_id=0))
[[1, 2, 99, 0], [1, 2, 3, 99], [1, 2, 3, 4]]
Get Mask
tensorlayer.prepro.sequences_get_mask(sequences, pad_val=0)
Return mask for sequences.
Parameters
• pad_val (int) – The pad value.
Returns The mask.
Examples
>>> sentences_ids = [[4, 0, 5, 3, 0, 0],

... [5, 3, 9, 4, 9, 0]]
>>> mask = sequences_get_mask(sentences_ids, pad_val=0)
[[1 1 1 1 0 0]
[1 1 1 1 1 0]]
2.5 API - Distributed Training
(Alpha release - usage might change later)

Helper API to run a distributed training. Check these examples.

Trainer(training_dataset, . . . [, batch_size, . . . ]) Trainer for neural networks in a distributed environ-

ment.
2.5.1 Distributed training
Trainer
tensorlayer.distributed.Trainer(training_dataset, build_training_func, optimizer, op-

timizer_args, batch_size=32, prefetch_size=None,
checkpoint_dir=None, scaling_learning_rate=True,
log_step_size=1, validation_dataset=None,
build_validation_func=None, max_iteration=inf )
Trainer for neural networks in a distributed environment.
TensorLayer Trainer is a high-level training interface built on top of TensorFlow MonitoredSession and
Horovod. It transparently scales the training of a TensorLayer model from a single GPU to multiple GPUs
that be placed on different machines in a single cluster.
To run the trainer, you will need to install Horovod on your machine. Check the installation script at tensor-
layer/scripts/download_and_install_openmpi3_ubuntu.sh
The minimal inputs to the Trainer include (1) a training dataset defined using the TensorFlow DataSet API,
and (2) a model build function given the inputs of the training dataset, and returns the neural network to train,
the loss function to minimize, and the names of the tensor to log during training, and (3) an optimizer and its
arguments.
The default parameter choices of Trainer is inspired by the Facebook paper: Accurate, Large Minibatch SGD:
Training ImageNet in 1 Hour
Parameters
• training_dataset (class TensorFlow DataSet) – The training dataset which zips
samples and labels. The trainer automatically shards the training dataset based on the num-
ber of GPUs.
• build_training_func (function) – A function that builds the training operator. It
takes the training dataset as an input, and returns the neural network, the loss function and a
dictionary that maps string tags to tensors to log during training.
• optimizer (class TensorFlow Optimizer) – The loss function optimizer. The trainer
automatically linearly scale the learning rate based on the number of GPUs.
• optimizer_args (dict) – The optimizer argument dictionary. It must contain a learn-
ing_rate field in type of float. Note that the learning rate is linearly scaled according to the
number of GPU by default. You can disable it using the option scaling_learning_rate
• batch_size (int) – The training mini-batch size (i.e., number of samples per batch).
• prefetch_size (int or None) – The dataset prefetch buffer size. Set this parameter
to overlap the GPU training and data preparation if the data preparation is heavy.
• checkpoint_dir (None or str) – The path to the TensorFlow model checkpoint.
Note that only one trainer master would checkpoints its model. If None, checkpoint is
disabled.
• log_step_size (int) – The trainer logs training information every N mini-batches (i.e.,
step size).
2.5. API - Distributed Training 93

• validation_dataset (None or class TensorFlow DataSet) – The optional validation

dataset that zips samples and labels. Note that only the trainer master needs to the validation
often.
• build_validation_func (None or function) – The function that builds the val-
idation operator. It returns the validation neural network (which share the weights of the
training network) and a custom number of validation metrics.
• scaling_learning_rate (Boolean) – Linearly scale the learning rate by the num-
ber of GPUs. Default is True. This linear scaling rule is generally effective and is highly
recommended by the practioners. Check Accurate, Large Minibatch SGD: Training Ima-
geNet in 1 Hour
• max_iteration (int) – The maximum iteration (i.e., mini-batch) to train. The default
is math.inf. You can set it to a small number to end the training earlier. This is usually set
for testing purpose.
tensorlayer.distributed.training_network
class TensorLayer Layer – The training model.
tensorlayer.distributed.session
class TensorFlow MonitoredTrainingSession – The training session tha the Trainer wraps.
tensorlayer.distributed.global_step
int – The number of training mini-batch by far.
tensorlayer.distributed.validation_metrics
list of tuples – The validation metrics that zips the validation metric property and the average value.
Examples
See tutorial_mnist_distributed_trainer.py.
2.6 API - Files
A collections of helper functions to work with dataset. Load benchmark dataset, save and restore model, save and load
variables.
TensorLayer provides rich layer implementations trailed for various benchmarks and domain-specific problems. In
addition, we also support transparent access to native TensorFlow parameters. For example, we provide not only layers
for local response normalization, but also layers that allow user to apply tf.nn.lrn on network.outputs. More
functions can be found in TensorFlow API.
load_mnist_dataset([shape, path]) Load the original mnist.

load_fashion_mnist_dataset([shape, path]) Load the fashion mnist.
load_cifar10_dataset([shape, path, plotable]) Load CIFAR-10 dataset.
load_cropped_svhn([path, include_extra]) Load Cropped SVHN.
load_ptb_dataset([path]) Load Penn TreeBank (PTB) dataset.
load_matt_mahoney_text8_dataset([path]) Load Matt Mahoney’s dataset.
load_imdb_dataset([path, nb_words, . . . ]) Load IMDB dataset.
load_nietzsche_dataset([path]) Load Nietzsche dataset.
load_wmt_en_fr_dataset([path]) Load WMT‘15 English-to-French translation dataset.
load_flickr25k_dataset([tag, path, . . . ]) Load Flickr25K dataset.
load_flickr1M_dataset([tag, size, path, . . . ]) Load Flick1M dataset.


load_cyclegan_dataset([filename, path]) Load images from CycleGAN’s database, see this link.
load_celebA_dataset([path]) Load CelebA dataset
load_voc_dataset([path, dataset, . . . ]) Pascal VOC 2007/2012 Dataset.
load_mpii_pose_dataset([path, Load MPII Human Pose Dataset.
is_16_pos_only])
download_file_from_google_drive(ID, des- Download file from Google Drive.
tination)
save_npz([save_list, name, sess]) Input parameters and the file name, save parameters into
.npz file.
load_npz([path, name]) Load the parameters of a Model saved by
tl.files.save_npz().
assign_params(sess, params, network) Assign the given parameters to the TensorLayer net-
work.
load_and_assign_npz([sess, name, network]) Load model from npz and assign to a network.
save_npz_dict([save_list, name, sess]) Input parameters and the file name, save parameters as
a dictionary into .npz file.
load_and_assign_npz_dict([name, sess]) Restore the parameters saved by tl.files.
save_npz_dict().
save_ckpt([sess, mode_name, save_dir, . . . ]) Save parameters into ckpt file.
load_ckpt([sess, mode_name, save_dir, . . . ]) Load parameters from ckpt file.
save_any_to_npy([save_dict, name]) Save variables to .npy file.
load_npy_to_any([path, name]) Load .npy file.
file_exists(filepath) Check whether a file exists by given file path.
folder_exists(folderpath) Check whether a folder exists by given folder path.
del_file(filepath) Delete a file by given file path.
del_folder(folderpath) Delete a folder by given folder path.
read_file(filepath) Read a file and return a string.
load_file_list([path, regx, printable, . . . ]) Return a file list in a folder by given a path and regular
expression.
load_folder_list([path]) Return a folder list in a folder by given a folder path.
exists_or_mkdir(path[, verbose]) Check a folder by given name, if not exist, create the
folder and return False, if directory exists, return True.
maybe_download_and_extract(filename, . . . [, Checks if file exists in working_directory otherwise
. . . ]) tries to dowload the file, and optionally also tries to ex-
tract the file if format is “.zip” or “.tar”
natural_keys(text) Sort list of string with number in human order.
2.6.1 Load dataset functions
MNIST
tensorlayer.files.load_mnist_dataset(shape=(-1, 784), path=’data’)

Load the original mnist.
Automatically download MNIST dataset and return the training, validation and test set with 50000, 10000 and
10000 digit images respectively.
Parameters
• shape (tuple) – The shape of digit images (the default is (-1, 784), alternatively (-1, 28,
28, 1)).
• path (str) – The path that the data is downloaded to.
2.6. API - Files 95

Returns X_train, y_train, X_val, y_val, X_test, y_test – Return splitted training/validation/test set
respectively.
Return type tuple
Examples
>>> X_train, y_train, X_val, y_val, X_test, y_test = tl.files.load_mnist_

˓→dataset(shape=(-1,784), path='datasets')
>>> X_train, y_train, X_val, y_val, X_test, y_test = tl.files.load_mnist_

˓→dataset(shape=(-1, 28, 28, 1))
Fashion-MNIST
tensorlayer.files.load_fashion_mnist_dataset(shape=(-1, 784), path=’data’)

Load the fashion mnist.
Automatically download fashion-MNIST dataset and return the training, validation and test set with 50000,
10000 and 10000 fashion images respectively, examples.
Parameters
• shape (tuple) – The shape of digit images (the default is (-1, 784), alternatively (-1, 28,
28, 1)).
Returns X_train, y_train, X_val, y_val, X_test, y_test – Return splitted training/validation/test set
respectively.
Return type tuple
Examples
>>> X_train, y_train, X_val, y_val, X_test, y_test = tl.files.load_fashion_mnist_

˓→dataset(shape=(-1,784), path='datasets')
>>> X_train, y_train, X_val, y_val, X_test, y_test = tl.files.load_fashion_mnist_

˓→dataset(shape=(-1, 28, 28, 1))
CIFAR-10
tensorlayer.files.load_cifar10_dataset(shape=(-1, 32, 32, 3), path=’data’, plotable=False)

Load CIFAR-10 dataset.
It consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training
images and 10000 test images.
The dataset is divided into five training batches and one test batch, each with 10000 images. The test batch
contains exactly 1000 randomly-selected images from each class. The training batches contain the remaining
images in random order, but some training batches may contain more images from one class than another.
Between them, the training batches contain exactly 5000 images from each class.
Parameters
• shape (tupe) – The shape of digit images e.g. (-1, 3, 32, 32) and (-1, 32, 32, 3).

• path (str) – The path that the data is downloaded to, defaults is data/cifar10/.
• plotable (boolean) – Whether to plot some image examples, False as default.
Examples
>>> X_train, y_train, X_test, y_test = tl.files.load_cifar10_dataset(shape=(-1,

˓→32, 32, 3))
References
• CIFAR website
• Data download link
• https://teratail.com/questions/28932
SVHN
tensorlayer.files.load_cropped_svhn(path=’data’, include_extra=True)
Load Cropped SVHN.
The Cropped Street View House Numbers (SVHN) Dataset contains 32x32x3 RGB images. Digit ‘1’ has label
1, ‘9’ has label 9 and ‘0’ has label 0 (the original dataset uses 10 to represent ‘0’), see ufldl website.
Parameters
• include_extra (boolean) – If True (default), add extra images to the training set.
Returns X_train, y_train, X_test, y_test – Return splitted training/test set respectively.
Return type tuple
Examples
>>> X_train, y_train, X_test, y_test = tl.files.load_cropped_svhn(include_

˓→extra=False)
>>> tl.vis.save_images(X_train[0:100], [10, 10], 'svhn.png')
Penn TreeBank (PTB)
tensorlayer.files.load_ptb_dataset(path=’data’)
Load Penn TreeBank (PTB) dataset.
It is used in many LANGUAGE MODELING papers, including “Empirical Evaluation and Combination of
Advanced Language Modeling Techniques”, “Recurrent Neural Network Regularization”. It consists of 929k
training words, 73k validation words, and 82k test words. It has 10k words in its vocabulary.
Parameters path (str) – The path that the data is downloaded to, defaults is data/ptb/.
Returns
2.6. API - Files 97

• train_data, valid_data, test_data (list of int) – The training, validating and testing data in
integer format.
• vocab_size (int) – The vocabulary size.
Examples
>>> train_data, valid_data, test_data, vocab_size = tl.files.load_ptb_dataset()
References
• tensorflow.models.rnn.ptb import reader

• Manual download
Notes
• If you want to get the raw data, see the source code.
Matt Mahoney’s text8
tensorlayer.files.load_matt_mahoney_text8_dataset(path=’data’)
Load Matt Mahoney’s dataset.
Download a text file from Matt Mahoney’s website if not present, and make sure it’s the right size. Extract the
first file enclosed in a zip file as a list of words. This dataset can be used for Word Embedding.
Parameters path (str) – The path that the data is downloaded to, defaults is data/mm_test8/.
Returns The raw text data e.g. [. . . . ‘their’, ‘families’, ‘who’, ‘were’, ‘expelled’, ‘from’,
‘jerusalem’, . . . ]
Return type list of str
Examples
>>> words = tl.files.load_matt_mahoney_text8_dataset()

>>> print('Data size', len(words))
IMBD
tensorlayer.files.load_imdb_dataset(path=’data’, nb_words=None, skip_top=0,

maxlen=None, test_split=0.2, seed=113, start_char=1,
oov_char=2, index_from=3)
Load IMDB dataset.
Parameters
• path (str) – The path that the data is downloaded to, defaults is data/imdb/.
• nb_words (int) – Number of words to get.

• skip_top (int) – Top most frequent words to ignore (they will appear as oov_char value
in the sequence data).
• maxlen (int) – Maximum sequence length. Any longer sequence will be truncated.
• seed (int) – Seed for reproducible data shuffling.
• start_char (int) – The start of a sequence will be marked with this character. Set to 1
because 0 is usually the padding character.
• oov_char (int) – Words that were cut out because of the num_words or skip_top limit
will be replaced with this character.
• index_from (int) – Index actual words with this index and higher.
Examples
>>> X_train, y_train, X_test, y_test = tl.files.load_imdb_dataset(

... nb_words=20000, test_split=0.2)
>>> print('X_train.shape', X_train.shape)
(20000,) [[1, 62, 74, ... 1033, 507, 27],[1, 60, 33, ... 13, 1053, 7]..]
>>> print('y_train.shape', y_train.shape)
(20000,) [1 0 0 ..., 1 0 1]
References
• Modified from keras.
Nietzsche
tensorlayer.files.load_nietzsche_dataset(path=’data’)
Load Nietzsche dataset.
Parameters path (str) – The path that the data is downloaded to, defaults is data/
nietzsche/.
Returns The content.
Return type str
Examples
>>> see tutorial_generate_text.py

>>> words = tl.files.load_nietzsche_dataset()
>>> words = basic_clean_str(words)
>>> words = words.split()
English-to-French translation data from the WMT‘15 Website
tensorlayer.files.load_wmt_en_fr_dataset(path=’data’)
Load WMT‘15 English-to-French translation dataset.
It will download the data from the WMT‘15 Website (10^9-French-English corpus), and the 2013 news test
from the same site as development set. Returns the directories of training data and test data.
2.6. API - Files 99

Parameters path (str) – The path that the data is downloaded to, defaults is data/
wmt_en_fr/.
References
• Code modified from /tensorflow/models/rnn/translation/data_utils.py
Notes
Usually, it will take a long time to download this dataset.
Flickr25k
tensorlayer.files.load_flickr25k_dataset(tag=’sky’, path=’data’, n_threads=50, print-

able=False)
Load Flickr25K dataset.
Returns a list of images by a given tag from Flick25k dataset, it will download Flickr25k from the official
website at the first time you use it.
Parameters
• tag (str or None) –
What images to return.
– If you want to get images with tag, use string like ‘dog’, ‘red’, see Flickr Search.
– If you want to get all images, set to None.
• path (str) – The path that the data is downloaded to, defaults is data/flickr25k/.
• n_threads (int) – The number of thread to read image.
• printable (boolean) – Whether to print infomation when reading images, default is
False.
Examples
Get images with tag of sky
>>> images = tl.files.load_flickr25k_dataset(tag='sky')
Get all images
>>> images = tl.files.load_flickr25k_dataset(tag=None, n_threads=100,

Flickr1M
tensorlayer.files.load_flickr1M_dataset(tag=’sky’, size=10, path=’data’, n_threads=50,

printable=False)
Load Flick1M dataset.

Returns a list of images by a given tag from Flickr1M dataset, it will download Flickr1M from the official
website at the first time you use it.
Parameters
• tag (str or None) –
What images to return.
– If you want to get images with tag, use string like ‘dog’, ‘red’, see Flickr Search.
– If you want to get all images, set to None.
• size (int) – integer between 1 to 10. 1 means 100k images . . . 5 means 500k images, 10
means all 1 million images. Default is 10.
• path (str) – The path that the data is downloaded to, defaults is data/flickr25k/.
• n_threads (int) – The number of thread to read image.
• printable (boolean) – Whether to print infomation when reading images, default is
False.
Examples
Use 200k images
>>> images = tl.files.load_flickr1M_dataset(tag='zebra', size=2)
Use 1 Million images
>>> images = tl.files.load_flickr1M_dataset(tag='zebra')
CycleGAN
tensorlayer.files.load_cyclegan_dataset(filename=’summer2winter_yosemite’,
path=’data’)
Load images from CycleGAN’s database, see this link.
Parameters
• filename (str) – The dataset you want, see this link.
• path (str) – The path that the data is downloaded to, defaults is data/cyclegan
Examples
>>> im_train_A, im_train_B, im_test_A, im_test_B = load_cyclegan_dataset(filename=

˓→'summer2winter_yosemite')
CelebA
tensorlayer.files.load_celebA_dataset(path=’data’)
Load CelebA dataset
Return a list of image path.
2.6. API - Files 101

Parameters path (str) – The path that the data is downloaded to, defaults is data/celebA/.
VOC 2007/2012
tensorlayer.files.load_voc_dataset(path=’data’, dataset=’2012’, con-

tain_classes_in_person=False)
Pascal VOC 2007/2012 Dataset.
It has 20 objects: aeroplane, bicycle, bird, boat, bottle, bus, car, cat, chair, cow, diningtable, dog, horse, motor-
bike, person, pottedplant, sheep, sofa, train, tvmonitor and additional 3 classes : head, hand, foot for person.
Parameters
• path (str) – The path that the data is downloaded to, defaults is data/VOC.
• dataset (str) – The VOC dataset version, 2012, 2007, 2007test or 2012test. We usually
train model on 2007+2012 and test it on 2007test.
• contain_classes_in_person (boolean) – Whether include head, hand and foot
annotation, default is False.
Returns
• imgs_file_list (list of str) – Full paths of all images.
• imgs_semseg_file_list (list of str) – Full paths of all maps for semantic segmentation. Note
that not all images have this map!
• imgs_insseg_file_list (list of str) – Full paths of all maps for instance segmentation. Note
that not all images have this map!
• imgs_ann_file_list (list of str) – Full paths of all annotations for bounding box and object
class, all images have this annotations.
• classes (list of str) – Classes in order.
• classes_in_person (list of str) – Classes in person.
• classes_dict (dictionary) – Class label to integer.
• n_objs_list (list of int) – Number of objects in all images in imgs_file_list in order.
• objs_info_list (list of str) – Darknet format for the annotation of all images
in imgs_file_list in order. [class_id x_centre y_centre width
height] in ratio format.
• objs_info_dicts (dictionary) – The annotation of all images in imgs_file_list,
{imgs_file_list : dictionary for annotation}, format from
TensorFlow/Models/object-detection.
Examples
>>> imgs_file_list, imgs_semseg_file_list, imgs_insseg_file_list, imgs_ann_file_

˓→list,
>>> classes, classes_in_person, classes_dict,

>>> n_objs_list, objs_info_list, objs_info_dicts = tl.files.load_voc_
˓→dataset(dataset="2012", contain_classes_in_person=False)
>>> idx = 26
>>> print(classes)
['aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat', 'chair',
˓→'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person', 'pottedplant',
˓→'sheep', 'sofa', 'train', 'tvmonitor']


>>> print(classes_dict)
{'sheep': 16, 'horse': 12, 'bicycle': 1, 'bottle': 4, 'cow': 9, 'sofa': 17, 'car
˓→': 6, 'dog': 11, 'cat': 7, 'person': 14, 'train': 18, 'diningtable': 10,
˓→'aeroplane': 0, 'bus': 5, 'pottedplant': 15, 'tvmonitor': 19, 'chair': 8, 'bird
˓→': 2, 'boat': 3, 'motorbike': 13}
>>> print(imgs_file_list[idx])
data/VOC/VOC2012/JPEGImages/2007_000423.jpg
>>> print(n_objs_list[idx])
2
>>> print(imgs_ann_file_list[idx])
data/VOC/VOC2012/Annotations/2007_000423.xml
>>> print(objs_info_list[idx])
14 0.173 0.461333333333 0.142 0.496
14 0.828 0.542666666667 0.188 0.594666666667
>>> ann = tl.prepro.parse_darknet_ann_str_to_list(objs_info_list[idx])
>>> print(ann)
[[14, 0.173, 0.461333333333, 0.142, 0.496], [14, 0.828, 0.542666666667, 0.188, 0.
˓→594666666667]]
>>> c, b = tl.prepro.parse_darknet_ann_list_to_cls_box(ann)
>>> print(c, b)
[14, 14] [[0.173, 0.461333333333, 0.142, 0.496], [0.828, 0.542666666667, 0.188, 0.
˓→594666666667]]
References
• Pascal VOC2012 Website.

• Pascal VOC2007 Website.
MPII
tensorlayer.files.load_mpii_pose_dataset(path=’data’, is_16_pos_only=False)
Load MPII Human Pose Dataset.
Parameters
• is_16_pos_only (boolean) – If True, only return the peoples contain 16 pose key-
points. (Usually be used for single person pose estimation)
Returns
• img_train_list (list of str) – The image directories of training data.
• ann_train_list (list of dict) – The annotations of training data.
• img_test_list (list of str) – The image directories of testing data.
• ann_test_list (list of dict) – The annotations of testing data.
Examples

>>> import pprint

>>> img_train_list, ann_train_list, img_test_list, ann_test_list = tl.files.load_
˓→mpii_pose_dataset()
>>> image = tl.vis.read_image(img_train_list[0])

>>> tl.vis.draw_mpii_pose_to_image(image, ann_train_list[0], 'image.png')
>>> pprint.pprint(ann_train_list[0])
References
• MPII Human Pose Dataset. CVPR 14

• MPII Human Pose Models. CVPR 16
• MPII Human Shape, Poselet Conditioned Pictorial Structures and etc
• MPII Keyponts and ID
Google Drive
tensorlayer.files.download_file_from_google_drive(ID, destination)
Download file from Google Drive.
See tl.files.load_celebA_dataset for example.
Parameters
• ID (str) – The driver ID.
• destination (str) – The destination for save file.
2.6.2 Load and save network
TensorFlow provides .ckpt file format to save and restore the models, while we suggest to use standard python file
format .npz to save models for the sake of cross-platform.
## save model as .ckpt

saver = tf.train.Saver()
save_path = saver.save(sess, "model.ckpt")
# restore model from .ckpt
saver = tf.train.Saver()
saver.restore(sess, "model.ckpt")
## save model as .npz

tl.files.save_npz(network.all_params , name='model.npz')
# restore model from .npz (method 1)
load_params = tl.files.load_npz(name='model.npz')
tl.files.assign_params(sess, load_params, network)
# restore model from .npz (method 2)
tl.files.load_and_assign_npz(sess=sess, name='model.npz', network=network)
## you can assign the pre-trained parameters as follow

# 1st parameter
tl.files.assign_params(sess, [load_params[0]], network)
# the first three parameters
tl.files.assign_params(sess, load_params[:3], network)

Save network into list (npz)
tensorlayer.files.save_npz(save_list=None, name=’model.npz’, sess=None)

Input parameters and the file name, save parameters into .npz file. Use tl.utils.load_npz() to restore.
Parameters
• save_list (list of tensor) – A list of parameters (tensor) to be saved.
• name (str) – The name of the .npz file.
• sess (None or Session) – Session may be required in some case.
Examples
Save model to npz
>>> tl.files.save_npz(network.all_params, name='model.npz', sess=sess)
Load model from npz (Method 1)
>>> load_params = tl.files.load_npz(name='model.npz')

>>> tl.files.assign_params(sess, load_params, network)
Load model from npz (Method 2)
>>> tl.files.load_and_assign_npz(sess=sess, name='model.npz', network=network)
Notes
If you got session issues, you can change the value.eval() to value.eval(session=sess)
References
Saving dictionary using numpy
Load network from list (npz)
tensorlayer.files.load_npz(path=”, name=’model.npz’)
Load the parameters of a Model saved by tl.files.save_npz().
Parameters
• path (str) – Folder path to .npz file.
Returns A list of parameters in order.
Return type list of array
Examples
• See tl.files.save_npz

References
• Saving dictionary using numpy
Assign a list of parameters to network
tensorlayer.files.assign_params(sess, params, network)

Assign the given parameters to the TensorLayer network.
Parameters
• sess (Session) – TensorFlow Session.
• params (list of array) – A list of parameters (array) in order.
• network (Layer) – The network to be assigned.
Returns A list of tf ops in order that assign params. Support sess.run(ops) manually.
Return type list of operations
Examples
References
• Assign value to a TensorFlow variable
Load and assign a list of parameters to network
tensorlayer.files.load_and_assign_npz(sess=None, name=None, network=None)

Load model from npz and assign to a network.
Parameters
• network (Layer) – The network to be assigned.
Returns Returns False, if the model is not exist.
Return type False or network
Examples

Save network into dict (npz)
tensorlayer.files.save_npz_dict(save_list=None, name=’model.npz’, sess=None)

Input parameters and the file name, save parameters as a dictionary into .npz file.
Use tl.files.load_and_assign_npz_dict() to restore.
Parameters
• save_list (list of parameters) – A list of parameters (tensor) to be saved.
Load network from dict (npz)
tensorlayer.files.load_and_assign_npz_dict(name=’model.npz’, sess=None)
Restore the parameters saved by tl.files.save_npz_dict().
Parameters
Save network into ckpt
tensorlayer.files.save_ckpt(sess=None, mode_name=’model.ckpt’, save_dir=’checkpoint’,

var_list=None, global_step=None, printable=False)
Save parameters into ckpt file.
Parameters
• mode_name (str) – The name of the model, default is model.ckpt.
• save_dir (str) – The path / file directory to the ckpt, default is checkpoint.
• var_list (list of tensor) – The parameters / variables (tensor) to be saved. If
empty, save all global variables (default).
• global_step (int or None) – Step number.
• printable (boolean) – Whether to print all parameters information.
See also:
load_ckpt()
Load network from ckpt
tensorlayer.files.load_ckpt(sess=None, mode_name=’model.ckpt’, save_dir=’checkpoint’,

var_list=None, is_latest=True, printable=False)
Load parameters from ckpt file.
Parameters
• mode_name (str) – The name of the model, default is model.ckpt.

• save_dir (str) – The path / file directory to the ckpt, default is checkpoint.
• var_list (list of tensor) – The parameters / variables (tensor) to be saved. If
empty, save all global variables (default).
• is_latest (boolean) – Whether to load the latest ckpt, if False, load the ckpt with the
name of `mode_name.
• printable (boolean) – Whether to print all parameters information.
Examples
• Save all global parameters.
>>> tl.files.save_ckpt(sess=sess, mode_name='model.ckpt', save_dir='model',

• Save specific parameters.
>>> tl.files.save_ckpt(sess=sess, mode_name='model.ckpt', var_list=net.all_params,

˓→ save_dir='model', printable=True)
• Load latest ckpt.
>>> tl.files.load_ckpt(sess=sess, var_list=net.all_params, save_dir='model',

• Load specific ckpt.
>>> tl.files.load_ckpt(sess=sess, mode_name='model.ckpt', var_list=net.all_params,

˓→ save_dir='model', is_latest=False, printable=True)
2.6.3 Load and save variables
Save variables as .npy
tensorlayer.files.save_any_to_npy(save_dict=None, name=’file.npy’)
Save variables to .npy file.
Parameters
• save_dict (directory) – The variables to be saved.
• name (str) – File name.
Examples
>>> tl.files.save_any_to_npy(save_dict={'data': ['a','b']}, name='test.npy')

>>> data = tl.files.load_npy_to_any(name='test.npy')
>>> print(data)
{'data': ['a','b']}

Load variables from .npy
tensorlayer.files.load_npy_to_any(path=”, name=’file.npy’)
Load .npy file.
Parameters
• path (str) – Path to the file (optional).
• name (str) – File name.
Examples
• see tl.files.save_any_to_npy()
2.6.4 Folder/File functions
Check file exists
tensorlayer.files.file_exists(filepath)
Check whether a file exists by given file path.
Check folder exists
tensorlayer.files.folder_exists(folderpath)
Check whether a folder exists by given folder path.
Delete file
tensorlayer.files.del_file(filepath)
Delete a file by given file path.
Delete folder
tensorlayer.files.del_folder(folderpath)
Delete a folder by given folder path.
Read file
tensorlayer.files.read_file(filepath)
Read a file and return a string.
Examples
>>> data = tl.files.read_file('data.txt')

Load file list from folder
tensorlayer.files.load_file_list(path=None, regx=’\\.jpg’, printable=True,

keep_prefix=False)
Return a file list in a folder by given a path and regular expression.
Parameters
• path (str or None) – A folder path, if None, use the current directory.
• regx (str) – The regx of file name.
• printable (boolean) – Whether to print the files infomation.
• keep_prefix (boolean) – Whether to keep path in the file name.
Examples
>>> file_list = tl.files.load_file_list(path=None, regx='w1pre_[0-9]+\.(npz)')
Load folder list from folder
tensorlayer.files.load_folder_list(path=”)
Return a folder list in a folder by given a folder path.
Parameters path (str) – A folder path.
Check and Create folder
tensorlayer.files.exists_or_mkdir(path, verbose=True)
Check a folder by given name, if not exist, create the folder and return False, if directory exists, return True.
Parameters
• path (str) – A folder path.
• verbose (boolean) – If True (default), prints results.
Returns True if folder already exist, otherwise, returns False and create the folder.
Return type boolean
Examples
>>> tl.files.exists_or_mkdir("checkpoints/train")
Download or extract
tensorlayer.files.maybe_download_and_extract(filename, working_directory, url_source,

extract=False, expected_bytes=None)
Checks if file exists in working_directory otherwise tries to dowload the file, and optionally also tries to extract
the file if format is “.zip” or “.tar”
Parameters

• filename (str) – The name of the (to be) dowloaded file.

• working_directory (str) – A folder path to search for the file in and dowload the file
to
• url (str) – The URL to download the file from
• extract (boolean) – If True, tries to uncompress the dowloaded file is “.tar.gz/.tar.bz2”
or “.zip” file, default is False.
• expected_bytes (int or None) – If set tries to verify that the downloaded file is of
the specified size, otherwise raises an Exception, defaults is None which corresponds to no
check being performed.
Returns File path of the dowloaded (uncompressed) file.
Return type str
Examples
>>> down_file = tl.files.maybe_download_and_extract(filename='train-images-idx3-

˓→ubyte.gz',
... working_directory='data/',
... url_source='http://yann.lecun.com/
˓→exdb/mnist/')
>>> tl.files.maybe_download_and_extract(filename='ADEChallengeData2016.zip',
... working_directory='data/',
... url_source='http://sceneparsing.
˓→csail.mit.edu/data/',
... extract=True)
2.6.5 Sort
List of string with number in human order
tensorlayer.files.natural_keys(text)
Sort list of string with number in human order.
Examples
>>> l = ['im1.jpg', 'im31.jpg', 'im11.jpg', 'im21.jpg', 'im03.jpg', 'im05.jpg']

>>> l.sort(key=tl.files.natural_keys)
['im1.jpg', 'im03.jpg', 'im05', 'im11.jpg', 'im21.jpg', 'im31.jpg']
>>> l.sort() # that is what we dont want
['im03.jpg', 'im05', 'im1.jpg', 'im11.jpg', 'im21.jpg', 'im31.jpg']
References
• link

2.6.6 Visualizing npz file
tensorlayer.files.npz_to_W_pdf(path=None, regx=’w1pre_[0-9]+\\.(npz)’)
Convert the first weight matrix of .npz file to .pdf by using tl.visualize.W().
Parameters
• path (str) – A folder path to npz files.
• regx (str) – Regx for the file name.
Examples
Convert the first weight matrix of w1_pre. . . npz file to w1_pre. . . pdf.
>>> tl.files.npz_to_W_pdf(path='/Users/.../npz_file/', regx='w1pre_[0-9]+\.(npz)')
2.7 API - Iteration
Data iteration.
minibatches([inputs, targets, batch_size, . . . ]) Generate a generator that input a group of example in

numpy.array and their labels, return the examples and
labels by the given batch size.
seq_minibatches(inputs, targets, batch_size, . . . ) Generate a generator that return a batch of sequence in-
puts and targets.
seq_minibatches2(inputs, targets, . . . ) Generate a generator that iterates on two list of words.
ptb_iterator(raw_data, batch_size, num_steps) Generate a generator that iterates on a list of words, see
PTB example.
2.7.1 Non-time series
tensorlayer.iterate.minibatches(inputs=None, targets=None, batch_size=None, al-

low_dynamic_batch_size=False, shuffle=False)
Generate a generator that input a group of example in numpy.array and their labels, return the examples and
labels by the given batch size.
Parameters
• inputs (numpy.array) – The input features, every row is a example.
• targets (numpy.array) – The labels of inputs, every row is a example.
• batch_size (int) – The batch size.
• allow_dynamic_batch_size (boolean) – Allow the use of the last data batch in
case the number of examples is not a multiple of batch_size, this may result in unexpected
behaviour if other functions expect a fixed-sized batch-size.
• shuffle (boolean) – Indicating whether to use a shuffling queue, shuffle the dataset
before return.

Examples
>>> X = np.asarray([['a','a'], ['b','b'], ['c','c'], ['d','d'], ['e','e'], ['f','f

˓→']])
>>> y = np.asarray([0,1,2,3,4,5])
>>> for batch in tl.iterate.minibatches(inputs=X, targets=y, batch_size=2,
˓→shuffle=False):
>>> print(batch)
(array([['a', 'a'], ['b', 'b']], dtype='<U1'), array([0, 1]))
(array([['c', 'c'], ['d', 'd']], dtype='<U1'), array([2, 3]))
(array([['e', 'e'], ['f', 'f']], dtype='<U1'), array([4, 5]))
Notes
If you have two inputs and one label and want to shuffle them together, e.g. X1 (1000, 100), X2 (1000, 80) and
Y (1000, 1), you can stack them together (np.hstack((X1, X2))) into (1000, 180) and feed to inputs. After
getting a batch, you can split it back into X1 and X2.
2.7.2 Time series
Sequence iteration 1
tensorlayer.iterate.seq_minibatches(inputs, targets, batch_size, seq_length, stride=1)

Generate a generator that return a batch of sequence inputs and targets. If batch_size=100 and seq_length=5,
one return will have 500 rows (examples).
Parameters
• inputs (numpy.array) – The input features, every row is a example.
• targets (numpy.array) – The labels of inputs, every element is a example.
• seq_length (int) – The sequence length.
• stride (int) – The stride step, default is 1.
Examples
Synced sequence input and output.

˓→']])
>>> y = np.asarray([0, 1, 2, 3, 4, 5])

>>> for batch in tl.iterate.seq_minibatches(inputs=X, targets=y, batch_size=2,
˓→seq_length=2, stride=1):
>>> print(batch)
(array([['a', 'a'], ['b', 'b'], ['b', 'b'], ['c', 'c']], dtype='<U1'), array([0,
˓→1, 1, 2]))
(array([['c', 'c'], ['d', 'd'], ['d', 'd'], ['e', 'e']], dtype='<U1'), array([2,
˓→3, 3, 4]))
Many to One
2.7. API - Iteration 113

>>> return_last = True

>>> num_steps = 2
˓→']])
>>> Y = np.asarray([0,1,2,3,4,5])
>>> for batch in tl.iterate.seq_minibatches(inputs=X, targets=Y, batch_size=2,
˓→seq_length=num_steps, stride=1):
>>> x, y = batch
>>> if return_last:
>>> tmp_y = y.reshape((-1, num_steps) + y.shape[1:])
>>> y = tmp_y[:, -1]
>>> print(x, y)
[['a' 'a']
['b' 'b']
['b' 'b']
['c' 'c']] [1 2]
[['c' 'c']
['d' 'd']
['d' 'd']
['e' 'e']] [3 4]
Sequence iteration 2
tensorlayer.iterate.seq_minibatches2(inputs, targets, batch_size, num_steps)

Generate a generator that iterates on two list of words. Yields (Returns) the source contexts and the target
context by the given batch_size and num_steps (sequence_length). In TensorFlow’s tutorial, this generates the
batch_size pointers into the raw PTB data, and allows minibatch iteration along these pointers.
Parameters
• inputs (list of data) – The context in list format; note that context usually be rep-
resented by splitting by space, and then convert to unique word IDs.
• targets (list of data) – The context in list format; note that context usually be
represented by splitting by space, and then convert to unique word IDs.
• num_steps (int) – The number of unrolls. i.e. sequence length
Yields Pairs of the batched data, each a matrix of shape [batch_size, num_steps].
Raises ValueError : if batch_size or num_steps are too high.
Examples
>>> X = [i for i in range(20)]

>>> Y = [i for i in range(20,40)]
>>> for batch in tl.iterate.seq_minibatches2(X, Y, batch_size=2, num_steps=3):
... x, y = batch
... print(x, y)
[[ 0. 1. 2.] [ 10. 11. 12.]] [[ 20. 21. 22.] [ 30. 31. 32.]]
[[ 3. 4. 5.] [ 13. 14. 15.]] [[ 23. 24. 25.] [ 33. 34. 35.]]
[[ 6. 7. 8.] [ 16. 17. 18.]] [[ 26. 27. 28.] [ 36. 37. 38.]]

Notes
• Hint, if the input data are images, you can modify the source code data = np.zeros([batch_size, batch_len)
to data = np.zeros([batch_size, batch_len, inputs.shape[1], inputs.shape[2], inputs.shape[3]]).
PTB dataset iteration
tensorlayer.iterate.ptb_iterator(raw_data, batch_size, num_steps)

Generate a generator that iterates on a list of words, see PTB example. Yields the source contexts and the target
context by the given batch_size and num_steps (sequence_length).
In TensorFlow’s tutorial, this generates batch_size pointers into the raw PTB data, and allows minibatch iteration
along these pointers.
Parameters
• raw_data (a list) – the context in list format; note that context usually be represented
by splitting by space, and then convert to unique word IDs.
• batch_size (int) – the batch size.
• num_steps (int) – the number of unrolls. i.e. sequence_length
Yields
• Pairs of the batched data, each a matrix of shape [batch_size, num_steps].
• The second element of the tuple is the same data time-shifted to the
• right by one.
Raises ValueError : if batch_size or num_steps are too high.
Examples
>>> train_data = [i for i in range(20)]

>>> for batch in tl.iterate.ptb_iterator(train_data, batch_size=2, num_steps=3):
>>> x, y = batch
>>> print(x, y)
[[ 0 1 2] <---x 1st subset/ iteration
[10 11 12]]
[[ 1 2 3] <---y
[11 12 13]]
[[ 3 4 5] <— 1st batch input 2nd subset/ iteration [13 14 15]] <— 2nd batch input
[[ 4 5 6] <— 1st batch target [14 15 16]] <— 2nd batch target
[[ 6 7 8] 3rd subset/ iteration [16 17 18]]
[[ 7 8 9] [17 18 19]]
2.8 API - Layers
2.8. API - Layers 115

2.8.1 Name Scope and Sharing Parameters
These functions help you to reuse parameters for different inference (graph), and get a list of parameters by given
name. About TensorFlow parameters sharing click here.
Get variables with name
tensorlayer.layers.get_variables_with_name(name=None, train_only=True, ver-

bose=False)
Get a list of TensorFlow variables by a given name scope.
Parameters
• name (str) – Get the variables that contain this name.
• train_only (boolean) – If Ture, only get the trainable variables.
• verbose (boolean) – If True, print the information of all variables.
Returns A list of TensorFlow variables
Return type list of Tensor
Examples

>>> dense_vars = tl.layers.get_variables_with_name('dense', True, True)
Get layers with name
tensorlayer.layers.get_layers_with_name(net, name=”, verbose=False)

Get a list of layers’ output in a network by a given name scope.
Parameters
• net (Layer) – The last layer of the network.
• name (str) – Get the layers’ output that contain this name.
• verbose (boolean) – If True, print information of all the layers’ output
Returns A list of layers’ output (TensorFlow tensor)
Return type list of Tensor
Examples

>>> layers = tl.layers.get_layers_with_name(net, "CNN", True)

Print variables
tensorlayer.layers.print_all_variables(train_only=False)
Print information of trainable or all variables, without tl.layers.
initialize_global_variables(sess).
Parameters train_only (boolean) –
Whether print trainable variables only.
• If True, print the trainable variables.
• If False, print all variables.
Initialize variables
tensorlayer.layers.initialize_global_variables(sess)
Initialize the global variables of TensorFlow.

Instructions for updating: This API is deprecated in favor of tf.global_variables_initializer.
Run sess.run(tf.global_variables_initializer()) for TF 0.12+ or sess.run(tf.

initialize_all_variables()) for TF 0.11.
Parameters sess (Session) – TensorFlow session.
2.8.2 Understanding the Basic Layer
All TensorLayer layers have a number of properties in common:

• layer.outputs : a Tensor, the outputs of current layer.
• layer.all_params : a list of Tensor, all network variables in order.
• layer.all_layers : a list of Tensor, all network outputs in order.
• layer.all_drop : a dictionary of {placeholder : float}, all keeping probabilities of noise layers.
All TensorLayer layers have a number of methods in common:
• layer.print_params() : print network variable information in order (after tl.layers.
initialize_global_variables(sess)). alternatively, print all variables by tl.layers.
print_all_variables().
• layer.print_layers() : print network layer information in order.
• layer.count_params() : print the number of parameters in the network.
A network starts with the input layer and is followed by layers stacked in order. A network is essentially a Layer
class. The key properties of a network are network.all_params, network.all_layers and network.
all_drop. The all_params is a list which store pointers to all network parameters in order. For example, the
following script define a 3 layer network, then:
all_params = [W1, b1, W2, b2, W_out, b_out]
To get specified variable information, you can use network.all_params[2:3] or
get_variables_with_name(). all_layers is a list which stores the pointers to the outputs of all
layers, see the example as follow:

all_layers = [drop(?,784), relu(?,800), drop(?,800), relu(?,800), drop(?,800)], identity(?,10)]

where ? reflects a given batch size. You can print the layer and parameters information by using network.
print_layers() and network.print_params(). To count the number of parameters in a network, run
network.count_params().
sess = tf.InteractiveSession()


act=tf.nn.relu, name='relu1')
act=tf.nn.relu, name='relu2')
act=None, name='output')
y = network.outputs
cost = tl.cost.cross_entropy(y, y_)
train_op = tf.train.AdamOptimizer(learning_rate, beta1=0.9, beta2=0.999,

epsilon=1e-08, use_locking=False).minimize(cost, var_list
˓→= train_params)
In addition, network.all_drop is a dictionary which stores the keeping probabilities of all noise layers. In the
above network, they represent the keeping probabilities of dropout layers.
In case for training, you can enable all dropout layers as follow:
feed_dict = {x: X_train_a, y_: y_train_a}

loss, _ = sess.run([cost, train_op], feed_dict=feed_dict)
In case for evaluating and testing, you can disable all dropout layers as follow.
feed_dict = {x: X_val, y_: y_val}

feed_dict.update(dp_dict)
print(" val loss: %f" % sess.run(cost, feed_dict=feed_dict))
print(" val acc: %f" % np.mean(y_val ==
sess.run(y_op, feed_dict=feed_dict)))
For more details, please read the MNIST examples in the example folder.

2.8.3 Layer list
get_variables_with_name([name, train_only, Get a list of TensorFlow variables by a given name

. . . ]) scope.
get_layers_with_name(net[, name, verbose]) Get a list of layers’ output in a network by a given name
scope.
set_name_reuse([enable]) DEPRECATED FUNCTION
print_all_variables([train_only]) Print information of trainable or
all variables, without tl.layers.
initialize_global_variables(sess).
initialize_global_variables(sess) Initialize the global variables of TensorFlow.
Layer(prev_layer[, act, name]) The basic Layer class represents a single layer of a
neural network.
InputLayer(inputs[, name]) The InputLayer class is the starting layer of a neural
network.
OneHotInputLayer([inputs, depth, on_value, . . . ]) The OneHotInputLayer class is the starting layer
of a neural network, see tf.one_hot.
Word2vecEmbeddingInputlayer(inputs[, . . . ]) The Word2vecEmbeddingInputlayer class is a
fully connected layer.
EmbeddingInputlayer(inputs[, . . . ]) The EmbeddingInputlayer class is a look-up ta-
ble for word embedding.
AverageEmbeddingInputlayer(inputs, . . . [, The AverageEmbeddingInputlayer averages
. . . ]) over embeddings of inputs.
DenseLayer(prev_layer[, n_units, act, . . . ]) The DenseLayer class is a fully connected layer.
ReconLayer(prev_layer[, x_recon, n_units, . . . ]) A reconstruction layer for DenseLayer to implement
AutoEncoder.
DropoutLayer(prev_layer[, keep, is_fix, . . . ]) The DropoutLayer class is a noise layer which ran-
domly set some activations to zero according to a keep-
ing probability.
GaussianNoiseLayer(prev_layer[, mean, . . . ]) The GaussianNoiseLayer class is noise layer that
adding noise with gaussian distribution to the activation.
DropconnectDenseLayer(prev_layer[, keep, . . . ]) The DropconnectDenseLayer class is
DenseLayer with DropConnect behaviour which
randomly removes connections between this layer and
the previous layer according to a keeping probability.
Conv1dLayer(prev_layer[, act, shape, . . . ]) The Conv1dLayer class is a 1D CNN layer, see
tf.nn.convolution.
Conv2dLayer(prev_layer[, act, shape, . . . ]) The Conv2dLayer class is a 2D CNN layer, see
tf.nn.conv2d.
DeConv2dLayer(prev_layer[, act, shape, . . . ]) A de-convolution 2D layer.
Conv3dLayer(prev_layer[, shape, strides, . . . ]) The Conv3dLayer class is a 3D CNN layer, see
tf.nn.conv3d.
DeConv3dLayer(prev_layer[, act, shape, . . . ]) The DeConv3dLayer class is deconvolutional 3D
layer, see tf.nn.conv3d_transpose.
UpSampling2dLayer(prev_layer, size[, . . . ]) The UpSampling2dLayer class is a up-sampling 2D
layer.
DownSampling2dLayer(prev_layer, size[, . . . ]) The DownSampling2dLayer class is down-
sampling 2D layer.
AtrousConv1dLayer(prev_layer[, n_filter, . . . ]) Simplified version of AtrousConv1dLayer.
AtrousConv2dLayer(prev_layer[, n_filter, . . . ]) The AtrousConv2dLayer class is 2D atrous convo-
lution (a.k.a.


AtrousDeConv2dLayer(prev_layer[, shape, . . . ]) The AtrousDeConv2dLayer class is 2D atrous con-
volution transpose, see tf.nn.atrous_conv2d_transpose.
Conv1d(prev_layer[, n_filter, filter_size, . . . ]) Simplified version of Conv1dLayer.
Conv2d(prev_layer[, n_filter, filter_size, . . . ]) Simplified version of Conv2dLayer.
DeConv2d(prev_layer[, n_filter, . . . ]) Simplified version of DeConv2dLayer.
DeConv3d(prev_layer[, n_filter, . . . ]) Simplified version of The DeConv3dLayer, see
tf.contrib.layers.conv3d_transpose.
DepthwiseConv2d(prev_layer[, shape, . . . ]) Separable/Depthwise Convolutional 2D layer, see
tf.nn.depthwise_conv2d.
SeparableConv1d(prev_layer[, n_filter, . . . ]) The SeparableConv1d class is a 1D
depthwise separable convolutional layer, see
tf.layers.separable_conv1d.
SeparableConv2d(prev_layer[, n_filter, . . . ]) The SeparableConv2d class is a 2D
depthwise separable convolutional layer, see
DeformableConv2d(prev_layer[, offset_layer, . . . ]) The DeformableConv2d class is a 2D Deformable
Convolutional Networks.
GroupConv2d(prev_layer[, n_filter, . . . ]) The GroupConv2d class is 2D grouped convolution,
see here.
PadLayer(prev_layer[, padding, mode, name]) The PadLayer class is a padding layer for any mode
and dimension.
PoolLayer(prev_layer[, ksize, strides, . . . ]) The PoolLayer class is a Pooling layer.
ZeroPad1d(prev_layer, padding[, name]) The ZeroPad1d class is a 1D padding layer for signal
[batch, length, channel].
ZeroPad2d(prev_layer, padding[, name]) The ZeroPad2d class is a 2D padding layer for image
[batch, height, width, channel].
ZeroPad3d(prev_layer, padding[, name]) The ZeroPad3d class is a 3D padding layer for vol-
ume [batch, depth, height, width, channel].
MaxPool1d(prev_layer[, filter_size, . . . ]) Max pooling for 1D signal.
MeanPool1d(prev_layer[, filter_size, . . . ]) Mean pooling for 1D signal.
MaxPool2d(prev_layer[, filter_size, . . . ]) Max pooling for 2D image.
MeanPool2d(prev_layer[, filter_size, . . . ]) Mean pooling for 2D image [batch, height, width, chan-
nel].
MaxPool3d(prev_layer[, filter_size, . . . ]) Max pooling for 3D volume.
MeanPool3d(prev_layer[, filter_size, . . . ]) Mean pooling for 3D volume.
GlobalMaxPool1d(prev_layer[, data_format, The GlobalMaxPool1d class is a 1D Global Max
name]) Pooling layer.
GlobalMeanPool1d(prev_layer[, data_format, The GlobalMeanPool1d class is a 1D Global Mean
SubpixelConv1d(prev_layer[, scale, act, name]) It is a 1D sub-pixel up-sampling layer.
SubpixelConv2d(prev_layer[, scale, . . . ]) It is a 2D sub-pixel up-sampling layer, usually be used
for Super-Resolution applications, see SRGAN for ex-
ample.


SpatialTransformer2dAffineLayer(prev_layer,The SpatialTransformer2dAffineLayer
...) class is a 2D Spatial Transformer Layer for 2D Affine
Transformation.
transformer(U, theta, out_size[, name]) Spatial Transformer Layer for
2D Affine Transformation , see
SpatialTransformer2dAffineLayer class.
batch_transformer(U, thetas, out_size[, name]) Batch Spatial Transformer function for 2D Affine
Transformation.
BatchNormLayer(prev_layer[, decay, epsilon, . . . ]) The BatchNormLayer is a batch normalization layer
for both fully-connected and convolution outputs.
LocalResponseNormLayer(prev_layer[, . . . ]) The LocalResponseNormLayer layer is for Local
Response Normalization.
InstanceNormLayer(prev_layer[, act, . . . ]) The InstanceNormLayer class is a for instance
normalization.
LayerNormLayer(prev_layer[, center, scale, . . . ]) The LayerNormLayer class is for layer normaliza-
tion, see tf.contrib.layers.layer_norm.
GroupNormLayer(prev_layer[, groups, . . . ]) The GroupNormLayer layer is for Group Normaliza-
tion.
SwitchNormLayer(prev_layer[, act, epsilon, . . . ]) The SwitchNormLayer is a switchable normaliza-
tion.
ROIPoolingLayer(prev_layer, rois[, . . . ]) The region of interest pooling layer.
TimeDistributedLayer(prev_layer[, . . . ]) The TimeDistributedLayer class that applies a
function to every timestep of the input tensor.
RNNLayer(prev_layer, cell_fn[, . . . ]) The RNNLayer class is a fixed length recurrent layer
for implementing vanilla RNN, LSTM, GRU and etc.
BiRNNLayer(prev_layer, cell_fn[, . . . ]) The BiRNNLayer class is a fixed length Bidirectional
recurrent layer.
ConvRNNCell Abstract object representing an Convolutional RNN
Cell.
BasicConvLSTMCell(shape, filter_size, . . . [, . . . ]) Basic Conv LSTM recurrent network cell.
ConvLSTMLayer(prev_layer[, cell_shape, . . . ]) A fixed length Convolutional LSTM layer.
advanced_indexing_op(inputs, index) Advanced Indexing for Sequences, returns the outputs
by given sequence lengths.
retrieve_seq_length_op(data) An op to compute the length of a sequence from input
shape of [batch_size, n_step(max), n_features], it can be
used when the features of padding (on right hand side)
are all zeros.
retrieve_seq_length_op2(data) An op to compute the length of a sequence, from input
shape of [batch_size, n_step(max)], it can be used when
the features of padding (on right hand side) are all zeros.
retrieve_seq_length_op3(data[, pad_val]) Return tensor for sequence length, if input is tf.
string.
target_mask_op(data[, pad_val]) Return tensor for mask, if input is tf.string.
DynamicRNNLayer(prev_layer, cell_fn[, . . . ]) The DynamicRNNLayer class is a dynamic recurrent
layer, see tf.nn.dynamic_rnn.
BiDynamicRNNLayer(prev_layer, cell_fn[, . . . ]) The BiDynamicRNNLayer class is a RNN layer, you
can implement vanilla RNN, LSTM and GRU with it.
Seq2Seq(net_encode_in, net_decode_in, cell_fn) The Seq2Seq class is a simple DynamicRNNLayer
based Seq2seq layer without using tl.contrib.seq2seq.
FlattenLayer(prev_layer[, name]) A layer that reshapes high-dimension input into a vector.
ReshapeLayer(prev_layer, shape[, name]) A layer that reshapes a given tensor.


TransposeLayer(prev_layer, perm[, name]) A layer that transposes the dimension of a tensor.
LambdaLayer(prev_layer, fn[, fn_args, name]) A layer that takes a user-defined function us-
ing TensorFlow Lambda, for multiple inputs see
ElementwiseLambdaLayer.
ConcatLayer(prev_layer[, concat_dim, name]) A layer that concats multiple tensors according to given
axis.
ElementwiseLayer(prev_layer[, combine_fn, . . . ]) A layer that combines multiple Layer that have the
same output shapes according to an element-wise op-
eration.
ElementwiseLambdaLayer(layers, fn[, . . . ]) A layer that use a custom function to combine multiple
Layer inputs.
ExpandDimsLayer(prev_layer, axis[, name]) The ExpandDimsLayer class inserts a dimension of
1 into a tensor’s shape, see tf.expand_dims() .
TileLayer(prev_layer[, multiples, name]) The TileLayer class constructs a tensor by tiling a
given tensor, see tf.tile() .
StackLayer(layers[, axis, name]) The StackLayer class is a layer for stacking a list of
rank-R tensors into one rank-(R+1) tensor, see tf.stack().
UnStackLayer(prev_layer[, num, axis, name]) The UnStackLayer class is a layer for unstacking the
given dimension of a rank-R tensor into rank-(R-1) ten-
sors., see tf.unstack().
SlimNetsLayer(prev_layer, slim_layer[, . . . ]) A layer that merges TF-Slim models into TensorLayer.
SignLayer(prev_layer[, name]) The SignLayer class is for quantizing the layer out-
puts to -1 or 1 while inferencing.
ScaleLayer(prev_layer[, init_scale, name]) The AddScaleLayer class is for multipling a trainble
scale value to the layer outputs.
BinaryDenseLayer(prev_layer[, n_units, act, . . . ]) The BinaryDenseLayer class is a binary fully con-
nected layer, which weights are either -1 or 1 while in-
ferencing.
BinaryConv2d(prev_layer[, n_filter, . . . ]) The BinaryConv2d class is a 2D binary CNN layer,
which weights are either -1 or 1 while inference.
TernaryDenseLayer(prev_layer[, n_units, . . . ]) The TernaryDenseLayer class is a ternary fully
connected layer, which weights are either -1 or 1 or 0
while inference.
TernaryConv2d(prev_layer[, n_filter, . . . ]) The TernaryConv2d class is a 2D binary CNN layer,
which weights are either -1 or 1 or 0 while inference.
DorefaDenseLayer(prev_layer[, bitW, bitA, . . . ]) The DorefaDenseLayer class is a binary fully con-
nected layer, which weights are ‘bitW’ bits and the out-
put of the previous layer are ‘bitA’ bits while inferenc-
ing.
DorefaConv2d(prev_layer[, bitW, bitA, . . . ]) The DorefaConv2d class is a 2D quantized convolu-
tional layer, which weights are ‘bitW’ bits and the out-
put of the previous layer are ‘bitA’ bits while inferenc-
ing.
QuanDenseLayer(prev_layer[, n_units, act, . . . ]) The QuanDenseLayer class is a quantized fully con-
nected layer with BN, which weights are ‘bitW’ bits and
the output of the previous layer are ‘bitA’ bits while in-
ferencing.
QuanDenseLayerWithBN (prev_layer[, n_units, The QuanDenseLayer class is a quantized fully con-
. . . ]) nected layer with BN, which weights are ‘bitW’ bits and
the output of the previous layer are ‘bitA’ bits while in-
ferencing.


QuanConv2d(prev_layer[, n_filter, . . . ]) The QuanConv2dWithBN class is a quantized con-
volutional layer with BN, which weights are ‘bitW’ bits
and the output of the previous layer are ‘bitA’ bits while
inferencing.
QuanConv2dWithBN (prev_layer[, n_filter, . . . ]) The QuanConv2dWithBN class is a quantized con-
volutional layer with BN, which weights are ‘bitW’ bits
and the output of the previous layer are ‘bitA’ bits while
inferencing.
PReluLayer(prev_layer[, channel_shared, . . . ]) The PReluLayer class is Parametric Rectified Linear
layer.
PRelu6Layer(prev_layer[, channel_shared, . . . ]) The PRelu6Layer class is Parametric Rectified Lin-
ear layer integrating ReLU6 behaviour.
PTRelu6Layer(prev_layer[, channel_shared, . . . ]) The PTRelu6Layer class is Parametric Rectified Lin-
ear layer integrating ReLU6 behaviour.
MultiplexerLayer(layers[, name]) The MultiplexerLayer selects inputs to be for-
warded to output.
flatten_reshape(variable[, name]) Reshapes a high-dimension vector input.
clear_layers_name() DEPRECATED FUNCTION
initialize_rnn_state(state[, feed_dict]) Returns the initialized RNN state.
list_remove_repeat(x) Remove the repeated items in a list, and return the pro-
cessed list.
merge_networks([layers]) Merge all parameters, layers and dropout probabilities
to a Layer.
2.8.4 Customizing Layers
A Simple Layer
To implement a custom layer in TensorLayer, you will have to write a Python class that subclasses Layer and implement
the outputs expression.
The following is an example implementation of a layer that multiplies its input by 2:
class DoubleLayer(Layer):
def __init__(
self,
layer = None,
name ='double_layer',
):
# manage layer (fixed)
super(DoubleLayer, self).__init__(prev_layer=prev_layer, name=name)
# the input of this layer is the output of previous layer (fixed)

self.inputs = layer.outputs
# operation (customized)
self.outputs = self.inputs * 2
# update layer (customized)

self._add_layers(self.outputs)

Your Dense Layer
Before creating your own TensorLayer layer, let’s have a look at the Dense layer. It creates a weight matrix and a
bias vector if not exists, and then implements the output expression. At the end, for a layer with parameters, we also
append the parameters into all_params.
class MyDenseLayer(Layer):
def __init__(
self,
layer = None,
n_units = 100,
act = tf.nn.relu,
name ='simple_dense',
):
# manage layer (fixed)
super(MyDenseLayer, self).__init__(prev_layer=prev_layer, act=act, name=name)
# the input of this layer is the output of previous layer (fixed)

self.inputs = layer.outputs
# print out info (customized)

print(" MyDenseLayer %s: %d, %s" % (self.name, n_units, act))
# operation (customized)
n_in = int(self.inputs._shape[-1])
with tf.variable_scope(name) as vs:
# create new parameters
W = tf.get_variable(name='W', shape=(n_in, n_units))
b = tf.get_variable(name='b', shape=(n_units))
# tensor operation
self.outputs = self._apply_activation(tf.matmul(self.inputs, W) + b)
# update layer (customized)

self._add_layers(self.outputs)
self._add_params([W, b])
2.8.5 Basic Layer
class tensorlayer.layers.Layer(prev_layer, act=None, name=None, *args, **kwargs)

The basic Layer class represents a single layer of a neural network.
It should be subclassed when implementing new types of layers. Because each layer can keep track of the
layer(s) feeding into it, a network’s output Layer instance can double as a handle to the full network.
Parameters
• prev_layer (Layer or None) – Previous layer (optional), for adding all properties of
previous layer(s) to this layer.
• act (activation function (None by default)) – The activation function of
this layer.
• name (str or None) – A unique layer name.
print_params(details=True, session=None)
Print all parameters of this network.
print_layers()
Print all outputs of all layers of this network.

count_params()
Return the number of parameters of this network.
get_all_params()
Return the parameters in a list of array.
Examples
• Define model
>>> import tensorflow as tf

>>> x = tf.placeholder("float32", [None, 100])
>>> n = tl.layers.InputLayer(x, name='in')
>>> n = tl.layers.DenseLayer(n, 80, name='d1')
>>> n = tl.layers.DenseLayer(n, 80, name='d2')
• Get information
>>> print(n)
Last layer is: DenseLayer (d2) [None, 80]
>>> n.print_layers()
[TL] layer 0: d1/Identity:0 (?, 80) float32
[TL] layer 1: d2/Identity:0 (?, 80) float32
>>> n.print_params(False)
[TL] param 0: d1/W:0 (100, 80) float32_ref
[TL] param 1: d1/b:0 (80,) float32_ref
[TL] param 2: d2/W:0 (80, 80) float32_ref
[TL] param 3: d2/b:0 (80,) float32_ref
[TL] num of params: 14560
>>> n.count_params()
14560
• Slicing the outputs
>>> n2 = n[:, :30]

>>> print(n2)
Last layer is: Layer (d2) [None, 30]
• Iterating the outputs
>>> for l in n:
>>> print(l)
Tensor("d1/Identity:0", shape=(?, 80), dtype=float32)
Tensor("d2/Identity:0", shape=(?, 80), dtype=float32)
2.8.6 Input Layers
Input Layer
class tensorlayer.layers.InputLayer(inputs, name=’input’)

The InputLayer class is the starting layer of a neural network.

Parameters
• inputs (placeholder or tensor) – The input of a network.
• name (str) – A unique layer name.
One-hot Input Layer
class tensorlayer.layers.OneHotInputLayer(inputs=None, depth=None, on_value=None,

off_value=None, axis=None, dtype=None,
name=’input’)
The OneHotInputLayer class is the starting layer of a neural network, see tf.one_hot.
Parameters
• inputs (placeholder or tensor) – The input of a network.
• depth (None or int) – If the input indices is rank N, the output will have rank N+1.
The new axis is created at dimension axis (default: the new axis is appended at the end).
• on_value (None or number) – The value to represnt ON. If None, it will default to
the value 1.
• off_value (None or number) – The value to represnt OFF. If None, it will default
to the value 0.
• axis (None or int) – The axis.
• dtype (None or TensorFlow dtype) – The data type, None means tf.float32.
Examples

>>> x = tf.placeholder(tf.int32, shape=[None])
>>> net = tl.layers.OneHotInputLayer(x, depth=8, name='one_hot_encoding')
(?, 8)
Word2Vec Embedding Layer
class tensorlayer.layers.Word2vecEmbeddingInputlayer(inputs, train_labels=None,

vocabulary_size=80000,
embedding_size=200,
num_sampled=64,
nce_loss_args=None,
E_init=<sphinx.ext.autodoc.importer._MockObject
object>, E_init_args=None,
nce_W_init=<sphinx.ext.autodoc.importer._MockObject
object>,
nce_W_init_args=None,
nce_b_init=<sphinx.ext.autodoc.importer._MockObject
object>,
nce_b_init_args=None,
name=’word2vec’)
The Word2vecEmbeddingInputlayer class is a fully connected layer. For Word Embedding, words are

input as integer index. The output is the embedded word vector.

Parameters
• inputs (placeholder or tensor) – The input of a network. For word inputs,
please use integer index format, 2D tensor : [batch_size, num_steps(num_words)]
• train_labels (placeholder) – For word labels. integer index format
• vocabulary_size (int) – The size of vocabulary, number of words
• embedding_size (int) – The number of embedding dimensions
• num_sampled (int) – The mumber of negative examples for NCE loss
• nce_loss_args (dictionary) – The arguments for tf.nn.nce_loss()
• E_init (initializer) – The initializer for initializing the embedding matrix
• E_init_args (dictionary) – The arguments for embedding initializer
• nce_W_init (initializer) – The initializer for initializing the nce decoder weight
matrix
• nce_W_init_args (dictionary) – The arguments for initializing the nce decoder
weight matrix
• nce_b_init (initializer) – The initializer for initializing of the nce decoder bias
vector
• nce_b_init_args (dictionary) – The arguments for initializing the nce decoder
bias vector
• name (str) – A unique layer name
nce_cost
Tensor – The NCE loss.
outputs
Tensor – The embedding layer outputs.
normalized_embeddings
Tensor – Normalized embedding matrix.
Examples
With TensorLayer : see tensorlayer/example/tutorial_word2vec_basic.py

>>> batch_size = 8
>>> train_inputs = tf.placeholder(tf.int32, shape=(batch_size))
>>> train_labels = tf.placeholder(tf.int32, shape=(batch_size, 1))
>>> net = tl.layers.Word2vecEmbeddingInputlayer(inputs=train_inputs,
... train_labels=train_labels, vocabulary_size=1000, embedding_size=200,
... num_sampled=64, name='word2vec')
(8, 200)
>>> cost = net.nce_cost
>>> train_params = net.all_params
>>> cost = net.nce_cost
>>> train_params = net.all_params


>>> train_op = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost,
˓→var_list=train_params)
>>> normalized_embeddings = net.normalized_embeddings
Without TensorLayer : see tensorflow/examples/tutorials/word2vec/word2vec_basic.py
>>> train_inputs = tf.placeholder(tf.int32, shape=(batch_size))

>>> train_labels = tf.placeholder(tf.int32, shape=(batch_size, 1))
>>> embeddings = tf.Variable(
... tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0))
>>> embed = tf.nn.embedding_lookup(embeddings, train_inputs)
>>> nce_weights = tf.Variable(
... tf.truncated_normal([vocabulary_size, embedding_size],
... stddev=1.0 / math.sqrt(embedding_size)))
>>> nce_biases = tf.Variable(tf.zeros([vocabulary_size]))
>>> cost = tf.reduce_mean(
... tf.nn.nce_loss(weights=nce_weights, biases=nce_biases,
... inputs=embed, labels=train_labels,
... num_sampled=num_sampled, num_classes=vocabulary_size,
... num_true=1))
References
tensorflow/examples/tutorials/word2vec/word2vec_basic.py
Embedding Input Layer
class tensorlayer.layers.EmbeddingInputlayer(inputs, vocabulary_size=80000,

embedding_size=200,
E_init=<sphinx.ext.autodoc.importer._MockObject
object>, E_init_args=None,
name=’embedding’)
The EmbeddingInputlayer class is a look-up table for word embedding.
Word content are accessed using integer indexes, then the output is the embedded word vector. To train a word
embedding matrix, you can used Word2vecEmbeddingInputlayer. If you have a pre-trained matrix, you
can assign the parameters into it.
Parameters
• inputs (placeholder) – The input of a network. For word inputs. Please use integer
index format, 2D tensor : (batch_size, num_steps(num_words)).
• vocabulary_size (int) – The size of vocabulary, number of words.
• embedding_size (int) – The number of embedding dimensions.
• E_init (initializer) – The initializer for the embedding matrix.
• E_init_args (dictionary) – The arguments for embedding matrix initializer.
outputs
tensor – The embedding layer output is a 3D tensor in the shape: (batch_size, num_steps(num_words),
embedding_size).

Examples

>>> batch_size = 8
>>> x = tf.placeholder(tf.int32, shape=(batch_size, ))
>>> net = tl.layers.EmbeddingInputlayer(inputs=x, vocabulary_size=1000, embedding_
˓→size=50, name='embed')
(8, 50)
Average Embedding Input Layer
class tensorlayer.layers.AverageEmbeddingInputlayer(inputs, vocabulary_size, embed-

ding_size, pad_value=0, embed-
dings_initializer=<sphinx.ext.autodoc.importer._MockOb
object>, embed-
dings_kwargs=None,
name=’average_embedding’)
The AverageEmbeddingInputlayer averages over embeddings of inputs. This is often used as the input
layer for models like DAN[1] and FastText[2].
Parameters
• inputs (placeholder or tensor) – The network input. For word inputs, please
use integer index format, 2D tensor: (batch_size, num_steps(num_words)).
• vocabulary_size (int) – The size of vocabulary.
• embedding_size (int) – The dimension of the embedding vectors.
• pad_value (int) – The scalar padding value used in inputs, 0 as default.
• embeddings_initializer (initializer) – The initializer of the embedding ma-
trix.
• embeddings_kwargs (None or dictionary) – The arguments to get embedding
matrix variable.
References
• [1] Iyyer, M., Manjunatha, V., Boyd-Graber, J., & Daum’e III, H. (2015). Deep Unordered Composition
Rivals Syntactic Methods for Text Classification. In Association for Computational Linguistics.
• [2] Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2016). Bag of Tricks for Efficient Text Classifi-
cation.
Examples

>>> batch_size = 8
>>> length = 5
>>> x = tf.placeholder(tf.int32, shape=(batch_size, length))


>>> net = tl.layers.AverageEmbeddingInputlayer(x, vocabulary_size=1000, embedding_
˓→size=50, name='avg')
(8, 50)
2.8.7 Activation Layers
PReLU Layer
class tensorlayer.layers.PReluLayer(prev_layer, channel_shared=False,

a_init=<sphinx.ext.autodoc.importer._MockObject
object>, a_init_args=None, name=’PReluLayer’)
The PReluLayer class is Parametric Rectified Linear layer.
Parameters
• prev_layer (Layer) – Previous layer.
• channel_shared (boolean) – If True, single weight is shared by all channels.
• a_init (initializer) – The initializer for initializing the alpha(s).
• a_init_args (dictionary) – The arguments for initializing the alpha(s).
References
• Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

PReLU6 Layer
class tensorlayer.layers.PRelu6Layer(prev_layer, channel_shared=False,

object>, a_init_args=None, name=’PReLU6_layer’)
The PRelu6Layer class is Parametric Rectified Linear layer integrating ReLU6 behaviour.
This Layer is a modified version of the PReluLayer.
This activation layer use a modified version tl.act.leaky_relu() introduced by the following paper:
Rectifier Nonlinearities Improve Neural Network Acoustic Models [A. L. Maas et al., 2013]
This activation function also use a modified version of the activation function tf.nn.relu6() introduced by
This activation layer push further the logic by adding leaky behaviour both below zero and above six.
• When x in [0, 6]: f(x) = x.
• When x > 6: f(x) = 6.
Parameters


References

PTReLU6 Layer
class tensorlayer.layers.PTRelu6Layer(prev_layer, channel_shared=False,

object>, a_init_args=None,
name=’PTReLU6_layer’)
The PTRelu6Layer class is Parametric Rectified Linear layer integrating ReLU6 behaviour.
This Layer is a modified version of the PReluLayer.
This activation layer use a modified version tl.act.leaky_relu() introduced by the following paper:
Rectifier Nonlinearities Improve Neural Network Acoustic Models [A. L. Maas et al., 2013]
This activation function also use a modified version of the activation function tf.nn.relu6() introduced by
This activation layer push further the logic by adding leaky behaviour both below zero and above six.
• When x in [0, 6]: f(x) = x.
• When x > 6: f(x) = 6 + (alpha_high * (x-6)).
This version goes one step beyond PRelu6Layer by introducing leaky behaviour on the positive side when x
> 6.
Parameters

References

2.8.8 Convolutional Layers
Simplified Convolutions
For users don’t familiar with TensorFlow, the following simplified functions may easier for you. We will provide more
simplified functions later, but if you are good at TensorFlow, the professional APIs may better for you.
Conv1d
class tensorlayer.layers.Conv1d(prev_layer, n_filter=32, filter_size=5,

stride=1, dilation_rate=1, act=None,
padding=’SAME’, data_format=’channels_last’,
W_init=<sphinx.ext.autodoc.importer._MockObject object>,
b_init=<sphinx.ext.autodoc.importer._MockObject object>,
W_init_args=None, b_init_args=None, name=’conv1d’)
Simplified version of Conv1dLayer.
Parameters
• prev_layer (Layer) – Previous layer
• n_filter (int) – The number of filters
• filter_size (int) – The filter size
• stride (int) – The stride step
• dilation_rate (int) – Specifying the dilation rate to use for dilated convolution.
• act (activation function) – The function that is applied to the layer activations
• padding (str) – The padding algorithm type: “SAME” or “VALID”.
• data_format (str) – channels_last ‘NWC’ (default) or channels_first.
• W_init (initializer) – The initializer for the weight matrix.
• b_init (initializer or None) – The initializer for the bias vector. If None, skip
biases.
• W_init_args (dictionary) – The arguments for the weight matrix initializer (depre-
cated).
• b_init_args (dictionary) – The arguments for the bias vector initializer (depre-
cated).

Examples
>>> x = tf.placeholder(tf.float32, (batch_size, width))

>>> y_ = tf.placeholder(tf.int64, shape=(batch_size,))
>>> n = InputLayer(x, name='in')
>>> n = ReshapeLayer(n, (-1, width, 1), name='rs')
>>> n = Conv1d(n, 64, 3, 1, act=tf.nn.relu, name='c1')
>>> n = MaxPool1d(n, 2, 2, padding='valid', name='m1')
>>> n = FlattenLayer(n, name='f')
>>> n = DenseLayer(n, 500, tf.nn.relu, name='d1')
>>> n = DenseLayer(n, 100, tf.nn.relu, name='d2')
>>> n = DenseLayer(n, 2, None, name='o')
Conv2d
class tensorlayer.layers.Conv2d(prev_layer, n_filter=32, filter_size=(3, 3),

strides=(1, 1), act=None, padding=’SAME’,
data_format=’channels_last’, dilation_rate=(1, 1),
W_init=<sphinx.ext.autodoc.importer._MockObject ob-
ject>, b_init=<sphinx.ext.autodoc.importer._MockObject
object>, W_init_args=None, b_init_args=None,
use_cudnn_on_gpu=None, name=’conv2d’)
Simplified version of Conv2dLayer.
Parameters
• n_filter (int) – The number of filters.
• filter_size (tuple of int) – The filter size (height, width).
• strides (tuple of int) – The sliding window strides of corresponding input dimen-
sions. It must be in the same order as the shape parameter.
• act (activation function) – The activation function of this layer.
• data_format (str) – “channels_last” (NHWC, default) or “channels_first” (NCHW).
• W_init (initializer) – The initializer for the the weight matrix.
• b_init (initializer or None) – The initializer for the the bias vector. If None,
skip biases.
• W_init_args (dictionary) – The arguments for the weight matrix initializer (for TF
< 1.5).
• b_init_args (dictionary) – The arguments for the bias vector initializer (for TF <
1.5).
• use_cudnn_on_gpu (bool) – Default is False (for TF < 1.5).
Returns A Conv2dLayer object.

Return type Layer
Examples
>>> x = tf.placeholder(tf.float32, shape=(None, 28, 28, 1))

>>> net = InputLayer(x, name='inputs')
>>> net = Conv2d(net, 64, (3, 3), act=tf.nn.relu, name='conv1_1')
>>> net = MaxPool2d(net, (2, 2), name='pool1')
>>> net = MaxPool2d(net, (2, 2), name='pool2')
Simplified Deconvolutions
For users don’t familiar with TensorFlow, the following simplified functions may easier for you. We will provide more
simplified functions later, but if you are good at TensorFlow, the professional APIs may better for you.
DeConv2d
class tensorlayer.layers.DeConv2d(prev_layer, n_filter=32, filter_size=(3,

3), strides=(2, 2), padding=’SAME’,
act=None, data_format=’channels_last’,
name=’decnn2d’)
Simplified version of DeConv2dLayer.
Parameters
• out_size (tuple of int) – Require if TF version < 1.3, (height, width) of output.
• strides (tuple of int) – The stride step (height, width).
• data_format (str) – “channels_last” (NHWC, default) or “channels_first” (NCHW).
biases.
• W_init_args (dictionary) – The arguments for the weight matrix initializer (For TF
< 1.3).
• b_init_args (dictionary) – The arguments for the bias vector initializer (For TF <
1.3).

DeConv3d
class tensorlayer.layers.DeConv3d(prev_layer, n_filter=32, filter_size=(3, 3,

3), strides=(2, 2, 2), padding=’SAME’,
name=’decnn3d’)
Simplified version of The DeConv3dLayer, see tf.contrib.layers.conv3d_transpose.
Parameters
• filter_size (tuple of int) – The filter size (depth, height, width).
• stride (tuple of int) – The stride step (depth, height, width).
• data_format (str) – “channels_last” (NDHWC, default) or “channels_first”
(NCDHW).
bias.
• W_init_args (dictionary) – The arguments for the weight matrix initializer (For TF
< 1.3).
• b_init_args (dictionary) – The arguments for the bias vector initializer (For TF <
1.3).
Expert Convolutions
Conv1dLayer
class tensorlayer.layers.Conv1dLayer(prev_layer, act=None, shape=(5, 1, 5), stride=1, dila-

tion_rate=1, padding=’SAME’, data_format=’NWC’,
name=’cnn1d’)
The Conv1dLayer class is a 1D CNN layer, see tf.nn.convolution.
Parameters

• shape (tuple of int) – The shape of the filters: (filter_length, in_channels,

out_channels).
• stride (int) – The number of entries by which the filter is moved right at a step.
• dilation_rate (int) – Filter up-sampling/input down-sampling rate.
• data_format (str) – Default is ‘NWC’ as it is a 1D CNN.
biases.
• W_init_args (dictionary) – The arguments for the weight matrix initializer.
• b_init_args (dictionary) – The arguments for the bias vector initializer.
Conv2dLayer
class tensorlayer.layers.Conv2dLayer(prev_layer, act=None, shape=(5, 5, 1, 100),

strides=(1, 1, 1, 1), padding=’SAME’,
use_cudnn_on_gpu=None, data_format=None,
name=’cnn_layer’)
The Conv2dLayer class is a 2D CNN layer, see tf.nn.conv2d.
Parameters
• shape (tuple of int) – The shape of the filters: (filter_height, filter_width,
in_channels, out_channels).
biases.
• use_cudnn_on_gpu (bool) – Default is False.
• data_format (str) – “NHWC” or “NCHW”, default is “NHWC”.

Notes
• shape = [h, w, the number of output channel of previous layer, the number of output channels]
• the number of output channel of a layer is its last dimension.
Examples
With TensorLayer
>>> x = tf.placeholder(tf.float32, shape=(None, 28, 28, 1))

>>> net = tl.layers.InputLayer(x, name='input_layer')
>>> net = tl.layers.Conv2dLayer(net,
... act = tf.nn.relu,
... shape = (5, 5, 1, 32), # 32 features for each 5x5 patch
... strides = (1, 1, 1, 1),
... padding='SAME',
... W_init=tf.truncated_normal_initializer(stddev=5e-2),
... b_init = tf.constant_initializer(value=0.0),
... name ='cnn_layer1') # output: (?, 28, 28, 32)
>>> net = tl.layers.PoolLayer(net,
... ksize=(1, 2, 2, 1),
... strides=(1, 2, 2, 1),
... padding='SAME',
... pool = tf.nn.max_pool,
... name ='pool_layer1',) # output: (?, 14, 14, 32)
Without TensorLayer, you can implement 2D convolution as follow.
>>> W = tf.Variable(W_init(shape=[5, 5, 1, 32], ), name='W_conv')

>>> b = tf.Variable(b_init(shape=[32], ), name='b_conv')
>>> outputs = tf.nn.relu( tf.nn.conv2d(inputs, W,
... strides=[1, 1, 1, 1],
... padding='SAME') + b )
Conv3dLayer
class tensorlayer.layers.Conv3dLayer(prev_layer, shape=(2, 2, 2, 3, 32), strides=(1,

2, 2, 2, 1), padding=’SAME’, act=None,
name=’cnn3d_layer’)
The Conv3dLayer class is a 3D CNN layer, see tf.nn.conv3d.
Parameters
• shape (tuple of int) – Shape of the filters: (filter_depth, filter_height, filter_width,
in_channels, out_channels).
• strides (tuple of int) – The sliding window strides for corresponding input dimen-
sions. Must be in the same order as the shape dimension.


biases.
Examples
>>> x = tf.placeholder(tf.float32, (None, 100, 100, 100, 3))

>>> n = tl.layers.InputLayer(x, name='in3')
>>> n = tl.layers.Conv3dLayer(n, shape=(2, 2, 2, 3, 32), strides=(1, 2, 2, 2, 1))
[None, 50, 50, 50, 32]
Expert Deconvolutions
DeConv2dLayer
class tensorlayer.layers.DeConv2dLayer(prev_layer, act=None, shape=(3, 3, 128,

256), output_shape=(1, 256, 256, 128),
W_init=<sphinx.ext.autodoc.importer._MockObject
object>, b_init=<sphinx.ext.autodoc.importer._MockObject
name=’decnn2d_layer’)
A de-convolution 2D layer.
See tf.nn.conv2d_transpose.
Parameters
• shape (tuple of int) – Shape of the filters: (height, width, output_channels,
in_channels). The filter’s in_channels dimension must match that of value.
• output_shape (tuple of int) – Output shape of the deconvolution,
sions.
biases.
• W_init_args (dictionary) – The arguments for initializing the weight matrix.
• b_init_args (dictionary) – The arguments for initializing the bias vector.

Notes
• We recommend to use DeConv2d with TensorFlow version higher than 1.3.

• shape = [h, w, the number of output channels of this layer, the number of output channel of the previous
layer].
• output_shape = [batch_size, any, any, the number of output channels of this layer].
• the number of output channel of a layer is its last dimension.
Examples
A part of the generator in DCGAN example
>>> batch_size = 64
>>> inputs = tf.placeholder(tf.float32, [batch_size, 100], name='z_noise')
>>> net_in = tl.layers.InputLayer(inputs, name='g/in')
>>> net_h0 = tl.layers.DenseLayer(net_in, n_units = 8192,
... W_init = tf.random_normal_initializer(stddev=0.02),
... act = None, name='g/h0/lin')
>>> print(net_h0.outputs._shape)
(64, 8192)
>>> net_h0 = tl.layers.ReshapeLayer(net_h0, shape=(-1, 4, 4, 512), name='g/h0/
˓→reshape')
>>> net_h0 = tl.layers.BatchNormLayer(net_h0, act=tf.nn.relu, is_train=is_train,

˓→name='g/h0/batch_norm')
(64, 4, 4, 512)
>>> net_h1 = tl.layers.DeConv2dLayer(net_h0,
... shape=(5, 5, 256, 512),
... output_shape=(batch_size, 8, 8, 256),
... strides=(1, 2, 2, 1),
... act=None, name='g/h1/decon2d')
>>> net_h1 = tl.layers.BatchNormLayer(net_h1, act=tf.nn.relu, is_train=is_train,
˓→name='g/h1/batch_norm')
(64, 8, 8, 256)
U-Net
>>> ....
>>> conv10 = tl.layers.Conv2dLayer(conv9, act=tf.nn.relu,
... shape=(3,3,1024,1024), strides=(1,1,1,1), padding='SAME',
... W_init=w_init, b_init=b_init, name='conv10')
>>> print(conv10.outputs)
(batch_size, 32, 32, 1024)
>>> deconv1 = tl.layers.DeConv2dLayer(conv10, act=tf.nn.relu,
... shape=(3,3,512,1024), strides=(1,2,2,1), output_shape=(batch_size,64,
˓→64,512),
... padding='SAME', W_init=w_init, b_init=b_init, name='devcon1_1')

DeConv3dLayer
class tensorlayer.layers.DeConv3dLayer(prev_layer, act=None, shape=(2, 2, 2, 128,

256), output_shape=(1, 12, 32, 32, 128),
strides=(1, 2, 2, 2, 1), padding=’SAME’,
name=’decnn3d_layer’)
The DeConv3dLayer class is deconvolutional 3D layer, see tf.nn.conv3d_transpose.
Parameters
• shape (tuple of int) – The shape of the filters: (depth, height, width, out-
put_channels, in_channels). The filter’s in_channels dimension must match that of value.
• output_shape (tuple of int) – The output shape of the deconvolution.
sions.
biases.
Atrous (De)Convolutions
AtrousConv1dLayer
tensorlayer.layers.AtrousConv1dLayer(prev_layer, n_filter=32, filter_size=2,

stride=1, dilation=1, act=None,
padding=’SAME’, data_format=’NWC’,
name=’atrous_1d’)
Simplified version of AtrousConv1dLayer.
Parameters
• filter_size (int) – The filter size.
• stride (tuple of int) – The strides: (height, width).
• dilation (int) – The filter dilation size.


• data_format (str) – Default is ‘NWC’ as it is a 1D CNN.
biases.
Returns A AtrousConv1dLayer object
Return type Layer
AtrousConv2dLayer
class tensorlayer.layers.AtrousConv2dLayer(prev_layer, n_filter=32, filter_size=(3,

3), rate=2, act=None, padding=’SAME’,
object>, W_init_args=None,
b_init_args=None, name=’atrous_2d’)
The AtrousConv2dLayer class is 2D atrous convolution (a.k.a. convolution with holes or dilated convolu-
tion) 2D layer, see tf.nn.atrous_conv2d.
Parameters
• prev_layer (Layer) – Previous layer with a 4D output tensor in the shape of (batch,
height, width, channels).
• filter_size (tuple of int) – The filter size: (height, width).
• rate (int) – The stride that we sample input values in the height and width dimensions.
This equals the rate that we up-sample the filters by inserting zeros across the height and
width dimensions. In the literature, this parameter is sometimes mentioned as input stride
or dilation.
biases.

AtrousDeConv2dLayer
class tensorlayer.layers.AtrousDeConv2dLayer(prev_layer, shape=(3, 3, 128, 256),

output_shape=(1, 64, 64, 128),
rate=2, act=None, padding=’SAME’,
b_init_args=None,
name=’atrous_2d_transpose’)
The AtrousDeConv2dLayer class is 2D atrous convolution transpose, see tf.nn.atrous_conv2d_transpose.
Parameters
• prev_layer (Layer) – Previous layer with a 4D output tensor in the shape of (batch,
height, width, channels).
• shape (tuple of int) – The shape of the filters: (filter_height, filter_width,
out_channels, in_channels).
• output_shape (tuple of int) – Output shape of the deconvolution.
• rate (int) – The stride that we sample input values in the height and width dimensions.
This equals the rate that we up-sample the filters by inserting zeros across the height and
width dimensions. In the literature, this parameter is sometimes mentioned as input stride
or dilation.
biases.
Deformable Convolutions
DeformableConv2d
class tensorlayer.layers.DeformableConv2d(prev_layer, offset_layer=None,

n_filter=32, filter_size=(3, 3),
act=None, name=’deformable_conv_2d’,
b_init_args=None)
The DeformableConv2d class is a 2D Deformable Convolutional Networks.
Parameters

• offset_layer (Layer) – To predict the offset of convolution operations. The output

shape is (batchsize, input height, input width, 2*(number of element in the convolution
kernel)) e.g. if apply a 3*3 kernel, the number of the last dimension should be 18 (2*3*3)
biases.
Examples

>>> offset1 = tl.layers.Conv2d(net, 18, (3, 3), (1, 1), act=act, padding='SAME',
˓→name='offset1')
>>> net = tl.layers.DeformableConv2d(net, offset1, 32, (3, 3), act=act, name=

˓→'deformable1')
>>> offset2 = tl.layers.Conv2d(net, 18, (3, 3), (1, 1), act=act, padding='SAME',
˓→name='offset2')
>>> net = tl.layers.DeformableConv2d(net, offset2, 64, (3, 3), act=act, name=

˓→'deformable2')
References
• The deformation operation was adapted from the implementation in here
Notes
• The padding is fixed to ‘SAME’.

• The current implementation is not optimized for memory usgae. Please use it carefully.
Depthwise Convolutions
DepthwiseConv2d
class tensorlayer.layers.DepthwiseConv2d(prev_layer, shape=(3, 3), strides=(1,

1), act=None, padding=’SAME’, dila-
tion_rate=(1, 1), depth_multiplier=1,
name=’depthwise_conv2d’)
Separable/Depthwise Convolutional 2D layer, see tf.nn.depthwise_conv2d.

Input: 4-D Tensor (batch, height, width, in_channels).

Output: 4-D Tensor (batch, new height, new width, in_channels * depth_multiplier).
Parameters
• stride (tuple of int) – The stride step (height, width).
• dilation_rate (tuple of 2 int) – The dilation rate in which we sample input
values across the height and width dimensions in atrous convolution. If it is greater than 1,
then all values of strides must be 1.
• depth_multiplier (int) – The number of channels to expand to.
bias.
Examples
>>> net = InputLayer(x, name='input')

>>> net = Conv2d(net, 32, (3, 3), (2, 2), b_init=None, name='cin')
>>> net = BatchNormLayer(net, act=tf.nn.relu, is_train=is_train, name='bnin')
...
>>> net = DepthwiseConv2d(net, (3, 3), (1, 1), b_init=None, name='cdw1')
>>> net = BatchNormLayer(net, act=tf.nn.relu, is_train=is_train, name='bn11')
>>> net = Conv2d(net, 64, (1, 1), (1, 1), b_init=None, name='c1')
...
>>> net = DepthwiseConv2d(net, (3, 3), (2, 2), b_init=None, name='cdw2')
>>> net = Conv2d(net, 128, (1, 1), (1, 1), b_init=None, name='c2')
References
• tflearn’s grouped_conv_2d
• keras’s separableconv2d

Group Convolutions
GroupConv2d
class tensorlayer.layers.GroupConv2d(prev_layer, n_filter=32, filter_size=(3, 3), strides=(2,

2), n_group=2, act=None, padding=’SAME’,
name=’groupconv’)
The GroupConv2d class is 2D grouped convolution, see here.
Parameters
• filter_size (int) – The filter size.
• stride (int) – The stride step.
• n_group (int) – The number of groups.
biases.
Separable Convolutions
SeparableConv1d
class tensorlayer.layers.SeparableConv1d(prev_layer, n_filter=100, filter_size=3,

strides=1, act=None, padding=’valid’,
data_format=’channels_last’, dila-
tion_rate=1, depth_multiplier=1, depth-
wise_init=None, pointwise_init=None,
b_init=<sphinx.ext.autodoc.importer._MockObject
name=’seperable1d’)
The SeparableConv1d class is a 1D depthwise separable convolutional layer, see
This layer performs a depthwise convolution that acts separately on channels, followed by a pointwise convolu-
tion that mixes channels.
Parameters

• n_filter (int) – The dimensionality of the output space (i.e. the number of filters in the
convolution).
• filter_size (int) – Specifying the spatial dimensions of the filters. Can be a single
integer to specify the same value for all spatial dimensions.
• strides (int) – Specifying the stride of the convolution. Can be a single integer to spec-
ify the same value for all spatial dimensions. Specifying any stride value != 1 is incompatible
with specifying any dilation_rate value != 1.
• padding (str) – One of “valid” or “same” (case-insensitive).
• data_format (str) – One of channels_last (default) or channels_first. The ordering of
the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, height,
width, channels) while channels_first corresponds to inputs with shape (batch, channels,
height, width).
• dilation_rate (int) – Specifying the dilation rate to use for dilated convolution. Can
be a single integer to specify the same value for all spatial dimensions. Currently, specifying
any dilation_rate value != 1 is incompatible with specifying any stride value != 1.
• depth_multiplier (int) – The number of depthwise convolution output channels for
each input channel. The total number of depthwise convolution output channels will be
equal to num_filters_in * depth_multiplier.
• depthwise_init (initializer) – for the depthwise convolution kernel.
• pointwise_init (initializer) – For the pointwise convolution kernel.
• b_init (initializer) – For the bias vector. If None, ignore bias in the pointwise part
only.
• name (a str) – A unique layer name.
SeparableConv2d
class tensorlayer.layers.SeparableConv2d(prev_layer, n_filter=100, filter_size=(3, 3),

strides=(1, 1), act=None, padding=’valid’,
data_format=’channels_last’, dila-
tion_rate=(1, 1), depth_multiplier=1, depth-
wise_init=None, pointwise_init=None,
name=’seperable’)
The SeparableConv2d class is a 2D depthwise separable convolutional layer, see
This layer performs a depthwise convolution that acts separately on channels, followed by a pointwise convolu-
tion that mixes channels. While DepthwiseConv2d performs depthwise convolution only, which allow us to
add batch normalization between depthwise and pointwise convolution.
Parameters
• n_filter (int) – The dimensionality of the output space (i.e. the number of filters in the
convolution).
• filter_size (tuple/list of 2 int) – Specifying the spatial dimensions of the
filters. Can be a single integer to specify the same value for all spatial dimensions.

• strides (tuple/list of 2 int) – Specifying the strides of the convolution. Can

be a single integer to specify the same value for all spatial dimensions. Specifying any stride
value != 1 is incompatible with specifying any dilation_rate value != 1.
• padding (str) – One of “valid” or “same” (case-insensitive).
• data_format (str) – One of channels_last (default) or channels_first. The ordering of
the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, height,
width, channels) while channels_first corresponds to inputs with shape (batch, channels,
height, width).
• dilation_rate (integer or tuple/list of 2 int) – Specifying the dila-
tion rate to use for dilated convolution. Can be a single integer to specify the same value
for all spatial dimensions. Currently, specifying any dilation_rate value != 1 is incompatible
with specifying any stride value != 1.
• depth_multiplier (int) – The number of depthwise convolution output channels for
each input channel. The total number of depthwise convolution output channels will be
equal to num_filters_in * depth_multiplier.
• depthwise_init (initializer) – for the depthwise convolution kernel.
• pointwise_init (initializer) – For the pointwise convolution kernel.
• b_init (initializer) – For the bias vector. If None, ignore bias in the pointwise part
only.
SubPixel Convolutions
SubpixelConv1d
class tensorlayer.layers.SubpixelConv1d(prev_layer, scale=2, act=None,

name=’subpixel_conv1d’)
It is a 1D sub-pixel up-sampling layer.
Calls a TensorFlow function that directly implements this functionality. We assume input has dim (batch, width,
r)
Parameters
• net (Layer) – Previous layer with output shape of (batch, width, r).
• scale (int) – The up-scaling ratio, a wrong setting will lead to Dimension size error.
Examples

>>> t_signal = tf.placeholder('float32', [10, 100, 4], name='x')
>>> n = tl.layers.InputLayer(t_signal, name='in')
>>> n = tl.layers.SubpixelConv1d(n, scale=2, name='s')
>>> print(n.outputs.shape)
(10, 200, 2)

References
Audio Super Resolution Implementation.
SubpixelConv2d
class tensorlayer.layers.SubpixelConv2d(prev_layer, scale=2, n_out_channel=None,

act=None, name=’subpixel_conv2d’)
It is a 2D sub-pixel up-sampling layer, usually be used for Super-Resolution applications, see SRGAN for
example.
Parameters
• prev_layer (Layer) – Previous layer,
• scale (int) – The up-scaling ratio, a wrong setting will lead to dimension size error.
• n_out_channel (int or None) – The number of output channels. - If None, auto-
matically set n_out_channel == the number of input channels / (scale x scale). - The number
of input channels == (scale x scale) x The number of output channels.
Examples
>>> # examples here just want to tell you how to set the n_out_channel.
>>> import numpy as np
>>> x = np.random.rand(2, 16, 16, 4)
>>> X = tf.placeholder("float32", shape=(2, 16, 16, 4), name="X")
>>> net = tl.layers.InputLayer(X, name='input')
>>> net = tl.layers.SubpixelConv2d(net, scale=2, n_out_channel=1, name='subpixel_
˓→conv2d')
>>> sess = tf.Session()

>>> y = sess.run(net.outputs, feed_dict={X: x})
>>> print(x.shape, y.shape)
(2, 16, 16, 4) (2, 32, 32, 1)
>>> x = np.random.rand(2, 16, 16, 4*10)

>>> X = tf.placeholder("float32", shape=(2, 16, 16, 4*10), name="X")
>>> net = tl.layers.InputLayer(X, name='input2')
>>> net = tl.layers.SubpixelConv2d(net, scale=2, n_out_channel=10, name='subpixel_
˓→conv2d2')

(2, 16, 16, 40) (2, 32, 32, 10)
>>> x = np.random.rand(2, 16, 16, 25*10)

>>> X = tf.placeholder("float32", shape=(2, 16, 16, 25*10), name="X")
>>> net = tl.layers.InputLayer(X, name='input3')
>>> net = tl.layers.SubpixelConv2d(net, scale=5, n_out_channel=None, name=
˓→'subpixel_conv2d3')


(2, 16, 16, 250) (2, 80, 80, 10)
References
• Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural
Network
2.8.9 Dense Layers
Dense Layer
class tensorlayer.layers.DenseLayer(prev_layer, n_units=100, act=None,

name=’dense’)
The DenseLayer class is a fully connected layer.
Parameters
• n_units (int) – The number of units of this layer.
biases.
Examples
With TensorLayer
>>> net = tl.layers.InputLayer(x, name='input')

>>> net = tl.layers.DenseLayer(net, 800, act=tf.nn.relu, name='relu')
Without native TensorLayer APIs, you can do as follow.
>>> W = tf.Variable(
... tf.random_uniform([n_in, n_units], -1.0, 1.0), name='W')
>>> b = tf.Variable(tf.zeros(shape=[n_units]), name='b')
>>> y = tf.nn.relu(tf.matmul(inputs, W) + b)

Notes
If the layer input has more than two axes, it needs to be flatten by using FlattenLayer.
Drop Connect Dense Layer
class tensorlayer.layers.DropconnectDenseLayer(prev_layer, keep=0.5,

n_units=100, act=None,
object>,
b_init_args=None,
name=’dropconnect_layer’)
The DropconnectDenseLayer class is DenseLayer with DropConnect behaviour which randomly re-
moves connections between this layer and the previous layer according to a keeping probability.
Parameters
• keep (float) – The keeping probability. The lower the probability it is, the more activa-
tions are set to zero.
• W_init (weights initializer) – The initializer for the weight matrix.
• b_init (biases initializer) – The initializer for the bias vector.
Examples

>>> net = tl.layers.DropconnectDenseLayer(net, keep=0.8,
... n_units=800, act=tf.nn.relu, name='relu1')
... n_units=800, act=tf.nn.relu, name='relu2')
... n_units=10, name='output')
References
• Wan, L. (2013). Regularization of neural networks using dropconnect

2.8.10 Dropout Layers
class tensorlayer.layers.DropoutLayer(prev_layer, keep=0.5, is_fix=False, is_train=True,

seed=None, name=’dropout_layer’)
The DropoutLayer class is a noise layer which randomly set some activations to zero according to a keeping
probability.
Parameters
• keep (float) – The keeping probability. The lower the probability it is, the more activa-
tions are set to zero.
• is_fix (boolean) – Fixing probability or nor. Default is False. If True, the keeping
probability is fixed and cannot be changed via feed_dict.
• is_train (boolean) – Trainable or not. If False, skip this layer. Default is True.
• seed (int or None) – The seed for random dropout.
Examples
Method 1: Using all_drop see tutorial_mlp_dropout1.py

>>> net = tl.layers.DropoutLayer(net, keep=0.8, name='drop1')
>>> net = tl.layers.DenseLayer(net, n_units=800, act=tf.nn.relu, name='relu1')
>>> ...
>>> # For training, enable dropout as follow.
>>> feed_dict = {x: X_train_a, y_: y_train_a}
>>> feed_dict.update( net.all_drop ) # enable noise layers
>>> sess.run(train_op, feed_dict=feed_dict)
>>> ...
>>> # For testing, disable dropout as follow.
>>> dp_dict = tl.utils.dict_to_one( net.all_drop ) # disable noise layers
>>> feed_dict = {x: X_val_a, y_: y_val_a}
>>> feed_dict.update(dp_dict)
>>> err, ac = sess.run([cost, acc], feed_dict=feed_dict)
>>> ...
Method 2: Without using all_drop see tutorial_mlp_dropout2.py
>>> def mlp(x, is_train=True, reuse=False):

>>> with tf.variable_scope("MLP", reuse=reuse):
>>> tl.layers.set_name_reuse(reuse)
>>> net = tl.layers.DropoutLayer(net, keep=0.8, is_fix=True,
>>> is_train=is_train, name='drop1')
>>> ...
>>> return net
>>> net_train = mlp(x, is_train=True, reuse=False)

>>> net_test = mlp(x, is_train=False, reuse=True)

2.8.11 Extend Layers
Expand Dims Layer
class tensorlayer.layers.ExpandDimsLayer(prev_layer, axis, name=’expand_dims’)

The ExpandDimsLayer class inserts a dimension of 1 into a tensor’s shape, see tf.expand_dims() .
Parameters
• prev_layer (Layer) – The previous layer.
• axis (int) – The dimension index at which to expand the shape of input.
Examples

>>> x = tf.placeholder(tf.float32, (None, 100))
>>> n = tl.layers.ExpandDimsLayer(n, 2)
[None, 100, 1]
Tile layer
class tensorlayer.layers.TileLayer(prev_layer, multiples=None, name=’tile’)

The TileLayer class constructs a tensor by tiling a given tensor, see tf.tile() .
Parameters
• multiples (tensor) – Must be one of the following types: int32, int64. 1-D Length
must be the same as the number of dimensions in input.
Examples

>>> x = tf.placeholder(tf.float32, (None, 100))
>>> n = tl.layers.ExpandDimsLayer(n, 2)
>>> n = tl.layers.TileLayer(n, [-1, 1, 3])
[None, 100, 3]
2.8.12 External Libraries Layers
TF-Slim Layer
TF-Slim models can be connected into TensorLayer. All Google’s Pre-trained model can be used easily , see Slim-
model.

class tensorlayer.layers.SlimNetsLayer(prev_layer, slim_layer, slim_args=None,

name=’tfslim_layer’)
A layer that merges TF-Slim models into TensorLayer.
Models can be found in slim-model, see Inception V3 example on Github.
Parameters
• slim_layer (a slim network function) – The network you want to stack onto,
end with return net, end_points.
• slim_args (dictionary) – The arguments for the slim model.
Notes
• As TF-Slim stores the layers as dictionary, the all_layers in this network is not in order ! Fortunately,
the all_params are in order.
2.8.13 Flow Control Layer
class tensorlayer.layers.MultiplexerLayer(layers, name=’mux_layer’)

The MultiplexerLayer selects inputs to be forwarded to output. see tutorial_mnist_multiplexer.py.
Parameters
• layers (a list of Layer) – The input layers.
sel
placeholder – The placeholder takes an integer for selecting which layer to output.
Examples

>>> x = tf.placeholder(tf.float32, shape=(None, 784), name='x')
>>> # define the network
>>> net_in = tl.layers.InputLayer(x, name='input')
>>> net_in = tl.layers.DropoutLayer(net_in, keep=0.8, name='drop1')
>>> # net 0
>>> net_0 = tl.layers.DenseLayer(net_in, n_units=800, act=tf.nn.relu, name='net0/
˓→relu1')
>>> net_0 = tl.layers.DropoutLayer(net_0, keep=0.5, name='net0/drop2')

>>> net_0 = tl.layers.DenseLayer(net_0, n_units=800, act=tf.nn.relu, name='net0/
˓→relu2')
>>> # net 1
>>> net_1 = tl.layers.DenseLayer(net_in, n_units=800, act=tf.nn.relu, name='net1/
˓→relu1')

˓→relu2')



˓→relu3')
>>> # multiplexer
>>> net_mux = tl.layers.MultiplexerLayer(layers=[net_0, net_1], name='mux')
>>> network = tl.layers.ReshapeLayer(net_mux, shape=(-1, 800), name='reshape')
>>> network = tl.layers.DropoutLayer(network, keep=0.5, name='drop3')
>>> # output layer
>>> network = tl.layers.DenseLayer(network, n_units=10, act=None, name='output')
2.8.14 Image Resampling Layers
2D UpSampling
class tensorlayer.layers.UpSampling2dLayer(prev_layer, size, is_scale=True,

method=0, align_corners=False,
name=’upsample2d_layer’)
The UpSampling2dLayer class is a up-sampling 2D layer.
See tf.image.resize_images.
Parameters
• prev_layer (Layer) – Previous layer with 4-D Tensor of the shape (batch, height,
width, channels) or 3-D Tensor of the shape (height, width, channels).
• size (tuple of int/float) – (height, width) scale factor or new size of height and
width.
• is_scale (boolean) – If True (default), the size is a scale factor; otherwise, the size is
the numbers of pixels of height and width.
• method (int) –
The resize method selected through the index. Defaults index is 0 which is ResizeMethod.BILINEAR.
– Index 0 is ResizeMethod.BILINEAR, Bilinear interpolation.

– Index 1 is ResizeMethod.NEAREST_NEIGHBOR, Nearest neighbor interpolation.
– Index 2 is ResizeMethod.BICUBIC, Bicubic interpolation.
– Index 3 ResizeMethod.AREA, Area interpolation.
• align_corners (boolean) – If True, align the corners of the input and output. Default
is False.
2D DownSampling
class tensorlayer.layers.DownSampling2dLayer(prev_layer, size, is_scale=True,

method=0, align_corners=False,
name=’downsample2d_layer’)
The DownSampling2dLayer class is down-sampling 2D layer.
See tf.image.resize_images.
Parameters

• prev_layer (Layer) – Previous layer with 4-D Tensor in the shape of (batch, height,
width, channels) or 3-D Tensor in the shape of (height, width, channels).
• size (tuple of int/float) – (height, width) scale factor or new size of height and
width.
• is_scale (boolean) – If True (default), the size is the scale factor; otherwise, the size
are numbers of pixels of height and width.
• method (int) –
The resize method selected through the index. Defaults index is 0 which is ResizeMethod.BILINEAR.
– Index 0 is ResizeMethod.BILINEAR, Bilinear interpolation.

– Index 1 is ResizeMethod.NEAREST_NEIGHBOR, Nearest neighbor interpolation.
– Index 2 is ResizeMethod.BICUBIC, Bicubic interpolation.
– Index 3 ResizeMethod.AREA, Area interpolation.
• align_corners (boolean) – If True, exactly align all 4 corners of the input and output.
Default is False.
2.8.15 Lambda Layers
Lambda Layer
class tensorlayer.layers.LambdaLayer(prev_layer, fn, fn_args=None, name=’lambda_layer’)

A layer that takes a user-defined function using TensorFlow Lambda, for multiple inputs see
ElementwiseLambdaLayer.
Parameters
• fn (function) – The function that applies to the outputs of previous layer.
• fn_args (dictionary or None) – The arguments for the function (option).
Examples
Non-parametric case

>>> x = tf.placeholder(tf.float32, shape=[None, 1], name='x')
>>> net = tl.layers.LambdaLayer(net, lambda x: 2*x, name='lambda')
Parametric case, merge other wrappers into TensorLayer

>>> from keras.layers import *

>>> from tensorlayer.layers import *
>>> def keras_block(x):
>>> x = Dropout(0.8)(x)
>>> x = Dense(800, activation='relu')(x)
>>> x = Dropout(0.5)(x)
>>> x = Dense(800, activation='relu')(x)
>>> x = Dropout(0.5)(x)
>>> logits = Dense(10, activation='linear')(x)
>>> return logits
>>> net = InputLayer(x, name='input')
>>> net = LambdaLayer(net, fn=keras_block, name='keras')
ElementWise Lambda Layer
class tensorlayer.layers.ElementwiseLambdaLayer(layers, fn, fn_args=None, act=None,

name=’elementwiselambda_layer’)
A layer that use a custom function to combine multiple Layer inputs.
Parameters
• layers (list of Layer) – The list of layers to combine.
• fn (function) – The function that applies to the outputs of previous layer.
• fn_args (dictionary or None) – The arguments for the function (option).
Examples
z = mean + noise * tf.exp(std * 0.5)

>>> def func(noise, mean, std):

>>> return mean + noise * tf.exp(std * 0.5)
>>> x = tf.placeholder(tf.float32, [None, 200])

>>> noise_tensor = tf.random_normal(tf.stack([tf.shape(x)[0], 200]))
>>> noise = tl.layers.InputLayer(noise_tensor)
>>> net = tl.layers.InputLayer(x)
>>> net = tl.layers.DenseLayer(net, n_units=200, act=tf.nn.relu, name='dense1')
>>> mean = tl.layers.DenseLayer(net, n_units=200, name='mean')
>>> std = tl.layers.DenseLayer(net, n_units=200, name='std')
>>> z = tl.layers.ElementwiseLambdaLayer([noise, mean, std], fn=func, name='z')

2.8.16 Merge Layers
Concat Layer
class tensorlayer.layers.ConcatLayer(prev_layer, concat_dim=-1, name=’concat_layer’)

A layer that concats multiple tensors according to given axis.
Parameters
• prev_layer (list of Layer) – List of layers to concatenate.
• concat_dim (int) – The dimension to concatenate.
Examples

>>> sess = tf.InteractiveSession()
>>> x = tf.placeholder(tf.float32, shape=[None, 784])
>>> inputs = tl.layers.InputLayer(x, name='input_layer')
>>> net1 = tl.layers.DenseLayer(inputs, 800, act=tf.nn.relu, name='relu1_1')
[TL] DenseLayer relu1_1: 800, relu
>>> net2 = tl.layers.DenseLayer(inputs, 300, act=tf.nn.relu, name='relu2_1')
[TL] DenseLayer relu2_1: 300, relu
>>> net = tl.layers.ConcatLayer([net1, net2], 1, name ='concat_layer')
[TL] ConcatLayer concat_layer, 1100
>>> tl.layers.initialize_global_variables(sess)
>>> net.print_params()
[TL] param 0: relu1_1/W:0 (784, 800) float32_ref
[TL] param 1: relu1_1/b:0 (800,) float32_ref
[TL] param 2: relu2_1/W:0 (784, 300) float32_ref
[TL] param 3: relu2_1/b:0 (300,) float32_ref
>>> net.print_layers()
[TL] layer 0: relu1_1/Relu:0 (?, 800) float32
[TL] layer 1: relu2_1/Relu:0 (?, 300) float32
[TL] layer 2: concat_layer:0 (?, 1100) float32
ElementWise Layer
class tensorlayer.layers.ElementwiseLayer(prev_layer, com-

bine_fn=<sphinx.ext.autodoc.importer._MockObject
object>, act=None,
name=’elementwise_layer’)
A layer that combines multiple Layer that have the same output shapes according to an element-wise operation.
Parameters
• prev_layer (list of Layer) – The list of layers to combine.
• combine_fn (a TensorFlow element-wise combine function) – e.g.
AND is tf.minimum ; OR is tf.maximum ; ADD is tf.add ; MUL is tf.multiply
and so on. See TensorFlow Math API .


Examples

>>> inputs = tl.layers.InputLayer(x, name='input_layer')
>>> net_0 = tl.layers.DenseLayer(inputs, n_units=500, act=tf.nn.relu, name='net_0
˓→')
>>> net_1 = tl.layers.DenseLayer(inputs, n_units=500, act=tf.nn.relu, name='net_1

˓→')
>>> net = tl.layers.ElementwiseLayer([net_0, net_1], combine_fn=tf.minimum, name=

˓→'minimum')
>>> net.print_params(False)
[TL] param 0: net_0/W:0 (784, 500) float32_ref
[TL] param 1: net_0/b:0 (500,) float32_ref
[TL] param 2: net_1/W:0 (784, 500) float32_ref
[TL] param 3: net_1/b:0 (500,) float32_ref
>>> net.print_layers()
[TL] layer 0: net_0/Relu:0 (?, 500) float32
[TL] layer 1: net_1/Relu:0 (?, 500) float32
[TL] layer 2: minimum:0 (?, 500) float32
2.8.17 Noise Layer
class tensorlayer.layers.GaussianNoiseLayer(prev_layer, mean=0.0, std-

dev=1.0, is_train=True, seed=None,
name=’gaussian_noise_layer’)
The GaussianNoiseLayer class is noise layer that adding noise with gaussian distribution to the activation.
Parameters
• mean (float) – The mean. Default is 0.
• stddev (float) – The standard deviation. Default is 1.
• is_train (boolean) – Is trainable layer. If False, skip this layer. default is True.
• seed (int or None) – The seed for random noise.
Examples

>>> x = tf.placeholder(tf.float32, shape=(100, 784))
>>> net = tl.layers.DenseLayer(net, n_units=100, act=tf.nn.relu, name='dense3')
>>> net = tl.layers.GaussianNoiseLayer(net, name='gaussian')
(64, 100)

2.8.18 Normalization Layers
For local response normalization as it does not have any weights and arguments, you can also apply tf.nn.lrn on
network.outputs.
Batch Normalization
class tensorlayer.layers.BatchNormLayer(prev_layer, decay=0.9, epsilon=1e-

05, act=None, is_train=False,
beta_init=<sphinx.ext.autodoc.importer._MockObject
object>, gamma_init=<sphinx.ext.autodoc.importer._MockObject
object>, moving_mean_init=<sphinx.ext.autodoc.importer._MockObject
object>, data_format=’channels_last’,
name=’batchnorm_layer’)
The BatchNormLayer is a batch normalization layer for both fully-connected and convolution outputs. See
tf.nn.batch_normalization and tf.nn.moments.
Parameters
• decay (float) – A decay factor for ExponentialMovingAverage. Suggest to use a large
value for large dataset.
• epsilon (float) – Eplison.
• is_train (boolean) – Is being used for training or inference.
• beta_init (initializer or None) – The initializer for initializing beta, if None,
skip beta. Usually you should not skip beta unless you know what happened.
• gamma_init (initializer or None) – The initializer for initializing gamma, if
None, skip gamma. When the batch normalization layer is use instead of ‘biases’, or the
next layer is linear, this can be disabled since the scaling can be done by the next layer. see
Inception-ResNet-v2
References
• Source
• stackoverflow
Local Response Normalization
class tensorlayer.layers.LocalResponseNormLayer(prev_layer, depth_radius=None,

bias=None, alpha=None, beta=None,
name=’lrn_layer’)
The LocalResponseNormLayer layer is for Local Response Normalization. See tf.nn.
local_response_normalization or tf.nn.lrn for new TF version. The 4-D input tensor is a 3-D
array of 1-D vectors (along the last dimension), and each vector is normalized independently. Within a given
vector, each component is divided by the weighted square-sum of inputs within depth_radius.
Parameters

• prev_layer (Layer) – The previous layer with a 4D output shape.

• depth_radius (int) – Depth radius. 0-D. Half-width of the 1-D normalization window.
• bias (float) – An offset which is usually positive and shall avoid dividing by 0.
• alpha (float) – A scale factor which is usually positive.
• beta (float) – An exponent.
Instance Normalization
class tensorlayer.layers.InstanceNormLayer(prev_layer, act=None, epsilon=1e-05,

name=’instan_norm’)
The InstanceNormLayer class is a for instance normalization.
Parameters
• act (activation function.) – The activation function of this layer.
Layer Normalization
class tensorlayer.layers.LayerNormLayer(prev_layer, center=True, scale=True, act=None,

reuse=None, variables_collections=None,
outputs_collections=None, trainable=True,
begin_norm_axis=1, begin_params_axis=-1,
name=’layernorm’)
The LayerNormLayer class is for layer normalization, see tf.contrib.layers.layer_norm.
Parameters
• others – tf.contrib.layers.layer_norm.
Group Normalization
class tensorlayer.layers.GroupNormLayer(prev_layer, groups=32, epsilon=1e-06,

name=’groupnorm’)
The GroupNormLayer layer is for Group Normalization. See tf.contrib.layers.group_norm.
Parameters

Switch Normalization
class tensorlayer.layers.SwitchNormLayer(prev_layer, act=None, epsilon=1e-05,

beta_init=<sphinx.ext.autodoc.importer._MockObject
object>, gamma_init=<sphinx.ext.autodoc.importer._MockObject
object>, moving_mean_init=<sphinx.ext.autodoc.importer._MockObject
object>, name=’switchnorm_layer’)
The SwitchNormLayer is a switchable normalization.
Parameters
None, skip gamma. When the batch normalization layer is use instead of ‘biases’, or the
next layer is linear, this can be disabled since the scaling can be done by the next layer. see
Inception-ResNet-v2
References
• Differentiable Learning-to-Normalize via Switchable Normalization

• Zhihu (CN)
2.8.19 Object Detection Layer
class tensorlayer.layers.ROIPoolingLayer(prev_layer, rois, pool_height=2, pool_width=2,

name=’roipooling_layer’)
The region of interest pooling layer.
Parameters
• rois (tuple of int) – Regions of interest in the format of (feature map index, upper
left, bottom right).
• pool_width (int) – The size of the pooling sections.
• pool_width – The size of the pooling sections.
Notes
• This implementation is imported from Deepsense-AI .

• Please install it by the instruction HERE.

2.8.20 Padding Layers
Pad Layer (Expert API)
Padding layer for any modes.

class tensorlayer.layers.PadLayer(prev_layer, padding=None, mode=’CONSTANT’,
name=’pad_layer’)
The PadLayer class is a padding layer for any mode and dimension. Please see tf.pad for usage.
Parameters
• padding (list of lists of 2 ints, or a Tensor of type int32.) –
The int32 values to pad.
• mode (str) – “CONSTANT”, “REFLECT”, or “SYMMETRIC” (case-insensitive).
Examples

>>> images = tf.placeholder(tf.float32, [None, 224, 224, 3])
>>> net = tl.layers.InputLayer(images, name='in')
>>> net = tl.layers.PadLayer(net, [[0, 0], [3, 3], [3, 3], [0, 0]], "REFLECT",
˓→name='inpad')
1D Zero padding
class tensorlayer.layers.ZeroPad1d(prev_layer, padding, name=’zeropad1d’)

The ZeroPad1d class is a 1D padding layer for signal [batch, length, channel].
Parameters
• padding (int, or tuple of 2 ints) –
– If int, zeros to add at the beginning and end of the padding dimension (axis 1).
– If tuple of 2 ints, zeros to add at the beginning and at the end of the padding dimension.
2D Zero padding

The ZeroPad2d class is a 2D padding layer for image [batch, height, width, channel].
Parameters
• padding (int, or tuple of 2 ints, or tuple of 2 tuples of 2
ints.) –

– If int, the same symmetric padding is applied to width and height.

– If tuple of 2 ints, interpreted as two different symmetric padding values for height and
width as (symmetric_height_pad, symmetric_width_pad).
– If tuple of 2 tuples of 2 ints, interpreted as ((top_pad, bottom_pad),
(left_pad, right_pad)).
3D Zero padding

The ZeroPad3d class is a 3D padding layer for volume [batch, depth, height, width, channel].
Parameters
• padding (int, or tuple of 2 ints, or tuple of 2 tuples of 2
ints.) –
– If int, the same symmetric padding is applied to width and height.
– If tuple of 2 ints, interpreted as two different symmetric padding values
for height and width as (symmetric_dim1_pad, symmetric_dim2_pad,
symmetric_dim3_pad).
– If tuple of 2 tuples of 2 ints, interpreted as ((left_dim1_pad,
right_dim1_pad), (left_dim2_pad, right_dim2_pad),
(left_dim3_pad, right_dim3_pad)).
2.8.21 Pooling Layers
Pool Layer (Expert API)
Pooling layer for any dimensions and any pooling functions.

class tensorlayer.layers.PoolLayer(prev_layer, ksize=(1, 2, 2, 1),
pool=<sphinx.ext.autodoc.importer._MockObject ob-
ject>, name=’pool_layer’)
The PoolLayer class is a Pooling layer. You can choose tf.nn.max_pool and tf.nn.avg_pool for
2D input or tf.nn.max_pool3d and tf.nn.avg_pool3d for 3D input.
Parameters
• ksize (tuple of int) – The size of the window for each dimension of the input tensor.
Note that: len(ksize) >= 4.
• strides (tuple of int) – The stride of the sliding window for each dimension of the
input tensor. Note that: len(strides) >= 4.
• pool (pooling function) – One of tf.nn.max_pool, tf.nn.avg_pool, tf.
nn.max_pool3d and f.nn.avg_pool3d. See TensorFlow pooling APIs

Examples
• see Conv2dLayer.
1D Max pooling
class tensorlayer.layers.MaxPool1d(prev_layer, filter_size=3, strides=2, padding=’valid’,

data_format=’channels_last’, name=’maxpool1d’)
Max pooling for 1D signal.
Parameters
• prev_layer (Layer) – The previous layer with a output rank as 3 [batch, length, chan-
nel].
• filter_size (tuple of int) – Pooling window size.
• strides (tuple of int) – Strides of the pooling operation.
• padding (str) – The padding method: ‘valid’ or ‘same’.
• data_format (str) – One of channels_last (default, [batch, length, channel]) or chan-
nels_first. The ordering of the dimensions in the inputs.
1D Mean pooling
class tensorlayer.layers.MeanPool1d(prev_layer, filter_size=3, strides=2, padding=’valid’,

data_format=’channels_last’, name=’meanpool1d’)
Mean pooling for 1D signal.
Parameters
• prev_layer (Layer) – The previous layer with a output rank as 3.
2D Max pooling
class tensorlayer.layers.MaxPool2d(prev_layer, filter_size=(3, 3), strides=(2, 2),

name=’maxpool2d’)
Max pooling for 2D image.
Parameters

• filter_size (tuple of int) – (height, width) for filter size.

• strides (tuple of int) – (height, width) for strides.
• data_format (str) – One of channels_last (default, [batch, height, width, channel]) or
channels_first. The ordering of the dimensions in the inputs.
2D Mean pooling
class tensorlayer.layers.MeanPool2d(prev_layer, filter_size=(3, 3), strides=(2, 2),

name=’meanpool2d’)
Mean pooling for 2D image [batch, height, width, channel].
Parameters
• prev_layer (Layer) – The previous layer with a output rank as 4 [batch, height, width,
channel].
• filter_size (tuple of int) – (height, width) for filter size.
• strides (tuple of int) – (height, width) for strides.
3D Max pooling
class tensorlayer.layers.MaxPool3d(prev_layer, filter_size=(3, 3, 3), strides=(2, 2,

2), padding=’valid’, data_format=’channels_last’,
name=’maxpool3d’)
Max pooling for 3D volume.
Parameters
• data_format (str) – One of channels_last (default, [batch, depth, height, width, chan-
nel]) or channels_first. The ordering of the dimensions in the inputs.
Returns A max pooling 3-D layer with a output rank as 5.
Return type Layer

3D Mean pooling
class tensorlayer.layers.MeanPool3d(prev_layer, filter_size=(3, 3, 3), strides=(2, 2,

2), padding=’valid’, data_format=’channels_last’,
name=’meanpool3d’)
Mean pooling for 3D volume.
Parameters
Returns A mean pooling 3-D layer with a output rank as 5.
Return type Layer
1D Global Max pooling
class tensorlayer.layers.GlobalMaxPool1d(prev_layer, data_format=’channels_last’,

name=’globalmaxpool1d’)
The GlobalMaxPool1d class is a 1D Global Max Pooling layer.
Parameters
nel].
Examples
>>> x = tf.placeholder("float32", [None, 100, 30])

>>> n = InputLayer(x, name='in')
>>> n = GlobalMaxPool1d(n)
[None, 30]
1D Global Mean pooling
class tensorlayer.layers.GlobalMeanPool1d(prev_layer, data_format=’channels_last’,

name=’globalmeanpool1d’)
The GlobalMeanPool1d class is a 1D Global Mean Pooling layer.
Parameters
nel].


Examples

>>> x = tf.placeholder("float32", [None, 100, 30])
>>> n = tl.layers.GlobalMeanPool1d(n)
[None, 30]

Parameters
channel].
Examples

>>> x = tf.placeholder("float32", [None, 100, 100, 30])
>>> n = tl.layers.GlobalMaxPool2d(n)
[None, 30]

Parameters
channel].

Examples

>>> x = tf.placeholder("float32", [None, 100, 100, 30])
[None, 30]

Parameters
• prev_layer (Layer) – The previous layer with a output rank as 5 [batch, depth, height,
width, channel].
Examples

>>> x = tf.placeholder("float32", [None, 100, 100, 100, 30])
>>> n = tl.layers.GlobalMaxPool3d(n)
[None, 30]

Parameters
• prev_layer (Layer) – The previous layer with a output rank as 5 [batch, depth, height,
width, channel].
Examples


>>> x = tf.placeholder("float32", [None, 100, 100, 100, 30])
[None, 30]
2.8.22 Quantized Nets
This is an experimental API package for building Quantized Neural Networks. We are using matrix multiplication
rather than add-minus and bit-count operation at the moment. Therefore, these APIs would not speed up the infer-
encing, for production, you can train model via TensorLayer and deploy the model into other customized C/C++
implementation (We probably provide users an extra C/C++ binary net framework that can load model from Tensor-
Layer).
Note that, these experimental APIs can be changed in the future
Sign
class tensorlayer.layers.SignLayer(prev_layer, name=’sign’)

The SignLayer class is for quantizing the layer outputs to -1 or 1 while inferencing.
Parameters
Scale
class tensorlayer.layers.ScaleLayer(prev_layer, init_scale=0.05, name=’scale’)

The AddScaleLayer class is for multipling a trainble scale value to the layer outputs. Usually be used on the
output of binary net.
Parameters
• init_scale (float) – The initial value for the scale factor.
Binary Dense Layer
class tensorlayer.layers.BinaryDenseLayer(prev_layer, n_units=100,

act=None, use_gemm=False,
b_init_args=None, name=’binary_dense’)
The BinaryDenseLayer class is a binary fully connected layer, which weights are either -1 or 1 while
inferencing.
Note that, the bias vector would not be binarized.
Parameters


• act (activation function) – The activation function of this layer, usually set to
tf.act.sign or apply SignLayer after BatchNormLayer.
• use_gemm (boolean) – If True, use gemm instead of tf.matmul for inference.
(TODO).
biases.
Binary (De)Convolutions
BinaryConv2d
class tensorlayer.layers.BinaryConv2d(prev_layer, n_filter=32, filter_size=(3, 3), strides=(1,

1), act=None, padding=’SAME’, use_gemm=False,
name=’binary_cnn2d’)
The BinaryConv2d class is a 2D binary CNN layer, which weights are either -1 or 1 while inference.
Parameters
(TODO).
skip biases.


Examples

>>> x = tf.placeholder(tf.float32, [None, 256, 256, 3])
>>> net = tl.layers.BinaryConv2d(net, 32, (5, 5), (1, 1), padding='SAME', name=
˓→'bcnn1')
>>> net = tl.layers.MaxPool2d(net, (2, 2), (2, 2), padding='SAME', name='pool1')

>>> net = tl.layers.BatchNormLayer(net, act=tl.act.htanh, is_train=True, name='bn1
˓→')
...
>>> net = tl.layers.SignLayer(net)
>>> net = tl.layers.BinaryConv2d(net, 64, (5, 5), (1, 1), padding='SAME', name=
˓→'bcnn2')

˓→')
Ternary Dense Layer
TernaryDenseLayer
class tensorlayer.layers.TernaryDenseLayer(prev_layer, n_units=100,

act=None, use_gemm=False,
b_init_args=None, name=’ternary_dense’)
The TernaryDenseLayer class is a ternary fully connected layer, which weights are either -1 or 1 or 0 while
inference.
Note that, the bias vector would not be tenaried.
Parameters
• act (activation function) – The activation function of this layer, usually set to
tf.act.sign or apply SignLayer after BatchNormLayer.
(TODO).
biases.

Ternary Convolutions
TernaryConv2d
class tensorlayer.layers.TernaryConv2d(prev_layer, n_filter=32, filter_size=(3,

3), strides=(1, 1), act=None,
padding=’SAME’, use_gemm=False,
name=’ternary_cnn2d’)
The TernaryConv2d class is a 2D binary CNN layer, which weights are either -1 or 1 or 0 while inference.
Note that, the bias vector would not be tenarized.
Parameters
(TODO).
skip biases.
Examples

>>> net = tl.layers.TernaryConv2d(net, 32, (5, 5), (1, 1), padding='SAME', name=
˓→'bcnn1')



˓→')
...
>>> net = tl.layers.TernaryConv2d(net, 64, (5, 5), (1, 1), padding='SAME', name=
˓→'bcnn2')

˓→')
DoReFa Convolutions
DorefaConv2d
class tensorlayer.layers.DorefaConv2d(prev_layer, bitW=1, bitA=3, n_filter=32, fil-

ter_size=(3, 3), strides=(1, 1), act=None,
name=’dorefa_cnn2d’)
The DorefaConv2d class is a 2D quantized convolutional layer, which weights are ‘bitW’ bits and the output
of the previous layer are ‘bitA’ bits while inferencing.
Parameters
• bitW (int) – The bits of this layer’s parameter
• bitA (int) – The bits of the output of previous layer
• use_gemm (boolean) – If True, use gemm instead of tf.matmul for inferencing.
(TODO).
skip biases.

Examples

>>> net = tl.layers.DorefaConv2d(net, 32, (5, 5), (1, 1), padding='SAME', name=
˓→'bcnn1')

˓→')
...
˓→'bcnn2')

˓→')
DoReFa Convolutions
DorefaConv2d
class tensorlayer.layers.DorefaConv2d(prev_layer, bitW=1, bitA=3, n_filter=32, fil-

ter_size=(3, 3), strides=(1, 1), act=None,
name=’dorefa_cnn2d’)
The DorefaConv2d class is a 2D quantized convolutional layer, which weights are ‘bitW’ bits and the output
of the previous layer are ‘bitA’ bits while inferencing.
Parameters


(TODO).
skip biases.
Examples

˓→'bcnn1')

˓→')
...
˓→'bcnn2')

˓→')
Quantization Dense Layer
QuanDenseLayer
class tensorlayer.layers.QuanDenseLayer(prev_layer, n_units=100, act=None,

bitW=8, bitA=8, use_gemm=False,
name=’quan_dense’)
The QuanDenseLayer class is a quantized fully connected layer with BN, which weights are ‘bitW’ bits and
the output of the previous layer are ‘bitA’ bits while inferencing.
Parameters


(TODO).
biases.
QuanDenseLayerWithBN
class tensorlayer.layers.QuanDenseLayerWithBN(prev_layer, n_units=100, act=None,

decay=0.9, epsilon=1e-05,
is_train=False, bitW=8, bitA=8,
gamma_init=<sphinx.ext.autodoc.importer._MockObject
object>, beta_init=<sphinx.ext.autodoc.importer._MockObject
object>, use_gemm=False,
name=’quan_dense_with_bn’)
The QuanDenseLayer class is a quantized fully connected layer with BN, which weights are ‘bitW’ bits and
the output of the previous layer are ‘bitA’ bits while inferencing.
Parameters
None, skip gamma.
• decay – A decay factor for ExponentialMovingAverage. Suggest to use a large value for
large dataset.
• epsilon – Eplison.
• is_train – Is being used for training or inference.
• beta_init – The initializer for initializing beta, if None, skip beta. Usually you should
not skip beta unless you know what happened.

• gamma_init – The initializer for initializing gamma, if None, skip gamma.

(TODO).
Quantization Convolutions
Quantization
class tensorlayer.layers.QuanConv2d(prev_layer, n_filter=32, filter_size=(3,

3), strides=(1, 1), padding=’SAME’,
act=None, bitW=8, bitA=8, use_gemm=False,
name=’quan_cnn2d’)
The QuanConv2dWithBN class is a quantized convolutional layer with BN, which weights are ‘bitW’ bits
and the output of the previous layer are ‘bitA’ bits while inferencing. Note that, the bias vector would not be
binarized.
Parameters
• bitW – The bits of this layer’s parameter
• bitA – The bits of the output of previous layer
(TODO).
skip biases.


Examples

>>> net = tl.layers.QuanConv2d(net, 32, (5, 5), (1, 1), padding='SAME', act=tf.nn.
˓→relu, name='qcnn1')

˓→')
...
>>> net = tl.layers.QuanConv2d(net, 64, (5, 5), (1, 1), padding='SAME', act=tf.nn.
˓→relu, name='qcnn2')

˓→')
QuanConv2dWithBN
class tensorlayer.layers.QuanConv2dWithBN(prev_layer, n_filter=32, filter_size=(3, 3),

strides=(1, 1), padding=’SAME’, act=None,
decay=0.9, epsilon=1e-05, is_train=False,
gamma_init=<sphinx.ext.autodoc.importer._MockObject
object>, beta_init=<sphinx.ext.autodoc.importer._MockObject
object>, bitW=8, bitA=8, use_gemm=False,
name=’quan_cnn2d_bn’)
The QuanConv2dWithBN class is a quantized convolutional layer with BN, which weights are ‘bitW’ bits
and the output of the previous layer are ‘bitA’ bits while inferencing.
Note that, the bias vector would keep the same.
Parameters


None, skip gamma.
• decay – A decay factor for ExponentialMovingAverage. Suggest to use a large value for
large dataset.
• epsilon – Eplison.
• is_train – Is being used for training or inference.
• beta_init – The initializer for initializing beta, if None, skip beta. Usually you should
not skip beta unless you know what happened.
• gamma_init – The initializer for initializing gamma, if None, skip gamma.
(TODO).
Examples

>>> net = tl.layers.QuanConv2dWithBN(net, 64, (5, 5), (1, 1), act=tf.nn.relu,
˓→padding='SAME', is_train=is_train, bitW=bitW, bitA=bitA, name='qcnnbn1')

...
>>> net = tl.layers.QuanConv2dWithBN(net, 64, (5, 5), (1, 1), padding='SAME',
˓→act=tf.nn.relu, is_train=is_train, bitW=bitW, bitA=bitA, name='qcnnbn2')
...
2.8.23 Recurrent Layers
Fixed Length Recurrent layer
All recurrent layers can implement any type of RNN cell by feeding different cell function (LSTM, GRU etc).

RNN layer
class tensorlayer.layers.RNNLayer(prev_layer, cell_fn, cell_init_args=None, n_hidden=100,

initializer=<sphinx.ext.autodoc.importer._MockObject ob-
ject>, n_steps=5, initial_state=None, return_last=False, re-
turn_seq_2d=False, name=’rnn’)
The RNNLayer class is a fixed length recurrent layer for implementing vanilla RNN, LSTM, GRU and etc.
Parameters
• cell_fn (TensorFlow cell function) –
A TensorFlow core RNN cell
– See RNN Cells in TensorFlow
– Note TF1.0+ and TF1.0- are different
• cell_init_args (dictionary) – The arguments for the cell function.
• n_hidden (int) – The number of hidden units in the layer.
• initializer (initializer) – The initializer for initializing the model parameters.
• n_steps (int) – The fixed sequence length.
• initial_state (None or RNN State) – If None, initial_state is zero state.
• return_last (boolean) –
Whether return last output or all outputs in each step.
– If True, return the last output, “Sequence input and single output”
– If False, return all outputs, “Synced sequence input and output”
– In other word, if you want to stack more RNNs on this layer, set to False.
• return_seq_2d (boolean) –
Only consider this argument when return_last is False
– If True, return 2D Tensor [n_example, n_hidden], for stacking DenseLayer after it.
– If False, return 3D Tensor [n_example/n_steps, n_steps, n_hidden], for stacking multi-
ple RNN after it.
outputs
Tensor – The output of this layer.
final_state
Tensor or StateTuple –
The finial state of this layer.
• When state_is_tuple is False, it is the final hidden and cell states, states.get_shape() = [?, 2 *
n_hidden].
• When state_is_tuple is True, it stores two elements: (c, h).
• In practice, you can get the final state after each iteration during training, then feed it to the initial
state of next iteration.

initial_state
Tensor or StateTuple –
The initial state of this layer.
• In practice, you can set your state at the begining of each epoch or iteration according to your
training procedure.
batch_size
int or Tensor – It is an integer, if it is able to compute the batch_size; otherwise, tensor for dynamic batch
size.
Examples
• For synced sequence input and output, see PTB example

• For encoding see below.

>>> batch_size = 32
>>> num_steps = 5
>>> vocab_size = 3000
>>> hidden_size = 256
>>> keep_prob = 0.8
>>> is_train = True
>>> input_data = tf.placeholder(tf.int32, [batch_size, num_steps])
>>> net = tl.layers.EmbeddingInputlayer(inputs=input_data, vocabulary_size=vocab_
˓→size,
... embedding_size=hidden_size, name='embed')

>>> net = tl.layers.DropoutLayer(net, keep=keep_prob, is_fix=True, is_train=is_
˓→train, name='drop1')
>>> net = tl.layers.RNNLayer(net, cell_fn=tf.contrib.rnn.BasicLSTMCell,

... n_hidden=hidden_size, n_steps=num_steps, return_last=False, name='lstm1')
>>> net = tl.layers.RNNLayer(net, cell_fn=tf.contrib.rnn.BasicLSTMCell,

... n_hidden=hidden_size, n_steps=num_steps, return_last=True, name='lstm2')
>>> net = tl.layers.DenseLayer(net, n_units=vocab_size, name='output')
• For CNN+LSTM
>>> image_size = 100

>>> batch_size = 10
>>> num_steps = 5
>>> x = tf.placeholder(tf.float32, shape=[batch_size, image_size, image_size, 1])
>>> net = tl.layers.InputLayer(x, name='in')
>>> net = tl.layers.Conv2d(net, 32, (5, 5), (2, 2), tf.nn.relu, name='cnn1')
>>> net = tl.layers.MaxPool2d(net, (2, 2), (2, 2), name='pool1')
>>> net = tl.layers.Conv2d(net, 10, (5, 5), (2, 2), tf.nn.relu, name='cnn2')
>>> net = tl.layers.MaxPool2d(net, (2, 2), (2, 2), name='pool2')
>>> net = tl.layers.FlattenLayer(net, name='flatten')
>>> net = tl.layers.ReshapeLayer(net, shape=[-1, num_steps, int(net.outputs._
˓→shape[-1])])


>>> rnn = tl.layers.RNNLayer(net, cell_fn=tf.contrib.rnn.BasicLSTMCell, n_
˓→hidden=200, n_steps=num_steps, return_last=False, return_seq_2d=True, name='rnn
˓→')
>>> net = tl.layers.DenseLayer(rnn, 3, name='out')
Notes
Input dimension should be rank 3 : [batch_size, n_steps, n_features], if no, please see ReshapeLayer.
References
• Neural Network RNN Cells in TensorFlow

• tensorflow/python/ops/rnn.py
• tensorflow/python/ops/rnn_cell.py
• see TensorFlow tutorial ptb_word_lm.py, TensorLayer tutorials tutorial_ptb_lstm*.py and
tutorial_generate_text.py
Bidirectional layer
class tensorlayer.layers.BiRNNLayer(prev_layer, cell_fn, cell_init_args=None, n_hidden=100,

initializer=<sphinx.ext.autodoc.importer._MockObject
object>, n_steps=5, fw_initial_state=None,
bw_initial_state=None, dropout=None, n_layer=1, re-
turn_last=False, return_seq_2d=False, name=’birnn’)
The BiRNNLayer class is a fixed length Bidirectional recurrent layer.
Parameters
A TensorFlow core RNN cell.
– See RNN Cells in TensorFlow.
– Note TF1.0+ and TF1.0- are different.
• cell_init_args (dictionary or None) – The arguments for the cell function.
• initializer (initializer) – The initializer for initializing the model parameters.
• n_steps (int) – The fixed sequence length.
• fw_initial_state (None or forward RNN State) – If None, initial_state is
zero state.
• bw_initial_state (None or backward RNN State) – If None, initial_state is
zero state.
• dropout (tuple of float or int) – The input and output keep probability (in-
put_keep_prob, output_keep_prob). If one int, input and output keep probability are the
same.

• n_layer (int) – The number of RNN layers, default is 1.

ple RNN after it.
outputs
tensor – The output of this layer.
fw(bw)_final_state
tensor or StateTuple –
n_hidden].
fw(bw)_initial_state
training procedure.
batch_size
int or tensor – It is an integer, if it is able to compute the batch_size; otherwise, tensor for dynamic batch
size.
Notes
Input dimension should be rank 3 : [batch_size, n_steps, n_features]. If not, please see ReshapeLayer. For
predicting, the sequence length has to be the same with the sequence length of training, while, for normal RNN,
we can use sequence length of 1 for predicting.
References
Source

Recurrent Convolution
Conv RNN Cell
class tensorlayer.layers.ConvRNNCell
Abstract object representing an Convolutional RNN Cell.
Basic Conv LSTM Cell
class tensorlayer.layers.BasicConvLSTMCell(shape, filter_size, num_features,

forget_bias=1.0, in-
put_size=None, state_is_tuple=False,
act=<sphinx.ext.autodoc.importer._MockObject
object>)
Basic Conv LSTM recurrent network cell.
Parameters
• shape (tuple of int) – The height and width of the cell.
• filter_size (tuple of int) – The height and width of the filter
• num_features (int) – The hidden size of the cell
• forget_bias (float) – The bias added to forget gates (see above).
• input_size (int) – Deprecated and unused.
• state_is_tuple (boolen) – If True, accepted and returned states are 2-tuples of the
c_state and m_state. If False, they are concatenated along the column axis. The latter
behavior will soon be deprecated.
• act (activation function) – The activation function of this layer, tanh as default.
Conv LSTM layer
class tensorlayer.layers.ConvLSTMLayer(prev_layer, cell_shape=None, feature_map=1,

filter_size=(3, 3), cell_fn=<class ’tensor-
layer.layers.recurrent.BasicConvLSTMCell’>, ini-
tializer=<sphinx.ext.autodoc.importer._MockObject
object>, n_steps=5, initial_state=None,
return_last=False, return_seq_2d=False,
name=’convlstm’)
A fixed length Convolutional LSTM layer.
See this paper .
Parameters
• cell_shape (tuple of int) – The shape of each cell width * height
• filter_size (tuple of int) – The size of filter width * height
• cell_fn (a convolutional RNN cell) – Cell function like
BasicConvLSTMCell
• feature_map (int) – The number of feature map in the layer.

• initializer (initializer) – The initializer for initializing the parameters.

• n_steps (int) – The sequence length.
• initial_state (None or ConvLSTM State) – If None, initial_state is zero state.
– If True, return the last output, “Sequence input and single output”.
– If False, return all outputs, “Synced sequence input and output”.
ple RNN after it.
outputs
tensor – The output of this RNN. return_last = False, outputs = all cell_output, which is the hidden state.
cell_output.get_shape() = (?, h, w, c])
final_state
• When state_is_tuple = False, it is the final hidden and cell states,
• When state_is_tuple = True, You can get the final state after each iteration during training, then
feed it to the initial state of next iteration.
initial_state
tensor or StateTuple – It is the initial state of this ConvLSTM layer, you can use it to initialize your state
at the beginning of each epoch or iteration according to your training procedure.
batch_size
int or tensor – Is int, if able to compute the batch_size, otherwise, tensor for ?.
Advanced Ops for Dynamic RNN
These operations usually be used inside Dynamic RNN layer, they can compute the sequence lengths for different
situation and get the last RNN outputs by indexing.
Output indexing
tensorlayer.layers.advanced_indexing_op(inputs, index)
Advanced Indexing for Sequences, returns the outputs by given sequence lengths. When return the last output
DynamicRNNLayer uses it to get the last outputs with the sequence lengths.
Parameters
• inputs (tensor for data) – With shape of [batch_size, n_step(max), n_features]

• index (tensor for indexing) – Sequence length in Dynamic RNN. [batch_size]
Examples

>>> batch_size, max_length, n_features = 3, 5, 2
>>> z = np.random.uniform(low=-1, high=1, size=[batch_size, max_length, n_
˓→features]).astype(np.float32)
>>> b_z = tf.constant(z)

>>> sl = tf.placeholder(dtype=tf.int32, shape=[batch_size])
>>> o = advanced_indexing_op(b_z, sl)
>>>
>>>
>>> order = np.asarray([1,1,2])
>>> print("real",z[0][order[0]-1], z[1][order[1]-1], z[2][order[2]-1])
>>> y = sess.run([o], feed_dict={sl:order})
>>> print("given",order)
>>> print("out", y)
real [-0.93021595 0.53820813] [-0.92548317 -0.77135968] [ 0.89952248 0.19149846]
given [1 1 2]
out [array([[-0.93021595, 0.53820813],
[-0.92548317, -0.77135968],
[ 0.89952248, 0.19149846]], dtype=float32)]
References
• Modified from TFlearn (the original code is used for fixed length rnn), references.
Compute Sequence length 1
tensorlayer.layers.retrieve_seq_length_op(data)
An op to compute the length of a sequence from input shape of [batch_size, n_step(max), n_features], it can be
used when the features of padding (on right hand side) are all zeros.
Parameters data (tensor) – [batch_size, n_step(max), n_features] with zero padding on right
hand side.
Examples
>>> data = [[[1],[2],[0],[0],[0]],

... [[1],[2],[3],[0],[0]],
... [[1],[2],[6],[1],[0]]]
>>> data = np.asarray(data)
>>> print(data.shape)
(3, 5, 1)
>>> data = tf.constant(data)
>>> sl = retrieve_seq_length_op(data)


>>> y = sl.eval()
[2 3 4]
Multiple features >>> data = [[[1,2],[2,2],[1,2],[1,2],[0,0]], ... [[2,3],[2,4],[3,2],[0,0],[0,0]], ...

[[3,3],[2,2],[5,3],[1,2],[0,0]]] >>> print(sl) [4 3 4]
References
Borrow from TFlearn.
tensorlayer.layers.retrieve_seq_length_op2(data)
An op to compute the length of a sequence, from input shape of [batch_size, n_step(max)], it can be used when
the features of padding (on right hand side) are all zeros.
Parameters data (tensor) – [batch_size, n_step(max)] with zero padding on right hand side.
Examples
>>> data = [[1,2,0,0,0],

... [1,2,3,0,0],
... [1,2,6,1,0]]
>>> o = retrieve_seq_length_op2(data)
>>> print(o.eval())
[2 3 4]
tensorlayer.layers.retrieve_seq_length_op3(data, pad_val=0)
Return tensor for sequence length, if input is tf.string.
Get Mask
tensorlayer.layers.target_mask_op(data, pad_val=0)
Return tensor for mask, if input is tf.string.

Dynamic RNN Layer
RNN Layer
class tensorlayer.layers.DynamicRNNLayer(prev_layer, cell_fn, cell_init_args=None,

n_hidden=256, initial-
izer=<sphinx.ext.autodoc.importer._MockObject
object>, sequence_length=None, ini-
tial_state=None, dropout=None, n_layer=1,
return_last=None, return_seq_2d=False, dy-
namic_rnn_init_args=None, name=’dyrnn’)
The DynamicRNNLayer class is a dynamic recurrent layer, see tf.nn.dynamic_rnn.
Parameters
– See RNN Cells in TensorFlow
• cell_init_args (dictionary or None) – The arguments for the cell function.
• sequence_length (tensor, array or None) –
The sequence length of each row of input data, see Advanced Ops for Dynamic RNN.
– If None, it uses retrieve_seq_length_op to compute the sequence length, i.e.

when the features of padding (on right hand side) are all zeros.
– If using word embedding, you may need to compute the sequence length
from the ID array (the integer features before word embedding) by using
retrieve_seq_length_op2 or retrieve_seq_length_op.
– You can also input an numpy array.
– More details about TensorFlow dynamic RNN in Wild-ML Blog.
• initial_state (None or RNN State) – If None, initial_state is zero state.
• dropout (tuple of float or int) –
The input and output keep probability (input_keep_prob, output_keep_prob).
– If one int, input and output keep probability are the same.
• return_last (boolean or None) –

ple RNN after it.
• dynamic_rnn_init_args (dictionary) – The arguments for tf.nn.
dynamic_rnn.
outputs
tensor – The output of this layer.
final_state
n_hidden].
initial_state
training procedure.
batch_size
size.
sequence_length
a tensor or array – The sequence lengths computed by Advanced Opt or the given sequence lengths,
[batch_size]
Notes
Input dimension should be rank 3 : [batch_size, n_steps(max), n_features], if no, please see ReshapeLayer.
Examples
Synced sequence input and output, for loss function see tl.cost.cross_entropy_seq_with_mask.
>>> input_seqs = tf.placeholder(dtype=tf.int64, shape=[batch_size, None], name=
˓→"input")
>>> net = tl.layers.EmbeddingInputlayer(

... inputs=input_seqs,
... vocabulary_size=vocab_size,
... embedding_size=embedding_size,


... name='embedding')
>>> net = tl.layers.DynamicRNNLayer(net,
... cell_fn=tf.contrib.rnn.BasicLSTMCell, # for TF0.2 use tf.nn.rnn_
˓→cell.BasicLSTMCell,
... n_hidden=embedding_size,
... dropout=(0.7 if is_train else None),
... sequence_length=tl.layers.retrieve_seq_length_op2(input_seqs),
... return_last=False, # for encoder, set to True
... return_seq_2d=True, # stack denselayer or
˓→compute cost after it
... name='dynamicrnn')
>>> net = tl.layers.DenseLayer(net, n_units=vocab_size, name="output")
References
• Wild-ML Blog
• dynamic_rnn.ipynb
• tf.nn.dynamic_rnn
• tflearn rnn
• tutorial_dynamic_rnn.py
Bidirectional Layer
class tensorlayer.layers.BiDynamicRNNLayer(prev_layer, cell_fn, cell_init_args=None,

n_hidden=256, initial-
object>, sequence_length=None,
fw_initial_state=None,
bw_initial_state=None, dropout=None,
n_layer=1, return_last=False,
return_seq_2d=False, dy-
namic_rnn_init_args=None,
name=’bi_dyrnn_layer’)
The BiDynamicRNNLayer class is a RNN layer, you can implement vanilla RNN, LSTM and GRU with it.
Parameters
– See RNN Cells in TensorFlow.
– Note TF1.0+ and TF1.0- are different.
• cell_init_args (dictionary) – The arguments for the cell initializer.
• sequence_length (tensor, array or None) –

The sequence length of each row of input data, see Advanced Ops for Dynamic RNN.
– If None, it uses retrieve_seq_length_op to compute the sequence length, i.e.

when the features of padding (on right hand side) are all zeros.
– If using word embedding, you may need to compute the sequence length
from the ID array (the integer features before word embedding) by using
retrieve_seq_length_op2 or retrieve_seq_length_op.
– You can also input an numpy array.
– More details about TensorFlow dynamic RNN in Wild-ML Blog.
• fw_initial_state (None or forward RNN State) – If None, initial_state is
zero state.
• bw_initial_state (None or backward RNN State) – If None, initial_state is
zero state.
– If True, return 2D Tensor [n_example, 2 * n_hidden], for stacking DenseLayer after it.
– If False, return 3D Tensor [n_example/n_steps, n_steps, 2 * n_hidden], for stacking
multiple RNN after it.
• dynamic_rnn_init_args (dictionary) – The arguments for tf.nn.
bidirectional_dynamic_rnn.
outputs
tensor – The output of this layer. (?, 2 * n_hidden)
fw(bw)_final_state
n_hidden].

fw(bw)_initial_state
training procedure.
batch_size
size.
sequence_length
a tensor or array – The sequence lengths computed by Advanced Opt or the given sequence lengths,
[batch_size].
Notes
Input dimension should be rank 3 : [batch_size, n_steps(max), n_features], if no, please see ReshapeLayer.
References
• Wild-ML Blog
• bidirectional_rnn.ipynb
Sequence to Sequence
Simple Seq2Seq
class tensorlayer.layers.Seq2Seq(net_encode_in, net_decode_in, cell_fn,

cell_init_args=None, n_hidden=256, initial-
object>, encode_sequence_length=None, de-
code_sequence_length=None, initial_state_encode=None,
initial_state_decode=None, dropout=None, n_layer=1,
return_seq_2d=False, name=’seq2seq’)
The Seq2Seq class is a simple DynamicRNNLayer based Seq2seq layer without using tl.contrib.seq2seq.
See Model and Sequence to Sequence Learning with Neural Networks.
• Please check this example Chatbot in 200 lines of code.
• The Author recommends users to read the source code of DynamicRNNLayer and Seq2Seq.
Parameters
• net_encode_in (Layer) – Encode sequences, [batch_size, None, n_features].
• net_decode_in (Layer) – Decode sequences, [batch_size, None, n_features].
– see RNN Cells in TensorFlow

• cell_init_args (dictionary or None) – The arguments for the cell initializer.

• initializer (initializer) – The initializer for the parameters.
• encode_sequence_length (tensor) – For encoder sequence length, see
DynamicRNNLayer .
• decode_sequence_length (tensor) – For decoder sequence length, see
DynamicRNNLayer .
• initial_state_encode (None or RNN state) – If None, initial_state_encode is
zero state, it can be set by placeholder or other RNN.
• initial_state_decode (None or RNN state) – If None, initial_state_decode is
the final state of the RNN encoder, it can be set by placeholder or other RNN.
– If True, return 2D Tensor [n_example, 2 * n_hidden], for stacking DenseLayer after it.
– If False, return 3D Tensor [n_example/n_steps, n_steps, 2 * n_hidden], for stacking
multiple RNN after it.
outputs
tensor – The output of RNN decoder.
initial_state_encode
tensor or StateTuple – Initial state of RNN encoder.
initial_state_decode
tensor or StateTuple – Initial state of RNN decoder.
final_state_encode
tensor or StateTuple – Final state of RNN encoder.
final_state_decode
tensor or StateTuple – Final state of RNN decoder.
Notes
• How to feed data: Sequence to Sequence Learning with Neural Networks

• input_seqs : ['how', 'are', 'you', '<PAD_ID>']
• decode_seqs : ['<START_ID>', 'I', 'am', 'fine', '<PAD_ID>']
• target_seqs : ['I', 'am', 'fine', '<END_ID>', '<PAD_ID>']
• target_mask : [1, 1, 1, 1, 0]

• related functions : tl.prepro <pad_sequences, precess_sequences, sequences_add_start_id, se-

quences_get_mask>
Examples
>>> from tensorlayer.layers import *

>>> batch_size = 32
>>> encode_seqs = tf.placeholder(dtype=tf.int64, shape=[batch_size, None], name=
˓→"encode_seqs")
>>> decode_seqs = tf.placeholder(dtype=tf.int64, shape=[batch_size, None], name=

˓→"decode_seqs")
>>> target_seqs = tf.placeholder(dtype=tf.int64, shape=[batch_size, None], name=

˓→"target_seqs")
>>> target_mask = tf.placeholder(dtype=tf.int64, shape=[batch_size, None], name=

˓→"target_mask") # tl.prepro.sequences_get_mask()
>>> with tf.variable_scope("model"):

>>> # for chatbot, you can use the same embedding layer,
>>> # for translation, you may want to use 2 seperated embedding layers
>>> with tf.variable_scope("embedding") as vs:
>>> net_encode = EmbeddingInputlayer(
... inputs = encode_seqs,
... vocabulary_size = 10000,
... embedding_size = 200,
>>> vs.reuse_variables()
>>> tl.layers.set_name_reuse(True)
>>> net_decode = EmbeddingInputlayer(
... inputs = decode_seqs,
... vocabulary_size = 10000,
... embedding_size = 200,
>>> net = Seq2Seq(net_encode, net_decode,
... cell_fn = tf.contrib.rnn.BasicLSTMCell,
... n_hidden = 200,
... initializer = tf.random_uniform_initializer(-0.1, 0.1),
... encode_sequence_length = retrieve_seq_length_op2(encode_seqs),
... decode_sequence_length = retrieve_seq_length_op2(decode_seqs),
... initial_state_encode = None,
... dropout = None,
... n_layer = 1,
... return_seq_2d = True,
... name = 'seq2seq')
>>> net_out = DenseLayer(net, n_units=10000, act=None, name='output')
>>> e_loss = tl.cost.cross_entropy_seq_with_mask(logits=net_out.outputs, target_
˓→seqs=target_seqs, input_mask=target_mask, return_details=False, name='cost')
>>> y = tf.nn.softmax(net_out.outputs)
>>> net_out.print_params(False)
2.8.24 Shape Layers
Flatten Layer
class tensorlayer.layers.FlattenLayer(prev_layer, name=’flatten’)

A layer that reshapes high-dimension input into a vector.

Then we often apply DenseLayer, RNNLayer, ConcatLayer and etc on the top of a flatten layer. [batch_size,
mask_row, mask_col, n_mask] —> [batch_size, mask_row * mask_col * n_mask]
Parameters
Examples

>>> x = tf.placeholder(tf.float32, shape=[None, 28, 28, 1])
>>> net = tl.layers.FlattenLayer(net, name='flatten')
[?, 784]
Reshape Layer
class tensorlayer.layers.ReshapeLayer(prev_layer, shape, name=’reshape’)

A layer that reshapes a given tensor.
Parameters
• shape (tuple of int) – The output shape, see tf.reshape.
Examples

>>> x = tf.placeholder(tf.float32, shape=(None, 784))
>>> net = tl.layers.ReshapeLayer(net, [-1, 28, 28, 1], name='reshape')
(?, 28, 28, 1)
Transpose Layer
class tensorlayer.layers.TransposeLayer(prev_layer, perm, name=’transpose’)

A layer that transposes the dimension of a tensor.
See tf.transpose() .
Parameters
• perm (list of int) – The permutation of the dimensions, similar with numpy.
transpose.

Examples

>>> x = tf.placeholder(tf.float32, shape=[None, 28, 28, 1])
>>> net = tl.layers.TransposeLayer(net, perm=[0, 1, 3, 2], name='trans')
[None, 28, 1, 28]
2.8.25 Spatial Transformer
2D Affine Transformation
class tensorlayer.layers.SpatialTransformer2dAffineLayer(prev_layer, theta_layer,

out_size=None,
name=’spatial_trans_2d_affine’)
The SpatialTransformer2dAffineLayer class is a 2D Spatial Transformer Layer for 2D Affine Trans-
formation.
Parameters
• theta_layer (Layer) – The localisation network. - We will use a DenseLayer to
make the theta size to [batch, 6], value range to [0, 1] (via tanh).
• out_size (tuple of int or None) – The size of the output of the network (height,
width), the feature maps will be resized by this.
References
• Spatial Transformer Networks

• TensorFlow/Models
2D Affine Transformation function
tensorlayer.layers.transformer(U, theta, out_size, name=’SpatialTransformer2dAffine’)

Spatial Transformer Layer for 2D Affine Transformation , see SpatialTransformer2dAffineLayer
class.
Parameters
• U (list of float) – The output of a convolutional net should have the shape
[num_batch, height, width, num_channels].
• theta (float) – The output of the localisation network should be [num_batch, 6], value
range should be [0, 1] (via tanh).
• out_size (tuple of int) – The size of the output of the network (height, width)
• name (str) – Optional function name
Returns The transformed tensor.

Return type Tensor
References
• Spatial Transformer Networks

• TensorFlow/Models
Notes
To initialize the network to the identity transform init.

>>> # ``theta`` to
>>> identity = np.array([[1., 0., 0.], [0., 1., 0.]])
>>> identity = identity.flatten()
>>> theta = tf.Variable(initial_value=identity)
Batch 2D Affine Transformation function
tensorlayer.layers.batch_transformer(U, thetas, out_size, name=’BatchSpatialTransformer2dAffine’)

Batch Spatial Transformer function for 2D Affine Transformation.
Parameters
• U (list of float) – tensor of inputs [batch, height, width, num_channels]
• thetas (list of float) – a set of transformations for each input [batch,
num_transforms, 6]
• out_size (list of int) – the size of the output [out_height, out_width]
• name (str) – optional function name
Returns Tensor of size [batch * num_transforms, out_height, out_width, num_channels]
Return type float
2.8.26 Stack Layer
Stack Layer
class tensorlayer.layers.StackLayer(layers, axis=1, name=’stack’)

The StackLayer class is a layer for stacking a list of rank-R tensors into one rank-(R+1) tensor, see tf.stack().
Parameters
• layers (list of Layer) – Previous layers to stack.
• axis (int) – Dimension along which to concatenate.

Examples

>>> net1 = tl.layers.DenseLayer(net, 10, name='dense1')
>>> net = tl.layers.StackLayer([net1, net2, net3], axis=1, name='stack')
(?, 3, 10)
Unstack Layer
class tensorlayer.layers.UnStackLayer(prev_layer, num=None, axis=0, name=’unstack’)

The UnStackLayer class is a layer for unstacking the given dimension of a rank-R tensor into rank-(R-1)
tensors., see tf.unstack().
Parameters
• num (int or None) – The length of the dimension axis. Automatically inferred if None
(the default).
• axis (int) – Dimension along which axis to concatenate.
Returns The list of layer objects unstacked from the input.
Return type list of Layer
2.8.27 Time Distributed Layer
class tensorlayer.layers.TimeDistributedLayer(prev_layer, layer_class=None,

layer_args=None,
name=’time_distributed’)
The TimeDistributedLayer class that applies a function to every timestep of the input tensor. For exam-
ple, if use DenseLayer as the layer_class, we input (batch_size, length, dim) and output (batch_size , length,
new_dim).
Parameters
• prev_layer (Layer) – Previous layer with output size of (batch_size, length, dim).
• layer_class (a Layer class) – The layer class name.
• args (dictionary) – The arguments for the layer_class.
Examples


>>> batch_size = 32
>>> timestep = 20
>>> input_dim = 100
>>> x = tf.placeholder(dtype=tf.float32, shape=[batch_size, timestep, input_dim],
˓→name="encode_seqs")

[TL] InputLayer input: (32, 20, 100)
>>> net = tl.layers.TimeDistributedLayer(net, layer_class=tl.layers.DenseLayer,
˓→args={'n_units':50, 'name':'dense'}, name='time_dense')
[TL] TimeDistributedLayer time_dense: layer_class:DenseLayer

>>> print(net.outputs._shape)
(32, 20, 50)
>>> net.print_params(False)
[TL] param 0: (100, 50) time_dense/dense/W:0
[TL] param 1: (50,) time_dense/dense/b:0
[TL] num of params: 5050
2.8.28 Helper Functions
Flatten tensor
tensorlayer.layers.flatten_reshape(variable, name=’flatten’)
Reshapes a high-dimension vector input.
[batch_size, mask_row, mask_col, n_mask] —> [batch_size, mask_row x mask_col x n_mask]
Parameters
• variable (TensorFlow variable or tensor) – The variable or tensor to be
flatten.
Returns Flatten Tensor
Return type Tensor
Examples

>>> # Convolution Layer with 32 filters and a kernel size of 5
>>> network = tf.layers.conv2d(x, 32, 5, activation=tf.nn.relu)
>>> # Max Pooling (down-sampling) with strides of 2 and kernel size of 2
>>> network = tf.layers.max_pooling2d(network, 2, 2)
>>> print(network.get_shape()[:].as_list())
>>> [None, 62, 62, 32]
>>> network = tl.layers.flatten_reshape(network)
>>> print(network.get_shape()[:].as_list()[1:])
>>> [None, 123008]

Permanent clear existing layer names
tensorlayer.layers.clear_layers_name()
DEPRECATED FUNCTION
Warning: THIS FUNCTION IS DEPRECATED: It will be removed after after 2018-06-30. Instructions
for updating: TensorLayer relies on TensorFlow to check naming.
Initialize RNN state
tensorlayer.layers.initialize_rnn_state(state, feed_dict=None)
Returns the initialized RNN state. The inputs are LSTMStateTuple or State of RNNCells, and an optional
feed_dict.
Parameters
• state (RNN state.) – The TensorFlow’s RNN state.
• feed_dict (dictionary) – Initial RNN state; if None, returns zero state.
Returns The TensorFlow’s RNN state.
Return type RNN state
Remove repeated items in a list
tensorlayer.layers.list_remove_repeat(x)
Remove the repeated items in a list, and return the processed list. You may need it to create merged layer like
Concat, Elementwise and etc.
Parameters x (list) – Input
Returns A list that after removing it’s repeated items
Return type list
Examples
>>> l = [2, 3, 4, 2, 3]
>>> l = list_remove_repeat(l)
[2, 3, 4]
Merge networks attributes
tensorlayer.layers.merge_networks(layers=None)
Merge all parameters, layers and dropout probabilities to a Layer. The output of return network is the first
network in the list.
Parameters layers (list of Layer) – Merge all parameters, layers and dropout probabilities to
the first layer in the list.
Returns The network after merging all parameters, layers and dropout probabilities to the first net-
work in the list.

Return type Layer
Examples

>>> n1 = ...
>>> n2 = ...
>>> n1 = tl.layers.merge_networks([n1, n2])
2.9 API - Models
TensorLayer provides many pretrained models, you can easily use the whole or a part of the pretrained models via
these APIs.
VGG16(x[, end_with, reuse]) Pre-trained VGG-16 model.

VGG19(x[, end_with, reuse]) Pre-trained VGG-19 model.
SqueezeNetV1(x[, end_with, is_train, reuse]) Pre-trained SqueezeNetV1 model.
MobileNetV1(x[, end_with, is_train, reuse]) Pre-trained MobileNetV1 model.
2.9.1 VGG16
class tensorlayer.models.VGG16(x, end_with=’fc3_relu’, reuse=None)

Pre-trained VGG-16 model.
Parameters
• x (placeholder) – shape [None, 224, 224, 3], value range [0, 1].
• end_with (str) – The end point of the model. Default fc3_relu i.e. the whole model.
• reuse (boolean) – Whether to reuse the model.
Examples
Classify ImageNet classes with VGG16, see tutorial_models_vgg16.py

>>> # get the whole model
>>> vgg = tl.models.VGG16(x)
>>> # restore pre-trained VGG parameters
>>> vgg.restore_params(sess)
>>> # use for inferencing
>>> probs = tf.nn.softmax(vgg.outputs)
Extract features with VGG16 and Train a classifier with 100 classes
>>> # get VGG without the last layer
>>> vgg = tl.models.VGG16(x, end_with='fc2_relu')
>>> # add one more layer
2.9. API - Models 201


>>> net = tl.layers.DenseLayer(vgg, 100, name='out')
>>> # initialize all parameters
>>> # train your own classifier (only update the last layer)
>>> train_params = tl.layers.get_variables_with_name('out')
Reuse model
>>> x1 = tf.placeholder(tf.float32, [None, 224, 224, 3])
>>> vgg1 = tl.models.VGG16(x1, end_with='fc2_relu')
>>> # reuse the parameters of vgg1 with different input
>>> vgg2 = tl.models.VGG16(x2, end_with='fc2_relu', reuse=True)
>>> # restore pre-trained VGG parameters (as they share parameters, we don’t need
˓→to restore vgg2)

>>> vgg1.restore_params(sess)
2.9.2 VGG19
class tensorlayer.models.VGG19(x, end_with=’fc3_relu’, reuse=None)

Pre-trained VGG-19 model.
Parameters
• end_with (str) – The end point of the model. Default fc3_relu i.e. the whole model.
Examples
Classify ImageNet classes with VGG19, see tutorial_models_vgg19.py

>>> vgg = tl.models.VGG19(x)
>>> probs = tf.nn.softmax(vgg.outputs)
Extract features with VGG19 and Train a classifier with 100 classes
>>> vgg = tl.models.VGG19(x, end_with='fc2_relu')
>>> net = tl.layers.DenseLayer(vgg, 100, name='out')


Reuse model

>>> vgg1 = tl.models.VGG19(x1, end_with='fc2_relu')
>>> # reuse the parameters of vgg1 with different input
>>> vgg2 = tl.models.VGG19(x2, end_with='fc2_relu', reuse=True)
>>> # restore pre-trained VGG parameters (as they share parameters, we don’t need
˓→to restore vgg2)

>>> vgg1.restore_params(sess)
2.9.3 SqueezeNetV1
class tensorlayer.models.SqueezeNetV1(x, end_with=’output’, is_train=False, reuse=None)

Pre-trained SqueezeNetV1 model.
Parameters
• end_with (str) – The end point of the model [input, fire2, fire3 . . . fire9, output]. Default
output i.e. the whole model.
• is_train (boolean) – Whether the model is used for training i.e. enable dropout.
Examples
Classify ImageNet classes, see tutorial_models_squeezenetv1.py

>>> net = tl.models.SqueezeNetV1(x)
>>> # restore pre-trained parameters
>>> net.restore_params(sess)
>>> probs = tf.nn.softmax(net.outputs)
Extract features and Train a classifier with 100 classes

>>> # get model without the last layer
>>> cnn = tl.models.SqueezeNetV1(x, end_with='fire9')
2.9. API - Models 203


>>> net = Conv2d(cnn, 100, (1, 1), (1, 1), padding='VALID', name='output')
>>> net = GlobalMeanPool2d(net)
>>> cnn.restore_params(sess)
>>> train_params = tl.layers.get_variables_with_name('output')
Reuse model

>>> net1 = tl.models.SqueezeNetV1(x1, end_with='fire9')
>>> # reuse the parameters with different input
>>> net2 = tl.models.SqueezeNetV1(x2, end_with='fire9', reuse=True)
>>> # restore pre-trained parameters (as they share parameters, we don’t need to
˓→restore net2)

>>> net1.restore_params(sess)
2.9.4 MobileNetV1
class tensorlayer.models.MobileNetV1(x, end_with=’out’, is_train=False, reuse=None)

Pre-trained MobileNetV1 model.
Parameters
• end_with (str) – The end point of the model [conv, depth1, depth2 . . . depth13, glob-
almeanpool, out]. Default out i.e. the whole model.
• is_train (boolean) – Whether the model is used for training i.e. enable dropout.
Examples
Classify ImageNet classes, see tutorial_models_mobilenetv1.py

>>> net = tl.models.MobileNetV1(x)
>>> net.restore_params(sess)
>>> probs = tf.nn.softmax(net.outputs)
Extract features and Train a classifier with 100 classes


>>> cnn = tl.models.MobileNetV1(x, end_with='reshape')
>>> net = Conv2d(cnn, 100, (1, 1), (1, 1), name='out')
>>> net = FlattenLayer(net, name='flatten')
>>> cnn.restore_params(sess)
Reuse model

>>> net1 = tl.models.MobileNetV1(x1, end_with='reshape')
>>> # reuse the parameters with different input
>>> net2 = tl.models.MobileNetV1(x2, end_with='reshape', reuse=True)
>>> # restore pre-trained parameters (as they share parameters, we don’t need to
˓→restore net2)

>>> net1.restore_params(sess)
2.10 API - Natural Language Processing
Natural Language Processing and Word Representation.
generate_skip_gram_batch(data, batch_size, Generate a training batch for the Skip-Gram model.

...)
sample([a, temperature]) Sample an index from a probability array.
sample_top([a, top_k]) Sample from top_k probabilities.
SimpleVocabulary(vocab, unk_id) Simple vocabulary wrapper, see create_vocab().
Vocabulary(vocab_file[, start_word, . . . ]) Create Vocabulary class from a given vocabulary and its
id-word, word-id convert.
process_sentence(sentence[, start_word, . . . ]) Seperate a sentence string into a list of string words, add
start_word and end_word, see create_vocab() and
tutorial_tfrecord3.py.
create_vocab(sentences, word_counts_output_file) Creates the vocabulary of word to word_id.
simple_read_words([filename]) Read context from file without any preprocessing.
read_words([filename, replace]) Read list format context from a file.
read_analogies_file([eval_file, word2id]) Reads through an analogy question file, return its id for-
mat.
build_vocab(data) Build vocabulary.
build_reverse_dictionary(word_to_id) Given a dictionary that maps word to integer id.
build_words_dataset([words, . . . ]) Build the words dictionary and replace rare words with
‘UNK’ token.
save_vocab([count, name]) Save the vocabulary to a file so the model can be
reloaded.
2.10. API - Natural Language Processing 205


words_to_word_ids([data, word_to_id, unk_key]) Convert a list of string (words) to IDs.
word_ids_to_words(data, id_to_word) Convert a list of integer to strings (words).
basic_tokenizer(sentence[, _WORD_SPLIT]) Very basic tokenizer: split the sentence into a list of to-
kens.
create_vocabulary(vocabulary_path, . . . [, . . . ]) Create vocabulary file (if it does not exist yet) from data
file.
initialize_vocabulary(vocabulary_path) Initialize vocabulary from file, return the word_to_id
(dictionary) and id_to_word (list).
sentence_to_token_ids(sentence, vocabulary) Convert a string to list of integers representing token-
ids.
data_to_token_ids(data_path, target_path, . . . ) Tokenize data file and turn into token-ids using given
vocabulary file.
moses_multi_bleu(hypotheses, references[, . . . ]) Calculate the bleu score for hypotheses and references
using the MOSES ulti-bleu.perl script.
2.10.1 Iteration function for training embedding matrix
tensorlayer.nlp.generate_skip_gram_batch(data, batch_size, num_skips, skip_window,

data_index=0)
Generate a training batch for the Skip-Gram model.
See Word2Vec example.
Parameters
• data (list of data) – To present context, usually a list of integers.
• batch_size (int) – Batch size to return.
• num_skips (int) – How many times to reuse an input to generate a label.
• skip_window (int) – How many words to consider left and right.
• data_index (int) – Index of the context location. This code use data_index to instead
of yield like tl.iterate.
Returns
• batch (list of data) – Inputs.
• labels (list of data) – Labels
• data_index (int) – Index of the context location.
Examples
Setting num_skips=2, skip_window=1, use the right and left words. In the same way, num_skips=4,
skip_window=2 means use the nearby 4 words.
>>> data = [1,2,3,4,5,6,7,8,9,10,11]

>>> batch, labels, data_index = tl.nlp.generate_skip_gram_batch(data=data, batch_
˓→size=8, num_skips=2, skip_window=1, data_index=0)
>>> print(batch)
[2 2 3 3 4 4 5 5]
>>> print(labels)
[[3]


[1]
[4]
[2]
[5]
[3]
[4]
[6]]
2.10.2 Sampling functions
Simple sampling
tensorlayer.nlp.sample(a=None, temperature=1.0)
Sample an index from a probability array.
Parameters
• a (list of float) – List of probabilities.
• temperature (float or None) –
The higher the more uniform. When a = [0.1, 0.2, 0.7],
– temperature = 0.7, the distribution will be sharpen [0.05048273, 0.13588945,
0.81362782]
– temperature = 1.0, the distribution will be the same [0.1, 0.2, 0.7]
– temperature = 1.5, the distribution will be filtered [0.16008435, 0.25411807,
0.58579758]
– If None, it will be np.argmax(a)
Notes
• No matter what is the temperature and input list, the sum of all probabilities will be one. Even if input list
= [1, 100, 200], the sum of all probabilities will still be one.
• For large vocabulary size, choice a higher temperature or tl.nlp.sample_top to avoid error.
Sampling from top k
tensorlayer.nlp.sample_top(a=None, top_k=10)
Sample from top_k probabilities.
Parameters
• a (list of float) – List of probabilities.
• top_k (int) – Number of candidates to be considered.

2.10.3 Vector representations of words
Simple vocabulary class
class tensorlayer.nlp.SimpleVocabulary(vocab, unk_id)

Simple vocabulary wrapper, see create_vocab().
Parameters
• vocab (dictionary) – A dictionary that maps word to ID.
• unk_id (int) – The ID for ‘unknown’ word.
Vocabulary class
class tensorlayer.nlp.Vocabulary(vocab_file, start_word=’<S>’, end_word=’</S>’,

unk_word=’<UNK>’, pad_word=’<PAD>’)
Create Vocabulary class from a given vocabulary and its id-word, word-id convert. See create_vocab() and
tutorial_tfrecord3.py.
Parameters
• vocab_file (str) – The file contains the vocabulary (can be created via tl.nlp.
create_vocab), where the words are the first whitespace-separated token on each line
(other tokens are ignored) and the word ids are the corresponding line numbers.
• start_word (str) – Special word denoting sentence start.
• end_word (str) – Special word denoting sentence end.
• unk_word (str) – Special word denoting unknown words.
vocab
dictionary – A dictionary that maps word to ID.
reverse_vocab
list of int – A list that maps ID to word.
start_id
int – For start ID.
end_id
int – For end ID.
unk_id
int – For unknown ID.
pad_id
int – For Padding ID.
Examples
The vocab file looks like follow, includes start_word , end_word . . .
>>> a 969108
>>> <S> 586368
>>> </S> 586368
>>> . 440479
>>> on 213612


>>> of 202290
>>> the 196219
>>> in 182598
>>> with 152984
>>> and 139109
>>> is 97322
Process sentence
tensorlayer.nlp.process_sentence(sentence, start_word=’<S>’, end_word=’</S>’)

Seperate a sentence string into a list of string words, add start_word and end_word, see create_vocab()
and tutorial_tfrecord3.py.
Parameters
• sentence (str) – A sentence.
• start_word (str or None) – The start word. If None, no start word will be appended.
• end_word (str or None) – The end word. If None, no end word will be appended.
Returns A list of strings that separated into words.
Examples
>>> c = "how are you?"

>>> c = tl.nlp.process_sentence(c)
>>> print(c)
['<S>', 'how', 'are', 'you', '?', '</S>']
Notes
• You have to install the following package.

• Installing NLTK
• Installing NLTK data
Create vocabulary
tensorlayer.nlp.create_vocab(sentences, word_counts_output_file, min_word_count=1)

Creates the vocabulary of word to word_id.
See tutorial_tfrecord3.py.
The vocabulary is saved to disk in a text file of word counts. The id of each word in the file is its corresponding
0-based line number.
Parameters
• sentences (list of list of str) – All sentences for creating the vocabulary.
• word_counts_output_file (str) – The file name.

• min_word_count (int) – Minimum number of occurrences for a word.

Returns The simple vocabulary object, see Vocabulary for more.
Return type SimpleVocabulary
Examples
Pre-process sentences
>>> captions = ["one two , three", "four five five"]

>>> processed_capts = []
>>> for c in captions:
>>> c = tl.nlp.process_sentence(c, start_word="<S>", end_word="</S>")
>>> processed_capts.append(c)
>>> print(processed_capts)
...[['<S>', 'one', 'two', ',', 'three', '</S>'], ['<S>', 'four', 'five', 'five', '
˓→</S>']]
Create vocabulary
>>> tl.nlp.create_vocab(processed_capts, word_counts_output_file='vocab.txt', min_

˓→word_count=1)
Creating vocabulary.
Total words: 8
Words in vocabulary: 8
Wrote vocabulary file: vocab.txt
Get vocabulary object
>>> vocab = tl.nlp.Vocabulary('vocab.txt', start_word="<S>", end_word="</S>", unk_

˓→word="<UNK>")
INFO:tensorflow:Initializing vocabulary from file: vocab.txt

[TL] Vocabulary from vocab.txt : <S> </S> <UNK>
vocabulary with 10 words (includes start_word, end_word, unk_word)
start_id: 2
end_id: 3
unk_id: 9
pad_id: 0
2.10.4 Read words from file
Simple read file
tensorlayer.nlp.simple_read_words(filename=’nietzsche.txt’)
Read context from file without any preprocessing.
Parameters filename (str) – A file path (like .txt file)
Returns The context in a string.
Return type str

Read file
tensorlayer.nlp.read_words(filename=’nietzsche.txt’, replace=None)
Read list format context from a file.
For customized read_words method, see tutorial_generate_text.py.
Parameters
• filename (str) – a file path.
• replace (list of str) – replace original string by target string.
Returns The context in a list (split using space).
2.10.5 Read analogy question file
tensorlayer.nlp.read_analogies_file(eval_file=’questions-words.txt’, word2id=None)
Reads through an analogy question file, return its id format.
Parameters
• eval_file (str) – The file name.
• word2id (dictionary) – a dictionary that maps word to ID.
Returns A [n_examples, 4] numpy array containing the analogy question’s word IDs.
Examples
The file should be in this format

>>> : capital-common-countries
>>> Athens Greece Baghdad Iraq
>>> Athens Greece Bangkok Thailand
>>> Athens Greece Beijing China
>>> Athens Greece Berlin Germany
>>> Athens Greece Bern Switzerland
>>> Athens Greece Cairo Egypt
>>> Athens Greece Canberra Australia
>>> Athens Greece Hanoi Vietnam
>>> Athens Greece Havana Cuba
Get the tokenized analogy question data

>>> data, count, dictionary, reverse_dictionary = tl.nlp.build_words_
˓→dataset(words, vocabulary_size, True)
>>> analogy_questions = tl.nlp.read_analogies_file(eval_file='questions-words.txt

˓→', word2id=dictionary)
>>> print(analogy_questions)
[[ 3068 1248 7161 1581]
[ 3068 1248 28683 5642]
[ 3068 1248 3878 486]
...,


[ 1216 4309 19982 25506]
[ 1216 4309 3194 8650]
[ 1216 4309 140 312]]
2.10.6 Build vocabulary, word dictionary and word tokenization
Build dictionary from word to id
tensorlayer.nlp.build_vocab(data)
Build vocabulary.
Given the context in list format. Return the vocabulary, which is a dictionary for word to id. e.g. {‘campbell’:
2587, ‘atlantic’: 2247, ‘aoun’: 6746 . . . . }
Parameters data (list of str) – The context in list format
Returns that maps word to unique ID. e.g. {‘campbell’: 2587, ‘atlantic’: 2247, ‘aoun’: 6746 . . . . }
Return type dictionary
References
• tensorflow.models.rnn.ptb.reader
Examples
>>> data_path = os.getcwd() + '/simple-examples/data'

>>> train_path = os.path.join(data_path, "ptb.train.txt")
>>> word_to_id = build_vocab(read_txt_words(train_path))
Build dictionary from id to word
tensorlayer.nlp.build_reverse_dictionary(word_to_id)
Given a dictionary that maps word to integer id. Returns a reverse dictionary that maps a id to word.
Parameters word_to_id (dictionary) – that maps word to ID.
Returns A dictionary that maps IDs to words.
Return type dictionary
Build dictionaries for id to word etc
tensorlayer.nlp.build_words_dataset(words=None, vocabulary_size=50000, printable=True,

unk_key=’UNK’)
Build the words dictionary and replace rare words with ‘UNK’ token. The most common word has the smallest
integer id.
Parameters
• words (list of str or byte) – The context in list format. You may need to do
preprocessing on the words, such as lower case, remove marks etc.

• vocabulary_size (int) – The maximum vocabulary size, limiting the vocabulary size.
Then the script replaces rare words with ‘UNK’ token.
• printable (boolean) – Whether to print the read vocabulary size of the given words.
• unk_key (str) – Represent the unknown words.
Returns
• data (list of int) – The context in a list of ID.
• count (list of tuple and list) –
Pair words and IDs.
– count[0] is a list : the number of rare words
– count[1:] are tuples : the number of occurrence of each word
– e.g. [[‘UNK’, 418391], (b’the’, 1061396), (b’of’, 593677), (b’and’, 416629), (b’one’,
411764)]
• dictionary (dictionary) – It is word_to_id that maps word to ID.
• reverse_dictionary (a dictionary) – It is id_to_word that maps ID to word.
Examples

>>> vocabulary_size = 50000
˓→dataset(words, vocabulary_size)
References
• tensorflow/examples/tutorials/word2vec/word2vec_basic.py
Save vocabulary
tensorlayer.nlp.save_vocab(count=None, name=’vocab.txt’)
Save the vocabulary to a file so the model can be reloaded.
Parameters count (a list of tuple and list) – count[0] is a list : the number of rare
words, count[1:] are tuples : the number of occurrence of each word, e.g. [[‘UNK’, 418391],
(b’the’, 1061396), (b’of’, 593677), (b’and’, 416629), (b’one’, 411764)]
Examples

>>> tl.nlp.save_vocab(count, name='vocab_text8.txt')

>>> vocab_text8.txt
UNK 418391


the 1061396
of 593677
and 416629
one 411764
in 372201
a 325873
to 316376
2.10.7 Convert words to IDs and IDs to words
These functions can be done by Vocabulary class.
List of Words to IDs
tensorlayer.nlp.words_to_word_ids(data=None, word_to_id=None, unk_key=’UNK’)

Convert a list of string (words) to IDs.
Parameters
• data (list of string or byte) – The context in list format
• word_to_id (a dictionary) – that maps word to ID.
• unk_key (str) – Represent the unknown words.
Returns A list of IDs to represent the context.
Return type list of int
Examples

>>> context = [b'hello', b'how', b'are', b'you']

>>> ids = tl.nlp.words_to_word_ids(words, dictionary)
>>> context = tl.nlp.word_ids_to_words(ids, reverse_dictionary)
>>> print(ids)
[6434, 311, 26, 207]
>>> print(context)
[b'hello', b'how', b'are', b'you']
References
• tensorflow.models.rnn.ptb.reader
List of IDs to Words
tensorlayer.nlp.word_ids_to_words(data, id_to_word)
Convert a list of integer to strings (words).

Parameters
• data (list of int) – The context in list format.
• id_to_word (dictionary) – a dictionary that maps ID to word.
Returns A list of string or byte to represent the context.
Examples
>>> see ``tl.nlp.words_to_word_ids``
2.10.8 Functions for translation
Word Tokenization
tensorlayer.nlp.basic_tokenizer(sentence, _WORD_SPLIT=re.compile(b’([., !?"\’:;)(])’))

Very basic tokenizer: split the sentence into a list of tokens.
Parameters
• sentence (tensorflow.python.platform.gfile.GFile Object) –
• _WORD_SPLIT (regular expression for word spliting.) –
Examples
>>> see create_vocabulary

>>> from tensorflow.python.platform import gfile
>>> train_path = "wmt/giga-fren.release2"
>>> with gfile.GFile(train_path + ".en", mode="rb") as f:
>>> for line in f:
>>> tokens = tl.nlp.basic_tokenizer(line)
>>> tl.logging.info(tokens)
>>> exit()
[b'Changing', b'Lives', b'|', b'Changing', b'Society', b'|', b'How',
b'It', b'Works', b'|', b'Technology', b'Drives', b'Change', b'Home',
b'|', b'Concepts', b'|', b'Teachers', b'|', b'Search', b'|', b'Overview',
b'|', b'Credits', b'|', b'HHCC', b'Web', b'|', b'Reference', b'|',
b'Feedback', b'Virtual', b'Museum', b'of', b'Canada', b'Home', b'Page']
References
• Code from /tensorflow/models/rnn/translation/data_utils.py
Create or read vocabulary
tensorlayer.nlp.create_vocabulary(vocabulary_path, data_path, max_vocabulary_size,

tokenizer=None, normalize_digits=True,
_DIGIT_RE=re.compile(b’\\d’), _START_VOCAB=None)
Create vocabulary file (if it does not exist yet) from data file.

Data file is assumed to contain one sentence per line. Each sentence is tokenized and digits are normalized (if
normalize_digits is set). Vocabulary contains the most-frequent tokens up to max_vocabulary_size. We write it
to vocabulary_path in a one-token-per-line format, so that later token in the first line gets id=0, second line gets
id=1, and so on.
Parameters
• vocabulary_path (str) – Path where the vocabulary will be created.
• data_path (str) – Data file that will be used to create vocabulary.
• max_vocabulary_size (int) – Limit on the size of the created vocabulary.
• tokenizer (function) – A function to use to tokenize each data sentence. If None,
basic_tokenizer will be used.
• normalize_digits (boolean) – If true, all digits are replaced by 0.
• _DIGIT_RE (regular expression function) – Default is re.
compile(br"\d").
• _START_VOCAB (list of str) – The pad, go, eos and unk token, default is
[b"_PAD", b"_GO", b"_EOS", b"_UNK"].
References
tensorlayer.nlp.initialize_vocabulary(vocabulary_path)
Initialize vocabulary from file, return the word_to_id (dictionary) and id_to_word (list).
We assume the vocabulary is stored one-item-per-line, so a file will result in a vocabulary {“dog”: 0, “cat”: 1},
and this function will also return the reversed-vocabulary [“dog”, “cat”].
Parameters vocabulary_path (str) – Path to the file containing the vocabulary.
Returns
• vocab (dictionary) – a dictionary that maps word to ID.
• rev_vocab (list of int) – a list that maps ID to word.
Examples
>>> Assume 'test' contains

dog
cat
bird
>>> vocab, rev_vocab = tl.nlp.initialize_vocabulary("test")
>>> print(vocab)
>>> {b'cat': 1, b'dog': 0, b'bird': 2}
>>> print(rev_vocab)
>>> [b'dog', b'cat', b'bird']
Raises ValueError : if the provided vocabulary_path does not exist.

Convert words to IDs and IDs to words
tensorlayer.nlp.sentence_to_token_ids(sentence, vocabulary, tokenizer=None,

normalize_digits=True, UNK_ID=3,
_DIGIT_RE=re.compile(b’\\d’))
Convert a string to list of integers representing token-ids.
For example, a sentence “I have a dog” may become tokenized into [“I”, “have”, “a”, “dog”] and with vocabulary
{“I”: 1, “have”: 2, “a”: 4, “dog”: 7”} this function will return [1, 2, 4, 7].
Parameters
• sentence (tensorflow.python.platform.gfile.GFile Object) – The
sentence in bytes format to convert to token-ids, see basic_tokenizer() and
data_to_token_ids().
• vocabulary (dictionary) – Mmapping tokens to integers.
• tokenizer (function) – A function to use to tokenize each sentence. If None,
Returns The token-ids for the sentence.
Return type list of int
tensorlayer.nlp.data_to_token_ids(data_path, target_path, vocabulary_path, tok-
enizer=None, normalize_digits=True, UNK_ID=3,
_DIGIT_RE=re.compile(b’\\d’))
Tokenize data file and turn into token-ids using given vocabulary file.
This function loads data line-by-line from data_path, calls the above sentence_to_token_ids, and saves the result
to target_path. See comment for sentence_to_token_ids on the details of token-ids format.
Parameters
• data_path (str) – Path to the data file in one-sentence-per-line format.
• target_path (str) – Path where the file with token-ids will be created.
• vocabulary_path (str) – Path to the vocabulary file.
• tokenizer (function) – A function to use to tokenize each sentence. If None,
References
2.10.9 Metrics
BLEU
tensorlayer.nlp.moses_multi_bleu(hypotheses, references, lowercase=False)

Calculate the bleu score for hypotheses and references using the MOSES ulti-bleu.perl script.
Parameters

• hypotheses (numpy.array.string) – A numpy array of strings where each string

is a single example.
• references (numpy.array.string) – A numpy array of strings where each string
is a single example.
• lowercase (boolean) – If True, pass the “-lc” flag to the multi-bleu script
Examples
>>> hypotheses = ["a bird is flying on the sky"]

>>> references = ["two birds are flying on the sky", "a bird is on the top of the
˓→tree", "an airplane is on the sky",]
>>> score = tl.nlp.moses_multi_bleu(hypotheses, references)
Returns The BLEU score

Return type float
References
• Google/seq2seq/metric/bleu
2.11 API - Optimizers
TensorLayer provides simple API and tools to ease research, development and reduce the time to production. There-
fore, we provide the latest state of the art optimizers that work with Tensorflow.
2.11.1 Optimizers List
AMSGrad([learning_rate, beta1, beta2, . . . ]) Implementation of the AMSGrad optimization algo-

rithm.
2.11.2 AMSGrad Optimizer
class tensorlayer.optimizers.AMSGrad(learning_rate=0.01, beta1=0.9, beta2=0.99,

epsilon=1e-08, use_locking=False,
name=’AMSGrad’)
Implementation of the AMSGrad optimization algorithm.
See: On the Convergence of Adam and Beyond - [Reddi et al., 2018].
Parameters
• learning_rate (float) – A Tensor or a floating point value. The learning rate.
• beta1 (float) – A float value or a constant float tensor. The exponential decay rate for

the 1st moment estimates.

• beta2 (float) – A float value or a constant float tensor. The exponential decay rate for
the 2nd moment estimates.
• epsilon (float) – A small constant for numerical stability. This epsilon is “epsilon
hat” in the Kingma and Ba paper (in the formula just before Section 2.1), not the epsilon in
Algorithm 1 of the paper.
• use_locking (bool) – If True use locks for update operations.
• name (str) – Optional name for the operations created when applying gradients. Defaults
to “AMSGrad”.
2.12 API - Reinforcement Learning
Reinforcement Learning.
discount_episode_rewards([rewards, gamma, Take 1D float array of rewards and compute discounted

mode]) rewards for an episode.
cross_entropy_reward_loss(logits, actions, Calculate the loss for Policy Gradient Network.
...)
log_weight(probs, weights[, name]) Log weight.
choice_action_by_probs([probs, action_list]) Choice and return an an action by given the action prob-
ability distribution.
2.12.1 Reward functions
tensorlayer.rein.discount_episode_rewards(rewards=None, gamma=0.99, mode=0)

Take 1D float array of rewards and compute discounted rewards for an episode. When encount a non-zero value,
consider as the end a of an episode.
Parameters
• rewards (list) – List of rewards
• gamma (float) – Discounted factor
• mode (int) –
Mode for computing the discount rewards.
– If mode == 0, reset the discount process when encount a non-zero reward (Ping-pong
game).
– If mode == 1, would not reset the discount process.
Returns The discounted rewards.
Return type list of float
Examples
>>> rewards = np.asarray([0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1])

>>> gamma = 0.9
2.12. API - Reinforcement Learning 219


>>> discount_rewards = tl.rein.discount_episode_rewards(rewards, gamma)
>>> print(discount_rewards)
[ 0.72899997 0.81 0.89999998 1. 0.72899997 0.81
0.89999998 1. 0.72899997 0.81 0.89999998 1. ]
>>> discount_rewards = tl.rein.discount_episode_rewards(rewards, gamma, mode=1)
>>> print(discount_rewards)
[ 1.52110755 1.69011939 1.87791049 2.08656716 1.20729685 1.34144104
1.49048996 1.65610003 0.72899997 0.81 0.89999998 1. ]
2.12.2 Cost functions
Weighted Cross Entropy
tensorlayer.rein.cross_entropy_reward_loss(logits, actions, rewards, name=None)

Calculate the loss for Policy Gradient Network.
Parameters
• logits (tensor) – The network outputs without softmax. This function implements
softmax inside.
• actions (tensor or placeholder) – The agent actions.
• rewards (tensor or placeholder) – The rewards.
Returns The TensorFlow loss function.
Return type Tensor
Examples
>>> states_batch_pl = tf.placeholder(tf.float32, shape=[None, D])

>>> network = InputLayer(states_batch_pl, name='input')
>>> network = DenseLayer(network, n_units=H, act=tf.nn.relu, name='relu1')
>>> network = DenseLayer(network, n_units=3, name='out')
>>> probs = network.outputs
>>> sampling_prob = tf.nn.softmax(probs)
>>> actions_batch_pl = tf.placeholder(tf.int32, shape=[None])
>>> discount_rewards_batch_pl = tf.placeholder(tf.float32, shape=[None])
>>> loss = tl.rein.cross_entropy_reward_loss(probs, actions_batch_pl, discount_
˓→rewards_batch_pl)
>>> train_op = tf.train.RMSPropOptimizer(learning_rate, decay_rate).minimize(loss)
Log weight
tensorlayer.rein.log_weight(probs, weights, name=’log_weight’)

Log weight.
Parameters
• probs (tensor) – If it is a network output, usually we should scale it to [0, 1] via softmax.
• weights (tensor) – The weights.
Returns The Tensor after appling the log weighted expression.

Return type Tensor
2.12.3 Sampling functions
tensorlayer.rein.choice_action_by_probs(probs=(0.5, 0.5), action_list=None)

Choice and return an an action by given the action probability distribution.
Parameters
• probs (list of float.) – The probability distribution of all actions.
• action_list (None or a list of int or others) – A list of action in inte-
ger, string or others. If None, returns an integer range between 0 and len(probs)-1.
Returns The chosen action.
Return type float int or str
Examples
>>> for _ in range(5):

>>> a = choice_action_by_probs([0.2, 0.4, 0.4])
>>> print(a)
0
1
1
2
1
>>> for _ in range(3):
>>> a = choice_action_by_probs([0.5, 0.5], ['a', 'b'])
>>> print(a)
a
b
b
2.13 API - Utility
fit(sess, network, train_op, cost, X_train, . . . ) Training a given non time-series network by the given
cost function, training data, batch_size, n_epoch etc.
test(sess, network, acc, X_test, y_test, x, . . . ) Test a given non time-series network by the given test
data and metric.
predict(sess, network, X, x, y_op[, batch_size]) Return the predict results of given non time-series net-
work.
evaluation([y_test, y_predict, n_classes]) Input the predicted results, targets results and the num-
ber of class, return the confusion matrix, F1-score of
each class, accuracy and macro F1-score.
class_balancing_oversample([X_train, . . . ]) Input the features and labels, return the features and la-
bels after oversampling.
get_random_int([min_v, max_v, number, seed]) Return a list of random integer by the given range and
quantity.
2.13. API - Utility 221


dict_to_one(dp_dict) Input a dictionary, return a dictionary that all items are
set to one.
list_string_to_dict(string) Inputs ['a', 'b', 'c'], returns {'a': 0,
'b': 1, 'c': 2}.
flatten_list(list_of_list) Input a list of list, return a list that all items are in a list.
exit_tensorflow([sess, port]) Close TensorFlow session, TensorBoard and Nvidia-
process if available.
open_tensorboard([log_dir, port]) Open Tensorboard.
clear_all_placeholder_variables([printable])Clears all the placeholder variables of keep prob, in-
cluding keeping probabilities of all dropout, denoising,
dropconnect etc.
set_gpu_fraction([gpu_fraction]) Set the GPU memory fraction for the application.
2.13.1 Training, testing and predicting
Training
tensorlayer.utils.fit(sess, network, train_op, cost, X_train, y_train, x, y_, acc=None,

batch_size=100, n_epoch=100, print_freq=5, X_val=None, y_val=None,
eval_train=True, tensorboard_dir=None, tensorboard_epoch_freq=5, tensor-
board_weight_histograms=True, tensorboard_graph_vis=True)
Training a given non time-series network by the given cost function, training data, batch_size, n_epoch etc.
• MNIST example click here.
• In order to control the training details, the authors HIGHLY recommend tl.iterate see two MNIST
examples 1, 2.
Parameters
• network (TensorLayer layer) – the network to be trained.
• train_op (TensorFlow optimizer) – The optimizer for training e.g.
tf.train.AdamOptimizer.
• X_train (numpy.array) – The input of training data
• y_train (numpy.array) – The target of training data
• x (placeholder) – For inputs.
• y (placeholder) – For targets.
• acc (TensorFlow expression or None) – Metric for accuracy or others. If None,
would not print the information.
• batch_size (int) – The batch size for training and evaluating.
• n_epoch (int) – The number of training epochs.
• print_freq (int) – Print the training information every print_freq epochs.
• X_val (numpy.array or None) – The input of validation data. If None, would not
perform validation.
• y_val (numpy.array or None) – The target of validation data. If None, would not
perform validation.

• eval_train (boolean) – Whether to evaluate the model during training. If X_val and
y_val are not None, it reflects whether to evaluate the model on training data.
• tensorboard_dir (string) – path to log dir, if set, summary data will be stored to
the tensorboard_dir/ directory for visualization with tensorboard. (default None) Also runs
tl.layers.initialize_global_variables(sess) internally in fit() to setup the summary nodes.
• tensorboard_epoch_freq (int) – How many epochs between storing tensorboard
checkpoint for visualization to log/ directory (default 5).
• tensorboard_weight_histograms (boolean) – If True updates tensorboard
data in the logs/ directory for visualization of the weight histograms every tensor-
board_epoch_freq epoch (default True).
• tensorboard_graph_vis (boolean) – If True stores the graph in the tensorboard
summaries saved to log/ (default True).
Examples
See tutorial_mnist_simple.py
>>> tl.utils.fit(sess, network, train_op, cost, X_train, y_train, x, y_,

... acc=acc, batch_size=500, n_epoch=200, print_freq=5,
... X_val=X_val, y_val=y_val, eval_train=False)
>>> tl.utils.fit(sess, network, train_op, cost, X_train, y_train, x, y_,
... acc=acc, batch_size=500, n_epoch=200, print_freq=5,
... X_val=X_val, y_val=y_val, eval_train=False,
... tensorboard=True, tensorboard_weight_histograms=True, tensorboard_
˓→graph_vis=True)
Notes
If tensorboard_dir not None, the global_variables_initializer will be run inside the fit function in or-
der to initialize the automatically generated summary nodes used for tensorboard visualization, thus
tf.global_variables_initializer().run() before the fit() call will be undefined.
Evaluation
tensorlayer.utils.test(sess, network, acc, X_test, y_test, x, y_, batch_size, cost=None)

Test a given non time-series network by the given test data and metric.
Parameters
• sess (Session) – TensorFlow session.
• network (TensorLayer layer) – The network.
• acc (TensorFlow expression or None) –
Metric for accuracy or others.
– If None, would not print the information.
• X_test (numpy.array) – The input of testing data.
• y_test (numpy array) – The target of testing data

• y (placeholder) – For targets.

• batch_size (int or None) – The batch size for testing, when dataset is large, we
should use minibatche for testing; if dataset is small, we can set it to None.
• cost (TensorFlow expression or None) – Metric for cost or others. If None,
would not print the information.
Examples
>>> tl.utils.test(sess, network, acc, X_test, y_test, x, y_, batch_size=None,

˓→cost=cost)
Prediction
tensorlayer.utils.predict(sess, network, X, x, y_op, batch_size=None)

Return the predict results of given non time-series network.
Parameters
• network (TensorLayer layer) – The network.
• X (numpy.array) – The inputs.
• y_op (placeholder) – The argmax expression of softmax outputs.
• batch_size (int or None) – The batch size for prediction, when dataset is large, we
should use minibatche for prediction; if dataset is small, we can set it to None.
Examples
>>> y = network.outputs
>>> y_op = tf.argmax(tf.nn.softmax(y), 1)
>>> print(tl.utils.predict(sess, network, X_test, x, y_op))
2.13.2 Evaluation functions
tensorlayer.utils.evaluation(y_test=None, y_predict=None, n_classes=None)

Input the predicted results, targets results and the number of class, return the confusion matrix, F1-score of each
class, accuracy and macro F1-score.
Parameters
• y_test (list) – The target results
• y_predict (list) – The predicted results
• n_classes (int) – The number of classes

Examples
>>> c_mat, f1, acc, f1_macro = tl.utils.evaluation(y_test, y_predict, n_classes)
2.13.3 Class balancing functions
tensorlayer.utils.class_balancing_oversample(X_train=None, y_train=None, print-

able=True)
Input the features and labels, return the features and labels after oversampling.
Parameters
• X_train (numpy.array) – The inputs.
• y_train (numpy.array) – The targets.
Examples
One X
>>> X_train, y_train = class_balancing_oversample(X_train, y_train,

Two X
>>> X, y = tl.utils.class_balancing_oversample(X_train=np.hstack((X1, X2)), y_

˓→train=y, printable=False)
>>> X1 = X[:, 0:5]

>>> X2 = X[:, 5:]
2.13.4 Random functions
tensorlayer.utils.get_random_int(min_v=0, max_v=10, number=5, seed=None)

Return a list of random integer by the given range and quantity.
Parameters
• min_v (number) – The minimum value.
• max_v (number) – The maximum value.
• number (int) – Number of value.
• seed (int or None) – The seed for random.
Examples
>>> r = get_random_int(min_v=0, max_v=10, number=5)

[10, 2, 3, 3, 7]

2.13.5 Dictionary and list
Set all items in dictionary to one
tensorlayer.utils.dict_to_one(dp_dict)
Input a dictionary, return a dictionary that all items are set to one.
Used for disable dropout, dropconnect layer and so on.
Parameters dp_dict (dictionary) – The dictionary contains key and number, e.g. keeping
probabilities.
Examples
>>> dp_dict = dict_to_one( network.all_drop )

>>> dp_dict = dict_to_one( network.all_drop )
>>> feed_dict.update(dp_dict)
Convert list of string to dictionary
tensorlayer.utils.list_string_to_dict(string)
Inputs ['a', 'b', 'c'], returns {'a': 0, 'b': 1, 'c': 2}.
Flatten a list
tensorlayer.utils.flatten_list(list_of_list)
Input a list of list, return a list that all items are in a list.
Parameters list_of_list (a list of list) –
Examples
>>> tl.utils.flatten_list([[1, 2, 3],[4, 5],[6]])

[1, 2, 3, 4, 5, 6]
2.13.6 Close TF session and associated processes
tensorlayer.utils.exit_tensorflow(sess=None, port=6006)
Close TensorFlow session, TensorBoard and Nvidia-process if available.
Parameters
• tb_port (int) – TensorBoard port you want to close, 6006 as default.

2.13.7 Open TensorBoard
tensorlayer.utils.open_tensorboard(log_dir=’/tmp/tensorflow’, port=6006)
Open Tensorboard.
Parameters
• log_dir (str) – Directory where your tensorboard logs are saved
• port (int) – TensorBoard port you want to open, 6006 is tensorboard default
2.13.8 Clear TensorFlow placeholder
tensorlayer.utils.clear_all_placeholder_variables(printable=True)
Clears all the placeholder variables of keep prob, including keeping probabilities of all dropout, denoising,
dropconnect etc.
Parameters printable (boolean) – If True, print all deleted variables.
2.13.9 Set GPU functions
tensorlayer.utils.set_gpu_fraction(gpu_fraction=0.3)
Set the GPU memory fraction for the application.
Parameters gpu_fraction (float) – Fraction of GPU memory, (0 ~ 1]
References
• TensorFlow using GPU
2.14 API - Visualization
TensorFlow provides TensorBoard to visualize the model, activations etc. Here we provide more functions for data
visualization.
read_image(image[, path]) Read one image.

read_images(img_list[, path, n_threads, . . . ]) Returns all images in list by given path and name of
each image file.
save_image(image[, image_path]) Save a image.
save_images(images, size[, image_path]) Save multiple images into one single image.
draw_boxes_and_labels_to_image(image, Draw bboxes and class labels on image.
. . . [, . . . ])
draw_mpii_pose_to_image(image, poses[, . . . ]) Draw people(s) into image using MPII dataset format as
input, return or save the result image.
draw_weights([W, second, saveable, shape, . . . ]) Visualize every columns of the weight matrix to a group
of Greyscale img.
CNN2d([CNN, second, saveable, name, fig_idx]) Display a group of RGB or Greyscale CNN masks.
frame([I, second, saveable, name, cmap, fig_idx]) Display a frame.
images2d([images, second, saveable, name, . . . ]) Display a group of RGB or Greyscale images.
tsne_embedding(embeddings, reverse_dictionary) Visualize the embeddings by using t-SNE.
2.14. API - Visualization 227

2.14.1 Save and read images
Read one image
tensorlayer.visualize.read_image(image, path=”)
Read one image.
Parameters
• image (str) – The image file name.
• path (str) – The image folder path.
Returns The image.
Read multiple images
tensorlayer.visualize.read_images(img_list, path=”, n_threads=10, printable=True)

Returns all images in list by given path and name of each image file.
Parameters
• img_list (list of str) – The image file names.
• path (str) – The image folder path.
• n_threads (int) – The number of threads to read image.
• printable (boolean) – Whether to print information when reading images.
Returns The images.
Return type list of numpy.array
Save one image
tensorlayer.visualize.save_image(image, image_path=’_temp.png’)
Save a image.
Parameters
• image (numpy array) – [w, h, c]
• image_path (str) – path
Save multiple images
tensorlayer.visualize.save_images(images, size, image_path=’_temp.png’)

Save multiple images into one single image.
Parameters
• images (numpy array) – (batch, w, h, c)
• size (list of 2 ints) – row and column number. number of images should be equal
or less than size[0] * size[1]
• image_path (str) – save path

Examples

>>> images = np.random.rand(64, 100, 100, 3)
>>> tl.visualize.save_images(images, [8, 8], 'temp.png')
Save image for object detection
tensorlayer.visualize.draw_boxes_and_labels_to_image(image, classes, coords, scores,

classes_list, is_center=True,
is_rescale=True,
save_name=None)
Draw bboxes and class labels on image. Return or save the image with bboxes, example in the docs of tl.
prepro.
Parameters
• image (numpy.array) – The RGB image [height, width, channel].
• classes (list of int) – A list of class ID (int).
• coords (list of int) –
A list of list for coordinates.
– Should be [x, y, x2, y2] (up-left and botton-right format)
– If [x_center, y_center, w, h] (set is_center to True).
• scores (list of float) – A list of score (float). (Optional)
• classes_list (list of str) – for converting ID to string on image.
• is_center (boolean) –
Whether the coordinates is [x_center, y_center, w, h]
– If coordinates are [x_center, y_center, w, h], set it to True for converting it to [x, y, x2,
y2] (up-left and botton-right) internally.
– If coordinates are [x1, x2, y1, y2], set it to False.
• is_rescale (boolean) –
Whether to rescale the coordinates from pixel-unit format to ratio format.
– If True, the input coordinates are the portion of width and high, this API will scale the
coordinates to pixel unit internally.
– If False, feed the coordinates with pixel unit format.
• save_name (None or str) – The name of image file (i.e. image.png), if None, not to
save image.
Returns The saved image.

References
• OpenCV rectangle and putText.

• scikit-image.
Save image for pose estimation (MPII)
tensorlayer.visualize.draw_mpii_pose_to_image(image, poses, save_name=’image.png’)

Draw people(s) into image using MPII dataset format as input, return or save the result image.
This is an experimental API, can be changed in the future.
Parameters
• image (numpy.array) – The RGB image [height, width, channel].
• poses (list of dict) – The people(s) annotation in MPII format, see tl.files.
load_mpii_pose_dataset.
• save_name (None or str) – The name of image file (i.e. image.png), if None, not to
save image.
Returns The saved image.
Examples
>>> import pprint

>>> img_train_list, ann_train_list, img_test_list, ann_test_list = tl.files.load_
˓→mpii_pose_dataset()
>>> image = tl.vis.read_image(img_train_list[0])

>>> tl.vis.draw_mpii_pose_to_image(image, ann_train_list[0], 'image.png')
>>> pprint.pprint(ann_train_list[0])
References
• MPII Keyponts and ID
2.14.2 Visualize model parameters
Visualize CNN 2d filter
tensorlayer.visualize.CNN2d(CNN=None, second=10, saveable=True, name=’cnn’,

fig_idx=3119362)
Display a group of RGB or Greyscale CNN masks.
Parameters
• CNN (numpy.array) – The image. e.g: 64 5x5 RGB images can be (5, 5, 3, 64).
• second (int) – The display second(s) for the image(s), if saveable is False.
• saveable (boolean) – Save or plot the figure.

• name (str) – A name to save the image, if saveable is True.

• fig_idx (int) – The matplotlib figure index.
Examples
>>> tl.visualize.CNN2d(network.all_params[0].eval(), second=10, saveable=True,

˓→name='cnn1_mnist', fig_idx=2012)
Visualize weights
tensorlayer.visualize.draw_weights(W=None, second=10, saveable=True, shape=None,

name=’mnist’, fig_idx=2396512)
Visualize every columns of the weight matrix to a group of Greyscale img.
Parameters
• W (numpy.array) – The weight matrix
• shape (a list with 2 int or None) – The shape of feature image, MNIST is
[28, 80].
• name (a string) – A name to save the image, if saveable is True.
• fig_idx (int) – matplotlib figure index.
Examples
>>> tl.visualize.draw_weights(network.all_params[0].eval(), second=10,

˓→saveable=True, name='weight_of_1st_layer', fig_idx=2012)
2.14.3 Visualize images
Image by matplotlib
tensorlayer.visualize.frame(I=None, second=5, saveable=True, name=’frame’, cmap=None,

fig_idx=12836)
Display a frame. Make sure OpenAI Gym render() is disable before using it.
Parameters
• I (numpy.array) – The image.
• cmap (None or str) – ‘gray’ for greyscale, None for default, etc.

Examples
>>> env = gym.make("Pong-v0")

>>> observation = env.reset()
>>> tl.visualize.frame(observation)
Images by matplotlib
tensorlayer.visualize.images2d(images=None, second=10, saveable=True, name=’images’,

dtype=None, fig_idx=3119362)
Display a group of RGB or Greyscale images.
Parameters
• images (numpy.array) – The images.
• dtype (None or numpy data type) – The data type for displaying the images.
Examples
>>> X_train, y_train, X_test, y_test = tl.files.load_cifar10_dataset(shape=(-1,

˓→32, 32, 3), plotable=False)
>>> tl.visualize.images2d(X_train[0:100,:,:,:], second=10, saveable=False, name=

˓→'cifar10', dtype=np.uint8, fig_idx=20212)
2.14.4 Visualize embeddings
tensorlayer.visualize.tsne_embedding(embeddings, reverse_dictionary, plot_only=500, sec-

ond=5, saveable=False, name=’tsne’, fig_idx=9862)
Visualize the embeddings by using t-SNE.
Parameters
• embeddings (numpy.array) – The embedding matrix.
• reverse_dictionary (dictionary) – id_to_word, mapping id to unique word.
• plot_only (int) – The number of examples to plot, choice the most common words.

Examples
>>> see 'tutorial_word2vec_basic.py'

>>> final_embeddings = normalized_embeddings.eval()
>>> tl.visualize.tsne_embedding(final_embeddings, labels, reverse_dictionary,
... plot_only=500, second=5, saveable=False, name='tsne')
2.15 API - Database
This is the alpha version of database management system. If you have any trouble, please ask for help at tensor-
layer@gmail.com .
2.15.1 Why Database
TensorLayer is designed for real world production, capable of large scale machine learning applications. TensorLayer
database is introduced to address the many data management challenges in the large scale machine learning projects,
such as:
1. Finding training data from an enterprise data warehouse.
2. Loading large datasets that are beyond the storage limitation of one computer.
3. Managing different models with version control, and comparing them(e.g. accuracy).
4. Automating the process of training, evaluating and deploying machine learning models.
With the TensorLayer system, we introduce this database technology to address the challenges above.
The database management system is designed with the following three principles in mind.
Everything is Data
Data warehouses can store and capture the entire machine learning development process. The data can be categorized
as:
1. Dataset: This includes all the data used for training, validation and prediction. The labels can be manually
specified or generated by model prediction.
2. Model architecture: The database includes a table that stores different model architectures, enabling users to
reuse the many model development works.
3. Model parameters: This database stores all the model parameters of each epoch in the training step.
4. Tasks: A project usually include many small tasks. Each task contains the necessary information such as hyper-
parameters for training or validation. For a training task, typical information includes training data, the model
parameter, the model architecture, how many epochs the training task has. Validation, testing and inference are
also supported by the task system.
5. Loggings: The logs store all the metrics of each machine learning model, such as the time stamp, loss and
accuracy of each batch or epoch.
TensorLayer database in principle is a keyword based search engine. Each model, parameter, or training data is
assigned many tags. The storage system organizes data into two layers: the index layer, and the blob layer. The index
layer stores all the tags and references to the blob storage. The index layer is implemented based on NoSQL document
database such as MongoDB. The blob layer stores videos, medical images or label masks in large chunk size, which
2.15. API - Database 233

is usually implemented based on a file system. Our database is based on MongoDB. The blob system is based on the
GridFS while the indexes are stored as documents.
Everything is identified by Query
Within the database framework, any entity within the data warehouse, such as the data, model or tasks is specified
by the database query language. As a reference, the query is more space efficient for storage and it can specify
multiple objects in a concise way. Another advantage of such a design is enabling a highly flexible software system.
Many system can be implemented by simply rewriting different components, with many new applications can be
implemented just by update the query without modification of any application code.
2.15.2 Preparation
In principle, the database can be implemented by any document oriented NoSQL database system. The existing
implementation is based on MongoDB. Further implementations on other databases will be released depending on the
progress. It will be straightforward to port our database system to Google Cloud, AWS and Azure. The following
tutorials are based on the MongoDB implementation.
Installing and running MongoDB
The installation instruction of MongoDB can be found at MongoDB Docs. There are also many MongoDB services
from Amazon or GCP, such as Mongo Atlas from MongoDB User can also use docker, which is a powerful tool for
deploying software . After installing MongoDB, a MongoDB management tool with graphic user interface will be
extremely useful. Users can also install Studio3T(MongoChef), which is powerful user interface tool for MongoDB
and is free for non-commercial use studio3t.
2.15.3 Tutorials
Connect to the database
Similar with MongoDB management tools, IP and port number are required for connecting to the database. To distin-
guish the different projects, the database instances have a project_name argument. In the following example, we
connect to MongoDB on a local machine with the IP localhost, and port 27017 (this is the default port number
of MongoDB).
db = tl.db.TensorHub(ip='localhost', port=27017, dbname='temp',

username=None, password='password', project_name='tutorial')
Dataset management
You can save a dataset into the database and allow all machines to access it. Apart from the dataset key, you can
also insert a custom argument such as version and description, for better managing the datasets. Note that, all saving
functions will automatically save a timestamp, allowing you to load staff (data, model, task) using the timestamp.
db.save_dataset(dataset=[X_train, y_train, X_test, y_test], dataset_name='mnist',

˓→description='this is a tutorial')
After saving the dataset, others can access the dataset as followed:

dataset = db.find_dataset('mnist')
dataset = db.find_dataset('mnist', version='1.0')
If you have multiple datasets that use the same dataset key, you can get all of them as followed:
datasets = db.find_all_datasets('mnist')
Model management
Save model architecture and parameters into database. The model architecture is represented by a TL graph, and the
parameters are stored as a list of array.
db.save_model(net, accuracy=0.8, loss=2.3, name='second_model')
After saving the model into database, we can load it as follow:
net = db.find_model(sess=sess, accuracy=0.8, loss=2.3)
If there are many models, you can use MongoDB’s ‘sort’ method to find the model you want. To get the newest or
oldest model, you can sort by time:
## newest model
net = db.find_model(sess=sess, sort=[("time", pymongo.DESCENDING)])

net = db.find_model(sess=sess, sort=[("time", -1)])
## oldest model
net = db.find_model(sess=sess, sort=[("time", pymongo.ASCENDING)])

net = db.find_model(sess=sess, sort=[("time", 1)])
If you save the model along with accuracy, you can get the model with the best accuracy as followed:
net = db.find_model(sess=sess, sort=[("test_accuracy", -1)])
To delete all models in a project:
db.delete_model()
If you want to specify which model you want to delete, you need to put arguments inside.
Event / Logging management
Save training log:
db.save_training_log(accuracy=0.33)
db.save_training_log(accuracy=0.44)
Delete logs that match the requirement:
db.delete_training_log(accuracy=0.33)
Delete all logging of this project:

db.delete_training_log()
db.delete_validation_log()
db.delete_testing_log()
Task distribution
A project usually consists of many tasks such as hyper parameter selection. To make it easier, we can distribute these
tasks to several GPU servers. A task consists of a task script, hyper parameters, desired result and a status.
A task distributor can push both dataset and tasks into a database, allowing task runners on GPU servers to pull and
run. The following is an example that pushes 3 tasks with different hyper parameters.
## save dataset into database, then allow other servers to use it

X_train, y_train, X_val, y_val, X_test, y_test = tl.files.load_mnist_dataset(shape=(-
˓→1, 784))
db.save_dataset((X_train, y_train, X_val, y_val, X_test, y_test), 'mnist',

˓→description='handwriting digit')
## push tasks into database, then allow other servers pull tasks to run
db.create_task(
task_name='mnist', script='task_script.py', hyper_parameters=dict(n_units1=800, n_
˓→units2=800),
saved_result_keys=['test_accuracy'], description='800-800'
)
db.create_task(
˓→units2=600),
)
db.create_task(
˓→units2=400),
)
## wait for tasks to finish

while db.check_unfinished_task(task_name='mnist'):
print("waiting runners to finish the tasks")
time.sleep(1)
## you can get the model and result from database and do some analysis at the end
The task runners on GPU servers can monitor the database, and run the tasks immediately when they are made avail-
able. In the task script, we can save the final model and results to the database, this allows task distributors to get the
desired model and results.
## monitors the database and pull tasks to run

while True:
print("waiting task from distributor")
db.run_task(task_name='mnist', sort=[("time", -1)])
time.sleep(1)

Example codes
See here.
2.15.4 TensorHub API
class tensorlayer.db.TensorHub(ip=’localhost’, port=27017, dbname=’dbname’, user-

name=’None’, password=’password’, project_name=None)
It is a MongoDB based manager that help you to manage data, network architecture, parameters and logging.
Parameters
• ip (str) – Localhost or IP address.
• port (int) – Port number.
• dbname (str) – Database name.
• username (str or None) – User name, set to None if you do not need authentication.
• password (str) – Password.
• project_name (str or None) – Experiment key for this entire project, similar with
the repository name of Github.
ip, port, dbname and other input parameters
see above – See above.
project_name
str – The given project name, if no given, set to the script name.
db
mongodb client – See pymongo.MongoClient.
check_unfinished_task(task_name=None, **kwargs)
Finds and runs a pending task.
Parameters
• task_name (str) – The task name.
• kwargs (other parameters) – Users customized parameters such as description,
version number.
Examples
Wait until all tasks finish in user’s local console
>>> while not db.check_unfinished_task():

>>> time.sleep(1)
>>> print("all tasks finished")
>>> net = db.find_top_model(sess=sess, sort=[("test_accuracy", -1)])
>>> print("the best accuracy {} is from model {}".format(net._test_accuracy,
˓→net._name))
Returns boolean
Return type True for success, False for fail.

create_task(task_name=None, script=None, hyper_parameters=None, saved_result_keys=None,

**kwargs)
Uploads a task to the database, timestamp will be added automatically.
Parameters
• script (str) – File name of the python script.
• hyper_parameters (dictionary) – The hyper parameters pass into the script.
• saved_result_keys (list of str) – The keys of the task results to keep in the
database when the task finishes.
version number.
Examples
Uploads a task >>> db.create_task(task_name=’mnist’, script=’example/tutorial_mnist_simple.py’, de-

scription=’simple tutorial’)
Finds and runs the latest task >>> db.run_top_task(sess=sess, sort=[(“time”, pymongo.DESCENDING)])
>>> db.run_top_task(sess=sess, sort=[(“time”, -1)])
Finds and runs the oldest task >>> db.run_top_task(sess=sess, sort=[(“time”, pymongo.ASCENDING)])
>>> db.run_top_task(sess=sess, sort=[(“time”, 1)])
delete_datasets(**kwargs)
Delete datasets.
Parameters kwargs (logging information) – Find items to delete, leave it empty to
delete all log.
delete_model(**kwargs)
Delete model.
delete all log.
delete_tasks(**kwargs)
Delete tasks.
delete all log.
Examples
>>> db.delete_tasks()
delete_testing_log(**kwargs)
Deletes testing log.
delete all log.

Examples
• see save_training_log.
delete_training_log(**kwargs)
Deletes training log.
delete all log.
Examples
Save training log >>> db.save_training_log(accuracy=0.33) >>> db.save_training_log(accuracy=0.44)

Delete logs that match the requirement >>> db.delete_training_log(accuracy=0.33)
Delete all logs >>> db.delete_training_log()
delete_validation_log(**kwargs)
Deletes validation log.
delete all log.
Examples
• see save_training_log.
find_datasets(dataset_name=None, **kwargs)
Finds and returns all datasets from the database which matches the requirement. In some case, the data in
a dataset can be stored separately for better management.
Parameters
• dataset_name (str) – The name/key of dataset.
• kwargs (other events) – Other events, such as description, author and etc (op-
tional).
Returns params
Return type the parameters, return False if nothing found.
find_top_dataset(dataset_name=None, sort=None, **kwargs)
Finds and returns a dataset from the database which matches the requirement.
Parameters
• dataset_name (str) – The name of dataset.
• sort (List of tuple) – PyMongo sort comment, search “PyMongo find one sort-
ing” and collection level operations for more details.
• kwargs (other events) – Other events, such as description, author and etc (optinal).

Examples
Save dataset >>> db.save_dataset([X_train, y_train, X_test, y_test], ‘mnist’, description=’this is a tutorial’)
Get dataset >>> dataset = db.find_top_dataset(‘mnist’) >>> datasets = db.find_datasets(‘mnist’)
Returns dataset – Return False if nothing found.
Return type the dataset or False
find_top_model(sess, sort=None, model_name=’model’, **kwargs)
Finds and returns a model architecture and its parameters from the database which matches the require-
ment.
Parameters
• sess (Session) – TensorFlow session.
• model_name (str or None) – The name/key of model.
• kwargs (other events) – Other events, such as name, accuracy, loss, step number
and etc (optinal).
Examples
• see save_model.
Returns network – Note that, the returned network contains all information of the document
(record), e.g. if you saved accuracy in the document, you can get the accuracy by using
net._accuracy.
Return type TensorLayer layer
run_top_task(task_name=None, sort=None, **kwargs)

Finds and runs a pending task that in the first of the sorting list.
Parameters
version number.
Examples
Monitors the database and pull tasks to run >>> while True: >>> print(“waiting task from distributor”)
>>> db.run_top_task(task_name=’mnist’, sort=[(“time”, -1)]) >>> time.sleep(1)
Returns boolean
save_dataset(dataset=None, dataset_name=None, **kwargs)
Saves one dataset into database, timestamp will be added automatically.

Parameters
• dataset (any type) – The dataset you want to store.
• dataset_name (str) – The name of dataset.
• kwargs (other events) – Other events, such as description, author and etc (optinal).
Examples
Save dataset >>> db.save_dataset([X_train, y_train, X_test, y_test], ‘mnist’, description=’this is a tutorial’)
Get dataset >>> dataset = db.find_top_dataset(‘mnist’)
Returns boolean
Return type Return True if save success, otherwise, return False.
save_model(network=None, model_name=’model’, **kwargs)
Save model architecture and parameters into database, timestamp will be added automatically.
Parameters
• network (TensorLayer layer) – TensorLayer layer instance.
• model_name (str) – The name/key of model.
• kwargs (other events) – Other events, such as name, accuracy, loss, step number
and etc (optinal).
Examples
Save model architecture and parameters into database. >>> db.save_model(net, accuracy=0.8, loss=2.3,
name=’second_model’)
Load one model with parameters from database (run this in other script) >>> net =
db.find_top_model(sess=sess, accuracy=0.8, loss=2.3)
Find and load the latest model. >>> net = db.find_top_model(sess=sess, sort=[(“time”, py-
mongo.DESCENDING)]) >>> net = db.find_top_model(sess=sess, sort=[(“time”, -1)])
Find and load the oldest model. >>> net = db.find_top_model(sess=sess, sort=[(“time”, py-
mongo.ASCENDING)]) >>> net = db.find_top_model(sess=sess, sort=[(“time”, 1)])
Get model information >>> net._accuracy . . . 0.8
Returns boolean
save_testing_log(**kwargs)
Saves the testing log, timestamp will be added automatically.
Parameters kwargs (logging information) – Events, such as accuracy, loss, step num-
ber and etc.
Examples
>>> db.save_testing_log(accuracy=0.33, loss=0.98)

save_training_log(**kwargs)
Saves the training log, timestamp will be added automatically.
ber and etc.
Examples
>>> db.save_training_log(accuracy=0.33, loss=0.98)
save_validation_log(**kwargs)
Saves the validation log, timestamp will be added automatically.
ber and etc.
Examples
>>> db.save_validation_log(accuracy=0.33, loss=0.98)

CHAPTER 3
Command-line Reference
TensorLayer provides a handy command-line tool tl to perform some common tasks.
3.1 CLI - Command Line Interface
The tensorlayer.cli module provides a command-line tool for some common tasks.
3.1.1 tl train
(Alpha release - usage might change later)

The tensorlayer.cli.train module provides the tl train subcommand. It helps the user bootstrap a Tensor-
Flow/TensorLayer program for distributed training using multiple GPU cards or CPUs on a computer.
You need to first setup the CUDA_VISIBLE_DEVICES to tell tl train which GPUs are available. If the
CUDA_VISIBLE_DEVICES is not given, tl train would try best to discover all available GPUs.
In distribute training, each TensorFlow program needs a TF_CONFIG environment variable to describe the cluster. It
also needs a master daemon to monitor all trainers. tl train is responsible for automatically managing these two
tasks.
Usage
tl train [-h] [-p NUM_PSS] [-c CPU_TRAINERS] <file> [args [args . . . ]]
# example of using GPU 0 and 1 for training mnist

CUDA_VISIBLE_DEVICES="0,1"
tl train example/tutorial_mnist_distributed.py
# example of using CPU trainers for inception v3

tl train -c 16 example/tutorial_imagenet_inceptionV3_distributed.py
243
# example of using GPU trainers for inception v3 with customized arguments

# as CUDA_VISIBLE_DEVICES is not given, tl would try to discover all available GPUs
tl train example/tutorial_imagenet_inceptionV3_distributed.py -- --batch_size 16
Command-line Arguments
• file: python file path.

• NUM_PSS : The number of parameter servers.
• CPU_TRAINERS: The number of CPU trainers.
It is recommended that NUM_PSS + CPU_TRAINERS <= cpu count
• args: Any parameter after -- would be passed to the python program.
Notes
A parallel training program would require multiple parameter servers to help parallel trainers to exchange intermediate
gradients. The best number of parameter servers is often proportional to the size of your model as well as the number
of CPUs available. You can control the number of parameter servers using the -p parameter.
If you have a single computer with massive CPUs, you can use the -c parameter to enable CPU-only parallel training.
The reason we are not supporting GPU-CPU co-training is because GPU and CPU are running at different speeds.
Using them together in training would incur stragglers.
244 Chapter 3. Command-line Reference

CHAPTER 4
Indices and tables
• genindex
• modindex
• search
245
246 Chapter 4. Indices and tables

Python Module Index
t
tensorlayer.activation, 37
tensorlayer.array_ops, 42
tensorlayer.cli, 243
tensorlayer.cli.train, 243
tensorlayer.cost, 45
tensorlayer.db, 237
tensorlayer.distributed, 92
tensorlayer.files, 94
tensorlayer.iterate, 112
tensorlayer.layers, 115
tensorlayer.models, 201
tensorlayer.nlp, 205
tensorlayer.optimizers, 218
tensorlayer.prepro, 52
tensorlayer.rein, 219
tensorlayer.utils, 221
tensorlayer.visualize, 227
247
248 Python Module Index

Index
A batch_size (tensorlayer.layers.ConvLSTMLayer at-

absolute_difference_error() (in module tensorlayer.cost), tribute), 185
48 batch_size (tensorlayer.layers.DynamicRNNLayer
adjust_hue() (in module tensorlayer.prepro), 72 attribute), 189
advanced_indexing_op() (in module tensorlayer.layers), batch_size (tensorlayer.layers.RNNLayer attribute), 181
185 batch_transformer() (in module tensorlayer.layers), 197
affine_horizontal_flip_matrix() (in module tensor- BatchNormLayer (class in tensorlayer.layers), 159
layer.prepro), 58 BiDynamicRNNLayer (class in tensorlayer.layers), 190
affine_respective_zoom_matrix() (in module tensor- binary_cross_entropy() (in module tensorlayer.cost), 46
layer.prepro), 59 binary_dilation() (in module tensorlayer.prepro), 77
affine_rotation_matrix() (in module tensorlayer.prepro), binary_erosion() (in module tensorlayer.prepro), 78
58 BinaryConv2d (class in tensorlayer.layers), 170
affine_shear_matrix() (in module tensorlayer.prepro), 59 BinaryDenseLayer (class in tensorlayer.layers), 169
affine_shift_matrix() (in module tensorlayer.prepro), 58 BiRNNLayer (class in tensorlayer.layers), 182
affine_transform_cv2() (in module tensorlayer.prepro), 60 brightness() (in module tensorlayer.prepro), 70
affine_transform_keypoints() (in module tensor- brightness_multi() (in module tensorlayer.prepro), 71
layer.prepro), 61 build_reverse_dictionary() (in module tensorlayer.nlp),
affine_vertical_flip_matrix() (in module tensor- 212
layer.prepro), 58 build_vocab() (in module tensorlayer.nlp), 212
affine_zoom_matrix() (in module tensorlayer.prepro), 59 build_words_dataset() (in module tensorlayer.nlp), 212
alphas() (in module tensorlayer.array_ops), 43
alphas_like() (in module tensorlayer.array_ops), 43
C
AMSGrad (class in tensorlayer.optimizers), 218 channel_shift() (in module tensorlayer.prepro), 75
array_to_img() (in module tensorlayer.prepro), 76 channel_shift_multi() (in module tensorlayer.prepro), 75
assign_params() (in module tensorlayer.files), 106 check_unfinished_task() (tensorlayer.db.TensorHub
AtrousConv1dLayer() (in module tensorlayer.layers), 140 method), 237
AtrousConv2dLayer (class in tensorlayer.layers), 141 choice_action_by_probs() (in module tensorlayer.rein),
AtrousDeConv2dLayer (class in tensorlayer.layers), 142 221
AverageEmbeddingInputlayer (class in class_balancing_oversample() (in module tensor-
tensor-
layer.layers), 129 layer.utils), 225
clear_all_placeholder_variables() (in module tensor-
B layer.utils), 227
basic_tokenizer() (in module tensorlayer.nlp), 215 clear_layers_name() (in module tensorlayer.layers), 200
BasicConvLSTMCell (class in tensorlayer.layers), 184 CNN2d() (in module tensorlayer.visualize), 230
batch_size (tensorlayer.layers.BiDynamicRNNLayer at- ConcatLayer (class in tensorlayer.layers), 157
tribute), 192 Conv1d (class in tensorlayer.layers), 132
batch_size (tensorlayer.layers.BiRNNLayer attribute), Conv1dLayer (class in tensorlayer.layers), 135
183 Conv2d (class in tensorlayer.layers), 133
Conv2dLayer (class in tensorlayer.layers), 136
Conv3dLayer (class in tensorlayer.layers), 137
249
ConvLSTMLayer (class in tensorlayer.layers), 184 DropconnectDenseLayer (class in tensorlayer.layers), 150

ConvRNNCell (class in tensorlayer.layers), 184 DropoutLayer (class in tensorlayer.layers), 151
cosine_similarity() (in module tensorlayer.cost), 51 DynamicRNNLayer (class in tensorlayer.layers), 188
count_params() (tensorlayer.layers.Layer method), 125
create_task() (tensorlayer.db.TensorHub method), 237 E
create_vocab() (in module tensorlayer.nlp), 209 elastic_transform() (in module tensorlayer.prepro), 68
create_vocabulary() (in module tensorlayer.nlp), 215 elastic_transform_multi() (in module tensorlayer.prepro),
crop() (in module tensorlayer.prepro), 63 69
crop_multi() (in module tensorlayer.prepro), 64 ElementwiseLambdaLayer (class in tensorlayer.layers),
cross_entropy() (in module tensorlayer.cost), 46 156
cross_entropy_reward_loss() (in module tensor- ElementwiseLayer (class in tensorlayer.layers), 157
layer.rein), 220 EmbeddingInputlayer (class in tensorlayer.layers), 128
cross_entropy_seq() (in module tensorlayer.cost), 49 end_id (tensorlayer.nlp.Vocabulary attribute), 208
cross_entropy_seq_with_mask() (in module tensor- erosion() (in module tensorlayer.prepro), 78
layer.cost), 50 evaluation() (in module tensorlayer.utils), 224
exists_or_mkdir() (in module tensorlayer.files), 110
D exit_tensorflow() (in module tensorlayer.utils), 226
data_to_token_ids() (in module tensorlayer.nlp), 217 ExpandDimsLayer (class in tensorlayer.layers), 152
db (tensorlayer.db.TensorHub attribute), 237
DeConv2d (class in tensorlayer.layers), 134 F
DeConv2dLayer (class in tensorlayer.layers), 138 featurewise_norm() (in module tensorlayer.prepro), 75
DeConv3d (class in tensorlayer.layers), 135 file_exists() (in module tensorlayer.files), 109
DeConv3dLayer (class in tensorlayer.layers), 140 final_state (tensorlayer.layers.ConvLSTMLayer at-
DeformableConv2d (class in tensorlayer.layers), 142 tribute), 185
del_file() (in module tensorlayer.files), 109 final_state (tensorlayer.layers.DynamicRNNLayer
del_folder() (in module tensorlayer.files), 109 attribute), 189
delete_datasets() (tensorlayer.db.TensorHub method), final_state (tensorlayer.layers.RNNLayer attribute), 180
238 final_state_decode (tensorlayer.layers.Seq2Seq attribute),
delete_model() (tensorlayer.db.TensorHub method), 238 193
delete_tasks() (tensorlayer.db.TensorHub method), 238 final_state_encode (tensorlayer.layers.Seq2Seq attribute),
delete_testing_log() (tensorlayer.db.TensorHub method), 193
238 find_contours() (in module tensorlayer.prepro), 76
delete_training_log() (tensorlayer.db.TensorHub find_datasets() (tensorlayer.db.TensorHub method), 239
method), 239 find_top_dataset() (tensorlayer.db.TensorHub method),
delete_validation_log() (tensorlayer.db.TensorHub 239
method), 239 find_top_model() (tensorlayer.db.TensorHub method),
DenseLayer (class in tensorlayer.layers), 149 240
DepthwiseConv2d (class in tensorlayer.layers), 143 fit() (in module tensorlayer.utils), 222
dice_coe() (in module tensorlayer.cost), 48 flatten_list() (in module tensorlayer.utils), 226
dice_hard_coe() (in module tensorlayer.cost), 49 flatten_reshape() (in module tensorlayer.layers), 199
dict_to_one() (in module tensorlayer.utils), 226 FlattenLayer (class in tensorlayer.layers), 194
dilation() (in module tensorlayer.prepro), 77 flip_axis() (in module tensorlayer.prepro), 64
discount_episode_rewards() (in module tensorlayer.rein), flip_axis_multi() (in module tensorlayer.prepro), 64
219 folder_exists() (in module tensorlayer.files), 109
DorefaConv2d (class in tensorlayer.layers), 173, 174 frame() (in module tensorlayer.visualize), 231
download_file_from_google_drive() (in module tensor-
layer.files), 104 G
DownSampling2dLayer (class in tensorlayer.layers), 154 GaussianNoiseLayer (class in tensorlayer.layers), 158
draw_boxes_and_labels_to_image() (in module tensor- generate_skip_gram_batch() (in module tensorlayer.nlp),
layer.visualize), 229 206
draw_mpii_pose_to_image() (in module tensor- get_all_params() (tensorlayer.layers.Layer method), 125
layer.visualize), 230 get_layers_with_name() (in module tensorlayer.layers),
draw_weights() (in module tensorlayer.visualize), 231 116
drop() (in module tensorlayer.prepro), 76 get_random_int() (in module tensorlayer.utils), 225
250 Index
get_variables_with_name() (in module tensor- Layer (class in tensorlayer.layers), 124

layer.layers), 116 LayerNormLayer (class in tensorlayer.layers), 160
global_step (in module tensorlayer.distributed), 94 leaky_relu() (in module tensorlayer.activation), 38
GlobalMaxPool1d (class in tensorlayer.layers), 166 leaky_relu6() (in module tensorlayer.activation), 39
GlobalMaxPool2d (class in tensorlayer.layers), 167 leaky_twice_relu6() (in module tensorlayer.activation),
GlobalMaxPool3d (class in tensorlayer.layers), 168 40
GlobalMeanPool1d (class in tensorlayer.layers), 166 li_regularizer() (in module tensorlayer.cost), 52
GlobalMeanPool2d (class in tensorlayer.layers), 167 list_remove_repeat() (in module tensorlayer.layers), 200
GlobalMeanPool3d (class in tensorlayer.layers), 168 list_string_to_dict() (in module tensorlayer.utils), 226
GroupConv2d (class in tensorlayer.layers), 145 lo_regularizer() (in module tensorlayer.cost), 52
GroupNormLayer (class in tensorlayer.layers), 160 load_and_assign_npz() (in module tensorlayer.files), 106
load_and_assign_npz_dict() (in module tensorlayer.files),
H 107
hard_tanh() (in module tensorlayer.activation), 41 load_celebA_dataset() (in module tensorlayer.files), 101
hsv_to_rgb() (in module tensorlayer.prepro), 72 load_cifar10_dataset() (in module tensorlayer.files), 96
load_ckpt() (in module tensorlayer.files), 107
I load_cropped_svhn() (in module tensorlayer.files), 97
illumination() (in module tensorlayer.prepro), 71 load_cyclegan_dataset() (in module tensorlayer.files), 101
images2d() (in module tensorlayer.visualize), 232 load_fashion_mnist_dataset() (in module tensor-
imresize() (in module tensorlayer.prepro), 73 layer.files), 96
initial_state (tensorlayer.layers.ConvLSTMLayer at- load_file_list() (in module tensorlayer.files), 110
tribute), 185 load_flickr1M_dataset() (in module tensorlayer.files), 100
initial_state (tensorlayer.layers.DynamicRNNLayer at- load_flickr25k_dataset() (in module tensorlayer.files),
tribute), 189 100
initial_state (tensorlayer.layers.RNNLayer attribute), 180 load_folder_list() (in module tensorlayer.files), 110
initial_state_decode (tensorlayer.layers.Seq2Seq at- load_imdb_dataset() (in module tensorlayer.files), 98
tribute), 193 load_matt_mahoney_text8_dataset() (in module tensor-
initial_state_encode (tensorlayer.layers.Seq2Seq at- layer.files), 98
tribute), 193 load_mnist_dataset() (in module tensorlayer.files), 95
initialize_global_variables() (in module tensor- load_mpii_pose_dataset() (in module tensorlayer.files),
layer.layers), 117 103
initialize_rnn_state() (in module tensorlayer.layers), 200 load_nietzsche_dataset() (in module tensorlayer.files), 99
initialize_vocabulary() (in module tensorlayer.nlp), 216 load_npy_to_any() (in module tensorlayer.files), 109
InputLayer (class in tensorlayer.layers), 125 load_npz() (in module tensorlayer.files), 105
InstanceNormLayer (class in tensorlayer.layers), 160 load_ptb_dataset() (in module tensorlayer.files), 97
iou_coe() (in module tensorlayer.cost), 49 load_voc_dataset() (in module tensorlayer.files), 102
load_wmt_en_fr_dataset() (in module tensorlayer.files),
K 99
LocalResponseNormLayer (class in tensorlayer.layers),
keypoint_random_crop() (in module tensorlayer.prepro),
159
87
log_weight() (in module tensorlayer.rein), 220
keypoint_random_flip() (in module tensorlayer.prepro),
88 M
keypoint_random_resize() (in module tensor-
layer.prepro), 88 maxnorm_i_regularizer() (in module tensorlayer.cost), 52
keypoint_random_resize_shortestedge() (in module ten- maxnorm_o_regularizer() (in module tensorlayer.cost),
sorlayer.prepro), 88 52
keypoint_random_rotate() (in module tensorlayer.prepro), maxnorm_regularizer() (in module tensorlayer.cost), 51
87 MaxPool1d (class in tensorlayer.layers), 164
keypoint_resize_random_crop() (in module tensor- MaxPool2d (class in tensorlayer.layers), 164
layer.prepro), 87 MaxPool3d (class in tensorlayer.layers), 165
maybe_download_and_extract() (in module tensor-
L layer.files), 110
mean_squared_error() (in module tensorlayer.cost), 47
LambdaLayer (class in tensorlayer.layers), 155
MeanPool1d (class in tensorlayer.layers), 164
Index 251
MeanPool2d (class in tensorlayer.layers), 165 outputs (tensorlayer.layers.Word2vecEmbeddingInputlayer

MeanPool3d (class in tensorlayer.layers), 166 attribute), 127
merge_networks() (in module tensorlayer.layers), 200
minibatches() (in module tensorlayer.iterate), 112 P
MobileNetV1 (class in tensorlayer.models), 204 pad_id (tensorlayer.nlp.Vocabulary attribute), 208
moses_multi_bleu() (in module tensorlayer.nlp), 217 pad_sequences() (in module tensorlayer.prepro), 89
MultiplexerLayer (class in tensorlayer.layers), 153 PadLayer (class in tensorlayer.layers), 162
parse_darknet_ann_list_to_cls_box() (in module tensor-
N layer.prepro), 83
natural_keys() (in module tensorlayer.files), 111 parse_darknet_ann_str_to_list() (in module tensor-
nce_cost (tensorlayer.layers.Word2vecEmbeddingInputlayer layer.prepro), 83
attribute), 127 pixel_value_scale() (in module tensorlayer.prepro), 74
normalized_embeddings (tensor- pixel_wise_softmax() (in module tensorlayer.activation),
layer.layers.Word2vecEmbeddingInputlayer 42
attribute), 127 PoolLayer (class in tensorlayer.layers), 163
normalized_mean_square_error() (in module tensor- predict() (in module tensorlayer.utils), 224
layer.cost), 47 PRelu6Layer (class in tensorlayer.layers), 130
npz_to_W_pdf() (in module tensorlayer.files), 112 PReluLayer (class in tensorlayer.layers), 130
print_all_variables() (in module tensorlayer.layers), 117
O print_layers() (tensorlayer.layers.Layer method), 124
obj_box_coord_centroid_to_upleft() (in module tensor- print_params() (tensorlayer.layers.Layer method), 124
layer.prepro), 82 process_sentence() (in module tensorlayer.nlp), 209
obj_box_coord_centroid_to_upleft_butright() (in module process_sequences() (in module tensorlayer.prepro), 90
tensorlayer.prepro), 82 project_name (tensorlayer.db.TensorHub attribute), 237
obj_box_coord_rescale() (in module tensorlayer.prepro), projective_transform_by_points() (in module tensor-
80 layer.prepro), 61
obj_box_coord_scale_to_pixelunit() (in module tensor- pt2map() (in module tensorlayer.prepro), 77
layer.prepro), 81 ptb_iterator() (in module tensorlayer.iterate), 115
obj_box_coord_upleft_butright_to_centroid() (in module PTRelu6Layer (class in tensorlayer.layers), 131
tensorlayer.prepro), 82
obj_box_coord_upleft_to_centroid() (in module tensor- Q
layer.prepro), 82 QuanConv2d (class in tensorlayer.layers), 177
obj_box_coords_rescale() (in module tensorlayer.prepro), QuanConv2dWithBN (class in tensorlayer.layers), 178
81 QuanDenseLayer (class in tensorlayer.layers), 175
obj_box_crop() (in module tensorlayer.prepro), 85 QuanDenseLayerWithBN (class in tensorlayer.layers),
obj_box_horizontal_flip() (in module tensorlayer.prepro), 176
83
obj_box_imresize() (in module tensorlayer.prepro), 84 R
obj_box_shift() (in module tensorlayer.prepro), 85 ramp() (in module tensorlayer.activation), 38
obj_box_zoom() (in module tensorlayer.prepro), 86 read_analogies_file() (in module tensorlayer.nlp), 211
OneHotInputLayer (class in tensorlayer.layers), 126 read_file() (in module tensorlayer.files), 109
open_tensorboard() (in module tensorlayer.utils), 227 read_image() (in module tensorlayer.visualize), 228
outputs (tensorlayer.layers.BiDynamicRNNLayer at- read_images() (in module tensorlayer.visualize), 228
tribute), 191 read_words() (in module tensorlayer.nlp), 211
outputs (tensorlayer.layers.BiRNNLayer attribute), 183 remove_pad_sequences() (in module tensorlayer.prepro),
outputs (tensorlayer.layers.ConvLSTMLayer attribute), 90
185 ReshapeLayer (class in tensorlayer.layers), 195
outputs (tensorlayer.layers.DynamicRNNLayer attribute), respective_zoom() (in module tensorlayer.prepro), 70
189 retrieve_seq_length_op() (in module tensorlayer.layers),
outputs (tensorlayer.layers.EmbeddingInputlayer at- 186
tribute), 128 retrieve_seq_length_op2() (in module tensorlayer.layers),
outputs (tensorlayer.layers.RNNLayer attribute), 180 187
outputs (tensorlayer.layers.Seq2Seq attribute), 193 retrieve_seq_length_op3() (in module tensorlayer.layers),
187
252 Index
reverse_vocab (tensorlayer.nlp.Vocabulary attribute), 208 shift_multi() (in module tensorlayer.prepro), 65

rgb_to_hsv() (in module tensorlayer.prepro), 72 sigmoid_cross_entropy() (in module tensorlayer.cost), 46
RNNLayer (class in tensorlayer.layers), 180 sign() (in module tensorlayer.activation), 41
ROIPoolingLayer (class in tensorlayer.layers), 161 SignLayer (class in tensorlayer.layers), 169
rotation() (in module tensorlayer.prepro), 62 simple_read_words() (in module tensorlayer.nlp), 210
rotation_multi() (in module tensorlayer.prepro), 63 SimpleVocabulary (class in tensorlayer.nlp), 208
run_top_task() (tensorlayer.db.TensorHub method), 240 SlimNetsLayer (class in tensorlayer.layers), 152
SpatialTransformer2dAffineLayer (class in tensor-
S layer.layers), 196
sample() (in module tensorlayer.nlp), 207 SqueezeNetV1 (class in tensorlayer.models), 203
sample_top() (in module tensorlayer.nlp), 207 StackLayer (class in tensorlayer.layers), 197
samplewise_norm() (in module tensorlayer.prepro), 74 start_id (tensorlayer.nlp.Vocabulary attribute), 208
save_any_to_npy() (in module tensorlayer.files), 108 SubpixelConv1d (class in tensorlayer.layers), 147
save_ckpt() (in module tensorlayer.files), 107 SubpixelConv2d (class in tensorlayer.layers), 148
save_dataset() (tensorlayer.db.TensorHub method), 240 swirl() (in module tensorlayer.prepro), 67
save_image() (in module tensorlayer.visualize), 228 swirl_multi() (in module tensorlayer.prepro), 68
save_images() (in module tensorlayer.visualize), 228 swish() (in module tensorlayer.activation), 40
save_model() (tensorlayer.db.TensorHub method), 241 SwitchNormLayer (class in tensorlayer.layers), 161
save_npz() (in module tensorlayer.files), 105
save_npz_dict() (in module tensorlayer.files), 107 T
save_testing_log() (tensorlayer.db.TensorHub method), target_mask_op() (in module tensorlayer.layers), 187
241 TensorHub (class in tensorlayer.db), 237
save_training_log() (tensorlayer.db.TensorHub method), tensorlayer.activation (module), 37
241 tensorlayer.array_ops (module), 42
save_validation_log() (tensorlayer.db.TensorHub tensorlayer.cli (module), 243
method), 242 tensorlayer.cli.train (module), 243
save_vocab() (in module tensorlayer.nlp), 213 tensorlayer.cost (module), 45
ScaleLayer (class in tensorlayer.layers), 169 tensorlayer.db (module), 237
sel (tensorlayer.layers.MultiplexerLayer attribute), 153 tensorlayer.distributed (module), 92
sentence_to_token_ids() (in module tensorlayer.nlp), 217 tensorlayer.files (module), 94
SeparableConv1d (class in tensorlayer.layers), 145 tensorlayer.iterate (module), 112
SeparableConv2d (class in tensorlayer.layers), 146 tensorlayer.layers (module), 115
Seq2Seq (class in tensorlayer.layers), 192 tensorlayer.models (module), 201
seq_minibatches() (in module tensorlayer.iterate), 113 tensorlayer.nlp (module), 205
seq_minibatches2() (in module tensorlayer.iterate), 114 tensorlayer.optimizers (module), 218
sequence_length (tensor- tensorlayer.prepro (module), 52
layer.layers.BiDynamicRNNLayer attribute), tensorlayer.rein (module), 219
192 tensorlayer.utils (module), 221
sequence_length (tensorlayer.layers.DynamicRNNLayer tensorlayer.visualize (module), 227
attribute), 189 TernaryConv2d (class in tensorlayer.layers), 172
sequences_add_end_id() (in module tensorlayer.prepro), TernaryDenseLayer (class in tensorlayer.layers), 171
91 test() (in module tensorlayer.utils), 223
sequences_add_end_id_after_pad() (in module tensor- TileLayer (class in tensorlayer.layers), 152
layer.prepro), 92 TimeDistributedLayer (class in tensorlayer.layers), 198
sequences_add_start_id() (in module tensorlayer.prepro), Trainer() (in module tensorlayer.distributed), 93
91 training_network (in module tensorlayer.distributed), 94
sequences_get_mask() (in module tensorlayer.prepro), 92 transform_matrix_offset_center() (in module tensor-
session (in module tensorlayer.distributed), 94 layer.prepro), 60
set_gpu_fraction() (in module tensorlayer.utils), 227 transformer() (in module tensorlayer.layers), 196
shear() (in module tensorlayer.prepro), 65 TransposeLayer (class in tensorlayer.layers), 195
shear2() (in module tensorlayer.prepro), 66 tsne_embedding() (in module tensorlayer.visualize), 232
shear_multi() (in module tensorlayer.prepro), 66
shear_multi2() (in module tensorlayer.prepro), 67 U
shift() (in module tensorlayer.prepro), 65 unk_id (tensorlayer.nlp.Vocabulary attribute), 208
Index 253
UnStackLayer (class in tensorlayer.layers), 198

UpSampling2dLayer (class in tensorlayer.layers), 154
V
validation_metrics (in module tensorlayer.distributed), 94
VGG16 (class in tensorlayer.models), 201
VGG19 (class in tensorlayer.models), 202
vocab (tensorlayer.nlp.Vocabulary attribute), 208
Vocabulary (class in tensorlayer.nlp), 208
W
Word2vecEmbeddingInputlayer (class in tensor-
layer.layers), 126
word_ids_to_words() (in module tensorlayer.nlp), 214
words_to_word_ids() (in module tensorlayer.nlp), 214
Z
ZeroPad1d (class in tensorlayer.layers), 162
zoom() (in module tensorlayer.prepro), 69
zoom_multi() (in module tensorlayer.prepro), 70
254 Index

Tensorlayer Documentation: Release 1.11.1

Uploaded by

Copyright:

Available Formats

Tensorlayer Documentation: Release 1.11.1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Tensorlayer Documentation: Release 1.11.1

Uploaded by

Copyright:

Available Formats

TensorLayer Documentation

Dec 15, 2018

3 Command-line Reference 243

4 Indices and tables 245

Python Module Index 247

Documentation Version: 1.11.1

Starting with TensorLayer 1

2 Starting with TensorLayer

1.1.1 Step 1 : Install dependencies

sudo apt-get install python3

To check the installed packages, run the following command:

1.1.2 Step 2 : TensorFlow

1.1.3 Step 3 : TensorLayer

[stable version] pip install tensorlayer

cd to the root of the git tree

1.1.4 Step 4 : GPU support

4 Chapter 1. User Guide

1.1.5 Windows User

1. Installing Microsoft Visual Studio

You can easily install Tensorlayer using pip in CMD

pip install tensorflow #CPU version

pip install tensorlayer #Install tensorlayer

Enter “python” in CMD. Then:

successfully opened CUDA library cublas64_80.dll locally

If there is no error, the CPU version is successfully installed.

_tkinter.TclError: no display name and no $DISPLAY environment variable

1.2.1 Before we start

6 Chapter 1. User Guide

1.2.2 TensorLayer is simple

The following code shows a simple example of TensorLayer, see tutorial_mnist_simple.py . We

# define the network

# define the optimizer

(continued from previous page)

# initialize all variables in the session

# print network information

# train the network

# save the network to .npz file

1.2.3 Run the MNIST example

tensorlayer: GPU MEM Fraction 0.300000

(continues on next page)

8 Chapter 1. User Guide

(continued from previous page)

[TL] InputLayer input_layer (?, 784)

param 0: (784, 800) (mean: -0.000053, median: -0.000043 std: 0.035558)

layer 0: Tensor("dropout/mul_1:0", shape=(?, 784), dtype=float32)

Epoch 1 of 500 took 0.342539s

1.2.4 Understand the MNIST example

X_train, y_train, X_val, y_val, X_test, y_test = \

X_train, y_train, X_val, y_val, X_test, y_test = \

Building the model

10 Chapter 1. User Guide

Multi-Layer Perceptron (MLP)

x = tf.placeholder(tf.float32, shape=[None, 784], name='x')

network = tl.layers.InputLayer(x, name='input')

network = tl.layers.DropoutLayer(network, keep=0.8, name='drop1')

network = tl.layers.DenseLayer(network, n_units=800, act = tf.nn.relu, name='relu1')

network = tl.layers.DropoutLayer(network, keep=0.5, name='drop2')

Denoising Autoencoder (DAE)

network = tl.layers.InputLayer(x, name='input_layer')