Tensorlayer Documentation: Release 1.11.1
Tensorlayer Documentation: Release 1.11.1
Tensorlayer Documentation: Release 1.11.1
Release 1.11.1
TensorLayer contributors
1 User Guide 3
1.1 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.4 Contributing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.5 Get Involved in Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.6 FAQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2 API Reference 37
2.1 API - Activations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.2 API - Array Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.3 API - Cost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.4 API - Data Pre-Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.5 API - Distributed Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
2.6 API - Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
2.7 API - Iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
2.8 API - Layers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
2.9 API - Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
2.10 API - Natural Language Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
2.11 API - Optimizers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
2.12 API - Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
2.13 API - Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
2.14 API - Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
2.15 API - Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
i
ii
TensorLayer Documentation, Release 1.11.1
Note: If you got problem to read the docs online, you could download the repository on GitHub, then go to /docs/
_build/html/index.html to read the docs offline. The _build folder can be generated in docs using make
html.
User Guide
The TensorLayer user guide explains how to install TensorFlow, CUDA and cuDNN, how to build and train neural
networks using TensorLayer, and how to contribute to the library as a developer.
1.1 Installation
TensorLayer has some prerequisites that need to be installed first, including TensorFlow , numpy and matplotlib. For
GPU support CUDA and cuDNN are required.
If you run into any trouble, please check the TensorFlow installation instructions which cover installing the TensorFlow
for a range of operating systems including Mac OX, Linux and Windows, or ask for help on tensorlayer@gmail.com
or FAQ.
TensorLayer is build on the top of Python-version TensorFlow, so please install Python first.
Note: We highly recommend python3 instead of python2 for the sake of future.
Python includes pip command for installing additional modules is recommended. Besides, a virtual environment via
virtualenv can help you to manage python packages.
Take Python3 on Ubuntu for example, to install Python includes pip, run the following commands:
To build a virtual environment and install dependencies into it, run the following commands: (You can also skip to
Step 3, automatically install the prerequisites by TensorLayer)
3
TensorLayer Documentation, Release 1.11.1
virtualenv env
env/bin/pip install matplotlib
env/bin/pip install numpy
env/bin/pip install scipy
env/bin/pip install scikit-image
env/bin/pip list
After that, you can run python script by using the virtual python as follow.
env/bin/python *.py
The installation instructions of TensorFlow are written to be very detailed on TensorFlow website. However, there are
something need to be considered. For example, TensorFlow officially supports GPU acceleration for Linux, Mac OX
and Windows at present.
Warning: For ARM processor architecture, you need to install TensorFlow from source.
The simplest way to install TensorLayer is as follow, it will also install the numpy and matplotlib automatically.
However, if you want to modify or extend TensorLayer, you can download the repository from Github and install it as
follow.
This command will run the setup.py to install TensorLayer. The -e reflects editable, then you can edit the source
code in tensorlayer folder, and import the edited TensorLayer.
Thanks to NVIDIA supports, training a fully connected network on a GPU, which may be 10 to 20 times faster than
training them on a CPU. For convolutional network, may have 50 times faster. This requires an NVIDIA GPU with
CUDA and cuDNN support.
CUDA
The TensorFlow website also teach how to install the CUDA and cuDNN, please see TensorFlow GPU Support.
Download and install the latest CUDA is available from NVIDIA website:
• CUDA download and install
If CUDA is set up correctly, the following command should print some GPU information on the terminal:
python -c "import tensorflow"
cuDNN
Apart from CUDA, NVIDIA also provides a library for common neural network operations that especially speeds up
Convolutional Neural Networks (CNNs). Again, it can be obtained from NVIDIA after registering as a developer (it
take a while):
Download and install the latest cuDNN is available from NVIDIA website:
• cuDNN download and install
To install it, copy the *.h files to /usr/local/cuda/include and the lib* files to /usr/local/cuda/
lib64.
TensorLayer is built on the top of Python-version TensorFlow, so please install Python first. NoteWe highly recom-
mend installing Anaconda. The lowest version requirements of Python is py35.
Anaconda download
GPU support
Thanks to NVIDIA supports, training a fully connected network on a GPU, which may be 10 to 20 times faster than
training them on a CPU. For convolutional network, may have 50 times faster. This requires an NVIDIA GPU with
CUDA and cuDNN support.
You should preinstall Microsoft Visual Studio (VS) before installing CUDA. The lowest version requirements is
VS2010. We recommend installing VS2015 or VS2013. CUDA7.5 supports VS2010, VS2012 and VS2013. CUDA8.0
also supports VS2015.
2. Installing CUDA
Download and install the latest CUDA is available from NVIDIA website:
CUDA download
We do not recommend modifying the default installation directory.
3. Installing cuDNN
The NVIDIA CUDA® Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep
neural networks. Download and extract the latest cuDNN is available from NVIDIA website:
cuDNN download
After extracting cuDNN, you will get three folders (bin, lib, include). Then these folders should be copied to CUDA
installation. (The default installation directory is C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v8.0)
1.1. Installation 5
TensorLayer Documentation, Release 1.11.1
Installing TensorLayer
Test
import tensorlayer
If there is no error and the following output is displayed, the GPU version is successfully installed.
1.1.6 Issue
If you get the following output when import tensorlayer, please read FQA.
1.2 Tutorials
For deep learning, this tutorial will walk you through building handwritten digits classifiers using the MNIST dataset,
arguably the “Hello World” of neural networks. For reinforcement learning, we will let computer learns to play Pong
game from the original screen inputs. For nature language processing, we start from word embedding and then describe
language modeling and machine translation.
This tutorial includes all modularized implementation of Google TensorFlow Deep Learning tutorial, so you could
read TensorFlow Deep Learning tutorial at the same time [en] [cn] .
Note: For experts: Read the source code of InputLayer and DenseLayer, you will understand how TensorLayer
work. After that, we recommend you to read the codes on Github directly.
The tutorial assumes that you are somewhat familiar with neural networks and TensorFlow (the library which Tensor-
Layer is built on top of). You can try to learn the basics of a neural network from the Deeplearning Tutorial.
For a more slow-paced introduction to artificial neural networks, we recommend Convolutional Neural Networks for
Visual Recognition by Andrej Karpathy et al., Neural Networks and Deep Learning by Michael Nielsen.
To learn more about TensorFlow, have a look at the TensorFlow tutorial. You will not need all of it, but a basic
understanding of how TensorFlow works is required to be able to use TensorLayer. If you’re new to TensorFlow,
going through that tutorial.
sess = tf.InteractiveSession()
# prepare data
X_train, y_train, X_val, y_val, X_test, y_test = \
tl.files.load_mnist_dataset(shape=(-1,784))
# define placeholder
x = tf.placeholder(tf.float32, shape=[None, 784], name='x')
y_ = tf.placeholder(tf.int64, shape=[None, ], name='y_')
1.2. Tutorials 7
TensorLayer Documentation, Release 1.11.1
# evaluation
tl.utils.test(sess, network, acc, X_test, y_test, x, y_, batch_size=None, cost=cost)
In the first part of the tutorial, we will just run the MNIST example that’s included in the source distribution of
TensorLayer. The MNIST dataset contains 60000 handwritten digits that are commonly used for training various
image processing systems. Each digit is 28x28 pixels in size.
We assume that you have already run through the Installation. If you haven’t done so already, get a copy of
the source tree of TensorLayer, and navigate to the folder in a terminal window. Enter the folder and run the
tutorial_mnist.py example script:
python tutorial_mnist.py
If everything is set up correctly, you will get an output like the following:
learning_rate: 0.000100
batch_size: 128
The example script allows you to try different models, including Multi-Layer Perceptron, Dropout, Dropconnect,
1.2. Tutorials 9
TensorLayer Documentation, Release 1.11.1
Stacked Denoising Autoencoder and Convolutional Neural Network. Select different models from if __name__
== '__main__':.
main_test_layers(model='relu')
main_test_denoise_AE(model='relu')
main_test_stacked_denoise_AE(model='relu')
main_test_cnn_layer()
Let’s now investigate what’s needed to make that happen! To follow along, open up the source code.
Preface
The first thing you might notice is that besides TensorLayer, we also import numpy and tensorflow:
import tensorflow as tf
import tensorlayer as tl
from tensorlayer.layers import set_keep
import numpy as np
import time
As we know, TensorLayer is built on top of TensorFlow, it is meant as a supplement helping with some tasks, not as a
replacement. You will always mix TensorLayer with some vanilla TensorFlow code. The set_keep is used to access
the placeholder of keeping probabilities when using Denoising Autoencoder.
Loading data
The first piece of code defines a function load_mnist_dataset(). Its purpose is to download the MNIST dataset
(if it hasn’t been downloaded yet) and return it in the form of regular numpy arrays. There is no TensorLayer involved
at all, so for the purpose of this tutorial, we can regard it as:
X_train.shape is (50000, 784), to be interpreted as: 50,000 images and each image has 784 pixels.
y_train.shape is simply (50000,), which is a vector the same length of X_train giving an integer class
label for each image – namely, the digit between 0 and 9 depicted in the image (according to the human annotator who
drew that digit).
For Convolutional Neural Network example, the MNIST can be load as 4D version as follow:
X_train.shape is (50000, 28, 28, 1) which represents 50,000 images with 1 channel, 28 rows and 28
columns each. Channel one is because it is a grey scale image, every pixel has only one value.
This is where TensorLayer steps in. It allows you to define an arbitrarily structured neural network by creating and
stacking or merging layers. Since every layer knows its immediate incoming layers, the output layer (or output layers)
of a network double as a handle to the network as a whole, so usually this is the only thing we will pass on to the rest
of the code.
As mentioned above, tutorial_mnist.py supports four types of models, and we implement that via eas-
ily exchangeable functions of the same interface. First, we’ll define a function that creates a Multi-Layer
Perceptron (MLP) of a fixed architecture, explaining all the steps in detail. We’ll then implement a De-
noising Autoencoder (DAE), after that we will then stack all Denoising Autoencoder and supervised fine-tune
them. Finally, we’ll show how to create a Convolutional Neural Network (CNN). In addition, a simple ex-
ample for MNIST dataset in tutorial_mnist_simple.py, a CNN example for the CIFAR-10 dataset in
tutorial_cifar10_tfrecord.py.
The first script, main_test_layers(), creates an MLP of two hidden layers of 800 units each, followed by a
softmax output layer of 10 units. It applies 20% dropout to the input data and 50% dropout to the hidden layers.
To feed data into the network, TensofFlow placeholders need to be defined as follow. The None here means the
network will accept input data of arbitrary batch size after compilation. The x is used to hold the X_train data
and y_ is used to hold the y_train data. If you know the batch size beforehand and do not need this flexibility,
you should give the batch size here – especially for convolutional layers, this can allow TensorFlow to apply some
optimizations.
The foundation of each neural network in TensorLayer is an InputLayer instance representing the input data that
will subsequently be fed to the network. Note that the InputLayer is not tied to any specific data yet.
Before adding the first hidden layer, we’ll apply 20% dropout to the input data. This is realized via a DropoutLayer
instance:
Note that the first constructor argument is the incoming layer, the second argument is the keeping probability for the
activation value. Now we’ll proceed with the first fully-connected hidden layer of 800 units. Note that when stacking
a DenseLayer.
Again, the first constructor argument means that we’re stacking network on top of network. n_units simply
gives the number of units for this fully-connected layer. act takes an activation function, several of which are defined
in tensorflow.nn and tensorlayer.activation. Here we’ve chosen the rectifier, so we’ll obtain ReLUs. We’ll now
add dropout of 50%, another 800-unit dense layer and 50% dropout again:
Finally, we’ll add the fully-connected output layer which the n_units equals to the number of classes. Note that,
the softmax is implemented internally in tf.nn.sparse_softmax_cross_entropy_with_logits() to
speed up computation, so we used identity in the last layer, more details in tl.cost.cross_entropy().
1.2. Tutorials 11
TensorLayer Documentation, Release 1.11.1
network = tl.layers.DenseLayer(network,
n_units=10,
act = tf.identity,
name='output')
As mentioned above, each layer is linked to its incoming layer(s), so we only need the output layer(s) to access a
network in TensorLayer:
y = network.outputs
y_op = tf.argmax(tf.nn.softmax(y), 1)
cost = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(y, y_))
Here, network.outputs is the 10 identity outputs from the network (in one hot format), y_op is the integer output
represents the class index. While cost is the cross-entropy between the target and the predicted labels.
Autoencoder is an unsupervised learning model which is able to extract representative features, it has become more
widely used for learning generative models of data and Greedy layer-wise pre-train. For vanilla Autoencoder, see
Deeplearning Tutorial.
The script main_test_denoise_AE() implements a Denoising Autoencoder with corrosion rate of 50%. The
Autoencoder can be defined as follow, where an Autoencoder is represented by a DenseLayer:
recon_layer1 = tl.layers.ReconLayer(network,
x_recon=x,
n_units=784,
act=tf.nn.sigmoid,
name='recon_layer1')
To train the DenseLayer, simply run ReconLayer.pretrain(), if using denoising Autoencoder, the name
of corrosion layer (a DropoutLayer) need to be specified as follow. To save the feature images, set save to
True. There are many kinds of pre-train metrices according to different architectures and applications. For sigmoid
activation, the Autoencoder can be implemented by using KL divergence, while for rectifier, L1 regularization of
activation outputs can make the output to be sparse. So the default behaviour of ReconLayer only provide KLD
and cross-entropy for sigmoid activation function and L1 of activation outputs and mean-squared-error for rectifying
activation function. We recommend you to modify ReconLayer to achieve your own pre-train metrice.
recon_layer1.pretrain(sess,
x=x,
X_train=X_train,
X_val=X_val,
denoise_name='denoising1',
n_epoch=200,
batch_size=128,
print_freq=10,
save=True,
save_name='w1pre_')
In addition, the script main_test_stacked_denoise_AE() shows how to stacked multiple Autoencoder to one
network and then fine-tune.
Finally, the main_test_cnn_layer() script creates two CNN layers and max pooling stages, a fully-connected
hidden layer and a fully-connected output layer. More CNN examples can be found in other examples, like
tutorial_cifar10_tfrecord.py.
The remaining part of the tutorial_mnist.py script copes with setting up and running a training loop over the
MNIST dataset by using cross-entropy only.
Dataset iteration
An iteration function for synchronously iterating over two numpy arrays of input data and targets, respectively, in
mini-batches of a given number of items. More iteration function can be found in tensorlayer.iterate
y = network.outputs
y_op = tf.argmax(tf.nn.softmax(y), 1)
cost = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(y, y_))
More cost or regularization can be applied here. For example, to apply max-norm on the weight matrices, we can add
the following line.
Depending on the problem you are solving, you will need different loss functions, see tensorlayer.cost
for more. Apart from using network.all_params to get the variables, we can also use tl.layers.
get_variables_with_name to get the specific variables by string name.
Having the model and the loss function here, we create update expression/operation for training the network. Tensor-
Layer does not provide many optimizers, we used TensorFlow’s optimizer instead:
1.2. Tutorials 13
TensorLayer Documentation, Release 1.11.1
train_params = network.all_params
train_op = tf.train.AdamOptimizer(learning_rate, beta1=0.9, beta2=0.999,
epsilon=1e-08, use_locking=False).minimize(cost, var_list=train_params)
For training the network, we fed data and the keeping probabilities to the feed_dict.
feed_dict = {x: X_train_a, y_: y_train_a}
feed_dict.update( network.all_drop )
sess.run(train_op, feed_dict=feed_dict)
While, for validation and testing, we use slightly different way. All Dropout, Dropconnect, Corrosion layers need to
be disabled. We use tl.utils.dict_to_one to set all network.all_drop to 1.
dp_dict = tl.utils.dict_to_one( network.all_drop )
feed_dict = {x: X_test_a, y_: y_test_a}
feed_dict.update(dp_dict)
err, ac = sess.run([cost, acc], feed_dict=feed_dict)
What Next?
We also have a more advanced image classification example in tutorial_cifar10_tfrecord.py. Please read the code
and notes, figure out how to generate more training data and what is local response normalization. After that, try to
implement Residual Network (Hint: you may want to use the Layer.outputs).
In the second part of the tutorial, we will run the Deep Reinforcement Learning example which is introduced by
Karpathy in Deep Reinforcement Learning: Pong from Pixels.
python tutorial_atari_pong.py
Before running the tutorial code, you need to install OpenAI gym environment which is a popular benchmark for
Reinforcement Learning. If everything is set up correctly, you will get an output like the following:
[2016-07-12 09:31:59,760] Making new env: Pong-v0
[TL] InputLayer input_layer (?, 6400)
[TL] DenseLayer relu1: 200, relu
[TL] DenseLayer output_layer: 3, identity
param 0: (6400, 200) (mean: -0.000009 median: -0.000018 std: 0.017393)
param 1: (200,) (mean: 0.000000 median: 0.000000 std: 0.000000)
param 2: (200, 3) (mean: 0.002239 median: 0.003122 std: 0.096611)
param 3: (3,) (mean: 0.000000 median: 0.000000 std: 0.000000)
num of params: 1280803
layer 0: Tensor("Relu:0", shape=(?, 200), dtype=float32)
layer 1: Tensor("add_1:0", shape=(?, 3), dtype=float32)
episode 0: game 0 took 0.17381s, reward: -1.000000
episode 0: game 1 took 0.12629s, reward: 1.000000 !!!!!!!!
episode 0: game 2 took 0.17082s, reward: -1.000000
episode 0: game 3 took 0.08944s, reward: -1.000000
(continues on next page)
This example allows the neural network to learn how to play Pong game from the screen inputs, just like human
behavior. The neural network will play with a fake AI player and learn to beat it. After training for 15,000 episodes,
the neural network can win 20% of the games. The neural network win 35% of the games at 20,000 episode, we can
seen the neural network learn faster and faster as it has more winning data to train. If you run it for 30,000 episode, it
never loss.
render = False
resume = False
Setting render to True, if you want to display the game environment. When you run the code again, you can set
resume to True, the code will load the existing model and train the model basic on it.
1.2. Tutorials 15
TensorLayer Documentation, Release 1.11.1
Pong Game
To understand Reinforcement Learning, we let computer to learn how to play Pong game from the original screen
inputs. Before we start, we highly recommend you to go through a famous blog called Deep Reinforcement Learning:
Pong from Pixels which is a minimalistic implementation of Deep Reinforcement Learning by using python-numpy
and OpenAI gym environment.
python tutorial_atari_pong.py
Policy Network
In Deep Reinforcement Learning, the Policy Network is the same with Deep Neural Network, it is our player (or
“agent”) who output actions to tell what we should do (move UP or DOWN); in Karpathy’s code, he only defined 2
actions, UP and DOWN and using a single simgoid output; In order to make our tutorial more generic, we defined 3
actions which are UP, DOWN and STOP (do nothing) by using 3 softmax outputs.
Then when our agent is playing Pong, it calculates the probabilities of different actions, and then draw sample (action)
from this uniform distribution. As the actions are represented by 1, 2 and 3, but the softmax outputs should be start
from 0, we calculate the label value by minus 1.
prob = sess.run(
sampling_prob,
feed_dict={states_batch_pl: x}
)
# action. 1: STOP 2: UP 3: DOWN
action = np.random.choice([1,2,3], p=prob.flatten())
...
ys.append(action - 1)
Policy Gradient
Policy gradient methods are end-to-end algorithms that directly learn policy functions mapping states to actions. An
approximate policy could be learned directly by maximizing the expected rewards. The parameters of a policy function
(e.g. the parameters of a policy network used in the pong example) could be trained and learned under the guidance of
the gradient of expected rewards. In other words, we can gradually tune the policy function via updating its parameters,
such that it will generate actions from given states towards higher rewards.
An alternative method to policy gradient is Deep Q-Learning (DQN). It is based on Q-Learning that tries to learn a
value function (called Q function) mapping states and actions to some value. DQN employs a deep neural network to
represent the Q function as a function approximator. The training is done by minimizing temporal-difference errors. A
neurobiologically inspired mechanism called “experience replay” is typically used along with DQN to help improve
its stability caused by the use of non-linear function approximator.
You can check the following papers to gain better understandings about Reinforcement Learning.
• Reinforcement Learning: An Introduction. Richard S. Sutton and Andrew G. Barto
• Deep Reinforcement Learning. David Silver, Google DeepMind
• UCL Course on RL
The most successful applications of Deep Reinforcement Learning in recent years include DQN with experience replay
to play Atari games and AlphaGO that for the first time beats world-class professional GO players. AlphaGO used the
policy gradient method to train its policy network that is similar to the example of Pong game.
• Atari - Playing Atari with Deep Reinforcement Learning
• Atari - Human-level control through deep reinforcement learning
• AlphaGO - Mastering the game of Go with deep neural networks and tree search
Dataset iteration
In Reinforcement Learning, we consider a final decision as an episode. In Pong game, a episode is a few dozen games,
because the games go up to score of 21 for either player. Then the batch size is how many episode we consider to
update the model. In the tutorial, we train a 2-layer policy network with 200 hidden layer units using RMSProp on
batches of 10 episodes.
The loss in a batch is relate to all outputs of Policy Network, all actions we made and the corresponding discounted
rewards in a batch. We first compute the loss of each action by multiplying the discounted reward and the cross-entropy
between its output and its true action. The final loss in a batch is the sum of all loss of the actions.
What Next?
The tutorial above shows how you can build your own agent, end-to-end. While it has reasonable quality, the default
parameters will not give you the best agent model. Here are a few things you can improve.
1.2. Tutorials 17
TensorLayer Documentation, Release 1.11.1
First of all, instead of conventional MLP model, we can use CNNs to capture the screen information better as Playing
Atari with Deep Reinforcement Learning describe.
Also, the default parameters of the model are not tuned. You can try changing the learning rate, decay, or initializing
the weights of your model in a different way.
Finally, you can try the model on different tasks (games) and try other reinforcement learning algorithm in Example.
In this part of the tutorial, we train a matrix for words, where each word can be represented by a unique row vector in
the matrix. In the end, similar words will have similar vectors. Then as we plot out the words into a two-dimensional
plane, words that are similar end up clustering nearby each other.
python tutorial_word2vec_basic.py
Word Embedding
We highly recommend you to read Colah’s blog Word Representations to understand why we want to use a vector
representation, and how to compute the vectors. (For chinese reader please click. More details about word2vec can be
found in Word2vec Parameter Learning Explained.
Bascially, training an embedding matrix is an unsupervised learning. As every word is refected by an unique ID, which
is the row index of the embedding matrix, a word can be converted into a vector, it can better represent the meaning.
For example, there seems to be a constant male-female difference vector: woman man = queen - king, this
means one dimension in the vector represents gender.
The model can be created as follow.
1.2. Tutorials 19
TensorLayer Documentation, Release 1.11.1
Word2vec uses Negative Sampling and Skip-Gram model for training. Noise-Contrastive Estimation Loss (NCE) can
help to reduce the computation of loss. Skip-Gram inverts context and targets, tries to predict each context word
from its target word. We use tl.nlp.generate_skip_gram_batch to generate training data as follow, see
tutorial_generate_text.py .
data_index = 0
while (step < num_steps):
batch_inputs, batch_labels, data_index = tl.nlp.generate_skip_gram_batch(
data=data, batch_size=batch_size, num_skips=num_skips,
skip_window=skip_window, data_index=data_index)
feed_dict = {train_inputs : batch_inputs, train_labels : batch_labels}
_, loss_val = sess.run([train_op, cost], feed_dict=feed_dict)
In the end of training the embedding matrix, we save the matrix and corresponding dictionaries. Then next
time, we can restore the matrix and directories as follow. (see main_restore_embedding_layer() in
tutorial_generate_text.py)
vocabulary_size = 50000
embedding_size = 128
model_file_name = "model_word2vec_50k_128"
batch_size = None
tl.nlp.save_vocab(count, name='vocab_'+model_file_name+'.txt')
load_params = tl.files.load_npz(name=model_file_name+'.npz')
x = tf.placeholder(tf.int32, shape=[batch_size])
y_ = tf.placeholder(tf.int32, shape=[batch_size, 1])
emb_net = tl.layers.EmbeddingInputlayer(
inputs = x,
vocabulary_size = vocabulary_size,
embedding_size = embedding_size,
name ='embedding_layer')
tl.layers.initialize_global_variables(sess)
Penn TreeBank (PTB) dataset is used in many LANGUAGE MODELING papers, including “Empirical Evaluation
and Combination of Advanced Language Modeling Techniques”, “Recurrent Neural Network Regularization”. It
consists of 929k training words, 73k validation words, and 82k test words. It has 10k words in its vocabulary.
The PTB example is trying to show how to train a recurrent neural network on a challenging task of language modeling.
Given a sentence “I am from Imperial College London”, the model can learn to predict “Imperial College London”
from “from Imperial College”. In other word, it predict the next word in a text given a history of previous words. In
the previous example , num_steps (sequence length) is 3.
python tutorial_ptb_lstm.py
The script provides three settings (small, medium, large), where a larger model has better performance. You can
choose different settings in:
flags.DEFINE_string(
"model", "small",
"A type of model. Possible options are: small, medium, large.")
1.2. Tutorials 21
TensorLayer Documentation, Release 1.11.1
The PTB example shows that RNN is able to model language, but this example did not do something practically
interesting. However, you should read through this example and “Understand LSTM” in order to understand the
basics of RNN. After that, you will learn how to generate text, how to achieve language translation, and how to build
a question answering system by using RNN.
We personally think Andrej Karpathy’s blog is the best material to Understand Recurrent Neural Network , after
reading that, Colah’s blog can help you to Understand LSTM Network [chinese] which can solve The Problem of
Long-Term Dependencies. We will not describe more about the theory of RNN, so please read through these blogs
before you go on.
The model in PTB example is a typical type of synced sequence input and output, which was described by Karpathy
as “(5) Synced sequence input and output (e.g. video classification where we wish to label each frame of the video).
Notice that in every case there are no pre-specified constraints on the lengths of sequences because the recurrent
transformation (green) can be applied as many times as we like.”
The model is built as follows. Firstly, we transfer the words into word vectors by looking up an embedding matrix. In
this tutorial, there is no pre-training on the embedding matrix. Secondly, we stack two LSTMs together using dropout
between the embedding layer, LSTM layers, and the output layer for regularization. In the final layer, the model
provides a sequence of softmax outputs.
The first LSTM layer outputs [batch_size, num_steps, hidden_size] for stacking another LSTM
after it. The second LSTM layer outputs [batch_size*num_steps, hidden_size] for stacking a
DenseLayer after it. Then the DenseLayer computes the softmax outputs of each example n_examples =
batch_size*num_steps).
To understand the PTB tutorial, you can also read TensorFlow PTB tutorial.
(Note that, TensorLayer supports DynamicRNNLayer after v1.1, so you can set the input/output dropouts, number of
RNN layers in one single layer)
network = tl.layers.EmbeddingInputlayer(
inputs = x,
vocabulary_size = vocab_size,
embedding_size = hidden_size,
E_init = tf.random_uniform_initializer(-init_scale, init_scale),
name ='embedding_layer')
if is_training:
network = tl.layers.DropoutLayer(network, keep=keep_prob, name='drop1')
network = tl.layers.RNNLayer(network,
cell_fn=tf.contrib.rnn.BasicLSTMCell,
(continues on next page)
1.2. Tutorials 23
TensorLayer Documentation, Release 1.11.1
Dataset iteration
The batch_size can be seen as the number of concurrent computations we are running. As the following example
shows, the first batch learns the sequence information by using items 0 to 9. The second batch learn the sequence
information by using items 10 to 19. So it ignores the information from items 9 to 10 !n If only if we set batch_size
= 1`, it will consider all the information from items 0 to 20.
The meaning of batch_size here is not the same as the batch_size in the MNIST example. In the MNIST
example, batch_size reflects how many examples we consider in each iteration, while in the PTB example,
batch_size is the number of concurrent processes (segments) for accelerating the computation.
Some information will be ignored if batch_size > 1, however, if your dataset is “long” enough (a text corpus
usually has billions of words), the ignored information would not affect the final result.
In the PTB tutorial, we set batch_size = 20, so we divide the dataset into 20 segments. At the beginning of each
epoch, we initialize (reset) the 20 RNN states for the 20 segments to zero, then go through the 20 segments separately.
An example of generating training data is as follows:
Note: This example can also be considered as pre-training of the word embedding matrix.
For updating, truncated backpropagation clips values of gradients by the ratio of the sum of their norms, so as to make
the learning process tractable.
In addition, if the epoch index is greater than max_epoch, we decrease the learning rate by multipling lr_decay.
1.2. Tutorials 25
TensorLayer Documentation, Release 1.11.1
At the beginning of each epoch, all states of LSTMs need to be reseted (initialized) to zero states. Then after each
iteration, the LSTMs’ states is updated, so the new LSTM states (final states) need to be assigned as the initial states
of the next iteration:
# set all states to zero states at the beginning of each epoch
state1 = tl.layers.initialize_rnn_state(lstm1.initial_state)
state2 = tl.layers.initialize_rnn_state(lstm2.initial_state)
for step, (x, y) in enumerate(tl.iterate.ptb_iterator(train_data,
batch_size, num_steps)):
feed_dict = {input_data: x, targets: y,
lstm1.initial_state: state1,
lstm2.initial_state: state2,
}
# For training, enable dropout
feed_dict.update( network.all_drop )
# use the new states as the initial state of next iteration
_cost, state1, state2, _ = sess.run([cost,
lstm1.final_state,
lstm2.final_state,
train_op],
feed_dict=feed_dict
)
costs += _cost; iters += num_steps
Predicting
After training the model, when we predict the next output, we no long consider the number of steps (sequence length),
i.e. batch_size, num_steps are set to 1. Then we can output the next word one by one, instead of predicting a
sequence of words from a sequence of words.
input_data_test = tf.placeholder(tf.int32, [1, 1])
targets_test = tf.placeholder(tf.int32, [1, 1])
...
network_test, lstm1_test, lstm2_test = inference(input_data_test,
is_training=False, num_steps=1, reuse=True)
...
cost_test = loss_fn(network_test.outputs, targets_test, 1, 1)
...
print("Evaluation")
# Testing
# go through the test set step by step, it will take a while.
start_time = time.time()
costs = 0.0; iters = 0
# reset all states at the beginning
state1 = tl.layers.initialize_rnn_state(lstm1_test.initial_state)
state2 = tl.layers.initialize_rnn_state(lstm2_test.initial_state)
for step, (x, y) in enumerate(tl.iterate.ptb_iterator(test_data,
batch_size=1, num_steps=1)):
feed_dict = {input_data_test: x, targets_test: y,
lstm1_test.initial_state: state1,
lstm2_test.initial_state: state2,
}
_cost, state1, state2 = sess.run([cost_test,
lstm1_test.final_state,
lstm2_test.final_state],
feed_dict=feed_dict
(continues on next page)
What Next?
Now, you have understood Synced sequence input and output. Let’s think about Many to one (Sequence input and one
output), so that LSTM is able to predict the next word “English” from “I am from London, I speak ..”.
Please read and understand the code of tutorial_generate_text.py. It shows you how to restore a pre-trained
Embedding matrix and how to learn text generation from a given context.
Karpathy’s blog : “(3) Sequence input (e.g. sentiment analysis where a given sentence is classified as expressing
positive or negative sentiment). “
In Example page, we provide many examples include Seq2seq, different type of Adversarial Learning, Reinforcement
Learning and etc.
For more information on what you can do with TensorLayer, just continue reading through readthedocs. Finally, the
reference lists and explains as follow.
layers (tensorlayer.layers),
activation (tensorlayer.activation),
natural language processing (tensorlayer.nlp),
reinforcement learning (tensorlayer.rein),
cost expressions and regularizers (tensorlayer.cost),
load and save files (tensorlayer.files),
helper functions (tensorlayer.utils),
visualization (tensorlayer.visualize),
iteration functions (tensorlayer.iterate),
preprocessing functions (tensorlayer.prepro),
command line interface (tensorlayer.prepro),
1.3 Examples
1.3.1 Basics
1.3. Examples 27
TensorLayer Documentation, Release 1.11.1
• Multi-layer perceptron (MNIST). Classification with dropout using iterator, see method1 (use placeholder) and
method2 (use reuse).
• Denoising Autoencoder (MNIST). Classification task, see tutorial_mnist_autoencoder_cnn.py.
• Stacked Denoising Autoencoder and Fine-Tuning (MNIST). A MLP classification task, see tuto-
rial_mnist_autoencoder_cnn.py.
• Convolutional Network (MNIST). Classification task, see tutorial_mnist_autoencoder_cnn.py.
• Convolutional Network (CIFAR-10). Classification task, see tutorial_cifar10.py and tuto-
rial_cifar10_tfrecord.py.
• TensorFlow dataset API for object detection see here.
• Merge TF-Slim into TensorLayer. tutorial_inceptionV3_tfslim.py.
• Merge Keras into TensorLayer. tutorial_keras.py.
• Data augmentation with TFRecord. Effective way to load and pre-process data, see tutorial_tfrecord*.py and
tutorial_cifar10_tfrecord.py.
• Data augmentation with Dataset API. Effective way to load and pre-process data, see tuto-
rial_cifar10_datasetapi.py.
• Data augmentation with TensorLayer. See tutorial_image_preprocess.py (for quick test only).
• Float 16 half-precision model, see tutorial_mnist_float16.py.
• Transparent distributed training. mnist by luomai.
1.3.2 Vision
• Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization, see examples.
• ArcFace: Additive Angular Margin Loss for Deep Face Recognition, see InsignFace.
• BinaryNet. Model compression, see mnist cifar10.
• Ternary Weight Network. Model compression, see mnist cifar10.
• DoReFa-Net. Model compression, see mnist cifar10.
• QuanCNN. Model compression, sees mnist cifar10.
• Wide ResNet (CIFAR) by ritchieng.
• Spatial Transformer Networks by zsdonghao.
• U-Net for brain tumor segmentation by zsdonghao.
• Variational Autoencoder (VAE) for (CelebA) by yzwxx.
• Variational Autoencoder (VAE) for (MNIST) by BUPTLdy.
• Image Captioning - Reimplementation of Google’s im2txt by zsdonghao.
• DCGAN (CelebA). Generating images by Deep Convolutional Generative Adversarial Networks by zsdonghao.
• Generative Adversarial Text to Image Synthesis by zsdonghao.
• Unsupervised Image to Image Translation with Generative Adversarial Networks by zsdonghao.
• Recurrent Neural Network (LSTM). Apply multiple LSTM to PTB dataset for language modeling, see tuto-
rial_ptb_lstm_state_is_tuple.py.
• Word Embedding (Word2vec). Train a word embedding matrix, see tutorial_word2vec_basic.py.
• Restore Embedding matrix. Restore a pre-train embedding matrix, see tutorial_generate_text.py.
• Text Generation. Generates new text scripts, using LSTM network, see tutorial_generate_text.py.
• Chinese Text Anti-Spam by pakrchen.
• Chatbot in 200 lines of code for Seq2Seq.
• FastText Sentence Classification (IMDB), see tutorial_imdb_fasttext.py by tomtung.
1.3. Examples 29
TensorLayer Documentation, Release 1.11.1
1.3.7 Miscellaneous
1.4 Contributing
TensorLayer is a major ongoing research project in Data Science Institute, Imperial College London. The goal of the
project is to develop a compositional language while complex learning systems can be build through composition of
neural network modules.
Numerous contributors come from various horizons such as: Tsinghua University, Carnegie Mellon University, Uni-
versity of Technology of Compiegne, Google, Microsoft, Bloomberg and etc.
There are many functions need to be contributed such as Maxout, Neural Turing Machine, Attention, TensorLayer
Mobile and etc.
You can easily open a Pull Request (PR) on GitHub, every little step counts and will be credited. As an open-source
project, we highly welcome and value contributions!
If you are interested in working with us, please contact us at: tensorlayer@gmail.com.
The TensorLayer project was started by Hao Dong at Imperial College London in June 2016.
It is actively developed and maintained by the following people (in alphabetical order):
• Akara Supratak (@akaraspt) - https://akaraspt.github.io
• Fangde Liu (@fangde) - http://fangde.github.io/
• Guo Li (@lgarithm) - https://lgarithm.github.io
• Hao Dong (@zsdonghao) - https://zsdonghao.github.io
• Jonathan Dekhtiar (@DEKHTIARJonathan) - https://www.jonathandekhtiar.eu
• Luo Mai (@luomai) - http://www.doc.ic.ac.uk/~lm111/
• Simiao Yu (@nebulaV) - https://nebulav.github.io
Numerous other contributors can be found in the Github Contribution Graph.
If you have a new method or example in term of Deep learning and Reinforcement learning, you are welcome to
contribute.
• Provide your layer or example, so everyone can use it.
• Explain how it would work, and link to a scientific paper if applicable.
• Keep the scope as narrow as possible, to make it easier to implement.
Report bugs
Report bugs at the GitHub, we normally will fix it in 5 hours. If you are reporting a bug, please include:
• your TensorLayer, TensorFlow and Python version.
• steps to reproduce the bug, ideally reduced to a few Python commands.
• the results you obtain, and the results you expected instead.
If you are unsure whether the behavior you experience is a bug, or if you are unsure whether it is related to TensorLayer
or TensorFlow, please just ask on our mailing list first.
Fix bugs
Look through the GitHub issues for bug reports. Anything tagged with “bug” is open to whoever wants to implement
it. If you discover a bug in TensorLayer you can fix yourself, by all means feel free to just implement a fix and not
report it first.
Write documentation
Whenever you find something not explained well, misleading, glossed over or just wrong, please update it! The Edit
on GitHub link on the top right of every documentation page and the [source] link for every documented entity in the
API reference will help you to quickly locate the origin of any text.
Edit on GitHub
As a very easy way of just fixing issues in the documentation, use the Edit on GitHub link on the top right of a
documentation page or the [source] link of an entity in the API reference to open the corresponding source file in
GitHub, then click the Edit this file link to edit the file in your browser and send us a Pull Request. All you need for
this is a free GitHub account.
For any more substantial changes, please follow the steps below to setup TensorLayer for development.
Documentation
The documentation is generated with Sphinx. To build it locally, run the following commands:
cd docs
make html
If you want to re-generate the whole docs, run the following commands:
cd docs
make clean
make html
1.4. Contributing 31
TensorLayer Documentation, Release 1.11.1
Testing
TensorLayer has a code coverage of 100%, which has proven very helpful in the past, but also creates some duties:
• Whenever you change any code, you should test whether it breaks existing features by just running the test
scripts.
• Every bug you fix indicates a missing test case, so a proposed bug fix should come with a new test that fails
without your fix.
When you’re satisfied with your addition, the tests pass and the documentation looks good without any markup errors,
commit your changes to a new branch, push that branch to your fork and send us a Pull Request via GitHub’s web
interface.
All these steps are nicely explained on GitHub: https://guides.github.com/introduction/flow/
When filing your Pull Request, please include a description of what it does, to help us reviewing it. If it is fixing an
open issue, say, issue #123, add Fixes #123, Resolves #123 or Closes #123 to the description text, so GitHub will close
it when your request is merged.
Data science is therefore by nature at the core of all modern transdisciplinary scientific activities, as it involves the
whole life cycle of data, from acquisition and exploration to analysis and communication of the results. Data science
is not only concerned with the tools and methods to obtain, manage and analyse data: it is also about extracting value
from data and translating it from asset to product.
Launched on 1st April 2014, the Data Science Institute at Imperial College London aims to enhance Imperial’s excel-
lence in data-driven research across its faculties by fulfilling the following objectives.
The Data Science Institute is housed in purpose built facilities in the heart of the Imperial College campus in South
Kensington. Such a central location provides excellent access to collabroators across the College and across London.
• To act as a focal point for coordinating data science research at Imperial College by facilitating access to funding,
engaging with global partners, and stimulating cross-disciplinary collaboration.
• To develop data management and analysis technologies and services for supporting data driven research in the
College.
• To promote the training and education of the new generation of data scientist by developing and coordinating
new degree courses, and conducting public outreach programmes on data science.
• To advise College on data strategy and policy by providing world-class data science expertise.
• To enable the translation of data science innovation by close collaboration with industry and supporting com-
mercialization.
If you are interested in working with us, please check our vacancies and other ways to get involved , or feel free to
contact us.
1.6 FAQ
No matter what stage you are in, we recommend you to spend just 10 minutes to read the source code of TensorLayer
and the Understand layer / Your layer in this website, you will find the abstract methods are very simple for everyone.
Reading the source codes helps you to better understand TensorFlow and allows you to implement your own methods
easily. For discussion, we recommend Gitter, Help Wanted Issues, QQ group and Wechat group.
Beginner
For people who new to deep learning, the contirbutors provided a number of tutorials in this website, these tutorials
will guide you to understand autoencoder, convolutional neural network, recurrent neural network, word embedding
and deep reinforcement learning and etc. If your already understand the basic of deep learning, we recommend you to
skip the tutorials and read the example codes on Github , then implement an example from scratch.
Engineer
For people from industry, the contirbutors provided mass format-consistent examples covering computer vision, nat-
ural language processing and reinforcement learning. Besides, there are also many TensorFlow users already im-
plemented product-level examples including image captioning, semantic/instance segmentation, machine translation,
chatbot and etc, which can be found online. It is worth noting that a wrapper especially for computer vision Tf-Slim
can be connected with TensorLayer seamlessly. Therefore, you may able to find the examples that can be used in your
project.
Researcher
For people from academic, TensorLayer was originally developed by PhD students who facing issues with other
libraries on implement novel algorithm. Installing TensorLayer in editable mode is recommended, so you can extend
your methods in TensorLayer. For researches related to image such as image captioning, visual QA and etc, you may
find it is very helpful to use the existing Tf-Slim pre-trained models with TensorLayer (a specially layer for connecting
Tf-Slim is provided).
You may need to get the list of variables you want to update, TensorLayer provides two ways to get the variables list.
The first way is to use the all_params of a network, by default, it will store the variables in order. You can print
the variables information via tl.layers.print_all_variables(train_only=True) or network.
print_params(details=False). To choose which variables to update, you can do as below.
1.6. FAQ 33
TensorLayer Documentation, Release 1.11.1
train_params = network.all_params[3:]
The second way is to get the variables by a given name. For example, if you want to get all variables which the layer
name contain dense, you can do as below.
train_params = tl.layers.get_variables_with_name('dense', train_only=True,
˓→printable=True)
After you get the variable list, you can define your optimizer like that so as to update only a part of the variables.
train_op = tf.train.AdamOptimizer(0.001).minimize(cost, var_list= train_params)
1.6.3 Logging
TensorLayer adopts the Python logging module to log running information. The logging module would print logs to
the console in default. If you want to configure the logging module, you shall follow its manual.
1.6.4 Visualization
If you run the script via SSH control, sometime you may find the following error.
_tkinter.TclError: no display name and no $DISPLAY environment variable
If happen, run sudo apt-get install python3-tk or import matplotlib and matplotlib.
use('Agg') before import tensorlayer as tl. Alternatively, add the following code into the top of
visualize.py or in your own code.
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
To use all new features of TensorLayer, you need to install the master version from Github. Before that, you need to
make sure you already installed git.
[stable version] pip install tensorlayer
[master version] pip install git+https://github.com/tensorlayer/tensorlayer.git
pip install -e .
Note that, the tl.files.load_npz() can only able to load the npz model saved by tl.files.save_npz().
If you have a model want to load into your TensorLayer network, you can first assign your parameters into a list in
order, then use tl.files.assign_params() to load the parameters into your TensorLayer model.
1.6. FAQ 35
TensorLayer Documentation, Release 1.11.1
API Reference
If you are looking for information on a specific function, class or method, this part of the documentation is for you.
To make TensorLayer simple, we minimize the number of activation functions as much as we can. So we encourage
you to use TensorFlow’s function. TensorFlow provides tf.nn.relu, tf.nn.relu6, tf.nn.elu, tf.nn.
softplus, tf.nn.softsign and so on. More TensorFlow official activation functions can be found here. For
parametric activation, please read the layer APIs.
The shortcut of tensorlayer.activation is tensorlayer.act.
Customizes activation function in TensorLayer is very easy. The following example implements an activation that
multiplies its input by 2. For more complex activation, TensorFlow API will be required.
def double_activation(x):
return x * 2
leaky_relu(x[, alpha, name]) leaky_relu can be used through its shortcut: tl.act.
lrelu().
leaky_relu6(x[, alpha, name]) leaky_relu6() can be used through its shortcut:
tl.act.lrelu6().
leaky_twice_relu6(x[, alpha_low, . . . ]) leaky_twice_relu6() can be used through its
shortcut: :func:`tl.act.ltrelu6().
ramp(x[, v_min, v_max, name]) Ramp activation function.
swish(x[, name]) Swish function.
Continued on next page
37
TensorLayer Documentation, Release 1.11.1
2.1.2 Ramp
This function is a modified version of ReLU, introducing a nonzero gradient for negative input. Introduced by
the paper: Rectifier Nonlinearities Improve Neural Network Acoustic Models [A. L. Maas et al., 2013]
The function return the following results:
• When x < 0: f(x) = alpha_low * x.
• When x >= 0: f(x) = x.
Parameters
• x (Tensor) – Support input type float, double, int32, int64, uint8, int16, or
int8.
• alpha (float) – Slope.
• name (str) – The function name (optional).
Examples
References
• Rectifier Nonlinearities Improve Neural Network Acoustic Models [A. L. Maas et al., 2013]
Parameters
• x (Tensor) – Support input type float, double, int32, int64, uint8, int16, or
int8.
• alpha (float) – Slope.
• name (str) – The function name (optional).
Examples
References
• Rectifier Nonlinearities Improve Neural Network Acoustic Models [A. L. Maas et al., 2013]
• Convolutional Deep Belief Networks on CIFAR-10 [A. Krizhevsky, 2010]
Parameters
• x (Tensor) – Support input type float, double, int32, int64, uint8, int16, or
int8.
• alpha_low (float) – Slope for x < 0: f(x) = alpha_low * x.
• alpha_high (float) – Slope for x < 6: f(x) = 6 (alpha_high * (x-6)).
• name (str) – The function name (optional).
Examples
References
• Rectifier Nonlinearities Improve Neural Network Acoustic Models [A. L. Maas et al., 2013]
• Convolutional Deep Belief Networks on CIFAR-10 [A. Krizhevsky, 2010]
2.1.6 Swish
tensorlayer.activation.swish(x, name=’swish’)
Swish function.
See Swish: a Self-Gated Activation Function.
Parameters
• x (Tensor) – input.
• name (str) – function name (optional).
Returns A Tensor in the same type as x.
Return type Tensor
2.1.7 Sign
tensorlayer.activation.sign(x)
Sign function.
Clip and binarize tensor using the straight through estimator (STE) for the gradient, usually be used for quan-
tizing values in Binarized Neural Networks: https://arxiv.org/abs/1602.02830.
Parameters x (Tensor) – input.
Examples
References
• Rectifier Nonlinearities Improve Neural Network Acoustic Models, Maas et al. (2013) http:
//web.stanford.edu/~awni/papers/relu_hybrid_icml2013_final.pdf
• BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1, Courbariaux et al. (20
https://arxiv.org/abs/1602.02830
tensorlayer.activation.hard_tanh(x, name=’htanh’)
Hard tanh activation function.
Which is a ramp function with low bound of -1 and upper bound of 1, shortcut is htanh.
Parameters
• x (Tensor) – input.
• name (str) – The function name (optional).
Returns A Tensor in the same type as x.
Return type Tensor
tensorlayer.activation.pixel_wise_softmax(x, name=’pixel_wise_softmax’)
Return the softmax outputs of images, every pixels have multiple label, the sum of a pixel is 1.
Examples
References
• tf.reverse
See tensorlayer.layers.
alphas(shape, alpha_value[, name]) Creates a tensor with all elements set to alpha_value.
alphas_like(tensor, alpha_value[, name, . . . ]) Creates a tensor with all elements set to alpha_value.
tl.alphas
Examples
>>> tl.alphas([2, 3], tf.int32) # [[alpha, alpha, alpha], [alpha, alpha, alpha]]
tl.alphas_like
Examples
To make TensorLayer simple, we minimize the number of cost functions as much as we can. So we encour-
age you to use TensorFlow’s function. For example, you can implement L1, L2 and sum regularization by tf.
nn.l2_loss, tf.contrib.layers.l1_regularizer, tf.contrib.layers.l2_regularizer and
tf.contrib.layers.sum_regularizer, see TensorFlow API.
TensorLayer provides a simple way to create you own cost function. Take a MLP below for example.
The network parameters will be [W1, b1, W2, b2, W_out, b_out], then you can apply L2 regularization
on the weights matrix of first two layer as follow.
Besides, TensorLayer provides a easy way to get all variables by a given name, so you can also apply L2 regularization
on some weights as follow.
l2 = 0
for w in tl.layers.get_variables_with_name('W_conv2d', train_only=True,
˓→printable=False):
l2 += tf.contrib.layers.l2_regularizer(1e-4)(w)
cost = tl.cost.cross_entropy(y, y_) + l2
Regularization of Weights
After initializing the variables, the informations of network parameters can be observed by using network.
print_params().
tl.layers.initialize_global_variables(sess)
network.print_params()
The output of network is network.outputs, then the cross entropy can be defined as follow. Besides, to regu-
larize the weights, the network.all_params contains all parameters of the network. In this case, network.
all_params = [W1, b1, W2, b2, Wout, bout] according to param 0, 1 . . . 5 shown by network.
print_params(). Then max-norm regularization on W1 and W2 can be performed as follow.
max_norm = 0
for w in tl.layers.get_variables_with_name('W', train_only=True, printable=False):
max_norm += tl.cost.maxnorm_regularizer(1)(w)
cost = tl.cost.cross_entropy(y, y_) + max_norm
In addition, all TensorFlow’s regularizers like tf.contrib.layers.l2_regularizer can be used with Ten-
sorLayer.
Instance method network.print_layers() prints all outputs of different layers in order. To achieve regular-
ization on activation output, you can use network.all_layers which contains all outputs of different layers.
If you want to apply L1 penalty on the activations of first hidden layer, just simply add tf.contrib.layers.
l2_regularizer(lambda_l1)(network.all_layers[1]) to the cost function.
network.print_layers()
Examples
References
Parameters
• output (Tensor) – Tensor with type of float32 or float64.
• target (Tensor) – The target distribution, format the same with output.
• epsilon (float) – A small value to avoid output to be zero.
• name (str) – An optional name to attach to this function.
References
• ericjang-DRAW
References
tensorlayer.cost.normalized_mean_square_error(output, target,
name=’normalized_mean_squared_error_loss’)
Return the TensorFlow expression of normalized mean-square-error of two distributions.
Parameters
• output (Tensor) – 2D, 3D or 4D tensor i.e. [batch_size, n_feature], [batch_size, height,
width] or [batch_size, height, width, channel].
• target (Tensor) – The target distribution, format the same with output.
• name (str) – An optional name to attach to this function.
Examples
References
• Wiki-Dice
References
• Wiki-Dice
Notes
• IoU cannot be used as training loss, people usually use dice coefficient for training, IoU and hard-dice for
evaluating.
Examples
Examples
>>> batch_size = 64
>>> vocab_size = 10000
>>> embedding_size = 256
>>> input_seqs = tf.placeholder(dtype=tf.int64, shape=[batch_size, None], name=
˓→"input")
tensorlayer.cost.cosine_similarity(v1, v2)
Cosine similarity [-1, 1].
Parameters v2 (v1,) – Tensor with the same shape [batch_size, n_feature].
References
• Wiki.
Maxnorm
tensorlayer.cost.maxnorm_regularizer(scale=1.0)
Max-norm regularization returns a function that can be used to apply max-norm regularization to weights.
More about max-norm, see wiki-max norm. The implementation follows TensorFlow contrib.
Parameters scale (float) – A scalar multiplier Tensor. 0.0 disables the regularizer.
Returns
Return type A function with signature mn(weights, name=None) that apply Lo regularization.
Raises ValueError : If scale is outside of the range [0.0, 1.0] or if scale is not a float.
Special
tensorlayer.cost.li_regularizer(scale, scope=None)
Li regularization removes the neurons of previous layer. The i represents inputs. Returns a function that can be
used to apply group li regularization to weights. The implementation follows TensorFlow contrib.
Parameters
• scale (float) – A scalar multiplier Tensor. 0.0 disables the regularizer.
• scope (str) – An optional scope name for this function.
Returns
Return type A function with signature li(weights, name=None) that apply Li regularization.
Raises ValueError : if scale is outside of the range [0.0, 1.0] or if scale is not a float.
tensorlayer.cost.lo_regularizer(scale)
Lo regularization removes the neurons of current layer. The o represents outputs Returns a function that can be
used to apply group lo regularization to weights. The implementation follows TensorFlow contrib.
Parameters scale (float) – A scalar multiplier Tensor. 0.0 disables the regularizer.
Returns
Return type A function with signature lo(weights, name=None) that apply Lo regularization.
Raises ValueError : If scale is outside of the range [0.0, 1.0] or if scale is not a float.
tensorlayer.cost.maxnorm_o_regularizer(scale)
Max-norm output regularization removes the neurons of current layer. Returns a function that can be used
to apply max-norm regularization to each column of weight matrix. The implementation follows TensorFlow
contrib.
Parameters scale (float) – A scalar multiplier Tensor. 0.0 disables the regularizer.
Returns
Return type A function with signature mn_o(weights, name=None) that apply Lo regularization.
Raises ValueError : If scale is outside of the range [0.0, 1.0] or if scale is not a float.
tensorlayer.cost.maxnorm_i_regularizer(scale)
Max-norm input regularization removes the neurons of previous layer. Returns a function that can be used to
apply max-norm regularization to each row of weight matrix. The implementation follows TensorFlow contrib.
Parameters scale (float) – A scalar multiplier Tensor. 0.0 disables the regularizer.
Returns
Return type A function with signature mn_i(weights, name=None) that apply Lo regularization.
Raises ValueError : If scale is outside of the range [0.0, 1.0] or if scale is not a float.
Image augmentation is a critical step in deep learning. Though TensorFlow has provided tf.image, image augmen-
tation often remains as a key bottleneck. tf.image has three limitations:
• Real-world visual tasks such as object detection, segmentation, and pose estimation must cope with image
meta-data (e.g., coordinates). These data are beyond tf.image which processes images as tensors.
• tf.image operators breaks the pure Python programing experience (i.e., users have to use tf.py_func in
order to call image functions written in Python); however, frequent uses of tf.py_func slow down Tensor-
Flow, making users hard to balance flexibility and performance.
• tf.image API is inflexible. Image operations are performed in an order. They are hard to jointly optimize.
More importantly, sequential image operations can significantly reduces the quality of images, thus affecting
training accuracy.
TensorLayer addresses these limitations by providing a high-performance image augmentation API in Python. This
API bases on affine transformation and cv2.wrapAffine. It allows you to combine multiple image processing
functions into a single matrix operation. This combined operation is executed by the fast cv2 library, offering 78x
performance improvement (observed in openpose-plus for example). The following example illustrates the rationale
behind this tremendous speed up.
Example
The source code of complete examples can be found here. The following is a typical Python program that applies
rotation, shifting, flipping, zooming and shearing to an image,
image = tl.vis.read_image('tiger.jpeg')
tl.vis.save_image(xx, '_result_slow.png')
However, by leveraging affine transformation, image operations can be combined into one:
# 2. Combine matrices
# NOTE: operations are applied in a reversed order (i.e., rotation is performed first)
M_combined = M_shift.dot(M_zoom).dot(M_shear).dot(M_flip).dot(M_rotate)
# 3. Convert the matrix from Cartesian coordinates (the origin in the middle of image)
# to image coordinates (the origin on the top-left of image)
transform_matrix = tl.prepro.transform_matrix_offset_center(M_combined, x=w, y=h)
tl.vis.save_image(result, '_result_fast.png')
The following figure illustrates the rational behind combined affine transformation.
Using combined affine transformation has two key benefits. First, it allows you to leverage a pure Python API to
achieve orders of magnitudes of speed up in image augmentation, and thus prevent data pre-processing from becoming
a bottleneck in training. Second, performing sequential image transformation requires multiple image interpolations.
This produces low-quality input images. In contrast, a combined transformation performs the interpolation only once,
and thus preserve the content in an image. The following figure illustrates these two benefits:
The major reason for combined affine transformation being fast is because it has lower computational complexity.
Assume we have k affine transformations T1, ..., Tk, where Ti can be represented by 3x3 matrixes. The se-
quential transformation can be represented as y = Tk (... T1(x)), and the time complexity is O(k N) where
N is the cost of applying one transformation to image x. N is linear to the size of x. For the combined transformation
y = (Tk ... T1) (x) the time complexity is O(27(k - 1) + N) = max{O(27k), O(N)} = O(N)
(assuming 27k << N) where 27 = 3^3 is the cost for combining two transformations.
tensorlayer.prepro.affine_rotation_matrix(angle=(-20, 20))
Create an affine transform matrix for image rotation. NOTE: In OpenCV, x is width and y is height.
Parameters angle (int/float or tuple of two int/float) –
Degree to rotate, usually -180 ~ 180.
• int/float, a fixed angle.
• tuple of 2 floats/ints, randomly sample a value as the angle between these 2 values.
Returns An affine transform matrix.
Return type numpy.array
tensorlayer.prepro.affine_horizontal_flip_matrix(prob=0.5)
Create an affine transformation matrix for image horizontal flipping. NOTE: In OpenCV, x is width and y is
height.
Parameters prob (float) – Probability to flip the image. 1.0 means always flip.
Returns An affine transform matrix.
Return type numpy.array
tensorlayer.prepro.affine_vertical_flip_matrix(prob=0.5)
Create an affine transformation for image vertical flipping. NOTE: In OpenCV, x is width and y is height.
Parameters prob (float) – Probability to flip the image. 1.0 means always flip.
Returns An affine transform matrix.
Return type numpy.array
tensorlayer.prepro.affine_zoom_matrix(zoom_range=(0.8, 1.1))
Create an affine transform matrix for zooming/scaling an image’s height and width. OpenCV format, x is width.
Parameters
• x (numpy.array) – An image with dimension of [row, col, channel] (default).
• zoom_range (float or tuple of 2 floats) –
The zooming/scaling ratio, greater than 1 means larger.
– float, a fixed ratio.
– tuple of 2 floats, randomly sample a value as the ratio between these 2 values.
Returns An affine transform matrix.
Return type numpy.array
tensorlayer.prepro.affine_respective_zoom_matrix(w_range=0.8, h_range=1.1)
Get affine transform matrix for zooming/scaling that height and width are changed independently. OpenCV
format, x is width.
Parameters
• w_range (float or tuple of 2 floats) –
The zooming/scaling ratio of width, greater than 1 means larger.
– float, a fixed ratio.
– tuple of 2 floats, randomly sample a value as the ratio between 2 values.
• h_range (float or tuple of 2 floats) –
The zooming/scaling ratio of height, greater than 1 means larger.
– float, a fixed ratio.
– tuple of 2 floats, randomly sample a value as the ratio between 2 values.
tensorlayer.prepro.transform_matrix_offset_center(matrix, y, x)
Convert the matrix from Cartesian coordinates (the origin in the middle of image) to Image coordinates (the
origin on the top-left of image).
Parameters
• matrix (numpy.array) – Transform matrix.
• and y (x) – Size of image.
Returns The transform matrix.
Return type numpy.array
Examples
Examples
tensorlayer.prepro.affine_transform_keypoints(coords_list, transform_matrix)
Transform keypoint coordinates according to a given affine transform matrix. OpenCV format, x is width.
Note that, for pose estimation task, flipping requires maintaining the left and right body information. We should
not flip the left and right body, so please use tl.prepro.keypoint_random_flip.
Parameters
• coords_list (list of list of tuple/list) – The coordinates e.g., the key-
point coordinates of every person in an image.
• transform_matrix (numpy.array) – Transform matrix, OpenCV format.
Examples
>>> # 4. then we can transfrom the image once for all transformations
>>> result = tl.prepro.affine_transform_cv2(image, transform_matrix) # 76 times
˓→faster
2.4.2 Images
Examples
References
Rotation
Examples
Examples
Crop
Flip
Shift
Shear
References
• Affine transformation
Shear V2
References
• Affine transformation
Swirl
• clip (boolean) – Whether to clip the output to the range of values of the input image.
This is enabled by default, since higher order interpolation may produce values outside the
given input range.
• preserve_range (boolean) – Whether to keep the original range of values. Other-
wise, the input image is converted according to the conventions of img_as_float.
• is_random (boolean,) –
If True, random swirl. Default is False.
– random center = [(0 ~ x.shape[0]), (0 ~ x.shape[1])]
– random strength = [0, strength]
– random radius = [1e-10, radius]
– random rotation = [-rotation, rotation]
Returns A processed image.
Return type numpy.array
Examples
Elastic transform
Examples
References
• Github.
• Kaggle
Zoom
Respective Zoom
Brightness
References
• skimage.exposure.adjust_gamma
• chinese blog
Change saturation.
– if is_random=False, one float number, small than one means unsaturation.
– if is_random=True, tuple of two float numbers, (min, max).
• is_random (boolean) – If True, randomly change illumination. Default is False.
Returns A processed image.
Return type numpy.array
Examples
Random
Non-random
RGB to HSV
tensorlayer.prepro.rgb_to_hsv(rgb)
Input RGB image [0~255] return HSV image [0~1].
Parameters rgb (numpy.array) – An image with values between 0 and 255.
Returns A processed image.
Return type numpy.array
HSV to RGB
tensorlayer.prepro.hsv_to_rgb(hsv)
Input HSV image [0~1] return RGB image [0~255].
Parameters hsv (numpy.array) – An image with values between 0.0 and 1.0
Returns A processed image.
Return type numpy.array
Adjust Hue
• hout (float) –
The scale value for adjusting hue.
– If is_offset is False, set all hue values to this value. 0 is red; 0.33 is green; 0.66 is blue.
– If is_offset is True, add this value as the offset to the hue channel.
• is_offset (boolean) – Whether hout is added on HSV as offset or not. Default is True.
• is_clip (boolean) – If HSV value smaller than 0, set to 0. Default is True.
• is_random (boolean) – If True, randomly change hue. Default is False.
Returns A processed image.
Return type numpy.array
Examples
Random, add a random value between -0.2 and 0.2 as the offset to every hue values.
References
• tf.image.random_hue.
• tf.image.adjust_hue.
• StackOverflow: Changing image hue with python PIL.
Resize
References
• scipy.misc.imresize
Examples
Random
Non-random
Normalization
Examples
Notes
When samplewise_center and samplewise_std_normalization are True. - For greyscale image, every pixels are
subtracted and divided by the mean and std of whole image. - For RGB image, every pixels are subtracted and
divided by the mean and std of this pixel i.e. the mean and std of a pixel is 0 and 1.
tensorlayer.prepro.featurewise_norm(x, mean=None, std=None, epsilon=1e-07)
Normalize every pixels by the same given mean and std, which are usually compute from all examples.
Parameters
• x (numpy.array) – An image with dimension of [row, col, channel] (default).
• mean (float) – Value for subtraction.
• std (float) – Value for division.
• epsilon (float) – A small position value for dividing standard deviation.
Returns A processed image.
Return type numpy.array
Channel shift
Noise
tensorlayer.prepro.drop(x, keep=0.5)
Randomly set some pixels to zero by a given keeping probability.
Parameters
• x (numpy.array) – An image with dimension of [row, col, channel] or [row, col].
• keep (float) – The keeping probability (0, 1), the lower more values will be set to zero.
Returns A processed image.
Return type numpy.array
References
PIL Image.fromarray
Find contours
• fully_connected (str) – Either low or high. Indicates whether array elements below
the given level value are to be considered fully-connected (and hence elements above the
value will only be face connected), or vice-versa. (See notes below for details.)
• positive_orientation (str) – Either low or high. Indicates whether the output
contours will produce positively-oriented polygons around islands of low- or high-valued
elements. If low then contours will wind counter-clockwise around elements below the iso-
value. Alternately, this means that low-valued elements are always on the left of the contour.
Returns Each contour is an ndarray of shape (n, 2), consisting of n (row, column) coordinates along
the contour.
Return type list of (n,2)-ndarrays
Points to Image
Binary dilation
tensorlayer.prepro.binary_dilation(x, radius=3)
Return fast binary morphological dilation of an image. see skimage.morphology.binary_dilation.
Parameters
• x (2D array) – A binary image.
• radius (int) – For the radius of mask.
Returns A processed binary image.
Return type numpy.array
Greyscale dilation
tensorlayer.prepro.dilation(x, radius=3)
Return greyscale morphological dilation of an image, see skimage.morphology.dilation.
Parameters
• x (2D array) – An greyscale image.
• radius (int) – For the radius of mask.
Returns A processed greyscale image.
Return type numpy.array
Binary erosion
tensorlayer.prepro.binary_erosion(x, radius=3)
Return binary morphological erosion of an image, see skimage.morphology.binary_erosion.
Parameters
• x (2D array) – A binary image.
• radius (int) – For the radius of mask.
Returns A processed binary image.
Return type numpy.array
Greyscale erosion
tensorlayer.prepro.erosion(x, radius=3)
Return greyscale morphological erosion of an image, see skimage.morphology.erosion.
Parameters
• x (2D array) – A greyscale image.
• radius (int) – For the radius of mask.
Returns A processed greyscale image.
Return type numpy.array
import tensorlayer as tl
# resize
im_resize, coords = tl.prepro.obj_box_imresize(image,
coords=ann_list[idx][1], size=[300, 200], is_rescale=True)
tl.vis.draw_boxes_and_labels_to_image(im_resize, ann_list[idx][0],
coords, [], classes, True, save_name='_im_resize.png')
# crop
im_crop, clas, coords = tl.prepro.obj_box_crop(image, ann_list[idx][0],
ann_list[idx][1], wrg=200, hrg=200,
is_rescale=True, is_center=True, is_random=False)
tl.vis.draw_boxes_and_labels_to_image(im_crop, clas, coords, [],
classes, True, save_name='_im_crop.png')
# shift
im_shfit, clas, coords = tl.prepro.obj_box_shift(image, ann_list[idx][0],
ann_list[idx][1], wrg=0.1, hrg=0.1,
is_rescale=True, is_center=True, is_random=False)
tl.vis.draw_boxes_and_labels_to_image(im_shfit, clas, coords, [],
classes, True, save_name='_im_shift.png')
# zoom
im_zoom, clas, coords = tl.prepro.obj_box_zoom(image, ann_list[idx][0],
ann_list[idx][1], zoom_range=(1.3, 0.7),
is_rescale=True, is_center=True, is_random=False)
tl.vis.draw_boxes_and_labels_to_image(im_zoom, clas, coords, [],
classes, True, save_name='_im_zoom.png')
In practice, you may want to use threading method to process a batch of images as follows.
import tensorlayer as tl
import random
batch_size = 64
im_size = [416, 416]
n_data = len(imgs_file_list)
jitter = 0.2
def _data_pre_aug_fn(data):
im, ann = data
clas, coords = ann
## change image brightness, contrast and saturation randomly
im = tl.prepro.illumination(im, gamma=(0.5, 1.5),
contrast=(0.5, 1.5), saturation=(0.5, 1.5), is_random=True)
## flip randomly
im, coords = tl.prepro.obj_box_horizontal_flip(im, coords,
is_rescale=True, is_center=True, is_random=True)
## randomly resize and crop image, it can have same effect as random zoom
tmp0 = random.randint(1, int(im_size[0]*jitter))
tmp1 = random.randint(1, int(im_size[1]*jitter))
im, coords = tl.prepro.obj_box_imresize(im, coords,
[im_size[0]+tmp0, im_size[1]+tmp1], is_rescale=True,
interp='bicubic')
im, clas, coords = tl.prepro.obj_box_crop(im, clas, coords,
wrg=im_size[1], hrg=im_size[0], is_rescale=True,
is_center=True, is_random=True)
(continues on next page)
# threading process
data = tl.prepro.threading_data([_ for _ in zip(b_images, b_ann)],
_data_pre_aug_fn)
b_images2 = [d[0] for d in data]
b_ann = [d[1] for d in data]
tensorlayer.prepro.obj_box_coord_rescale(coord=None, shape=None)
Scale down one coordinates from pixel unit to the ratio of image size i.e. in the range of [0, 1]. It is the reverse
process of obj_box_coord_scale_to_pixelunit.
Parameters
• coords (list of 4 int or None) – One coordinates of one image e.g. [x, y, w, h].
• shape (list of 2 int or None) – For [height, width].
Returns New bounding box.
Return type list of 4 numbers
Examples
tensorlayer.prepro.obj_box_coords_rescale(coords=None, shape=None)
Scale down a list of coordinates from pixel unit to the ratio of image size i.e. in the range of [0, 1].
Parameters
• coords (list of list of 4 ints or None) – For coordinates of more than one
images .e.g.[[x, y, w, h], [x, y, w, h], . . . ].
• shape (list of 2 int or None) – height, width].
Returns A list of new bounding boxes.
Return type list of list of 4 numbers
Examples
>>> coords = obj_box_coords_rescale(coords=[[30, 40, 50, 50], [10, 10, 20, 20]],
˓→shape=[100, 100])
>>> print(coords)
[[0.3, 0.4, 0.5, 0.5], [0.1, 0.1, 0.2, 0.2]]
>>> coords = obj_box_coords_rescale(coords=[[30, 40, 50, 50]], shape=[50, 100])
>>> print(coords)
[[0.3, 0.8, 0.5, 1.0]]
>>> coords = obj_box_coords_rescale(coords=[[30, 40, 50, 50]], shape=[100, 200])
>>> print(coords)
[[0.15, 0.4, 0.25, 0.5]]
tensorlayer.prepro.obj_box_coord_scale_to_pixelunit(coord, shape=None)
Convert one coordinate [x, y, w (or x2), h (or y2)] in ratio format to image coordinate format. It is the reverse
process of obj_box_coord_rescale.
Parameters
• coord (list of 4 float) – One coordinate of one image [x, y, w (or x2), h (or y2)]
in ratio format, i.e value range [0~1].
• shape (tuple of 2 or None) – For [height, width].
Returns New bounding box.
Return type list of 4 numbers
Examples
tensorlayer.prepro.obj_box_coord_centroid_to_upleft_butright(coord,
to_int=False)
Convert one coordinate [x_center, y_center, w, h] to [x1, y1, x2, y2] in up-left and botton-right format.
Parameters
• coord (list of 4 int/float) – One coordinate.
• to_int (boolean) – Whether to convert output as integer.
Returns New bounding box.
Return type list of 4 numbers
Examples
tensorlayer.prepro.obj_box_coord_upleft_butright_to_centroid(coord)
Convert one coordinate [x1, y1, x2, y2] to [x_center, y_center, w, h]. It is the reverse process of
obj_box_coord_centroid_to_upleft_butright.
Parameters coord (list of 4 int/float) – One coordinate.
Returns New bounding box.
Return type list of 4 numbers
tensorlayer.prepro.obj_box_coord_centroid_to_upleft(coord)
Convert one coordinate [x_center, y_center, w, h] to [x, y, w, h]. It is the reverse process of
obj_box_coord_upleft_to_centroid.
Parameters coord (list of 4 int/float) – One coordinate.
Returns New bounding box.
Return type list of 4 numbers
tensorlayer.prepro.obj_box_coord_upleft_to_centroid(coord)
Convert one coordinate [x, y, w, h] to [x_center, y_center, w, h]. It is the reverse process of
obj_box_coord_centroid_to_upleft.
Parameters coord (list of 4 int/float) – One coordinate.
Returns New bounding box.
Return type list of 4 numbers
tensorlayer.prepro.parse_darknet_ann_str_to_list(annotations)
Input string format of class, x, y, w, h, return list of list format.
Parameters annotations (str) – The annotations in darkent format “class, x, y, w, h . . . .”
seperated by “\n”.
Returns List of bounding box.
Return type list of list of 4 numbers
tensorlayer.prepro.parse_darknet_ann_list_to_cls_box(annotations)
Parse darknet annotation format into two lists for class and bounding box.
Input list of [[class, x, y, w, h], . . . ], return two list of [class . . . ] and [[x, y, w, h], . . . ].
Parameters annotations (list of list) – A list of class and bounding boxes of images
e.g. [[class, x, y, w, h], . . . ]
Returns
• list of int – List of class labels.
• list of list of 4 numbers – List of bounding box.
Examples
>>> print(coords)
[[0.8, 0.4, 0.3, 0.3], [0.9, 0.5, 0.2, 0.3]]
>>> im, coords = obj_box_left_right_flip(im, coords=[[0.2, 0.4, 0.3, 0.3]], is_
˓→rescale=True, is_center=False, is_random=False)
>>> print(coords)
[[0.5, 0.4, 0.3, 0.3]]
>>> im, coords = obj_box_left_right_flip(im, coords=[[20, 40, 30, 30]], is_
˓→rescale=False, is_center=True, is_random=False)
>>> print(coords)
[[80, 40, 30, 30]]
>>> im, coords = obj_box_left_right_flip(im, coords=[[20, 40, 30, 30]], is_
˓→rescale=False, is_center=False, is_random=False)
>>> print(coords)
[[50, 40, 30, 30]]
Examples
>>> print(coords)
[[40, 80, 60, 60], [20, 40, 40, 40]]
>>> _, coords = obj_box_imresize(im, coords=[[20, 40, 30, 30]], size=[40, 100],
˓→is_rescale=False)
>>> print(coords)
[[20, 20, 30, 15]]
>>> _, coords = obj_box_imresize(im, coords=[[20, 40, 30, 30]], size=[60, 150],
˓→is_rescale=False)
>>> print(coords)
[[30, 30, 45, 22]]
(continues on next page)
2.4.4 Keypoints
Returns
Return type preprocessed image, annos, mask
smaller than min_size, uses padding to make shape matchs min_size. The height and width of image will be
changed together, the scale would not be changed.
Parameters
• image (3 channel image) – The given image for augmentation.
• annos (list of list of floats) – The keypoints annotation of people.
• mask (single channel image or None) – The mask if available.
• min_size (tuple of two int) – The minimum size of height and width.
• zoom_range (tuple of two floats) – The minimum and maximum factor to
zoom in or out, e.g (0.5, 1) means zoom out 1~2 times.
• pad_val (int/float, or tuple of int or random function) – The
three padding values for RGB channels respectively.
Returns
Return type preprocessed image, annos, mask
2.4.5 Sequence
Padding
Examples
Remove Padding
tensorlayer.prepro.remove_pad_sequences(sequences, pad_id=0)
Remove padding.
Parameters
• sequences (list of list of int) – All sequences where each row is a sequence.
• pad_id (int) – The pad ID.
Returns The processed sequences.
Return type list of list of int
Examples
Process
Examples
Add Start ID
Examples
For Seq2seq
Add End ID
tensorlayer.prepro.sequences_add_end_id(sequences, end_id=888)
Add special end token(id) in the end of each sequence.
Parameters
• sequences (list of list of int) – All sequences where each row is a sequence.
• end_id (int) – The end ID.
Returns The processed sequences.
Return type list of list of int
Examples
Examples
Get Mask
tensorlayer.prepro.sequences_get_mask(sequences, pad_val=0)
Return mask for sequences.
Parameters
• sequences (list of list of int) – All sequences where each row is a sequence.
• pad_val (int) – The pad value.
Returns The mask.
Return type list of list of int
Examples
Trainer
Examples
See tutorial_mnist_distributed_trainer.py.
A collections of helper functions to work with dataset. Load benchmark dataset, save and restore model, save and load
variables.
TensorLayer provides rich layer implementations trailed for various benchmarks and domain-specific problems. In
addition, we also support transparent access to native TensorFlow parameters. For example, we provide not only layers
for local response normalization, but also layers that allow user to apply tf.nn.lrn on network.outputs. More
functions can be found in TensorFlow API.
MNIST
Returns X_train, y_train, X_val, y_val, X_test, y_test – Return splitted training/validation/test set
respectively.
Return type tuple
Examples
Fashion-MNIST
Examples
CIFAR-10
• path (str) – The path that the data is downloaded to, defaults is data/cifar10/.
• plotable (boolean) – Whether to plot some image examples, False as default.
Examples
References
• CIFAR website
• Data download link
• https://teratail.com/questions/28932
SVHN
tensorlayer.files.load_cropped_svhn(path=’data’, include_extra=True)
Load Cropped SVHN.
The Cropped Street View House Numbers (SVHN) Dataset contains 32x32x3 RGB images. Digit ‘1’ has label
1, ‘9’ has label 9 and ‘0’ has label 0 (the original dataset uses 10 to represent ‘0’), see ufldl website.
Parameters
• path (str) – The path that the data is downloaded to.
• include_extra (boolean) – If True (default), add extra images to the training set.
Returns X_train, y_train, X_test, y_test – Return splitted training/test set respectively.
Return type tuple
Examples
tensorlayer.files.load_ptb_dataset(path=’data’)
Load Penn TreeBank (PTB) dataset.
It is used in many LANGUAGE MODELING papers, including “Empirical Evaluation and Combination of
Advanced Language Modeling Techniques”, “Recurrent Neural Network Regularization”. It consists of 929k
training words, 73k validation words, and 82k test words. It has 10k words in its vocabulary.
Parameters path (str) – The path that the data is downloaded to, defaults is data/ptb/.
Returns
• train_data, valid_data, test_data (list of int) – The training, validating and testing data in
integer format.
• vocab_size (int) – The vocabulary size.
Examples
References
Notes
• If you want to get the raw data, see the source code.
tensorlayer.files.load_matt_mahoney_text8_dataset(path=’data’)
Load Matt Mahoney’s dataset.
Download a text file from Matt Mahoney’s website if not present, and make sure it’s the right size. Extract the
first file enclosed in a zip file as a list of words. This dataset can be used for Word Embedding.
Parameters path (str) – The path that the data is downloaded to, defaults is data/mm_test8/.
Returns The raw text data e.g. [. . . . ‘their’, ‘families’, ‘who’, ‘were’, ‘expelled’, ‘from’,
‘jerusalem’, . . . ]
Return type list of str
Examples
IMBD
• skip_top (int) – Top most frequent words to ignore (they will appear as oov_char value
in the sequence data).
• maxlen (int) – Maximum sequence length. Any longer sequence will be truncated.
• seed (int) – Seed for reproducible data shuffling.
• start_char (int) – The start of a sequence will be marked with this character. Set to 1
because 0 is usually the padding character.
• oov_char (int) – Words that were cut out because of the num_words or skip_top limit
will be replaced with this character.
• index_from (int) – Index actual words with this index and higher.
Examples
References
Nietzsche
tensorlayer.files.load_nietzsche_dataset(path=’data’)
Load Nietzsche dataset.
Parameters path (str) – The path that the data is downloaded to, defaults is data/
nietzsche/.
Returns The content.
Return type str
Examples
tensorlayer.files.load_wmt_en_fr_dataset(path=’data’)
Load WMT‘15 English-to-French translation dataset.
It will download the data from the WMT‘15 Website (10^9-French-English corpus), and the 2013 news test
from the same site as development set. Returns the directories of training data and test data.
Parameters path (str) – The path that the data is downloaded to, defaults is data/
wmt_en_fr/.
References
Notes
Flickr25k
Examples
Flickr1M
Returns a list of images by a given tag from Flickr1M dataset, it will download Flickr1M from the official
website at the first time you use it.
Parameters
• tag (str or None) –
What images to return.
– If you want to get images with tag, use string like ‘dog’, ‘red’, see Flickr Search.
– If you want to get all images, set to None.
• size (int) – integer between 1 to 10. 1 means 100k images . . . 5 means 500k images, 10
means all 1 million images. Default is 10.
• path (str) – The path that the data is downloaded to, defaults is data/flickr25k/.
• n_threads (int) – The number of thread to read image.
• printable (boolean) – Whether to print infomation when reading images, default is
False.
Examples
CycleGAN
tensorlayer.files.load_cyclegan_dataset(filename=’summer2winter_yosemite’,
path=’data’)
Load images from CycleGAN’s database, see this link.
Parameters
• filename (str) – The dataset you want, see this link.
• path (str) – The path that the data is downloaded to, defaults is data/cyclegan
Examples
CelebA
tensorlayer.files.load_celebA_dataset(path=’data’)
Load CelebA dataset
Return a list of image path.
Parameters path (str) – The path that the data is downloaded to, defaults is data/celebA/.
VOC 2007/2012
Examples
>>> idx = 26
>>> print(classes)
['aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat', 'chair',
˓→'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person', 'pottedplant',
(continues on next page)
˓→'sheep', 'sofa', 'train', 'tvmonitor']
>>> print(imgs_file_list[idx])
data/VOC/VOC2012/JPEGImages/2007_000423.jpg
>>> print(n_objs_list[idx])
2
>>> print(imgs_ann_file_list[idx])
data/VOC/VOC2012/Annotations/2007_000423.xml
>>> print(objs_info_list[idx])
14 0.173 0.461333333333 0.142 0.496
14 0.828 0.542666666667 0.188 0.594666666667
>>> ann = tl.prepro.parse_darknet_ann_str_to_list(objs_info_list[idx])
>>> print(ann)
[[14, 0.173, 0.461333333333, 0.142, 0.496], [14, 0.828, 0.542666666667, 0.188, 0.
˓→594666666667]]
>>> c, b = tl.prepro.parse_darknet_ann_list_to_cls_box(ann)
>>> print(c, b)
[14, 14] [[0.173, 0.461333333333, 0.142, 0.496], [0.828, 0.542666666667, 0.188, 0.
˓→594666666667]]
References
MPII
tensorlayer.files.load_mpii_pose_dataset(path=’data’, is_16_pos_only=False)
Load MPII Human Pose Dataset.
Parameters
• path (str) – The path that the data is downloaded to.
• is_16_pos_only (boolean) – If True, only return the peoples contain 16 pose key-
points. (Usually be used for single person pose estimation)
Returns
• img_train_list (list of str) – The image directories of training data.
• ann_train_list (list of dict) – The annotations of training data.
• img_test_list (list of str) – The image directories of testing data.
• ann_test_list (list of dict) – The annotations of testing data.
Examples
References
Google Drive
tensorlayer.files.download_file_from_google_drive(ID, destination)
Download file from Google Drive.
See tl.files.load_celebA_dataset for example.
Parameters
• ID (str) – The driver ID.
• destination (str) – The destination for save file.
TensorFlow provides .ckpt file format to save and restore the models, while we suggest to use standard python file
format .npz to save models for the sake of cross-platform.
Examples
Notes
If you got session issues, you can change the value.eval() to value.eval(session=sess)
References
tensorlayer.files.load_npz(path=”, name=’model.npz’)
Load the parameters of a Model saved by tl.files.save_npz().
Parameters
• path (str) – Folder path to .npz file.
• name (str) – The name of the .npz file.
Returns A list of parameters in order.
Return type list of array
Examples
• See tl.files.save_npz
References
Examples
• See tl.files.save_npz
References
Examples
• See tl.files.save_npz
tensorlayer.files.load_and_assign_npz_dict(name=’model.npz’, sess=None)
Restore the parameters saved by tl.files.save_npz_dict().
Parameters
• name (str) – The name of the .npz file.
• sess (Session) – TensorFlow Session.
• save_dir (str) – The path / file directory to the ckpt, default is checkpoint.
• var_list (list of tensor) – The parameters / variables (tensor) to be saved. If
empty, save all global variables (default).
• is_latest (boolean) – Whether to load the latest ckpt, if False, load the ckpt with the
name of `mode_name.
• printable (boolean) – Whether to print all parameters information.
Examples
tensorlayer.files.save_any_to_npy(save_dict=None, name=’file.npy’)
Save variables to .npy file.
Parameters
• save_dict (directory) – The variables to be saved.
• name (str) – File name.
Examples
tensorlayer.files.load_npy_to_any(path=”, name=’file.npy’)
Load .npy file.
Parameters
• path (str) – Path to the file (optional).
• name (str) – File name.
Examples
• see tl.files.save_any_to_npy()
tensorlayer.files.file_exists(filepath)
Check whether a file exists by given file path.
tensorlayer.files.folder_exists(folderpath)
Check whether a folder exists by given folder path.
Delete file
tensorlayer.files.del_file(filepath)
Delete a file by given file path.
Delete folder
tensorlayer.files.del_folder(folderpath)
Delete a folder by given folder path.
Read file
tensorlayer.files.read_file(filepath)
Read a file and return a string.
Examples
Examples
tensorlayer.files.load_folder_list(path=”)
Return a folder list in a folder by given a folder path.
Parameters path (str) – A folder path.
tensorlayer.files.exists_or_mkdir(path, verbose=True)
Check a folder by given name, if not exist, create the folder and return False, if directory exists, return True.
Parameters
• path (str) – A folder path.
• verbose (boolean) – If True (default), prints results.
Returns True if folder already exist, otherwise, returns False and create the folder.
Return type boolean
Examples
>>> tl.files.exists_or_mkdir("checkpoints/train")
Download or extract
Examples
... working_directory='data/',
... url_source='http://yann.lecun.com/
˓→exdb/mnist/')
>>> tl.files.maybe_download_and_extract(filename='ADEChallengeData2016.zip',
... working_directory='data/',
... url_source='http://sceneparsing.
˓→csail.mit.edu/data/',
... extract=True)
2.6.5 Sort
tensorlayer.files.natural_keys(text)
Sort list of string with number in human order.
Examples
References
• link
tensorlayer.files.npz_to_W_pdf(path=None, regx=’w1pre_[0-9]+\\.(npz)’)
Convert the first weight matrix of .npz file to .pdf by using tl.visualize.W().
Parameters
• path (str) – A folder path to npz files.
• regx (str) – Regx for the file name.
Examples
Convert the first weight matrix of w1_pre. . . npz file to w1_pre. . . pdf.
Data iteration.
Examples
>>> y = np.asarray([0,1,2,3,4,5])
>>> for batch in tl.iterate.minibatches(inputs=X, targets=y, batch_size=2,
˓→shuffle=False):
>>> print(batch)
(array([['a', 'a'], ['b', 'b']], dtype='<U1'), array([0, 1]))
(array([['c', 'c'], ['d', 'd']], dtype='<U1'), array([2, 3]))
(array([['e', 'e'], ['f', 'f']], dtype='<U1'), array([4, 5]))
Notes
If you have two inputs and one label and want to shuffle them together, e.g. X1 (1000, 100), X2 (1000, 80) and
Y (1000, 1), you can stack them together (np.hstack((X1, X2))) into (1000, 180) and feed to inputs. After
getting a batch, you can split it back into X1 and X2.
Sequence iteration 1
Examples
>>> print(batch)
(array([['a', 'a'], ['b', 'b'], ['b', 'b'], ['c', 'c']], dtype='<U1'), array([0,
˓→1, 1, 2]))
(array([['c', 'c'], ['d', 'd'], ['d', 'd'], ['e', 'e']], dtype='<U1'), array([2,
˓→3, 3, 4]))
Many to One
>>> Y = np.asarray([0,1,2,3,4,5])
>>> for batch in tl.iterate.seq_minibatches(inputs=X, targets=Y, batch_size=2,
˓→seq_length=num_steps, stride=1):
>>> x, y = batch
>>> if return_last:
>>> tmp_y = y.reshape((-1, num_steps) + y.shape[1:])
>>> y = tmp_y[:, -1]
>>> print(x, y)
[['a' 'a']
['b' 'b']
['b' 'b']
['c' 'c']] [1 2]
[['c' 'c']
['d' 'd']
['d' 'd']
['e' 'e']] [3 4]
Sequence iteration 2
Examples
[[ 0. 1. 2.] [ 10. 11. 12.]] [[ 20. 21. 22.] [ 30. 31. 32.]]
[[ 3. 4. 5.] [ 13. 14. 15.]] [[ 23. 24. 25.] [ 33. 34. 35.]]
[[ 6. 7. 8.] [ 16. 17. 18.]] [[ 26. 27. 28.] [ 36. 37. 38.]]
Notes
• Hint, if the input data are images, you can modify the source code data = np.zeros([batch_size, batch_len)
to data = np.zeros([batch_size, batch_len, inputs.shape[1], inputs.shape[2], inputs.shape[3]]).
Examples
[[ 3 4 5] <— 1st batch input 2nd subset/ iteration [13 14 15]] <— 2nd batch input
[[ 4 5 6] <— 1st batch target [14 15 16]] <— 2nd batch target
[[ 6 7 8] 3rd subset/ iteration [16 17 18]]
[[ 7 8 9] [17 18 19]]
TensorLayer provides rich layer implementations trailed for various benchmarks and domain-specific problems. In
addition, we also support transparent access to native TensorFlow parameters. For example, we provide not only layers
for local response normalization, but also layers that allow user to apply tf.nn.lrn on network.outputs. More
functions can be found in TensorFlow API.
These functions help you to reuse parameters for different inference (graph), and get a list of parameters by given
name. About TensorFlow parameters sharing click here.
Examples
Examples
Print variables
tensorlayer.layers.print_all_variables(train_only=False)
Print information of trainable or all variables, without tl.layers.
initialize_global_variables(sess).
Parameters train_only (boolean) –
Whether print trainable variables only.
• If True, print the trainable variables.
• If False, print all variables.
Initialize variables
tensorlayer.layers.initialize_global_variables(sess)
Initialize the global variables of TensorFlow.
sess = tf.InteractiveSession()
y = network.outputs
y_op = tf.argmax(tf.nn.softmax(y), 1)
train_params = network.all_params
tl.layers.initialize_global_variables(sess)
network.print_params()
network.print_layers()
In addition, network.all_drop is a dictionary which stores the keeping probabilities of all noise layers. In the
above network, they represent the keeping probabilities of dropout layers.
In case for training, you can enable all dropout layers as follow:
In case for evaluating and testing, you can disable all dropout layers as follow.
For more details, please read the MNIST examples in the example folder.
A Simple Layer
To implement a custom layer in TensorLayer, you will have to write a Python class that subclasses Layer and implement
the outputs expression.
The following is an example implementation of a layer that multiplies its input by 2:
class DoubleLayer(Layer):
def __init__(
self,
layer = None,
name ='double_layer',
):
# manage layer (fixed)
super(DoubleLayer, self).__init__(prev_layer=prev_layer, name=name)
# operation (customized)
self.outputs = self.inputs * 2
Before creating your own TensorLayer layer, let’s have a look at the Dense layer. It creates a weight matrix and a
bias vector if not exists, and then implements the output expression. At the end, for a layer with parameters, we also
append the parameters into all_params.
class MyDenseLayer(Layer):
def __init__(
self,
layer = None,
n_units = 100,
act = tf.nn.relu,
name ='simple_dense',
):
# manage layer (fixed)
super(MyDenseLayer, self).__init__(prev_layer=prev_layer, act=act, name=name)
# operation (customized)
n_in = int(self.inputs._shape[-1])
with tf.variable_scope(name) as vs:
# create new parameters
W = tf.get_variable(name='W', shape=(n_in, n_units))
b = tf.get_variable(name='b', shape=(n_units))
# tensor operation
self.outputs = self._apply_activation(tf.matmul(self.inputs, W) + b)
count_params()
Return the number of parameters of this network.
get_all_params()
Return the parameters in a list of array.
Examples
• Define model
• Get information
>>> print(n)
Last layer is: DenseLayer (d2) [None, 80]
>>> n.print_layers()
[TL] layer 0: d1/Identity:0 (?, 80) float32
[TL] layer 1: d2/Identity:0 (?, 80) float32
>>> n.print_params(False)
[TL] param 0: d1/W:0 (100, 80) float32_ref
[TL] param 1: d1/b:0 (80,) float32_ref
[TL] param 2: d2/W:0 (80, 80) float32_ref
[TL] param 3: d2/b:0 (80,) float32_ref
[TL] num of params: 14560
>>> n.count_params()
14560
>>> for l in n:
>>> print(l)
Tensor("d1/Identity:0", shape=(?, 80), dtype=float32)
Tensor("d2/Identity:0", shape=(?, 80), dtype=float32)
Input Layer
Parameters
• inputs (placeholder or tensor) – The input of a network.
• name (str) – A unique layer name.
Examples
Examples
References
tensorflow/examples/tutorials/word2vec/word2vec_basic.py
Examples
(8, 50)
References
• [1] Iyyer, M., Manjunatha, V., Boyd-Graber, J., & Daum’e III, H. (2015). Deep Unordered Composition
Rivals Syntactic Methods for Text Classification. In Association for Computational Linguistics.
• [2] Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2016). Bag of Tricks for Efficient Text Classifi-
cation.
Examples
(8, 50)
PReLU Layer
References
PReLU6 Layer
Parameters
References
PTReLU6 Layer
References
Simplified Convolutions
For users don’t familiar with TensorFlow, the following simplified functions may easier for you. We will provide more
simplified functions later, but if you are good at TensorFlow, the professional APIs may better for you.
Conv1d
Examples
Conv2d
Examples
Simplified Deconvolutions
For users don’t familiar with TensorFlow, the following simplified functions may easier for you. We will provide more
simplified functions later, but if you are good at TensorFlow, the professional APIs may better for you.
DeConv2d
DeConv3d
Expert Convolutions
Conv1dLayer
Conv2dLayer
Notes
• shape = [h, w, the number of output channel of previous layer, the number of output channels]
• the number of output channel of a layer is its last dimension.
Examples
With TensorLayer
Conv3dLayer
Examples
Expert Deconvolutions
DeConv2dLayer
Notes
Examples
>>> batch_size = 64
>>> inputs = tf.placeholder(tf.float32, [batch_size, 100], name='z_noise')
>>> net_in = tl.layers.InputLayer(inputs, name='g/in')
>>> net_h0 = tl.layers.DenseLayer(net_in, n_units = 8192,
... W_init = tf.random_normal_initializer(stddev=0.02),
... act = None, name='g/h0/lin')
>>> print(net_h0.outputs._shape)
(64, 8192)
>>> net_h0 = tl.layers.ReshapeLayer(net_h0, shape=(-1, 4, 4, 512), name='g/h0/
˓→reshape')
>>> print(net_h0.outputs._shape)
(64, 4, 4, 512)
>>> net_h1 = tl.layers.DeConv2dLayer(net_h0,
... shape=(5, 5, 256, 512),
... output_shape=(batch_size, 8, 8, 256),
... strides=(1, 2, 2, 1),
... act=None, name='g/h1/decon2d')
>>> net_h1 = tl.layers.BatchNormLayer(net_h1, act=tf.nn.relu, is_train=is_train,
˓→name='g/h1/batch_norm')
>>> print(net_h1.outputs._shape)
(64, 8, 8, 256)
U-Net
>>> ....
>>> conv10 = tl.layers.Conv2dLayer(conv9, act=tf.nn.relu,
... shape=(3,3,1024,1024), strides=(1,1,1,1), padding='SAME',
... W_init=w_init, b_init=b_init, name='conv10')
>>> print(conv10.outputs)
(batch_size, 32, 32, 1024)
>>> deconv1 = tl.layers.DeConv2dLayer(conv10, act=tf.nn.relu,
... shape=(3,3,512,1024), strides=(1,2,2,1), output_shape=(batch_size,64,
˓→64,512),
DeConv3dLayer
Atrous (De)Convolutions
AtrousConv1dLayer
AtrousConv2dLayer
AtrousDeConv2dLayer
Deformable Convolutions
DeformableConv2d
Examples
>>> offset2 = tl.layers.Conv2d(net, 18, (3, 3), (1, 1), act=act, padding='SAME',
˓→name='offset2')
References
Notes
Depthwise Convolutions
DepthwiseConv2d
Parameters
• prev_layer (Layer) – Previous layer.
• filter_size (tuple of int) – The filter size (height, width).
• stride (tuple of int) – The stride step (height, width).
• act (activation function) – The activation function of this layer.
• padding (str) – The padding algorithm type: “SAME” or “VALID”.
• dilation_rate (tuple of 2 int) – The dilation rate in which we sample input
values across the height and width dimensions in atrous convolution. If it is greater than 1,
then all values of strides must be 1.
• depth_multiplier (int) – The number of channels to expand to.
• W_init (initializer) – The initializer for the weight matrix.
• b_init (initializer or None) – The initializer for the bias vector. If None, skip
bias.
• W_init_args (dictionary) – The arguments for the weight matrix initializer.
• b_init_args (dictionary) – The arguments for the bias vector initializer.
• name (str) – A unique layer name.
Examples
References
• tflearn’s grouped_conv_2d
• keras’s separableconv2d
Group Convolutions
GroupConv2d
Separable Convolutions
SeparableConv1d
• n_filter (int) – The dimensionality of the output space (i.e. the number of filters in the
convolution).
• filter_size (int) – Specifying the spatial dimensions of the filters. Can be a single
integer to specify the same value for all spatial dimensions.
• strides (int) – Specifying the stride of the convolution. Can be a single integer to spec-
ify the same value for all spatial dimensions. Specifying any stride value != 1 is incompatible
with specifying any dilation_rate value != 1.
• padding (str) – One of “valid” or “same” (case-insensitive).
• data_format (str) – One of channels_last (default) or channels_first. The ordering of
the dimensions in the inputs. channels_last corresponds to inputs with shape (batch, height,
width, channels) while channels_first corresponds to inputs with shape (batch, channels,
height, width).
• dilation_rate (int) – Specifying the dilation rate to use for dilated convolution. Can
be a single integer to specify the same value for all spatial dimensions. Currently, specifying
any dilation_rate value != 1 is incompatible with specifying any stride value != 1.
• depth_multiplier (int) – The number of depthwise convolution output channels for
each input channel. The total number of depthwise convolution output channels will be
equal to num_filters_in * depth_multiplier.
• depthwise_init (initializer) – for the depthwise convolution kernel.
• pointwise_init (initializer) – For the pointwise convolution kernel.
• b_init (initializer) – For the bias vector. If None, ignore bias in the pointwise part
only.
• name (a str) – A unique layer name.
SeparableConv2d
SubPixel Convolutions
SubpixelConv1d
Examples
References
SubpixelConv2d
Examples
>>> # examples here just want to tell you how to set the n_out_channel.
>>> import numpy as np
>>> import tensorflow as tf
>>> import tensorlayer as tl
>>> x = np.random.rand(2, 16, 16, 4)
>>> X = tf.placeholder("float32", shape=(2, 16, 16, 4), name="X")
>>> net = tl.layers.InputLayer(X, name='input')
>>> net = tl.layers.SubpixelConv2d(net, scale=2, n_out_channel=1, name='subpixel_
˓→conv2d')
References
• Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural
Network
Dense Layer
Examples
With TensorLayer
>>> W = tf.Variable(
... tf.random_uniform([n_in, n_units], -1.0, 1.0), name='W')
>>> b = tf.Variable(tf.zeros(shape=[n_units]), name='b')
>>> y = tf.nn.relu(tf.matmul(inputs, W) + b)
Notes
If the layer input has more than two axes, it needs to be flatten by using FlattenLayer.
Examples
References
Examples
Examples
Tile layer
Examples
TF-Slim Layer
TF-Slim models can be connected into TensorLayer. All Google’s Pre-trained model can be used easily , see Slim-
model.
Notes
• As TF-Slim stores the layers as dictionary, the all_layers in this network is not in order ! Fortunately,
the all_params are in order.
Examples
>>> # net 1
>>> net_1 = tl.layers.DenseLayer(net_in, n_units=800, act=tf.nn.relu, name='net1/
˓→relu1')
>>> # multiplexer
>>> net_mux = tl.layers.MultiplexerLayer(layers=[net_0, net_1], name='mux')
>>> network = tl.layers.ReshapeLayer(net_mux, shape=(-1, 800), name='reshape')
>>> network = tl.layers.DropoutLayer(network, keep=0.5, name='drop3')
>>> # output layer
>>> network = tl.layers.DenseLayer(network, n_units=10, act=None, name='output')
2D UpSampling
2D DownSampling
• prev_layer (Layer) – Previous layer with 4-D Tensor in the shape of (batch, height,
width, channels) or 3-D Tensor in the shape of (height, width, channels).
• size (tuple of int/float) – (height, width) scale factor or new size of height and
width.
• is_scale (boolean) – If True (default), the size is the scale factor; otherwise, the size
are numbers of pixels of height and width.
• method (int) –
The resize method selected through the index. Defaults index is 0 which is ResizeMethod.BILINEAR.
Lambda Layer
Examples
Non-parametric case
Examples
Concat Layer
Examples
ElementWise Layer
Examples
>>> net.print_params(False)
[TL] param 0: net_0/W:0 (784, 500) float32_ref
[TL] param 1: net_0/b:0 (500,) float32_ref
[TL] param 2: net_1/W:0 (784, 500) float32_ref
[TL] param 3: net_1/b:0 (500,) float32_ref
>>> net.print_layers()
[TL] layer 0: net_0/Relu:0 (?, 500) float32
[TL] layer 1: net_1/Relu:0 (?, 500) float32
[TL] layer 2: minimum:0 (?, 500) float32
Examples
For local response normalization as it does not have any weights and arguments, you can also apply tf.nn.lrn on
network.outputs.
Batch Normalization
References
• Source
• stackoverflow
Instance Normalization
Layer Normalization
Group Normalization
Switch Normalization
References
Notes
Examples
1D Zero padding
2D Zero padding
3D Zero padding
Examples
• see Conv2dLayer.
1D Max pooling
1D Mean pooling
2D Max pooling
2D Mean pooling
3D Max pooling
3D Mean pooling
Examples
Examples
Examples
Examples
Examples
Examples
This is an experimental API package for building Quantized Neural Networks. We are using matrix multiplication
rather than add-minus and bit-count operation at the moment. Therefore, these APIs would not speed up the infer-
encing, for production, you can train model via TensorLayer and deploy the model into other customized C/C++
implementation (We probably provide users an extra C/C++ binary net framework that can load model from Tensor-
Layer).
Note that, these experimental APIs can be changed in the future
Sign
Scale
Binary (De)Convolutions
BinaryConv2d
Examples
...
>>> net = tl.layers.SignLayer(net)
>>> net = tl.layers.BinaryConv2d(net, 64, (5, 5), (1, 1), padding='SAME', name=
˓→'bcnn2')
TernaryDenseLayer
Ternary Convolutions
TernaryConv2d
Examples
...
>>> net = tl.layers.SignLayer(net)
>>> net = tl.layers.TernaryConv2d(net, 64, (5, 5), (1, 1), padding='SAME', name=
˓→'bcnn2')
DoReFa Convolutions
DorefaConv2d
Examples
...
>>> net = tl.layers.SignLayer(net)
>>> net = tl.layers.DorefaConv2d(net, 64, (5, 5), (1, 1), padding='SAME', name=
˓→'bcnn2')
DoReFa Convolutions
DorefaConv2d
Examples
...
>>> net = tl.layers.SignLayer(net)
>>> net = tl.layers.DorefaConv2d(net, 64, (5, 5), (1, 1), padding='SAME', name=
˓→'bcnn2')
QuanDenseLayer
QuanDenseLayerWithBN
Quantization Convolutions
Quantization
Examples
...
>>> net = tl.layers.QuanConv2d(net, 64, (5, 5), (1, 1), padding='SAME', act=tf.nn.
˓→relu, name='qcnn2')
QuanConv2dWithBN
Examples
All recurrent layers can implement any type of RNN cell by feeding different cell function (LSTM, GRU etc).
RNN layer
initial_state
Tensor or StateTuple –
The initial state of this layer.
• In practice, you can set your state at the begining of each epoch or iteration according to your
training procedure.
batch_size
int or Tensor – It is an integer, if it is able to compute the batch_size; otherwise, tensor for dynamic batch
size.
Examples
• For CNN+LSTM
˓→')
Notes
Input dimension should be rank 3 : [batch_size, n_steps, n_features], if no, please see ReshapeLayer.
References
Bidirectional layer
Notes
Input dimension should be rank 3 : [batch_size, n_steps, n_features]. If not, please see ReshapeLayer. For
predicting, the sequence length has to be the same with the sequence length of training, while, for normal RNN,
we can use sequence length of 1 for predicting.
References
Source
Recurrent Convolution
class tensorlayer.layers.ConvRNNCell
Abstract object representing an Convolutional RNN Cell.
These operations usually be used inside Dynamic RNN layer, they can compute the sequence lengths for different
situation and get the last RNN outputs by indexing.
Output indexing
tensorlayer.layers.advanced_indexing_op(inputs, index)
Advanced Indexing for Sequences, returns the outputs by given sequence lengths. When return the last output
DynamicRNNLayer uses it to get the last outputs with the sequence lengths.
Parameters
• inputs (tensor for data) – With shape of [batch_size, n_step(max), n_features]
Examples
References
• Modified from TFlearn (the original code is used for fixed length rnn), references.
tensorlayer.layers.retrieve_seq_length_op(data)
An op to compute the length of a sequence from input shape of [batch_size, n_step(max), n_features], it can be
used when the features of padding (on right hand side) are all zeros.
Parameters data (tensor) – [batch_size, n_step(max), n_features] with zero padding on right
hand side.
Examples
References
tensorlayer.layers.retrieve_seq_length_op2(data)
An op to compute the length of a sequence, from input shape of [batch_size, n_step(max)], it can be used when
the features of padding (on right hand side) are all zeros.
Parameters data (tensor) – [batch_size, n_step(max)] with zero padding on right hand side.
Examples
tensorlayer.layers.retrieve_seq_length_op3(data, pad_val=0)
Return tensor for sequence length, if input is tf.string.
Get Mask
tensorlayer.layers.target_mask_op(data, pad_val=0)
Return tensor for mask, if input is tf.string.
RNN Layer
• return_seq_2d (boolean) –
Only consider this argument when return_last is False
– If True, return 2D Tensor [n_example, n_hidden], for stacking DenseLayer after it.
– If False, return 3D Tensor [n_example/n_steps, n_steps, n_hidden], for stacking multi-
ple RNN after it.
• dynamic_rnn_init_args (dictionary) – The arguments for tf.nn.
dynamic_rnn.
• name (str) – A unique layer name.
outputs
tensor – The output of this layer.
final_state
tensor or StateTuple –
The finial state of this layer.
• When state_is_tuple is False, it is the final hidden and cell states, states.get_shape() = [?, 2 *
n_hidden].
• When state_is_tuple is True, it stores two elements: (c, h).
• In practice, you can get the final state after each iteration during training, then feed it to the initial
state of next iteration.
initial_state
tensor or StateTuple –
The initial state of this layer.
• In practice, you can set your state at the begining of each epoch or iteration according to your
training procedure.
batch_size
int or tensor – It is an integer, if it is able to compute the batch_size; otherwise, tensor for dynamic batch
size.
sequence_length
a tensor or array – The sequence lengths computed by Advanced Opt or the given sequence lengths,
[batch_size]
Notes
Input dimension should be rank 3 : [batch_size, n_steps(max), n_features], if no, please see ReshapeLayer.
Examples
Synced sequence input and output, for loss function see tl.cost.cross_entropy_seq_with_mask.
>>> input_seqs = tf.placeholder(dtype=tf.int64, shape=[batch_size, None], name=
˓→"input")
... n_hidden=embedding_size,
... dropout=(0.7 if is_train else None),
... sequence_length=tl.layers.retrieve_seq_length_op2(input_seqs),
... return_last=False, # for encoder, set to True
... return_seq_2d=True, # stack denselayer or
˓→compute cost after it
... name='dynamicrnn')
>>> net = tl.layers.DenseLayer(net, n_units=vocab_size, name="output")
References
• Wild-ML Blog
• dynamic_rnn.ipynb
• tf.nn.dynamic_rnn
• tflearn rnn
• tutorial_dynamic_rnn.py
Bidirectional Layer
The sequence length of each row of input data, see Advanced Ops for Dynamic RNN.
fw(bw)_initial_state
tensor or StateTuple –
The initial state of this layer.
• In practice, you can set your state at the begining of each epoch or iteration according to your
training procedure.
batch_size
int or tensor – It is an integer, if it is able to compute the batch_size; otherwise, tensor for dynamic batch
size.
sequence_length
a tensor or array – The sequence lengths computed by Advanced Opt or the given sequence lengths,
[batch_size].
Notes
Input dimension should be rank 3 : [batch_size, n_steps(max), n_features], if no, please see ReshapeLayer.
References
• Wild-ML Blog
• bidirectional_rnn.ipynb
Sequence to Sequence
Simple Seq2Seq
Parameters
• net_encode_in (Layer) – Encode sequences, [batch_size, None, n_features].
• net_decode_in (Layer) – Decode sequences, [batch_size, None, n_features].
• cell_fn (TensorFlow cell function) –
A TensorFlow core RNN cell
– see RNN Cells in TensorFlow
– Note TF1.0+ and TF1.0- are different
outputs
tensor – The output of RNN decoder.
initial_state_encode
tensor or StateTuple – Initial state of RNN encoder.
initial_state_decode
tensor or StateTuple – Initial state of RNN decoder.
final_state_encode
tensor or StateTuple – Final state of RNN encoder.
final_state_decode
tensor or StateTuple – Final state of RNN decoder.
Notes
Examples
>>> y = tf.nn.softmax(net_out.outputs)
>>> net_out.print_params(False)
Flatten Layer
Then we often apply DenseLayer, RNNLayer, ConcatLayer and etc on the top of a flatten layer. [batch_size,
mask_row, mask_col, n_mask] —> [batch_size, mask_row * mask_col * n_mask]
Parameters
• prev_layer (Layer) – Previous layer.
• name (str) – A unique layer name.
Examples
Reshape Layer
Examples
Transpose Layer
Examples
2D Affine Transformation
References
References
Notes
Stack Layer
Examples
Unstack Layer
Examples
Flatten tensor
tensorlayer.layers.flatten_reshape(variable, name=’flatten’)
Reshapes a high-dimension vector input.
[batch_size, mask_row, mask_col, n_mask] —> [batch_size, mask_row x mask_col x n_mask]
Parameters
• variable (TensorFlow variable or tensor) – The variable or tensor to be
flatten.
• name (str) – A unique layer name.
Returns Flatten Tensor
Return type Tensor
Examples
tensorlayer.layers.clear_layers_name()
DEPRECATED FUNCTION
Warning: THIS FUNCTION IS DEPRECATED: It will be removed after after 2018-06-30. Instructions
for updating: TensorLayer relies on TensorFlow to check naming.
tensorlayer.layers.initialize_rnn_state(state, feed_dict=None)
Returns the initialized RNN state. The inputs are LSTMStateTuple or State of RNNCells, and an optional
feed_dict.
Parameters
• state (RNN state.) – The TensorFlow’s RNN state.
• feed_dict (dictionary) – Initial RNN state; if None, returns zero state.
Returns The TensorFlow’s RNN state.
Return type RNN state
tensorlayer.layers.list_remove_repeat(x)
Remove the repeated items in a list, and return the processed list. You may need it to create merged layer like
Concat, Elementwise and etc.
Parameters x (list) – Input
Returns A list that after removing it’s repeated items
Return type list
Examples
>>> l = [2, 3, 4, 2, 3]
>>> l = list_remove_repeat(l)
[2, 3, 4]
tensorlayer.layers.merge_networks(layers=None)
Merge all parameters, layers and dropout probabilities to a Layer. The output of return network is the first
network in the list.
Parameters layers (list of Layer) – Merge all parameters, layers and dropout probabilities to
the first layer in the list.
Returns The network after merging all parameters, layers and dropout probabilities to the first net-
work in the list.
Examples
TensorLayer provides many pretrained models, you can easily use the whole or a part of the pretrained models via
these APIs.
2.9.1 VGG16
Examples
Extract features with VGG16 and Train a classifier with 100 classes
>>> x = tf.placeholder(tf.float32, [None, 224, 224, 3])
>>> # get VGG without the last layer
>>> vgg = tl.models.VGG16(x, end_with='fc2_relu')
>>> # add one more layer
(continues on next page)
Reuse model
>>> x1 = tf.placeholder(tf.float32, [None, 224, 224, 3])
>>> x2 = tf.placeholder(tf.float32, [None, 224, 224, 3])
>>> # get VGG without the last layer
>>> vgg1 = tl.models.VGG16(x1, end_with='fc2_relu')
>>> # reuse the parameters of vgg1 with different input
>>> vgg2 = tl.models.VGG16(x2, end_with='fc2_relu', reuse=True)
>>> # restore pre-trained VGG parameters (as they share parameters, we don’t need
˓→to restore vgg2)
2.9.2 VGG19
Examples
Extract features with VGG19 and Train a classifier with 100 classes
>>> x = tf.placeholder(tf.float32, [None, 224, 224, 3])
>>> # get VGG without the last layer
>>> vgg = tl.models.VGG19(x, end_with='fc2_relu')
>>> # add one more layer
>>> net = tl.layers.DenseLayer(vgg, 100, name='out')
(continues on next page)
Reuse model
2.9.3 SqueezeNetV1
Examples
Reuse model
2.9.4 MobileNetV1
Examples
Reuse model
Examples
Setting num_skips=2, skip_window=1, use the right and left words. In the same way, num_skips=4,
skip_window=2 means use the nearby 4 words.
>>> print(batch)
[2 2 3 3 4 4 5 5]
>>> print(labels)
[[3]
(continues on next page)
Simple sampling
tensorlayer.nlp.sample(a=None, temperature=1.0)
Sample an index from a probability array.
Parameters
• a (list of float) – List of probabilities.
• temperature (float or None) –
The higher the more uniform. When a = [0.1, 0.2, 0.7],
– temperature = 0.7, the distribution will be sharpen [0.05048273, 0.13588945,
0.81362782]
– temperature = 1.0, the distribution will be the same [0.1, 0.2, 0.7]
– temperature = 1.5, the distribution will be filtered [0.16008435, 0.25411807,
0.58579758]
– If None, it will be np.argmax(a)
Notes
• No matter what is the temperature and input list, the sum of all probabilities will be one. Even if input list
= [1, 100, 200], the sum of all probabilities will still be one.
• For large vocabulary size, choice a higher temperature or tl.nlp.sample_top to avoid error.
tensorlayer.nlp.sample_top(a=None, top_k=10)
Sample from top_k probabilities.
Parameters
• a (list of float) – List of probabilities.
• top_k (int) – Number of candidates to be considered.
Vocabulary class
Examples
>>> a 969108
>>> <S> 586368
>>> </S> 586368
>>> . 440479
>>> on 213612
(continues on next page)
Process sentence
Examples
Notes
Create vocabulary
Examples
Pre-process sentences
Create vocabulary
Creating vocabulary.
Total words: 8
Words in vocabulary: 8
Wrote vocabulary file: vocab.txt
tensorlayer.nlp.simple_read_words(filename=’nietzsche.txt’)
Read context from file without any preprocessing.
Parameters filename (str) – A file path (like .txt file)
Returns The context in a string.
Return type str
Read file
tensorlayer.nlp.read_words(filename=’nietzsche.txt’, replace=None)
Read list format context from a file.
For customized read_words method, see tutorial_generate_text.py.
Parameters
• filename (str) – a file path.
• replace (list of str) – replace original string by target string.
Returns The context in a list (split using space).
Return type list of str
tensorlayer.nlp.read_analogies_file(eval_file=’questions-words.txt’, word2id=None)
Reads through an analogy question file, return its id format.
Parameters
• eval_file (str) – The file name.
• word2id (dictionary) – a dictionary that maps word to ID.
Returns A [n_examples, 4] numpy array containing the analogy question’s word IDs.
Return type numpy.array
Examples
>>> print(analogy_questions)
[[ 3068 1248 7161 1581]
[ 3068 1248 28683 5642]
[ 3068 1248 3878 486]
...,
(continues on next page)
tensorlayer.nlp.build_vocab(data)
Build vocabulary.
Given the context in list format. Return the vocabulary, which is a dictionary for word to id. e.g. {‘campbell’:
2587, ‘atlantic’: 2247, ‘aoun’: 6746 . . . . }
Parameters data (list of str) – The context in list format
Returns that maps word to unique ID. e.g. {‘campbell’: 2587, ‘atlantic’: 2247, ‘aoun’: 6746 . . . . }
Return type dictionary
References
• tensorflow.models.rnn.ptb.reader
Examples
tensorlayer.nlp.build_reverse_dictionary(word_to_id)
Given a dictionary that maps word to integer id. Returns a reverse dictionary that maps a id to word.
Parameters word_to_id (dictionary) – that maps word to ID.
Returns A dictionary that maps IDs to words.
Return type dictionary
• vocabulary_size (int) – The maximum vocabulary size, limiting the vocabulary size.
Then the script replaces rare words with ‘UNK’ token.
• printable (boolean) – Whether to print the read vocabulary size of the given words.
• unk_key (str) – Represent the unknown words.
Returns
• data (list of int) – The context in a list of ID.
• count (list of tuple and list) –
Pair words and IDs.
– count[0] is a list : the number of rare words
– count[1:] are tuples : the number of occurrence of each word
– e.g. [[‘UNK’, 418391], (b’the’, 1061396), (b’of’, 593677), (b’and’, 416629), (b’one’,
411764)]
• dictionary (dictionary) – It is word_to_id that maps word to ID.
• reverse_dictionary (a dictionary) – It is id_to_word that maps ID to word.
Examples
References
• tensorflow/examples/tutorials/word2vec/word2vec_basic.py
Save vocabulary
tensorlayer.nlp.save_vocab(count=None, name=’vocab.txt’)
Save the vocabulary to a file so the model can be reloaded.
Parameters count (a list of tuple and list) – count[0] is a list : the number of rare
words, count[1:] are tuples : the number of occurrence of each word, e.g. [[‘UNK’, 418391],
(b’the’, 1061396), (b’of’, 593677), (b’and’, 416629), (b’one’, 411764)]
Examples
Examples
References
• tensorflow.models.rnn.ptb.reader
tensorlayer.nlp.word_ids_to_words(data, id_to_word)
Convert a list of integer to strings (words).
Parameters
• data (list of int) – The context in list format.
• id_to_word (dictionary) – a dictionary that maps ID to word.
Returns A list of string or byte to represent the context.
Return type list of str
Examples
Word Tokenization
Examples
References
Data file is assumed to contain one sentence per line. Each sentence is tokenized and digits are normalized (if
normalize_digits is set). Vocabulary contains the most-frequent tokens up to max_vocabulary_size. We write it
to vocabulary_path in a one-token-per-line format, so that later token in the first line gets id=0, second line gets
id=1, and so on.
Parameters
• vocabulary_path (str) – Path where the vocabulary will be created.
• data_path (str) – Data file that will be used to create vocabulary.
• max_vocabulary_size (int) – Limit on the size of the created vocabulary.
• tokenizer (function) – A function to use to tokenize each data sentence. If None,
basic_tokenizer will be used.
• normalize_digits (boolean) – If true, all digits are replaced by 0.
• _DIGIT_RE (regular expression function) – Default is re.
compile(br"\d").
• _START_VOCAB (list of str) – The pad, go, eos and unk token, default is
[b"_PAD", b"_GO", b"_EOS", b"_UNK"].
References
tensorlayer.nlp.initialize_vocabulary(vocabulary_path)
Initialize vocabulary from file, return the word_to_id (dictionary) and id_to_word (list).
We assume the vocabulary is stored one-item-per-line, so a file will result in a vocabulary {“dog”: 0, “cat”: 1},
and this function will also return the reversed-vocabulary [“dog”, “cat”].
Parameters vocabulary_path (str) – Path to the file containing the vocabulary.
Returns
• vocab (dictionary) – a dictionary that maps word to ID.
• rev_vocab (list of int) – a list that maps ID to word.
Examples
References
2.10.9 Metrics
BLEU
Examples
References
• Google/seq2seq/metric/bleu
TensorLayer provides rich layer implementations trailed for various benchmarks and domain-specific problems. In
addition, we also support transparent access to native TensorFlow parameters. For example, we provide not only layers
for local response normalization, but also layers that allow user to apply tf.nn.lrn on network.outputs. More
functions can be found in TensorFlow API.
TensorLayer provides simple API and tools to ease research, development and reduce the time to production. There-
fore, we provide the latest state of the art optimizers that work with Tensorflow.
Reinforcement Learning.
Examples
Examples
Log weight
Examples
fit(sess, network, train_op, cost, X_train, . . . ) Training a given non time-series network by the given
cost function, training data, batch_size, n_epoch etc.
test(sess, network, acc, X_test, y_test, x, . . . ) Test a given non time-series network by the given test
data and metric.
predict(sess, network, X, x, y_op[, batch_size]) Return the predict results of given non time-series net-
work.
evaluation([y_test, y_predict, n_classes]) Input the predicted results, targets results and the num-
ber of class, return the confusion matrix, F1-score of
each class, accuracy and macro F1-score.
class_balancing_oversample([X_train, . . . ]) Input the features and labels, return the features and la-
bels after oversampling.
get_random_int([min_v, max_v, number, seed]) Return a list of random integer by the given range and
quantity.
Continued on next page
Training
Parameters
• sess (Session) – TensorFlow Session.
• network (TensorLayer layer) – the network to be trained.
• train_op (TensorFlow optimizer) – The optimizer for training e.g.
tf.train.AdamOptimizer.
• X_train (numpy.array) – The input of training data
• y_train (numpy.array) – The target of training data
• x (placeholder) – For inputs.
• y (placeholder) – For targets.
• acc (TensorFlow expression or None) – Metric for accuracy or others. If None,
would not print the information.
• batch_size (int) – The batch size for training and evaluating.
• n_epoch (int) – The number of training epochs.
• print_freq (int) – Print the training information every print_freq epochs.
• X_val (numpy.array or None) – The input of validation data. If None, would not
perform validation.
• y_val (numpy.array or None) – The target of validation data. If None, would not
perform validation.
• eval_train (boolean) – Whether to evaluate the model during training. If X_val and
y_val are not None, it reflects whether to evaluate the model on training data.
• tensorboard_dir (string) – path to log dir, if set, summary data will be stored to
the tensorboard_dir/ directory for visualization with tensorboard. (default None) Also runs
tl.layers.initialize_global_variables(sess) internally in fit() to setup the summary nodes.
• tensorboard_epoch_freq (int) – How many epochs between storing tensorboard
checkpoint for visualization to log/ directory (default 5).
• tensorboard_weight_histograms (boolean) – If True updates tensorboard
data in the logs/ directory for visualization of the weight histograms every tensor-
board_epoch_freq epoch (default True).
• tensorboard_graph_vis (boolean) – If True stores the graph in the tensorboard
summaries saved to log/ (default True).
Examples
See tutorial_mnist_simple.py
Notes
If tensorboard_dir not None, the global_variables_initializer will be run inside the fit function in or-
der to initialize the automatically generated summary nodes used for tensorboard visualization, thus
tf.global_variables_initializer().run() before the fit() call will be undefined.
Evaluation
Examples
See tutorial_mnist_simple.py
Prediction
Examples
See tutorial_mnist_simple.py
>>> y = network.outputs
>>> y_op = tf.argmax(tf.nn.softmax(y), 1)
>>> print(tl.utils.predict(sess, network, X_test, x, y_op))
Examples
Examples
One X
Two X
Examples
tensorlayer.utils.dict_to_one(dp_dict)
Input a dictionary, return a dictionary that all items are set to one.
Used for disable dropout, dropconnect layer and so on.
Parameters dp_dict (dictionary) – The dictionary contains key and number, e.g. keeping
probabilities.
Examples
tensorlayer.utils.list_string_to_dict(string)
Inputs ['a', 'b', 'c'], returns {'a': 0, 'b': 1, 'c': 2}.
Flatten a list
tensorlayer.utils.flatten_list(list_of_list)
Input a list of list, return a list that all items are in a list.
Parameters list_of_list (a list of list) –
Examples
tensorlayer.utils.exit_tensorflow(sess=None, port=6006)
Close TensorFlow session, TensorBoard and Nvidia-process if available.
Parameters
• sess (Session) – TensorFlow Session.
• tb_port (int) – TensorBoard port you want to close, 6006 as default.
tensorlayer.utils.open_tensorboard(log_dir=’/tmp/tensorflow’, port=6006)
Open Tensorboard.
Parameters
• log_dir (str) – Directory where your tensorboard logs are saved
• port (int) – TensorBoard port you want to open, 6006 is tensorboard default
tensorlayer.utils.clear_all_placeholder_variables(printable=True)
Clears all the placeholder variables of keep prob, including keeping probabilities of all dropout, denoising,
dropconnect etc.
Parameters printable (boolean) – If True, print all deleted variables.
tensorlayer.utils.set_gpu_fraction(gpu_fraction=0.3)
Set the GPU memory fraction for the application.
Parameters gpu_fraction (float) – Fraction of GPU memory, (0 ~ 1]
References
TensorFlow provides TensorBoard to visualize the model, activations etc. Here we provide more functions for data
visualization.
tensorlayer.visualize.read_image(image, path=”)
Read one image.
Parameters
• image (str) – The image file name.
• path (str) – The image folder path.
Returns The image.
Return type numpy.array
tensorlayer.visualize.save_image(image, image_path=’_temp.png’)
Save a image.
Parameters
• image (numpy array) – [w, h, c]
• image_path (str) – path
Examples
References
Examples
References
Examples
Visualize weights
Examples
Image by matplotlib
Examples
Images by matplotlib
Examples
Examples
This is the alpha version of database management system. If you have any trouble, please ask for help at tensor-
layer@gmail.com .
TensorLayer is designed for real world production, capable of large scale machine learning applications. TensorLayer
database is introduced to address the many data management challenges in the large scale machine learning projects,
such as:
1. Finding training data from an enterprise data warehouse.
2. Loading large datasets that are beyond the storage limitation of one computer.
3. Managing different models with version control, and comparing them(e.g. accuracy).
4. Automating the process of training, evaluating and deploying machine learning models.
With the TensorLayer system, we introduce this database technology to address the challenges above.
The database management system is designed with the following three principles in mind.
Everything is Data
Data warehouses can store and capture the entire machine learning development process. The data can be categorized
as:
1. Dataset: This includes all the data used for training, validation and prediction. The labels can be manually
specified or generated by model prediction.
2. Model architecture: The database includes a table that stores different model architectures, enabling users to
reuse the many model development works.
3. Model parameters: This database stores all the model parameters of each epoch in the training step.
4. Tasks: A project usually include many small tasks. Each task contains the necessary information such as hyper-
parameters for training or validation. For a training task, typical information includes training data, the model
parameter, the model architecture, how many epochs the training task has. Validation, testing and inference are
also supported by the task system.
5. Loggings: The logs store all the metrics of each machine learning model, such as the time stamp, loss and
accuracy of each batch or epoch.
TensorLayer database in principle is a keyword based search engine. Each model, parameter, or training data is
assigned many tags. The storage system organizes data into two layers: the index layer, and the blob layer. The index
layer stores all the tags and references to the blob storage. The index layer is implemented based on NoSQL document
database such as MongoDB. The blob layer stores videos, medical images or label masks in large chunk size, which
is usually implemented based on a file system. Our database is based on MongoDB. The blob system is based on the
GridFS while the indexes are stored as documents.
Within the database framework, any entity within the data warehouse, such as the data, model or tasks is specified
by the database query language. As a reference, the query is more space efficient for storage and it can specify
multiple objects in a concise way. Another advantage of such a design is enabling a highly flexible software system.
Many system can be implemented by simply rewriting different components, with many new applications can be
implemented just by update the query without modification of any application code.
2.15.2 Preparation
In principle, the database can be implemented by any document oriented NoSQL database system. The existing
implementation is based on MongoDB. Further implementations on other databases will be released depending on the
progress. It will be straightforward to port our database system to Google Cloud, AWS and Azure. The following
tutorials are based on the MongoDB implementation.
The installation instruction of MongoDB can be found at MongoDB Docs. There are also many MongoDB services
from Amazon or GCP, such as Mongo Atlas from MongoDB User can also use docker, which is a powerful tool for
deploying software . After installing MongoDB, a MongoDB management tool with graphic user interface will be
extremely useful. Users can also install Studio3T(MongoChef), which is powerful user interface tool for MongoDB
and is free for non-commercial use studio3t.
2.15.3 Tutorials
Similar with MongoDB management tools, IP and port number are required for connecting to the database. To distin-
guish the different projects, the database instances have a project_name argument. In the following example, we
connect to MongoDB on a local machine with the IP localhost, and port 27017 (this is the default port number
of MongoDB).
Dataset management
You can save a dataset into the database and allow all machines to access it. Apart from the dataset key, you can
also insert a custom argument such as version and description, for better managing the datasets. Note that, all saving
functions will automatically save a timestamp, allowing you to load staff (data, model, task) using the timestamp.
After saving the dataset, others can access the dataset as followed:
dataset = db.find_dataset('mnist')
dataset = db.find_dataset('mnist', version='1.0')
If you have multiple datasets that use the same dataset key, you can get all of them as followed:
datasets = db.find_all_datasets('mnist')
Model management
Save model architecture and parameters into database. The model architecture is represented by a TL graph, and the
parameters are stored as a list of array.
If there are many models, you can use MongoDB’s ‘sort’ method to find the model you want. To get the newest or
oldest model, you can sort by time:
## newest model
## oldest model
If you save the model along with accuracy, you can get the model with the best accuracy as followed:
db.delete_model()
If you want to specify which model you want to delete, you need to put arguments inside.
db.save_training_log(accuracy=0.33)
db.save_training_log(accuracy=0.44)
db.delete_training_log(accuracy=0.33)
db.delete_training_log()
db.delete_validation_log()
db.delete_testing_log()
Task distribution
A project usually consists of many tasks such as hyper parameter selection. To make it easier, we can distribute these
tasks to several GPU servers. A task consists of a task script, hyper parameters, desired result and a status.
A task distributor can push both dataset and tasks into a database, allowing task runners on GPU servers to pull and
run. The following is an example that pushes 3 tasks with different hyper parameters.
## push tasks into database, then allow other servers pull tasks to run
db.create_task(
task_name='mnist', script='task_script.py', hyper_parameters=dict(n_units1=800, n_
˓→units2=800),
saved_result_keys=['test_accuracy'], description='800-800'
)
db.create_task(
task_name='mnist', script='task_script.py', hyper_parameters=dict(n_units1=600, n_
˓→units2=600),
saved_result_keys=['test_accuracy'], description='600-600'
)
db.create_task(
task_name='mnist', script='task_script.py', hyper_parameters=dict(n_units1=400, n_
˓→units2=400),
saved_result_keys=['test_accuracy'], description='400-400'
)
## you can get the model and result from database and do some analysis at the end
The task runners on GPU servers can monitor the database, and run the tasks immediately when they are made avail-
able. In the task script, we can save the final model and results to the database, this allows task distributors to get the
desired model and results.
Example codes
See here.
Examples
Returns boolean
Return type True for success, False for fail.
Examples
Examples
>>> db.delete_tasks()
delete_testing_log(**kwargs)
Deletes testing log.
Parameters kwargs (logging information) – Find items to delete, leave it empty to
delete all log.
Examples
• see save_training_log.
delete_training_log(**kwargs)
Deletes training log.
Parameters kwargs (logging information) – Find items to delete, leave it empty to
delete all log.
Examples
Examples
• see save_training_log.
find_datasets(dataset_name=None, **kwargs)
Finds and returns all datasets from the database which matches the requirement. In some case, the data in
a dataset can be stored separately for better management.
Parameters
• dataset_name (str) – The name/key of dataset.
• kwargs (other events) – Other events, such as description, author and etc (op-
tional).
Returns params
Return type the parameters, return False if nothing found.
find_top_dataset(dataset_name=None, sort=None, **kwargs)
Finds and returns a dataset from the database which matches the requirement.
Parameters
• dataset_name (str) – The name of dataset.
• sort (List of tuple) – PyMongo sort comment, search “PyMongo find one sort-
ing” and collection level operations for more details.
• kwargs (other events) – Other events, such as description, author and etc (optinal).
Examples
Save dataset >>> db.save_dataset([X_train, y_train, X_test, y_test], ‘mnist’, description=’this is a tutorial’)
Get dataset >>> dataset = db.find_top_dataset(‘mnist’) >>> datasets = db.find_datasets(‘mnist’)
Returns dataset – Return False if nothing found.
Return type the dataset or False
find_top_model(sess, sort=None, model_name=’model’, **kwargs)
Finds and returns a model architecture and its parameters from the database which matches the require-
ment.
Parameters
• sess (Session) – TensorFlow session.
• sort (List of tuple) – PyMongo sort comment, search “PyMongo find one sort-
ing” and collection level operations for more details.
• model_name (str or None) – The name/key of model.
• kwargs (other events) – Other events, such as name, accuracy, loss, step number
and etc (optinal).
Examples
• see save_model.
Returns network – Note that, the returned network contains all information of the document
(record), e.g. if you saved accuracy in the document, you can get the accuracy by using
net._accuracy.
Return type TensorLayer layer
Examples
Monitors the database and pull tasks to run >>> while True: >>> print(“waiting task from distributor”)
>>> db.run_top_task(task_name=’mnist’, sort=[(“time”, -1)]) >>> time.sleep(1)
Returns boolean
Return type True for success, False for fail.
save_dataset(dataset=None, dataset_name=None, **kwargs)
Saves one dataset into database, timestamp will be added automatically.
Parameters
• dataset (any type) – The dataset you want to store.
• dataset_name (str) – The name of dataset.
• kwargs (other events) – Other events, such as description, author and etc (optinal).
Examples
Save dataset >>> db.save_dataset([X_train, y_train, X_test, y_test], ‘mnist’, description=’this is a tutorial’)
Get dataset >>> dataset = db.find_top_dataset(‘mnist’)
Returns boolean
Return type Return True if save success, otherwise, return False.
save_model(network=None, model_name=’model’, **kwargs)
Save model architecture and parameters into database, timestamp will be added automatically.
Parameters
• network (TensorLayer layer) – TensorLayer layer instance.
• model_name (str) – The name/key of model.
• kwargs (other events) – Other events, such as name, accuracy, loss, step number
and etc (optinal).
Examples
Save model architecture and parameters into database. >>> db.save_model(net, accuracy=0.8, loss=2.3,
name=’second_model’)
Load one model with parameters from database (run this in other script) >>> net =
db.find_top_model(sess=sess, accuracy=0.8, loss=2.3)
Find and load the latest model. >>> net = db.find_top_model(sess=sess, sort=[(“time”, py-
mongo.DESCENDING)]) >>> net = db.find_top_model(sess=sess, sort=[(“time”, -1)])
Find and load the oldest model. >>> net = db.find_top_model(sess=sess, sort=[(“time”, py-
mongo.ASCENDING)]) >>> net = db.find_top_model(sess=sess, sort=[(“time”, 1)])
Get model information >>> net._accuracy . . . 0.8
Returns boolean
Return type True for success, False for fail.
save_testing_log(**kwargs)
Saves the testing log, timestamp will be added automatically.
Parameters kwargs (logging information) – Events, such as accuracy, loss, step num-
ber and etc.
Examples
save_training_log(**kwargs)
Saves the training log, timestamp will be added automatically.
Parameters kwargs (logging information) – Events, such as accuracy, loss, step num-
ber and etc.
Examples
save_validation_log(**kwargs)
Saves the validation log, timestamp will be added automatically.
Parameters kwargs (logging information) – Events, such as accuracy, loss, step num-
ber and etc.
Examples
Command-line Reference
The tensorlayer.cli module provides a command-line tool for some common tasks.
3.1.1 tl train
Usage
243
TensorLayer Documentation, Release 1.11.1
Command-line Arguments
Notes
A parallel training program would require multiple parameter servers to help parallel trainers to exchange intermediate
gradients. The best number of parameter servers is often proportional to the size of your model as well as the number
of CPUs available. You can control the number of parameter servers using the -p parameter.
If you have a single computer with massive CPUs, you can use the -c parameter to enable CPU-only parallel training.
The reason we are not supporting GPU-CPU co-training is because GPU and CPU are running at different speeds.
Using them together in training would incur stragglers.
• genindex
• modindex
• search
245
TensorLayer Documentation, Release 1.11.1
t
tensorlayer.activation, 37
tensorlayer.array_ops, 42
tensorlayer.cli, 243
tensorlayer.cli.train, 243
tensorlayer.cost, 45
tensorlayer.db, 237
tensorlayer.distributed, 92
tensorlayer.files, 94
tensorlayer.iterate, 112
tensorlayer.layers, 115
tensorlayer.models, 201
tensorlayer.nlp, 205
tensorlayer.optimizers, 218
tensorlayer.prepro, 52
tensorlayer.rein, 219
tensorlayer.utils, 221
tensorlayer.visualize, 227
247
TensorLayer Documentation, Release 1.11.1
249
TensorLayer Documentation, Release 1.11.1
250 Index
TensorLayer Documentation, Release 1.11.1
Index 251
TensorLayer Documentation, Release 1.11.1
252 Index
TensorLayer Documentation, Release 1.11.1
Index 253
TensorLayer Documentation, Release 1.11.1
V
validation_metrics (in module tensorlayer.distributed), 94
VGG16 (class in tensorlayer.models), 201
VGG19 (class in tensorlayer.models), 202
vocab (tensorlayer.nlp.Vocabulary attribute), 208
Vocabulary (class in tensorlayer.nlp), 208
W
Word2vecEmbeddingInputlayer (class in tensor-
layer.layers), 126
word_ids_to_words() (in module tensorlayer.nlp), 214
words_to_word_ids() (in module tensorlayer.nlp), 214
Z
ZeroPad1d (class in tensorlayer.layers), 162
ZeroPad2d (class in tensorlayer.layers), 162
ZeroPad3d (class in tensorlayer.layers), 163
zoom() (in module tensorlayer.prepro), 69
zoom_multi() (in module tensorlayer.prepro), 70
254 Index