Tensorflow Object Detection Api Tutorial PDF
Tensorflow Object Detection Api Tutorial PDF
Lyudmil Vladimirov
1 Installation 3
1.1 General Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Install Anaconda Python 3.6 (Optional) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 TensorFlow Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.1 TensorFlow CPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.1.1 Create a new Conda virtual environment (Optional) . . . . . . . . . . . . . . . . . 4
1.3.1.2 Install TensorFlow CPU for Python . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.1.3 Test your Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3.2 TensorFlow GPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.2.1 Install CUDA Toolkit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.2.2 Install CUDNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.2.3 Set Your Environment Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.2.4 Update your GPU drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.2.5 Create a new Conda virtual environment . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.2.6 Install TensorFlow GPU for Python . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.2.7 Test your Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 TensorFlow Models Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4.1 Install Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4.2 Downloading the TensorFlow Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4.3 Adding necessary Environment Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4.4 Protobuf Installation/Compilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.4.5 Test your Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5 LabelImg Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5.1 Create a new Conda virtual environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.5.2 Downloading labelImg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5.3 Installing dependencies and compiling package . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5.4 Test your installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
i
3.5 Configuring a Training Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.6 Training the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.7 Monitor Training Job Progress using TensorBoard . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.8 Exporting a Trained Inference Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4 Common issues 35
4.1 Python crashes - TensorFlow GPU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2 Cleaning up Nvidia containers (TensorFlow GPU) . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.3 labelImg saves annotation files with .xml.xml extension . . . . . . . . . . . . . . . . . . . . . . . 36
ii
TensorFlow setup Documentation
This is a step-by-step tutorial/guide to setting up and using TensorFlow’s Object Detection API to perform, namely,
object detection in images/video.
The software tools which we shall use throughout this tutorial are listed in the table below:
0 Even though this tutorial is based on Windows 10, most steps (excluding the setting of environmental variables) should apply for Ubuntu
16.04, too.
Contents: 1
TensorFlow setup Documentation
2 Contents:
CHAPTER 1
Installation
• There are two different variations of TensorFlow that you might wish to install, depending on whether you would
like TensorFlow to run on your CPU or GPU, namely TensorFlow CPU and TensorFlow GPU. I will proceed
to document both and you can choose which one you wish to install.
• If you wish to install both TensorFlow variants on your machine, ideally you should install each variant under a
different (virtual) environment. If you attempt to install both TensorFlow CPU and TensorFlow GPU, without
making use of virtual environments, you will either end up failing, or when we later start running code there
will always be an uncertainty as to which variant is being used to execute your code.
• To ensure that we have no package conflicts and/or that we can install several different versions/variants of
TensorFlow (e.g. CPU and GPU), it is generally recommended to use a virtual environment of some sort. For
the purposes of this tutorial we will be creating and managing our virtual environments using Anaconda, but
you are welcome to use the virtual environment manager of your choice (e.g. virtualenv).
Although having Anaconda is not a requirement in order to install and use TensorFlow, I suggest doing so, due to it’s
intuitive way of managing packages and setting up new virtual environments. Anaconda is a pretty useful tool, not
only for working with TensorFlow, but in general for anyone working in Python, so if you haven’t had a chance to
work with it, now is a good chance.
• Go to https://www.anaconda.com/download/
• Download Anaconda Python 3.6 version
• If disk space is an issue for your machine, you could install the minified version of Anaconda (i.e. Miniconda).
• When prompted for a “Destination Folder” you can chose whichever you wish, but I generally tend to use
C:\Anaconda3, to keep things simple. Putting Anaconda under C:\Anaconda3 also ensures that you
don’t get the awkward `Destination Folder` contains spaces warning.
3
TensorFlow setup Documentation
As mentioned in the Remarks section, there exist two generic variants of TensorFlow, which utilise different hardware
on your computer to run their computationally heavy Machine Learning algorithms. The simplest to install, but also in
most cases the slowest in terms of performance, is TensorFlow CPU, which runs directly on the CPU of your machine.
Alternatively, if you own a (compatible) Nvidia graphics card, you can take advantage of the available CUDA cores to
speed up the computations performed by TesnsorFlow, in which case you should follow the guidelines for installing
TensorFlow GPU.
Getting setup with an installation of TensorFlow CPU can be done in 3 simple steps.
• The above will create a new virtual environment with name tensorflow_cpu
• Now lets activate the newly created virtual environment by running the following in the Anaconda Promt win-
dow:
activate tensorflow_cpu
Once you have activated your virtual environment, the name of the environment should be displayed within brackets
at the beggining of your cmd path specifier, e.g.:
(tensorflow_cpu) C:\Users\sglvladi>
• Open a new Anaconda/Command Prompt window and activate the tensorflow_gpu environment (if you have not
done so already)
• Once open, type the following on the command line:
• Open a new Anaconda/Command Prompt window and activate the tensorflow_cpu environment (if you have not
done so already)
• Start a new Python interpreter session by running:
python
4 Chapter 1. Installation
TensorFlow setup Documentation
• If the above code shows an error, then check to make sure you have activated the tensorflow_gpu environment
and that tensorflow_gpu was successfully installed within it in the previous step.
• Then run the following:
• Once the above is run, if you see a print-out similar (but not identical) to the one below, it means that you could
benefit from installing TensorFlow by building the sources that correspond to you specific CPU. Everything
should still run as normal, just slower than if you had built TensorFlow from source.
˓→Your CPU supports instructions that this TensorFlow binary was not
• Finally, for the sake of completing the test as described by TensorFlow themselves (see here), let’s run the
following:
>>> print(sess.run(hello))
b'Hello, TensorFlow!'
The installation of TesnorFlow GPU is slightly more involved than that of TensorFlow CPU, mainly due to the need
of installing the relevant Graphics and CUDE drivers. There’s a nice Youtube tutorial (see here), explaining how to
install TensorFlow GPU. Although it describes different versions of the relevant components (including TensorFlow
itself), the installation steps are generally the same with this tutorial.
Before proceeding to install TesnsorFlow GPU, you need to make sure that your system can satisfy the following
requirements:
Prerequisites
Nvidia GPU (GTX 650 or newer)
CUDA Toolkit v9.0
CuDNN v7.0.5
Anaconda with Python 3.6 (Optional)
• Go to https://developer.nvidia.com/rdp/cudnn-download
• Create a user profile if needed and log in
• Select cuDNN v7.0.5 (Feb 28, 2018), for CUDA 9.0
• Go to http://www.nvidia.com/Download/index.aspx
• Select your GPU version to download
• Install the driver
• The above will create a new virtual environment with name tensorflow_gpu
• Now lets activate the newly created virtual environment by running the following in the Anaconda Promt win-
dow:
activate tensorflow_gpu
Once you have activated your virtual environment, the name of the environment should be displayed within brackets
at the beggining of your cmd path specifier, e.g.:
(tensorflow_gpu) C:\Users\sglvladi>
6 Chapter 1. Installation
TensorFlow setup Documentation
• Open a new Anaconda/Command Prompt window and activate the tensorflow_gpu environment (if you have not
done so already)
• Once open, type the following on the command line:
• Open a new Anaconda/Command Prompt window and activate the tensorflow_gpu environment (if you have not
done so already)
• Start a new Python interpreter session by running:
python
• If the above code shows an error, then check to make sure you have activated the tensorflow_gpu environment
and that tensorflow_gpu was successfully installed within it in the previous step.
• Then run the following:
• Once the above is run, you should see a print-out similar (but not identical) to the one bellow:
˓→GeForce GTX 770, pci bus id: 0000:02:00.0, compute capability: 3.0)
• Finally, for the sake of completing the test as described by TensorFlow themselves (see here), let’s run the
following:
>>> print(sess.run(hello))
b'Hello, TensorFlow!'
Now that you have installed TensorFlow, it is time to install the models used by TensorFlow to do its magic.
Building on the assumption that you have just created your new virtual environment (whether that’s tensor-
flow_cpu,‘tensorflow_gpu‘ or whatever other name you might have used), there are some packages which need to
be installed before installing the models.
Prerequisite packages
Name Tutorial version-build
pillow 5.0.0-py36h0738816_0
lxml 4.2.0-py36heafd4d3_0
jupyter 1.0.0-py36_4
matplotlib 2.2.2-py36h153e9ff_0
opencv 3.3.1-py36h20b85fd_1
where <package_name> can be replaced with the name of the package, and optionally the package version can be
specified by adding the optional specifier =<version> after <package_name>.
Alternatively, if you don’t want to use Anaconda you can install the packages using pip:
• Create a new folder under a path of your choice and name it TensorFlow. (e.g.
C:\Users\sglvladi\Documents\TensorFlow).
• From your Anaconda/Command Prompt cd into the TensorFlow directory.
• To download the models you can either use Git to clone the TensorFlow Models repo inside the TensorFlow
folder, or you can simply download it as a ZIP and extract it’s contents inside the TensorFlow folder. To
keep things consistent, in the latter case you will have to rename the extracted folder models-master to
models.1
• You should now have a single folder named models under your TensorFlow folder, which contains another
4 folders as such:
TensorFlow
models
official
research
1 The latest repo commit when writing this tutorial is da903e0.
8 Chapter 1. Installation
TensorFlow setup Documentation
samples
tutorials
Since a lot of the scripts we will use require packages from Tensorflow\models\research\object_detection
to be run, I have found that it’s convenient to add the specific folder to our environmental variables.
For Linux users, this can be done by either adding to ~/.bashrc or running the following code:
export PYTHONPATH=$PYTHONPATH:<PATH_TO_TF>/TensorFlow/models/research/object_detection
For Windows users, the following folder must be added to your Path environment variable (See Set Your Environment
Variables):
• <PATH_TO_TF>\TensorFlow\models\research\object_detection
For whatever reason, some of the TensorFlow packages that we will need to use to do object detection, do not come
pre-installed with our tensorflow installation.
For Linux users ONLY, the Installation docs suggest that you either run, or add to ~/.bashrc file, the following
command, which adds these packages to your PYTHONPATH:
# From tensorflow/models/research/
export PYTHONPATH=$PYTHONPATH:`pwd`:`pwd`/slim
For Windows, the only way that I found works best, is to simply add the following folders to your Path environment
variable (See also Set Your Environment Variables):
• <PATH_TO_TF>\TensorFlow\models\research\slim
• <PATH_TO_TF>\TensorFlow\models\research\slim\datasets
• <PATH_TO_TF>\TensorFlow\models\research\slim\deployment
• <PATH_TO_TF>\TensorFlow\models\research\slim\nets
• <PATH_TO_TF>\TensorFlow\models\research\slim\preprocessing
• <PATH_TO_TF>\TensorFlow\models\research\slim\scripts
where <PATH_TO_TF> replaces the absolute path to your TesnorFlow folder. (e.g. <PATH_TO_TF> =
C:\Users\sglvladi\Documents if TensorFlow resides within your Documents folder)
The Tensorflow Object Detection API uses Protobufs to configure model and training parameters. Before the frame-
work can be used, the Protobuf libraries must be downloaded and compiled.
This should be done as follows:
• Head to the protoc releases page
• Download the latest *-win32.zip release (e.g. protoc-3.5.1-win32.zip)
• Create a folder in C:\Program Files and name it Google Protobuf.
• Extract the contents of the downloaded *-win32.zip, inside C:\Program Files\Google Protobuf
• Add C:\Program Files\Google Protobuf\bin to your Path environment variable (see Set Your
Environment Variables)
• In a new Anaconda/Command Prompt2 , cd into TensorFlow/models/research/ directory and run the
following command:
# From TensorFlow/models/research/
protoc object_detection/protos/*.proto --python_out=.
If you are on Windows and using version 3.5 or later, the wildcard will not work and you have to run this in the
command prompt:
# From TensorFlow/models/research/
for /f %i in ('dir /b object_detection\protos\*.proto') do protoc object_
˓→detection\protos\%i --python_out=.
• Open a new Anaconda/Command Prompt window and activate the tensorflow_gpu environment (if you have not
done so already)
• cd into TensorFlow\models\research\object_detection and run the following command:
# From TensorFlow/models/research/object_detection
jupyter notebook
• This should start a new jupyter notebook server on your machine and you should be redirected to a new
tab of your default browser.
• Once there, simply follow sentdex’s Youtube video to ensure that everything is running smoothly.
• If, when you try to run In [11]:, Python crashes, have a look at the Anaconda/Command Prompt window
you used to run the jupyter notebook service and check for a line similar (maybe identical) to the one
below:
˓→runtime CuDNN library: 7101 (compatibility version 7100) but source was
• If the above line is present in the printed debugging, it means that you have not installed the correct version of
the cuDNN libraries. In this case make sure you re-do the Install CUDNN step, making sure you instal cuDNN
v7.0.5.
To deal with the fact that labelImg (on Windows) requires the use of pyqt4, while tensorflow 1.6 (and
possibly other packages) require pyqt5, we will create a new virtual environment in which to run labelImg.
2 NOTE: You MUST open a new Anaconda/Command Prompt for the changes in the environment variables to take effect.
10 Chapter 1. Installation
TensorFlow setup Documentation
• The above will create a new virtual environment with name labelImg
• Now lets activate the newly created virtual environment by running the following in the Anaconda Promt win-
dow:
activate labelImg
Once you have activated your virtual environment, the name of the environment should be displayed within brackets
at the beginning of your cmd path specifier, e.g.:
(labelImg) C:\Users\sglvladi>
• Inside you TensorFlow folder, create a new directory, name it addons and then cd into it.
• To download the package you can either use Git to clone the labelImg repo inside the TensorFlow\addons
folder, or you can simply download it as a ZIP and extract it’s contents inside the TensorFlow\addons
folder. To keep things consistent, in the latter case you will have to rename the extracted folder
labelImg-master to labelImg.3
• You should now have a single folder named addons\labelImg under your TensorFlow folder, which
contains another 4 folders as such:
TensorFlow
addons
labelImg
models
official
research
samples
tutorials
• Open a new Anaconda/Command Prompt window and activate the tensorflow_gpu environment (if you have not
done so already)
• cd into TensorFlow\addons\labelImg and run the following commands:
• Open a new Anaconda/Command Prompt window and activate the tensorflow_gpu environment (if you have not
done so already)
• cd into TensorFlow\addons\labelImg and run the following command:
python labelImg.py
# or
python labelImg.py [IMAGE_PATH] [PRE-DEFINED CLASS FILE]
12 Chapter 1. Installation
CHAPTER 2
Hereby you can find an example which allows you to use your camera to generate a video stream, based on which you
can perform object_detection.
import numpy as np
import os
import six.moves.urllib as urllib
import sys
import tarfile
import tensorflow as tf
import zipfile
import cv2
MODEL_NAME = 'ssd_inception_v2_coco_2017_11_17'
MODEL_FILE = MODEL_NAME + '.tar.gz'
DOWNLOAD_BASE = 'http://download.tensorflow.org/models/object_detection/'
# Path to frozen detection graph. This is the actual model that is used for the
˓→object detection.
# List of the strings that is used to add correct label for each box.
(continues on next page)
13
TensorFlow setup Documentation
# Download Model
opener = urllib.request.URLopener()
opener.retrieve(DOWNLOAD_BASE + MODEL_FILE, MODEL_FILE)
tar_file = tarfile.open(MODEL_FILE)
for file in tar_file.getmembers():
file_name = os.path.basename(file.name)
if 'frozen_inference_graph.pb' in file_name:
tar_file.extract(file, os.getcwd())
label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(
label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)
# Helper code
def load_image_into_numpy_array(image):
(im_width, im_height) = image.size
return np.array(image.getdata()).reshape(
(im_height, im_width, 3)).astype(np.uint8)
# Detection
with detection_graph.as_default():
with tf.Session(graph=detection_graph) as sess:
while True:
# Display output
cv2.imshow('object detection', cv2.resize(image_np, (800, 600)))
15
TensorFlow setup Documentation
If you have followed the tutorial, you shhould by now have a folder Tensorflow, placed under <PATH_TO_TF>
(e.g. C:\Users\sglvladi\Documents), with the following directory tree:
TensorFlow
addons
labelImg
models
official
research
17
TensorFlow setup Documentation
samples
tutorials
Now create a new folder under TensorFlow and call it workspace. It is within the workspace that we will store
all our training set-ups. Now let’s go under workspace and create another folder named training_demo. Now our
directory structure should be as so:
TensorFlow
addons
labelImg
models
official
research
samples
tutorials
workspace
training_demo
The training_demo folder shall be our training folder, which will contain all files related to our model training. It
is advisable to create a separate training folder each time we wish to train a different model. The typical structure for
training folders is shown below.
training_demo
annotations
images
test
train
pre-trained-model
training
README.md
Here’s an explanation for each of the folders/filer shown in the above tree:
• annotations: This folder will be used to store all *.csv files and the respective TensorFlow *.record
files, which contain the list of annotations for our dataset images.
• images: This folder contains a copy of all the images in our dataset, as well as the respective *.xml files
produced for each one, once labelImg is used to annotate objects.
– images\train: This folder contains a copy of all images, and the respective *.xml files, which will
be used to train our model.
– images\test: This folder contains a copy of all images, and the respective *.xml files, which will be
used to train our model.
• pre-trained-model: This folder will contain the pre-trained model of our choice, which shall be used as
a starting checkpoint for our training job.
• training: This folder will contain the training pipeline configuration file *.config, as well as a *.pbtxt
label map file and all files generated during the training of our model.
• README.md: This is an optional file which provides some general information regarding the training conditions
of our model. It is not used by TensorFlow in any way, but it generally helps when you have a few training folders
and/or you are revisiting a trained model after some time.
If you do not understand most of the things mentioned above, no need to worry, as we’ll see how all the files are
generated further down.
To annotate images we will be using the labelImg package. If you haven’t installed the package yet, then have a look
at LabelImg Installation.
• Once you have collected all the images to be used to test your model (ideally more than 100 per class), place
them inside the folder training_demo\images.
• Open a new Anaconda/Command Prompt window and cd into Tensorflow\addons\labelImg.
• If (as suggested in LabelImg Installation) you created a separate Conda environment for labelImg then go
ahead and activate it by running:
activate labelImg
• A File Explorer Dialog windows should open, which points to the training_demo\images folder.
• Press the “Select Folder” button, to start annotating your images.
Once open, you should see a window similar to the one below:
I won’t be covering a tutorial on how to use labelImg, but you can have a look at labelImg’s repo for more de-
tails. A nice Youtube video demonstrating how to use labelImg is also available here. What is important is that
once you annotate all your images, a set of new *.xml files, one for each image, should be generated inside your
training_demo\images folder.
Once you have finished annotating your image dataset, it is a general convention to use only part of it for training, and
the rest is used for testing purposes. Typically, the ratio is 90%/10%, i.e. 90% of the images are used for training and
the rest 10% is maintained for testing, but you can chose whatever ratio suits your needs.
Once you have decided how you will be splitting your dataset, copy all training images, together with their corre-
sponding *.xml files, and place them inside the training_demo\images\train folder. Similarly, copy all
testing images, with their *.xml files, and paste them inside training_demo\images\train.
TensorFlow requires a label map, which namely maps each of the used labels to an integer values. This label map is
used both by the training and detection processes.
Below I show an example label map (e.g label_map.pbtxt), assuming that our dataset containes 2 labels, dogs
and cats:
item {
id: 1
name: 'cat'
}
item {
id: 2
name: 'dog'
}
Label map files have the extention .pbtxt and should be placed inside the training_demo\annotations
folder.
Now that we have generated our annotations and split our dataset into the desired training and testing subsets, it is
time to convert our annotations into the so called TFRecord format.
There are two steps in doing so:
• Converting the individual *.xml files to a unified *.csv file for each dataset.
• Converting the *.csv files of each dataset to *.record files (TFRecord format).
Before we proceed to describe the above steps, let’s create a directory where we can store some scripts. Under the
TensorFlow folder, create a new folder TensorFlow\scripts, which we can use to store some useful scripts.
To make things even tidier, let’s create a new folder TensorFlow\scripts\preprocessing, where we shall
store scripts that we can use to preprocess our training inputs. Below is out TensorFlow directory tree structure, up
to now:
TensorFlow
addons
labelImg
models
official
research
samples
tutorials
scripts
preprocessing
workspace
training_demo
To do this we can write a simple script that iterates through all *.xml files in the
training_demo\images\train and training_demo\images\test folders, and generates a *.csv for
each of the two.
Here is an example script that allows us to do just that:
"""
Usage:
# Create train data:
python xml_to_csv.py -i [PATH_TO_IMAGES_FOLDER]/train -o [PATH_TO_ANNOTATIONS_FOLDER]/
˓→train_labels.csv
"""
import os
(continues on next page)
def xml_to_csv(path):
"""Iterates through all .xml files (generated by labelImg) in a given directory
˓→and combines them in a single Pandas datagrame.
Parameters:
----------
path : {str}
The path containing the .xml files
Returns
-------
Pandas DataFrame
The produced dataframe
"""
xml_list = []
for xml_file in glob.glob(path + '/*.xml'):
tree = ET.parse(xml_file)
root = tree.getroot()
for member in root.findall('object'):
value = (root.find('filename').text,
int(root.find('size')[0].text),
int(root.find('size')[1].text),
member[0].text,
int(member[4][0].text),
int(member[4][1].text),
int(member[4][2].text),
int(member[4][3].text)
)
xml_list.append(value)
column_name = ['filename', 'width', 'height',
'class', 'xmin', 'ymin', 'xmax', 'ymax']
xml_df = pd.DataFrame(xml_list, columns=column_name)
return xml_df
def main():
# Initiate argument parser
parser = argparse.ArgumentParser(
description="Sample TensorFlow XML-to-CSV converter")
parser.add_argument("-i",
"--inputDir",
help="Path to the folder where the input .xml files are stored
˓→",
type=str)
parser.add_argument("-o",
"--outputFile",
help="Name of output .csv file (including path)", type=str)
args = parser.parse_args()
if(args.inputDir is None):
args.inputDir = os.getcwd()
(continues on next page)
assert(os.path.isdir(args.inputDir))
xml_df = xml_to_csv(args.inputDir)
xml_df.to_csv(
args.outputFile, index=None)
print('Successfully converted xml to csv.')
if __name__ == '__main__':
main()
• Create a new file with name xml_to_csv.py under TensorFlow\scripts\preprocessing, open it,
paste the above code inside it and save.
• Install the pandas package:
# For example
# python xml_to_csv.py -i
˓→C:\Users\sglvladi\Documents\TensorFlow\workspace\training_
˓→demo\images\train -o
˓→C:\Users\sglvladi\Documents\TensorFlow\workspace\training_
˓→demo\annotations\train_labels.csv
# python xml_to_csv.py -i
˓→C:\Users\sglvladi\Documents\TensorFlow\workspace\training_
˓→demo\images\test -o
˓→C:\Users\sglvladi\Documents\TensorFlow\workspace\training_
˓→demo\annotations\test_labels.csv
Once the above is done, there should be 2 new files under the training_demo\annotations folder, named
test_labels.csv and train_labels.csv, respectively.
Now that we have obtained our *.csv annotation files, we will need to convert them into TFRecords. Below is an
example script that allows us to do just that:
"""
Usage:
(continues on next page)
import os
import io
import pandas as pd
import tensorflow as tf
flags = tf.app.flags
flags.DEFINE_string('csv_input', '', 'Path to the CSV input')
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
flags.DEFINE_string('label', '', 'Name of class label')
FLAGS = flags.FLAGS
filename = group.filename.encode('utf8')
image_format = b'jpg'
xmins = []
xmaxs = []
ymins = []
(continues on next page)
tf_example = tf.train.Example(features=tf.train.Features(feature={
'image/height': dataset_util.int64_feature(height),
'image/width': dataset_util.int64_feature(width),
'image/filename': dataset_util.bytes_feature(filename),
'image/source_id': dataset_util.bytes_feature(filename),
'image/encoded': dataset_util.bytes_feature(encoded_jpg),
'image/format': dataset_util.bytes_feature(image_format),
'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
'image/object/class/label': dataset_util.int64_list_feature(classes),
}))
return tf_example
def main(_):
writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
path = os.path.join(os.getcwd(), 'images')
examples = pd.read_csv(FLAGS.csv_input)
grouped = split(examples, 'filename')
for group in grouped:
tf_example = create_tf_example(group, path)
writer.write(tf_example.SerializeToString())
writer.close()
output_path = os.path.join(os.getcwd(), FLAGS.output_path)
print('Successfully created the TFRecords: {}'.format(output_path))
if __name__ == '__main__':
tf.app.run()
--img_path=<PATH_TO_IMAGES_FOLDER> --output_path=<PATH_TO_ANNOTATIONS_
˓→FOLDER>/train.record
--img_path=<PATH_TO_IMAGES_FOLDER>
--output_path=<PATH_TO_ANNOTATIONS_FOLDER>/test.record
# For example
# python generate_tfrecord.py --label=ship --csv_
˓→input=C:\Users\sglvladi\Documents\TensorFlow\workspace\training_
˓→demo\annotations\train_labels.csv --output_
˓→path=C:\Users\sglvladi\Documents\TensorFlow\workspace\training_
˓→demo\annotations\train.record --img_
˓→path=C:\Users\sglvladi\Documents\TensorFlow\workspace\training_
˓→demo\images\train
˓→demo\annotations\test_labels.csv --output_
˓→path=C:\Users\sglvladi\Documents\TensorFlow\workspace\training_
˓→demo\annotations\test.record --img_
˓→path=C:\Users\sglvladi\Documents\TensorFlow\workspace\training_
˓→demo\images\test
Once the above is done, there should be 2 new files under the training_demo\annotations folder, named
test.record and train.record, respectively.
For the purposes of this tutorial we will not be creating a training job from the scratch, but rather we will go through
how to reuse one of the pre-trained models provided by TensorFlow. If you would like to train an entirely new model,
you can have a look at TensorFlow’s tutorial.
The model we shall be using in our examples is the ssd_inception_v2_coco model, since it provides a relatively
good trade-off between performance and speed, however there are a number of other models you can use, all of
which are listed in TensorFlow’s detection model zoo. More information about the detection performance, as well as
reference times of execution, for each of the available pre-trained models can be found here.
First of all, we need to get ourselves the sample pipeline configuration file for the specific model we wish to re-
train. You can find the specific file for the model of your choice here. In our case, since we shall be using the
ssd_inception_v2_coco model, we shall be downloading the corresponding ssd_inception_v2_coco.config file.
Apart from the configuration file, we also need to download the latest pre-trained NN for the model we wish to use.
This can be done by simply clicking on the name of the desired model in the tables found in TensorFlow’s detection
model zoo. Clicking on the name of your model should initiate a download for a *.tar.gz file.
Once the *.tar.gz file has been downloaded, open it using a decompression program of your choice (e.g. 7zip,
WinZIP, etc.). Next, open the folder that you see when the compressed folder is opened (typically it will have the
same name as the compressed folded, without the *.tar.gz extension), and extract it’s contents inside the folder
training_demo\pre-trained-model.
Now that we have downloaded and extracted our pre-trained model, let’s have a look at the changes that we shall need
to apply to the downloaded *.config file (highlighted in yellow):
model {
ssd {
num_classes: 1 # Set this to the number of different label classes
box_coder {
faster_rcnn_box_coder {
y_scale: 10.0
x_scale: 10.0
height_scale: 5.0
width_scale: 5.0
}
}
matcher {
argmax_matcher {
matched_threshold: 0.5
unmatched_threshold: 0.5
ignore_thresholds: false
negatives_lower_than_unmatched: true
force_match_for_each_row: true
}
}
similarity_calculator {
iou_similarity {
}
}
anchor_generator {
ssd_anchor_generator {
num_layers: 6
min_scale: 0.2
max_scale: 0.95
aspect_ratios: 1.0
aspect_ratios: 2.0
aspect_ratios: 0.5
aspect_ratios: 3.0
aspect_ratios: 0.3333
reduce_boxes_in_lowest_layer: true
}
}
image_resizer {
fixed_shape_resizer {
height: 300
width: 300
}
}
box_predictor {
convolutional_box_predictor {
min_depth: 0
max_depth: 0
num_layers_before_predictor: 0
use_dropout: false
dropout_keep_probability: 0.8
kernel_size: 3
box_code_size: 4
apply_sigmoid_to_scores: false
conv_hyperparams {
(continues on next page)
train_config: {
batch_size: 12 # Increase/Decrease this value depending on the available memory
˓→(Higher values require more memory and vice-versa)
optimizer {
rms_prop_optimizer: {
learning_rate: {
exponential_decay_learning_rate {
initial_learning_rate: 0.004
decay_steps: 800720
decay_factor: 0.95
}
}
momentum_optimizer_value: 0.9
decay: 0.9
epsilon: 1.0
}
}
fine_tune_checkpoint: "pre-trained-model/model.ckpt" # Path to extracted files of
˓→pre-trained model
from_detection_checkpoint: true
# Note: The below line limits the training process to 200K steps, which we
# empirically found to be sufficient enough to train the pets dataset. This
# effectively bypasses the learning rate schedule (the learning rate will
# never decay). Remove the below line to train indefinitely.
num_steps: 200000
data_augmentation_options {
random_horizontal_flip {
}
}
data_augmentation_options {
ssd_random_crop {
}
}
}
train_input_reader: {
tf_record_input_reader {
input_path: "annotations/train.record" # Path to training TFRecord file
}
label_map_path: "annotations/label_map.pbtxt" # Path to label map file
(continues on next page)
eval_config: {
num_examples: 8000
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals: 10
}
eval_input_reader: {
tf_record_input_reader {
input_path: "annotations/test.record" # Path to testing TFRecord
}
label_map_path: "annotations/label_map.pbtxt" # Path to label map file
shuffle: false
num_readers: 1
}
Once the above changes have been applied to our config file, go ahead and save it under training_demo/
training.
Before we begin training our model, let’s go and copy the TensorFlow/models/research/
object_detection/train.py script and paste it straight into our training_demo folder. We will
need this script in order to train our model.
Now, to initiate a new training job, cd inside the training_demo folder and type the following:
Once the training process has been initiated, you should see a series of print outs similar to the one below (plus/minus
some warnings):
If you ARE NOT seeing a print-out similar to that shown above, and/or the training job crashes after a few seconds,
then have a look at the issues and proposed solutions, under the Common issues section, to see if you can find a
solution. Alternatively, you can try the issues section of the official Tensorflow Models repo.
If you ARE observing a similar output to the above, then CONGRATULATIONS, you have successfully started your
first training job. Now you may very well treat yourself to a cold beer, as waiting on the training to finish is likely
to take a while. Following what people have said online, it seems that it is advisable to allow you model to reach
a TotalLoss of at least 2 (ideally 1 and lower) if you want to achieve “fair” detection results. Obviously, lower
TotalLoss is better, however very low TotalLoss should be avoided, as the model may end up overfitting the
dataset, meaning that it will perform poorly when applied to images outside the dataset. To monitor TotalLoss, as
well as a number of other metrics, while your model is training, have a look at Monitor Training Job Progress using
TensorBoard.
Training times can be affected by a number of factors such as:
• The computational power of you hardware (either CPU or GPU): Obviously, the more powerful your PC
is, the faster the training process.
• Whether you are using the TensorFlow CPU or GPU variant: In general, even when compared to the best
CPUs, almost any GPU graphics card will yield much faster training and detection speeds. As a matter of
fact, when I first started I was running TensorFlow on my Intel i7-5930k (6/12 cores @ 4GHz, 32GB RAM)
and was getting step times of around 12 sec/step, after which I installed TensorFlow GPU and training the
very same model -using the same dataset and config files- on a EVGA GTX-770 (1536 CUDA-cores @
1GHz, 2GB VRAM) I was down to 0.9 sec/step!!! A 12-fold increase in speed, using a “low/mid-end”
graphics card, when compared to a “mid/high-end” CPU.
• How big the dataset is: The higher the number of images in your dataset, the longer it will take for the
model to reach satisfactory levels of detection performance.
• The complexity of the objects you are trying to detect: Obviously, if your objective is to track a black ball
over a white background, the model will converge to satisfactory levels of detection pretty quickly. If on
the other hand, for example, you wish to detect ships in ports, using Pan-Tilt-Zoom cameras, then training
will be a much more challenging and time-consuming process, due to the high variability of the shape and
size of ships, combined with a highly dynamic background.
• And many, many, many, more. . . .
A very nice feature of TensorFlow, is that it allows you to coninuously monitor and visualise a number of different
training/detection performance metrics, while your model is being trained. The specific tool that allows us to do all
that is Tensorboard.
To start a new TensorBoard server, we follow the following steps:
activate tensorflow_gpu
tensorboard --logdir=training\
The above command will start a new TensorBoard server, which (by default) listens to port 6006 of your machine.
Assuming that everything went well, you should see a print-out similar to the one below (plus/minus some warnings):
Once this is done, go to your browser and type http://YOUR-PC:6006 in your address bar, following which you
should be presented with a dashboard similar to the one shown below (maybe less populated if your model has just
started training):
Once your training job is complete, you need to extract the newly trained inference graph, which will be later used to
perform the object detection. This can be done as follows:
• Open a new Anaconda/Command Prompt
• Activate your TensorFlow conda environment (if you have one), e.g.:
activate tensorflow_gpu
Common issues
Below is a list of common issues encountered while using TensorFlow for objects detection.
If you are using TensorFlow GPU and when you try to run some Python object detection script (e.g. Test your Instal-
lation), after a few seconds, Windows reports that Python has crashed then have a look at the Anaconda/Command
Prompt window you used to run the script and check for a line similar (maybe identical) to the one below:
˓→CuDNN library: 7101 (compatibility version 7100) but source was compiled
If the above line is present in the printed debugging, it means that you have not installed the correct version of the
cuDNN libraries. In this case make sure you re-do the Install CUDNN step, making sure you instal cuDNN v7.0.5.
Sometimes, when terminating a TensorFlow training process, the Nvidia containers associated to the process are not
cleanly terminated. This can lead to bogus errors when we try to run a new TensorFlow process.
Some known issues caused by the above are presented below:
• Failure to restart training of a model. Look for the following errors in the debugging:
35
TensorFlow setup Documentation
˓→Memcpy failed
To solve such issues in Windows, open a Task Manager windows, look for Tasks with name NVIDIA Container
and kill them by selecting them and clicking the End Task button at the bottom left corner of the window.
If the issue persists, then you’re probably running out of memory. Try closing down anything else that might be eating
up your GPU memory (e.g. Youtube videos, webpages etc.)
At the time of writing up this document, I haven’t managed to identify why this might be happening. I have joined a
GitHub issue, at which you can refer in case there are any updates.
One way I managed to fix the issue was by clicking on the “Change Save Dir” button and selecting the directory where
the annotations files should be stores. By doing so, you should not longer get a pop-up dialog when you click “Save”
(or Ctrl+s), but you can always check if the file was saved by looking at the bottom left corner of labelImg.
• genindex
• modindex
• search
37