This document summarizes TensorFlow's APIs, beginning with an overview of the low-level API using computational graphs and sessions. It then discusses higher-level APIs like Keras, TensorFlow Datasets for input pipelines, and Estimators which hide graph and session details. Datasets improve training speed by up to 300% by enabling parallelism. Estimators resemble scikit-learn and separate model definition from training, making code more modular and reusable. The document provides examples of using Datasets and Estimators with TensorFlow.
Report
Share
Report
Share
1 of 23
Download to read offline
More Related Content
A Tour of Tensorflow's APIs
1. A Tour of TensorFlow’s APIs
Dean R. Wyatte
Google Developer Group Boulder
@drwyatte
August 21, 2018
3. About me
• Data Scientist/Machine Learning Engineer (prototype to production)
• TensorFlow user for ~2 years
• Recently transitioned to TensorFlow higher-level APIs
4. TensorFlow low-level API
• In the beginning, there was the computational graph
• Lazy evaluation + automatic differentiation + Google backing = ML Win
• The TensorFlow session is required for interacting with the graph
• Resource allocation, inter-device communication, etc.
• The entire system was flexible and powerful, but verbose and
complex, likely intimidating would-be users
7. TensorFlow warm-up:
Anatomy of a TensorFlow script
def yield_batches(data_dir):
# load/preprocess data
yield images, labels
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(100):
train_loss, test_loss, n_train, n_test = 0
for batch_images, batch_labels in yield_batches(train_data_dir):
l, _ = sess.run([loss, optimizer], feed_dict={images: batch_images, labels: batch_labels})
n_train += len(batch_images)
train_loss += l
for batch_images, batch_labels in yield_batches(test_data_dir):
l = sess.run(loss, feed_dict={images: batch_images, labels: batch_labels})
n_test += len(batch_images)
test_loss += l
print('Train loss: {} Test loss: {}'.format(train_loss/n_train, test_loss/n_test))
Training
loop
Data generator
8. On Keras
• High-level neural network API “for humans”
• Originally built on top of Theano, also
supports TensorFlow and CNTK backends
• Can now import directly from TensorFlow
Theano TensorFlow CNTK
10. TensorFlow higher-level APIs
• Try to satisfy simplicity/flexibility
tradeoff
• Provide low-level benefits while still
enabling user to focus on the model
• This talk primarily covers Datasets and
Estimators
• Rapid development over last ~1 year
• Mostly backward compatible across
releases, but conventions subject to change
11. Datasets
• Standard data loading and preprocessing blocks model
• High, consistent GPU utilization can vastly speed up training
def yield_batches(data_dir):
# load/preprocess data
yield images, labels
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(100):
train_loss, test_loss, n_train, n_test = 0
for batch_images, batch_labels in yield_batches(data_dir):
l, _ = sess.run([loss, optimizer],
feed_dict={images: batch_images,
labels: batch_labels})
12. Datasets
• TensorFlow Datasets provide simple API for designing input pipelines
decoupled from model without having to write threaded data
generator
• Part of core API since TensorFlow 1.4 (November 2017)
• Functionality previously provided by QueueRunner, now deprecated
(incompatible with eager execution)
https://www.tensorflow.org/guide/datasets
13. Dataset example
def load_image(filename):
buffer = tf.read_file(filename)
image = tf.image.decode_png(buffer, channels=3)
image = tf.cast(image, tf.float32)
return image
def input_fn(data_dir, batch_size):
filenames = glob.glob(os.path.join(data_dir, '**', '*.png'), recursive=True)
classnames = [os.path.basename(os.path.dirname(filename)) for filename in filenames]
lookup = {v: k for k, v in enumerate(sorted(set(classnames)))}
labels = [lookup[classname] for classname in classnames]
dataset = tf.data.Dataset.from_tensor_slices((filenames, labels))
dataset = dataset.shuffle(buffer_size=1000000)
dataset = dataset.map(lambda filename, label: (load_image(filename), label))
dataset = dataset.batch(batch_size)
dataset = dataset.prefetch(1)
iterator = dataset.make_one_shot_iterator()
images, labels = iterator.get_next()
return images, labels
Ensure at least one batch enqueued
14. Dataset best practices
• In practice, Datasets can increase
training speed by 200-300%
• prefetch overlaps the work of
a producer and consumer
• map accepts a
num_parallel_calls
argument for producer
parallelism
https://www.tensorflow.org/performance/datasets_performance
15. Estimators
• Directly managing graph and session
can lead to boilerplate, inefficient
code
• See Keras, Datasets
• TensorFlow Estimators expose
interface that hides these details
while remaining flexible and efficient
• Other benefits: Distributed training,
pre-made estimators
TensorFlow Estimators: Managing Simplicity vs. Flexibility in High-Level Machine Learning Frameworks
https://arxiv.org/abs/1708.02637
16. Estimators
• Part of core API since TensorFlow 1.1 (April 2017), but still maturing
as TensorFlow evolves
• Resemble scikit-learn estimators (uniform methods, separation of
initialization and learning)
https://www.tensorflow.org/guide/estimators
17. Estimator example
def model_fn(features, labels, mode):
W_conv1 = tf.Variable(tf.random_normal([3, 3, 3, 64]))
b_conv1 = tf.Variable(tf.zeros([64]))
W_conv2 = tf.Variable(tf.random_normal([3, 3, 64, 128]))
b_conv2 = tf.Variable(tf.zeros([128]))
W_fc = tf.Variable(tf.random_normal([56*56*128, 4096]))
b_fc = tf.Variable(tf.zeros([4096]))
W_probs = tf.Variable(tf.random_normal([4096, 1000]))
b_probs = tf.Variable(tf.zeros([1000]))
conv1 = tf.nn.relu(tf.nn.conv2d(features, W_conv1, strides=[1, 1, 1, 1], padding='SAME') + b_conv1)
pool1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding=‘SAME')
conv2 = tf.nn.relu(tf.nn.conv2d(pool1, W_conv2, strides=[1, 1, 1, 1], padding='SAME') + b_conv2)
pool2 = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
fc = tf.nn.relu(tf.matmul(tf.reshape(pool2, [-1, 56*56*128]), W_fc) + b_fc)
probs = tf.nn.softmax(tf.matmul(fc, W_probs) + b_probs)
loss = tf.reduce_mean(-tf.reduce_sum(labels*tf.log(probs), axis=1))
optimizer = tf.train.GradientDescentOptimizer(0.01)
train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step())
return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)
estimator = tf.estimator.Estimator(model_fn=model_fn)
for epoch in range(100):
estimator.train(...)
Note that we no longer define placeholders.
Features (inputs) and labels are tensors
provided by caller
Estimators return an EstimatorSpec
with ops required for training,
evaluation, etc.
Manages session
18. Estimators and Datasets: Better together
• estimator.train|evaluate|predict are unary methods
accepting a function that returns two tensors, inputs and target
• Or a Dataset
def input_fn(data_dir, ...)
dataset = tf.data.Dataset(...)
dataset = dataset.map(...).batch(batch_size).prefetch(1)
return dataset
def model_fn(features, labels, mode):
model = ...
loss = ...
train_op = ...
return EstimatorSpec(mode=mode, loss=loss, train_op=train_op)
estimator = Estimator(model_fn=model_fn)
for epoch in range(100):
estimator.train(lambda: input_fn(train_data_dir, ...))
estimator.evaluate(lambda: input_fn(test_data_dir, ...))
20. Distributed training with Estimators and Datasets
• Early stage feature to abstract away details of distributed training
• https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/distribute
• DistributionStrategy specifies the nature of distribution
• Currently supports model-replication with synchronous updates (MirroredStrategy)
distribution = tf.contrib.distribute.MirroredStrategy(['/device:GPU:0', '/device:GPU:1'])
config = tf.estimator.RunConfig(train_distribute=distribution)
estimator = tf.estimator.Estimator(model_fn=model_fn, config=config)
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmp0ar31ejr', '_tf_random_seed': None,
'_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600,
'_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000,
'_log_step_count_steps': 100, '_train_distribute':
<tensorflow.contrib.distribute.python.mirrored_strategy.MirroredStrategy object at 0x7f161d4b7b00>,
'_device_fn': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec
object at 0x7f161d4b7c18>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master':
'', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Device is available but not used by distribute strategy: /device:CPU:0
INFO:tensorflow:Configured nccl all-reduce.
INFO:tensorflow:batch_all_reduce invoked for batches size = 14 with algorithm = nccl, num_packs = 1,
agg_small_grads_max_bytes = 0 and agg_small_grads_max_group = 10
21. TensorFlow Serving
• Set of APIs for defining how a model should handle
requests without requiring changes to server
architecture
• tensorflow-model-server is a lightweight C++ wrapper
over TensorFlow session that handles API requests via
gRPC
inputs_info = tf.saved_model.utils.build_tensor_info(inputs_tensor)
inputs_def = {tf.saved_model.signature_constants.CLASSIFY_INPUTS: inputs_info}
outputs_info = tf.saved_model.utils.build_tensor_info(outputs_tensor)
outputs_def = {tf.saved_model.signature_constants.CLASSIFY_OUTPUT_SCORES: outputs_def}
signature = (
tf.saved_model.signature_def_utils.build_signature_def(
inputs=inputs_def,
outputs=outputs_def,
method_name=tf.saved_model.signature_constants.CLASSIFY_METHOD_NAME)
)
...
* For Estimators: https://www.tensorflow.org/guide/saved_model#using_savedmodel_with_estimators
22. Summary
• TensorFlow high-level APIs allow users to focus on the model without
sacrificing too much flexibility
• Datasets enable users to design performant input pipelines
• Estimators hide the details of the session and reduce boilerplate
The future
• High-level APIs are usable now, but still maturing as TensorFlow evolves
• Compatibility with Keras models via existing methods or conversion to Estimators
• Distributed training with minimal code changes
• Compatibility with eager execution
23. Thank You
Dean R. Wyatte
Google Developer Group Boulder
@drwyatte
August 21, 2018