Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
A Tour of TensorFlow’s APIs
Dean R. Wyatte
Google Developer Group Boulder
@drwyatte
August 21, 2018
TensorFlow today
https://www.tensorflow.org/guide/premade_estimators
Eager
About me
• Data Scientist/Machine Learning Engineer (prototype to production)
• TensorFlow user for ~2 years
• Recently transitioned to TensorFlow higher-level APIs
TensorFlow low-level API
• In the beginning, there was the computational graph
• Lazy evaluation + automatic differentiation + Google backing = ML Win
• The TensorFlow session is required for interacting with the graph
• Resource allocation, inter-device communication, etc.
• The entire system was flexible and powerful, but verbose and
complex, likely intimidating would-be users
TensorFlow warm-up:
Anatomy of a TensorFlow script
import tensorflow as tf
images = tf.placeholder(tf.float32, [None, 224, 224, 3])
labels = tf.placeholder(tf.float32, [None, 1000])
W_conv1 = tf.Variable(tf.random_normal([3, 3, 3, 64]))
b_conv1 = tf.Variable(tf.zeros([64]))
W_conv2 = tf.Variable(tf.random_normal([3, 3, 64, 128]))
b_conv2 = tf.Variable(tf.zeros([128]))
W_fc = tf.Variable(tf.random_normal([56*56*128, 4096]))
b_fc = tf.Variable(tf.zeros([4096]))
W_probs = tf.Variable(tf.random_normal([4096, 1000]))
b_probs = tf.Variable(tf.zeros([1000]))
conv1 = tf.nn.relu(tf.nn.conv2d(images, W_conv1, strides=[1, 1, 1, 1], padding='SAME') + b_conv1)
pool1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
conv2 = tf.nn.relu(tf.nn.conv2d(pool1, W_conv2, strides=[1, 1, 1, 1], padding='SAME') + b_conv2)
pool2 = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
fc = tf.nn.relu(tf.matmul(tf.reshape(pool2, [-1, 56*56*128]), W_fc) + b_fc)
probs = tf.nn.softmax(tf.matmul(fc, W_probs) + b_probs)
loss = tf.reduce_mean(-tf.reduce_sum(labels*tf.log(probs), axis=1))
optimizer = tf.train.GradientDescentOptimizer(0.01).minimize(loss)
Image
Convolution
Subsampling
Convolution
Subsampling
Fully
Connected
Probabilities
Model
Resulting graph
TensorFlow warm-up:
Anatomy of a TensorFlow script
def yield_batches(data_dir):
# load/preprocess data
yield images, labels
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(100):
train_loss, test_loss, n_train, n_test = 0
for batch_images, batch_labels in yield_batches(train_data_dir):
l, _ = sess.run([loss, optimizer], feed_dict={images: batch_images, labels: batch_labels})
n_train += len(batch_images)
train_loss += l
for batch_images, batch_labels in yield_batches(test_data_dir):
l = sess.run(loss, feed_dict={images: batch_images, labels: batch_labels})
n_test += len(batch_images)
test_loss += l
print('Train loss: {} Test loss: {}'.format(train_loss/n_train, test_loss/n_test))
Training
loop
Data generator
On Keras
• High-level neural network API “for humans”
• Originally built on top of Theano, also
supports TensorFlow and CNTK backends
• Can now import directly from TensorFlow
Theano TensorFlow CNTK
TensorFlow Keras
import tensorflow as tf
images = tf.placeholder(tf.float32, [None, 224, 224, 3])
labels = tf.placeholder(tf.float32, [None, 1000])
W_conv1 = tf.Variable(tf.random_normal([3, 3, 3, 64]))
b_conv1 = tf.Variable(tf.zeros([64]))
W_conv2 = tf.Variable(tf.random_normal([3, 3, 64, 128]))
b_conv2 = tf.Variable(tf.zeros([128]))
W_fc = tf.Variable(tf.random_normal([56*56*128, 4096]))
b_fc = tf.Variable(tf.zeros([4096]))
W_probs = tf.Variable(tf.random_normal([4096, 1000]))
b_probs = tf.Variable(tf.zeros([1000]))
conv1 = tf.nn.relu(tf.nn.conv2d(images, W_conv1, strides=[1, 1, 1, 1], padding='SAME') + b_conv1)
pool1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
conv2 = tf.nn.relu(tf.nn.conv2d(pool1, W_conv2, strides=[1, 1, 1, 1], padding='SAME') + b_conv2)
pool2 = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
fc = tf.nn.relu(tf.matmul(tf.reshape(pool2, [-1, 56*56*128]), W_fc) + b_fc)
probs = tf.nn.softmax(tf.matmul(fc, W_probs) + b_probs)
loss = tf.reduce_mean(-tf.reduce_sum(labels*tf.log(probs), axis=1))
optimizer = tf.train.GradientDescentOptimizer(0.01).minimize(loss)
...
from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, Flatten, Dense
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.models import Model
images = Input(shape=[224, 224, 3])
conv1 = Conv2D(64, [3, 3], padding='same', activation='relu')(images)
pool1 = MaxPooling2D([2, 2], padding='same')(conv1)
conv2 = Conv2D(128, [3, 3], padding='same', activation='relu')(pool1)
pool2 = MaxPooling2D([2, 2], padding='same')(conv2)
fc = Dense(4096, activation=‘relu’)(Flatten()(pool2))
probs = Dense(1000, activation=‘softmax’)(fc)
optimizer = SGD(lr=0.01)
model = Model(images, probs)
model.compile(optimizer, loss=‘categorical_crossentropy’)
model.fit(train_images, train_labels,
validation_data=[test_images, test_labels])
TensorFlow higher-level APIs
• Try to satisfy simplicity/flexibility
tradeoff
• Provide low-level benefits while still
enabling user to focus on the model
• This talk primarily covers Datasets and
Estimators
• Rapid development over last ~1 year
• Mostly backward compatible across
releases, but conventions subject to change
Datasets
• Standard data loading and preprocessing blocks model
• High, consistent GPU utilization can vastly speed up training
def yield_batches(data_dir):
# load/preprocess data
yield images, labels
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(100):
train_loss, test_loss, n_train, n_test = 0
for batch_images, batch_labels in yield_batches(data_dir):
l, _ = sess.run([loss, optimizer],
feed_dict={images: batch_images,
labels: batch_labels})
Datasets
• TensorFlow Datasets provide simple API for designing input pipelines
decoupled from model without having to write threaded data
generator
• Part of core API since TensorFlow 1.4 (November 2017)
• Functionality previously provided by QueueRunner, now deprecated
(incompatible with eager execution)
https://www.tensorflow.org/guide/datasets
Dataset example
def load_image(filename):
buffer = tf.read_file(filename)
image = tf.image.decode_png(buffer, channels=3)
image = tf.cast(image, tf.float32)
return image
def input_fn(data_dir, batch_size):
filenames = glob.glob(os.path.join(data_dir, '**', '*.png'), recursive=True)
classnames = [os.path.basename(os.path.dirname(filename)) for filename in filenames]
lookup = {v: k for k, v in enumerate(sorted(set(classnames)))}
labels = [lookup[classname] for classname in classnames]
dataset = tf.data.Dataset.from_tensor_slices((filenames, labels))
dataset = dataset.shuffle(buffer_size=1000000)
dataset = dataset.map(lambda filename, label: (load_image(filename), label))
dataset = dataset.batch(batch_size)
dataset = dataset.prefetch(1)
iterator = dataset.make_one_shot_iterator()
images, labels = iterator.get_next()
return images, labels
Ensure at least one batch enqueued
Dataset best practices
• In practice, Datasets can increase
training speed by 200-300%
• prefetch overlaps the work of
a producer and consumer
• map accepts a
num_parallel_calls
argument for producer
parallelism
https://www.tensorflow.org/performance/datasets_performance
Estimators
• Directly managing graph and session
can lead to boilerplate, inefficient
code
• See Keras, Datasets
• TensorFlow Estimators expose
interface that hides these details
while remaining flexible and efficient
• Other benefits: Distributed training,
pre-made estimators
TensorFlow Estimators: Managing Simplicity vs. Flexibility in High-Level Machine Learning Frameworks
https://arxiv.org/abs/1708.02637
Estimators
• Part of core API since TensorFlow 1.1 (April 2017), but still maturing
as TensorFlow evolves
• Resemble scikit-learn estimators (uniform methods, separation of
initialization and learning)
https://www.tensorflow.org/guide/estimators
Estimator example
def model_fn(features, labels, mode):
W_conv1 = tf.Variable(tf.random_normal([3, 3, 3, 64]))
b_conv1 = tf.Variable(tf.zeros([64]))
W_conv2 = tf.Variable(tf.random_normal([3, 3, 64, 128]))
b_conv2 = tf.Variable(tf.zeros([128]))
W_fc = tf.Variable(tf.random_normal([56*56*128, 4096]))
b_fc = tf.Variable(tf.zeros([4096]))
W_probs = tf.Variable(tf.random_normal([4096, 1000]))
b_probs = tf.Variable(tf.zeros([1000]))
conv1 = tf.nn.relu(tf.nn.conv2d(features, W_conv1, strides=[1, 1, 1, 1], padding='SAME') + b_conv1)
pool1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding=‘SAME')
conv2 = tf.nn.relu(tf.nn.conv2d(pool1, W_conv2, strides=[1, 1, 1, 1], padding='SAME') + b_conv2)
pool2 = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
fc = tf.nn.relu(tf.matmul(tf.reshape(pool2, [-1, 56*56*128]), W_fc) + b_fc)
probs = tf.nn.softmax(tf.matmul(fc, W_probs) + b_probs)
loss = tf.reduce_mean(-tf.reduce_sum(labels*tf.log(probs), axis=1))
optimizer = tf.train.GradientDescentOptimizer(0.01)
train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step())
return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)
estimator = tf.estimator.Estimator(model_fn=model_fn)
for epoch in range(100):
estimator.train(...)
Note that we no longer define placeholders.
Features (inputs) and labels are tensors
provided by caller
Estimators return an EstimatorSpec
with ops required for training,
evaluation, etc.
Manages session
Estimators and Datasets: Better together
• estimator.train|evaluate|predict are unary methods
accepting a function that returns two tensors, inputs and target
• Or a Dataset
def input_fn(data_dir, ...)
dataset = tf.data.Dataset(...)
dataset = dataset.map(...).batch(batch_size).prefetch(1)
return dataset
def model_fn(features, labels, mode):
model = ...
loss = ...
train_op = ...
return EstimatorSpec(mode=mode, loss=loss, train_op=train_op)
estimator = Estimator(model_fn=model_fn)
for epoch in range(100):
estimator.train(lambda: input_fn(train_data_dir, ...))
estimator.evaluate(lambda: input_fn(test_data_dir, ...))
Resulting graph
Distributed training with Estimators and Datasets
• Early stage feature to abstract away details of distributed training
• https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/distribute
• DistributionStrategy specifies the nature of distribution
• Currently supports model-replication with synchronous updates (MirroredStrategy)
distribution = tf.contrib.distribute.MirroredStrategy(['/device:GPU:0', '/device:GPU:1'])
config = tf.estimator.RunConfig(train_distribute=distribution)
estimator = tf.estimator.Estimator(model_fn=model_fn, config=config)
INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmp0ar31ejr', '_tf_random_seed': None,
'_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600,
'_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000,
'_log_step_count_steps': 100, '_train_distribute':
<tensorflow.contrib.distribute.python.mirrored_strategy.MirroredStrategy object at 0x7f161d4b7b00>,
'_device_fn': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec
object at 0x7f161d4b7c18>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master':
'', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
INFO:tensorflow:Device is available but not used by distribute strategy: /device:CPU:0
INFO:tensorflow:Configured nccl all-reduce.
INFO:tensorflow:batch_all_reduce invoked for batches size = 14 with algorithm = nccl, num_packs = 1,
agg_small_grads_max_bytes = 0 and agg_small_grads_max_group = 10
TensorFlow Serving
• Set of APIs for defining how a model should handle
requests without requiring changes to server
architecture
• tensorflow-model-server is a lightweight C++ wrapper
over TensorFlow session that handles API requests via
gRPC
inputs_info = tf.saved_model.utils.build_tensor_info(inputs_tensor)
inputs_def = {tf.saved_model.signature_constants.CLASSIFY_INPUTS: inputs_info}
outputs_info = tf.saved_model.utils.build_tensor_info(outputs_tensor)
outputs_def = {tf.saved_model.signature_constants.CLASSIFY_OUTPUT_SCORES: outputs_def}
signature = (
tf.saved_model.signature_def_utils.build_signature_def(
inputs=inputs_def,
outputs=outputs_def,
method_name=tf.saved_model.signature_constants.CLASSIFY_METHOD_NAME)
)
...
* For Estimators: https://www.tensorflow.org/guide/saved_model#using_savedmodel_with_estimators
Summary
• TensorFlow high-level APIs allow users to focus on the model without
sacrificing too much flexibility
• Datasets enable users to design performant input pipelines
• Estimators hide the details of the session and reduce boilerplate
The future
• High-level APIs are usable now, but still maturing as TensorFlow evolves
• Compatibility with Keras models via existing methods or conversion to Estimators
• Distributed training with minimal code changes
• Compatibility with eager execution
Thank You
Dean R. Wyatte
Google Developer Group Boulder
@drwyatte
August 21, 2018

More Related Content

A Tour of Tensorflow's APIs

  • 1. A Tour of TensorFlow’s APIs Dean R. Wyatte Google Developer Group Boulder @drwyatte August 21, 2018
  • 3. About me • Data Scientist/Machine Learning Engineer (prototype to production) • TensorFlow user for ~2 years • Recently transitioned to TensorFlow higher-level APIs
  • 4. TensorFlow low-level API • In the beginning, there was the computational graph • Lazy evaluation + automatic differentiation + Google backing = ML Win • The TensorFlow session is required for interacting with the graph • Resource allocation, inter-device communication, etc. • The entire system was flexible and powerful, but verbose and complex, likely intimidating would-be users
  • 5. TensorFlow warm-up: Anatomy of a TensorFlow script import tensorflow as tf images = tf.placeholder(tf.float32, [None, 224, 224, 3]) labels = tf.placeholder(tf.float32, [None, 1000]) W_conv1 = tf.Variable(tf.random_normal([3, 3, 3, 64])) b_conv1 = tf.Variable(tf.zeros([64])) W_conv2 = tf.Variable(tf.random_normal([3, 3, 64, 128])) b_conv2 = tf.Variable(tf.zeros([128])) W_fc = tf.Variable(tf.random_normal([56*56*128, 4096])) b_fc = tf.Variable(tf.zeros([4096])) W_probs = tf.Variable(tf.random_normal([4096, 1000])) b_probs = tf.Variable(tf.zeros([1000])) conv1 = tf.nn.relu(tf.nn.conv2d(images, W_conv1, strides=[1, 1, 1, 1], padding='SAME') + b_conv1) pool1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') conv2 = tf.nn.relu(tf.nn.conv2d(pool1, W_conv2, strides=[1, 1, 1, 1], padding='SAME') + b_conv2) pool2 = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') fc = tf.nn.relu(tf.matmul(tf.reshape(pool2, [-1, 56*56*128]), W_fc) + b_fc) probs = tf.nn.softmax(tf.matmul(fc, W_probs) + b_probs) loss = tf.reduce_mean(-tf.reduce_sum(labels*tf.log(probs), axis=1)) optimizer = tf.train.GradientDescentOptimizer(0.01).minimize(loss) Image Convolution Subsampling Convolution Subsampling Fully Connected Probabilities Model
  • 7. TensorFlow warm-up: Anatomy of a TensorFlow script def yield_batches(data_dir): # load/preprocess data yield images, labels with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for epoch in range(100): train_loss, test_loss, n_train, n_test = 0 for batch_images, batch_labels in yield_batches(train_data_dir): l, _ = sess.run([loss, optimizer], feed_dict={images: batch_images, labels: batch_labels}) n_train += len(batch_images) train_loss += l for batch_images, batch_labels in yield_batches(test_data_dir): l = sess.run(loss, feed_dict={images: batch_images, labels: batch_labels}) n_test += len(batch_images) test_loss += l print('Train loss: {} Test loss: {}'.format(train_loss/n_train, test_loss/n_test)) Training loop Data generator
  • 8. On Keras • High-level neural network API “for humans” • Originally built on top of Theano, also supports TensorFlow and CNTK backends • Can now import directly from TensorFlow Theano TensorFlow CNTK
  • 9. TensorFlow Keras import tensorflow as tf images = tf.placeholder(tf.float32, [None, 224, 224, 3]) labels = tf.placeholder(tf.float32, [None, 1000]) W_conv1 = tf.Variable(tf.random_normal([3, 3, 3, 64])) b_conv1 = tf.Variable(tf.zeros([64])) W_conv2 = tf.Variable(tf.random_normal([3, 3, 64, 128])) b_conv2 = tf.Variable(tf.zeros([128])) W_fc = tf.Variable(tf.random_normal([56*56*128, 4096])) b_fc = tf.Variable(tf.zeros([4096])) W_probs = tf.Variable(tf.random_normal([4096, 1000])) b_probs = tf.Variable(tf.zeros([1000])) conv1 = tf.nn.relu(tf.nn.conv2d(images, W_conv1, strides=[1, 1, 1, 1], padding='SAME') + b_conv1) pool1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') conv2 = tf.nn.relu(tf.nn.conv2d(pool1, W_conv2, strides=[1, 1, 1, 1], padding='SAME') + b_conv2) pool2 = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') fc = tf.nn.relu(tf.matmul(tf.reshape(pool2, [-1, 56*56*128]), W_fc) + b_fc) probs = tf.nn.softmax(tf.matmul(fc, W_probs) + b_probs) loss = tf.reduce_mean(-tf.reduce_sum(labels*tf.log(probs), axis=1)) optimizer = tf.train.GradientDescentOptimizer(0.01).minimize(loss) ... from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, Flatten, Dense from tensorflow.keras.optimizers import SGD from tensorflow.keras.models import Model images = Input(shape=[224, 224, 3]) conv1 = Conv2D(64, [3, 3], padding='same', activation='relu')(images) pool1 = MaxPooling2D([2, 2], padding='same')(conv1) conv2 = Conv2D(128, [3, 3], padding='same', activation='relu')(pool1) pool2 = MaxPooling2D([2, 2], padding='same')(conv2) fc = Dense(4096, activation=‘relu’)(Flatten()(pool2)) probs = Dense(1000, activation=‘softmax’)(fc) optimizer = SGD(lr=0.01) model = Model(images, probs) model.compile(optimizer, loss=‘categorical_crossentropy’) model.fit(train_images, train_labels, validation_data=[test_images, test_labels])
  • 10. TensorFlow higher-level APIs • Try to satisfy simplicity/flexibility tradeoff • Provide low-level benefits while still enabling user to focus on the model • This talk primarily covers Datasets and Estimators • Rapid development over last ~1 year • Mostly backward compatible across releases, but conventions subject to change
  • 11. Datasets • Standard data loading and preprocessing blocks model • High, consistent GPU utilization can vastly speed up training def yield_batches(data_dir): # load/preprocess data yield images, labels with tf.Session() as sess: sess.run(tf.global_variables_initializer()) for epoch in range(100): train_loss, test_loss, n_train, n_test = 0 for batch_images, batch_labels in yield_batches(data_dir): l, _ = sess.run([loss, optimizer], feed_dict={images: batch_images, labels: batch_labels})
  • 12. Datasets • TensorFlow Datasets provide simple API for designing input pipelines decoupled from model without having to write threaded data generator • Part of core API since TensorFlow 1.4 (November 2017) • Functionality previously provided by QueueRunner, now deprecated (incompatible with eager execution) https://www.tensorflow.org/guide/datasets
  • 13. Dataset example def load_image(filename): buffer = tf.read_file(filename) image = tf.image.decode_png(buffer, channels=3) image = tf.cast(image, tf.float32) return image def input_fn(data_dir, batch_size): filenames = glob.glob(os.path.join(data_dir, '**', '*.png'), recursive=True) classnames = [os.path.basename(os.path.dirname(filename)) for filename in filenames] lookup = {v: k for k, v in enumerate(sorted(set(classnames)))} labels = [lookup[classname] for classname in classnames] dataset = tf.data.Dataset.from_tensor_slices((filenames, labels)) dataset = dataset.shuffle(buffer_size=1000000) dataset = dataset.map(lambda filename, label: (load_image(filename), label)) dataset = dataset.batch(batch_size) dataset = dataset.prefetch(1) iterator = dataset.make_one_shot_iterator() images, labels = iterator.get_next() return images, labels Ensure at least one batch enqueued
  • 14. Dataset best practices • In practice, Datasets can increase training speed by 200-300% • prefetch overlaps the work of a producer and consumer • map accepts a num_parallel_calls argument for producer parallelism https://www.tensorflow.org/performance/datasets_performance
  • 15. Estimators • Directly managing graph and session can lead to boilerplate, inefficient code • See Keras, Datasets • TensorFlow Estimators expose interface that hides these details while remaining flexible and efficient • Other benefits: Distributed training, pre-made estimators TensorFlow Estimators: Managing Simplicity vs. Flexibility in High-Level Machine Learning Frameworks https://arxiv.org/abs/1708.02637
  • 16. Estimators • Part of core API since TensorFlow 1.1 (April 2017), but still maturing as TensorFlow evolves • Resemble scikit-learn estimators (uniform methods, separation of initialization and learning) https://www.tensorflow.org/guide/estimators
  • 17. Estimator example def model_fn(features, labels, mode): W_conv1 = tf.Variable(tf.random_normal([3, 3, 3, 64])) b_conv1 = tf.Variable(tf.zeros([64])) W_conv2 = tf.Variable(tf.random_normal([3, 3, 64, 128])) b_conv2 = tf.Variable(tf.zeros([128])) W_fc = tf.Variable(tf.random_normal([56*56*128, 4096])) b_fc = tf.Variable(tf.zeros([4096])) W_probs = tf.Variable(tf.random_normal([4096, 1000])) b_probs = tf.Variable(tf.zeros([1000])) conv1 = tf.nn.relu(tf.nn.conv2d(features, W_conv1, strides=[1, 1, 1, 1], padding='SAME') + b_conv1) pool1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding=‘SAME') conv2 = tf.nn.relu(tf.nn.conv2d(pool1, W_conv2, strides=[1, 1, 1, 1], padding='SAME') + b_conv2) pool2 = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') fc = tf.nn.relu(tf.matmul(tf.reshape(pool2, [-1, 56*56*128]), W_fc) + b_fc) probs = tf.nn.softmax(tf.matmul(fc, W_probs) + b_probs) loss = tf.reduce_mean(-tf.reduce_sum(labels*tf.log(probs), axis=1)) optimizer = tf.train.GradientDescentOptimizer(0.01) train_op = optimizer.minimize(loss, global_step=tf.train.get_global_step()) return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op) estimator = tf.estimator.Estimator(model_fn=model_fn) for epoch in range(100): estimator.train(...) Note that we no longer define placeholders. Features (inputs) and labels are tensors provided by caller Estimators return an EstimatorSpec with ops required for training, evaluation, etc. Manages session
  • 18. Estimators and Datasets: Better together • estimator.train|evaluate|predict are unary methods accepting a function that returns two tensors, inputs and target • Or a Dataset def input_fn(data_dir, ...) dataset = tf.data.Dataset(...) dataset = dataset.map(...).batch(batch_size).prefetch(1) return dataset def model_fn(features, labels, mode): model = ... loss = ... train_op = ... return EstimatorSpec(mode=mode, loss=loss, train_op=train_op) estimator = Estimator(model_fn=model_fn) for epoch in range(100): estimator.train(lambda: input_fn(train_data_dir, ...)) estimator.evaluate(lambda: input_fn(test_data_dir, ...))
  • 20. Distributed training with Estimators and Datasets • Early stage feature to abstract away details of distributed training • https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/distribute • DistributionStrategy specifies the nature of distribution • Currently supports model-replication with synchronous updates (MirroredStrategy) distribution = tf.contrib.distribute.MirroredStrategy(['/device:GPU:0', '/device:GPU:1']) config = tf.estimator.RunConfig(train_distribute=distribution) estimator = tf.estimator.Estimator(model_fn=model_fn, config=config) INFO:tensorflow:Using config: {'_model_dir': '/tmp/tmp0ar31ejr', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': None, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': <tensorflow.contrib.distribute.python.mirrored_strategy.MirroredStrategy object at 0x7f161d4b7b00>, '_device_fn': None, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f161d4b7c18>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1} INFO:tensorflow:Device is available but not used by distribute strategy: /device:CPU:0 INFO:tensorflow:Configured nccl all-reduce. INFO:tensorflow:batch_all_reduce invoked for batches size = 14 with algorithm = nccl, num_packs = 1, agg_small_grads_max_bytes = 0 and agg_small_grads_max_group = 10
  • 21. TensorFlow Serving • Set of APIs for defining how a model should handle requests without requiring changes to server architecture • tensorflow-model-server is a lightweight C++ wrapper over TensorFlow session that handles API requests via gRPC inputs_info = tf.saved_model.utils.build_tensor_info(inputs_tensor) inputs_def = {tf.saved_model.signature_constants.CLASSIFY_INPUTS: inputs_info} outputs_info = tf.saved_model.utils.build_tensor_info(outputs_tensor) outputs_def = {tf.saved_model.signature_constants.CLASSIFY_OUTPUT_SCORES: outputs_def} signature = ( tf.saved_model.signature_def_utils.build_signature_def( inputs=inputs_def, outputs=outputs_def, method_name=tf.saved_model.signature_constants.CLASSIFY_METHOD_NAME) ) ... * For Estimators: https://www.tensorflow.org/guide/saved_model#using_savedmodel_with_estimators
  • 22. Summary • TensorFlow high-level APIs allow users to focus on the model without sacrificing too much flexibility • Datasets enable users to design performant input pipelines • Estimators hide the details of the session and reduce boilerplate The future • High-level APIs are usable now, but still maturing as TensorFlow evolves • Compatibility with Keras models via existing methods or conversion to Estimators • Distributed training with minimal code changes • Compatibility with eager execution
  • 23. Thank You Dean R. Wyatte Google Developer Group Boulder @drwyatte August 21, 2018