Convolutional Neural Networks in Python Master Data Science and Machine Learning With Modern Deep Le
Convolutional Neural Networks in Python Master Data Science and Machine Learning With Modern Deep Le
Python
Chapter 2: Convolution
This is the 3rd part in my Data Science and Machine Learning series on
Deep Learning in Python. At this point, you already know a lot about neural
networks and deep learning, including not just the basics like
backpropagation, but how to improve it using modern techniques like
momentum and adaptive learning rates. You've already written deep neural
networks in Theano and TensorFlow, and you know how to run code using
the GPU.
This book is all about how to use deep learning for computer vision using
convolutional neural networks. These are the state of the art when it comes
to image classification and they beat vanilla deep networks at tasks like
MNIST.
In this course we are going to up the ante and look at the StreetView House
Number (SVHN) dataset - which uses larger color images at various angles
- so things are going to get tougher both computationally and in terms of the
difficulty of the classification task. But we will show that convolutional
neural networks, or CNNs, are capable of handling the challenge!
All the materials used in this book are FREE. You can download and install
Python, Numpy, Scipy, Theano, and TensorFlow with pip or easy_install.
y = softmax( relu(X.dot(W1).dot(W2) )
The way they are trained is exactly the same as before, so all your skills
with backpropagation, etc. carry over.
Chapter 1: Review of Feedforward Neural
Networks
Predict
We know that for neural networks the predict function is also called the
feedforward action, and this is simply the dot product and a nonlinear
function on each layer of the neural network.
e.g. z1 = s(w0x), z2 = s(w1z1), z3 = s(w2z2), y = s(w3z3)
We know that the output is a sigmoid for binary classification and softmax
for classification with >= 2 classes.
Train
W ← W - learning_rate * dJ/dW
We know that libraries like Theano and TensorFlow will calculate the
gradient for us, which can get very complicated the more layers there are.
You’ll be thankful for this feature of neural networks when you see that the
output function becomes even more complex when we incorporate
convolution (although the derivation is still do-able and I would recommend
trying for practice).
At this point you should be familiar with how the cost function J is derived
from the likelihood and how we might not calculate J over the entire
training data set but rather in batches to improve training time.
When we work with images you know that an image is really a 2-D array of
data, and that if we have a color image we have a 3-D array of data where
one extra dimension is for the red, green, and blue channels.
In the past, we’ve flattened this array into a vector, which is the usual input
into a neural network, so for example a 28 x 28 image becomes a 784
vector, and a 3 x 32 x 32 image becomes a 3072 dimensional vector.
In this book, we are going to keep the dimensions of the original image for
a portion of the processing.
This book will use the MNIST dataset (handwritten digits) and the
streetview house number (SVHN) dataset.
The streetview house number dataset is a much harder problem than
MNIST since the images are in color, the digits can be at an angle and in
different styles or fonts, and the dimensionality is much larger.
https://github.com/lazyprogrammer/machine_learning_examples
And look in the folder: cnn_class
If you’ve already checked out this repo then simply do a “git pull” since this
code will be on the master branch.
I would highly recommend NOT just running this code but using it as a
backup if yours doesn’t work, and try to follow along with the code
examples by typing them out yourself to build muscle memory.
Once you have the machine_learning_examples repo you’ll want to create a
folder adjacent to the cnn_class folder called large_files if you haven’t
already done that for a previous class.
You’ll want to get the files under “format 2”, which are the cropped digits.
Note that these are MATLAB binary data files, so we’ll need to use the
Scipy library to load them, which I’m sure you have heard of if you’re
familiar with the Numpy stack.
Chapter 2: Convolution
In this chapter I’m going to give you guys a crash course in convolution. If
you really want to dig deep on this topic you’ll want to take a course on
signal processing or linear systems.
So what is convolution?
Think of your favorite audio effect (suppose that’s the “echo”). An echo is
simply the same sound bouncing back at you in the future, but with less
volume. We’ll see how we can do that mathematically later.
All effects can be thought of as filters, like the one I’ve shown here, and
they are often drawn in block diagrams. In machine learning and statistics
these are sometimes called kernels.
--------
I’m representing our audio signal by this triangle. Remember that we want
to do 2 things, we want to hear this audio signal in the future, which is
basically a shift in to the right, and this audio signal should be lower in
amplitude than the original.
For any general filter, there wouldn’t be this restriction on the weights. The
weights themselves would define the filter.
You can think of it as we are “sliding” the filter across the signal, by
changing the value of m.
I want to emphasize that it doesn’t matter if we slide the filter across the
signal, or if we slide the signal across the filter, since they would give us the
same result.
You can see from this formula that this just does both convolutions
independently in each direction. I’ve got some pseudocode here to
demonstrate how you might write this in code, but notice there’s a problem.
If i > n or j > m, we’ll go out of bounds.
def convolve(x, w):
y = np.zeros(x.shape)
for n in xrange(x.shape[0]):
for m in xrange(x.shape[1]):
for i in xrange(w.shape[0]):
for j in xrange(w.shape[1]):
y[n,m] += w[i,j]*x[n-i,m-j]
What that tells us is that the shape of Y is actually BIGGER than X.
Sometimes we just ignore these extra parts and consider Y to be the same
size as X. You’ll see when we do this in Theano and TensorFlow how we
can control the method in which the size of the output is determined.
Gaussian Blur
The idea is the same as we did with the sound echo. We’re going to take a
signal and spread it out.
But this time instead of having predefined delays we are going to spread out
the signal in the shape of a 2-dimensional Gaussian.
Here is the definition of the filter:
W = np.zeros((20, 20))
for i in xrange(20):
for j in xrange(20):
import numpy as np
img = mpimg.imread('lena.png')
plt.imshow(img)
plt.show()
# make it B&W
bw = img.mean(axis=2)
plt.imshow(bw, cmap='gray')
plt.show()
W = np.zeros((20, 20))
for i in xrange(20):
for j in xrange(20):
plt.imshow(W, cmap='gray')
plt.show()
out = convolve2d(bw, W)
plt.imshow(out, cmap='gray')
plt.show()
# what's that weird black stuff on the edges? let's check the size of output
print out.shape
# we can also just make the output the same size as the input
plt.show()
print out.shape
Edge Detection
Edge detection is another important operation in computer vision. If you
just want to see the code that’s already been written, check out the file
https://github.com/lazyprogrammer/machine_learning_examples/blob/mast
er/cnn_class/edge.py from Github.
Now I’m going to introduce the Sobel operator. The Sobel operator is
defined for 2 directions, X and Y, and they approximate the gradient at each
point of the image. Let’s call them Hx and Hy.
Hx = np.array([
[-1, 0, 1],
[-2, 0, 2],
[-1, 0, 1],
], dtype=np.float32)
Hy = np.array([
[0, 0, 0],
[1, 2, 1],
], dtype=np.float32)
import numpy as np
img = mpimg.imread('lena.png')
# make it B&W
bw = img.mean(axis=2)
# Sobel operator - approximate gradient in X dir
Hx = np.array([
[-1, 0, 1],
[-2, 0, 2],
[-1, 0, 1],
], dtype=np.float32)
# Sobel operator - approximate gradient in Y dir
Hy = np.array([
[0, 0, 0],
[1, 2, 1],
], dtype=np.float32)
Gx = convolve2d(bw, Hx)
plt.imshow(Gx, cmap='gray')
plt.show()
Gy = convolve2d(bw, Hy)
plt.imshow(Gy, cmap='gray')
plt.show()
# Gradient magnitude
G = np.sqrt(Gx*Gx + Gy*Gy)
plt.imshow(G, cmap='gray')
plt.show()
The Takeaway
So what is the takeaway from all these examples of convolution? Now you
know that there are SOME filters that help us detect features - so perhaps, it
would be possible to just do a convolution in the neural network and use
gradient descent to find the best filter.
Chapter 3: The Convolutional Neural Network
All of the networks we’ve seen so far have one thing in common: all the
nodes in one layer are connected to all the nodes in the next layer. This is
the “standard” feedforward neural network. With convolutional neural
networks you will see how that changes.
Our system should still be able to recognize that there is a dog in there
somewhere.
We call this “translational invariance”.
Question to think about: How can we ensure the neural network has
“rotational invariance?” What other kinds of invariances can you think of?
Downsampling
Another important operation we’ll need before we build the convolutional
neural network is downsampling. So remember our audio sample where we
did an echo - that was a 16kHz sample. Why 16kHz? Because this is
adequate for representing voices.
The telephone has a sampling rate of 8kHz - that’s why voices sound
muffled over the phone.
For images, we just want to know if after we did the convolution, was a
feature present in a certain area of the image. We can do that by
downsampling the image, or in other words, changing its resolution.
So for example, we would downsample an image by converting it from
32x32 to 16x16, and that would mean we downsampled by a factor of 2 in
both the horizontal and vertical direction.
There are a couple of ways of doing this: one is called maxpooling, which
means we take a 2x2 or 3x3 (or any other size) block and just output the
maximum value in that block.
Another way is average pooling - this means taking the average value over
the block. We will just use maxpooling in our code.
Theano has a function for this:
theano.tensor.signal.downsample.max_pool_2d
The simplest convolutional net is just the kind I showed you in the
introduction to this book. It does not even need to incorporate
downsampling.
Y = softmax(Z.dot(W2))
As stated previously, you could then train this simply by doing gradient
descent.
Exercise: Try this on MNIST. How well does it perform? Better or worse
than a fully-connected MLP?
Now we are finally at the point where I can describe the layout of a typical
convolutional neural network, specifically the LeNet flavor.
You will see that it is just a matter of joining up the operations we have
already discussed.
So in the first layer, you take the image, and keep all the colors and the
original shape, meaning you don’t flatten it. (i.e. it remains (3 x W x H))
Finally, you flatten these features into a vector and you put it into a regular,
fully connected neural network like the ones we’ve been talking about.
Note that you can have arbitrarily many convolution + pool layers, and
more fully connected layers.
Some networks have only convolution. The design is up to you.
Technicalities
4-D tensor inputs: The dimension of the inputs is a 4-D tensor, and it’s
pretty easy to see why. The image already takes up 3 dimensions, since we
have height, width, and color. The 4th dimension is just the number of
samples (i.e. for batch training).
4-D tensor filters / kernels: You might be surprised to learn that the kernels
are ALSO 4-D tensors. Now why is this? Well in the LeNet model, you
have multiple kernels per image and a different set of kernels for each color
channel. The next layer after the convolution is called a feature map. This
feature map is the same size as the number of kernels. So basically you can
think of this as, each kernel will extract a different feature, and place it onto
the feature map. Example:
We’ll see that in TensorFlow the dimensions of the kernels are going to be
different from Theano.
Another thing to note is that the shapes of our filters are usually MUCH
smaller than the image itself. What this means is that the same tiny filter
gets applied across the entire image. This is the idea of weight sharing.
By sharing this weight we’re introducing less parameters into the model,
and this is going to help us generalize better, since as you know from my
previous courses, when you have TOO many parameters, you’ll end up
overfitting.
In the schematic above, we assume a pooling size of (2, 2), which is what
we will also use in the code. This fits our data nicely because both 28
(MNIST) and 32 (SVHN) can be divided by 2 twice evenly.
Training a CNN
It’s ridiculous how many people take my courses or read my books and ask
things like, “But, but, … what about X modern technique?”
Look familiar?
People have been using convolution since the 1700s. LeCun himself
published his paper in 1998.
You too, can be a deep learning researcher. Just try different things. Be
creative. Use backprop. Easy, right?
So the first thing you might be wondering after learning about convolution
and downsampling is - does Theano have functions for these? And of
course the answer is yes.
In the LeNet we always do the convolution followed by pooling, so we just
call it convpool.
pooled_out = downsample.max_pool_2d(
input=conv_out,
ds=poolsize,
ignore_border=True
def rearrange(X):
N = X.shape[-1]
for i in xrange(N):
for j in xrange(3):
out[i, j, :, :] = X[:, :, j, i]
It’s also good to keep track of the size of each matrix as each operation is
done. You’ll see that with TensorFlow, by default each library treats the
edges of the result of the convolution a little differently, and the order of
each dimension is also different.
W1_shape = (20, 3, 5, 5)
W1 = np.random.randn(W1_shape)
b1_init = np.zeros(W1_shape[0])
# (num_feature_maps, old_num_feature_maps, filter_width, filter_height)
W2 = np.random.randn(W2_shape)
b2_init = np.zeros(W2_shape[0])
W3_init = np.random.randn(W2_shape[0]*5*5, M)
b3_init = np.zeros(M)
W4_init = np.random.randn(M, K)
b4_init = np.zeros(K)
Note that the bias is the same size as the number of feature maps.
Also note that this filter is a 4-D tensor, which is different from the filters
we were working with previously, which were 1-D and 2-D filters.
So the OUTPUT of that first conv_pool operation will also be a 4-D tensor.
The first dimension of course will be the batch size. The second is now no
longer color, but the number of feature maps, which after the first stage
would be 20. The next 2 are the dimensions of the new image after
conv_pooling, which is 32 - 5 + 1, which is 28, and then divided by 2 which
is 14.
In the next stage, we’ll use a filter of size 50 x 20 x 5 x 5. This means that
we now have 50 feature maps. So the output of this will have the first 2
dimensions as batch_size and 50. And then next 2 dimensions will be the
new image after conv_pooling, which will be 14 - 5 + 1, which is 10, and
then divided by 2 which is 5.
In the next stage we pass everything into a vanilla, fully-connected ANN,
which we’ve used before. Of course this means we have to flatten our
output from the previous layer from 4-dimensions to 2-dimensions.
Since that image was 5x5 and had 50 feature maps, the new flattened
dimension will be 50x5x5.
Now that we have all the initial weights and operations we need, we can
compute the output of the neural network. So we do the convpool twice,
and then notice this flatten() operation before I do the dot product. That’s
because Z2, after convpooling, will still be an image.
# forward pass
Z3 = relu(Z2.flatten(ndim=2).dot(W3) + b3)
pY = T.nnet.softmax(Z3.dot(W4) + b4)
But if you call flatten() by itself it’ll turn into a 1-D array, which we don’t
want, and luckily Theano provides us with a parameter that allows us to
control how much to flatten the array. ndim=2 means to flatten all the
dimensions after the 2nd dimension.
import numpy as np
import theano
import theano.tensor as T
import matplotlib.pyplot as plt
return np.mean(p != t)
def relu(a):
return a * (a > 0)
def y2indicator(y):
N = len(y)
ind = np.zeros((N, 10))
for i in xrange(N):
ind[i, y[i]] = 1
return ind
pooled_out = downsample.max_pool_2d(
input=conv_out,
ds=poolsize,
ignore_border=True
)
w = np.random.randn(*shape) / np.sqrt(np.prod(shape[1:]) +
shape[0]*np.prod(shape[2:] / np.prod(poolsz)))
return w.astype(np.float32)
def rearrange(X):
N = X.shape[-1]
out = np.zeros((N, 3, 32, 32), dtype=np.float32)
for i in xrange(N):
for j in xrange(3):
out[i, j, :, :] = X[:, :, j, i]
train = loadmat('../large_files/train_32x32.mat')
test = loadmat('../large_files/test_32x32.mat')
Xtrain = rearrange(train['X'])
Ytrain = train['y'].flatten() - 1
del train
Ytrain_ind = y2indicator(Ytrain)
Xtest = rearrange(test['X'])
Ytest = test['y'].flatten() - 1
del test
Ytest_ind = y2indicator(Ytest)
max_iter = 8
print_period = 10
lr = np.float32(0.00001)
reg = np.float32(0.01)
mu = np.float32(0.99)
N = Xtrain.shape[0]
batch_sz = 500
n_batches = N / batch_sz
M = 500
K = 10
poolsz = (2, 2)
# after conv will be of dimension 32 - 5 + 1 = 28
# after downsample 28 / 2 = 14
# after downsample 10 / 2 = 5
W3_init = np.random.randn(W2_shape[0]*5*5, M) /
np.sqrt(W2_shape[0]*5*5 + M)
X = T.tensor4('X', dtype='float32')
Y = T.matrix('T')
W1 = theano.shared(W1_init, 'W1')
b1 = theano.shared(b1_init, 'b1')
W2 = theano.shared(W2_init, 'W2')
b2 = theano.shared(b2_init, 'b2')
W3 = theano.shared(W3_init.astype(np.float32), 'W3')
b3 = theano.shared(b3_init, 'b3')
W4 = theano.shared(W4_init.astype(np.float32), 'W4')
b4 = theano.shared(b4_init, 'b4')
# momentum changes
Z3 = relu(Z2.flatten(ndim=2).dot(W3) + b3)
train = theano.function(
inputs=[X, Y],
updates=[
(W1, update_W1),
(b1, update_b1),
(W2, update_W2),
(b2, update_b2),
(W3, update_W3),
(b3, update_b3),
(W4, update_W4),
(b4, update_b4),
(dW1, update_dW1),
(db1, update_db1),
(dW2, update_dW2),
(db2, update_db2),
(dW3, update_dW3),
(db3, update_db3),
(dW4, update_dW4),
(db4, update_db4),
],
)
# create another function for this because we want it over the whole dataset
get_prediction = theano.function(
inputs=[X, Y],
outputs=[cost, prediction],
t0 = datetime.now()
LL = []
for i in xrange(max_iter):
for j in xrange(n_batches):
train(Xbatch, Ybatch)
if j % print_period == 0:
print "Cost / err at iteration i=%d, j=%d: %.3f / %.3f" % (i, j, cost_val, err)
LL.append(cost_val)
plt.plot(LL)
plt.show()
if __name__ == '__main__':
main()
Chapter 5: Sample Code in TensorFlow
https://github.com/lazyprogrammer/machine_learning_examples/blob/mast
er/cnn_class/cnn_tf.py
This is the ConvPool in TensorFlow. It’s almost the same as what we did
with Theano except that the conv2d() function takes in a new parameter
called strides.
def convpool(X, W, b):
conv_out = tf.nn.bias_add(conv_out, b)
return pool_out
In the past we just assumed that we had to drag the filter along every point
of the signal, but in fact we can move with any size step we want, and that’s
what stride is. We’re also going to use the padding parameter to control the
size of the output.
Remember that the bias is a 1-D vector, and we used the dimshuffle
function in Theano to add it to the convolution output. Here we can just use
a function that TensorFlow built called bias_add().
Next we call the max_pool() function. Notice that the ksize parameter is
kind of like the poolsize parameter we had with Theano, but it’s now 4-D
instead of 2-D. We just add ones at the ends. Notice that this function
ALSO takes in a strides parameter, meaning we can max_pool at EVERY
step, but we’ll just use non-overlapping sub-images like we did previously.
The next step is to rearrange the inputs. Remember that convolution in
Theano is not the same as convolution in TensorFlow. That means we have
to adjust not only the input dimensions but the filter dimensions as well.
The only change with the inputs is that the color now comes last.
def rearrange(X):
for i in xrange(N):
for j in xrange(3):
out[i, :, :, j] = X[:, :, j, i]
This is great and allows for a lot of flexibility, but I hit a snag during
development, which is my RAM started swapping when I did this. If you
haven’t noticed yet the size of the SVHN data is pretty big, about 73k
samples.
So one way around this is to make the shapes constant, which you’ll see
later. That means we’ll always have to pass in batch_sz number of samples
each time, which means the total number of samples we use has to be a
multiple of it. In the code I used exact numbers but you can also calculate it
using the data.
X = tf.placeholder(tf.float32, shape=(batch_sz, 32, 32, 3), name='X')
Just to reinforce this idea, the filter is going to be in a different order than
before. So now the dimensions of the image filter come first, then the
number of color channels, then the number of feature maps.
# (filter_width, filter_height, num_color_channels, num_feature_maps)
W3_init = np.random.randn(W2_shape[-1]*8*8, M) /
np.sqrt(W2_shape[-1]*8*8 + M)
For the vanilla ANN portion, also notice that the outputs of the convolution
are now a different size. So now it’s 8 instead of 5.
For the forward pass, the first 2 parts are the same as Theano.
One thing that’s different is TensorFlow objects don’t have a flatten method,
so we have to use reshape.
Z2_shape = Z2.get_shape().as_list()
Luckily this is pretty straightforward EVEN when you pass in None for the
input shape parameter. You can just pass in -1 in reshape and it will be
automatically be calculated. But as you can imagine this will make your
computation take longer.
The last step is to calculate the output just before the softmax. Remember
that with TensorFlow the cost function requires the logits without
softmaxing, so we won’t do the softmax at this point.
The full code is as follows:
import numpy as np
import tensorflow as tf
def y2indicator(y):
N = len(y)
ind = np.zeros((N, 10))
for i in xrange(N):
ind[i, y[i]] = 1
return ind
conv_out = tf.nn.bias_add(conv_out, b)
pool_out = tf.nn.max_pool(conv_out, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1],
padding='SAME')
return pool_out
w = np.random.randn(*shape) / np.sqrt(np.prod(shape[:-1]) +
shape[-1]*np.prod(shape[:-2] / np.prod(poolsz)))
return w.astype(np.float32)
def rearrange(X):
N = X.shape[-1]
out = np.zeros((N, 32, 32, 3), dtype=np.float32)
for i in xrange(N):
for j in xrange(3):
out[i, :, :, j] = X[:, :, j, i]
Xtrain = rearrange(train['X'])
Ytrain = train['y'].flatten() - 1
print len(Ytrain)
del train
Ytrain_ind = y2indicator(Ytrain)
Xtest = rearrange(test['X'])
Ytest = test['y'].flatten() - 1
del test
Ytest_ind = y2indicator(Ytest)
print_period = 10
N = Xtrain.shape[0]
batch_sz = 500
n_batches = N / batch_sz
Xtrain = Xtrain[:73000,]
Ytrain = Ytrain[:73000]
Xtest = Xtest[:26000,]
Ytest = Ytest[:26000]
Ytest_ind = Ytest_ind[:26000,]
# initialize weights
M = 500
K = 10
poolsz = (2, 2)
W3_init = np.random.randn(W2_shape[-1]*8*8, M) /
np.sqrt(W2_shape[-1]*8*8 + M)
# using None as the first shape element takes up too much RAM
unfortunately
W1 = tf.Variable(W1_init.astype(np.float32))
b1 = tf.Variable(b1_init.astype(np.float32))
W2 = tf.Variable(W2_init.astype(np.float32))
b2 = tf.Variable(b2_init.astype(np.float32))
W3 = tf.Variable(W3_init.astype(np.float32))
b3 = tf.Variable(b3_init.astype(np.float32))
W4 = tf.Variable(W4_init.astype(np.float32))
b4 = tf.Variable(b4_init.astype(np.float32))
Z1 = convpool(X, W1, b1)
Z2_shape = Z2.get_shape().as_list()
predict_op = tf.argmax(Yish, 1)
t0 = datetime.now()
LL = []
init = tf.initialize_all_variables()
session.run(init)
for i in xrange(max_iter):
for j in xrange(n_batches):
if len(Xbatch) == batch_sz:
if j % print_period == 0:
# due to RAM limitations we need to have a fixed size input
test_cost = 0
prediction = np.zeros(len(Xtest))
print "Cost / err at iteration i=%d, j=%d: %.3f / %.3f" % (i, j, test_cost, err)
LL.append(test_cost)
plt.show()
if __name__ == '__main__':
main()
Conclusion
I really hope you had as much fun reading this book as I did making it.
Do you want to learn more about deep learning? Perhaps online courses are
more your style. I happen to have a few of them on Udemy.
A lot of the material in this book is covered in this course, but you get to see
me derive the formulas and write the code live:
The background and prerequisite knowledge for deep learning and neural
networks can be found in my class “Data Science: Deep Learning in
Python” (officially known as “part 1” of the series). In this course I teach
you the feedforward mechanism of a neural network (which I assumed you
already knew for this book), and how to derive the training algorithm called
backpropagation (which I also assumed you knew for this book):
Data Science: Deep Learning in Python
https://udemy.com/data-science-deep-learning-in-python
https://kdp.amazon.com/amazon-dp-
action/us/bookshelf.marketplacelink/B01CVJ19E8
Are you comfortable with this material, and you want to take your deep
learning skillset to the next level? Then my follow-up Udemy course on
deep learning is for you. Similar to previous book, I take you through the
basics of Theano and TensorFlow - creating functions, variables, and
expressions, and build up neural networks from scratch. I teach you about
ways to accelerate the learning process, including batch gradient descent,
momentum, and adaptive learning rates. I also show you live how to create
a GPU instance on Amazon AWS EC2, and prove to you that training a
neural network with GPU optimization can be orders of magnitude faster
than on your CPU.
https://kdp.amazon.com/amazon-dp-
action/us/bookshelf.marketplacelink/B01D7GDRQ2
https://www.udemy.com/data-science-linear-regression-in-python
If you are interested in learning about how machine learning can be applied
to language, text, and speech, you’ll want to check out my course on
Natural Language Processing, or NLP:
Data Science: Natural Language Processing in Python
https://www.udemy.com/data-science-natural-language-processing-in-
python
https://www.udemy.com/sql-for-marketers-data-analytics-data-science-big-
data
Finally, I am always giving out coupons and letting you know when you
can get my stuff for free. But you can only do this if you are a current
student of mine! Here are some ways I notify my students about coupons
and free giveaways:
My newsletter, which you can sign up for at http://lazyprogrammer.me (it
comes with a free 6-week intro to machine learning course)
My Twitter, https://twitter.com/lazy_scientist