Python Numpy Array Tutorial
Python Numpy Array Tutorial
Today’s post will exactly focus on this last. This NumPy tutorial will not
only show you what NumPy arrays actually are and how you can install
Python, but you’ll also learn how to make arrays (even when your data
comes from files!), how broadcasting works, how you can ask for help,
how to manipulate your arrays and how to visualize them.
Content
What Is A Python Numpy Array?
If you want to know even more about NumPy arrays and the other data
structures that you will need in your data science journey, consider taking
a look at DataCamp’s Intro to Python for Data Science, which has a
chapter on NumPy.
As the name kind of gives away, a NumPy array is a central data structure
of the numpy library. The library’s name is actually short for “Numeric
Python” or “Numerical Python”.
In other words, NumPy is a Python library that is the core library for
scientific computing in Python. It contains a collection of tools and
techniques that can be used to solve on a computer mathematical models
of problems in Science and Engineering. One of these tools is a high-
performance multidimensional array object that is a powerful data
structure for efficient computation of arrays and matrices. To work with
these arrays, there’s a huge amount of high-level mathematical functions
operate on these matrices and arrays.
1
2
3
4
5
6
7
8
# Print the array
print(my_array)
# Print the 2d array
print(my_2d_array)
# Print the 3d array
print(my_3d_array)
IPython Shell
In [1]:
Run
You see that, in the example above, the data are integers. The array holds
and represents any regular data in a structured way.
However, you should know that, on a structural level, an array is basically
nothing but pointers. It’s a combination of a memory address, a data type,
a shape and strides:
The data pointer indicates the memory address of the first byte in
the array,
The data type or dtype pointer describes the kind of elements that
are contained within the array,
Or, in other words, an array contains information about the raw data, how
to locate an element and how to interpret an element.
You can easily test this by exploring the numpy array attributes:
script.py
1
2
3
4
5
6
7
8
9
10
11
# Print out memory address
print(my_2d_array.data)
# Print out the shape of `my_array`
print(my_2d_array.shape)
# Print out the data type of `my_array`
print(my_2d_array.dtype)
# Print out the stride of `my_array`
print(my_2d_array.strides)
IPython Shell
In [1]:
Run
You see that now, you get a lot more information: for example, the data
type that is printed out is ‘int64’ or signed 32-bit integer type; This is a lot
more detailed! That also means that the array is stored in memory as 64
bytes (as each integer takes up 8 bytes and you have an array of 8
integers). The strides of the array tell us that you have to skip 8 bytes (one
value) to move to the next column, but 32 bytes (4 values) to get to the
same position in the next row. As such, the strides for the array will be
(32,8).
Note that if you set the data type to int32, the strides tuple that you get
back will be (16, 4), as you will still need to move one value to the next
column and 4 values to get the same position. The only thing that will have
changed is the fact that each integer will take up 4 bytes instead of 8.
The array that you see above is, as its name already suggested, a 2-
dimensional array: you have rows and columns. The rows are indicated as
the “axis 0”, while the columns are the “axis 1”. The number of the axis
goes up accordingly with the number of the dimensions: in 3-D arrays, of
which you have also seen an example in the previous code chunk, you’ll
have an additional “axis 2”. Note that these axes are only valid for arrays
that have at least 2 dimensions, as there is no point in having this for 1-D
arrays;
These axes will come in handy later when you’re manipulating the shape
of your NumPy arrays.
If you still need to set up your environment, you must be aware that there
are two major ways of installing NumPy on your pc: with the help of
Python wheels or the Anaconda Python distribution.
Note that recent versions of Python 3 come with pip, so double check if
you have it and if you do, upgrade it before you install NumPy:
pip install pip --upgrade
pip --version
Next, you can go here or here to get your NumPy wheel. After you have
downloaded it, navigate to the folder on your pc that stores it through the
terminal and install it:
install "numpy-1.9.2rc1+mkl-cp34-none-win_amd64.whl"
import numpy
numpy.__version__
The two last lines allow you to verify that you have installed NumPy and
check the version of the package.
The good thing about getting this Python distribution is the fact that you
don’t need to worry too much about separately installing NumPy or any of
the major packages that you’ll be using for your data analyses, such as
pandas, scikit-learn, etc.
Because, especially if you’re very new to Python, programming or
terminals, it can really come as a relief that Anaconda already includes 100
of the most popular Python, R and Scala packages for data science. But
also for more seasoned data scientists, Anaconda is the way to go if you
want to get started quickly on tackling data science problems.
Some exercises have been included below so that you can already practice
how it’s done before you start on your own!
To make a numpy array, you can just use the np.array() function. All
you need to do is pass a list to it and optionally, you can also specify the
data type of the data. If you want to know more about the possible data
types that you can pick, go here or consider taking a brief look at
DataCamp’s NumPy cheat sheet.
Don’t forget that, in order to work with the np.array() function, you
need to make sure that the numpy library is present in your environment.
The NumPy library follows an import convention: when you import this
library, you have to make sure that you import it as np. By doing this,
you’ll make sure that other Pythonistas understand your code more easily.
In the following example you’ll create the my_array array that you have
already played around with above:
script.py
1
2
3
4
5
6
7
8
# Import `numpy` as `np`
_________________
# Make the array `my_array`
my_array = ________([[1,2,3,4], [5,6,7,8]],
dtype=np.int64)
# Print `my_array`
print(_________)
IPython Shell
In [1]:
SolutionRun
If you would like to know more about how to make lists, go here.
However, sometimes you don’t know what data you want to put in your
array or you want to import data into a numpy array from another source.
In those cases, you’ll make use of initial placeholders or functions to load
data from text into arrays, respectively.
IPython Shell
In [1]:
Run
Tip: play around with the above functions so that you understand how
they work!
With that what you have seen up until now, you won’t really be able to do
much. Make use of some specific functions to load data from your files,
such as loadtxt() or genfromtxt().
Let’s say you have the following text files with data:
# This is your data in the text file
x, y, z = np.loadtxt('data.txt',
skiprows=1,
unpack=True)
In the code above, you use loadtxt() to load the data in your
environment. You see that the first argument that both functions take is the
text file data.txt. Next, there are some specific arguments for each: in
the first statement, you skip the first row and you return the columns as
separate arrays with unpack=TRUE. This means that the values in
column Value1 will be put in x, and so on.
my_array2 = np.genfromtxt('data2.txt',
skip_header=1,
filling_values=-999)
You see that here, you resort to genfromtxt() to load the data. In this
case, you have to handle some missing values that are indicated by
the 'MISSING' strings. Since the genfromtxt()function converts
character strings in numeric columns to nan, you can convert these values
to other ones by specifying the filling_values argument. In this case,
you choose to set the value of these missing values to -999.
If, by any chance, you have values that don’t get converted
to nan by genfromtxt(), there’s always
the missing_values argument that allows you to specify what the
missing values of your data exactly are.
Tip: check out this page to see what other arugments you can add to
import your data successfully.
You now might wonder what the difference between these two functions
really is.
x = np.arange(0.0,5.0,1.0)
np.savetxt('test.out', x, delimiter=',')
Remember that np.arange() creates a NumPy array of evenly-spaced
values. The third value that you pass to this function is the step value.
There are, of course, other ways to save your NumPy arrays to text files.
Check out the functions in the table below if you want to get your data to
binary files or archives:
save() Save an array to a binary file in NumPy .npy format
For more information or examples of how you can use the above functions
to save your data, go here or make use of one of the help functions that
NumPy has to offer to get to know more instantly!
Are you not sure what these NumPy help functions are?
No worries! You’ll learn more about them in one of the next sections!
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Print the number of `my_array`'s
dimensions
print(my_array.ndim)
# Print the number of `my_array`'s elements
print(my_array.size)
# Print information about `my_array`'s
memory layout
print(my_array.flags)
# Print the length of one array element in
bytes
print(my_array.itemsize)
# Print the total consumed bytes by
`my_array`'s elements
print(my_array.nbytes)
IPython Shell
In [1]:
Run
These are almost all the attributes that an array can have.
Don’t worry if you don’t feel that all of them are useful for you at this
point; This is fairly normal, because, just like you read in the previous
section, you’ll only get to worry about memory when you’re working with
large data sets.
Also note that, besides the attributes, you also have some other ways of
gaining more information on and even tweaking your array slightly:
script.py
1
2
3
4
5
# Print the length of `my_array`
print(len(my_array))
# Change the data type of `my_array`
my_array.astype(float)
IPython Shell
In [1]:
Run
Now that you have made your array, either by making one yourself with
the np.array() or one of the intial placeholder functions, or by loading
in your data through the loadtxt() or genfromtxt() functions, it’s
time to look more closely into the second key element that really defines
the NumPy library: scientific computing.
However, there are some rules if you want to use it. And, before you
already sigh, you’ll see that these “rules” are very simple and kind of
straightforward!
script.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Initialize `x`
x = np.ones((3,4))
# Check shape of `x`
print(x.shape)
# Initialize `y`
y = np.random.random((3,4))
# Check shape of `y`
print(y.shape)
# Add `x` and `y`
x + y
IPython Shell
In [1]:
Run
script.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Import `numpy` as `np`
import numpy as np
# Initialize `x`
x = np.ones((3,4))
# Check shape of `x`
print(x.shape)
# Initialize `y`
y = np.arange(4)
# Check shape of `y`
print(y.shape)
# Subtract `x` and `y`
x - y
IPython Shell
In [1]:
Run
Note that if the dimensions are not compatible, you will get
a ValueError.
Tip: also test what the size of the resulting array is after you have done the
computations! You’ll see that the size is actually the maximum size along
each dimension of the input arrays.
In other words, you see that the result of x-y gives an array with
shape (3,4): y had a shape of (4,) and x had a shape of (3,4). The
maximum size along each dimension of x and y is taken to make up the
shape of the new, resulting array.
script.py
1
2
3
4
5
6
7
8
9
# Import `numpy` as `np`
import numpy as np
# Initialize `x` and `y`
x = np.ones((3,4))
y = np.random.random((5,1,4))
# Add `x` and `y`
x + y
IPython Shell
In [1]:
Run
You see that, even though x and y seem to have somewhat different
dimensions, the two can be added together.
Since you have seen above that dimensions are also compatible if one of
them is equal to 1, you see that these two arrays are indeed a good
candidate for broadcasting!
What you will notice is that in the dimension where y has size 1 and the
other array has a size greater than 1 (that is, 3), the first array behaves as if
it were copied along that dimension.
Note that the shape of the resulting array will again be the maximum size
along each dimension of x and y: the dimension of the result will
be (5,3,4)
In short, if you want to make use of broadcasting, you will rely a lot on the
shape and dimensions of the arrays with which you’re working.
You’ll have to fix this by manipulating your array! You’ll see how to do
this in one of the next sections.
How Do Array Mathematics Work?
You’ve seen that broadcasting is handy when you’re doing arithmetic
operations. In this section, you’ll discover some of the functions that you
can use to do mathematics with arrays.
You can also easily do exponentiation and taking the square root of your
arrays with np.exp()and np.sqrt(), or calculate the sines or cosines of
your array with np.sin() and np.cos(). Lastly, its’ also useful to
mention that there’s also a way for you to calculate the natural logarithm
with np.log() or calculate the dot product by applying the dot() to your
array.
Just a tip: make sure to check out first the arrays that have been loaded for
this exercise!
script.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Add `x` and `y`
_______________(x,y)
# Subtract `x` and `y`
_______________(x,y)
# Multiply `x` and `y`
_______________(x,y)
# Divide `x` and `y`
______________(x,y)
# Calculate the remainder of `x` and `y`
______________(x,y)
IPython Shell
In [1]:
SolutionRun
Remember how broadcasting works? Check out the dimensions and the
shapes of both x and yin your IPython shell. Are the rules of broadcasting
respected?
a.mean() Mean
b.median() Median
Besides all of these functions, you might also find it useful to know that
there are mechanisms that allow you to compare array elements. For
example, if you want to check whether the elements of two arrays are the
same, you might use the == operator. To check whether the array elements
are smaller or bigger, you use the < or > operators.
However, you can also compare entire arrays with each other! In this case,
you use the np.array_equal() function. Just pass in the two arrays that
you want to compare with each other and you’re done.
Note that, besides comparing, you can also perform logical operations on
your arrays. You can start
with np.logical_or(), np.logical_not() and np.logical_and()
. This basically works like your typical OR, NOT and AND logical
operations;
In the simplest example, you use OR to see whether your elements are the
same (for example, 1), or if one of the two array elements is 1. If both of
them are 0, you’ll return FALSE. You would use AND to see whether your
second element is also 1 and NOT to see if the second element differs
from 1.
1
2
3
4
5
6
7
8
# `a` AND `b`
_____________(a, b)
# `a` OR `b`
_____________(a, b)
# `a` NOT `b`
_____________(a,b)
IPython Shell
In [1]:
SolutionRun
These operations are very similar to when you perform them on Python
lists. If you want to check out the similarities for yourself, or if you want a
more elaborate explanation, you might consider checking out
DataCamp’s Python list tutorial.
If you have no clue at all on how these operations work, it suffices for now
to know these two basic things:
Generally, you pass integers to these square brackets, but you can
also put a colon : or a comination of the colon with integers in it to
designate the elements/rows/columns you want to select.
Besides from these two points, the easiest way to see how this all fits
together is by looking at some examples of subsetting:
script.py
1
2
3
4
5
6
7
8
9
10
11
# Select the element at the 1st index
print(my_array[1])
# Select the element at row 1 column 2
print(my_2d_array[1][2])
# Select the element at row 1 column 2
print(my_2d_array[1,2])
# Select the element at row 1, column 2 and
print(my_3d_array[1,1,2])
IPython Shell
In [1]:
Run
Something a little bit more advanced than subsetting, if you will, is slicing.
Here, you consider not just particular values of your arrays, but you go to
the level of rows and columns. You’re basically working with “regions” of
data instead of pure “locations”.
You can see what is meant with this analogy in these code examples:
script.py
1
2
3
4
5
6
7
8
9
# Select items at index 0 and 1
print(my_array[0:2])
# Select items at row 0 and 1, column 1
print(my_2d_array[0:2,1])
# Select items at row 1
# This is the same as saying `my_3d_array[1
,:,:]
print(my_3d_array[1,...])
IPython Shell
In [1]:
Run
included!)
Lastly, there’s also indexing. When it comes to NumPy, there are boolean
indexing and advanced or “fancy” indexing.
(In case you’re wondering, this is true NumPy jargon, I didn’t make the
last one up!)
1
2
3
4
5
6
7
8
# Try out a simple example
print(my_array[my_array<2])
# Specify a condition
bigger_than_3 = (my_3d_array >= 3)
# Use the condition to index our 3d array
print(my_3d_array[bigger_than_3])
IPython Shell
In [1]:
Run
Note that, to specify a condition, you can also make use of the logical
operators | (OR) and &(AND). If you would want to rewrite the condition
above in such a way (which would be inefficient, but I demonstrate it here
for educational purposes :)), you would get bigger_than_3 =
(my_3d_array > 3) | (my_3d_array == 3).
With the arrays that have been loaded in, there aren’t too many
possibilities, but with arrays that contain for example, names or capitals,
the possibilities could be endless!
When it comes to fancy indexing, that what you basically do with it is the
following: you pass a list or an array of integers to specify the order of the
subset of rows you want to select out of the original array.
1
2
3
4
5
# Select elements at (1,0), (0,1), (1,2) and
(0,0)
print(my_2d_array[[1, 0, 1, 0],[0, 1, 2, 0]]
)
# Select a subset of the rows and columns
print(my_2d_array[[1, 0, 1, 0]][:,[0,1,2,0]]
)
IPython Shell
In [1]:
Run
Now, the second statement might seem to make less sense to you at first
sight. This is normal. It might make more sense if you break it down:
array([[5, 6, 7, 8],
[1, 2, 3, 4],
[5, 6, 7, 8],
[1, 2, 3, 4]])
What the second part, namely, [:,[0,1,2,0]], is tell you that you
want to keep all the rows of this result, but that you want to change
the order of the columns around a bit. You want to display the
columns 0, 1, and 2 as they are right now, but you want to repeat
column 0 as the last column instead of displaying column number 3.
This will give you the following result:
array([[5, 6, 7, 5],
[1, 2, 3, 1],
[5, 6, 7, 5],
[1, 2, 3, 1]])
You just make use of the specific help functions that numpy offers to set
you on your way:
You see, both functions have their advantages and disadvantages, but
you’ll see for yourself why both of them can be useful: try them out for
yourself in the DataCamp Light code chunk below!
script.py
1
2
3
4
5
# Look up info on `mean` with `np.lookfor()`
print(np.lookfor("mean"))
# Get info on data types with `np.info()`
np.info(np.ndarray.dtype)
IPython Shell
In [1]:
Run
Note that you indeed need to know that dtype is an attribute of ndarray.
Also, make sure that you don’t forget to put np in front of the modules,
classes or terms you’re asking information about, otherwise you will get
an error message like this:
Traceback (most recent call last):
You now know how to ask for help, and that’s a good thing. The next topic
that this NumPy tutorial covers is array manipulation.
Not that you can not overcome this topic on your own, quite the contrary!
But some of the functions might raise questions, because, what is the
difference between resizing and reshaping?
And what is the difference between stacking your arrays horizontally and
vertically?
The next section is all about answering these questions, but if you ever feel
in doubt, feel free to use the help functions that you have just seen to
quickly get up to speed.
Below are some of the most common manipulations that you’ll be doing.
1
2
3
4
5
6
7
8
# Print `my_2d_array`
print(my_2d_array)
# Transpose `my_2d_array`
print(np.transpose(my_2d_array))
# Or use `T` to transpose `my_2d_array`
print(my_2d_array.T)
IPython Shell
In [1]:
Run
Tip: if the visual comparison between the array and its transposed version
is not entirely clear, inspect the shape of the two arrays to make sure that
you understand why the dimensions are permuted.
Note that there are two transpose functions. Both do the same; There isn’t
too much difference. You do have to take into account that T seems more
of a convenience function and that you have a lot more flexibility
with np.transpose(). That’s why it’s recommended to make use of this
function if you want to more arguments.
All is well when you transpose arrays that are bigger than one dimension,
but what happens when you just have a 1-D array? Will there be any
effect, you think?
Try it out for yourself in the code chunk below. Your 1-D array has
already been loaded in:
script.py
1
2
3
4
5
6
7
8
# Print `my_2d_array`
print(my_array)
# Transpose `my_2d_array`
print(np.transpose(my_array))
# Or use `T` to transpose `my_2d_array`
print(my_array.T)
IPython Shell
In [1]:
Run
What you can do if the arrays don’t have the same dimensions, is resize
your array. You will then return a new array that has the shape that you
passed to the np.resize() function. If you pass your original array
together with the new dimensions, and if that new array is larger than the
one that you originally had, the new array will be filled with copies of the
original array that are repeated as many times as is needed.
However, if you just apply np.resize() to the array and you pass the
new shape to it, the new array will be filled with zeros.
1
2
3
4
5
6
7
8
9
10
11
# Print the shape of `x`
print(x.shape)
# Resize `x` to ((6,4))
np.resize(x, (6,4))
# Try out this as well
x.resize((6,4))
# Print out `x`
print(x)
IPython Shell
In [1]:
Run
Besides resizing, you can also reshape your array. This means that you
give a new shape to an array without changing its data. The key to
reshaping is to make sure that the total size of the new array is unchanged.
If you take the example of array x that was used above, which has a size of
3 X 4 or 12, you have to make sure that the new array also has a size of 12.
Psst… If you want to calculate the size of an array with code, make sure to
use the size attribute: x.size or x.reshape((2,6)).size:
script.py
1
2
3
4
5
6
7
8
9
10
11
# Print the size of `x` to see what's
possible
print(x.size)
# Reshape `x` to (2,6)
print(x.reshape((2,6)))
# Flatten `x`
z = x.ravel()
# Print `z`
print(z)
IPython Shell
In [1]:
Run
If all else fails, you can also append an array to your original one or insert
or delete array elements to make sure that your dimensions fit with the
other array that you want to use for your computations.
Another operation that you might keep handy when you’re changing the
shape of arrays is ravel(). This function allows you to flatten your
arrays. This means that if you ever have 2D, 3D or n-D arrays, you can
just use this function to flatten it all out to a 1-D array.
Check how it’s done in the code chunk below. Don’t forget that you can
always check which arrays are loaded in by typing, for
example, my_array in the IPython shell and pressing ENTER.
script.py
1
2
3
4
5
6
7
8
9
10
11
# Append a 1D array to your `my_array`
new_array = _________(my_array, [7, 8, 9,
10])
# Print `new_array`
_________(new_array)
# Append an extra column to your
`my_2d_array`
new_2d_array = __________(my_2d_array,
[[7], [8]], axis=1)
# Print `new_2d_array`
________(new_2d_array)
IPython Shell
In [1]:
SolutionRun
1
2
3
4
5
# Insert `5` at index 1
____________(my_array, 1, 5)
# Delete the value at index 1
____________(my_array,[1])
IPython Shell
In [1]:
SolutionRun
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Concatentate `my_array` and `x`
print(np.concatenate((my_array,x)))
# Stack arrays row-wise
print(np.vstack((my_array, my_2d_array)))
# Stack arrays row-wise
print(np.r_[my_resized_array, my_2d_array])
# Stack arrays horizontally
print(np.hstack((my_resized_array,
my_2d_array)))
# Stack arrays column-wise
print(np.column_stack((my_resized_array,
my_2d_array)))
# Stack arrays column-wise
print(np.c_[my_resized_array, my_2d_array])
IPython Shell
In [1]:
Run
When you have joined arrays, you might also want to split them at some
point. Just like you can stack them horizontally, you can also do the same
but then vertically. You use np.hsplit() and np.vsplit(),
respectively:
script.py
1
2
3
4
5
# Split `my_stacked_array` horizontally at
the 2nd index
print(np.hsplit(my_stacked_array, 2))
# Split `my_stacked_array` vertically at the
2nd index
print(np.vsplit(my_stacked_array, 2))
IPython Shell
In [1]:
Run
What you need to keep in mind when you’re using both of these split
functions is probably the shape of your array. Let’s take the above case as
an example: my_stacked_array has a shape of (2,8). If you want to
select the index at which you want the split to occur, you have to keep the
shape in mind.
With np.histogram()
Contrary to what the function might suggest,
the np.histogram() function doesn’t draw the histogram but it does
compute the occurrences of the array that fall within each bin; This will
determine the area that each bar of your histogram takes up.
What you pass to the np.histogram() function then is first the input
data or the array that you’re working with. The array will be flattened
when the histogram is computed.
script.py
1
2
3
4
5
6
7
8
9
10
11
# Import `numpy` as `np`
import numpy as np
# Initialize your array
my_3d_array = np.array([[[1,2,3,4], [5,6
,7,8]], [[1,2,3,4], [9,10,11,12]]], dtype
=np.int64)
# Pass the array to `np.histogram()`
print(np.histogram(my_3d_array))
# Specify the number of bins
print(np.histogram(my_3d_array, bins
=range(0,13)))
IPython Shell
In [1]:
Run
You’ll see that as a result, the histogram will be computed: the first array
lists the frequencies for all the elements of your array, while the second
array lists the bins that would be used if you don’t specify any bins.
There are still some other arguments that you can specify that can
influence the histogram that is computed. You can find all of them here.
But what is the point of computing such a histogram if you can’t visualize
it?
Visualization is a piece of cake with the help of Matplotlib, but you don’t
need np.histogram() to compute the histogram. plt.hist() does this
for itself when you pass it the (flattened) data and the bins:
import numpy as np
range of bins
plt.hist(my_3d_array.ravel(), bins=range(0,13))
The above code will then give you the following (basic) histogram:
Using np.meshgrid()
Another way to (indirectly) visualize your array is by
using np.meshgrid(). The problem that you face with arrays is that you
need 2-D arrays of x and y coordinate values. With the above function,
you can create a rectangular grid out of an array of x values and an array
of y values: the np.meshgrid() function takes two 1D arrays and
produces two 2D matrices corresponding to all pairs of (x, y) in the two
arrays. Then, you can use these matrices to make all sorts of plots.
import numpy as np
import matplotlib.pyplot as plt
# Create an array
# Make a meshgrid
z = np.sqrt(xs ** 2 + ys ** 2)
plt.imshow(z, cmap=plt.cm.gray)
plt.colorbar()
You have covered a lot of ground, so now you have to make sure to retain
the knowledge that you have gained. Don’t forget to get your copy of
DataCamp’s NumPy cheat sheet to support you in doing this!
After all this theory, it’s also time to get some more practice with the
concepts and techniques that you have learned in this tutorial. One way to
do this is to go back to the scikit-learn tutorial and start experimenting
with further with the data arrays that are used to build machine learning
models.
If this is not your cup of tea, check again whether you have downloaded
Anaconda. Then, get started with NumPy arrays in Jupyter with
this Definitive Guide to Jupyter Notebook. Also make sure to check
out this Jupyter Notebook, which also guides you through data analysis in
Python with NumPy and some other libraries in the interactive data
science environment of the Jupyter Notebook.
https://www.datacamp.com/community/tutorials/python-numpy-tutorial