Introduction - IPython Interactive Computing and Visualization Cookbook - INTRODUCTION
Introduction - IPython Interactive Computing and Visualization Cookbook - INTRODUCTION
Introduction
In this introduction, we will give a broad overview of Python, IPython, Jupyter, and the scientific
Python ecosystem.
What is Python?
In the last twenty years, Python has been increasingly used for scientific computing and data
analysis as well. Other competing platforms include commercial software such as MATLAB,
Maple, Mathematica, Excel, SPSS, SAS, and others. Competing open-source platforms include
Julia, R, Octave, and Scilab. These tools are dedicated to scientific computing, whereas Python is a
general-purpose programming language that was not initially designed for scientific computing.
However, a wide ecosystem of tools has been developed to bring Python to the level of these
other scientific computing systems. Today, the main advantage of Python, and one of the main
reasons why it is so popular, is that it brings scientific computing features to a general-purpose
language that is used in many research areas and industries. This makes the transition from
research to production much easier.
What is IPython?
IPython is a Python library that was originally meant to improve the default interactive console
provided by Python, and to make it scientist-friendly. In 2011, ten years after the first release of
IPython, the IPython Notebook was introduced. This web-based interface to IPython combines
https://subscription.packtpub.com/book/big_data_and_business_intelligence/9781785888632/1/ch01lvl1sec10/introduction 1/7
14/08/2019 Introduction - IPython Interactive Computing and Visualization Cookbook - Second Edition
code, text, mathematical expressions, inline plots, interactive figures, widgets, graphical
interfaces, and other rich media within a standalone sharable web document. This platform
provides an ideal gateway to interactive scientific computing and data analysis. IPython has
become essential to researchers, engineers, data scientists, and teachers and their students.
What is Jupyter?
Within a few years, IPython gained an incredible popularity among the scientific and engineering
communities. The Notebook started to support more and more programming languages beyond
Python. In 2014, the IPython developers announced the Jupyter project, an initiative created to
improve the implementation of the Notebook and make it language-agnostic by design. The name
of the project reflects the importance of three of the main scientific computing languages
supported by the Notebook: Julia, Python, and R.
SciPy is the name of a Python package for scientific computing, but it refers also, more generally,
to the collection of all Python tools that have been developed to bring scientific computing
features to Python.
In the late 1990s, Travis Oliphant and others started to build efficient tools to deal with numerical
data in Python: Numeric, Numarray, and finally, NumPy. SciPy, which implements many numerical
computing algorithms, was also created on top of NumPy. In the early 2000s, John Hunter created
Matplotlib to bring scientific graphics to Python. At the same time, Fernando Perez created
IPython to improve interactivity and productivity in Python. In the late 2000s, Wes McKinney
created pandas for the manipulation and analysis of numerical tables and time series. Since then,
hundreds of engineers and researchers collaboratively worked on this platform to make SciPy one
of the leading open source platforms for scientific computing and data science.
Note
https://subscription.packtpub.com/book/big_data_and_business_intelligence/9781785888632/1/ch01lvl1sec10/introduction 2/7
14/08/2019 Introduction - IPython Interactive Computing and Visualization Cookbook - Second Edition
Many of the SciPy tools are supported by NumFOCUS, a nonprofit that was created as a legal structure to
promote the sustainable development of the ecosystem. NumFOCUS is supported by several large
companies including Microsoft, IBM, and Intel.
SciPy has its own conferences, too: SciPy (in the US) and EuroSciPy (in Europe) (see
https://conference.sci (https://conference.sci)).
What are some of the main changes in the SciPy ecosystem since the first edition of this book,
published in 2014? We give here a very brief selection.
Note
Feel free to skip this section if you are new to the platform.
The last version of IPython at the time of writing is IPython 6.0, released in April 2017. It is the
first version of IPython that is no longer compatible with Python 2. This decision allowed the
developers to make the internal code simpler and to make better use of the new features of the
language.
IPython now has a web-based Terminal interface that can be used along with notebooks.
Keyboard shortcuts can be edited directly from the Notebook interface. Multiple cells can be
selected and copy/pasted between notebooks. There is a new restart-and-run-all button and a
find-and-replace option in the Notebook. See
http://ipython.readthedocs.io/en/stable/whatsnew/version6.html
(http://ipython.readthedocs.io/en/stable/whatsnew/version6.html) for more details.
NumPy, which last version 1.13 was released in June 2017, now supports the @ matrix
multiplication operator between matrices (it was previously accessible via the np.dot()
function). Operations such as a + b + c use less memory and are faster on some systems
(temporary elision). The new np.block() function lets one define block matrices. The new
np.stack() function joins a sequence of arrays along a new axis. See
https://docs.scipy.org/doc/numpy-1.13.0/release.html (https://docs.scipy.org/doc/numpy-
1.13.0/release.html) for more details.
https://subscription.packtpub.com/book/big_data_and_business_intelligence/9781785888632/1/ch01lvl1sec10/introduction 3/7
14/08/2019 Introduction - IPython Interactive Computing and Visualization Cookbook - Second Edition
SciPy 1.0 was released in October 2017. For the developers, the 1.0 version means that the library
has reached some stability and maturity after 16 years of development. See
https://docs.scipy.org/doc/scipy/reference/release.html
(https://docs.scipy.org/doc/scipy/reference/release.html) for more details.
Matplotlib, of which version 2.1 was released in October 2017, has an improved styling and a
much better default color palette with the viridis colormap instead of jet. See
https://github.com/matplotlib/matplotlib/releases
(https://github.com/matplotlib/matplotlib/releases) for more details.
pandas 0.21 was released in October 2017. pandas now supports categorical data. Several
deprecations were done in the past years, with the deprecation of the .ix syntax and Panels
(which may be replaced via the xarray library). See https://pandas.pydata.org/pandas-
docs/stable/release.html (https://pandas.pydata.org/pandas-docs/stable/release.html) for more
details.
Anaconda comes with Python, IPython, Jupyter, NumPy, SciPy, pandas, Matplotlib, and almost all
of the other scientific packages we will be using in this book. The list of all packages is available at
https://docs.anaconda.com/anaconda/packages/pkg-docs
(https://docs.anaconda.com/anaconda/packages/pkg-docs).
Note
Miniconda is a light version of Anaconda with only Python and a few other essential packages. You can install
only the packages you need one by one using the conda package manager of Anaconda.
We won't cover in this book the various other ways of installing a scientific Python distribution.
https://subscription.packtpub.com/book/big_data_and_business_intelligence/9781785888632/1/ch01lvl1sec10/introduction 4/7
14/08/2019 Introduction - IPython Interactive Computing and Visualization Cookbook - Second Edition
The Anaconda website should give you all the instructions to install Anaconda on your system. To
install new packages, you can use the conda package manager that comes with Anaconda. For
example, to install the ipyparallel package (which is currently not installed by default in
Anaconda), type conda install ipyparallel in a system shell.
Note
A short introduction to system shells is given in the Learning the basics of the Unix shell section of Chapter
2 (/book/big_data_and_business_intelligence/9781785888632/2), Best Practices in Interactive Computing.
Note
GitHub is a commercial service that provides free and paid hosting for software repositories. It is one of the
most popular platforms for open source collaborative development.
pip is the Python system manager. Contrary to conda, pip works with any Python distribution, not
just with Anaconda. Packages installable by pip are stored on the Python Package Index (PyPI)
available at https://pypi.python.org/pypi (https://pypi.python.org/pypi).
Almost all Python packages available in conda are also available in pip, but the inverse is not true.
In practice, if a package is not available in conda or conda-forge, it should be available with pip
install somepackage . conda packages typically include binaries compiled for the most common
platforms, whereas that is not necessarily the case with pip packages. pip packages may contain
source code that has to be compiled locally (which requires that a compatible compiler is installed
and configured), but they may also contain compiled binaries.
References
https://subscription.packtpub.com/book/big_data_and_business_intelligence/9781785888632/1/ch01lvl1sec10/introduction 5/7
14/08/2019 Introduction - IPython Interactive Computing and Visualization Cookbook - Second Edition
Python on Wikipedia at
https://en.wikipedia.org/wiki/Python_%28programming_language%29
(https://en.wikipedia.org/wiki/Python_%28programming_language%29)
JupyterCon at https://conferences.oreilly.com/jupyter/jup-ny
(https://conferences.oreilly.com/jupyter/jup-ny)
Get
access
to all of
https://subscription.packtpub.com/book/big_data_and_business_intelligence/9781785888632/1/ch01lvl1sec10/introduction 6/7
14/08/2019 Introduction - IPython Interactive Computing and Visualization Cookbook - Second Edition
Packt's
7,000+
eBooks
&
Videos
Over
100
new
eBooks
Start FREE 10-day trial > (/checkout/packt-subscription-monthly-launch-offer?freeTrial)
and
Videos
added
each
month
10-day
FREE
trial.
Renews
at $9.99
per
month
https://subscription.packtpub.com/book/big_data_and_business_intelligence/9781785888632/1/ch01lvl1sec10/introduction 7/7