Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
1 views

Lec 1 Introduction to Python

Uploaded by

gaurshetty
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Lec 1 Introduction to Python

Uploaded by

gaurshetty
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Introduction to Python

In this lecture
 Data science
 Tools for data science
 History of Python
 Python IDEs

Python for Data Science 2


Introduction
 We live in a world that’s drowning in
data
 Data is generated from various sources
◦ Websites track every user’s every click
◦ Your smartphone is building up a record of
your location
◦ Sensors from electronic devices record
real time information
◦ E-commerce websites collect purchasing
habits

Python for Data Science 3


Data science
 Interdisciplinaryfield that brings
together computer science,
statistics and mathematics to
extract useful insights from data

 Analyzingand generating insights


from data aids in arriving at
better business decisions

Python for Data Science 4


Popular tools used in data science
 Data pre-processing and analysis
◦ Python, R, Microsoft Excel, SAS, SPSS

 Data exploration and visualization


◦ Tableau, Qlikview, Microsoft Excel

 Paralleland distributed computing


incase of big data
◦ Apache Spark,Apache Hadoop

Python for Data Science 5


Evolution of Python
 Python was developed by
Guido van Rossum in the late
eighties at the ‘National
Research Institute for
Mathematics and Computer
Science’ at Netherlands
 Python Editions
◦ Python 1.0-1991,
◦ Python 2.0- 2000
◦ Python 3.0 - 2008 (Python 3.7 –
latest)

Python for Data Science 6


Advantages of using python
 Python has several features that
make it well suited for data
science

 Open source and community


development
◦ Developed under Open Source
Initiative approved license making it
free to use and distribute even
commercially
Python for Data Science 7
Advantages of using python
 Syntax used is simple to understand
and code

 Libraries designed for specific data


science tasks

 Combines well with majority of the


cloud platform service providers

Python for Data Science 8


Integrated development
environment (IDE)
 Software application consisting of a
cohesive unit of tools required for
development
 Designed to simplify software
development
 Utilities provided by IDEs include
tools for managing, compiling,
deploying and debugging software

Python for Data Science 9


Features of IDE
 IDEshould centralize three key
tools that form the crux of
software development
◦ Source code editor
◦ Compiler
◦ Debugger
 Additional features
◦ Syntax and error highlighting
◦ Code completion
◦ Version control

Python for Data Science 10


Commonly used IDEs
 Spyder
 PyCharm
 Jupyter Notebook
 Atom

Python for Data Science 11


Spyder
 Supported across Linux, Mac
OS X and Windows platforms
 Available as open source
version
 Bundled with Anaconda
distribution which comes with
all Python libraries
 Developed for Python and
specifically data science

Python for Data Science 12


Spyder

Python for Data Science 13


Spyder
 Features include
◦ Code editor with robust syntax
and error highlighting
◦ Code completion and navigation
◦ Debugger
◦ Integrated document

 Interface
similar to MATLAB
and RStudio

Python for Data Science 14


PyCharm
 Supported across Linux, Mac OS
X and Windows platforms
 Available as community (free
open source) and professional
(paid) version
 Supports only Python
 Bundled with Anaconda
distribution which comes with
all Python libraries
◦ Can also be installed separately

Python for Data Science 15


PyCharm

Python for Data Science 16


PyCharm
 Features include
◦ Code editor provides syntax
and error highlighting
◦ Code completion and
navigation
◦ Unit testing
◦ Debugger
◦ Version control

Python for Data Science 17


Jupyter Notebook
 Web application that allows
creation and manipulation of
notebook documents called
‘notebook’
 Supported across Linux, Mac OS
X and Windows platforms
 Available as open source version

Python for Data Science 18


Jupyter Notebook

Source-https://jupyter.org/

Python for Data Science 19


Jupyter Notebook
 Bundled with Anaconda
distribution or can be installed
separately
 Supports Julia, Python, R and
Scala
 Consists of ordered collection of
input and output cells that contain
code, text, plots etc.
Source-https://jupyter.org/

Python for Data Science 20


Jupyter Notebook
 Allows sharing of code and
narrative text through output
formats like PDF, HTML etc.
◦ Education and presentation
tool
 Lacksmost of the features of
a good IDE

Source-https://jupyter.org/

Python for Data Science 21


Atom
 Open source text and source code
editor
 Supported across Linux, Mac OS X
and Windows platforms
 Supports Python, PHP, Java etc.
 Well suited for developers
 Enables users to install plug ins or
packages
◦ Packages can be installed for code
completion, debugging

Python for Data Science 22


Atom

Source-https://atom.io/
Python for Data Science 23
How to choose the best IDE?
 Requirements
 Working with different IDEs
helps us understand our own
requirement
 In this course, Spyder will be
used

Python for Data Science 24


Summary
 Popular tools used data science
 Evolution of Python
 Integrated development environment
◦ Spyder
◦ PyCharm
◦ Jupyter Notebook
◦ Atom

Python for Data Science 25


THANK YOU

You might also like