101 - Introducing DataFrames - Python
101 - Introducing DataFrames - Python
1. Introducing DataFrames
Hi, I'm Richie. I'll be your tour guide through the world of pandas.
3. Course outline
We'll start by talking about DataFrames, which form the core of pandas. In chapter 2, we'll discuss
aggregating data to gather insights. In chapter 3, you'll learn all about slicing and indexing to
subset DataFrames. Finally, you'll visualize your data, deal with missing data, and read data into a
DataFrame. Let's dive in.
5. pandas is popular
pandas has millions of users, with PyPi recording about 14 million downloads in December 2019.
This represents almost the entire Python data science community!
1
https://pypistats.org/packages/pandas
6. Rectangular data
There are several ways to store data for analysis, but rectangular data, sometimes called "tabular
data" is the most common form. In this example, with dogs, each observation, or each dog, is a
row, and each variable, or each dog property, is a column. pandas is designed to work with
rectangular data like this.
7. pandas DataFrames
In pandas, rectangular data is represented as a DataFrame object. Every programming language
used for data analysis has something similar to this. R also has DataFrames, while SQL has
database tables. Every value within a column has the same data type, either text or numeric, but
different columns can contain different data types.
The describe method computes some summary statistics for numerical columns, like mean and
median. "count" is the number of non-missing values in each column. describe is good for a quick
overview of numeric variables, but if you want more control, you'll see how to perform more specific
calculations later in the course.
https://campus.datacamp.com/courses/data-manipulation-with-pandas/transforming-data?ex=1 2/2