Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
110 views32 pages

Introduction To Data Visualization With Seaborn Chapter3

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 32

Count plots and bar

plots
I N T R O D U C T I O N TO D ATA V I S U A L I Z AT I O N W I T H S E A B O R N

Erin Case
Data Scientist
Categorical plots
Examples: count plots, bar plots

Involve a categorical variable

Comparisons between groups

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


catplot()
Used to create categorical plots

Same advantages of relplot()

Easily create subplots with col= and row=

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


countplot() vs. catplot()
import matplotlib.pyplot as plt
import seaborn as sns

sns.countplot(x="how_masculine",
data=masculinity_data)

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


countplot() vs. catplot()
import matplotlib.pyplot as plt
import seaborn as sns

sns.catplot(x="how_masculine",
data=masculinity_data,
kind="count")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Changing the order
import matplotlib.pyplot as plt
import seaborn as sns

category_order = ["No answer",


"Not at all",
"Not very",
"Somewhat",
"Very"]

sns.catplot(x="how_masculine",
data=masculinity_data,
kind="count",
order=category_order)

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Bar plots
Displays mean of quantitative variable per
category

import matplotlib.pyplot as plt


import seaborn as sns

sns.catplot(x="day",
y="total_bill",
data=tips,
kind="bar")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Con dence intervals
Lines show 95% con dence intervals for the
mean

Shows uncertainty about our estimate

Assumes our data is a random sample

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Turning off con dence intervals
import matplotlib.pyplot as plt
import seaborn as sns

sns.catplot(x="day",
y="total_bill",
data=tips,
kind="bar",
ci=None)

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Changing the orientation
import matplotlib.pyplot as plt
import seaborn as sns

sns.catplot(x="total_bill",
y="day",
data=tips,
kind="bar")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Let's practice!
I N T R O D U C T I O N TO D ATA V I S U A L I Z AT I O N W I T H S E A B O R N
Creating a box plot
I N T R O D U C T I O N TO D ATA V I S U A L I Z AT I O N W I T H S E A B O R N

Erin Case
Data Scientist
What is a box plot?
Shows the distribution of quantitative data

See median, spread, skewness, and outliers

Facilitates comparisons between groups

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


How to create a box plot
import matplotlib.pyplot as plt
import seaborn as sns

g = sns.catplot(x="time",
y="total_bill",
data=tips,
kind="box")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Change the order of categories
import matplotlib.pyplot as plt
import seaborn as sns

g = sns.catplot(x="time",
y="total_bill",
data=tips,
kind="box",
order=["Dinner",
"Lunch"])

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Omitting the outliers using `sym`
import matplotlib.pyplot as plt
import seaborn as sns

g = sns.catplot(x="time",
y="total_bill",
data=tips,
kind="box",
sym="")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Changing the whiskers using `whis`
By default, the whiskers extend to 1.5 * the interquartile range

Make them extend to 2.0 * IQR: whis=2.0

Show the 5th and 95th percentiles: whis=[5, 95]

Show min and max values: whis=[0, 100]

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Changing the whiskers using `whis`
import matplotlib.pyplot as plt
import seaborn as sns

g = sns.catplot(x="time",
y="total_bill",
data=tips,
kind="box",
whis=[0, 100])

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Let's practice!
I N T R O D U C T I O N TO D ATA V I S U A L I Z AT I O N W I T H S E A B O R N
Point plots
I N T R O D U C T I O N TO D ATA V I S U A L I Z AT I O N W I T H S E A B O R N

Erin Case
Data Scientist
What are point plots?
Points show mean of quantitative variable

Vertical lines show 95% con dence intervals

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Line plot: average level of nitrogen dioxide over Point plot: average restaurant bill, smokers vs.
time non-smokers

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Point plots vs. line plots
Both show:

Mean of quantitative variable

95% con dence intervals for the mean

Differences:

Line plot has quantitative variable (usually time) on x-axis

Point plot has categorical variable on x-axis

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Point plots vs. bar plots
Both show:

Mean of quantitative variable

95% con dence intervals for the mean

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Point plots vs. bar plots

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Creating a point plot
import matplotlib.pyplot as plt
import seaborn as sns

sns.catplot(x="age",
y="masculinity_important",
data=masculinity_data,
hue="feel_masculine",
kind="point")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Disconnecting the points
import matplotlib.pyplot as plt
import seaborn as sns

sns.catplot(x="age",
y="masculinity_important",
data=masculinity_data,
hue="feel_masculine",
kind="point",
join=False)

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Displaying the median
import matplotlib.pyplot as plt
import seaborn as sns

sns.catplot(x="smoker",
y="total_bill",
data=tips,
kind="point")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Displaying the median
import matplotlib.pyplot as plt
import seaborn as sns
from numpy import median

sns.catplot(x="smoker",
y="total_bill",
data=tips,
kind="point",
estimator=median)

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Customizing the con dence intervals
import matplotlib.pyplot as plt
import seaborn as sns

sns.catplot(x="smoker",
y="total_bill",
data=tips,
kind="point",
capsize=0.2)

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Turning off con dence intervals
import matplotlib.pyplot as plt
import seaborn as sns

sns.catplot(x="smoker",
y="total_bill",
data=tips,
kind="point",
ci=None)

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH SEABORN


Let's practice!
I N T R O D U C T I O N TO D ATA V I S U A L I Z AT I O N W I T H S E A B O R N

You might also like