The document provides an introduction to statistical analysis concepts including:
- Descriptive statistics such as mean, median and mode are used to summarize sample data. Inferential statistics make inferences about populations from samples using probabilities.
- A population is all possible observations, a sample is a subset of observations, and probabilities can be calculated using sample size and number of observations of interest.
- Sets, subsets, unions, intersections and complements are defined and used to calculate probabilities of events. Independent events have probabilities of intersection equal to the product of individual probabilities. Examples demonstrate probability calculations.
The document provides an introduction to statistical analysis concepts including:
- Descriptive statistics such as mean, median and mode are used to summarize sample data. Inferential statistics make inferences about populations from samples using probabilities.
- A population is all possible observations, a sample is a subset of observations, and probabilities can be calculated using sample size and number of observations of interest.
- Sets, subsets, unions, intersections and complements are defined and used to calculate probabilities of events. Independent events have probabilities of intersection equal to the product of individual probabilities. Examples demonstrate probability calculations.
The document provides an introduction to statistical analysis concepts including:
- Descriptive statistics such as mean, median and mode are used to summarize sample data. Inferential statistics make inferences about populations from samples using probabilities.
- A population is all possible observations, a sample is a subset of observations, and probabilities can be calculated using sample size and number of observations of interest.
- Sets, subsets, unions, intersections and complements are defined and used to calculate probabilities of events. Independent events have probabilities of intersection equal to the product of individual probabilities. Examples demonstrate probability calculations.
The document provides an introduction to statistical analysis concepts including:
- Descriptive statistics such as mean, median and mode are used to summarize sample data. Inferential statistics make inferences about populations from samples using probabilities.
- A population is all possible observations, a sample is a subset of observations, and probabilities can be calculated using sample size and number of observations of interest.
- Sets, subsets, unions, intersections and complements are defined and used to calculate probabilities of events. Independent events have probabilities of intersection equal to the product of individual probabilities. Examples demonstrate probability calculations.
Weekly Course Objectives ● What exactly are statistics? ● Descriptive vs. Inferential Statistics ● What are probabilities? ● Go over sampling terms like population, sample, representative samples, etc ● Review set notation, unions, intersects. Statistics are everywhere! ● Believe it or not, you collect statistics everyday! ● Statistics are a collection and analyses of information, especially for the purpose of making inferences when given a sample. ● For example; ○ You trying out a pizza from a new restaurant and saying it’s “bad” or “the restaurant is bad” is a statistic. ○ Finding the average height of people in a grocery store in a brand new town and then claiming the whole town is full of tall people is a statistic. ○ Attending 2 classes of a statistics course and saying the class is hard is also a statistic. Descriptive Statistics ● We have been using descriptive statistics so far in the course. ● Descriptive Statistics are those which involve summarizing or organizing data. ● For example; ○ The mean, median and mode are descriptive statistics because they organize / summarize data in meaningful ways. ○ We can visualize our summaries of data using boxplots, scatter plots and histograms. Meaning, they are also descriptive statistics. ● When building an analysis, descriptive statistics are vital to help you not only understand your data, but your claims about it! Inferential Statistics ● Later in this course, we will explore inferential statistics. ● Inferential statistics are those which involve making inferences on our data using some probabilistic approach. ● For example; ○ Confidence intervals are intervals that can capture the true population parameter if we were to collect a sample over and over again. ○ Hypothesis testing to measure a claim made against the data ( average height is greater than 150cm, average grades are less than 70%, etc )
● Descriptive statistics describe what you see, inferential
statistics attempt to infer based on your data. Descriptive vs. Inferential
● Descriptive statistics attempt to make summaries on your sample.
● Inferential Statistics attempt to make inferences on the whole population. Populations vs samples - why do we even? ● A population is every single observation that falls under the experiment you want to run. ● A sample is the subset of observations from your population. ● For example, the heights in the class of a first year statistics course taken from 1 class would be a sample. The population would be every first year statistics student (within some demographic. ● It’s very expensive to collect information - some companies even sell information from surveys (look at SurveyMonkey)! ● Getting adequate data for your study is critical, but sometimes data isn’t easy to come by (the weather behavior during solar eclipse). ● We ideally want a representative sample, which is a subset of the population that has the same characteristics as it. Populations and samples - a visual Sets ● Most people are often interested in collections and assortments of objects, and their relative proportions. ● A set is a collection of objects, or “stuff”. These objects could be countries, cities, years, numbers, words, letters, hats, people, whatever your heart desires. ● Naturally, we can define elements as these individual objects that make up the set. So it could be a country, city, year, number, word, letter, hat, person, etc. ● For example: ○ {a,t,i,n,d,e,r} ○ {0,1,2,3,5,8,13,21} ○ {every student that goes to Queen’s College} ○ {∅} ● The last point is called a null set aka an empty set. What does it mean?... Null Sets ● A shopping bag is an object to carry things; an empty bag is a bag with nothing inside it. ● To elaborate on this idea, look at the set A = {5}. ○ Using basic English, set A is a set containing the element 5. ○ Hence, 5 is not just a number, but it is an element of the set. ● In a similar way, an empty set is not nothing, it’s just a set with no elements. ● The moral of this tricky idea is that a set is like a container, and the elements are those objects or stuff we put in them. ● When you start thinking of sets as “containers”, you can easily see why the null set is a subset of every set, since every set has a “container”. ● A more depressing way of thinking is that “there is nothingness in everything”. Universal & complementary sets ● Universal sets are sets that contain every possible object possible for the scenario or experiment you are looking at. This is typically denoted with U. ● Sample sets is a smaller part of the population set of observations of which your experiment will take on, denoted by S. We typically work with samples! ● For example: ○ If we spoke about numbers, the universal set could be U = {Every possible number}, S = {all even numbers} ○ If we spoke about provinces in Canada, the universal set could be U ={All Provinces}, S = {Western Provinces} ● Complementary sets are basically everything a particular set isn’t. It is typically given by ’ ● For example: ○ If S = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} and A = {2,4,6,3} then A’ = {1,5,7,8,9,10}. Subsets ● A subset is a smaller group of observations within some other larger group which is at least as large as the smaller group ● In mathematical words, set B is a subset of set A when everything inside B is within A. This is typically denoted as B ⊆ A. ● For example, say set A = {1,2,4}. List all possible subsets. ● {1}, {2}, {4}, {1,2}, {1,4}, {2,4}, {1,2,4}, {∅}. ● Again, note that order didn’t matter. As the set {1,2} and {2,1} are the exact same. ● What about this: A = {math, english, science}? Unions and Intersects ● Unions are observations that are distinctly combined between two ( or more ) sets, denoted by ∪ ● For example, if A = {1,3,7,9} and B = {2,3,5,7,8} then A ∪ B = {1,2,3,5,7,8,9} ● Intersects are observations that occur in both sets mutually, denoted by ∩. ● For example, if A = {1,3,7,9} and B = {2,3,5,7,8} then A ∩ B = {3,7}. Number of objects! ● The last bit of information we can use to get closer to probabilities is to identify the number of elements of a set which is denoted by n() ● For example, if we had set A = {1,4,7}, then n(A) = 3. ● If we had information of everything in our sample, say S, then we can technically find the probability of observing this event, denoted by P() ● Ex, the probability of event A occurring, when A={1,4,7}, and S = {1,2,3,4,5,6,7,8,9,10} then P(A) = n(A) / n(S) = 3/10 = 0.3 ● Remember, probabilities can only be between 0 and 1. ● We will talk more about probabilities next class! Inclusion Exclusion rule! ● In general, there is a very interesting property in statistics called the inclusion-exclusion rule. It is given as follows: n(A ∪ B) = n(A)+n(B)−n(A ∩ B) ● For example, if A = {1,3,7,9} and B = {2,3,5,7,8}, then: ○ n(A) = 4 ○ n(B) = 5 ○ n(A ∩ B) = n({3,7}) = 2 ○ n(A ∪ B) = n({1,2,3,5,7,8,9}) = 7 = 4 + 5 - 2 DeMorgan’s Law ● DeMorgan’s Law states: The complement of the union of the two sets A and B will be equal to the intersection of A' and B'. ● For example, (A ∩ B)’ = A’ ∪ B’ or (A ∪ B)’ = A’ ∩ B’ ● Visually: Independent Events ● Two events are independent if the probability of their intersection are the product of the individual events. ● In other words P(A ∩ B) = P(A) × P(B) ● For example, if P(A) = ⅓, P(B) = 2/7, and P(A ∩ B) = 2/21, then A and B are independent events! ● Intuitively, knowledge that one occurred does not affect the chance the other occurs - i.e., these events don’t depend on each other. ● When events don’t impact one another, the probability both occur will be the product of each individual one. ● For example, the probability that it will rain in Hamilton and the Toronto Raptors win the 2023 championships is definitely independent. Examples ● Given S = {1, 2,4,6,8,19,23,100,3} and A = {2, 6, 8} find P(A). ● Given S = {1, 2,4,6,8,19,23,100,3} and A = {2, 6, 8} find P(A’). ● Given S = {1, 2,4,6,8,19,23,100,3}, A = {2, 6, 8}, B = {2,4,6,23,100,3} find P(A ∩ B). ● Find P(A’ ∩ B) ● Find P(A ∪ B’) ● Find P((A ∪ B)’) ● Are events A and B independent? Answers ● 3/9 = ⅓ ● 1-⅓=⅔. ● 2/9 ● 4/9 ● 5/9 ● 2/9 ● Yes Homework Homework: Answers