0% found this document useful (0 votes)

12 views

Introduction to Statistical Hypothesis Testing in R

Uploaded by

divyanavranshchauhan

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views

Introduction to Statistical Hypothesis Testing in R

Uploaded by

divyanavranshchauhan

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 8

Introduction to Statistical Hypothesis Testing in R

A statistical hypothesis is an assumption made by the researcher

about the data of the population collected for any experiment.

It is not mandatory for this assumption to be true every time.

Hypothesis testing, in a way, is a formal process of validating the
hypothesis made by the researcher.
In order to validate a hypothesis, it will consider the entire population
into account. However, this is not possible practically. Thus, to
validate a hypothesis, it will use random samples from a population.
On the basis of the result from testing over the sample data, it either
selects or rejects the hypothesis.
Hypothesis Testing is a type of statistical analysis in which you put
your assumptions about a population parameter to the test. It is used
to estimate the relationship between 2 statistical variables.

Let's discuss few examples of statistical hypothesis from real-life -

 A teacher assumes that 60% of his college's students come from

upper-middle-class families.
 A doctor believes that 3D (Diet, Dose, and Discipline) is 90%
effective for diabetic patients.
Statistical Hypothesis Testing can be categorized into two types as
below:
 Null Hypothesis – Hypothesis testing is carried out in order
to test the validity of a claim or assumption that is made
about the larger population. This claim that involves
attributes to the trial is known as the Null Hypothesis. The
null hypothesis testing is denoted by H0.
 Alternative Hypothesis – An alternative hypothesis would
be considered valid if the null hypothesis is fallacious. The
evidence that is present in the trial is basically the data and
the statistical computations that accompany it. The
alternative hypothesis testing is denoted by H or H .
1 a

Let’s take an example of the coin. We want to conclude that a coin is

unbiased or not. Since null hypothesis refers to the natural state of an
event, thus, according to the null hypothesis, there would an equal
number of occurrences of heads and tails, if a coin is tossed several
times. On the other hand, the alternative hypothesis negates the null
hypothesis and refers that the occurrences of heads and tails would
have significant differences in number.

Simple and Composite Hypothesis Testing

Depending on the population distribution, you can classify the
statistical hypothesis into two types.

Simple Hypothesis: A simple hypothesis specifies an exact value for

the parameter.

Composite Hypothesis: A composite hypothesis specifies a range of

values.

Example:

A company is claiming that their average sales for this quarter are
1000 units. This is an example of a simple hypothesis.

Suppose the company claims that the sales are in the range of 900 to
1000 units. Then this is a case of a composite hypothesis.

One-Tailed and Two-Tailed Hypothesis Testing

The One-Tailed test, also called a directional test, considers a critical
region of data that would result in the null hypothesis being rejected if
the test sample falls into it, inevitably meaning the acceptance of the
alternate hypothesis.

In a one-tailed test, the critical distribution area is one-sided,

meaning the test sample is either greater or lesser than a specific
value.

In two tails, the test sample is checked to be greater or less than a

range of values in a Two-Tailed test, implying that the critical
distribution area is two-sided.

If the sample falls within this range, the alternate hypothesis will be
accepted, and the null hypothesis will be rejected.

Example:

Suppose H0: mean = 50 and H1: mean not equal to 50

According to the H1, the mean can be greater than or less than 50.
This is an example of a Two-tailed test.

In a similar manner, if H0: mean >=50, then H1: mean <50

Here the mean is less than 50. It is called a One-tailed test.

Level of Significance
The alpha value is a criterion for determining whether a test statistic is
statistically significant. In a statistical test, Alpha represents an
acceptable probability of a Type I error. Because alpha is a
probability, it can be anywhere between 0 and 1.

In practice, the most commonly used alpha values are 0.01, 0.05, and
0.1, which represent a 1%, 5%, and 10% chance of a Type I error,
respectively (i.e. rejecting the null hypothesis when it is in fact
correct).

P-Value
A p-value is a metric that expresses the likelihood that an observed
difference could have occurred by chance. As the p-value decreases
the statistical significance of the observed difference increases. If the
p-value is too low, you reject the null hypothesis.

Here you have taken an example in which you are trying to test
whether the new advertising campaign has increased the product's
sales.

The p-value is the likelihood that the null hypothesis, which states
that there is no change in the sales due to the new advertising
campaign, is true. If the p-value is .30, then there is a 30% chance that
there is no increase or decrease in the product's sales. If the p-value is
0.03, then there is a 3% probability that there is no increase or
decrease in the sales value due to the new advertising campaign.

As you can see, the lower the p-value, the chances of the alternate
hypothesis being true increases, which means that the new advertising
campaign causes an increase or decrease in sales.

Hypothesis Testing in R
Statisticians use hypothesis testing to formally check whether the
hypothesis is accepted or rejected. Hypothesis testing is conducted in
the following manner:
1. State the Hypotheses – Stating the null and alternative
hypotheses.
2. Formulate an Analysis Plan – The formulation of an
analysis plan is a crucial step in this stage.
3. Analyze Sample Data – Calculation and interpretation of
the test statistic, as described in the analysis plan.
4. Interpret Results – Application of the decision rule
described in the analysis plan.

Hypothesis testing ultimately uses a p-value to weigh the strength of

the evidence or in other words what the data are about the population.
The p-value ranges between 0 and 1. It can be interpreted in the
following way:
 A small p-value (typically ≤ 0.05) indicates strong evidence
against the null hypothesis, so you reject it.(5% of error)
 A large p-value (> 0.05) indicates weak evidence against the
null hypothesis, so you fail to reject it.
A p-value very close to the cutoff (0.05) is considered to be marginal
and could go either way.
TRUE-Experimental FALSE-Exp.
(Actual) (Actual)
TRUE True Positive Case False Positive(type-II
Error)
(Predicted)
FALSE False Negative True Negative
Case(type-1 Error)
(Predicted)

True Positive:

Interpretation: You predicted positive and it’s true.

You predicted that a woman is pregnant and she actually

is.
True Negative:

Interpretation: You predicted negative and it’s true.

You predicted that a man is not pregnant and he actually is

not.

False Positive: (Type 1 Error)

Interpretation: You predicted positive and it’s false.

You predicted that a man is pregnant but he actually is not.

False Negative: (Type 2 Error)

Interpretation: You predicted negative and it’s false.

You predicted that a woman is not pregnant but she

actually is.

Decision Errors in R
The two types of error that can occur from the hypothesis testing:
 Type I Error – Type I error occurs when the researcher
rejects a null hypothesis when it is true. The term
significance level is used to express the probability of Type I
error while testing the hypothesis. The significance level is
represented by the symbol α (alpha).
 Type II Error – Accepting a false null hypothesis H is 0

referred to as the Type II error. The term power of the test is

used to express the probability of Type II error while testing
hypothesis. The power of the test is represented by the
symbol β (beta).

Using the Student’s T-test in R

The Student’s T-test is a method for comparing two samples. It can be
implemented to determine whether the samples are different. This is a
parametric test, and the data should be normally distributed.
R can handle the various versions of T-test using
the t.test() command. The test can be used to deal with two- and one-
sample tests as well as paired tests.
Listed below are the commands used in the Student’s t-test and their
explanation:
 t.test(data.1, data.2) – The basic method of applying a t-test
is to compare two vectors of numeric data.
 var.equal = FALSE – If the var.equal instruction is set to
TRUE, the variance is considered to be equal and the
standard test is carried out. If the instruction is set to FALSE
(the default), the variance is considered unequal and the
Welch two-sample test is carried out.
 mu = 0 – If a one-sample test is carried out, mu indicates the
mean against which the sample should be tested.
 alternative = “two.sided” – It sets the alternative
hypothesis. The default value for this is “two.sided” but a
greater or lesser value can also be assigned. You can
abbreviate the instruction.
 conf.level = 0.95 – It sets the confidence level of the interval
(default = 0.95).
 conf.level = 0.99 p<.01 Ho: is get rejected else Accepted

 paired = FALSE – If set to TRUE, a matched pair T-test is

carried out.
 t.test(y ~ x, data, subset) – The required data can be
specified as a formula of the form response ~ predictor. In
this case, the data should be named and a subset of the
predictor variable can be specified.
 subset = predictor %in% c(“sample.1”, sample.2”) – If
the data is in the form response ~ predictor, the two samples
to be selected from the predictor should be specified by the
subset instruction from the column of the data.

Hypothesis Testing Assignment
No ratings yet
Hypothesis Testing Assignment
12 pages
IGNOU Solved Assignment of MCS32
No ratings yet
IGNOU Solved Assignment of MCS32
13 pages
Hypothesis Testing One Sample t Test
No ratings yet
Hypothesis Testing One Sample t Test
10 pages
PRAC 2 Generating a Hypothesis
No ratings yet
PRAC 2 Generating a Hypothesis
49 pages
BRM UNIT 4
No ratings yet
BRM UNIT 4
20 pages
Stat Study Mat 3
No ratings yet
Stat Study Mat 3
51 pages
Statistical_Hypothesis_Testing
No ratings yet
Statistical_Hypothesis_Testing
20 pages
Hypothesis Class
No ratings yet
Hypothesis Class
56 pages
What Is Hypothesis Testing
100% (1)
What Is Hypothesis Testing
46 pages
What Is Hypothesis Testing
No ratings yet
What Is Hypothesis Testing
18 pages
Hypothesis Testing: Applied Statistics - Lesson 8
No ratings yet
Hypothesis Testing: Applied Statistics - Lesson 8
6 pages
PSNM - Ch. 3
No ratings yet
PSNM - Ch. 3
32 pages
Introduction: Hypothesis Testing Is A Formal Procedure For Investigating Our Ideas
No ratings yet
Introduction: Hypothesis Testing Is A Formal Procedure For Investigating Our Ideas
7 pages
Hypothesis Testing Basic Terminology:: Population
No ratings yet
Hypothesis Testing Basic Terminology:: Population
19 pages
Statistics Can Be Broadly Classified Into Two Categories Namely (I) Descriptive Statistics and (II) Inferential Statistics
0% (1)
Statistics Can Be Broadly Classified Into Two Categories Namely (I) Descriptive Statistics and (II) Inferential Statistics
59 pages
Hypothesis testing
No ratings yet
Hypothesis testing
5 pages
Hypothesis_Testing (updated)
No ratings yet
Hypothesis_Testing (updated)
13 pages
Unit-4 Hypothesis - Testing
No ratings yet
Unit-4 Hypothesis - Testing
17 pages
PSAI Unit 4
No ratings yet
PSAI Unit 4
38 pages
Statistical tests
No ratings yet
Statistical tests
16 pages
Hypothesis Lecture
No ratings yet
Hypothesis Lecture
7 pages
Hypothesis Testing (Original) PDF
No ratings yet
Hypothesis Testing (Original) PDF
5 pages
Hypothesis of Testing
100% (1)
Hypothesis of Testing
2 pages
Statistics For Management - 3
No ratings yet
Statistics For Management - 3
32 pages
Post Hoc
No ratings yet
Post Hoc
3 pages
Unit 1 Hypothesis Test
No ratings yet
Unit 1 Hypothesis Test
3 pages
7 Step
No ratings yet
7 Step
5 pages
CHAPTER THREE, Hypothesis Testing
No ratings yet
CHAPTER THREE, Hypothesis Testing
17 pages
module2_ds
No ratings yet
module2_ds
28 pages
Statistics Tutorial: Hypothesis Tests: Null Hypothesis. The Null Hypothesis, Denoted by H
No ratings yet
Statistics Tutorial: Hypothesis Tests: Null Hypothesis. The Null Hypothesis, Denoted by H
66 pages
Tests of Significance Notes PDF
No ratings yet
Tests of Significance Notes PDF
12 pages
P Value Definition
100% (1)
P Value Definition
1 page
Unit 4 Statistical Testing and Modeling in r
No ratings yet
Unit 4 Statistical Testing and Modeling in r
25 pages
ESTADISTICA APLICADA - Elementos Básicos
No ratings yet
ESTADISTICA APLICADA - Elementos Básicos
30 pages
Test of Hypothesis Intro
No ratings yet
Test of Hypothesis Intro
5 pages
MODULE 3 Statistics - 240131 - 013906
No ratings yet
MODULE 3 Statistics - 240131 - 013906
36 pages
Project On Hypothesis: Name-Sourav Sarkar M.Sc. in Irdm 1 Semester
No ratings yet
Project On Hypothesis: Name-Sourav Sarkar M.Sc. in Irdm 1 Semester
6 pages
Chapter-6-Hypothesis-Testing (1)
No ratings yet
Chapter-6-Hypothesis-Testing (1)
4 pages
Stats Hypothesis Testing
No ratings yet
Stats Hypothesis Testing
3 pages
Unit 3 (Hypothesis Testing)
No ratings yet
Unit 3 (Hypothesis Testing)
40 pages
Hypothesis Tests & Control Charts: by S.G.M
No ratings yet
Hypothesis Tests & Control Charts: by S.G.M
26 pages
Premili Definitions
No ratings yet
Premili Definitions
3 pages
Tests of Significance and Measures of Association
No ratings yet
Tests of Significance and Measures of Association
21 pages
Theory
No ratings yet
Theory
7 pages
BRM 3 - 4-5
No ratings yet
BRM 3 - 4-5
9 pages
Chapter 3 HT
No ratings yet
Chapter 3 HT
20 pages
Hypothesis Testing: Main Contents Page Index of All Entries
No ratings yet
Hypothesis Testing: Main Contents Page Index of All Entries
11 pages
Module-7A
No ratings yet
Module-7A
9 pages
Quantitative Techniques: Confirmatory Statistics
No ratings yet
Quantitative Techniques: Confirmatory Statistics
3 pages
MB 0034-1
No ratings yet
MB 0034-1
10 pages
4 T-Test
No ratings yet
4 T-Test
68 pages
a78bde04-1efd-4ff1-9e48-b23104cd7c3b (1)
No ratings yet
a78bde04-1efd-4ff1-9e48-b23104cd7c3b (1)
10 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
7 pages
Statistical Inference, Decision Theory and Hypothesis Testing
No ratings yet
Statistical Inference, Decision Theory and Hypothesis Testing
4 pages
Introduction On Chi Square Distribution
No ratings yet
Introduction On Chi Square Distribution
13 pages
Unit 2
No ratings yet
Unit 2
9 pages
Stat II CHAPTER 3 to 5
No ratings yet
Stat II CHAPTER 3 to 5
46 pages
Tests of Significance
No ratings yet
Tests of Significance
13 pages
Hypothesis Testing (1)
No ratings yet
Hypothesis Testing (1)
7 pages
Chi Squared for Beginners
From Everand
Chi Squared for Beginners
Stephanie Glen
No ratings yet
Hypothesis Testing: Six Sigma Thinking, #6
From Everand
Hypothesis Testing: Six Sigma Thinking, #6
Sumeet Savant
No ratings yet
IR Chapt 5
No ratings yet
IR Chapt 5
55 pages
TFM Alberto Simon
No ratings yet
TFM Alberto Simon
67 pages
Prac4 NOSQL-QUE
No ratings yet
Prac4 NOSQL-QUE
9 pages
Statistical Treatment of Data
No ratings yet
Statistical Treatment of Data
1 page
Chapter 2 Auditing It Governance Controls
No ratings yet
Chapter 2 Auditing It Governance Controls
54 pages
jasper_quick_guide
No ratings yet
jasper_quick_guide
25 pages
Lesson 9: Phase 3: System Design
No ratings yet
Lesson 9: Phase 3: System Design
12 pages
DM 2 Part 1
No ratings yet
DM 2 Part 1
50 pages
AZ-900 Microsoft Azure Fundamentals
No ratings yet
AZ-900 Microsoft Azure Fundamentals
161 pages
Python Unit 5
No ratings yet
Python Unit 5
21 pages
Graph Data Science For Dummies Neo4j 2nd Edition
No ratings yet
Graph Data Science For Dummies Neo4j 2nd Edition
53 pages
Med Ebase Datasheet v1 PDF 5fa2d59ae2a9e
No ratings yet
Med Ebase Datasheet v1 PDF 5fa2d59ae2a9e
2 pages
Literature Review On Computerized Hotel Management System
No ratings yet
Literature Review On Computerized Hotel Management System
4 pages
Web-Based Platform For Pet Adoption
No ratings yet
Web-Based Platform For Pet Adoption
59 pages
Gantt-Chart L
No ratings yet
Gantt-Chart L
8 pages
Netcool Overview Presentation
No ratings yet
Netcool Overview Presentation
25 pages
Scalability: Examples
No ratings yet
Scalability: Examples
6 pages
BITS Pilani, Hyderabad Campus CSF212, Database Systems : (2M) (2M) (2M) (2M) (2M)
No ratings yet
BITS Pilani, Hyderabad Campus CSF212, Database Systems : (2M) (2M) (2M) (2M) (2M)
8 pages
Linkedin: How Big Data Is Used To Fuel Social Media Success
No ratings yet
Linkedin: How Big Data Is Used To Fuel Social Media Success
7 pages
Library Management System Project Ppt... 123
No ratings yet
Library Management System Project Ppt... 123
28 pages
(Happy) Assignment#1
No ratings yet
(Happy) Assignment#1
9 pages
Comp 4 Documentation
No ratings yet
Comp 4 Documentation
213 pages
Purge and Archival Process in Pega
No ratings yet
Purge and Archival Process in Pega
3 pages
XII - Revision Sheet - 2 - C.S.
No ratings yet
XII - Revision Sheet - 2 - C.S.
2 pages
The Hardest Thing in Data Science - Caffeinated Data Science
No ratings yet
The Hardest Thing in Data Science - Caffeinated Data Science
5 pages
Suprativ Datta Atlassian Resume
No ratings yet
Suprativ Datta Atlassian Resume
1 page
Data Modeling Using ER Model - Ch3
No ratings yet
Data Modeling Using ER Model - Ch3
24 pages
Get Language and reading disabilities 3rd, intern. Edition Catts free all chapters
No ratings yet
Get Language and reading disabilities 3rd, intern. Edition Catts free all chapters
77 pages
Redis
No ratings yet
Redis
8 pages