Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
48 views

Lab 8 - Shell

This document describes a lab assignment on testing population proportions. The learning objectives are to test whether a population proportion is equal to a given value and to test whether two population proportions are equal. The document provides exercises to import datasets and use R functions like prop.test() to perform proportion tests. Hypotheses are stated and test decisions are made based on p-values and confidence intervals.

Uploaded by

Mansi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views

Lab 8 - Shell

This document describes a lab assignment on testing population proportions. The learning objectives are to test whether a population proportion is equal to a given value and to test whether two population proportions are equal. The document provides exercises to import datasets and use R functions like prop.test() to perform proportion tests. Hypotheses are stated and test decisions are made based on p-values and confidence intervals.

Uploaded by

Mansi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Lab 8 - Proportion Testing

Mansi Kumari (7908159)

2023-03-22

Learning Objectives

By the end of this lab, you should have a grasp on the following concepts:

• How to test whether a population proportion is equal to a given value.


• How to test whether two population proportions are equal to each other.

Instructions

To complete this worksheet, add code as needed into the R code chunks given below. Do not delete the
question text. All text should be in complete English sentences. Be sure to change the author of this file to
reflect your name and student number.
To properly see the questions, knit this .Rmd file to .pdf and view the output. You will have a link in your
email that takes you to the Crowdmark submission page. Once you have completed the worksheet, knit it
to .pdf and upload your output to Crowdmark.

1
Exercises
Import the React1000 dataset, which contains various measurements on a sample of 1000 Grade 12 students
across the United States, including their Region, Gender, Age, Handedness, Height (cm), Foot Length (cm),
and Armspan (cm).

React1000 <- read.csv("~/Downloads/LAB8/React1000.csv")

Use str to see the format of the dataset.

str(React1000)

## ’data.frame’: 1000 obs. of 7 variables:


## $ Region : chr "CA" "PA" "CO" "PA" ...
## $ Gender : chr "Female" "Female" "Female" "Male" ...
## $ Age : num 16 17 17 17 17 16 17 17 17 17 ...
## $ Handed : chr "Right-Handed" "Right-Handed" "Left-Handed" "Right-Handed" ...
## $ Height : int 164 163 157 175 175 177 163 179 151 186 ...
## $ Footlength: num 25 23 21 25 24 27 24 27.5 19 25 ...
## $ Armspan : num 165 158 156 182 180 ...

In 1992, a well-known study estimated that 11.1% of Americans aged 10 to 86 are left- or mixed-handed
(LMH). Suppose that we wish to test at the α = 0.01 level whether the proportion of Americans who are
LMH has changed since this estimate, using React1000 as our sample.
Give the hypotheses for this test.

H0 : p = 0.111 vs Ha : p ̸= 0.111

Use the table function to find the number of students in this sample who are LMH.

table(React1000$Handed)

##
## Ambidextrous Left-Handed Right-Handed
## 44 80 876

Calculate the test statistic for this test.

z.stat <- (0.124 - 0.111)/sqrt((0.111 * 0.889)/1000)


z.stat

## [1] 1.308673

Use pnorm to calculate the p-value for this test.

2 * pnorm(-z.stat)

## [1] 0.1906453

2
What is your decision regarding this test?
As the value of p is more than level of significance , we fail to reject H0.
Repeat the above test using the prop.test function.

prop.test(124, 1000, 0.111, alternative = "two.sided",


correct = FALSE)

##
## 1-sample proportions test without continuity correction
##
## data: 124 out of 1000, null probability 0.111
## X-squared = 1.7126, df = 1, p-value = 0.1906
## alternative hypothesis: true p is not equal to 0.111
## 95 percent confidence interval:
## 0.1050000 0.1458777
## sample estimates:
## p
## 0.124

Use the prop.test function to produce a 99% confidence interval for the true proportion of American citizens
who are LMH.

prop.test(124, 1000, 0.111, alternative = "two.sided",


conf.level = 0.99, correct = FALSE)

##
## 1-sample proportions test without continuity correction
##
## data: 124 out of 1000, null probability 0.111
## X-squared = 1.7126, df = 1, p-value = 0.1906
## alternative hypothesis: true p is not equal to 0.111
## 99 percent confidence interval:
## 0.09960635 0.15335021
## sample estimates:
## p
## 0.124

Exercise: Load in the Company500 dataset. This dataset contains various measurements on
a sample of 500 employees from a large company, including their age bracket (Age.Bracket:
either over or under 40), employment status (Status: either salaried or hourly), department,
and earnings bracket. Use the table function to obtain a count of how many employees are
hourly vs. salaried.

Company500 <- read.csv("~/Downloads/LAB8/Company500.csv")


table(Company500$Status)

##
## Hourly Salaried
## 351 149

3
Exercise: Perform a test at the 5% level of significance to determine whether the proportion
of employees who are salaried is below one-third, which is known to be the rate in a competing
company. Give whether you reject or fail to reject H0 .

prop.test(149, 500,1/3, alternative = "less",


correct = FALSE)

##
## 1-sample proportions test without continuity correction
##
## data: 149 out of 500, null probability 1/3
## X-squared = 2.809, df = 1, p-value = 0.04687
## alternative hypothesis: true p is less than 0.3333333
## 95 percent confidence interval:
## 0.000000 0.332659
## sample estimates:
## p
## 0.298

As the value of p is below the level of significance , we reject H0. We have sufficient evidence at the 5% level
of significance that the proportion of employees who are salaried are less than at the competing company.
Exercise: Calculate a 99% interval for the true proportion of employees who are salaried at
this company. Print out the confidence interval below.

prop.test(149, 500,1/3, alternative = "two.sided",conf.level = 0.99,


correct = FALSE)

##
## 1-sample proportions test without continuity correction
##
## data: 149 out of 500, null probability 1/3
## X-squared = 2.809, df = 1, p-value = 0.09374
## alternative hypothesis: true p is not equal to 0.3333333
## 99 percent confidence interval:
## 0.2482371 0.3530537
## sample estimates:
## p
## 0.298

c( 0.2482371, 0.3530537)

## [1] 0.2482371 0.3530537

The p value is above 1% level of significance so we fail to reject H0 and the confidence interval is [0.2482,
0.3531]
The 1992 study referenced earlier also found that the proportion of boys who are LMH is greater than
the proportion of girls who are LMH. Suppose that we wish to test this on our sample, at the 2% level of
significance.
Give the hypotheses for this test.

4
H0 : pM = pF vs Ha : pM > pF

Use the table function to compare counts of LMH by gender:

table(React1000$Gender,React1000$Handed)

##
## Ambidextrous Left-Handed Right-Handed
## Female 13 40 464
## Male 31 40 412

Use table to count the number of girls and boys in this sample:

table(React1000$Gender)

##
## Female Male
## 517 483

Use prop.test to conduct this test.

prop.test(c(71,53),c(483,517), alternative = "greater",


correct = FALSE)

##
## 2-sample test for equality of proportions without continuity correction
##
## data: c(71, 53) out of c(483, 517)
## X-squared = 4.5489, df = 1, p-value = 0.01647
## alternative hypothesis: greater
## 95 percent confidence interval:
## 0.01007626 1.00000000
## sample estimates:
## prop 1 prop 2
## 0.1469979 0.1025145

What is your decision regarding this test?


As the value of p is less than level of significance (0.02), we fail to reject H0.So there is sufficient evidence
to conclude that the proportion of boys who are LMH differs from the proportion of girls who are LMH
Use prop.test to create a confidence interval for the difference pM − pF .

prop.test(c(71,53),c(483,517), alternative = "two.sided",conf.level = 0.98,


correct = FALSE)

##
## 2-sample test for equality of proportions without continuity correction
##
## data: c(71, 53) out of c(483, 517)
## X-squared = 4.5489, df = 1, p-value = 0.03294

5
## alternative hypothesis: two.sided
## 98 percent confidence interval:
## -0.004179282 0.093146127
## sample estimates:
## prop 1 prop 2
## 0.1469979 0.1025145

Print out the confidence interval below.


confidence interval is [-0.0042, 0.0931]
Exercise: Using the Company500 dataset, use the table function to create a table comparing the
ages and employment status of the employees in this sample.

table(Company500$Age.Bracket,Company500$Status)

##
## Hourly Salaried
## Over 40 105 59
## Under 40 246 90

Exercise: Perform a test at the 1% level of significance to determine whether the proportion of
employees who are over 40 differs between the salaried and hourly workers. Mention whether
you reject or fail to reject H0 .

prop.test(c(105,59),c(351,149), alternative = "two.sided",conf.level = 0.99,


correct = FALSE)

##
## 2-sample test for equality of proportions without continuity correction
##
## data: c(105, 59) out of c(351, 149)
## X-squared = 4.4492, df = 1, p-value = 0.03492
## alternative hypothesis: two.sided
## 99 percent confidence interval:
## -0.21771465 0.02405894
## sample estimates:
## prop 1 prop 2
## 0.2991453 0.3959732

The p value is above 1% level of significance so we fail to reject H0.

You might also like