Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
62 views

Assignment 1

The document is a statistics assignment that analyzes data on penguin bill lengths from three islands in the Palmer Archipelago over three seasons. It includes exploratory analysis of species counts and bill length distributions. Key findings are that species counts increased each season, bill lengths were bimodally distributed between 32.1-61.8mm with a mean of 43.92mm, and a 95% confidence interval for the mean bill length was calculated as 43.32-44.52mm. Limitations in applying this interval to a single outlier measurement are discussed.

Uploaded by

Anika O'Connell
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views

Assignment 1

The document is a statistics assignment that analyzes data on penguin bill lengths from three islands in the Palmer Archipelago over three seasons. It includes exploratory analysis of species counts and bill length distributions. Key findings are that species counts increased each season, bill lengths were bimodally distributed between 32.1-61.8mm with a mean of 43.92mm, and a 95% confidence interval for the mean bill length was calculated as 43.32-44.52mm. Limitations in applying this interval to a single outlier measurement are discussed.

Uploaded by

Anika O'Connell
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Name: Anika O’Connell-Temple ID number: 20023303

Applied Statistics 161.111 S3 2022


Assignment 1

Due date: Friday 6th January 2023

Total marks: 30 Assessment value: 17%

Background

The population of interest is the Pygoscelis penguins nesting on the islands within the Palmer Archipelago.
The sample data came from a study in which researchers randomly selected three typical islands in the
Palmer Archipelago. They took measurements on all penguins found
nesting on these three islands during three breeding seasons.

The data in the Excel file penguins_2022_S3.xlsx include information on


species (Adélie, Gentoo and Chinstrap), sex (female or male,
determined from a blood test), Bill length and Bill depth (both measured
in mm), which season the data were collected in and which island the
penguin was found on.

Use the data to answer the following questions in the spaces provided.

Use Excel and incorporate the output into your answers. You can re-size the answer spaces.
When you are finished your assignment, save it as a pdf and upload to Stream Assignment 1 Dropbox

Part A: Exploratory analysis of species [6 marks]

A1: Use Excel to produce a table of counts of penguins for the different seasons.
[2 marks]
Season Number of Penguins
Season 1 109
Season 2 114
Season 3 119
Grand Total 342

Assignment 1 Page 1
Name: Anika O’Connell-Temple ID number: 20023303

A2: Use Excel to draw an appropriate graph to display the table you created above. [2 marks]

Number of penguins per season


120
118
116
114
112
110
108
106
104
Season 1 Season 2 Season 3

A3: What do the graph and table tell you about the numbers of penguins found in the different seasons?
[2 marks]
The graph and table tell me that the least amount of penguins were found in the first season, and the most
were found in the third season.

Part B: Exploratory analysis of penguin bill length [14 marks]

B1: Use Excel to draw a boxplot of bill lengths of all the penguins in the sample. [2 marks]

B2: What does your boxplot tell you about the distribution of bill lengths in the sample? [2 marks]
Centre: the median is approximately 45mm
Spread: the IQR is approximately 9mm (48-39)
Shape: the median is in the middle of the box and close to the mean, which suggests that the data is
symmetric.
Outliers: There are no outliers
Assignment 1 Page 2
Name: Anika O’Connell-Temple ID number: 20023303

B3: Use Excel to draw a histogram of bill lengths of all the penguins in the sample. [2 marks]

B4: What does your histogram tell you about the distribution of bill lengths in the sample? [2 marks]
Centre: the peak of the distribution is at 48.3mm to 51mm.
Spread: the beak lengths range from approximately 32.1mm to 61.8mm, a range of roughly 29.7mm.
Shape: the distribution is bi-modal.
Outliers: there don’t appear to be any outliers.

B5: Use Excel to calculate the numerical summaries of bill lengths. Fill in the table with the values rounded
sensibly. [2 marks]

Bill length (mm)


Mean 43.92
Standard deviation 5.46
Minimum 32.1
Lower Quartile 39.2
Median 44.45
Upper Quartile 48.5
Maximum 59.6
Sample size 342

B6: What do your summary statistics tell you about the distribution of bill lengths in the sample? [2 marks]
That it is not normally distributed.

Part C: Confidence interval for mean bill length [10 marks]

C1: Calculate a 95% confidence interval for the mean length of bills of penguins in the population. To get full
marks you must show your working for the following: [4 marks]

Assignment 1 Page 3
Name: Anika O’Connell-Temple ID number: 20023303

Standard error = 0.30


Confidence interval = 43.92 ± 0.60
Lower limit = 43.32
Upper limit = 44.52

C2: Write a sentence to interpret your confidence interval in context. [2 marks]


I am 95% certain that the interval from 43.32 to 44.52 millimetres captures the true average length of
penguin beaks.

C3: Two conditions (normality of the sampling distribution and representativeness of the sample) need to be
satisfied for the confidence interval to be valid.
a) How do we know that the normality condition is met? Explain. [1 mark]
In the case that the sample size is larger than 30, the sampling distribution will be normal in accord
with the Central Limit Thereom. If the sample size is smaller than 30 then the thereom doesn’t hold
and the distribution will only be normal if the population is normal.

b) Discuss problems with the representative condition. [2 marks]


We don’t always know how samples are selected and the representative condition is basically fully
dependant on this information.

C4: A fishing vessel finds a penguin caught in a fishing net near the Palmer Archipelago. The scientist on
board reports this penguin has a bill length of 32.5mm. Does your confidence interval suggest that the scientist
has made a mistake with their measurement? Explain. [1 mark]

No, a 95% confidence interval is meant to contain the true population mean, it does not mean that the
entire population will fall within it’s limits. Plus, outliers are always possible.

++++++++

Assignment 1 Page 4

You might also like