0% found this document useful (0 votes)

11 views

Exploratory Data Analysis - NOTES

Uploaded by

mfridah005

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

Exploratory Data Analysis - NOTES

Uploaded by

mfridah005

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 31

1

Exploratory Data Analysis

 This is the first step in analysing data from an experiment.

 Here, we do a descriptiver statistics analysis of the data.

 Some of the main reasons why we do EDA are:

 To detect mistakes.

 To determine the relationship between variables.

 To determine the main characteristics or features of the data.

NB: Before we start EDA in R, we will look at some important concepts that will help us in

handling this unit.

################################################### (PDF-NOTES).

Things to Know Before Start Learning R

Why use R

• R is an open source programming language and software environment for

statistical computing and graphics.

• R is an object oriented programming environment, much more than most other

statistical software packages.

• R is a comprehensive statistical platform, offering all manner of data-analytic

techniques – any type of data analysis can done in R.

• R has state-of-the-art graphics capabilities- visualize complex data.

• R is a powerful platform for interactive data analysis and exploration.

• Getting data into a usable form from multiple sources.

• R functionality can be integrated into applications written in other languages,

including C++, Java, Python , PHP, SAS and SPSS.

• R runs on a wide array of platforms, including Windows, Unix and Mac OS X.

• R is extensible; can be expanded by installing “packages”

Applications of R Programming in Real World

i. Statistical computing

ii. Data Science

iii. Machine Learning

Downloading and Installing R

########################################################

R Basics Sessions

R and R-Studio

R has Graphic User Interfaces (GUI). RStudio is an Integrated Development Environment

(IDE) that provides features to make using and managing R much easier.

 Looking at R window and R studio window with simple examples.

1. Getting help in R

To get help on specific topics, we can use the help() function along with the topic we want to

search. We can also use the ? operator for this. Example:

help(Syntax)

?Syntax

2. Operations in R.

R uses the following operators:

1. +, -, *, /, %%, ^ - Arithmetic Operators

2. >, > =, <, < =, = =, != - Relational Operators

3. !, $ - Logical Operators

4. ~ - Model Formulae

5. < -, = - Assignment Operator

6. : - Creating Sequence
3

a. Arithmetic Operators

Operator Description

+ addition

- subtraction

* multiplication

/ division

^ or ** exponentiation

b. Logical Operators include:

Operator Description

> greater than

>= greater than or equal to

== exactly equal to

!= not equal to
4

EXPLORATORY DATA
ANALYSIS
We will now start looking at exploratory data analysis in R.

1. Measures of Location.
i) Measures of Central Tendency
Measures that indicate the approximate center of a distribution are called

measures of central tendency.

Central tendency tells about how the group of data is clustered around the

centre value of the distribution.

Here we will look at the:

 Arithmetic Mean

 Geometric Mean

 Harmonic Mean

 Median

 Mode

Arithmetic Mean

The arithmetic mean is simply called the average of the numbers which represents the central

value of the data distribution. It is calculated by adding all the values and then dividing by the

total number of observations.

Formulae
5

In R language, arithmetic mean can be calculated by mean() function.

Example:

# defining vector

x <- c (3, 7, 5, 13, 20, 23, 39, 23, 40, 23, 14, 12, 56, 23)

# Print mean

Mean(x)

y = mean (x)

Print (mean(x))

Press ctrl + R to get the output or click on run at the upper corner of your console.

Output:

[1] 21.5

NB: You can also calculate the mean using:

W = sum(x)

MeanX = sum(x)/14

n = length(x)
6

XMean = sum(x)/n

Xmean = W/n

Given a Large Data set:

Let’s begin by looking at a simple example with a dataset that comes pre-loaded in your

version of R, called cars by Ezekiel (1930). These data give the speed of cars and the

distances taken to stop.

If we were to compute the mean for cars$speed (or the variable speed our dataset called cars)

we would simply sum the values in the column for speed and divide by 50.

(4 + 4 + 7 + 7 + 8 + 9 + 10 + 10 + 10 + 11 + 11 + 12 + 12 + 12 + 12 + 13 + 13 + 13 + 13 + 14

+ 14 + 14 + 14 + 15 + 15 + 15 + 16 + 16 + 17 + 17 + 17 + 18 + 18 + 18 + 18 + 19 + 19 + 19

+ 20 + 20 + 20 + 20 + 20 + 22 + 23 + 24 + 24 + 24 + 24 + 25) / 50

Or quite simply: 770/50 = 15.4.

But this data is too big to calculate the mean manually. To work with a large data set that is

pre-loaded in R, we:

View the data type:

View (cars)

cars

In R, we can compute the mean in several ways:

sumofspeed <- sum(cars$speed)

sumofspeed / 50

## [1] 15.4

or
7

sum(cars$speed) / length(cars$speed)

## [1] 15.4

or simply using the mean( ) function

mean(cars$speed)

## [1] 15.4

N/B:

Computing the mean for the cars data worked out nicely because there were no missing

values or NAs. If there were NAs we would be able to omit those from our calculations. For

example,

mean(cars$speed, na.rm=TRUE)

## [1] 15.4

While the mean is not a new concept to you, there’s some notation that is important for you

to understand.

n. Used to refer to the sample size. The number of sample of observations (rows) that we are

averaging. In the above example n=50

x. Used to refer to the sample elements.

EXERCISE

1. Geometric Mean

2. Harmonic Mean
8

The median

The median is another measure of central tendency. The middle value in a set of observations

is the median. For cars$speed, we can sort our variables in ascending order using the sort( )

function. This can help us identify the median.

Example 1

# Defining vector

x <- c(3, 7, 5, 13, 20, 23, 39, 23, 40, 23, 14, 12, 56, 23)

# Print Median

median(x)

# output

21.5

Example 2

sort(cars$speed)

## [1] 4 4 7 7 8 9 10 10 10 11 11 12 12 12 12 13 13 13 13 14 14 14 14 15 15

## [26] 15 16 16 17 17 17 18 18 18 18 19 19 19 20 20 20 20 20 22 23 24 24 24 24 25

In this case, the middle value is at positions 25 and 26. The middle value is 15. If the value of

position 25 was 14 and the value of position 26 was 15 we’d take the average of the two

values and the median would be 14.5.

An easier way to compute the median is to use the median( ) function:

median(cars$speed)

## [1] 15
9

The Mode

The mode of a set of observations is the value that occurs most frequently. There’s not a

standard function in R that computes the mode. However, you can create a simple frequency

table to tally the number of times each value occurs.

Example 1: Single-mode value

In R language, there is no function to calculate mode. So, modifying the code to find out the

mode for a given set of values.

# Defining vector

x <- c(3, 7, 5, 13, 20, 23, 39, 23, 40, 23, 14, 12, 56, 23, 29, 56, 37, 45, 1, 25, 8)

# Generate frequency table

y <- table(x)

# Print frequency table

print(y)

# Mode of x

m <- names(y)[which(y == max(y))]

# Print mode

print(m)

Output:

1 3 5 7 8 12 13 14 20 23 25 29 37 39 40 45 56

1 1 1 1 1 1 1 1 1 4 1 1 1 1 1 1 2

[1] "23"

Example 2: Multiple Mode values

# Defining vector

x <- c(3, 7, 5, 13, 20, 23, 39, 23, 40, 23, 14, 12, 56, 23, 29, 56, 37, 45, 1, 25, 8, 56, 56)

# Generate frequency table

y <- table(x)

# Print frequency table

print(y)

# Mode of x

m <- names(y)[which(y == max(y))]

# Print mode

print(m)

Output:

1 3 5 7 8 12 13 14 20 23 25 29 37 39 40 45 56

1 1 1 1 1 1 1 1 1 4 1 1 1 1 1 1 4

[1] "23" "56"

table(cars$speed)

## 4 7 8 9 10 11 12 13 14 15 16 17 18 19 20 22 23 24 25

## 2 2 1 1 3 2 4 4 4 3 2 3 4 3 5 1 1 4 1

Here we see that the value 20 occurs 5 times.

You can also compute the mode using the following algorithm:

modeforcars <- table(as.vector(cars$speed))

names(modeforcars)[modeforcars == max(modeforcars)]

## [1] "20"

Exercise 2
1. Find the Mean, Median and Mode using mtcars dataset pre-loaded in R.
12

ii) Measures of Relative Positioning

 The commonly used quantiles are; Quartiles, Deciles and Percentiles.

 These 3 divides a sorted data set into four, ten and hundred divisions, respectively.

a) Quartiles

There are several quartiles of an observation variable. The first quartile, or lower

quartile, is the value that cuts off the first 25% of the data when it is sorted in

ascending order. The second quartile, or median, is the value that cuts off the first

50%. The third quartile, or upper quartile, is the value that cuts off the first 75%.

Example

Find the quartiles of the eruption durations in the data set faithful.

Solution

We apply the quantile function to compute the quartiles of eruptions.

duration = faithful$eruptions # the eruption durations

quantile(duration) # apply the quantile function

0% 25% 50% 75% 100%

1.6000 2.1627 4.0000 4.4543 5.1000

Answer

The first, second and third quartiles of the eruption duration are 2.1627, 4.0000 and

4.4543 minutes respectively.

b) Deciles

In statistics, deciles are numbers that split a dataset into ten groups of equal

frequency. The first decile is the point where 10% of all data values lie below it. The

second decile is the point where 20% of all data values lie below it, and so on. We can

use the following syntax to calculate the deciles for a dataset in R:

quantile(data, probs = seq(.1, .9, by = .1))

Example

Calculate Deciles in R

The following code shows how to create a dataset with 20 values and then calculate

the values for the deciles of the dataset:

#create dataset

data <- c(56, 58, 64, 67, 68, 73, 78, 83, 84, 88,89, 90, 91, 92, 93, 93, 94, 95, 97, 99)

#calculate deciles of dataset

quantile(data, probs = seq(.1, .9, by = .1))

Output

10% 20% 30% 40% 50% 60% 70% 80% 90%

63.4 67.8 76.5 83.6 88.5 90.4 92.3 93.2 95.2

The way to interpret the deciles is as follows:

10% of all data values lie below 63.4

20% of all data values lie below 67.8.

30% of all data values lie below 76.5.

40% of all data values lie below 83.6.

50% of all data values lie below 88.5.

60% of all data values lie below 90.4.

70% of all data values lie below 92.3.

80% of all data values lie below 93.2.

90% of all data values lie below 95.2.

c) Percentiles

The nth percentile of a dataset is the value that cuts off the first n percent of the data values

when all of the values are sorted from least to greatest.

For example, the 90th percentile of a dataset is the value that cuts of the bottom 90% of the

data values from the top 10% of data values.

One of the most commonly used percentiles is the 50th percentile, which represents the

median value of a dataset: this is the value at which 50% of all data values fall below.

Percentiles can be used to answer questions such as:

 What score does a student need to earn on a particular test to be in the top 10% of

scores? To answer this, we would find the 90th percentile of all scores, which is the

value that separates the bottom 90% of values from the top 10%.

 What heights encompass the middle 50% of heights for students at a particular

school? To answer this, we would find the 75th percentile of heights and 25th

percentile of heights, which are the two values that determine the upper and lower

bounds for the middle 50% of heights.

To Calculate Percentiles in R

We can easily calculate percentiles in R using the quantile() function, which uses the

following syntax:

quantile(x, probs = seq(0, 1, 0.25))

Example

Find the 32nd, 57th and 98th percentiles of the eruption durations in the data set faithful.

Solution

We apply the quantile function to compute the percentiles of eruptions with the desired

percentage ratios.
15

duration = faithful$eruptions # the eruption durations

quantile(duration, c(.32, .57, .98))

32% 57% 98%

2.3952 4.1330 4.9330

Answer

The 32nd, 57th and 98th percentiles of the eruption duration are 2.3952, 4.1330 and 4.9330

minutes respectively.

Exercise

1. Find the 17th, 43rd, 67th and 85th percentiles of the eruption waiting periods in

faithful.

2.
16

2. Measures of Spread/
Dispersion
Spread is the degree of scatter or variation of the variable about the central value.

Examples of these measures include:

i) The range
ii) Inter-Quartile range
iii) Quartile Deviation also called semi Inter-Quartile range
iv) Mean Absolute Deviation
v) Variance
vi) Standard deviation

In addition to computing measures of central tendency, another summary statistic we’d like to

compute is variability. How spread out are the data? How far from the mean and median do

the observed values tend to be?

Range

The range of a variable is the largest value minus the smallest value. We can compute the

largest value using the max( ) function and the smallest value using the min( ) function. In the

case with cars$speed, the range is 25 – 4 or 21.

min(cars$speed)

## [1] 4

max(cars$speed)

## [1] 25

R has an even better function, range( ) that outputs the minimum and maximum value in a

vector

range(cars$speed)
17

## [1] 4 25

Interquartile range

The interquartile range is similar to the range, but instead of calculating the difference

between the biggest and smallest value, you calculate the difference between the 25th

quantile and the 75th quantile.

We can calculate the interquartile range (IQR) using IQR( ). This is the range spanned by the

middle half of the data. For example this is the 75th quantile minus the 25th quantile.

IQR(cars$speed)

## [1] 7

We can see all quantiles by typing the following:

quantile(cars$speed)

## 0% 25% 50% 75% 100%

## 4 12 15 19 25

Or just to see the 25% and 75% we can type:

quantile(cars$speed, probs=c(.25, .75))

## 25% 75%

## 12 19

Therefore, you can see the IQR is simply 19 – 12.

Variance

The variance is a numerical measure of how the data values are dispersed around the mean.

The variance measures how far a set of numbers are spread out. (A variance of zero indicates

that all the values are identical.) A non-zero variance is always positive: A small variance
18

indicates that the data points tend to be very close to the mean (expected value). A high

variance indicates that the data points are very spread out from the mean and from each other.

The variance of a dataset X is sometimes written as Var(X) but more commonly denoted as

S2 or for a given sample. The formula for the sample variance is:

To compute the sample variance in R we would type the following:

var(cars$speed)

## [1] 27.95918

Standard deviation

The square root of the variance is the standard deviation. Below is the formula for the sample

standard deviation.

To compute the sample standard deviation in R, type the following:

sqrt(var(cars$speed))

## [1] 5.287644

or you can use the sd( ) function

sd(cars$speed)

## [1] 5.287644

Measures of Skew and kurtosis

Skew and kurtosis are two more descriptive statistics that you may encounter.

Skew
19

Skewness is a measure of symmetry. If there are more extremely large values than extremely

small ones, the data can be described as positively skewed. If the data tend to have a lot of

extreme small values and not many extremely large values then the data is considered

negatively skewed. As a rule, negative skewness indicates that the mean of the data values is

less than the median, and the data distribution is left-skewed. Positive skewness would

indicate that the mean of the data values is larger than the median, and the data distribution is

right-skewed. See Figure below for an illustration.

Figure: From left to right: Positive skew, no skew, and negative skew

We can compute the skew by using a function called skew( ) from the psych package.

library(psych)

skew(cars$speed)

## [1] -0.1105533

Kurtosis

Kurtosis is the measure of the pointiness of the data. Intuitively, the kurtosis is a measure of

the peakedness of the data distribution. We can see how fat or thin the tails of a distribution

are relative to a normal distribution. Negative kurtosis would indicates a flat data distribution,

which is said to be platykurtic. Positive kurtosis would indicates a peaked distribution, which

is said to be leptokurtic. Incidentally, the normal distribution has zero kurtosis, and is said to

be mesokurtic. See Figure ?? for an illustration.

We can compute the kurtosis by using a function called kurtosi( ) from the psych package.

kurtosi(cars$speed)

## [1] -0.6730924

Where do you think cars$speed fall? Let’s plot it. See Figure below

DESCRIBE AND SUMMARY FUNCTIONS.

There’s an easier way to compute some measures of central tendency and variability using

the summary( ) function. The summary function provides the min( ), max( ), median( ),

mean( ), the 75% and 25% quantiles. To compute all these measures for a single variable

type:

summary(cars$speed)
21

## Min. 1st Qu. Median Mean 3rd Qu. Max.

## 4.0 12.0 15.0 15.4 19.0 25.0

To summarize a data frame, type:

summary(cars)

## speed dist

## Min. : 4.0 Min. : 2.00

## 1st Qu.:12.0 1st Qu.: 26.00

## Median :15.0 Median : 36.00

## Mean :15.4 Mean : 42.98

## 3rd Qu.:19.0 3rd Qu.: 56.00

## Max. :25.0 Max. :120.00

Describing a data frame

A similar function to the summary( ) function is the describe( ) function in the psych

package. This function is useful when your data are interval or ratio scale. Unlike the

summary ( ) function, it calculates the descriptive statistics for any type of variable you

give it. It also includes other measures that we discussed earlier such as the trimmed mean

(default is 10%), skew, kurtosis, and range. n is the sample size (or the number of non-

missing values)

describe(cars)

There are more advanced functions to compute descriptive statistics by group using the psych

package. One such function is describeBy( ). You can specify a grouping variable. Let’s say

we wanted to obtain descriptive statistics separately for each grouping of data. For example,

we could group our data by the different speeds in cars. We could use speed as our grouping

variable as follows:

describeBy(cars, group=cars$speed)
22

Bivariate Data
So far we have confined our discussion to the distributions involving only one variable.

Sometimes, in practical applications, we might come across certain set of data, where each

item of the set may comprise of the values of two or more variables.

A Bivariate Data is a set of paired measurements which are of the form:

( , ), ( , ), .....,( , ).

1. Scatter Diagrams.

2. Correlation.

3. Regression.

1. Scatter Diagrams.

A scatter diagram is a tool for analysing relationships between two variables. One variable is

plotted on the horizontal axis and the other is plotted on the vertical axis. The pattern of their

intersecting points can graphically show relationship patterns. Most often a scatter diagram is

used to prove or disprove cause-and-effect relationships.

There are many ways to create a scatterplot in R.

i) The basic function is plot(x, y), where x and y are numeric vectors denoting the

(x,y) points to plot.

# Simple Scatterplot

x<-c(1,2,3,4,5,6,7)

y<-c(2,4,6,8,10,12,14)

plot(x,y)

Plot 2

attach(mtcars)

plot(wt, mpg, main="Scatterplot Example",

xlab="Car Weight ", ylab="Miles Per Gallon ", pch=19)

Scatter diagrams will generally show one of six possible correlations between the

variables:

i) Strong Positive Correlation The value of Y clearly increases as the value of X

increases.

ii) Strong Negative Correlation The value of Y clearly decreases as the value of X

increases.

iii) Weak Positive Correlation The value of Y increases slightly as the value of X

increases.

iv) Weak Negative Correlation The value of Y decreases slightly as the value of X

increases.

v) Complex Correlation The value of Y seems to be related to the value of X, but the

relationship is not easily determined.

vi) No Correlation There is no demonstrated connection between the two variables

2. Correlation

Correlation is a statistical method to measure the relationship between the two quantitative

variables.

The correlation coefficient (r) measures the strength and direction of (linear) relationship

between the two quantitative variables. r can range from +1 (perfect positive correlation) to -

1 (perfect negative correlation).

The positive values of r indicate the positive relationship and vice versa. The higher the

absolute value of r, the stronger is the correlation. If the value of r is 0, it indicates that there

is no relationship between the two variables.

Interpretation of correlation coefficient (r)

The below table suggests the interpretation of r at different absolute values. These cut-offs

are arbitrary and should be used judiciously while interpreting the dataset.
25

Note: In interpretation, correlation can be positive or negative based on the sign of r

Types of correlation coefficients (r)

There are three main types of correlation coefficients:

i) Pearson’s product-moment correlation coefficient.

ii) Spearman’s rank-order (Spearman’s rho) correlation coefficient.

iii) Kendall’s Tau correlation coefficient.

Note: a) Most of the times correlation coefficients are referred to Pearson’s r unless specified.

b) The appropriate usage of different types of correlation coefficients largely depends

on underlying data types, sample size, linear or non-linear relationships between the

two variables, and their distributions.

i) Pearson’s product-moment correlation coefficient.

Pearson correlation (r), measures a linear dependence between two variables (x and y). It’s

also known as a parametric correlation test because it depends to the distribution of the data.

It can be used only when x and y are from normal distribution.

mx and my are the means of x and y variables.

Correlation coefficient can be computed in R using the functions cor() or cor.test():

cor() computes the correlation coefficient

cor.test() test for association/correlation between paired samples. It returns both the

correlation coefficient and the significance level(or p-value) of the correlation .

The simplified formats are:

cor(x, y, method = c("pearson", "kendall", "spearman"))

cor.test(x, y, method=c("pearson", "kendall", "spearman"))

Where;

x, y: numeric vectors with the same length

Method: correlation method.

Example 1

# correlation of vectors in R

x <- c(0,1,1,2,3,5,8,13,21,34)

y <- log(x+1)

cor(x,y)

Example 2

x <- c(0,1,1,2,3,5,8,13,21,34)

y <- log(x+1)

cor(x,y,method="pearson")

3. Regression
Regression analysis, in general sense, means the estimation or prediction of the unknown

value of one variable from the known value of the other variable.

Regression analysis can be thought of as being sort of like the flip side of correlation.

It has to do with finding the equation for the kind of straight lines you were just looking at
27

Suppose we have a sample of size n and it has two sets of measures, denoted by x and y. We

can predict the values of y given the values of x by using the equation, .

Or equation

Not every problem can be solved with the same algorithm. In this case, linear regression

assumes that there exists a linear relationship between the response variable and the

explanatory variables. This means that you can fit a line between the two (or more variables).

In this particular example, you can calculate the height of a child if you know her age:

In this case, “a” and “b” are called the intercept and the slope respectively. With the same

example, “a” or the intercept, is the value from where you start measuring. Newborn babies

with zero months are not zero centimeters necessarily; this is the function of the intercept.

The slope measures the change of height with respect to the age in months. In general, for

every month older the child is, his or her height will increase with “b”.

Linear Regression in R

A linear regression can be calculated in R with the command lm().

Dependent Variable (Target) : Continuous

Independent Variable (Predictor(s)): Continuous/Discrete

Y = mX + c , where

m = slope of straight line

c = Y-intercept

R-Codes to load Data

require("datasets")

data("iris")

str(iris)

head(iris)

Linear Models

Since simple L.R. requires just one target, let’s take “Sepal.Length”" attribute as our

target(Y) and “Sepal.Width” attribute as Predictor(X) to find if there exists any kind of

relationship between them.

Example 1

y<-c(1,2,3,4,5,6,7)

x<-c(2,5,6,8,9,10,18)

M<-lm(x~y)

summary(M)
29

Example 2

Y<- iris[,"Sepal.Width"] # select Target attribute

X<- iris[,"Sepal.Length"] # select Predictor attribute

head(X)

model1<- lm(Y~X)

model1 # provides regression line coefficients i.e. slope and y-intercept

Y = 3.41895 – 0.06188X

Interpretation

Holding X constant Y increases by 3.41895.

For unit increase in X Holding the intercept constant, β = 0, Y decreases by 0.06188.

Example 3

Model2<-lm(Petal.Width ~ Petal.Length, data=iris)$coefficients

summary (Model2)
30

The results can be interpreted as follows:

lm(Petal.Width ~ Petal.Length).

Petal.Width = -0.363076 + 0.415755 Petal.Length

R-Squared: 0.9271*100 = 92.71% implying that 92.71% variability of Y has been explained

by X leaving 7.29% unexplained.

EXPLORATORY DATA ANALYSIS

PLOTS/GRAPHICS

DS License Server: Installation Guide
No ratings yet
DS License Server: Installation Guide
176 pages
Manual Komatsu Pc8000 Shovel Excavator General Assembly Procedure Mounting Install
No ratings yet
Manual Komatsu Pc8000 Shovel Excavator General Assembly Procedure Mounting Install
156 pages
ECON 1100 R04 - R.Commands PDF
No ratings yet
ECON 1100 R04 - R.Commands PDF
15 pages
Descriptive Analysis in R Programming - GeeksforGeeks-1-12
No ratings yet
Descriptive Analysis in R Programming - GeeksforGeeks-1-12
12 pages
Prerequis R
No ratings yet
Prerequis R
38 pages
DAUR Lab Manual
No ratings yet
DAUR Lab Manual
14 pages
STATS LAB Basics of R PDF
No ratings yet
STATS LAB Basics of R PDF
77 pages
SSMDA Expt 7
No ratings yet
SSMDA Expt 7
16 pages
R Studio
No ratings yet
R Studio
41 pages
DSF Gourav-2
No ratings yet
DSF Gourav-2
30 pages
Beginner Guide To R and R Studio V1
No ratings yet
Beginner Guide To R and R Studio V1
27 pages
data anlytics using r notes
No ratings yet
data anlytics using r notes
14 pages
Homework 9: Independent and Paired Samples T-Tests: Information 1
No ratings yet
Homework 9: Independent and Paired Samples T-Tests: Information 1
7 pages
r 2m
No ratings yet
r 2m
34 pages
Unit 2 R
No ratings yet
Unit 2 R
16 pages
P1 2018
No ratings yet
P1 2018
5 pages
Research Methodology For Commerce Lab
No ratings yet
Research Methodology For Commerce Lab
35 pages
MultivariateRGGobi PDF
No ratings yet
MultivariateRGGobi PDF
60 pages
Data Analysis Using R - 5
No ratings yet
Data Analysis Using R - 5
9 pages
Untitled Document
No ratings yet
Untitled Document
27 pages
Teaching Notes of R
No ratings yet
Teaching Notes of R
78 pages
DAUR UNIT 1 Part 2
No ratings yet
DAUR UNIT 1 Part 2
39 pages
R Lab File Deepak
No ratings yet
R Lab File Deepak
27 pages
Stats Lab1
No ratings yet
Stats Lab1
11 pages
Module V 1
No ratings yet
Module V 1
7 pages
Unit3__R
No ratings yet
Unit3__R
19 pages
Data Analysis2
No ratings yet
Data Analysis2
16 pages
R-Software Tutorial - Installation
No ratings yet
R-Software Tutorial - Installation
13 pages
Linear Regression - Jupyter Notebook
100% (3)
Linear Regression - Jupyter Notebook
56 pages
1research Methodology For Commerce Lab
No ratings yet
1research Methodology For Commerce Lab
35 pages
9541EXP10BDA
No ratings yet
9541EXP10BDA
10 pages
E5 - Statistical Analysis Using R
100% (1)
E5 - Statistical Analysis Using R
45 pages
Introduction to Analytics and R file
No ratings yet
Introduction to Analytics and R file
29 pages
Lecture Notes - Linear Regression
No ratings yet
Lecture Notes - Linear Regression
26 pages
Week 1-R Programming Notes
No ratings yet
Week 1-R Programming Notes
15 pages
2 Undefined
No ratings yet
2 Undefined
86 pages
Getting Started With R: EPT OF Athematical Ciences Elle Ørensen Niversity of Openhagen Ctober
No ratings yet
Getting Started With R: EPT OF Athematical Ciences Elle Ørensen Niversity of Openhagen Ctober
20 pages
Module 1-1
No ratings yet
Module 1-1
38 pages
DSI237_GROUP_2
No ratings yet
DSI237_GROUP_2
27 pages
Big Data File in R
No ratings yet
Big Data File in R
23 pages
Muthayammal College of Arts and Science Rasipuram: Assignment No - 3
No ratings yet
Muthayammal College of Arts and Science Rasipuram: Assignment No - 3
8 pages
Introduction To R: Nihan Acar-Denizli, Pau Fonseca
No ratings yet
Introduction To R: Nihan Acar-Denizli, Pau Fonseca
50 pages
Sam BRM Rstudio
No ratings yet
Sam BRM Rstudio
43 pages
An Introduction To R: 1 Background
No ratings yet
An Introduction To R: 1 Background
17 pages
Advanced Statistical Methods using R Notes
No ratings yet
Advanced Statistical Methods using R Notes
55 pages
How To Run R
No ratings yet
How To Run R
48 pages
R Programming
No ratings yet
R Programming
79 pages
Lab 1
No ratings yet
Lab 1
3 pages
R Lanaguage
No ratings yet
R Lanaguage
25 pages
r programming built in functions
No ratings yet
r programming built in functions
8 pages
Introduction To R: Pavan Kumar A
No ratings yet
Introduction To R: Pavan Kumar A
55 pages
Dzone R Refcard
No ratings yet
Dzone R Refcard
9 pages
R Studio
No ratings yet
R Studio
42 pages
R Programming Tutorial
No ratings yet
R Programming Tutorial
8 pages
r Programming
No ratings yet
r Programming
56 pages
Introduction To R
No ratings yet
Introduction To R
20 pages
DSA Unit 1 & 2
No ratings yet
DSA Unit 1 & 2
26 pages
Unit 1
No ratings yet
Unit 1
16 pages
DWDM - Lab Manual1
No ratings yet
DWDM - Lab Manual1
40 pages
MODULE 1
No ratings yet
MODULE 1
42 pages
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Illuminating Data: A hands on guide to data visualization in R
From Everand
Illuminating Data: A hands on guide to data visualization in R
Eman Ahmad
No ratings yet
Understand Web Authentication On Wireless LAN Controllers (WLC) - Cisco
No ratings yet
Understand Web Authentication On Wireless LAN Controllers (WLC) - Cisco
15 pages
Ground Proximity Warning Receiver
100% (1)
Ground Proximity Warning Receiver
16 pages
Computer Engineering Department: Micro Project Report
No ratings yet
Computer Engineering Department: Micro Project Report
18 pages
Room Assignment - Sec-Social Studies-Pamp
No ratings yet
Room Assignment - Sec-Social Studies-Pamp
30 pages
Rotor-Gene Q User Manual: Sample & Assay Technologies
No ratings yet
Rotor-Gene Q User Manual: Sample & Assay Technologies
296 pages
Fit Up Report
No ratings yet
Fit Up Report
41 pages
Bibliografie Teza de Licenta
No ratings yet
Bibliografie Teza de Licenta
1 page
Choosing A Research Topic
No ratings yet
Choosing A Research Topic
2 pages
IMI Remosa Product Diverter AW
No ratings yet
IMI Remosa Product Diverter AW
4 pages
Battioni Pump Parts Cat PDF
No ratings yet
Battioni Pump Parts Cat PDF
34 pages
National Tiger Conservation Authority 1
No ratings yet
National Tiger Conservation Authority 1
3 pages
Co2 Daily Check List
No ratings yet
Co2 Daily Check List
3 pages
ADF P100 P300 User Manual REV21 en
No ratings yet
ADF P100 P300 User Manual REV21 en
80 pages
1034 PDF
No ratings yet
1034 PDF
80 pages
Canal Top Solar Power Plant
No ratings yet
Canal Top Solar Power Plant
61 pages
Chapter 13 Analog Electronic Instrumentation 1
100% (1)
Chapter 13 Analog Electronic Instrumentation 1
54 pages
Kevin Zhang Resume With Certificate
No ratings yet
Kevin Zhang Resume With Certificate
8 pages
Blockumulus: A Scalable Framework For Smart Contracts On The Cloud
No ratings yet
Blockumulus: A Scalable Framework For Smart Contracts On The Cloud
11 pages
Nist - sp.800 210 Draft
No ratings yet
Nist - sp.800 210 Draft
35 pages
2.0 en-US 2021-10 PL.1315
No ratings yet
2.0 en-US 2021-10 PL.1315
20 pages
Term Paper On Linux Operating System
100% (1)
Term Paper On Linux Operating System
6 pages
Portafolio de Inversión
No ratings yet
Portafolio de Inversión
27 pages
Herculink Bhs Series
No ratings yet
Herculink Bhs Series
20 pages
The Energy Act, 2019 (No. 1 of 2019) The Draft Energy (Solar Photovoltaic Systems) Regulations, 2019
No ratings yet
The Energy Act, 2019 (No. 1 of 2019) The Draft Energy (Solar Photovoltaic Systems) Regulations, 2019
49 pages
Architecture and System Requirements
No ratings yet
Architecture and System Requirements
35 pages
Stamp Auction Catalogue Persia
100% (1)
Stamp Auction Catalogue Persia
189 pages
MiniPack Sealmatic - 56T-79T
No ratings yet
MiniPack Sealmatic - 56T-79T
78 pages
Case Study-DNS
No ratings yet
Case Study-DNS
7 pages