Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
4 views

Lab 1- Basic functions in R and plotting

The document provides an overview of basic functions in R, including arithmetic operations, vector manipulation, and data plotting techniques. It covers creating vectors, calculating statistical measures like mean and standard deviation, and generating various plots such as histograms, scatter plots, and boxplots. Additionally, it explains the use of the scan() function for data input and highlights the importance of using vectors for functions like mean.

Uploaded by

robinson.m
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Lab 1- Basic functions in R and plotting

The document provides an overview of basic functions in R, including arithmetic operations, vector manipulation, and data plotting techniques. It covers creating vectors, calculating statistical measures like mean and standard deviation, and generating various plots such as histograms, scatter plots, and boxplots. Additionally, it explains the use of the scan() function for data input and highlights the importance of using vectors for functions like mean.

Uploaded by

robinson.m
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

Lab 1- Basic Functions in R and Plotting

1.Basic Arithmetic Operations:


# Addition
x <- 5
y <- 3
z <- x + y
z

# Addition
5+3
# Output: -------

# Subtraction
10 - 4
# Output: -------

# Multiplication
6*2
# Output: -------

# Division
15 / 3
# Output: -------

Addition, Subtraction, Multiplication and division.


Addition with scalar
> c(2,3,5,7) + 10
# Output: -------

Addition with data vectors


> c(2,3,5,7) + c(-2,-3, -5, 8)
# Output: -------

> c(2,3,5,7) + c(-2,-3,)


# Output: -------

> c(2,3,5,7) + c(-2,-3, -5)


warning message:
Longer object length

Subtraction with scalar


> c(12,13,15,17) - 10
# Output: -------

Multiplication with scalar


> c(2,3,5,7) * 10
# Output: -------
Division with scalar
> c(12,13,15,17) / 10
# Output: -------

2. Working with Vectors:


# Create a vector
x <- c(2, 4, 6, 8, 10)

# Calculate mean and standard deviation


mean_x <- mean(x) (that is mean_x = mean(x) )

sd_x <- sd(x)

mean_x
# Output: -------

sd_x
# Output: -------

3. Plotting of data sets


We will start working on a script file.
Go to R
Open a new script file
Save it in a desired location
And start typing the commands in the new script file
The command in the script file can be run using the ctrl+R

Creating table from data set

1. x=c("yes","no","no","yes","yes") Gives a vector of characters


table(x) Makes a table out of the given data set
# Output: -------

2. This can be done for numerical data also

y=c(1,1,2,2,4,5,6,7,8,9,8,9,5,5,6,7,8,11,12,10,11,10,10) Gives a vector of numerals


table(y) Makes a table out of the given data set

# Output: -------

Creating and Plotting Vectors:


# Create a vector
data <- c(10, 15, 7, 22, 9)
# Plotting histogram
hist(data, main="Histogram of Data", xlab="Values", col="skyblue")
# Output: -------
Plotting Examples:
# Scatter plot
x <- c(1, 2, 3, 4, 5)
y <- c(3, 5, 7, 9, 11)
plot(x, y, main="Scatter Plot", xlab="X-axis", ylab="Y-axis", col="blue")
# Output: -------

data <- c(10, 15, 7, 22, 9, 12, 18, 25, 14)


hist(data, main="Histogram of Data", xlab="Values", col="skyblue")
# Output: -------

# Boxplot
data <- c(10, 15, 7, 22, 9, 12, 18, 25, 14)
boxplot(data, main="Boxplot of Data", ylab="Values", col="green")
# Output: -------

# Scatter Plot
x <- c(1, 2, 3, 4, 5)
y <- c(3, 5, 7, 9, 11)

# Scatter plot
plot(x, y, main="Scatter Plot", xlab="X-axis", ylab="Y-axis", col="red")
# Output: -------

Power operations, Integer and Modulo divisions

> 2^3 # Command for power operator


# Output: -------

> 2**3 # Command for power operator


# Output: -------

> 2^0.5 # Command for power operator


# Output: -------

> 2**0.5 # Command for power operator


# Output: -------

> 2^-0.5 # Command for power operator


# Output: -------

Note: For finding the power, you have to use any one of the operator (^ or **). The both
will give the same answer.
Power operator with scalar

> c(2,3,5,7)^2 # command: application to a vector

# Output: -------

Power operation with vector

> c(2,3,5,7)^c(2,3) # !!ATTENTION! Observe the operation


# Output: -------

> c(1,2,3,4,5,6)^c(2,3,4) # command: application to a vector with vector

# Output: -------

> c(2,3,5,7)^c(2,3,4) #error message

Warning message:
longer object length is not a multiple of
shorter object length in: c(2,3,5,7)^c(2,3,4)

Integer division with scalar


Integer Division: Division in which the fractional part(remainder) is discarded
Operator : %/%

> 2 %/% 2
# Output: -------

> 5 %/% 2
# Output: -------

> 7 %/% 3
# Output: -------

> c(2,3,5,7) %/% 2


# Output: -------

> c(2,3,5,7) %/% c(2,3)


# Output: -------

> c(2,3,5) %/% c(2,3)


# Output: -------

Warning message:
In c(2, 3, 5)%/%c(2, 3) :
longer object length is not a multiple of shorter object length

Modulo Division (x mod y) with scalars

Modulo Division: modulo operation finds the remainder after division of one number
by another.

Operator : %%

> 2 %% 2
# Output: -------

> 3 %% 2
# Output: -------

> 7 %% 3
# Output: -------

> 7 %% 4
# Output: -------

> c(2,3,5,7) %% 2
# Output: -------

>c(2,3,5,7) %% c(2,3)
# Output: -------

> c(2,3,5) %% c(2,3)


# Output: -------

Warning message:
In c(2, 3, 5)%%c(2, 3) :
longer object length is not a multiple of shorter object length

Maximum
Operator: max

> max(1.2, 3.4, -7.8)


# Output: -------

> max( c(1.2, 3.4, -7.8) )


# Output: -------

Minimum
Operator: min
> min(1.2, 3.4, -7.8)
# Output: -------

> min( c(1.2, 3.4, -7.8) )


# Output: -------

Arithmetic mean
Operator: mean

> mean(2, 3, 4)
# Output: 2

> mean(c(2, 3, 4))


# Output: 3

Note: The above two codes make a lot of difference. That is,
The difference lies in how the `mean()` function interprets its arguments:

Case 1: mean(2, 3, 4)
i. Here, you are passing multiple arguments directly to the `mean()` function.
ii. However, the `mean()` function does **not** take multiple individual numbers as
input. It expects a single vector (e.g., c(2, 3, 4)).
iii. The first argument 2 is taken as the input vector, and the rest (3, 4) are ignored. As a
result, the mean of just 2 is computed, which is `2`.

Case 2: mean(c(2, 3, 4))


i. Here, the numbers 2, 3, 4 are combined into a single vector using c(2, 3, 4)
ii. The mean() function correctly calculates the mean of all the elements in the vector:
2+ 3+4
=3
3

Key Difference:
i. mean(2, 3, 4) : Ignores all arguments except the first, so the mean of `2` is returned.
ii. mean(c(2, 3, 4)): Combines the numbers into a vector and computes the mean
correctly.

Correct Usage:
i. Always use a vector for the `mean()` function if you want to compute the mean of
multiple values, like this: mean(c(2, 3, 4))

scan() function

The scan() function in R is used to read data interactively or from an external file. It allows
you to input data directly into R, either manually via the console or by reading data from
files.
you can use scan() to manually input numbers, characters, or other types of data
directly in the R console.
> y=c(1,1,2,2,4,5,6,7,8,9,8,9,5,5,6,7,8,11,12,10,11,10,10)
> table(y)
y
1 2 4 5 6 7 8 9 10 11 12
2 2 1 3 2 2 3 2 3 2 1
> y=scan()
1: 1 2 4
4:
Read 3 items
> print(y)
[1] 1 2 4
> table(y)
y
124
111
> y=scan()
1: 1 2 3 4 5
6:
Read 5 items
> print(y)
[1] 1 2 3 4 5
> table(y)
y
12345
11111

Read Data from a File:

scan() can be used to read data from an external text file.


Example:
x <- scan("data.txt")
Here, data.txt is a text file that contains numeric or character data. Each value can be
on a new line or separated by spaces.

Read Character Data:


You can use the what argument to specify the type of data you want to input.
For example, what = "" tells scan() to read character data instead of numeric data.
> scan(what = "")
1: alice bob charlie
4:
Read 3 items
[1] "alice" "bob" "charlie"

Common Applications

i. Reading small datasets interactively for quick analysis.


ii. Testing scripts or models with manually entered data.
iii. Quickly importing numeric or character data from simple text files.

Limitations
i. scan() is better suited for simple and unstructured data.
ii. For complex datasets, use read.table() or read.csv() instead, as they handle
structured tabular data more efficiently.
Note:

1. 4.barplot(x) function plots the bar graph of the data x


2. pie(x) plots the pie chart of the data set
3. pie(table(x)) plots the pie chart from the table
4. hist(x) plots the histogram of the data set x
5. boxplot(x) gives the box plot of the data set x

The boxplot is used to summarize data succinctly, quickly displaying if the data is
symmetric or has suspected outliers. It is based on the 5-number summary. In its simplest
usage, the boxplot has a box with lines at the lower hinge (basically Q1), the Median, the
upper hinge (basically Q3) and whiskers which extend to the min and max. To showcase
possible outliers, a convention is adopted to shorten the whiskers to a length of 1.5 times
the box length. Any points beyond that are plotted with points. These may further be
marked differently if the data is more than 3 box lengths away. Thus the boxplots allows us
to check quickly for symmetry (the shape looks unbalanced) and outliers (lots of data points
beyond the whiskers).

You might also like