Lab 1- Basic functions in R and plotting
Lab 1- Basic functions in R and plotting
# Addition
5+3
# Output: -------
# Subtraction
10 - 4
# Output: -------
# Multiplication
6*2
# Output: -------
# Division
15 / 3
# Output: -------
mean_x
# Output: -------
sd_x
# Output: -------
# Output: -------
# Boxplot
data <- c(10, 15, 7, 22, 9, 12, 18, 25, 14)
boxplot(data, main="Boxplot of Data", ylab="Values", col="green")
# Output: -------
# Scatter Plot
x <- c(1, 2, 3, 4, 5)
y <- c(3, 5, 7, 9, 11)
# Scatter plot
plot(x, y, main="Scatter Plot", xlab="X-axis", ylab="Y-axis", col="red")
# Output: -------
Note: For finding the power, you have to use any one of the operator (^ or **). The both
will give the same answer.
Power operator with scalar
# Output: -------
# Output: -------
Warning message:
longer object length is not a multiple of
shorter object length in: c(2,3,5,7)^c(2,3,4)
> 2 %/% 2
# Output: -------
> 5 %/% 2
# Output: -------
> 7 %/% 3
# Output: -------
Warning message:
In c(2, 3, 5)%/%c(2, 3) :
longer object length is not a multiple of shorter object length
Modulo Division: modulo operation finds the remainder after division of one number
by another.
Operator : %%
> 2 %% 2
# Output: -------
> 3 %% 2
# Output: -------
> 7 %% 3
# Output: -------
> 7 %% 4
# Output: -------
> c(2,3,5,7) %% 2
# Output: -------
>c(2,3,5,7) %% c(2,3)
# Output: -------
Warning message:
In c(2, 3, 5)%%c(2, 3) :
longer object length is not a multiple of shorter object length
Maximum
Operator: max
Minimum
Operator: min
> min(1.2, 3.4, -7.8)
# Output: -------
Arithmetic mean
Operator: mean
> mean(2, 3, 4)
# Output: 2
Note: The above two codes make a lot of difference. That is,
The difference lies in how the `mean()` function interprets its arguments:
Case 1: mean(2, 3, 4)
i. Here, you are passing multiple arguments directly to the `mean()` function.
ii. However, the `mean()` function does **not** take multiple individual numbers as
input. It expects a single vector (e.g., c(2, 3, 4)).
iii. The first argument 2 is taken as the input vector, and the rest (3, 4) are ignored. As a
result, the mean of just 2 is computed, which is `2`.
Key Difference:
i. mean(2, 3, 4) : Ignores all arguments except the first, so the mean of `2` is returned.
ii. mean(c(2, 3, 4)): Combines the numbers into a vector and computes the mean
correctly.
Correct Usage:
i. Always use a vector for the `mean()` function if you want to compute the mean of
multiple values, like this: mean(c(2, 3, 4))
scan() function
The scan() function in R is used to read data interactively or from an external file. It allows
you to input data directly into R, either manually via the console or by reading data from
files.
you can use scan() to manually input numbers, characters, or other types of data
directly in the R console.
> y=c(1,1,2,2,4,5,6,7,8,9,8,9,5,5,6,7,8,11,12,10,11,10,10)
> table(y)
y
1 2 4 5 6 7 8 9 10 11 12
2 2 1 3 2 2 3 2 3 2 1
> y=scan()
1: 1 2 4
4:
Read 3 items
> print(y)
[1] 1 2 4
> table(y)
y
124
111
> y=scan()
1: 1 2 3 4 5
6:
Read 5 items
> print(y)
[1] 1 2 3 4 5
> table(y)
y
12345
11111
Common Applications
Limitations
i. scan() is better suited for simple and unstructured data.
ii. For complex datasets, use read.table() or read.csv() instead, as they handle
structured tabular data more efficiently.
Note:
The boxplot is used to summarize data succinctly, quickly displaying if the data is
symmetric or has suspected outliers. It is based on the 5-number summary. In its simplest
usage, the boxplot has a box with lines at the lower hinge (basically Q1), the Median, the
upper hinge (basically Q3) and whiskers which extend to the min and max. To showcase
possible outliers, a convention is adopted to shorten the whiskers to a length of 1.5 times
the box length. Any points beyond that are plotted with points. These may further be
marked differently if the data is more than 3 box lengths away. Thus the boxplots allows us
to check quickly for symmetry (the shape looks unbalanced) and outliers (lots of data points
beyond the whiskers).