Control Flow - Looping
Control Flow - Looping
Looping in R
1. for Loop
The for loop is used to iterate over a sequence (like a vector or list) and execute a block of code for
each element in the sequence.
•variable: A variable that takes the value from the sequence in each iteration.
for (i in 1:5) {
print(i) }
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
number <- 5
factorial <- 1
for (i in 1:number) {
factorial <- factorial * i
}
print(paste("The factorial of", number, "is", factorial))
Here, a for loop is used to calculate the sum of the elements of a vector.
Example 6: Calculating mean and standard deviation for all numerical variables
Means:
print(means)
cat("\nStandard Deviations:\n")
Standard Deviations:
print(sds)
Imagine you’re handling multiple datasets, and within each dataset, you have several variables. If you
need to compute specific statistics for each variable across all datasets, a nested loop becomes an
efficient solution. The outer loop can iterate over datasets, while the inner loop handles each variable
within the current dataset.
https://sta334.s3.ap-southeast-1.amazonaw s.com/Loop+statement/Looping.html 3/18
10/26/23, 4:36 PM Control Flow - Looping
Example 7: Finding mean, sd and median for mtcars and iris dataset
if (measure == "mean") {
value <- mean(datasets_list[[dataset_name]][[col_name]], na.rm = TRUE)
} else if (measure == "sd") {
value <- sd(datasets_list[[dataset_name]][[col_name]], na.rm = TRUE)
} else if (measure == "median") {
value <- median(datasets_list[[dataset_name]][[col_name]], na.rm = TRUE)
}
Column: mpg
--------------------------------------------------
mean : 20.09
sd : 6.03
median : 19.2
Column: disp
--------------------------------------------------
mean : 230.72
sd : 123.94
Column: hp
--------------------------------------------------
mean : 146.69
sd : 68.56
median : 123
Column: drat
--------------------------------------------------
mean : 3.6
sd : 0.53
median : 3.7
Column: wt
--------------------------------------------------
mean : 3.22
sd : 0.98
median : 3.33
Column: qsec
--------------------------------------------------
mean : 17.85
sd : 1.79
median : 17.71
Column: Sepal.Length
--------------------------------------------------
mean : 5.84
sd : 0.83
median : 5.8
Column: Sepal.Width
--------------------------------------------------
mean : 3.06
sd : 0.44
median : 3
Column: Petal.Length
--------------------------------------------------
mean : 3.76
sd : 1.77
median : 4.35
Column: Petal.Width
--------------------------------------------------
mean : 1.2
sd : 0.76
median : 1.3
3. while loop
The while loop in R is used to execute a block of code repeatedly as long as specified condition if
TRUE . It is particularly useful when the number of iterations is not known beforehand.
while (condition) {
# code to be executed
}
condition: A logical expression that is evaluated before the execution of the loop’s body. The loop
runs as long as the condition is TRUE .
i <- 1
while (i <= 5) {
print(i)
i <- i + 1
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
i <- 1
column_names <- names(data)
col_num <- 1
row_num <- 1
col_num <- 1
row_num <- 1
missing_data_positions <- list()
col_num <- 1
missing_counts <- numeric()
4. repeat loop
The repeat loop in R is used to execute a block of code indefinitely until a break statement is
encountered. It is useful in situations where the number of iterations is not known beforehand, and the
loop should continue until a specific conditions is met.
repeat {
# code to be executed
if (condition){
break
}
}
condition: A logical expression. If TRUE , the break statement is executed, and the loop is terminated.
set.seed(123456)
repeat{
number <- runif(1) #Generate a random number between 0 and 1
print(number)
if (number > 0.9){
break
[1] 0.7977843
[1] 0.7535651
[1] 0.3912557
[1] 0.3415567
[1] 0.3612941
[1] 0.1983447
[1] 0.534858
[1] 0.09652624
[1] 0.9878469
Example 15: Calculating mean for each numeric variable in airquality dataset
col_num <- 1
means <- numeric()
repeat {
# Check if all columns have been processed
if (col_num > ncol(data)) {
break
}
Example 16: Calculating mean for each numeric variable in mtcars dataset
col_num <- 1
means <- numeric()
repeat {
# Check if all columns have been processed
if (col_num > ncol(data)) {
break
}
# Since all columns in mtcars are numeric, we can directly calculate the mean
mean_val <- mean(data[, col_num], na.rm = TRUE)
Example 17: Calculating mean and standard deviation for each numeric variable in
mtcars dataset
repeat {
# Check if all columns have been processed
if (col_num > ncol(data)) {
break
}
col_num <- 1
missing_counts <- numeric()
repeat {
# Check if all columns have been processed
if (col_num > ncol(data)) {
break
}
5. next loop
In R, the next statement is used within loop structures to skip the current iteration and proceed to the
next iteration of the loop. It is useful for bypassing specific conditions within a loop without exiting
the entire loop.
condition: A logical expression. If TRUE , the next statement is executed, and the current iteration is
skipped.
for (i in 1:10) {
if (i %% 2 == 0) {
next
}
print(i)
}
[1] 1
[1] 3
[1] 5
[1] 7
[1] 9
Example 20: Calculate mean for all numerical variables in airquality dataset
col_num <- 1
means <- numeric()
repeat {
# Check if all columns have been processed
if (col_num > ncol(data)) {
break
}
Example 21: Calculate mean for all numerical variables in mtcars dataset
col_num <- 1
means <- numeric()
repeat {
# Check if all columns have been processed
if (col_num > ncol(data)) {
break
}
Example 22: Calculate mean and standard deviation for all numerical variables in
mtcars dataset
col_num <- 1
means <- numeric()
repeat {
# Check if all columns have been processed
if (col_num > ncol(data)) {
break
}
col_num <- 1
missing_counts <- numeric()
repeat {
# Check if all columns have been processed
if (col_num > ncol(data)) {
break
}
Exercise
b. with a while loop, find the range (minimum and maximum values) of each numeric variable in the
iris dataset.
c. Employ a repeat loop to count the number of unique species in the iris dataset.
a. Convert the datasets::state.x77 matrix into a dataframe and introduce missing values.
b. Using a for loop, detect columns that have missing values and report the count.
c. Implement a while loop to replace missing values in the dataframe with the mean of their
respective columns.
d. Utilize a repeat loop to compute the standard deviation for each numeric column in the dataframe,
and use the next statement to skip over columns that have more than 10 missing values.
b. Implement a while loop to normalize each numeric column in the trees dataset (subtract mean
and divide by standard deviation).
c. Using a repeat loop, count the number of rows in the trees dataset where the Volume exceeds 1.5.
Terminate the loop once you’ve scanned all rows.
https://sta334.s3.ap-southeast-1.amazonaw s.com/Loop+statement/Looping.html 17/18
10/26/23, 4:36 PM Control Flow - Looping
b. Utilize a while loop to compute the median Assault rate across states.
c. Implement a repeat loop to find the average UrbanPop value. If the average exceeds 65, break out
of the loop and print a message indicating high urban population.
d. In a loop of your choice, iterate over each column and compute the sum. Use the next statement to
skip over the Rape column.