Introduction To R: General Lines
Introduction To R: General Lines
Introduction To R: General Lines
Kevyn Stefanelli
General lines
This is a useful guide to use the software R for the “Financial Econometrics with R” course.
The reader can find the R commands in the grey areas of the file, while outside them they can find the
relative explanation and useful notes. You can also find separately a R script containing only the code with
comments. You can copy-pasty the commands/code from the R-script to the R-console, or type them directly
there. Copy-pasting is fast, typing it he best way to learn.
Important: Copy-pasting from Word or pdf documents produces errors in R, sometimes. Pay attention
when you copy "", ”, -, and the other special characters. You can copy-pasting in R and just replace these
characters.
The following guide is divided in paragraphs and each one of them contains a specific topic and the relative
exercises. These lasts present different level of difficulties: (∗) “easy”, (∗∗) “intermediate”, (∗ ∗ ∗) “advance”.
In order to pass the exam you must be able to complete all the (∗) “easy” exercises. The (∗∗) “intermediate”
and the (∗ ∗ ∗) “advance” exercises will enhance your final grade progressively.
You must bring your personal computer in class to do live exercises.
The exam is composed by two different tests:
1. Short Multiple Choice exam (30 mins): you will have 10 questions to be completed in 30mins.
Your questions are chosen randomly from a set of questions of the same difficulty. There will be a time
window open for 24 hours during which you can start and complete the test. This test accounts for
40% of the final grade for the Introduction to R course. You will be asked to upload the script you
used to make calculations.
2. Long Multiple Choice exam (4 hours): you will be provided with a dataset and a list of questions
to be completed. Your questions are chosen randomly from a set of questions of the same difficulty.
Again, you will be asked to provide the entire R script used to answer these questions. This test
accounts for 60% of the final grade of the Introduction to R evaluation.
Further details on exam dates and procedures will be provided soon. In both tests you can use all the material
provided, but you must complete them on your own.
1
How to download R
You can download R from the Comprehensive R Archive Network (CRAN) following this link:
https://cran.r-project.org/
You can find video tutorial on the Blackborad page of the course to download the software R. Alternativerly,
here you can follow the instruction to download R for your specific operation system (Windows, macOS, or
Linux):
and then:
macOS
Windows
:
The download will starts automatically.
Linux
2
3
Introduction to R
R is an integrated suite of software facilities for data manipulation, calculation and graphical display. It
is a open source software, that means that is totally free to use and realized/updated/optimized by the R
community. Once we run the software R we will see a new window with the simbol “>”, which means that R
is ready to get instructions.
Basic commands
R is a calculator:
# R basic operations
6+4
3-2
2*4
9/2
1.2^2
However, it is better to work on a R script instead on the R command window. A script is a text file where
you can save your code and then analyze/re-execute/modify it later. To open a new script on R
• Windows: File –> New Script
• Mac: File –> New Rd Document/New Document
Once you write on a R script a (some) lines of code, to execute it (them) on the command window: 1. Select
the line(s) 2. Press
• on W indows: CTRL + R
• on M ac: cmd + Enter
You can also add lines that will not be compiled (executed) by R. You need just to add “#” at the beginning
of the line(s). These lines are usually called “comments”.
ADVICE FOR YOUR CAREER (and for any other course you will do): always write as many comments as
possible into your code. They will be extremely useful when you will use the code again in the future and
you will have forgotten why/how you wrote your code!
When you finish working with R you can save your script on your computer to open it the next time.
R Objects
R is an object oriented language, which means that we can store “things”, values, in objects. For example, we
can assign to the letter “x” the value 5 and, from now on, when we say “x” R responds 5. All these objects
(or variables) are contained in a space called Global Environment.
In a variable we can store different types of values. Among others we enumerate:
1. numeric (integer or double): 1,2,3,4,0.5,-2.3;
2. character (string of): "Kevyn", "Rome", "Nice";
3. boolean: TRUE (or just T), FALSE (or just F);
4
How do we define (assign a value to) a variable?
# assign to x the value 2 (to comment the script start a line with #)
x = 2
# assign to y the value -1.4
y <- -1.4
# alternatively
4 -> z
# print (visualize) them
x
y
z
# define a as a character variable
a='Hey! How are you?'
a
And we can also remove some (all) of them. To remove objects the function rm is available:
# remove x
rm("x")
# remove everything
rm(list = ls())
5
Vectors in R
We can define a vector of values and store it into a variable. A vector is just a collection of values stored
in the same object. Vectors in R are introduced by “c()” and their elements are separated by a comma as
follows:
# define a vector containing five numbers
myfirstvector = c(0,2.4,1.1,-1.3,-0.6)
myfirstvector
# define a vector containing three characters
ID = c("Kevyn", "Stefanelli","Rome")
ID
Note: If a vector contains at least one character element, then all the vector assumes the character type.
To define the step of a sequence (the distance between each couple of values of a sequence) we can use the
function seq() that works as follows:
seq(): It requires three inputs:
1. from: the starting point.
2. to: the ending point.
3. (optional) “by” to determinate the step or “length.out” to determine how many (equispaced) points we
want between the starting and the ending point.
Note: for the first two inputs there is no need to express “from” and “to” (R already knows it), while for the
third element (or parameter) is required to specify “by” or “length.out”.
6
We can define a vector by replicate a value/string n times using the function rep():
We can fill a vector with random numbers using several functions including runif(). This function creates a
vector of n random numbers coming from a Uniform distribution and comprised in the interval [a, b], with
a ≤ b. The Uniform distribution ensures that each number in this interval has the same probability to be
selected. The syntax of the function is the following:
Example: define a random vector of 10 elements between 0 and 100 from a Uniform distribution.
# define a random vector of 10 elements between 0 and 100 from a Uniform distribution
randv = runif(10,min=0,100)
Note: the default values of runif () for min and max are 0 and 1. Then, if we specify only n this function
returns a vector of values uniformely chosen between 0 and 1.
# define a random vector of 100 elements between 0 and 1 from a Uniform distribution
randv = runif(100)
IMPORTANT: runif, as any other simulating function, gives new values everytime you execute it, because
it provedes random numbers. To avoid this issue and obtain the same value of the previous code execution,
you can “block” the random numbers generator seed with the function set.seed(), that requires as input
a string of values (e.g., set.seed(1234)).
For example:
# first randv:
randv = runif(10)
randv
# second randv:
randv = runif(10)
randv
7
Conversely, using set.seed():
which
√ returns n random values from a Normal distribution with mean = mean and standard deviation
( variance) equal to sd.
For example:
We can round our results to a fixed decimal (e.g., first, second, third,. . . ) or define (a vector of) integer
numbers (rounding the decimal number at the units) using the function round() as follows:
round(x, k)
which rounds the object x (a number, a vector, . . . ) at the k decimal. If we do not specify k it rounds x at
the closest integer (k = 0 by default).
8
R built-it functions
R contains a large amount of native (built-in) functions. They also are objects which contain one (or more)
actions. In other words, we can use them to make calculations, plots, or defining particular objects. They are
characterized by a name followed by round brackets “()”. In the brackets we insert the inputs (objects to be
processed) and the function returns an output (the results of the process).
We have already seen a couple of them:
1. c(): create a vector. Inputs are the vector elements.
2. seq(): generate regular sequences.
3. rep(): replicate values.
4. . . .
Once you have installed R, you have a lot of built-in functions ready to use. For example:
• min(x): returns the minima of the input values x.
• max(x): returns the maxima of the input values x.
• mean(x): returns the arithmetic mean of the input values x.
• median(x): returns the median of the input values x.
• sd(x): returns the standard deviation of the input values x.
• var(x): compute the variance (covariance) of the input vector (matrix/vectors) x.
• abs(x): compute the absolute value of the input values x.
• exp(x): compute the exponential value of the input values x (ex ).
• log(x): compute the natural logarithm of the input values x.
• sqrt(x): compute the square root of the input values x.
• sort(x): sort the vector x by ascending/descending order or alphabetic order (for characters), default:
ascending.
R has an online manual which contains all the information about these and the other functions. To consult
the manual we can type:
• ?f unction_name: returns the specific page of the function with Description, Arguments, Details, See
Also, and Examples sections.
• ??word: returns all the manual pages where this "word" is located.
9
Now, define a vector of 10 observations and try these functions:
# define a vector
vector10 = c(0,-2,-1,-5,3,1,-8,2,9,4)
R Packages
R is (again) an open source software. Then everyone can write their own functions (including us, as we will
see later) and then publish them in the R repository (the online large warehouse of functions from where
we download R). These functions are collected in packages (like warehouse boxes) which comprises a list of
functions often referred to a particolar topic (e.g. graphics, statitics, . . . ).
Every functions we have used until now is contained in one of the pre-installed packages (e.g. base, util, . . . ),
but when we do something more specific we have to download and install new packages. Then, we can use
them by simply loading these packages in our R session, without installing them again.
We can do everything with just two rows in R. Try to install and download MASS, a package containing a
large amount of datasets to use:
# install the package MASS
install.packages("MASS")
# now MASS is on our computer.
# We have just to load (say to R that we need) MASS, because we are going to use it:
library("MASS")
# To see all the function contained in a package:
help(package = "MASS")
# To see al the packages currently loaded in our R session:
search()
10
R "homemade" functions
We can write our own functions, which are algorithms (from the easiest ones to those most complex) that do
exactly what we ask them for. Basically, we write a function that given a (or a sequence of) value(s), called
input(s), returns one (or more) value(s), called output(s). The syntax is straightforward:
Illustrative Examples:
# define sum_of_2 as a function that given two numbers a and b returns their sum
sum_of_2 = function(a,b) {a+b}
Note: Now, our function is stored in the Global Environment for further uses with the name “sum_of _2”.
11
2. Write a function that calculates exp(x+4) for a user-specified x and evaluate it for the values 2 and -4.
(*)
# evaluate it
exp4(2)
exp4(-4)
3. The following equation represents the probability density function (pdf) of a Normal distribution:
( 2 )
1 1 x−m
f (x, m, s) = √ exp − ,
2πs2 2 s
where x, m, s are user-specified numbers (x numeric, m = sample mean and s sample standard deviation).
Write a function called “fNormal” that calculates this pdf and evaluate this last considering x=2, m=-3, and
s=1.5. (**)
fNormal(2,-3,1.5)
4. Given m = 2 and s = 1, create x as a vector of the first 100 integers between [1:100] and evaluate the
fNormal function defined in the previous step for each point of the vector x. (**)
Note: If the formula contains only one command we can also avoid the {}.
12
5. Focus on the built-in sd() function.
The sd() built-in function computes the sample standard deviation, the unbiased estimate. Which is the
difference with the standard deviation of the population?
Table 1: Difference between sample and population SD. n and N are the size of the sample and of the
population, respectively, and x̄ represents the sample mean.
So, if we need the population standard deviation, called also biased SD, we cannot use the built-in function.
Let’s see our alternatives doing the following exercise:
13
Then:
# define the vector w
w = c(1,-1,5,6,1,-6,8,9,1,3)
# define N as the length of w
N = length(w)
# compute the population SD:
sd_pop2 = sd(w)*(sqrt(N-1)/sqrt(N))
# see the difference with the results of sd()
sd_pop
5. Write a function that given the (unbiased) sample standard deviation as input returns the (biased)
population standard deviation (as Ex5 page 12). (**)
6. Define two vectors u and g both of length 10, where u is the results of a draw from a Uniform distribution
in the interval [0,10], while g is the results of a draw from a Normal distribution with mean 5 and
sample standard deviation equal to 0.5. Then, use the function defined in the first exercise to compute
the difference between u and g. Set the random numbers generator seed equal to "092021". (***)
14
Matrices
In R we can define a matrix using the function matrix().
We can define a matrix in different ways.
Define (initialize) a variable A as a 3x2 matrix as:
1 2
A = 3 4
5 6
1st method: fill the matrix with the vector c(1, 2, 3, 4, 5, 6) and specifies (at least) the number of rows/columns
of A. Once we specify the number of rows, R automatically knows the number of colums, and viceversa.
Finally, we have to specify how R has to fill the matrix (by rows vs by colums).
Summarizing, we define in R a generic matrix M as:
M = matrix data, nrow, ncol, byrow
In the case of our matrix A we will do:
A = c(1, 2, 3, 4, 5, 6), nrow = 3, ncol = 2, byrow = T RU E
So we choose as “data” a vector containing the sequence of number between 1 and 6, then we specify the
number of rows and colums (only one is strictly necessary), and finally in which way we want to fill the
matrix, so we specify byrow = TRUE (or just T). Then:
15
Let’s create B, a 3x3 squared matrix containing the values between 1 and 9 and fill it by column:
Note: by default “byrow” is set as “FALSE”. So we need to specify it only if we want to fill the matrix by
row:
# define a matrix B containing the integer between 1 and 9
B = matrix(data=1:9,nrow=3)
B
# it works fine without specifying the number of columns and the way to fill the matrix
2nd method: Matrices ar just a collection of (row/column) vectors. Then we can define a matrix binding
(by row or by column) pre-created vectors. In order to do that, we use the functions named rbind() (to bind
vectors by row) and rbind() (to bind vectors by column). They require as inputs just the collection of vectors
of the same length to bind toghether.
Example:
Let be v1 = (2, 0, 1), v2 = (1, 0, 3), v3 = (−1, −1, −1) column vectors. We want to define C and D as
2 1 −1 2 0 1
C = 0 0 −1 D= 1 0 3
1 3 −1 −1 −1 −1
C = cbind(v1,v2,v3)
D = rbind(v1,v2,v3)
C
D
16
Element position in vectors and matrices
We can select/extract an element from a vector/matrix by indicating its position. To select an element of an
object we need to use the “[ ]”. In particular:
• Vectors have one dimensions, then to extract the element of position k from the vector v:
v[k]
• Matrices have two dimensions (rows and columns), then we need to specify the row(s) and column(s)
numbers in the squared brackets ("[ ]"), separated by a comma. To extract the element in row i and
column j from the matrix M :
M [i, j]
# define w and E
w = c(1,-1,7,2,0)
E = matrix(c(1,0,4,5,2,1,-1,0,-1,-2,3,-1,2,0,-3,0), nrow = 4)
# define G as a new matrix equal to E without the third column of this last
G = E[,-3]
G
17
H = E[-c(1,4),]
H
There are two other useful tools to work with the dimension/position of the elements of vectors and matrices:
• length(x): returns the length of the object (vector) x.
• dim(x): returns the dimension of the object (matrix, and other) x.
Example :
18
Basic matrix calculus with R
In R we can do math operations (sum, difference, product, ec..) between a matrix and a scalar simply as
follows. Let A be a matrix (or a vector, column vector = matrix with 1 column) and b a scalar.
Operation R Command
Sum/Diff: A± b A± b
Product: A·b A*b
A
Division: b A/b
Given two matrices A and B, the following table reports the most common mathematical matrix operations.
Operation R Command
Sum: A+B A+B
Product: A·B A%*%B
Determinant: det(A) det(A)
Inverse: A−1 solve(A)
Transpose: A’ t(A)
Main diagonal: diag(A) diag(A)
Example
Compute:
1. A + A
2. A · B
3. det(B)
4. A−1
5. A0
19
# define the three object A,B and c
A = matrix(c(rep(1,3), 0:2,c(1,-1,1)),nrow=3,byrow = T)
B = matrix(1:9, nrow=3,byrow = F)
c=3
A;B;c
# 1
A+A
# 2
A%*%B
# 3
det(B)
# 4
solve(A)
# 5
t(A)
5. A · B 0
6. (A − B) · c
7. diag(B) · c
8. A[2, 3] · B[3, 3]
20
Conditions in R
The comparison between two or more objects is important in all the programming lenguage. The following
table summarizes all the math condition operators and how to express them in R.
Condition R
< <
> >
≤ <=
≥ >=
= ==
6= !=
and &
or |
Note: The output of a comparison between two (or more) objects is a variable of type boolean (i.e.,
TRUE/FALSE).
Exercise:
Write the R code to answer the following questions relatively to the matrix A:
1. Is the element in position [1,1] greater than that in position [4,1]? (*)
2. Is the element in position [3,2] different from that in position [2,2]? (*)
3. Is the element in position [1,2] equal to that in position [4,2] and that in position [4,3]? (**)
4. Is the element in position [3,1] lower than that in position [3,3] or than that in position [4,2]? (**)
5. Is the sum of the first row of A lower than that of the second column of A? (***)
6. Is the mimimum of the last column of A greater than or equal to the maximum of the last row of A?
(***)
21
# define A
A = matrix(c(1,0,4,5,-1,-3,4,3,2,1,-6,2,0,0,2,4), ncol=4, byrow = T)
# 1
A[1,1] > A[4,1]
# 2
A[3,2] != A[2,2]
# 3
A[1,2] == A[4,2] & A[1,2] == A[4,3]
# 4
A[3,1] < A[3,3] | A[3,1] < A[4,2]
# 5
sum(A[1,]) < sum(A[,2])
# 6
min(A[,ncol(A)]) >= max(A[nrow(A),])
Dataframes
A dataframe is a “special” matrix which can contains more information and presents several desiderable
features. Each column of a dataframe (or dataset) has a name and represent a variable, hence an observed
feature of the statistical units (which are reported in rows).
We can create our first dataframe starting from a list of vectors. We select a class of 20 people coming from
different countries of the world. For each one of them we collect names, ages, heights, nationalities, gender,
and the final grades in the math exam.
# define a vector of names
names = c("Andrew","Anna","Alice","Antony",
"Barbara", "Brian","Boris","Barney",
"Claudia","Cliff","Cecilia","Clara",
"David","Dora","Denise","Donatello",
"Emma","Elise","Esteban","Elon")
# define a vector of ages
ages = c(20,22,27,25,18,22,26,21,19,24,
27,23,22,19,23,28,22,24,25,19)
# define a vector of heights
heights = c(180,170,155,175,150,197,178,182,183,170,
175,178,170,160,175,194,180,165,172,183)
# define a vector of nationalities
nationalities = c("France","Scotland","Italy","Poland",
"France","India","UK","Poland",
"Italy","Scotland","UK","France",
"Mexico","USA","France","Germany",
"USA","France","Spain","Poland")
22
We obtain the following dataset:
1. dataframe_name$variable_name
2. dataframe_name[,’variable_name’]
For example:
# print the names of the units
class$names
# or
class[,'names']
23
We can add a variable to our dataset by calling the dataset and define a variable inside it as follows:
We want to add the variable HStudy which indicates the number of hours spent studying in the last week by
our students. So:
Note: the new variable must have the same number of elements of the others comprised in the dataset.
We can also add new variables as transformation of pre-existing variables in the dataset. For example, we
can add a new variable called “MinStudy” which express the variable HStudy (currently measured in hours)
in minutes. In formula:
M inStudy = HStudy · 60
We can also export the dataset (or a matrix) we have created in a ‘.csv’ file (comma separated values) to use
it on Excel.
# export our dataframe
write.csv(class, file="class.csv")
24
Solutions:
3. The median grades and 4. the highest and the lowest grades.
Again, we know how to compute median, maximum and minimum. Then:
Note: to compute the relative frequencies we divide by the total number of observation, then nrow(class):
25
5. Define two subsamples separating females and males. Which group presents the highest grade, in mean?
We can split the dataset into two subsets according to a specific condition. In this case, each unit will belong
to the new “female” or “male” datasets according to the variable “gender”. To express a condition we refer
to those operators presented in Table 4. In particular, for each dataset we select only the rows where the
expressed condition is TRUE.
# Female = 1, Male = 0.
female = class[class$gender==1,]
male = class[class$gender==0,]
In words:
We select all the rows (the first space in the squared brakets, [], before “,”) where the condition (female:
class$gender==1, and male: class$gender==0) is TRUE and then we select all the columns (because we do
not specify one or more variables after the “,” and before the last “]”).
Now we have two different datasets and we can compute their average grades separately as follows:
Note:
Sometimes we need matrices and not dataframes (e.g., matrix calculus). To transform a dataframe (or a
subset of it) into a matrix: as.matrix(dataframe_name).
# 7. Transform the last three column of the dataset class into a matrix called Grades
Grades = as.matrix(class[,(ncol(class)-2):ncol(class)])
Take Home Exercises Write the R code to answer the following questions.
1. Which is the most represented country in the class? (*)
2. Do the mean and the median of the variable "grades" coincide? (*)
3. Which is the proportion of females in the class? (*)
4. Our class has a median age over 24, isn’t it? (*)
5. Where does Emma come frome? (**)
6. Is David older than Brian? (**)
7. Are the polish taller than the scottish, in mean? (***)
26
8. Which is the name of the tallest person in the class? (***)
27
Import data
We can import vector or matrices defined from an external sources. We identify three main different types of
files (data vectors/dataframes):
Example
# Import data in R
# Alternatively, we can see where R is working (in which working directory (WD))
getwd()
# and then set the working directory where the dataset is contained
setwd("C:\\Users\\...\\WD") # Windows
setwd("/Users/.../WD") # Mac
# and then just read the file without specifying the WD
data = read.csv('dataname.csv')
28
R for graphical representations
R is a powerful tool for graphical representations. It contains a large list of (customizable) functions to
represent our data. We can see its potenzial with the following demo:
demo(graphics)
Scatter plots
plot() This function provides a scatterplot of our data. It requires as input two vectors of coordinates (x and
y) and it is fully customizable according to our preferences (x is optional). For example, specifying the extra
(that means optional) input "type" we can decide how to make our scatterplot. In particular:
We can customize our plot also adding a title specifying main as an extra input. For other graphical parameters
and tons of examples digit ?plot on the R console.
• cex: the dimension of the points (0.5x, 1x, 1.2x, 2x, ...).
29
• main: a string that indicates the title of the plot.
• xlab/ylab: specify the x and the y-labels.
• col: the color of points.
We can represent frequency distributions using arbitrarely a bar plot (barplot()) or a pie plot (pie) as follows:
Note: You can choose the color you prefer. In this case we have 10 nationalities, hence you can specify
a character vector of 10 colors. Furthermore, there are predefined palettes of colors. To see all the colors
available in R: http://www.stat.columbia.edu/ tzheng/files/Rcolor.pdf
Histograms
We use a histogram when we represent continuous variables. Let’s represent the distribution of heights in
our dataset class:
30
Take Home Exercises
Load the dataset “iris”, which is already present in R and answer the following questions:
1. Represent the distribution on the length of the flowers petals. ("P etal.Length") through a histogram.
2. Draw a pie plot of the variable "Species".
3. Make a scatter plot which shows the relation between P etal.Length and P etal.W idth.
Feel free to provide basic plots (∗) or to customize them (∗∗)/(∗ ∗ ∗).
31
Control Flow Statements
We can ask R to return a result according to particular conditions or launch an iterative process (repeat a set
of actions) specifying an exit rule (when to stop). For this type of procedures we need conditional statements
and loops.
Conditional Statements
We can ask R to do something exclusively if a condition is satysfied and also to do something else if the
same condition is not met. The syntax is straightforward:
• if(condition) do something.
• if(condition) do something else do something else.
• if(condition1) do something else if (condition2) do something dif f erent else do something else
(multiple conditions, you can add as many conditions you want).
Examples:
# conditional statements on R
even_odd(3); even_odd(4)
# %%: returns the remainder after the number is divided by divisor (2).
32
Take Home Exercises
1. Your vending machine
Define a catalogue of four goods (e.g. “Coffee”, “Cappuccino”, “Tea”, and “Hot Water”) and assign them
the relative prices (e.g. 1, 1.50, 0.80, 0.05). Using the conditional statements, write a function that given a
string of character containing one of the four goods returns its price. If the input is different from those four
products the function returns "Sorry, product not available.".
Try to order a Coffee, a Tea and a Ginseng.
(Remember: R is case sensitive: a 6=A)
2. Saving when I do shopping
The following table contains a list of groceries which sell tomatoes, potatoes, zucchinis, and eggplants along
with their prices per kg.
33
Loops
Loops allow us to repeat the same set of actions/procedures n times. We can specify how many times R
must repeat this set of action according to:
START
i=1
do
something YES
is still
i = i+ 1
i < n?
END NO
i = counter
n = number iterations
/ exit condition
for
The syntax
Note: Again, we do not need {} when we demand R for only one action.
34
Example
Ex1: The grocery shop at the end of the street is interested in evaluate the amount of watermelons sold in
the last week. The owner register these quantities in a vector called “wm” as follows:
In particular, he is satisfied when he managed to sell at least 5 watermelons. Write a function in R that
returns the total number of days when the shop sold at lest 5 watermelons.
Solution:
Our strategy consists in assessing how many elements of wm are not lower than 5.
step1: defining an empty vector s containing the number of successes:
wm[i] ≥ 5, s[i] = 1
s= ,
wm[i] < 5, s[i] = 0
with i = {1, 2, . . . , 7} the index of the f or loop. Then, we sum the element of the new vector s and return
the number of days when they say more than 5 watermelons.
wm_affair = function(vector){
s = c() # define an empty vector, REQUIRED if we want to fill it in a for loop
n = length(vector) # define n as the length of the input vector
# for loop
for (i in 1:n) {
if (vector[i]>=5) s[i] = 1 else s[i]=0
}
# the for loop is ender here, when i reaches n. Then, s is filled.
result = sum(s)
result
}
While
Syntax
Examples
35
Ex1: The owner of the grocery shop continues tracking the sales of watermelon, but the summer is nearly
over. His wm vector contains now 30 days of observations:
His objective is the same: see how many days he sold more than 5 watermelons. But this time he wants to
consider only the first 20 days, because they are the only one reliables for his analysis. Write a code in R to
see how many times he manages to sell at least 5 watermelons in the first 20 days.
Solution
# Write a code in R to see how many times he manages to sell
# at least 5 watermelons in the first 20 days.
wm_tot = c(10,5,4,6,8,11,3,1,5,8,1,5,7,4,9,7,2,3,9,6,2,1,2,3,4,1,0,0,2,1)
T ype1, if Sepal.Lengthi ≥ 5.4
F lowerT ypei =
T ype2, otherwise
36