R Tutorial
R Tutorial
R Tutorial
To run the program, enter into the directory structure "R\R3.2.2\bin\i386\Rgui.exe" under the
Windows Program Files.
Double clicking following icon brings up the R-GUI which is the R console to do R Programming.
Here first statement defines a string variable myString, where we assign a string "Hello, World!"
and then next statement print() is being used to print the value stored in variable myString.
print ( myString)
Save the above code in a file test.R and execute it at Linux command prompt as given below.
Even if you are using Windows or other system, syntax will remain same.
$ Rscript test.R
Vectors
Lists
Matrices
Arrays
Factors
Data Frames
The simplest of these objects is the vector object and there are six data types of these atomic
vectors, also termed as six classes of vectors. The other R-Objects are built upon the atomic
vectors.
Variables
Introduction
A variable provides us with named storage that our programs can manipulate. A variable in R
can store an atomic vector, group of atomic vectors or a combination of many Robjects. A valid
variable name consists of letters, numbers and the dot or underline characters. The variable
name starts with a letter or the dot not followed by a number.
var_name% Invalid Has the character '%'. Only dot(.) and underscore
allowed.
Multiple Assignment
> name <- "Carmen"; n1 <- 10; n2 <- 100; m <- 0.5
Displaying Contnets of a Variable
The values of the variables can be printed simply by typing the name of variable at the command
prompt.
~ From E. Paradis
> n <- 15
>n
[1] 15
> 5 -> n
>n
[1] 5
> x <- 1
> X <- 10
>x
[1] 1
>X
[1] 10
> n <- 10 + 2
>n
[1] 12
> n <- 3 + rnorm(1)
>n
[1] 2.208807
The option pattern can be used in the same way as with ls.
Another
useful option of ls.str is max.level which speci_es the level of
detail for the
display of composite objects. By default, ls.str displays the details
of all
objects in memory, included the columns of data frames, matrices
and lists,
which can result in a very long display. We can avoid to display all
these
details with the option max.level = -1:
> M <- data.frame(n1, n2, m)
> ls.str(pat = "M")
M : `data.frame': 1 obs. of 3 variables:
$ n1: num 10
$ n2: num 100
$ m : num 0.5
> ls.str(pat="M", max.level=-1)
M : `data.frame': 1 obs. of 3 variables:
To know all the variables currently available in the workspace we use the ls()function. Also the
ls() function can use patterns to match the variable names.
print(ls())
Note It is a sample output depending on what variables are declared in your environment.
The ls() function can use patterns to match the variable names.
print(ls(pattern = "var"))
The variables starting with dot(.) are hidden, they can be listed using "all.names = TRUE"
argument to ls() function.
print(ls(all.name = TRUE))
Online Help
The on-line help of R gives very useful information on how to use
the functions.
Help is available directly for a given function, for instance:
> ?lm
will display, within R, the help page for the function lm() (linear
model). The
commands help(lm) and help("lm") have the same e_ect.
Some remained
R Operators
Arithmetic Operators
Relational Operators
Logical Operators
Assignment Operators
Miscellaneous Operators
Later
Data Objects of R
Introudction
R works with objects which are, of course, characterized by their names and their
content, but also by attributes which specify the kind of data represented by an
object.
All objects have two intrinsic attributes: mode and length.
There are four main modes: numeric, character, complex 7, and logical (FALSE or
TRUE).
Other modes exist but they do not represent data, for instance function or
expression.
> x <- 1
> mode(x)
[1] "numeric"
> length(x)
[1] 1
> A <- "Gomphotherium"; compar <- TRUE; z <- 1i
> mode(A); mode(compar); mode(z)
[1] "character"
[1] "logical"
[1] "complex"
Vectors
Lists
Matrices
Arrays
Factors
Data Frames
The simplest of these objects is the vector object and there are six data
types of these atomic vectors, also termed as six classes of vectors. The
other R-Objects are built upon the atomic vectors.
Missing Data
Whatever the mode, missing data are represented by NA (not available).
Exponential Notation
A very large numeric value can be speci_ed with an exponential notation:
> N <- 2.1e23
>N
[1] 2.1e+23
Non-finite Numbers
R correctly represents non-_nite numeric values, such as _1 with Inf and -Inf, or
values which are not numbers with NaN (not a number ).
The following table gives an overview of the type of objects representing data.
Vector
The simplest of these objects is the vector object and there are six data types of these atomic
vectors, also termed as six classes of vectors. The other R-Objects are built upon the atomic
vectors.
When you want to create vector with more than one element, you should use c() function which
means to combine the elements into a vector.
Lists
A list is an R-object which can contain many different types of elements inside it like vectors,
functions and even another list inside it.
Matrices
A matrix is a two-dimensional rectangular data set. It can be created using a vector input to the
matrix function.
Arrays
While matrices are confined to two dimensions, arrays can be of any number of dimensions. The
array function takes a dim attribute which creates the required number of dimension. In the
below example we create an array with two elements which are 3x3 matrices each.
Factors
Factors are the r-objects which are created using a vector. It stores the
vector along with the distinct values of the elements in the vector as labels.
The labels are always character irrespective of whether it is numeric or
character or Boolean etc. in the input vector. They are useful in statistical
modeling.
Factors are created using the factor() function.The nlevels functions gives
the count of levels.
Data Frames
Data frames are tabular data objects. Unlike a matrix in data frame each
column can contain different modes of data. The first column can be
numeric while the second column can be character and third column can be
logical. It is a list of vectors of equal length.
Generating Data
Regular Sequence
The operator `:' has priority on the arithmetic operators within an expression, so if
you give command
> 1:10-1
It generates numbers from 0 to 9, as
[1] 0 1 2 3 4 5 6 7 8 9
where the first number indicates the beginning of the sequence, the second one the
end, and the third one the increment to be used to generate the sequence.
[1] 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
It is also possible, if one wants to enter some data on the keyboard, to use the
function scan with simply the default options:
> z <- scan()
1: 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
10:
Read 9 items
>z
[1] 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
The function rep creates a vector with all its elements identical:
> rep(1, 30)
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
The function sequence creates a series of sequences of integers each ending by the
numbers given as arguments:
> sequence(4:5)
[1] 1 2 3 4 1 2 3 4 5
>sequence(4:7)
> sequence(c(10,5))
[1] 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5
Exm.
> gl(3, 5)
[1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3
Levels: 1 2 3
Functions
Creating a function
Syntax:
Function body
Example
for(i in 1:a) {
b <- i^2
print(b)
Calling a Function
Example
new.function(6)
result <- a * b + c
print(result)
}
# Call the function by position of arguments.
new.function(5,3,11)
new.function(a = 11, b = 5, c = 3)
[1] 26
[1] 58
result <- a * b
print(result)
new.function()
new.function(9,5)
[1] 18
[1] 45
R Strings
Any value written within a pair of single quote or double quotes in R is treated as a string. Internally R
stores every string within double quotes, even when you create them with single quote.
Notes:
The quotes at the beginning and end of a string should be both double quotes or both single
quote. They can not be mixed.
Double quotes can be inserted into a string starting and ending with single quote.
Single quote can be inserted into a string starting and ending with double quotes.
Double quotes can not be inserted into a string starting and ending with double quotes.
Single quote can not be inserted into a string starting and ending with single quote.
String Manipulation
Many strings in R are combined using the paste() function. It can take any number of arguments to be
combined together.
collapse is used to eliminate the space in between two strings. But not the space within two
words of one string.
Example
a <- "Hello"
b <- 'How'
print(paste(a,b,c))
format(x, digits, nsmall, scientific, width, justify = c("left", "right", "centre", "none"))
nsmall is the minimum number of digits to the right of the decimal point.
width indicates the minimum width to be displayed by padding blanks in the beginning.
Example
Output
[1] "23.1234568"
Output
[1] "23.47000"
Output
print(result)
Output
[1] "6"
Output
print(result)
print(result)
Output
nchar(x)
toupper(x)
tolower(x)
substring(x,first,last)
Example
print(result)
[1] "act"
R Vectors
Vectors are the most basic R data objects and there are six types of atomic vectors. They are
logical, integer, double, complex, character and raw.
Creating Vectors
Creating Single Element Vectors
Even when you write just one value in R, it becomes a vector of length 1 and belongs to one of
the above vector types.
print("abc");
Output
[1] "abc"
print(12.5)
Output
[1] 12.5
print(63L)
Output
[1] 63
print(TRUE)
Output
[1] TRUE
# Atomic vector of type complex.
print(2+3i)
Output
[1] 2+3i
print(charToRaw('hello'))
Output
[1] 68 65 6c 6c 6f
The operator `:' has priority on the arithmetic operators within an expression, so if
you give command
> 1:10-1
It generates numbers from 0 to 9, as
[1] 0 1 2 3 4 5 6 7 8 9
v <- 5:13
print(v)
Output
[1] 5 6 7 8 9 10 11 12 13
v <- 6.6:12.6
print(v)
Output
# If the final element specified does not belong to the sequence then it is discarded.
v <- 3.8:11.4
print(v)
Output
where the first number indicates the beginning of the sequence, the second one the
end, and the third one the increment to be used to generate the sequence.
[1] 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
print(seq(5, 9, by = 0.4))
Output
[1] 5.0 5.4 5.8 6.2 6.6 7.0 7.4 7.8 8.2 8.6 9.0
[1] 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
The non-character values are coerced to character type if one of the elements is a character.
u<-t[2]
print(u)
u <- t[c(2,3,6)]
print(u)
Output
[1] "Mon"
[1] "Mon" "Tue" "Fri"
print(v)
Output
print(x)
Output
print(y)
Output
[1] "Sun"
Vector Manipulation
Vector Arithmetic