Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
5 views

Programming R - 3

Uploaded by

hafsulli123
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Programming R - 3

Uploaded by

hafsulli123
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Vectors

 The c() function can be used to create vectors of objects by concatenating


things together.
 There exists a function vector() whose input mode is vector(mode=“numeric”,
length=10) and produces a vector of zeroes of length 10. For other modes =
“logical”, “character”, the outputs are of the form FALSE, FALSE, FALSE,… or
“”,””, “”,… and so on.
 When different objects are mixed in a vector, coercion occurs so that every
element in the vector is of the same class.
 When combining a numeric object with a character object will create a
character vector, because numbers can usually be easily represented as
strings.
Matrix
 Matrices are vectors with a dimension attribute. The dimension attribute is itself an integer vector of length
2 (number of rows, number of columns).
 Matrices are constructed column-wise, so entries can be thought of starting in the “upper left” corner and
running down the columns.
 > m <- matrix(1:6, nrow = 2, ncol = 3)
>m
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
Matrices can also be created directly from vectors by adding a dimension attribute.
 > m <- 1:10
>m
[1] 1 2 3 4 5 6 7 8 9 10
dim(m) <- c(2, 5)
> m
[,1] [,2] [,3] [,4] [,5]
[1,] 1 3 5 7 9
[2,] 2 4 6 8 10
 Check the dimension of m using dim(m), attributes(m).
 Examples of R object attributes
 • names, dimnames
 • dimensions (e.g. matrices, arrays)
 • class (e.g. integer, numeric)
 • length
 • other user-defined attributes/metadata
 Matrices can be created by column-binding or row-binding with the cbind() and rbind() functions.
> x <- 1:3
> y <- 10:12
> cbind(x, y)
x y
[1,] 1 10
[2,] 2 11
[3,] 3 12
> rbind(x, y)
[,1] [,2] [,3]
x123
y 10 11 12
 Column names and row names can be set separately using the colnames() and
rownames() functions.
 > colnames(m) <- c("h", "f")
 > rownames(m) <- c("x", "z")

Lists
 Lists are a special type of vector that can contain elements of different
classes.
 Lists can be explicitly created using the list() function, which takes an
arbitrary number of arguments.

Factors
 Factors are used to represent categorical data and can be unordered or
ordered.
 One can think of a factor as an integer vector where each integer has a label.
Data Frames
 Data frames are used to store tabular data in R. They are an important type
of object in R and are used in a variety of statistical modeling applications.
 Data frames are represented as a special type of list where every element of
the list has to have the same length. Each element of the list can be thought
of as a column and the length of each element of the list is the number of
rows.
 Unlike matrices, data frames can store different classes of objects in each
column. Matrices must have every element be the same class (e.g. all
integers or all numeric).
 Data frames can be converted to a matrix by calling data.matrix(). While it
might seem that the as.matrix() function should be used to coerce a data
frame to a matrix, almost always, what you want is the result of
data.matrix().
Subsetting R Objects
There are three operators that can be used to extract subsets of R objects.
 The [ operator always returns an object of the same class as the original. It
can be used to select multiple elements of an object
 The [[ operator is used to extract elements of a list or a data frame. It can
only be used to extract a single element and the class of the returned object
will not necessarily be a list or data frame.
 The $ operator is used to extract elements of a list or data frame by literal
name. Its semantics are similar to that of [[.
 Vectors are basic objects in R and they can be subsetted using the [ operator.
> x <- c("a", "b", "c", "c", "d", "a")
> x[1] ## Extract the first element
[1] "a"
> x[2] ## Extract the second element
[1] "b"
 The [ operator can be used to extract multiple elements of a vector by passing the
operator an integer sequence. Here we extract the first four elements of the
vector.
> x[1:4]
[1] "a" "b" "c" "c"
 The sequence does not have to be in order; you can specify any arbitrary integer
vector.
> x[c(1, 3, 4)]
[1] "a" "c" "c"
 We can also pass a logical sequence to the [ operator to extract elements of a
vector that satisfy a given condition.
 For example, here we want the elements of x that come lexicographically after
the letter “a”.
> u <- x > "a"
>u
[1] FALSE TRUE TRUE TRUE TRUE FALSE
> x[u]
[1] "b" "c" "c" "d"
 A more compact, way to do this would be to skip the creation of a logical vector
and just subset the vector directly with the logical expression.
> x[x > "a"]
[1] "b" "c" "c" "d"
 Subsetting a Matrix: Matrices can be subsetted in the usual way with (i,j) type
indices. Here, we create a simple 2 × 3 matrix with the matrix function.
> x <- matrix(1:6, 2, 3)
>x
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
 We can access the (1, 2) or the (2, 1) element of this matrix using the appropriate
indices.
> x[1, 2]
[1] 3
> x[2, 1]
[1] 2
 Indices can also be missing. This behavior is used to access entire rows or columns
of a matrix.
> x[1, ] ## Extract the first row
[1] 1 3 5
> x[, 2] ## Extract the second column
[1] 3 4
Dropping matrix dimensions
 By default, when a single element of a matrix is retrieved, it is returned as a
vector of length 1 rather than a 1 × 1 matrix. Often, this is exactly what we want,
but this behavior can be turned off by setting drop = FALSE.
> x <- matrix(1:6, 2, 3)
> x[1, 2]
[1] 3
> x[1, 2, drop = FALSE]
[,1]
[1,] 3
 Similarly, when we extract a single row or column of a matrix, R by default drops the dimension of
length 1, so instead of getting a 1 × 3 matrix after extracting the first row, we get a vector of length
3. This behavior can similarly be turned off with the drop = FALSE option.
> x <- matrix(1:6, 2, 3)
> x[1, ]
[1] 1 3 5
> x[1, , drop = FALSE]
[,1] [,2] [,3]
[1,] 1 3 5
 Subsetting Lists: Lists in R can be subsetted using all three of the operators mentioned earlier.
> x <- list(foo = 1:4, bar = 0.6)
>x
$foo
[1] 1 2 3 4
$bar
[1] 0.6
 The [[ operator can be used to extract single elements from a list. Here we
extract the first element of the list.
> x[[1]]
[1] 1 2 3 4
 The [[ operator can also use named indices so that you don’t have to
remember the exact ordering of every element of the list. You can also use
the $ operator to extract elements by name.
> x[["bar"]]
[1] 0.6
> x$bar
[1] 0.6
 Notice you don’t need the quotes when you use the $ operator.
 One thing that differentiates the [[ operator from the $ is that the [[ operator
can be used with computed indices. The $ operator can only be used with
literal names.
> x <- list(foo = 1:4, bar = 0.6, baz = "hello")
> name <- "foo"
>
> ## computed index for "foo"
> x[[name]]
[1] 1 2 3 4
>
> ## element "name" doesn’t exist! (but no error here)
> x$name
NULL
>
> ## element "foo" does exist
> x$foo
[1] 1 2 3 4
 The [[ operator can take an integer sequence if you want to extract a nested
element of a list.
x <- list(a = list(10, 12, 14), b = c(3.14, 2.81))
>
> ## Get the 3rd element of the 1st element
> x[[c(1, 3)]]
[1] 14
>
> ## Same as above
> x[[1]][[3]]
[1] 14
>
> ## 1st element of the 2nd element
> x[[c(2, 1)]]
[1] 3.14
 The [ operator can be used to extract multiple elements from a list. For
example, if you wanted to extract the first and third elements of a list, you
would do the following
> x <- list(foo = 1:4, bar = 0.6, baz = "hello")
> x[c(1, 3)]
$foo
[1] 1 2 3 4
$baz
[1] "hello"
 Note that x[c(1, 3)] is NOT the same as x[[c(1, 3)]].
 Remember that the [ operator always returns an object of the same class as
the original. Since the original object was a list, the [ operator returns a list.
In the above code, we returned a list with two elements (the first and the
third).
Reading datasets
 read.table, read.csv, for reading tabular data
 readLines, for reading lines of a text file
 source, for reading in R code files (inverse of dump)
 dget, for reading in R code files (inverse of dput)
 load, for reading in saved workspaces
 unserialize, for reading single R objects in binary form
Writing datasets

There are analogous functions for writing data to files


 write.table, for writing tabular data to text files (i.e. CSV) or connections
 writeLines, for writing character data line-by-line to a file or connection
 dump, for dumping a textual representation of multiple R objects
 dput, for outputting a textual representation of an R object
 save, for saving an arbitrary number of R objects in binary format (possibly
compressed) to a file.
 serialize, for converting an R object into a binary format for outputting to a
connection (or file).

You might also like