Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
3 views

R Programming Language Unit01

R is an open-source programming language primarily used for statistical computing and data analysis, developed by Ross Ihaka and Robert Gentleman. It features a command-line interface, is platform-independent, and has a vast library of packages available through CRAN. R is widely used in data science, machine learning, and by major tech companies, although it has some limitations such as slower performance compared to other languages.

Uploaded by

Chaya Anu
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

R Programming Language Unit01

R is an open-source programming language primarily used for statistical computing and data analysis, developed by Ross Ihaka and Robert Gentleman. It features a command-line interface, is platform-independent, and has a vast library of packages available through CRAN. R is widely used in data science, machine learning, and by major tech companies, although it has some limitations such as slower performance compared to other languages.

Uploaded by

Chaya Anu
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 133

R Programming Language – Introduction

R is an open-source programming language that is widely used as a


statistical software and data analysis tool. R generally comes with the
Command-line interface. R is available across widely used platforms like
Windows, Linux, and macOS. Also, the R programming language is the
latest cutting-edge tool.
It was designed by Ross Ihaka and Robert Gentleman at the University of
Auckland, New Zealand, and is currently developed by the R Development
Core Team. R programming language is an implementation of the S
programming language. It also combines with lexical scoping semantics
inspired by Scheme. Moreover, the project conceives in 1992, with an initial
version released in 1995 and a stable beta version in 2000.

Why R Programming Language?

 R programming is used as a leading tool for machine learning, statistics,


and data analysis. Objects, functions, and packages can easily be
created by R.
 It’s a platform-independent language. This means it can be applied to all
operating system.
 It’s an open-source free language. That means anyone can install it in any
organization without purchasing a license.
 R programming language is not only a statistic package but also allows us
to integrate with other languages (C, C++). Thus, you can easily interact
with many data sources and statistical packages.
 The R programming language has a vast community of users and it’s
growing day by day.
 R is currently one of the most requested programming languages in the
Data Science job market that makes it the hottest trend nowadays.
Features of R Programming Language
Statistical Features of R:
 Basic Statistics: The most common basic statistics terms are the mean,
mode, and median. These are all known as “Measures of Central
Tendency.” So using the R language we can measure central tendency
very easily.
 Static graphics: R is rich with facilities for creating and developing
interesting static graphics. R contains functionality for many plot types
including graphic maps, mosaic plots, biplots, and the list goes on.
 Probability distributions: Probability distributions play a vital role in
statistics and by using R we can easily handle various types of probability
distribution such as Binomial Distribution, Normal Distribution, Chi-
squared Distribution and many more.
 Data analysis: It provides a large, coherent and integrated collection of
tools for data analysis.
Programming Features of R:
 R Packages: One of the major features of R is it has a wide availability of
libraries. R has CRAN(Comprehensive R Archive Network), which is a
repository holding more than 10, 0000 packages.
 Distributed Computing: Distributed computing is a model in which
components of a software system are shared among multiple computers
to improve efficiency and performance. Two new packages ddR and
multidplyr used for distributed programming in R were released in
November 2015.
Programming in R:
Since R is much similar to other widely used languages syntactically, it is
easier to code and learn in R. Programs can be written in R in any of the
widely used IDE like R Studio, Rattle, Tinn-R, etc. After writing the program
save the file with the extension .r. To run the program use the following
command on the command line:
R file_name.r
Example:
 R

# R program to print Welcome to GFG!

# Below line will print "Welcome to GFG!"

cat("Welcome to GFG!")
Output:
Welcome to GFG!
Advantages of R:
 R is the most comprehensive statistical analysis package. As new
technology and concepts often appear first in R.
 As R programming language is an open source. Thus, you can run R
anywhere and at any time.
 R programming language is suitable for GNU/Linux and Windows
operating system.
 R programming is cross-platform which runs on any operating system.
 In R, everyone is welcome to provide new packages, bug fixes, and code
enhancements.
Disadvantages of R:
 In the R programming language, the standard of some packages is less
than perfect.
 Although, R commands give little pressure to memory management. So R
programming language may consume all available memory.
 In R basically, nobody to complain if something doesn’t work.
 R programming language is much slower than other programming
languages such as Python and MATLAB.
Applications of R:
 We use R for Data Science. It gives us a broad variety of libraries related
to statistics. It also provides the environment for statistical computing and
design.
 R is used by many quantitative analysts as its programming tool. Thus, it
helps in data importing and cleaning.
 R is the most prevalent language. So many data analysts and research
programmers use it. Hence, it is used as a fundamental tool for finance.
 Tech giants like Google, Facebook, bing, Twitter, Accenture, Wipro and
many more using R nowadays.
Fundamentals of R
Basic Syntax in R Programming
R is the most popular language used for Statistical Computing and Data
Analysis with the support of over 10, 000+ free packages
in CRAN repository. Like any other programming language, R has a specific
syntax which is important to understand if you want to make use of its
powerful features. This article assumes R is already installed on your
machine. We will be using RStudio but we can also use R command prompt
by typing the following command in the command line.
$ R
This will launch the interpreter and now let’s write a basic Hello World
program to get started.
A program in R is made up of three things: Variables, Comments, and
Keywords. Variables are used to store the data, Comments are used to
improve code readability, and Keywords are reserved words that hold a
specific meaning to the compiler.

Variables in R
Previously, we wrote all our code in a print() but we don’t have a way to
address them as to perform further operations. This problem can be solved
by using variables which like any other programming language are the name
given to reserved memory locations that can store any type of data. In R, the
assignment can be denoted in three ways:
1. = (Simple Assignment)
2. <- (Leftward Assignment)
3. -> (Rightward Assignment)

Example: Output:
"Simple Assignment"
"Leftward Assignment!"
"Rightward Assignment"
The rightward assignment is less common and can be confusing for some
programmers, so it is generally recommended to use the <- or = operator
for assigning values in R.
Comments in R
Comments are a way to improve your code’s readability and are only
meant for the user so the interpreter ignores it. Only single-line comments
are available in R but we can also use multiline comments by using a
simple trick which is shown below. Single line comments can be written
by using # at the beginning of the statement.
Example:

Ou
tput:
[1] "This is fun!"
From the above output, we can see that both comments were ignored by the
interpreter.
Keywords in R
Keywords are the words reserved by a program because they have a
special meaning thus a keyword can’t be used as a variable name, function
name, etc. We can view these keywords by using either help(reserved) or
reserved.

 if, else, repeat, while, function, for, in, next and break are used for control-
flow statements and declaring user-defined functions.
 The ones left are used as constants like TRUE/FALSE are used as
boolean constants.
 NaN defines Not a Number value and NULL are used to define an
Undefined value.
 Inf is used for Infinity values.

Comments in R
comments are generic English sentences, mostly written in a program to
explain what it does or what a piece of code is supposed to do. More
specifically, information that programmer should be concerned with and it
has nothing to do with the logic of the code. They are completely ignored by
the compiler and are thus never reflected on to the input.
The question arises here that how will the compiler know whether the given
statement is a comment or not?
The answer is pretty simple. All languages use a symbol to denote a
comment and this symbol when encountered by the compiler helps it to
differentiate between a comment and statement.
Comments are generally used for the following purposes:
 Code Readability
 Explanation of the code or Metadata of the project
 Prevent execution of code
 To include resources

 Types of Comments
There are generally three types of comments supported by languages,
namely-Single-line Comments- Comment that only needs one line
 Multi-line Comments- Comment that requires more than one line.
 Documentation Comments- Comments that are drafted usually for a
quick documentation look-up

Single-Line Comments in R
Single-line comments are comments that require only one line. They are
usually drafted to explain what a single line of code does or what it is
supposed to produce so that it can help someone referring to the source
code.
Just like python single-line comments, any statement starting with “#” is a
comment in R.
Syntax:
# comment statement
Example 1:

# geeksforgeeks
The above code when executed will not produce any output, because R will
consider the statement as a comment and hence the compiler will ignore the
line.
Example 2:

# R program to add two numbers

# Assigning values to variables

a <- 9

b <- 4

# Printing sum

print(a + b)

Output:
[1] 13

R Operators
Operators are the symbols directing the compiler to perform various kinds of
operations between the operands. Operators simulate the various
mathematical, logical, and decision operations performed on a set of
Complex Numbers, Integers, and Numericals as input operands.
R Operators
R supports majorly four kinds of binary operators between a set of operands.
In this article, we will see various types of operators in R Programming
language and their usage.
Types of the operator in R language
 Arithmetic Operators
 Logical Operators
 Relational Operators
 Assignment Operators
 Miscellaneous Operator
Arithmetic Operators
Arithmetic operations in R simulate various math operations, like addition,
subtraction, multiplication, division, and modulo using the specified operator
between operands, which may be either scalar values, complex numbers, or
vectors. The R operators are performed element-wise at the corresponding
positions of the vectors.

Addition operator (+)


The values at the corresponding positions of both operands are added.
Consider the following R operator snippet to add two vectors:

a <- c (1, 0.1)

b <- c (2.33, 4)

print (a+b)

Output : 3.33 4.10


Subtraction Operator (-)
The second operand values are subtracted from the first. Consider the
following R operator snippet to subtract two variables:

 R

a <- 6

b <- 8.4

print (a-b)

Output : -2.4
Multiplication Operator (*)
The multiplication of corresponding elements of vectors and Integers are
multiplied with the use of the ‘*’ operator.

 R
B= c(4,4)

C= c(5,5)

print (B*C)

Output : 20 20
Division Operator (/)
The first operand is divided by the second operand with the use of the ‘/’
operator.

 R

a <- 10

b <- 5

print (a/b)

Output : 2
Power Operator (^)
The first operand is raised to the power of the second operand.

 R

a <- 4

b <- 5

print(a^b)

Output : 1024
Modulo Operator (%%)
The remainder of the first operand divided by the second operand is
returned.

 R
list1<- c(2, 22)

list2<-c(2,4)

print(list1 %% list2)

Output : 0 2

The following R code illustrates the usage of all Arithmetic R operators.

 R

# R program to illustrate

# the use of Arithmetic operators

vec1 <- c(0, 2)

vec2 <- c(2, 3)

# Performing operations on Operands

cat ("Addition of vectors :", vec1 + vec2, "\n")

cat ("Subtraction of vectors :", vec1 - vec2, "\n")

cat ("Multiplication of vectors :", vec1 * vec2, "\n")

cat ("Division of vectors :", vec1 / vec2, "\n")

cat ("Modulo of vectors :", vec1 %% vec2, "\n")

cat ("Power operator :", vec1 ^ vec2)

Output
Addition of vectors : 2 5
Subtraction of vectors : -2 -1
Multiplication of vectors : 0 6
Division of vectors : 0 0.6666667
Modulo of vectors : 0 2
Power operator : 0 8
Logical Operators
Logical operations in R simulate element-wise decision operations, based on
the specified operator between the operands, which are then evaluated to
either a True or False boolean value. Any non-zero integer value is
considered as a TRUE value, be it a complex or real number.
Element-wise Logical AND operator (&)
Returns True if both the operands are True.

 R

list1 <- c(TRUE, 0.1)

list2 <- c(0,4+3i)

print(list1 & list2)

Output : FALSE TRUE


Any non zero integer value is considered as a TRUE value, be
it complex or real number.
Element-wise Logical OR operator (|)
Returns True if either of the operands is True.

 R

list1 <- c(TRUE, 0.1)

list2 <- c(0,4+3i)

print(list1|list2)

Output : TRUE TRUE


NOT operator (!)
A unary operator that negates the status of the elements of the operand.

 R
list1 <- c(0,FALSE)

print(!list1)

Output : TRUE TRUE


Logical AND operator (&&)
Returns True if both the first elements of the operands are True.

 R

list1 <- c(TRUE, 0.1)

list2 <- c(0,4+3i)

print(list1 && list2)

Output : FALSE
Compares just the first elements of both the lists.
Logical OR operator (||)
Returns True if either of the first elements of the operands is True.

 R

list1 <- c(TRUE, 0.1)

list2 <- c(0,4+3i)

print(list1||list2)

Output : TRUE
The following R code illustrates the usage of all Logical Operators in R:

 R

# R program to illustrate
# the use of Logical operators

vec1 <- c(0,2)

vec2 <- c(TRUE,FALSE)

# Performing operations on Operands

cat ("Element wise AND :", vec1 & vec2, "\n")

cat ("Element wise OR :", vec1 | vec2, "\n")

cat ("Logical AND :", vec1 && vec2, "\n")

cat ("Logical OR :", vec1 || vec2, "\n")

cat ("Negation :", !vec1)

Output
Element wise AND : FALSE FALSE
Element wise OR : TRUE TRUE
Logical AND : FALSE
Logical OR : TRUE
Negation : TRUE FALSE

Relational Operators
The relational operators in R carry out comparison operations between the
corresponding elements of the operands. Returns a boolean TRUE value if
the first operand satisfies the relation compared to the second. A TRUE
value is always considered to be greater than the FALSE.
Less than (<)
Returns TRUE if the corresponding element of the first operand is less than
that of the second operand. Else returns FALSE.

 R
list1 <- c(TRUE, 0.1,"apple")

list2 <- c(0,0.1,"bat")

print(list1<list2)

Output : FALSE FALSE TRUE


Less than equal to (<=)
Returns TRUE if the corresponding element of the first operand is less than
or equal to that of the second operand. Else returns FALSE.

 R

list1 <- c(TRUE, 0.1, "apple")

list2 <- c(TRUE, 0.1, "bat")

# Convert lists to character strings

list1_char <- as.character(list1)

list2_char <- as.character(list2)

# Compare character strings

print(list1_char <= list2_char)

Output : TRUE TRUE TRUE


Greater than (>)
Returns TRUE if the corresponding element of the first operand is greater
than that of the second operand. Else returns FALSE.

 R
list1 <- c(TRUE, 0.1, "apple")

list2 <- c(TRUE, 0.1, "bat")

print(list1_char > list2_char)

Output : FALSE FALSE FALSE


Greater than equal to (>=)
Returns TRUE if the corresponding element of the first operand is greater or
equal to that of the second operand. Else returns FALSE.

 R

list1 <- c(TRUE, 0.1, "apple")

list2 <- c(TRUE, 0.1, "bat")

print(list1_char >= list2_char)

Output : TRUE TRUE FALSE


Not equal to (!=)
Returns TRUE if the corresponding element of the first operand is not equal
to the second operand. Else returns FALSE.

 R

list1 <- c(TRUE, 0.1,'apple')

list2 <- c(0,0.1,"bat")

print(list1!=list2)

Output : TRUE FALSE TRUE


The following R code illustrates the usage of all Relational Operators in R:

 R
# R program to illustrate

# the use of Relational operators

vec1 <- c(0, 2)

vec2 <- c(2, 3)

# Performing operations on Operands

cat ("Vector1 less than Vector2 :", vec1 < vec2, "\n")

cat ("Vector1 less than equal to Vector2 :", vec1 <= vec2, "\n")

cat ("Vector1 greater than Vector2 :", vec1 > vec2, "\n")

cat ("Vector1 greater than equal to Vector2 :", vec1 >= vec2, "\n")

cat ("Vector1 not equal to Vector2 :", vec1 != vec2, "\n")

Output
Vector1 less than Vector2 : TRUE TRUE
Vector1 less than equal to Vector2 : TRUE TRUE
Vector1 greater than Vector2 : FALSE FALSE
Vector1 greater than equal to Vector2 : FALSE FALSE
Vector1 not equal to Vector2 : TRUE TRUE

Assignment Operators
Assignment operators in R are used to assigning values to various data
objects in R. The objects may be integers, vectors, or functions. These
values are then stored by the assigned variable names. There are two kinds
of assignment operators: Left and Right
Left Assignment (<- or <<- or =)
Assigns a value to a vector.
vec1 = c("ab", TRUE)

print (vec1)

Output : "ab" "TRUE"


Right Assignment (-> or ->>)
Assigns value to a vector.

 R

c("ab", TRUE) ->> vec1

print (vec1)

Output : "ab" "TRUE"


The following R code illustrates the usage of all Relational Operators in R:

 R

# R program to illustrate

# the use of Assignment operators

vec1 <- c(2:5)

c(2:5) ->> vec2

vec3 <<- c(2:5)

vec4 = c(2:5)

c(2:5) -> vec5

# Performing operations on Operands


cat ("vector 1 :", vec1, "\n")

cat("vector 2 :", vec2, "\n")

cat ("vector 3 :", vec3, "\n")

cat("vector 4 :", vec4, "\n")

cat("vector 5 :", vec5)

Output
vector 1 : 2 3 4 5
vector 2 : 2 3 4 5
vector 3 : 2 3 4 5
vector 4 : 2 3 4 5
vector 5 : 2 3 4 5

Miscellaneous Operators
These are the mixed operators in R that simulate the printing of sequences
and assignment of vectors, either left or right-handed.
%in% Operator
Checks if an element belongs to a list and returns a boolean value TRUE if
the value is present else FALSE.

 R

val <- 0.1

list1 <- c(TRUE, 0.1,"apple")

print (val %in% list1)

Output : TRUE
Checks for the value 0.1 in the specified list. It exists,
therefore, prints TRUE.
%*% Operator
This operator is used to multiply a matrix with its transpose. Transpose of the
matrix is obtained by interchanging the rows to columns and columns to
rows. The number of columns of the first matrix must be equal to the number
of rows of the second matrix. Multiplication of the matrix A with its transpose,
B, produces a square matrix.

 R

mat = matrix(c(1,2,3,4,5,6),nrow=2,ncol=3)

print (mat)

print( t(mat))

pro = mat %*% t(mat)

print(pro)

Input :
Output :[,1] [,2] [,3] #original matrix of order 2x3
[1,] 1 3 5
[2,] 2 4 6
[,1] [,2] #transposed matrix of order
3x2
[1,] 1 2
[2,] 3 4
[3,] 5 6
[,1] [,2] #product matrix of order 2x2
[1,] 35 44
[2,] 44 56
The following R code illustrates the usage of all Miscellaneous Operators in
R:

 R

# R program to illustrate

# the use of Miscellaneous operators


mat <- matrix (1:4, nrow = 1, ncol = 4)

print("Matrix elements using : ")

print(mat)

product = mat %*% t(mat)

print("Product of matrices")

print(product,)

cat ("does 1 exist in prod matrix :", "1" %in% product)

Output
[1] "Matrix elements using : "
[,1] [,2] [,3] [,4]
[1,] 1 2 3 4

[1] "Product of matrices"


[,1]
[1,] 30

does 1 exist in prod matrix : FALSE

R Data Types
Different forms of data that can be saved and manipulated are defined and
categorized using data types in computer languages, including R. Each R
data type has unique properties and associated operations.
What are R Data types?
R Data types are used in computer programming to specify the kind of data
that can be stored in a variable. For effective memory consumption and
precise computation, the right data type must be selected. Each R data type
has its own set of regulations and restrictions.
Data Types in R Programming Language
Each variable in R has an associated data type. Each R-Data Type requires
different amounts of memory and has some specific operations which can be
performed over it. R Programming language has the following basic R-data
types and the following table shows the data type and the values that each
data type can take.
Basic Data
Types Values Examples

Numeric Set of all real numbers "numeric_value <- 3.14"

Integer Set of all integers, Z "integer_value <- 42L"

Logical TRUE and FALSE "logical_value <- TRUE"

Set of complex numbers "complex_value <- 1 +


Complex
2i"

“a”, “b”, “c”, …, “@”, “#”, “$”, "character_value <-


Character
…., “1”, “2”, …etc "Hello Geeks"

as.raw() "single_raw <-


raw
as.raw(255)"

Numeric Data type in R

Decimal values are called numerics in R. It is the default R data type for
numbers in R. If you assign a decimal value to a variable x as follows, x will
be of numeric type. Real numbers with a decimal point are represented using
this data type in R. it uses a format for double-precision floating-point
numbers to represent numerical values.
# A simple R program

# to illustrate Numeric data type

# Assign a decimal value to x

x = 5.6

# print the class name of variable

print(class(x))

# print the type of variable

print(typeof(x))

Output
[1] "numeric"
[1] "double"
Even if an integer is assigned to a variable y, it is still saved as a numeric
value.

 R

# A simple R program

# to illustrate Numeric data type

# Assign an integer value to y


y = 5

# print the class name of variable

print(class(y))

# print the type of variable

print(typeof(y))

Output
[1] "numeric"
[1] "double"
When R stores a number in a variable, it converts the number into a “double”
value or a decimal type with at least two decimal places. This means that a
value such as “5” here, is stored as 5.00 with a type of double and a class of
numeric. And also y is not an integer here can be confirmed with
the is.integer() function.
 R

# A simple R program

# to illustrate Numeric data type

# Assign a integer value to y

y = 5

# is y an integer?
print(is.integer(y))

Output
[1] FALSE

Integer Data type in R

R supports integer data types which are the set of all integers. You can
create as well as convert a value into an integer type using
the as.integer() function. You can also use the capital ‘L’ notation as a suffix
to denote that a particular value is of the integer R data type.
 R

# A simple R program

# to illustrate integer data type

# Create an integer value

x = as.integer(5)

# print the class name of x

print(class(x))

# print the type of x

print(typeof(x))
# Declare an integer by appending an L suffix.

y = 5L

# print the class name of y

print(class(y))

# print the type of y

print(typeof(y))

Output
[1] "integer"
[1] "integer"
[1] "integer"
[1] "integer"

Logical Data type in R

R has logical data types that take either a value of true or false. A logical
value is often created via a comparison between variables. Boolean values,
which have two possible values, are represented by this R data type: FALSE
or TRUE

 R

# A simple R program

# to illustrate logical data type


# Sample values

x = 4

y = 3

# Comparing two values

z = x > y

# print the logical value

print(z)

# print the class name of z

print(class(z))

# print the type of z

print(typeof(z))

Output
[1] TRUE
[1] "logical"
[1] "logical"

Complex Data type in R

R supports complex data types that are set of all the complex numbers. The
complex data type is to store numbers with an imaginary component.
 R

# A simple R program

# to illustrate complex data type

# Assign a complex value to x

x = 4 + 3i

# print the class name of x

print(class(x))

# print the type of x

print(typeof(x))

Output
[1] "complex"
[1] "complex"

Character Data type in R

R supports character data types where you have all the alphabets and
special characters. It stores character values or strings. Strings in R can
contain alphabets, numbers, and symbols. The easiest way to denote that a
value is of character type in R data type is to wrap the value inside single or
double inverted commas.

 R

# A simple R program
# to illustrate character data type

# Assign a character value to char

char = "Geeksforgeeks"

# print the class name of char

print(class(char))

# print the type of char

print(typeof(char))

Output
[1] "character"
[1] "character"
There are several tasks that can be done using R data types. Let’s
understand each task with its action and the syntax for doing the task along
with an R code to illustrate the task.
Raw data type in R
To save and work with data at the byte level in R, use the raw data type. By
displaying a series of unprocessed bytes, it enables low-level operations on
binary data. Here are some speculative data on R’s raw data types:

 R

# Create a raw vector


x <- as.raw(c(0x1, 0x2, 0x3, 0x4, 0x5))

print(x)

Output:
[1] 01 02 03 04 05
Five elements make up this raw vector x, each of which represents a raw
byte value.

Find data type of an object in R

To find the data type of an object you have to use class() function. The
syntax for doing that is you need to pass the object as an argument to the
function class() to find the data type of an object.
Syntax
class(object)
Example
 R

# A simple R program

# to find data type of an object

# Logical

print(class(TRUE))

# Integer

print(class(3L))
# Numeric

print(class(10.5))

# Complex

print(class(1+2i))

# Character

print(class("12-04-2020"))

Output
[1] "logical"
[1] "integer"
[1] "numeric"
[1] "complex"
[1] "character"

Type verification

To do that, you need to use the prefix “is.” before the data type as a
command. The syntax for that is, is.data_type() of the object you have to
verify.
Syntax:
is.data_type(object)
Example
 R

# A simple R program

# Verify if an object is of a certain datatype


# Logical

print(is.logical(TRUE))

# Integer

print(is.integer(3L))

# Numeric

print(is.numeric(10.5))

# Complex

print(is.complex(1+2i))

# Character

print(is.character("12-04-2020"))

print(is.integer("a"))

print(is.numeric(2+3i))

Output
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] TRUE
[1] FALSE
[1] FALSE

Coerce or convert the data type of an object to another

The process of altering the data type of an object to another type is referred
to as coercion or data type conversion. This is a common operation in many
programming languages that is used to alter data and perform various
computations. When coercion is required, the language normally performs it
automatically, whereas conversion is performed directly by the programmer.
Coercion can manifest itself in a variety of ways, depending on the R
programming language and the context in which it is employed. In some
circumstances, the coercion is implicit, which means that the language will
change one type to another without the programmer having to expressly
request it.
Syntax
as.data_type(object)
Note: All the coercions are not possible and if attempted will be returning an
“NA” value.
Example
 R

# A simple R program

# convert data type of an object to another

# Logical

print(as.numeric(TRUE))
# Integer

print(as.complex(3L))

# Numeric

print(as.logical(10.5))

# Complex

print(as.character(1+2i))

# Can't possible

print(as.numeric("12-04-2020"))

Output
[1] 1
[1] 3+0i
[1] TRUE
[1] "1+2i"
[1] NA
Warning message:
In print(as.numeric("12-04-2020")) : NAs introduced by
coercion

Data Structures in R Programming


A data structure is a particular way of organizing data in a computer so that it
can be used effectively. The idea is to reduce the space and time
complexities of different tasks. Data structures in R programming are tools
for holding multiple values.
R’s base data structures are often organized by their dimensionality (1D, 2D,
or nD) and whether they’re homogeneous (all elements must be of the
identical type) or heterogeneous (the elements are often of various types).
This gives rise to the six data types which are most frequently utilized in data
analysis.
The most essential data structures used in R include:
 Vectors
 Lists
 Dataframes
 Matrices
 Arrays
 Factors

Vectors

A vector is an ordered collection of basic data types of a given length. The


only key thing here is all the elements of a vector must be of the identical
data type e.g homogeneous data structures. Vectors are one-dimensional
data structures.
Example:
 Python3

# R program to illustrate Vector

# Vectors(ordered collection of same data type)

X = c(1, 3, 5, 7, 8)

# Printing those elements in console

print(X)

Output:
[1] 1 3 5 7 8
Lists

A list is a generic object consisting of an ordered collection of objects. Lists


are heterogeneous data structures. These are also one-dimensional data
structures. A list can be a list of vectors, list of matrices, a list of characters
and a list of functions and so on.
Example:
 Python3

# R program to illustrate a List

# The first attributes is a numeric vector

# containing the employee IDs which is

# created using the 'c' command here

empId = c(1, 2, 3, 4)

# The second attribute is the employee name

# which is created using this line of code here

# which is the character vector

empName = c("Debi", "Sandeep", "Subham", "Shiba")

# The third attribute is the number of employees

# which is a single numeric variable.

numberOfEmp = 4
# We can combine all these three different

# data types into a list

# containing the details of employees

# which can be done using a list command

empList = list(empId, empName, numberOfEmp)

print(empList)

Output:
[[1]]
[1] 1 2 3 4

[[2]]
[1] "Debi" "Sandeep" "Subham" "Shiba"

[[3]]
[1] 4

Dataframes

Dataframes are generic data objects of R which are used to store the tabular
data. Dataframes are the foremost popular data objects in R programming
because we are comfortable in seeing the data within the tabular form. They
are two-dimensional, heterogeneous data structures. These are lists of
vectors of equal lengths.
Data frames have the following constraints placed upon them:
 A data-frame must have column names and every row should have a
unique name.
 Each column must have the identical number of items.
 Each item in a single column must be of the same data type.
 Different columns may have different data types.
To create a data frame we use the data.frame() function.
Example:
 Python3

# R program to illustrate dataframe

# A vector which is a character vector

Name = c("Amiya", "Raj", "Asish")

# A vector which is a character vector

Language = c("R", "Python", "Java")

# A vector which is a numeric vector

Age = c(22, 25, 45)

# To create dataframe use data.frame command

# and then pass each of the vectors

# we have created as arguments

# to the function data.frame()

df = data.frame(Name, Language, Age)


print(df)

Output:
Name Language Age
1 Amiya R 22
2 Raj Python 25
3 Asish Java 45

Matrices

A matrix is a rectangular arrangement of numbers in rows and columns. In a


matrix, as we know rows are the ones that run horizontally and columns are
the ones that run vertically. Matrices are two-dimensional, homogeneous
data structures.
Now, let’s see how to create a matrix in R. To create a matrix in R you need
to use the function called matrix. The arguments to this matrix() are the set of
elements in the vector. You have to pass how many numbers of rows and
how many numbers of columns you want to have in your matrix and this is
the important point you have to remember that by default, matrices are in
column-wise order.
Example:
 Python3

# R program to illustrate a matrix

A = matrix(
# Taking sequence of elements

c(1, 2, 3, 4, 5, 6, 7, 8, 9),

# No of rows and columns

nrow = 3, ncol = 3,

# By default matrices are

# in column-wise order

# So this parameter decides

# how to arrange the matrix

byrow = TRUE

print(A)

Output:
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9

Arrays

Arrays are the R data objects which store the data in more than two
dimensions. Arrays are n-dimensional data structures. For example, if we
create an array of dimensions (2, 3, 3) then it creates 3 rectangular matrices
each with 2 rows and 3 columns. They are homogeneous data structures.
Now, let’s see how to create arrays in R. To create an array in R you need to
use the function called array(). The arguments to this array() are the set of
elements in vectors and you have to pass a vector containing the dimensions
of the array.
Example:
 Python3

# R program to illustrate an array

A = array(

# Taking sequence of elements

c(1, 2, 3, 4, 5, 6, 7, 8),

# Creating two rectangular matrices

# each with two rows and two columns

dim = c(2, 2, 2)

print(A)

Output:
, , 1

[,1] [,2]
[1,] 1 3
[2,] 2 4
, , 2

[,1] [,2]
[1,] 5 7
[2,] 6 8

Factors

Factors are the data objects which are used to categorize the data and store
it as levels. They are useful for storing categorical data. They can store both
strings and integers. They are useful to categorize unique values in columns
like “TRUE” or “FALSE”, or “MALE” or “FEMALE”, etc.. They are useful in
data analysis for statistical modeling.
Now, let’s see how to create factors in R. To create a factor in R you need to
use the function called factor(). The argument to this factor() is the vector.
Example:
 Python3

# R program to illustrate factors

# Creating factor using factor()

fac = factor(c("Male", "Female", "Male",

"Male", "Female", "Male", "Female"))

print(fac)

Output:
[1] Male Female Male Male Female Male Female
Levels: Female Male
R Vectors
R vectors are the same as the arrays in C language which are used to hold
multiple data values of the same type. One major key point is that in R the
indexing of the vector will start from ‘1’ and not from ‘0’. We can create
numeric vectors and character vectors as well.

R – Vector

Types of R vectors
Vectors are of different types which are used in R. Following are some of the
types of vectors:
Numeric vectors: Numeric vectors are those which contain numeric values
such as integer, float, etc.
 R

# R program to create numeric Vectors

# creation of vectors using c() function.

v1<- c(4, 5, 6, 7)
# display type of vector

typeof(v1)

# by using 'L' we can specify that we want integer values.

v2<- c(1L, 4L, 2L, 5L)

# display type of vector

typeof(v2)

Output:
[1] "double"
[1] "integer"
Character vectors: Character vectors in R contain alphanumeric values and
special characters.
 R

# R program to create Character Vectors

# by default numeric values

# are converted into characters

v1<- c('geeks', '2', 'hello', 57)

# Displaying type of vector


typeof(v1)

Output:
[1] "character"
Logical vectors: Logical vectors in R contain Boolean values such as
TRUE, FALSE and NA for Null values.
 R

# R program to create Logical Vectors

# Creating logical vector

# using c() function

v1<- c(TRUE, FALSE, TRUE, NA)

# Displaying type of vector

typeof(v1)

Output:
[1] "logical"

Creating a vector

There are different ways of creating R vectors. Generally, we use ‘c’ to


combine different elements together.

 R

# R program to create Vectors


# we can use the c function

# to combine the values as a vector.

# By default the type will be double

X<- c(61, 4, 21, 67, 89, 2)

cat('using c function', X, '\n')

# seq() function for creating

# a sequence of continuous values.

# length.out defines the length of vector.

Y<- seq(1, 10, length.out = 5)

cat('using seq() function', Y, '\n')

# use':' to create a vector

# of continuous values.

Z<- 2:7

cat('using colon', Z)

Output:
using c function 61 4 21 67 89 2
using seq() function 1 3.25 5.5 7.75 10
using colon 2 3 4 5 6 7
Length of R vector

 R

# Create a numeric vector

x <- c(1, 2, 3, 4, 5)

# Find the length of the vector

length(x)

# Create a character vector

y <- c("apple", "banana", "cherry")

# Find the length of the vector

length(y)

# Create a logical vector

z <- c(TRUE, FALSE, TRUE, TRUE)

# Find the length of the vector

length(z)

Output:
> length(x)
[1] 5

> length(y)
[1] 3

> length(z)
[1] 4

Accessing R vector elements

Accessing elements in a vector is the process of performing operation on an


individual element of a vector. There are many ways through which we can
access the elements of the vector. The most common is using the ‘[]’,
symbol.
Note: Vectors in R are 1 based indexing unlike the normal C, python, etc
format.
 R

# R program to access elements of a Vector

# accessing elements with an index number.

X<- c(2, 5, 18, 1, 12)

cat('Using Subscript operator', X[2], '\n')

# by passing a range of values

# inside the vector index.

Y<- c(4, 8, 2, 1, 17)

cat('Using combine() function', Y[c(4, 1)], '\n')

Output:
Using Subscript operator 5
Using combine() function 1 4
Modifying a R vector

Modification of a Vector is the process of applying some operation on an


individual element of a vector to change its value in the vector. There are
different ways through which we can modify a vector:

 R

# R program to modify elements of a Vector

# Creating a vector

X<- c(2, 7, 9, 7, 8, 2)

# modify a specific element

X[3] <- 1

X[2] <-9

cat('subscript operator', X, '\n')

# Modify using different logics.

X[1:5]<- 0

cat('Logical indexing', X, '\n')

# Modify by specifying

# the position or elements.


X<- X[c(3, 2, 1)]

cat('combine() function', X)

Output:
subscript operator 2 9 1 7 8 2
Logical indexing 0 0 0 0 0 2
combine() function 0 0 0

Deleting a R vector

Deletion of a Vector is the process of deleting all of the elements of the


vector. This can be done by assigning it to a NULL value.

 R

# R program to delete a Vector

# Creating a Vector

M<- c(8, 10, 2, 5)

# set NULL to the vector

M<- NULL

cat('Output vector', M)

Output:
Output vector NULL
Sorting elements of a R Vector

sort() function is used with the help of which we can sort the values in
ascending or descending order.

 R

# R program to sort elements of a Vector

# Creation of Vector

X<- c(8, 2, 7, 1, 11, 2)

# Sort in ascending order

A<- sort(X)

cat('ascending order', A, '\n')

# sort in descending order

# by setting decreasing as TRUE

B<- sort(X, decreasing = TRUE)

cat('descending order', B)

Output:
ascending order 1 2 2 7 8 11
descending order 11 8 7 2 2 1

R – Lists
A list in R is a generic object consisting of an ordered collection of objects.
Lists are one-dimensional, heterogeneous data structures. The list can be a
list of vectors, a list of matrices, a list of characters and a list of functions,
and so on.
A list is a vector but with heterogeneous data elements. A list in R is created
with the use of list() function. R allows accessing elements of an R list with
the use of the index value. In R, the indexing of a list starts with 1 instead of
0 like in other programming languages.
Creating a List
To create a List in R you need to use the function called “list()”. In other
words, a list is a generic vector containing other objects. To illustrate how a
list looks, we take an example here. We want to build a list of employees
with the details. So for this, we want attributes such as ID, employee name,
and the number of employees.
Example:

# R program to create a List

# The first attributes is a numeric vector

# containing the employee IDs which is created

# using the command here

empId = c(1, 2, 3, 4)

# The second attribute is the employee name

# which is created using this line of code here

# which is the character vector

empName = c("Debi", "Sandeep", "Subham", "Shiba")


# The third attribute is the number of employees

# which is a single numeric variable.

numberOfEmp = 4

# We can combine all these three different

# data types into a list

# containing the details of employees

# which can be done using a list command

empList = list(empId, empName, numberOfEmp)

print(empList)

Output:
[[1]]
[1] 1 2 3 4

[[2]]
[1] "Debi" "Sandeep" "Subham" "Shiba"

[[3]]
[1] 4
Accessing components of a list
We can access components of an R list in two ways.
 Access components by names: All the components of a list can be
named and we can use those names to access the components of the R
list using the dollar command.
Example:
 R

# R program to access

# components of a list

# Creating a list by naming all its components

empId = c(1, 2, 3, 4)

empName = c("Debi", "Sandeep", "Subham", "Shiba")

numberOfEmp = 4

empList = list(

"ID" = empId,

"Names" = empName,

"Total Staff" = numberOfEmp

print(empList)

# Accessing components by names

cat("Accessing name components using $ command\n")

print(empList$Names)

Output:
$ID
[1] 1 2 3 4
$Names
[1] "Debi" "Sandeep" "Subham" "Shiba"

$`Total Staff`
[1] 4

Accessing name components using $ command


[1] "Debi" "Sandeep" "Subham" "Shiba"
 Access components by indices: We can also access the components
of the R list using indices. To access the top-level components of a R list
we have to use a double slicing operator “[[ ]]” which is two square
brackets and if we want to access the lower or inner-level components of
a R list we have to use another square bracket “[ ]” along with the double
slicing operator “[[ ]]“.
Example:
 R

# R program to access

# components of a list

# Creating a list by naming all its components

empId = c(1, 2, 3, 4)

empName = c("Debi", "Sandeep", "Subham", "Shiba")

numberOfEmp = 4

empList = list(

"ID" = empId,

"Names" = empName,

"Total Staff" = numberOfEmp


)

print(empList)

# Accessing a top level components by indices

cat("Accessing name components using indices\n")

print(empList[[2]])

# Accessing a inner level components by indices

cat("Accessing Sandeep from name using indices\n")

print(empList[[2]][2])

# Accessing another inner level components by indices

cat("Accessing 4 from ID using indices\n")

print(empList[[1]][4])

Output:
$ID
[1] 1 2 3 4

$Names
[1] "Debi" "Sandeep" "Subham" "Shiba"

$`Total Staff`
[1] 4
Accessing name components using indices
[1] "Debi" "Sandeep" "Subham" "Shiba"
Accessing Sandeep from name using indices
[1] "Sandeep"
Accessing 4 from ID using indices
[1] 4
Modifying components of a list
A R list can also be modified by accessing the components and replacing
them with the ones which you want.
Example:
 R

# R program to edit

# components of a list

# Creating a list by naming all its components

empId = c(1, 2, 3, 4)

empName = c("Debi", "Sandeep", "Subham", "Shiba")

numberOfEmp = 4

empList = list(

"ID" = empId,

"Names" = empName,

"Total Staff" = numberOfEmp

cat("Before modifying the list\n")


print(empList)

# Modifying the top-level component

empList$`Total Staff` = 5

# Modifying inner level component

empList[[1]][5] = 5

empList[[2]][5] = "Kamala"

cat("After modified the list\n")

print(empList)

Output:
Before modifying the list
$ID
[1] 1 2 3 4

$Names
[1] "Debi" "Sandeep" "Subham" "Shiba"

$`Total Staff`
[1] 4

After modified the list


$ID
[1] 1 2 3 4 5
$Names
[1] "Debi" "Sandeep" "Subham" "Shiba" "Kamala"

$`Total Staff`
[1] 5
Concatenation of lists
Two R lists can be concatenated using the concatenation function. So, when
we want to concatenate two lists we have to use the concatenation operator.
Syntax:
list = c(list, list1)
list = the original list
list1 = the new list
Example:
 R

# R program to edit

# components of a list

# Creating a list by naming all its components

empId = c(1, 2, 3, 4)

empName = c("Debi", "Sandeep", "Subham", "Shiba")

numberOfEmp = 4

empList = list(

"ID" = empId,

"Names" = empName,

"Total Staff" = numberOfEmp


)

cat("Before concatenation of the new list\n")

print(empList)

# Creating another list

empAge = c(34, 23, 18, 45)

# Concatenation of list using concatenation operator

empList = c(empName, empAge)

cat("After concatenation of the new list\n")

print(empList)

Output:
Before concatenation of the new list
$ID
[1] 1 2 3 4

$Names
[1] "Debi" "Sandeep" "Subham" "Shiba"

$`Total Staff`
[1] 4

After concatenation of the new list


[1] "Debi" "Sandeep" "Subham" "Shiba" "34" "23"
"18" "45"
Deleting components of a list
To delete components of a R list, first of all, we need to access those
components and then insert a negative sign before those components. It
indicates that we had to delete that component.
Example:
 R

# R program to access

# components of a list

# Creating a list by naming all its components

empId = c(1, 2, 3, 4)

empName = c("Debi", "Sandeep", "Subham", "Shiba")

numberOfEmp = 4

empList = list(

"ID" = empId,

"Names" = empName,

"Total Staff" = numberOfEmp

cat("Before deletion the list is\n")

print(empList)
# Deleting a top level components

cat("After Deleting Total staff components\n")

print(empList[-3])

# Deleting a inner level components

cat("After Deleting sandeep from name\n")

print(empList[[2]][-2])

Output:
Before deletion the list is
$ID
[1] 1 2 3 4

$Names
[1] "Debi" "Sandeep" "Subham" "Shiba"

$`Total Staff`
[1] 4

After Deleting Total staff components


$ID
[1] 1 2 3 4

$Names
[1] "Debi" "Sandeep" "Subham" "Shiba"

After Deleting sandeep from name


[1] "Debi" "Subham" "Shiba"
Merging list
We can merge the R list by placing all the lists into a single list.

 R

# Create two lists.

lst1 <- list(1,2,3)

lst2 <- list("Sun","Mon","Tue")

# Merge the two lists.

new_list <- c(lst1,lst2)

# Print the merged list.

print(new_list)

Output:
[[1]]
[1] 1

[[2]]
[1] 2

[[3]]
[1] 3

[[4]]
[1] "Sun"

[[5]]
[1] "Mon"

[[6]]
[1] "Tue"
Converting List to Vector
Here we are going to convert the R list to vector, for this we will create a list
first and then unlist the list into the vector.

 R

# Create lists.

lst <- list(1:5)

print(lst)

# Convert the lists to vectors.

vec <- unlist(lst)

print(vec)

Output:
[[1]]
[1] 1 2 3 4 5

[1] 1 2 3 4 5
R List to matrix
We will create matrices using matrix() function in R programming. Another
function that will be used is unlist() function to convert the lists into a vector.

 R

# Defining list
lst1 <- list(list(1, 2, 3),

list(4, 5, 6))

# Print list

cat("The list is:\n")

print(lst1)

cat("Class:", class(lst1), "\n")

# Convert list to matrix

mat <- matrix(unlist(lst1), nrow = 2, byrow = TRUE)

# Print matrix

cat("\nAfter conversion to matrix:\n")

print(mat)

cat("Class:", class(mat), "\n")

Output:
The list is:
[[1]]
[[1]][[1]]
[1] 1

[[1]][[2]]
[1] 2

[[1]][[3]]
[1] 3

[[2]]
[[2]][[1]]
[1] 4

[[2]][[2]]
[1] 5

[[2]][[3]]
[1] 6

Class: list

After conversion to matrix:


[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
Class: matrix

R – Array
Arrays are essential data storage structures defined by a fixed number of
dimensions. Arrays are used for the allocation of space at contiguous
memory locations. Uni-dimensional arrays are called vectors with the length
being their only dimension. Two-dimensional arrays are called matrices,
consisting of fixed numbers of rows and columns. Arrays consist of all
elements of the same data type. Vectors are supplied as input to the function
and then create an array based on the number of dimensions.
Creating an Array
An array in R can be created with the use of array() function. List of
elements is passed to the array() functions along with the dimensions as
required.
Syntax:
array(data, dim = (nrow, ncol, nmat), dimnames=names)
where,

nrow : Number of rows


ncol : Number of columns
nmat : Number of matrices of dimensions nrow * ncol
dimnames : Default value = NULL.
Otherwise, a list has to be specified which has a name for each component
of the dimension. Each component is either a null or a vector of length equal
to the dim value of that corresponding dimension.
Uni-Dimensional Array
A vector is a uni-dimensional array, which is specified by a single dimension,
length. A Vector can be created using ‘ c()‘ function. A list of values is passed
to the c() function to create a vector.
Example:

vec1 <- c(1, 2, 3, 4, 5, 6, 7, 8, 9)

print (vec1)

# cat is used to concatenate

# strings and print it.

cat ("Length of vector : ", length(vec1))

Output:
[1] 1 2 3 4 5 6 7 8 9
Length of vector : 9
Multi-Dimensional Array
A two-dimensional matrix is an array specified by a fixed number of rows and
columns, each containing the same data type. A matrix is created by
using array() function to which the values and the dimensions are passed.
Example:
# arranges data from 2 to 13

# in two matrices of dimensions 2x3

arr = array(2:13, dim = c(2, 3, 2))

print(arr)

Output:
, , 1

[,1] [,2] [,3]


[1,] 2 4 6
[2,] 3 5 7

, , 2

[,1] [,2] [,3]


[1,] 8 10 12
[2,] 9 11 13
Vectors of different lengths can also be fed as input into the array() function.
However, the total number of elements in all the vectors combined should be
equal to the number of elements in the matrices. The elements are arranged
in the order in which they are specified in the function.
Example:

vec1 <- c(1, 2, 3, 4, 5, 6, 7, 8, 9)

vec2 <- c(10, 11, 12)

# elements are combined into a single vector,

# vec1 elements followed by vec2 elements.


Arr = array(c(vec1, vec2), dim = c(2, 3, 2))

print (arr)

Output:
,, 1
[, 1] [, 2] [, 3]
[1, ] 1 3 5
[2, ] 2 4 6
,, 2
[, 1] [, 2] [, 3]
[1, ] 7 9 11
[2, ] 8 10 12
Naming of Arrays
The row names, column names and matrices names are specified as a
vector of the number of rows, number of columns and number of matrices
respectively. By default, the rows, columns and matrices are named by their
index values.

Row_names <- c(“row1”, “row2”)

col_names <- c(“col1”, “col2”, “col3”)

mat_names <- c(“Mat1”, “Mat2”)

# the naming of the various elements

# is specified in a list and

# fed to the function

arr = array(2:14, dim = c(2, 3, 2),


dimnames = list(row_names,

col_names, mat_names))

print (arr)

Output:
,, Mat1
col1 col2 col3
row1 2 4 6
row2 3 5 7
,, Mat2
col1 col2 col3
row1 8 10 12
row2 9 11 13
Accessing arrays
The arrays can be accessed by using indices for different dimensions
separated by commas. Different components can be specified by any
combination of elements’ names or positions.
Accessing Uni-Dimensional Array
The elements can be accessed by using indexes of the corresponding
elements.

Vec <- c(1:10)

# accessing entire vector

cat (“Vector is : “, vec)

# accessing elements
cat (“Third element of vector is : “, vec[3])

< Output:
Vector is : 1 2 3 4 5 6 7 8 9 10
Third element of vector is : 3
Accessing entire matrices

vec1 <- c(1, 2, 3, 4, 5, 6, 7, 8, 9)

vec2 <- c(10, 11, 12)

row_names <- c(“row1”, “row2”)

col_names <- c(“col1”, “col2”, “col3”)

mat_names <- c(“Mat1”, “Mat2”)

arr = array(c(vec1, vec2), dim = c(2, 3, 2),

dimnames = list(row_names,

col_names, mat_names))

# accessing matrix 1 by index value

print (“Matrix 1”)

print (arr[,,1])

# accessing matrix 2 by its name

print (“Matrix 2”)


print(arr[,,”Mat2”])

Output:
[1] “Matrix 1”
col1 col2 col3
row1 1 3 5
row2 2 4 6
[1] “Matrix 2”
col1 col2 col3
row1 7 9 11
row2 8 10 12
Accessing specific rows and columns of matrices
Rows and columns can also be accessed by both names as well as indices.

Vec1 <- c(1, 2, 3, 4, 5, 6, 7, 8, 9)

vec2 <- c(10, 11, 12)

row_names <- c(“row1”, “row2”)

col_names <- c(“col1”, “col2”, “col3”)

mat_names <- c(“Mat1”, “Mat2”)

arr = array(c(vec1, vec2), dim = c(2, 3, 2),

dimnames = list(row_names,

col_names, mat_names))

# accessing matrix 1 by index value

print (“1st column of matrix 1”)


print (arr[, 1, 1])

# accessing matrix 2 by its name

print (“2nd row of matrix 2”)

print(arr[“row2”,,”Mat2”])

+Output:
[1] "1st column of matrix 1"
row1 row2
1 2
[1] "2nd row of matrix 2"
col1 col2 col3
8 10 12
Accessing elements individually
Elements can be accessed by using both the row and column numbers or
names.

vec1 <- c(1, 2, 3, 4, 5, 6, 7, 8, 9)

vec2 <- c(10, 11, 12)

row_names <- c("row1", "row2")

col_names <- c("col1", "col2", "col3")

mat_names <- c("Mat1", "Mat2")

arr = array(c(vec1, vec2), dim = c(2, 3, 2),

dimnames = list(row_names, col_names, mat_names))


# accessing matrix 1 by index value

print ("2nd row 3rd column matrix 1 element")

print (arr[2, "col3", 1])

# accessing matrix 2 by its name

print ("2nd row 1st column element of matrix 2")

print(arr["row2", "col1", "Mat2"])

Output:
[1] "2nd row 3rd column matrix 1 element"
[1] 6
[1] "2nd row 1st column element of matrix 2"
[1] 8
Accessing subset of array elements
A smaller subset of the array elements can be accessed by defining a range
of row or column limits.

row_names <- c("row1", "row2")

col_names <- c("col1", "col2", "col3", "col4")

mat_names <- c("Mat1", "Mat2")

arr = array(1:15, dim = c(2, 4, 2),

dimnames = list(row_names, col_names, mat_names))

# print elements of both the rows and columns 2 and 3 of matrix 1


print (arr[, c(2, 3), 1])

Output:
col2 col3
row1 3 5
row2 4 6
Adding elements to array
Elements can be appended at the different positions in the array. The
sequence of elements is retained in order of their addition to the array. The
time complexity required to add new elements is O(n) where n is the length
of the array. The length of the array increases by the number of element
additions. There are various in-built functions available in R to add new
values:
 c(vector, values): c() function allows us to append values to the end of
the array. Multiple values can also be added together.
 append(vector, values): This method allows the values to be appended
at any position in the vector. By default, this function adds the element at
end.
append(vector, values, after=length(vector)) adds new values after
specified length of the array specified in the last argument of the function.
 Using the length function of the array:
Elements can be added at length+x indices where x>0.

# creating a uni-dimensional array

x <- c(1, 2, 3, 4, 5)

# addition of element using c() function

x <- c(x, 6)

print ("Array after 1st modification ")

print (x)
# addition of element using append function

x <- append(x, 7)

print ("Array after 2nd modification ")

print (x)

# adding elements after computing the length

len <- length(x)

x[len + 1] <- 8

print ("Array after 3rd modification ")

print (x)

# adding on length + 3 index

x[len + 3]<-9

print ("Array after 4th modification ")

print (x)

# append a vector of values to the array after length + 3 of array

print ("Array after 5th modification")

x <- append(x, c(10, 11, 12), after = length(x)+3)

print (x)
# adds new elements after 3rd index

print ("Array after 6th modification")

x <- append(x, c(-1, -1), after = 3)

print (x)


Output:
 [1] "Array after 1st modification "
 [1] 1 2 3 4 5 6
 [1] "Array after 2nd modification "
 [1] 1 2 3 4 5 6 7
 [1] "Array after 3rd modification "
 [1] 1 2 3 4 5 6 7 8
 [1] "Array after 4th modification "
 [1] 1 2 3 4 5 6 7 8 NA 9
 [1] "Array after 5th modification"
 [1] 1 2 3 4 5 6 7 8 NA 9 10 11 12
 [1] "Array after 6th modification"
 [1] 1 2 3 -1 -1 4 5 6 7 8 NA 9 10 11 12
 The original length of the array was 7, and after third modification
elements are present till the 8th index value. Now, at the fourth
modification, when we add element 9 at the tenth index value, the R’s
inbuilt function automatically adds NA at the missing value positions.
At 5th modification, the array of elements [10, 11, 12] are added
beginning from the 11th index.
At 6th modification, array [-1, -1] is appended after the third position in the
array.
Removing Elements from Array
Elements can be removed from arrays in R, either one at a time or multiple
together. These elements are specified as indexes to the array, wherein the
array values satisfying the conditions are retained and rest removed. The
comparison for removal is based on array values. Multiple conditions can
also be combined together to remove a range of elements. Another way to
remove elements is by using %in% operator wherein the set of element
values belonging to the TRUE values of the operator are displayed as result
and the rest are removed.

# creating an array of length 9

m <- c(1, 2, 3, 4, 5, 6, 7, 8, 9)

print ("Original Array")

print (m)

# remove a single value element:3 from array

m <- m[m != 3]

print ("After 1st modification")

print (m)

# removing elements based on condition

# where either element should be

# greater than 2 and less than equal to 8

m <- m[m>2 & m<= 8]

print ("After 2nd modification")

print (m)

# remove sequence of elements using another array


remove <- c(4, 6, 8)

# check which element satisfies the remove property

print (m % in % remove)

print ("After 3rd modification")

print (m [! m % in % remove])

Output:
[1] "Original Array"
[1] 1 2 3 4 5 6 7 8 9
[1] "After 1st modification"
[1] 1 2 4 5 6 7 8 9
[1] "After 2nd modification"
[1] 4 5 6 7 8
[1] TRUE FALSE TRUE FALSE TRUE
[1] "After 3rd modification"
[1] 5 7
At 1st modification, all the element values that are not equal to 3 are
retained. At 2nd modification, the range of elements that are between 2 and
8 are retained, rest are removed. At 3rd modification, the elements satisfying
the FALSE value are printed, since the condition involves the NOT operator.
Updating Existing Elements of Array
The elements of the array can be updated with new values by assignment of
the desired index of the array with the modified value. The changes are
retained in the original array. If the index value to be updated is within the
length of the array, then the value is changed, otherwise, the new element is
added at the specified index. Multiple elements can also be updated at once,
either with the same element value or multiple values in case the new values
are specified as a vector.

# creating an array of length 9


m <- c(1, 2, 3, 4, 5, 6, 7, 8, 9)

print ("Original Array")

print (m)

# updating single element

m[1] <- 0

print ("After 1st modification")

print (m)

# updating sequence of elements

m[7:9] <- -1

print ("After 2nd modification")

print (m)

# updating two indices with two different values

m[c(2, 5)] <- c(-1, -2)

print ("After 3rd modification")

print (m)
# this add new element to the array

m[10] <- 10

print ("After 4th modification")

print (m)

Output:
[1] "Original Array"
[1] 1 2 3 4 5 6 7 8 9
[1] "After 1st modification"
[1] 0 2 3 4 5 6 7 8 9
[1] "After 2nd modification"
[1] 0 2 3 4 5 6 -1 -1 -1
[1] "After 3rd modification"
[1] 0 -1 3 4 -2 6 -1 -1 -1
[1] "After 4th modification"
[1] 0 -1 3 4 -2 6 -1 -1 -1 10
At 2nd modification, the elements at indexes 7 to 9 are updated with -1 each.
At 3rd modification, the second element is replaced by -1 and fifth element
by -2 respectively. At 4th modification, a new element is added since 10th
index is greater than the length of the array.

R – Matrices
Matrix is a rectangular arrangement of numbers in rows and columns. In a
matrix, as we know rows are the ones that run horizontally and columns are
the ones that run vertically. In R programming, matrices are two-dimensional,
homogeneous data structures. These are some examples of matrices:
Creating a Matrix
To create a matrix in R you need to use the function called matrix(). The
arguments to this matrix() are the set of elements in the vector. You have to
pass how many numbers of rows and how many numbers of columns you
want to have in your matrix.
Note: By default, matrices are in column-wise order.
 R

# R program to create a matrix

A = matrix(

# Taking sequence of elements

c(1, 2, 3, 4, 5, 6, 7, 8, 9),

# No of rows

nrow = 3,

# No of columns

ncol = 3,
# By default matrices are in column-wise order

# So this parameter decides how to arrange the matrix

byrow = TRUE

# Naming rows

rownames(A) = c("a", "b", "c")

# Naming columns

colnames(A) = c("c", "d", "e")

cat("The 3x3 matrix:\n")

print(A)

Output:
The 3x3 matrix:
c d e
a 1 2 3
b 4 5 6
c 7 8 9
Creating special matrices
R allows the creation of various different types of matrices with the use of
arguments passed to the matrix() function.
 Matrix where all rows and columns are filled by a single constant
‘k’:
To create such a R matrix the syntax is given below:
Syntax: matrix(k, m, n)
Parameters:
k: the constant
m: no of rows
n: no of columns

Example:
 R

# R program to illustrate

# special matrices

# Matrix having 3 rows and 3 columns

# filled by a single constant 5

print(matrix(5, 3, 3))

Output:
[,1] [,2] [,3]
[1,] 5 5 5
[2,] 5 5 5
[3,] 5 5 5
 Diagonal matrix:
A diagonal matrix is a matrix in which the entries outside the main
diagonal are all zero. To create such a R matrix the syntax is given below:
Syntax: diag(k, m, n)
Parameters:
k: the constants/array
m: no of rows
n: no of columns
Example:
 R

# R program to illustrate
# special matrices

# Diagonal matrix having 3 rows and 3 columns

# filled by array of elements (5, 3, 3)

print(diag(c(5, 3, 3), 3, 3))

Output:
[,1] [,2] [,3]
[1,] 5 0 0
[2,] 0 3 0
[3,] 0 0 3
 Identity matrix:
An identity matrix in which all the elements of the principal diagonal are
ones and all other elements are zeros. To create such a R matrix the
syntax is given below:
Syntax: diag(k, m, n)
Parameters:
k: 1
m: no of rows
n: no of columns
Example:
 R

# R program to illustrate

# special matrices

# Identity matrix having

# 3 rows and 3 columns


print(diag(1, 3, 3))

Output:
[,1] [,2] [,3]
[1,] 1 0 0
[2,] 0 1 0
[3,] 0 0 1
Matrix metrics
Matrix metrics mean once a matrix is created then
 How can you know the dimension of the matrix?
 How can you know how many rows are there in the matrix?
 How many columns are in the matrix?
 How many elements are there in the matrix? are the questions we
generally wanted to answer.
Example:
 R

# R program to illustrate

# matrix metrics

# Create a 3x3 matrix

A = matrix(

c(1, 2, 3, 4, 5, 6, 7, 8, 9),

nrow = 3,

ncol = 3,

byrow = TRUE

)
cat("The 3x3 matrix:\n")

print(A)

cat("Dimension of the matrix:\n")

print(dim(A))

cat("Number of rows:\n")

print(nrow(A))

cat("Number of columns:\n")

print(ncol(A))

cat("Number of elements:\n")

print(length(A))

# OR

print(prod(dim(A)))

Output:
The 3x3 matrix:
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
[3,] 7 8 9
Dimension of the matrix:
[1] 3 3
Number of rows:
[1] 3
Number of columns:
[1] 3
Number of elements:
[1] 9
[1] 9
Accessing elements of a Matrix
We can access elements in the R matrices using the same convention that is
followed in data frames. So, you will have a matrix and followed by a square
bracket with a comma in between array. Value before the comma is used to
access rows and value that is after the comma is used to access columns.
Let’s illustrate this by taking a simple R code.
Accessing rows:
 R

# R program to illustrate

# access rows in metrics

# Create a 3x3 matrix

A = matrix(

c(1, 2, 3, 4, 5, 6, 7, 8, 9),

nrow = 3,

ncol = 3,

byrow = TRUE

)
cat("The 3x3 matrix:\n")

print(A)

# Accessing first and second row

cat("Accessing first and second row\n")

print(A[1:2, ])

Output:
The 3x3 matrix:
[, 1] [, 2] [, 3]
[1, ] 1 2 3
[2, ] 4 5 6
[3, ] 7 8 9

Accessing first and second row


[, 1] [, 2] [, 3]
[1, ] 1 2 3
[2, ] 4 5 6
Accessing columns:
 R

# R program to illustrate

# access columns in metrics

# Create a 3x3 matrix


A = matrix(

c(1, 2, 3, 4, 5, 6, 7, 8, 9),

nrow = 3,

ncol = 3,

byrow = TRUE

cat("The 3x3 matrix:\n")

print(A)

# Accessing first and second column

cat("Accessing first and second column\n")

print(A[, 1:2])

Output:
The 3x3 matrix:
[, 1] [, 2] [, 3]
[1, ] 1 2 3
[2, ] 4 5 6
[3, ] 7 8 9

Accessing first and second column


[, 1] [, 2]
[1, ] 1 2
[2, ] 4 5
[3, ] 7 8
Accessing elements of a R matrix:
 R

# R program to illustrate

# access an entry in metrics

# Create a 3x3 matrix

A = matrix(

c(1, 2, 3, 4, 5, 6, 7, 8, 9),

nrow = 3,

ncol = 3,

byrow = TRUE

cat("The 3x3 matrix:\n")

print(A)

# Accessing 2

print(A[1, 2])

# Accessing 6

print(A[2, 3])

Output:
The 3x3 matrix:
[, 1] [, 2] [, 3]
[1, ] 1 2 3
[2, ] 4 5 6
[3, ] 7 8 9

[1] 2
[1] 6
Accessing Submatrices:
We can access the submatrix in a matrix using the colon(:) operator.
 R

# R program to illustrate

# access submatrices in a matrix

# Create a 3x3 matrix

A = matrix(

c(1, 2, 3, 4, 5, 6, 7, 8, 9),

nrow = 3,

ncol = 3,

byrow = TRUE

cat("The 3x3 matrix:\n")

print(A)

cat("Accessing the first three rows and the first two columns\n")
print(A[1:3, 1:2])

Output:
The 3x3 matrix:
[, 1] [, 2] [, 3]
[1, ] 1 2 3
[2, ] 4 5 6
[3, ] 7 8 9

Accessing the first three rows and the first two columns
[, 1] [, 2]
[1, ] 1 2
[2, ] 4 5
[3, ] 7 8
Modifying Elements of a Matrix
In R you can modify the elements of the matrices by a direct assignment.
Example:
 R

# R program to illustrate

# editing elements in metrics

# Create a 3x3 matrix

A = matrix(

c(1, 2, 3, 4, 5, 6, 7, 8, 9),

nrow = 3,

ncol = 3,
byrow = TRUE

cat("The 3x3 matrix:\n")

print(A)

# Editing the 3rd rows and 3rd column element

# from 9 to 30

# by direct assignments

A[3, 3] = 30

cat("After edited the matrix\n")

print(A)

Output:
The 3x3 matrix:
[, 1] [, 2] [, 3]
[1, ] 1 2 3
[2, ] 4 5 6
[3, ] 7 8 9

After edited the matrix


[, 1] [, 2] [, 3]
[1, ] 1 2 3
[2, ] 4 5 6
[3, ] 7 8 30
Matrix Concatenation
Matrix concatenation refers to the merging of rows or columns of an existing
R matrix.

Concatenation of a row:
The concatenation of a row to a matrix is done using rbind().
 R

# R program to illustrate

# concatenation of a row in metrics

# Create a 3x3 matrix

A = matrix(

c(1, 2, 3, 4, 5, 6, 7, 8, 9),

nrow = 3,

ncol = 3,

byrow = TRUE

cat("The 3x3 matrix:\n")

print(A)

# Creating another 1x3 matrix

B = matrix(

c(10, 11, 12),


nrow = 1,

ncol = 3

cat("The 1x3 matrix:\n")

print(B)

# Add a new row using rbind()

C = rbind(A, B)

cat("After concatenation of a row:\n")

print(C)

Output:
The 3x3 matrix:
[, 1] [, 2] [, 3]
[1, ] 1 2 3
[2, ] 4 5 6
[3, ] 7 8 9

The 1x3 matrix:


[, 1] [, 2] [, 3]
[1, ] 10 11 12

After concatenation of a row:


[, 1] [, 2] [, 3]
[1, ] 1 2 3
[2, ] 4 5 6
[3, ] 7 8 9
[4, ] 10 11 12
Concatenation of a column:
The concatenation of a column to a matrix is done using cbind().
 R

# R program to illustrate

# concatenation of a column in metrics

# Create a 3x3 matrix

A = matrix(

c(1, 2, 3, 4, 5, 6, 7, 8, 9),

nrow = 3,

ncol = 3,

byrow = TRUE

cat("The 3x3 matrix:\n")

print(A)

# Creating another 3x1 matrix

B = matrix(

c(10, 11, 12),


nrow = 3,

ncol = 1,

byrow = TRUE

cat("The 3x1 matrix:\n")

print(B)

# Add a new column using cbind()

C = cbind(A, B)

cat("After concatenation of a column:\n")

print(C)

Output:
The 3x3 matrix:
[, 1] [, 2] [, 3]
[1, ] 1 2 3
[2, ] 4 5 6
[3, ] 7 8 9

The 3x1 matrix:


[, 1]
[1, ] 10
[2, ] 11
[3, ] 12
After concatenation of a column:
[, 1] [, 2] [, 3] [, 4]
[1, ] 1 2 3 10
[2, ] 4 5 6 11
[3, ] 7 8 9 12
Dimension inconsistency: Note that you have to make sure the
consistency of dimensions between the matrix before you do this matrix
concatenation.
 R

# R program to illustrate

# Dimension inconsistency in metrics concatenation

# Create a 3x3 matrix

A = matrix(

c(1, 2, 3, 4, 5, 6, 7, 8, 9),

nrow = 3,

ncol = 3,

byrow = TRUE

cat("The 3x3 matrix:\n")

print(A)

# Creating another 1x3 matrix


B = matrix(

c(10, 11, 12),

nrow = 1,

ncol = 3,

cat("The 1x3 matrix:\n")

print(B)

# This will give an error

# because of dimension inconsistency

C = cbind(A, B)

cat("After concatenation of a column:\n")

print(C)

Output:
The 3x3 matrix:
[, 1] [, 2] [, 3]
[1, ] 1 2 3
[2, ] 4 5 6
[3, ] 7 8 9

The 1x3 matrix:


[, 1] [, 2] [, 3]
[1, ] 10 11 12
Error in cbind(A, B) : number of rows of matrices must match
(see arg 2)
Deleting rows and columns of a Matrix
To delete a row or a column, first of all, you need to access that row or
column and then insert a negative sign before that row or column. It indicates
that you had to delete that row or column.
Row deletion:
 R

# R program to illustrate

# row deletion in metrics

# Create a 3x3 matrix

A = matrix(

c(1, 2, 3, 4, 5, 6, 7, 8, 9),

nrow = 3,

ncol = 3,

byrow = TRUE

cat("Before deleting the 2nd row\n")

print(A)

# 2nd-row deletion

A = A[-2, ]
cat("After deleted the 2nd row\n")

print(A)

Output:
Before deleting the 2nd row
[, 1] [, 2] [, 3]
[1, ] 1 2 3
[2, ] 4 5 6
[3, ] 7 8 9

After deleted the 2nd row


[, 1] [, 2] [, 3]
[1, ] 1 2 3
[2, ] 7 8 9
Column deletion:
 R

# R program to illustrate

# column deletion in metrics

# Create a 3x3 matrix

A = matrix(

c(1, 2, 3, 4, 5, 6, 7, 8, 9),

nrow = 3,
ncol = 3,

byrow = TRUE

cat("Before deleting the 2nd column\n")

print(A)

# 2nd-row deletion

A = A[, -2]

cat("After deleted the 2nd column\n")

print(A)

Output:
Before deleting the 2nd column
[, 1] [, 2] [, 3]
[1, ] 1 2 3
[2, ] 4 5 6
[3, ] 7 8 9

After deleted the 2nd column


[, 1] [, 2]
[1, ] 1 3
[2, ] 4 6
[3, ] 7 9

R Factors
Factors in R Programming Language are data structures that are
implemented to categorize the data or represent categorical data and store it
on multiple levels.
They can be stored as integers with a corresponding label to every unique
integer. The R factors may look similar to character vectors, they are
integers and care must be taken while using them as strings. The R factor
accepts only a restricted number of distinct values. For example, a data field
such as gender may contain values only from female, male, or transgender.
In the above example, all the possible cases are known beforehand and are
predefined. These distinct values are known as levels. After a factor is
created it only consists of levels that are by default sorted alphabetically.
Attributes of Factors in R Language
 x: It is the vector that needs to be converted into a factor.
 Levels: It is a set of distinct values which are given to the input vector x.
 Labels: It is a character vector corresponding to the number of labels.
 Exclude: This will mention all the values you want to exclude.
 Ordered: This logical attribute decides whether the levels are ordered.
 nmax: It will decide the upper limit for the maximum number of levels.
Creating a Factor in R Programming Language
The command used to create or modify a factor in R language is
– factor() with a vector as input.
The two steps to creating an R factor :

 Creating a vector
 Converting the vector created into a factor using function factor()
Examples: Let us create a factor gender with levels female, male and
transgender.
 R

# Creating a vector

x <-c("female", "male", "male", "female")

print(x)

# Converting the vector x into a factor

# named gender

gender <-factor(x)

print(gender)

Output
[1] "female" "male" "male" "female"
[1] female male male female
Levels: female male
Levels can also be predefined by the programmer.

 R

# Creating a factor with levels defined by programmer

gender <- factor(c("female", "male", "male", "female"),


levels = c("female", "transgender", "male"));

gender

Output
[1] female male male female
Levels: female transgender male
Further one can check the levels of a factor by using function levels().
Checking for a Factor in R
The function is.factor() is used to check whether the variable is a factor and
returns “TRUE” if it is a factor.
 R

gender <- factor(c("female", "male", "male", "female"));

print(is.factor(gender))

Output
[1] TRUE
Function class() is also used to check whether the variable is a factor and if
true returns “factor”.
 R

gender <- factor(c("female", "male", "male", "female"));

class(gender)

Output
[1] "factor"
Accessing elements of a Factor in R
Like we access elements of a vector, the same way we access the elements
of a factor. If gender is a factor then gender[i] would mean accessing an
ith element in the factor.
Example
 R
gender <- factor(c("female", "male", "male", "female"));

gender[3]

Output
[1] male
Levels: female male
More than one element can be accessed at a time.
Example
 R

gender <- factor(c("female", "male", "male", "female"));

gender[c(2, 4)]

Output
[1] male female
Levels: female male
Subtract one element at a time.
Example
 R

gender <- factor(c("female", "male", "male", "female" ));

gender[-3]

Output
[1] female male female
Levels: female male
 First, we create a factor vector gender with four elements: “female”,
“male”, “male”, and “female”.
 Then, we use the square brackets [-3] to subset the vector and remove
the third element, which is “male”.
 The output is the remaining elements of the gender vector, which are
“female”, “male”, and “female”. The output also shows the levels of the
factor, which are “female” and “male”.
Modification of a Factor in R
After a factor is formed, its components can be modified but the new values
which need to be assigned must be at the predefined level.
Example
 R

gender <- factor(c("female", "male", "male", "female" ));

gender[2]<-"female"

gender

Output
[1] female female male female
Levels: female male
For selecting all the elements of the factor gender except ith element,
gender[-i] should be used. So if you want to modify a factor and add value
out of predefined levels, then first modify levels.
Example
 R

gender <- factor(c("female", "male", "male", "female" ));

# add new level

levels(gender) <- c(levels(gender), "other")

gender[3] <- "other"

gender

Output
[1] female male other female
Levels: female male other
Factors in Data Frame
The Data frame is similar to a 2D array with the columns containing all the
values of one variable and the rows having one set of values from every
column. There are four things to remember about data frames:
 column names are compulsory and cannot be empty.
 Unique names should be assigned to each row.
 The data frame’s data can be only of three types- factor, numeric, and
character type.
 The same number of data items must be present in each column.
In R language when we create a data frame, its column is categorical data,
and hence a R factor is automatically created on it.
We can create a data frame and check if its column is a factor.
Example
 R

age <- c(40, 49, 48, 40, 67, 52, 53)

salary <- c(103200, 106200, 150200,

10606, 10390, 14070, 10220)

gender <- c("male", "male", "transgender",

"female", "male", "female", "transgender")

employee<- data.frame(age, salary, gender)

print(employee)

print(is.factor(employee$gender))

Output
age salary gender
1 40 103200 male
2 49 106200 male
3 48 150200 transgender
4 40 10606 female
5 67 10390 male
6 52 14070 female
7 53 10220 transgender
[1] TRUE

R – Data Frames
R Programming Language is an open-source programming language that is
widely used as a statistical software and data analysis tool. Data Frames in
R Language are generic data objects of R that are used to store tabular
data. Data frames can also be interpreted as matrices where each column of
a matrix can be of different data types. R DataFrame is made up of three
principal components, the data, rows, and columns.

R – Data Frames

R – Data Frames

Create Dataframe in R Programming Language


To create an R data frame use data.frame() command and then pass each
of the vectors you have created as arguments to the function.
Example:
 R

# R program to create dataframe

# creating a data frame

friend.data <- data.frame(

friend_id = c(1:5),

friend_name = c("Sachin", "Sourav",

"Dravid", "Sehwag",

"Dhoni"),

stringsAsFactors = FALSE

# print the data frame

print(friend.data)

Output:
friend_id friend_name
1 1 Sachin
2 2 Sourav
3 3 Dravid
4 4 Sehwag
5 5 Dhoni
Get the Structure of the R – Data Frame
One can get the structure of the R data frame using str() function in R. It can
display even the internal structure of large lists which are nested. It provides
one-liner output for the basic R objects letting the user know about the object
and its constituents.
Example:
 R

# R program to get the

# structure of the data frame

# creating a data frame

friend.data <- data.frame(

friend_id = c(1:5),

friend_name = c("Sachin", "Sourav",

"Dravid", "Sehwag",

"Dhoni"),

stringsAsFactors = FALSE

# using str()

print(str(friend.data))

Output:
'data.frame': 5 obs. of 2 variables:
$ friend_id : int 1 2 3 4 5
$ friend_name: chr "Sachin" "Sourav" "Dravid" "Sehwag" ...
NULL
Summary of data in the R data frame
In the R data frame, the statistical summary and nature of the data can be
obtained by applying summary() function. It is a generic function used to
produce result summaries of the results of various model fitting functions.
The function invokes particular methods which depend on the class of the
first argument.
Example:
 R

# R program to get the

# summary of the data frame

# creating a data frame

friend.data <- data.frame(

friend_id = c(1:5),

friend_name = c("Sachin", "Sourav",

"Dravid", "Sehwag",

"Dhoni"),

stringsAsFactors = FALSE

# using summary()

print(summary(friend.data))

Output:
friend_id friend_name
Min. :1 Length:5
1st Qu.:2 Class :character
Median :3 Mode :character
Mean :3
3rd Qu.:4
Max. :5
Extract Data from Data Frame
Extracting data from an R data frame means that to access its rows or
columns. One can extract a specific column from an R data frame using its
column name.

Example:
 R

# R program to extract

# data from the data frame

# creating a data frame

friend.data <- data.frame(

friend_id = c(1:5),

friend_name = c("Sachin", "Sourav",

"Dravid", "Sehwag",

"Dhoni"),

stringsAsFactors = FALSE

# Extracting friend_name column

result <- data.frame(friend.data$friend_name)

print(result)

Output:
friend.data.friend_name
1 Sachin
2 Sourav
3 Dravid
4 Sehwag
5 Dhoni
Expand Data Frame in R Language
A data frame in R can be expanded by adding new columns and rows to the
already existing R data frame.
Example:
 R

# R program to expand

# the data frame

# creating a data frame

friend.data <- data.frame(

friend_id = c(1:5),

friend_name = c("Sachin", "Sourav",

"Dravid", "Sehwag",

"Dhoni"),

stringsAsFactors = FALSE

# Expanding data frame

friend.data$location <- c("Kolkata", "Delhi",


"Bangalore", "Hyderabad",

"Chennai")

resultant <- friend.data

# print the modified data frame

print(resultant)

Output:
friend_id friend_name location
1 1 Sachin Kolkata
2 2 Sourav Delhi
3 3 Dravid Bangalore
4 4 Sehwag Hyderabad
5 5 Dhoni Chennai
In R, one can perform various types of operations on a data frame
like accessing rows and columns, selecting the subset of the data
frame, editing data frames, delete rows and columns in a data frame,
etc. Please refer to DataFrame Operations in R to know about all types of
operations that can be performed on a data frame.
Remove Rows and Columns
A data frame in R removes columns and rows from the already existing R
data frame.

 R

library(dplyr)

# Create a data frame

data <- data.frame(

friend_id = c(1, 2, 3, 4, 5),

friend_name = c("Sachin", "Sourav", "Dravid", "Sehwag", "Dhoni"),


location = c("Kolkata", "Delhi", "Bangalore", "Hyderabad", "Chennai")

# Remove a row with friend_id = 3

data <- subset(data, friend_id != 3)

# Remove the 'location' column

data <- select(data, -location)

Output:
friend_id friend_name
1 1 Sachin
2 2 Sourav
4 4 Sehwag
5 5 Dhoni

In the above code, we first created a data frame called data with three
columns: friend_id, friend_name, and location. To remove a row
with friend_id equal to 3, we used the subset() function and specified the
condition friend_id != 3. This removed the row with friend_id equal to 3.
To remove the location column, we used the select() function and
specified -location. The – sign indicates that we want to remove
the location column. The resulting data frame data will have only two
columns: friend_id and friend_name.

Classes in R Programming
Classes and Objects are basic concepts of Object-Oriented Programming
that revolve around the real-life entities. Everything in R is an object.
An object is simply a data structure that has some methods and attributes.
A class is just a blueprint or a sketch of these objects. It represents the set
of properties or methods that are common to all objects of one type.
Unlike most other programming languages, R has a three-class system.
These are S3, S4, and Reference Classes.
S3 Class
S3 is the simplest yet the most popular OOP system and it lacks formal
definition and structure. An object of this type can be created by just adding
an attribute to it. Following is an example to make things more clear:
Example:

# create a list with required components

movieList <- list(name = "Iron man", leadActor = "Robert Downey Jr")

# give a name to your class

class(movieList) <- "movie"

movieList

Output:
$name
[1] "Iron man"
$leadActor
[1] "Robert Downey Jr"
In S3 systems, methods don’t belong to the class. They belong to generic
functions. It means that we can’t create our own methods here, as we do in
other programming languages like C++ or Java. But we can define what a
generic method (for example print) does when applied to our objects.

print(movieList)

Output:
$name
[1] "Iron man"
$leadActor
[1] "Robert Downey Jr"
Example: Creating a user-defined print function

# now let us write our method

print.movie <- function(obj)

cat("The name of the movie is", obj$name,".\n")

cat(obj$leadActor, "is the lead actor.\n")

Output:
The name of the movie is Iron man .
Robert Downey Jr is the lead actor.
S4 Class
Programmers of other languages like C++, Java might find S3 to be very
much different than their normal idea of classes as it lacks the structure that
classes are supposed to provide. S4 is a slight improvement over S3 as its
objects have a proper definition and it gives a proper structure to its objects.
Example:

library(methods)

# definition of S4 class

setClass("movies", slots=list(name="character", leadActor = "character"))

# creating an object using new() by passing class name and slot values

movieList <- new("movies", name="Iron man", leadActor = "Robert Downey Jr")


movieList

Output:
An object of class "movies"
Slot "name":
[1] "Iron man"
Slot "leadActor":
[1] "Robert Downey Jr"
As shown in the above example, setClass() is used to define a class
and new() is used to create the objects.
The concept of methods in S4 is similar to S3, i.e., they belong to generic
functions. The following example shows how to create a method:

movieList

Output:
An object of class "movies"
Slot "name":
[1] "Iron man"

Slot "leadActor":
[1] "Robert Downey Jr"
Example:

# using setMethod to set a method

setMethod("show", "movies",

function(object)

cat("The name of the movie is ", object@name, ".\n")


cat(object@leadActor, "is the lead actor.\n")

movieList

Output:
[1] "show"
The name of the movie is Iron man .
Robert Downey Jr is the lead actor.
Reference Class
Reference Class is an improvement over S4 Class. Here the methods belong
to the classes. These are much similar to object-oriented classes of other
languages.
Defining a Reference class is similar to defining S4 classes. We
use setRefClass() instead of setClass() and “fields” instead of “slots”.
Example:

library(methods)

# setRefClass returns a generator

movies <- setRefClass("movies", fields = list(name = "character",

leadActor = "character", rating = "numeric"))

#now we can use the generator to create objects

movieList <- movies(name = "Iron Man",

leadActor = "Robert downey Jr", rating = 7)


movieList

Output:
Reference class object of class "movies"
Field "name":
[1] "Iron Man"
Field "leadActor":
[1] "Robert downey Jr"
Field "rating":
[1] 7
Now let us see how to add some methods to our class with an example.
Example

library(methods)

movies <- setRefClass("movies", fields = list(name = "character",

leadActor = "character", rating = "numeric"),


methods = list(

increment_rating = function()

rating <<- rating + 1

},

decrement_rating = function()

rating <<- rating - 1


}

))

movieList <- movies(name = "Iron Man",

leadActor = "Robert downey Jr", rating = 7)

# print the value of rating

movieList$rating

# increment and then print the rating

movieList$increment_rating()

movieList$rating

# decrement and print the rating

movieList$decrement_rating()

movieList$rating

Output:
[1] 7
[1] 8
[1] 7

Coercion in R Programming
Coercing of an object from one type of class to another is known as explicit
coercion. It is achieved through some functions which are similar to the
base functions. But they differ from base functions as they are not generic
and hence do not call S3 class methods for conversion.
Difference between conversion, coercion and cast:
Normally, whatever is converted implicitly is referred to as coercion and if
converted explicitly then it is known as casting. Conversion signifies both
types- coercion and casting.
Explicit coercion to character
There are two functions to do so as.character() and as.string().
If v is a vector which is needed to be converted into character then it can be
converted as:
 as.character(v, encoding = NULL)
 as.string(v, encoding = NULL)
Here encoding parameter informs R compiler about encoding of the vector
and helps internally in managing character and string vectors.

 Python3

# Creating a list

x<-c(0, 1, 0, 3)

# Converting it to character type

as.character(x)

Output:
[1] "0" "1" "0" "3"
Explicit coercion to numeric and logical
They all are as * functions with only one parameter, that is, a vector which is
to be converted.
.Difference-table { border-collapse: collapse; width: 100%; } .Difference-table
td { text-color: black !important; border: 1px solid #5fb962; text-align: left !
important; padding: 8px; } .Difference-table th { border: 1px solid #5fb962;
padding: 8px; } .Difference-table tr>th{ background-color: #c6ebd9; vertical-
align: middle; } .Difference-table tr:nth-child(odd) { background-color: #ffffff; }
Function Description

Converts the value to logical type.


as.logical
 If 0 is present then it is converted to FALSE
 Any other value is converted to TRUE

as.integer Converts the object to integer type

as.double Converts the object to double precision type

as.comple
Converts the object to complex type
x

It accepts only dictionary type or vector as input arguments in the


as.list
parameter

 Python3

# Creating a list

x<-c(0, 1, 0, 3)

# Checking its class

class(x)

# Converting it to integer type

as.numeric(x)
# Converting it to double type

as.double(x)

# Converting it to logical type

as.logical(x)

# Converting it to a list

as.list(x)

# Converting it to complex numbers

as.complex(x)

Output:
[1] "numeric"
[1] 0 1 0 3
[1] 0 1 0 3
[1] FALSE TRUE FALSE TRUE
[[1]]
[1] 0

[[2]]
[1] 1

[[3]]
[1] 0
[[4]]
[1] 3

[1] 0+0i 1+0i 0+0i 3+0i


Producing NAs
NAs are produced during explicit coercion if R is not able to decide the way
to coerce some object.

 Python3

# Creating a list

x<-c("q", "w", "c")

as.numeric(x)

as.logical(x)

Output:
[1] NA NA NA
Warning message:
NAs introduced by coercion
[1] NA NA NA

Plotting of Data in R Programming – plot()


Function
plot() function in R Programming Language is defined as a generic
function for plotting. It can be used to create basic graphs of a different type.
Syntax: plot(x, y, type)
Parameters
 x and y: coordinates of points to plot
 type: the type of graph to create
Returns: different type of plots
Plotting Data in R Programming Language
Example 1: Draw Points using plot() Function in R
 R

plot(3, 4)

Output:

Example 2: Draw Multiple Points


 R

plot(c(1, 3, 4), c(4, 5 , 8))

Output:
Example 3: Sequences of Points

 R

plot(1:20)

Output:
Example 4: R program to plot a graph

 R

# Values for x and y axis

x <- 1:5; y = x * x

# Using plot() function

plot(x, y, type = "l")

plot(x, y, type = "h")

Output:

In the above example type=“l” stands for lines graph, type=“h” stands for
‘histogram’ like vertical lines.
Example 5: R program to plot a different graph
 R

# R program to plot a graph


# Creating x and y-values

x - 1:5; y = x * x

# Using plot function

plot(x, y, type = "b")

plot(x, y, type = "s")

plot(x, y, type = "p")

Output:
Here, in the above example type=“b” stands for both that means points are
connected by a line
type=“s”indicates stair steps
type=“p” for points (by default).
example :

 R

# Generate some data

x <- 1:10
y <- x^2

# Create a plot of the data

plot(x, y, type = "l", col = "red", lwd = 2, main = "Quadratic


Function",

xlab = "x", ylab = "y", xlim = c(0, 11), ylim = c(0, 100))

output :

You might also like