R programming unit 1
R programming unit 1
UNIT - 1
R Programming
R is a software environment which is used to analyze statistical information and graphical
representation. R allows us to do modular programming using functions.
What is R Programming
"R is an interpreted computer programming language which was created by Ross Ihaka and
Robert Gentleman at the University of Auckland, New Zealand." The R Development Core
Team currently develops R. It is also a software environment used to analyze statistical
information, graphical representation, reporting, and data modeling. R is the implementation
of the S programming language, which is combined with lexical scoping semantics. 18:10
R not only allows us to do branching and looping but also allows to do modular programming
using functions. R allows integration with the procedures written in the C, C++, .Net, Python, and
FORTRAN languages to improve efficiency.
In the present era, R is one of the most important tool which is used by researchers, data analyst,
statisticians, and marketers for retrieving, cleaning, analyzing, visualizing, and presenting data.
History of R Programming
The history of R goes back about 20-30 years ago. R was developed by Ross lhaka and Robert
Gentleman in the University of Auckland, New Zealand, and the R Development Core Team
currently develops it. This programming language name is taken from the name of both the
developers. The first project was considered in 1992. The initial version was released in 1995, and
in 2000, a stable beta version was released.
The following table shows the release date, version, and description of R language:
0.49 1997-04-23 First time R's source was released, and CRAN (Comprehensive R Archive
Network) was started.
2.13 2011-04-14 Added a function that rapidly converts code to byte code.
3.5 2018-04-23 Added new features such as compact internal representation of integer
sequences, serialization format etc.
Features of R programming
R is a domain-specific programming language which aims to do data analysis. It has some
unique features which make it very powerful. The most important arguably being the notation of
vectors. These vectors allow us to perform a complex operation on a set of values in a single
command. These are the following features of R programming:
1. It is a simple and effective programming language which has been well developed.
2. It is data analysis software.
3. It is a well-designed, easy, and effective language which has the concepts of user-defined, looping,
conditional, and various I/O facilities.
4. It has a consistent and incorporated set of tools which are used for data analysis.
5. For different types of calculation on arrays, lists and vectors, R contains a suite of operators.
The important task in data science is the way we deal with the data: clean, feature engineering,
feature selection, and import. It should be our primary focus. Data scientist job is to understand
the data, manipulate it, and expose the best approach. For machine learning, the best algorithms
can be implemented with R. Keras and TensorFlow allow us to create high-end machine learning
techniques. R has a package to perform Xgboost. Xgboost is one of the best algorithms for Kaggle
competition.
R communicate with the other languages and possibly calls Python, Java, C++. The big data world
is also accessible to R. We can connect R with different databases like Spark or Hadoop.
In brief, R is a great tool to investigate and explore the data. The elaborate analysis such as
clustering, correlation, and data reduction are done with R.
Comparison R Python
Index
Development Core Team currently develops released in 1991. Python has a very
R. R is also a software environment which is simple and clean code syntax. It
used to analyze statistical information, emphasizes the code readability and
graphical representation, reporting, and data debugging is also simple and easier in
modeling. Python.
Specialties for R packages have advanced techniques which For finding outliers in a data set both R
data science are very useful for statistical work. The CRAN and Python are equally good. But for
text view is provided by many useful R developing a web service to allow
packages. These packages cover everything peoples to upload datasets and find
from Psychometrics to Genetics to Finance. outliers, Python is better.
Functionalities For data analysis, R has inbuilt functionalities Most of the data analysis functionalities
are not inbuilt. They are available
through packages like Numpy and
Pandas
Key domains of Data visualization is a key aspect of analysis. Python is better for deep learning because
application R packages such as ggplot2, ggvis, lattice, etc. Python packages such as Caffe, Keras,
make data visualization easier. OpenNN, etc. allows the development of
the deep neural network in a very simple
way.
Availability of There are hundreds of packages and ways to Python has few main packages such as
packages accomplish needful data science tasks. viz, Sccikit learn, and Pandas for data
analysis of machine learning,
respectively.
Applications of R
There are several-applications available in real-time. Some of the popular applications are as
follows:
o Facebook
o Google
o Twitter
o HRDAG
o Sunlight Foundation
o RealClimate
o NDAA
o XBOX ONE
o ANZ
o FDA
Advantages
1) Open Source
An open-source language is a language on which we can work without any need for a license or a
fee. R is an open-source language. We can contribute to the development of R by optimizing our
packages, developing new ones, and resolving issues.
2) Platform Independent
R is a platform-independent language or cross-platform programming language which means its
code can run on all operating systems. R enables programmers to develop software for several
competing platforms by writing a program only once. R can run quite easily on Windows, Linux,
and Mac.
7) Statistics
R is mainly known as the language of statistics. It is the main reason why R is predominant than
other programming languages for the development of statistical tools.
8) Continuously Growing
R is a constantly evolving programming language. Constantly evolving means when something
evolves, it changes or develops over time, like our taste in music and clothes, which evolve as we
get older. R is a state of the art which provides updates whenever any new feature is added.
Disadvantages
1) Data Handling
In R, objects are stored in physical memory. It is in contrast with other programming languages
like Python. R utilizes more memory as compared to Python. It requires the entire data in one
single place which is in the memory. It is not an ideal option when we deal with Big Data.
2) Basic Security
R lacks basic security. It is an essential part of most programming languages such as Python.
Because of this, there are many restrictions with R as it cannot be embedded in a web-application.
3) Complicated Language
R is a very complicated language, and it has a steep learning curve. The people who don't have
prior knowledge or programming experience may find it difficult to learn R.
4) Weak Origin
The main disadvantage of R is, it does not have support for dynamic or 3D graphics. The reason
behind this is its origin. It shares its origin with a much older programming language "S."
5) Lesser Speed
R programming language is much slower than other programming languages such as MATLAB
and Python. In comparison to other programming language, R packages are much slower.
In R, algorithms are spread across different packages. The programmers who have no prior
knowledge of packages may find it difficult to implement algorithms.
Syntax of R Programming
R Programming is a very popular programming language which is broadly used in data analysis.
The way in which we define its code is quite simple. The "Hello World!" is the basic program for
all the languages, and now we will understand the syntax of R programming with "Hello world"
program. We can write our code either in command prompt, or we can use an R script file.
R Command Prompt
It is required that we have already installed the R environment set up in our system to work on the
R command prompt. After the installation of R environment setup, we can easily start R command
prompt by typing R in our Windows command prompt. When we press enter after typing R, it will
launch interpreter, and we will get a prompt on which we can code our program.
In the above code, the first statement defines a string variable string, where we assign a string
"Hello World!". The next statement print() is used to print the value which is stored in the variable
string.
R Script File
The R script file is another way on which we can write our programs, and then we execute those
scripts at our command prompt with the help of R interpreter known as Rscript. We make a text
file and write the following code. We will save this file with .R extension as:
Demo.R
To execute this file in Windows and other operating systems, the process will remain the same as
mentioned below.
Comments
In R programming, comments are the programmer readable explanation in the source code of an
R program. The purpose of adding these comments is to make the source code easier to understand.
These comments are generally ignored by compilers and interpreters.
In R programming there is only single-line comment. R doesn't support multi-line comment. But
if we want to perform multi-line comments, then we can add our code in a false block.
Single-line comment
In R, there are several data types such as integer, string, etc. The operating system allocates
memory based on the data type of the variable and decides what can be stored in the reserved
memory.
There are the following data types which are used in R programming:
Logical True, False It is a special data type for data with only two possible values
which can be construed as true/false.
Integer 3L, 66L, 2346L Here, L tells R to store the value as an integer,
Complex Z=1+2i, t=7+3i A complex value in R is defined as the pure imaginary value i.
variable_logical<- TRUE
cat(variable_logical,"\n")
cat("The data type of variable_logical is ",class(variable_logical),"\n\n")
When we execute the following program, it will give us the following output:
1. Atomic vector
2. List
3. Array
4. Matrices
5. Data Frame
6. Factors
Vectors
A vector is the basic data structure in R, or we can say vectors are the most basic R data objects.
There are six types of atomic vectors such as logical, integer, character, double, and raw. "A
vector is a collection of elements which is most commonly of mode character, integer, logical
or numeric". They can be created using the c() function.
nv<- c(1,2,3,4,5)
cv<- c(“apple”, “banana”, “cherry”)
A vector can be one of the following two types:
1. Atomic vector
2. Lists
List
In R, the list is the container. Unlike an atomic vector, the list is not restricted to be a single mode.
A list contains a mixture of data types. The list is also known as generic vectors because the
element of the list can be of any type of R object. "A list is a special type of vector in which each
element can be a different type."
We can create a list with the help of list() or as.list(). We can use vector() to create a required
length empty list.
Arrays
There is another type of data objects which can store data in more than two dimensions known as
arrays. "An array is a collection of a similar data type with contiguous memory
allocation." Suppose, if we create an array of dimension (2, 3, 4) then it creates four rectangular
matrices of two rows and three columns.
In R, an array is created with the help of array() function. This function takes a vector as an input
and uses the value in the dim parameter to create an array.
Matrices
A matrix is an R object in which the elements are arranged in a two-dimensional rectangular layout.
In the matrix, elements of the same atomic types are contained. For mathematical calculation, this
can use a matrix containing the numeric element. A matrix is created with the help of the matrix()
function in R.
Syntax
Data Frames
A data frame is a two-dimensional array-like structure, or we can say it is a table in which each
column contains the value of one variable, and row contains the set of value from each column.
Factors
Factors are also data objects that are used to categorize the data and store it as levels. Factors can
store both strings and integers. Columns have a limited number of unique values so that factors
are very useful in columns. It is very useful in data analysis for statistical modeling.
Factors are created with the help of factor() function by taking a vector as an input parameter.
Variables in R Programming
Variables are used to store the information to be manipulated and referenced in the R program.
The R variable can store an atomic vector, a group of atomic vectors, or a combination of many R
objects.
Language like C++ is statically typed, but R is a dynamically typed, means it check the type of
data type when the statement is run. A valid variable name contains letter, numbers, dot and
underlines characters. A variable name should start with a letter or the dot not followed by a
number.
var_name, Valid Variable can start with a dot, but dot should not be followed by a number.
var.name In this case, the variable will be invalid.
var_name% Invalid In R, we can't use any special character in the variable name except dot and
underscore.
.2var_name Invalid A variable name cannot start with a dot which is followed by a digit.
var_name2 Valid The variable contains letter, number and underscore and starts with a letter.
Assignment of variable
In R programming, there are three operators which we can use to assign the values to the variable.
We can use leftward, rightward, and equal_to operator for this purpose.
There are two functions which are used to print the value of the variable i.e., print() and cat(). The
cat() function combines multiples values into a continuous print output.
When we execute the above code in our R command prompt, it will give us the following output:
R programming is a dynamically typed language, which means that we can change the data type
of the same variable again and again in our program. Because of its dynamic nature, a variable is
not declared of any data type. It gets the data type from the R-object, which is to be assigned to
the variable.
We can check the data type of the variable with the help of the class() function. Let's see an
example:
variable_y<- 124
cat("The data type of variable_y is ",class(variable_y),"\n")
variable_y<- 133L
cat(" Next the data type of variable_y becomes ",class(variable_y),"\n")
When we execute the above code in our R command prompt, it will give us the following output:
Keywords in R Programming
if else repeat
NaN NA NA_integer_
Operators in R
In R programming, there are different types of operator, and each operator performs a different
task. For data manipulation, There are some advance operators also such as model formula and list
indexing.
1. Arithmetic Operators
2. Relational Operators
3. Logical Operators
4. Assignment Operators
5. Miscellaneous Operators
Arithmetic Operators
Arithmetic operators are the symbols which are used to represent arithmetic math operations. The
operators act on each and every element of the vector. There are various arithmetic operators which
are supported by R.
6. %/% This operator is used to find the division of a <- c(2, 3.3, 4)
the first vector with the second(quotient). b <- c(11, 5, 3)
print(a%/%b)
Relational Operators
A relational operator is a symbol which defines some kind of relation between two entities. These
include numerical equalities and inequalities. A relational operator compares each element of the
first vector with the corresponding element of the second vector. The result of the comparison will
be a Boolean value. There are the following relational operators which are supported by R:
1. > This operator will return TRUE when every a <- c(1, 3, 5)
element in the first vector is greater than the b <- c(2, 4, 6)
print(a>b)
corresponding element of the second vector.
It will give us the following
output:
[1] FALSE FALSE
FALSE
2. < This operator will return TRUE when every a <- c(1, 9, 5)
element in the first vector is less then the b <- c(2, 4, 6)
print(a<b)
corresponding element of the second vector.
It will give us the following
output:
[1] FALSE TRUE
FALSE
3. <= This operator will return TRUE when every a <- c(1, 3, 5)
element in the first vector is less than or equal to b <- c(2, 3, 6)
print(a<=b)
the corresponding element of another vector.
It will give us the following
output:
[1] TRUE TRUE TRUE
4. >= This operator will return TRUE when every a <- c(1, 3, 5)
element in the first vector is greater than or equal to b <- c(2, 3, 6)
print(a>=b)
the corresponding element of another vector.
It will give us the following
output:
[1] FALSE TRUE
FALSE
Logical Operators
The logical operators allow a program to make a decision on the basis of multiple conditions. In
the program, each operand is considered as a condition which can be evaluated to a false or true
value. The value of the conditions is used to determine the overall value of the op1 operator op2.
Logical operators are applicable to those vectors whose type is logical, numeric, or complex.
The logical operator compares each element of the first vector with the corresponding element of
the second vector.
1. & This operator is known as the Logical AND a <- c(3, 0, TRUE, 2+2i)
operator. This operator takes the first element of b <- c(2, 4, TRUE, 2+3i)
print(a&b)
both the vector and returns TRUE if both the
elements are TRUE. It will give us the following output:
[1] TRUE FALSE TRUE
TRUE
2. | This operator is called the Logical OR operator. a <- c(3, 0, TRUE, 2+2i)
This operator takes the first element of both the b <- c(2, 4, TRUE, 2+3i)
print(a|b)
vector and returns TRUE if one of them is
TRUE. It will give us the following output:
[1] TRUE TRUE TRUE
TRUE
4. && This operator takes the first element of both the a <- c(3, 0, TRUE, 2+2i)
vector and gives TRUE as a result, only if both b <- c(2, 4, TRUE, 2+3i)
print(a&&b)
are TRUE.
It will give us the following output:
[1] TRUE
5. || This operator takes the first element of both the a <- c(3, 0, TRUE, 2+2i)
vector and gives the result TRUE, if one of them b <- c(2, 4, TRUE, 2+3i)
print(a||b)
is true.
It will give us the following output:
[1] TRUE
Assignment Operators
An assignment operator is used to assign a new value to a variable. In R, these operators are used
to assign values to vectors. There are the following types of assignment
1. <- or = or <<- These operators are known as left a <- c(3, 0, TRUE, 2+2i)
assignment operators. b <<- c(2, 4, TRUE, 2+3i)
d = c(1, 2, TRUE, 2+3i)
print(a)
print(b)
print(d)
2. -> or ->> These operators are known as right c(3, 0, TRUE, 2+2i) -> a
assignment operators. c(2, 4, TRUE, 2+3i) ->> b
print(a)
print(b)
TRUE/FALSE
The TRUE and FALSE keywords are used to represent a Boolean true and Boolean false. If the
given statement is true, then the interpreter returns true else the interpreter returns false.
NULL
In R, NULL represents the null object. NULL is used to represent missing and undefined values.
NULL is the logical representation of a statement which is neither TRUE nor FALSE.
Example:
1. as.null(list(a = 1, b = "c"))
Output:
The is.finite and is.infinite function returns a vector of the same length indicating which elements
are finite or infinite.
Inf and -Inf are positive and negative infinity. NaN stands for 'Not a Number.' NaN applies on
numeric values and real and imaginary parts of complex values, but it will not apply to the values
of integer vectors.
Usage
1. is.finite(x)
2. is.infinite(x)
3. is.nan(x)
4.
5. Inf
6. NaN
NA
NA is a logical constant of length 1 that contains a missing value indicator. It can be coerced to
any other vector type except raw. There are other types of constant also, such as NA_Integer_,
NA_real_, NA_complex_, and NA_character. These constants are of the other atomic vector type
which supports missing values.
Usage
1. NA
2. is.na(x)
3. anyNA(x, recursive = FALSE)
4.
5. ## S3 method for class 'data.frame'
6. is.na(x)
7. is.na(x) <- value
R Vector
A vector is a basic data structure which plays an important role in R programming.
In R, a sequence of elements which share the same data type is known as vector. A vector supports
logical, integer, double, character, complex, or raw data type. The elements which are contained
in vector known as components of the vector. We can check the type of vector with the help of
the typeof() function.
The length is an important property of a vector. A vector length is basically the number of elements
in the vector, and it is calculated with the help of the length() function.
Vector is classified into two parts, i.e., Atomic vectors and Lists. They have three common
properties, i.e., function type, function length, and attribute function.
There is only one difference between atomic vectors and lists. In an atomic vector, all the elements
are of the same type, but in the list, the elements are of different data types. In this section, we will
discuss only the atomic vectors. We will discuss lists briefly in the next topic.
restricted with a common data type which is the type of the returned value. There are various other
ways to create a vector in R, which are as follows:
We can create a vector with the help of the colon operator. There is the following syntax to use
colon operator:
1. z<-x:y
Example:
1. a<-4:-10
2. a
Output
[1] 4 3 2 1 0 -1 -2 -3 -4 -5 -6 -7 -8 -9 -10
In R, we can create a vector with the help of the seq() function. A sequence function creates a
sequence of elements as a vector. The seq() function is used in two ways, i.e., by setting step size
with ?by' parameter or specifying the length of the vector with the 'length.out' feature.
Example:
1. seq_vec<-seq(1,4,by=0.5)
2. seq_vec
3. class(seq_vec)
Output
Example:
1. seq_vec<-seq(1,4,length.out=6)
2. seq_vec
3. class(seq_vec)
Output
Numeric vector
The decimal values are known as numeric data types in R. If we assign a decimal value to any
variable d, then this d variable will become a numeric type. A vector which contains numeric
elements is known as a numeric vector.
Example:
1. d<-45.5
2. num_vec<-c(10.1, 10.2, 33.2)
3. d
4. num_vec
5. class(d)
6. class(num_vec)
Output
[1] 45.5
[1] 10.1 10.2 33.2
[1] "numeric"
[1] "numeric"
Integer vector
A non-fraction numeric value is known as integer data. This integer data is represented by "Int."
The Int size is 2 bytes and long Int size of 4 bytes. There is two way to assign an integer value to
a variable, i.e., by using as.integer() function and appending of L to the value.
Example:
1. d<-as.integer(5)
2. e<-5L
3. int_vec<-c(1,2,3,4,5)
4. int_vec<-as.integer(int_vec)
5. int_vec1<-c(1L,2L,3L,4L,5L)
6. class(d)
7. class(e)
8. class(int_vec)
9. class(int_vec1)
Output
[1] "integer"
[1] "integer"
[1] "integer"
[1] "integer"
Character vector
A character is held as a one-byte integer in memory. In R, there are two different ways to create a
character data type value, i.e., using as.character() function and by typing string between double
quotes("") or single quotes('').
Example:
1. d<-'shubham'
2. e<-"Arpita"
3. f<-65
4. f<-as.character(f)
5. d
6. e
7. f
8. char_vec<-c(1,2,3,4,5)
9. char_vec<-as.character(char_vec)
10. char_vec1<-c("shubham","arpita","nishka","vaishali")
11. char_vec
12. class(d)
13. class(e)
14. class(f)
15. class(char_vec)
16. class(char_vec1)
Output
[1] "shubham"
[1] "Arpita"
[1] "65"
[1] "1" "2" "3" "4" "5"
[1] "shubham" "arpita" "nishka" "vaishali"
[1] "character"
[1] "character"
[1] "character"
[1] "character"
[1] "character"
Repeating Values:
You can create a vector by repeating the values using the rep() function. # Creates a vector with
five is
Ex:
r<-rep(1, times=5)
r
Vector Function:
Vector Length
To find out how many items a vector has, use the length() function
Ex:
fruits <-c(“banana”, “apple”, “orange”)
length(fruits)
Output: [1] 3
Sort a Vector:
To sort items in a vector alphabetically or numerically, use the sort() function:
Ex:
fruits <-c(“banana”, “apple”, “orange”)
n<-c(5, 6, 1, 8)
sort(fruits)
sort(n)
Output:
[1] “apple” “banana” “orange”
[1] 1 5 6 8
Change an item:
To change the value of a specific item, refer to the index number
Ex:
fruits <-c(“banana”, “apple”, “orange”)
fruits[2]<-“Mango”
fruits
Output:
[1] “banana” “Mango” “orange”
We can access the elements of a vector with the help of vector indexing. Indexing denotes the
position where the value in a vector is stored. Indexing will be performed with the help of integer,
character, or logic.
On integer vector, indexing is performed in the same way as we have applied in C, C++, and java.
There is only one difference, i.e., in C, C++, and java the indexing starts from 0, but in R, the
indexing starts from 1. Like other programming languages, we perform indexing by specifying an
integer value in square braces [] next to our vector.
Example:
1. seq_vec<-seq(1,4,length.out=6)
2. seq_vec
3. seq_vec[2]
Output
In character vector indexing, we assign a unique key to each element of the vector. These keys are
uniquely defined as each element and can be accessed very easily. Let's see an example to
understand how it is performed.
Example:
1. char_vec<-c("shubham"=22,"arpita"=23,"vaishali"=25)
2. char_vec
3. char_vec["arpita"]
Output
In R, there are various operation which is performed on the vector. We can add, subtract, multiply
or divide two or more vectors from each other. In data science, R plays an important role, and
operations are required for data manipulation. There are the following types of operation which
are performed on the vector.
1) Combining vectors
The c() function is not only used to create a vector, but also it is also used to combine two vectors.
By combining one or more vectors, it forms a new vector which contains all the elements of each
vector. Let see an example to see how c() function combines the vectors.
Example:
1. p<-c(1,2,4,5,7,8)
2. q<-c("shubham","arpita","nishka","gunjan","vaishali","sumit")
3. r<-c(p,q)
Output
2) Arithmetic operations
We can perform all the arithmetic operation on vectors. The arithmetic operations are performed
member-by-member on vectors. We can add, subtract, multiply, or divide two vectors. Let see an
example to understand how arithmetic operations are performed on vectors.
Example:
1. a<-c(1,3,5,7)
2. b<-c(2,4,6,8)
3. a+b
4. a-b
5. a/b
6. a%%b
Output
[1] 3 7 11 15
[1] -1 -1 -1 -1
[1] 2 12 30 56
[1] 0.5000000 0.7500000 0.8333333 0.8750000
[1] 1 3 5 7
3) Numeric Index
In R, we specify the index between square braces [ ] for indexing a numerical value. If our index
is negative, it will return us all the values except for the index which we have specified. For
example, specifying [-3] will prompt R to convert -3 into its absolute value and then search for the
value which occupies that index.
Example:
1. q<-c("shubham","arpita","nishka","gunjan","vaishali","sumit")
2. q[2]
3. q[-4]
4. q[15]
Output
[1] "arpita"
[1] "shubham" "arpita" "nishka" "vaishali" "sumit"
[1] NA
4) Range Indexes
Range index is used to slice our vector to form a new vector. For slicing, we used colon(:) operator.
Range indexes are very helpful for the situation involving a large operator. Let see an example to
understand how slicing is done with the help of the colon operator to form a new vector.
Example:
1. q<-c("shubham","arpita","nishka","gunjan","vaishali","sumit")
2. b<-q[2:5]
3. b
Output
5) Out-of-order Indexes
In R, the index vector can be out-of-order. Below is an example in which a vector slice with the
order of first and second values reversed.
Example:
1. q<-c("shubham","arpita","nishka","gunjan","vaishali","sumit")b<-q[2:5]
2. q[c(2,1,3,4,5,6)]
Output
1. z=c("TensorFlow","PyTorch")
2. z
Output
Once our vector of characters is created, we name the first vector member as "Start" and the second
member as "End" as:
1. names(z)=c("Start","End")
2. z
Output
Start End
"TensorFlow" "PyTorch"
1. z["Start"]
Output
Start
"TensorFlow"
We can reverse the order with the help of the character string index vector.
1. z[c("Second","First")]
Output
Second First
"PyTorch" "TensorFlow"
Applications of vectors
1. In machine learning for principal component analysis vectors are used. They are extended to
eigenvalues and eigenvector and then used for performing decomposition in vector spaces.
2. The inputs which are provided to the deep learning model are in the form of vectors. These vectors
consist of standardized data which is supplied to the input layer of the neural network.
3. In the development of support vector machine algorithms, vectors are used.
4. Vector operations are utilized in neural networks for various operations like image recognition and
text processing.
R Lists
In R, lists are the second type of vector. Lists are the objects of R which contain elements of
different types such as number, vectors, string and another list inside it. It can also contain a
function or a matrix as its elements. A list is a data structure which has components of mixed data
types. We can say, a list is a generic vector which contains other objects.
Example
Output:
[[1]]
[1] 3 4 5 6
[[2]]
[1] "shubham" "nishka" "gunjan" "sumit"
[[3]]
[1] TRUE FALSE FALSE TRUE
List Functions:
R provides various functions for working with lists, including:
length(): Returns the number of elements in a list.
names(): Returns or sets the names of the elements in a list.
str(): Displays the structure of a list, showing its elements and data types.
unlist():Converts a list to a vector by flattening it.
Lists creation
The process of creating a list is the same as a vector. In R, the vector is created with the help of c()
function. Like c() function, there is another function, i.e., list() which is used to create a list in R.
A list avoid the drawback of the vector which is data type. We can add the elements in the list of
different data types.
Syntaxcreen
1. list()
1. list_1<-list(1,2,3)
2. list_2<-list("Shubham","Arpita","Vaishali")
3. list_3<-list(c(1,2,3))
4. list_4<-list(TRUE,FALSE,TRUE)
5. list_1
6. list_2
7. list_3
8. list_4
Output:
[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
[[1]]
[1] "Shubham"
[[2]]
[1] "Arpita"
[[3]]
[1] "Vaishali"
[[1]]
[1] 1 2 3
[[1]]
[1] TRUE
[[2]]
[1] FALSE
[[3]]
[1] TRUE
1. list_data<-list("Shubham","Arpita",c(1,2,3,4,5),TRUE,FALSE,22.5,12L)
2. print(list_data)
In the above example, the list function will create a list with character, logical, numeric, and vector
element. It will give the following output
Output:
[[1]]
[1] "Shubham"
[[2]]
[1] "Arpita"
[[3]]
[1] 1 2 3 4 5
[[4]]
[1] TRUE
[[5]]
[1] FALSE
[[6]]
[1] 22.5
[[7]]
[1] 12
Giving a name to list elements
R provides a very easy way for accessing elements, i.e., by giving the name to each element of a
list. By assigning names to the elements, we can access the element easily. There are only three
steps to print the list data corresponding to the name:
1. Creating a list.
2. Assign a name to the list elements with the help of names() function.
3. Print the list data.
Let see an example to understand how we can give the names to the list elements.
Example
Output:
$Students
[1] "Shubham" "Nishka" "Gunjan"
$Marks
[,1] [,2] [,3]
[1,] 40 60 90
[2,] 80 70 80
$Course
$Course[[1]]
[1] "BCA"
$Course[[2]]
[1] "MCA"
$Course[[3]]
[1] "B. tech."
Accessing List Elements
R provides two ways through which we can access the elements of a list. First one is the indexing
method performed in the same way as a vector. In the second one, we can access the elements of
a list with the help of names. It will be possible only with the named list.; we cannot access the
elements of a list using names if the list is normal.
Let see an example of both methods to understand how they are used in the list to access elements.
Output:
[[1]]
[1] "Shubham" "Arpita" "Nishka"
[[1]]
[[1]][[1]]
[1] "BCA"
[[1]][[2]]
[1] "MCA"
[[1]][[3]]
[1] "B.tech"
Output:
$Student
[1] "Shubham" "Arpita" "Nishka"
$Student
[1] "Shubham" "Arpita" "Nishka"
$Marks
[,1] [,2] [,3]
[1,] 40 60 90
[2,] 80 70 80
$Course
$Course[[1]]
[1] "BCA"
$Course[[2]]
[1] "MCA"
$Course[[3]]
[1] "B. tech."
Manipulation of list elements
R allows us to add, delete, or update elements in the list. We can update an element of a list from
anywhere, but elements can add or delete only at the end of the list. To remove an element from a
specified index, we will assign it a null value. We can update the element of a list by overriding it
from the new value. Let see an example to understand how we can add, delete, or update the
elements in the list.
Example
Output:
[[1]]
[1] "Moradabad"
$<NA>
NULL
$Course
[1] "Masters of computer applications"
You can also remove list items. The following example creates a new, updated list without an
“apple” item:
Ex:
lst<-list(“apple”, “banana”, “cherry”)
nl<-lst[-1]
nl
Output:
[[1]]
[1] “banana”
[[2]]
[1]”cherry”
There is a drawback with the list, i.e., we cannot perform all the arithmetic operations on list
elements. To remove this, drawback R provides unlist() function. This function converts the list
into vectors. In some cases, it is required to convert a list into a vector so that we can use the
elements of the vector for further manipulation.
The unlist() function takes the list as a parameter and change into a vector. Let see an example to
understand how to unlist() function is used in R.
Example
1. # Creating lists.
2. list1 <- list(10:20)
3. print(list1)
4. list2 <-list(5:14)
5. print(list2)
6. # Converting the lists to vectors.
7. v1 <- unlist(list1)
8. v2 <- unlist(list2)
9. print(v1)
10. print(v2)
11. adding the vectors
12. result <- v1+v2
13. print(result)
Output:
[[1]]
[1] 1 2 3 4 5
[[1]]
[1] 10 11 12 13 14
[1] 1 2 3 4 5
[1] 10 11 12 13 14
[1] 11 13 15 17 19
Merging Lists
R allows us to merge one or more lists into one list. Merging is done with the help of the list()
function also. To merge the lists, we have to pass all the lists into list function as a parameter, and
it returns a list which contains all the elements which are present in the lists. Let see an example
to understand how the merging process is done.
Example
Output:
[[1]]
[[1]][[1]]
[1] 2
[[1]][[2]]
[1] 4
[[1]][[3]]
[1] 6
[[1]][[4]]
[1] 8
[[1]][[5]]
[1] 10
[[2]]
[[2]][[1]]
[1] 1
[[2]][[2]]
[1] 3
[[2]][[3]]
[1] 5
[[2]][[4]]
[1] 7
[[2]][[5]]
[1] 9
R Arrays
In R, arrays are the data objects which allow us to store data in more than two dimensions. In R,
an array is created with the help of the array() function. This array() function takes a vector as an
input and to create an array it uses vectors values in the dim parameter.
For example- if we will create an array of dimension (2, 3, 4) then it will create 4 rectangular
matrices of 2 row and 3 columns.
R Array Syntax
There is the following syntax of R arrays:
1. array_name <- array(data, dim= (row_size, column_size, matrices, dim_names))
data
The data is the first argument in the array() function. It is an input vector which is given to the
array.
matrices
In R, the array consists of multi-dimensional matrices.
row_size
This parameter defines the number of row elements which an array can store.
column_size
This parameter defines the number of columns elements which an array can store.
dim_names
This parameter is used to change the default names of rows, columns, layers and blocks.
How to create?
In R, array creation is quite simple. We can easily create an array using vector and array() function.
In array, data is stored in the form of the matrix. There are only two steps to create a matrix which
are as follows
Let see an example to understand how we can implement an array with the help of the vectors and
array() function.
Example
Output
,,1
[,1] [,2] [,3]
[1,] 1 10 13
[2,] 3 11 14
[3,] 5 12 15
,,2
[,1] [,2] [,3]
[1,] 1 10 13
[2,] 3 11 14
[3,] 5 12 15
Naming rows and columns
In R, we can give the names to the rows, columns, and matrices of the array. This is done with the
help of the dim name parameter of the array() function.
It is not necessary to give the name to the rows and columns. It is only used to differentiate the
row and column for better understanding.
Below is an example, in which we create two arrays and giving names to the rows, columns, and
matrices.
Example
Output
, , Matrix1
, , Matrix2
Like C or C++, we can access the elements of the array. The elements are accessed with the help
of the index. Simply, we can access the elements of the array with the help of the indexing method.
Let see an example to understand how we can access the elements of the array using the indexing
method.
Example
1. , , Matrix1
2. Col1 Col2 Col3
3. Row1 1 10 13
4. Row2 3 11 14
5. Row3 5 12 15
6.
7. , , Matrix2
8. Col1 Col2 Col3
9. Row1 1 10 13
10. Row2 3 11 14
11. Row3 5 12 15
12.
13. Col1 Col2 Col3
14. 5 12 15
15.
16. [1] 13
17.
18. Col1 Col2 Col3
19. Row1 1 10 13
20. Row2 3 11 14
21. Row3 5 12 15
Manipulation of elements
The array is made up matrices in multiple dimensions so that the operations on elements of an
array are carried out by accessing elements of the matrices.
Example
1. #Creating two vectors of different lengths
2. vec1 <-c(1,3,5)
3. vec2 <-c(10,11,12,13,14,15)
4. #Taking the vectors as input to the array1
5. res1 <- array(c(vec1,vec2),dim=c(3,3,2))
6. print(res1)
7. #Creating two vectors of different lengths
8. vec1 <-c(8,4,7)
9. vec2 <-c(16,73,48,46,36,73)
10. #Taking the vectors as input to the array2
11. res2 <- array(c(vec1,vec2),dim=c(3,3,2))
12. print(res2)
13. #Creating matrices from these arrays
Output
,,1
,,2
[,1] [,2] [,3]
[1,] 1 10 13
[2,] 3 11 14
[3,] 5 12 15
,,1
[,1] [,2] [,3]
[1,] 8 16 46
[2,] 4 73 36
[3,] 7 48 73
,,2
[,1] [,2] [,3]
[1,] 8 16 46
[2,] 4 73 36
[3,] 7 48 73
R Matrix
In R, a two-dimensional rectangular data set is known as a matrix. A matrix is created with the
help of the vector input to the matrix function. On R matrices, we can perform addition,
subtraction, multiplication, and division operation.
In the R matrix, elements are arranged in a fixed number of rows and columns. The matrix elements
are the real numbers. In R, we use matrix function, which can easily reproduce the memory
representation of the matrix. In the R matrix, all the elements must share a common basic type.
Example
1. matrix1<-matrix(c(11, 13, 15, 12, 14, 16),nrow =2, ncol =3, byrow = TRUE)
2. matrix1
Output
Like vector and list, R provides a function which creates a matrix. R provides the matrix() function
to create a matrix. This function plays an important role in data analysis. There is the following
syntax of the matrix in R:
Example
Output
Like C and C++, we can easily access the elements of our matrix by using the index of the element.
There are three ways to access the elements from the matrix.
1. We can access the element which presents on nth row and mth column.
2. We can access all the elements of the matrix which are present on the nth row.
3. We can also access all the elements of the matrix which are present on the mth column.
Let see an example to understand how elements are accessed from the matrix present on nth row
mth column, nth row, or mth column.
Example
Output
[1] 12
R allows us to do modification in the matrix. There are several methods to do modification in the
matrix, which are as follows:
In matrix modification, the first method is to assign a single element to the matrix at a particular
position. By assigning a new value to that position, the old value will get replaced with the new
one. This modification technique is quite simple to perform matrix modification. The basic syntax
for it is as follows:
1. matrix[n, m]<-y
Here, n and m are the rows and columns of the element, respectively. And, y is the value which
we assign to modify our matrix.
Example
Output
row3 11 12 13
row4 14 15 16
The third method of matrix modification is through the addition of rows and columns using the
cbind() and rbind() function. The cbind() and rbind() function are used to add a column and a row
respectively. Let see an example to understand the working of cbind() and rbind() functions.
Example 1
Output
row4 14 15 16
17 18 19
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
[1,] 5 8 11 14 6 9 12 15 7 10 13 16
Matrix operations
Let see an example to understand how mathematical operations are performed on the matrix.
Example 1
6. print(sum)
7.
8. #Subtraction
9. sub<-R-S
10. print(sub)
11.
12. #Multiplication
13. mul<-R*S
14. print(mul)
15.
16. #Multiplication by constant
17. mul1<-R*12
18. print(mul1)
19.
20. #Division
21. div<-R/S
22. print(div)
Output
Matrix Transposition:
Transpose of a matrix using the t() function
result<-t(matrix1)
Matrix Inversion:
To find the inverse of a square matrix, use the solve() function
inverse<-solve(matrix1)
Applications of matrix
1. In geology, Matrices takes surveys and plot graphs, statistics, and used to study in different fields.
2. Matrix is the representation method which helps in plotting common survey things.
3. In robotics and automation, Matrices have the topmost elements for the robot movements.
4. Matrices are mainly used in calculating the gross domestic products in Economics, and it also helps
in calculating the capability of goods and products.
5. In computer-based application, matrices play a crucial role in the creation of realistic seeming
motion.
R Data Frame
A data frame is a two-dimensional array-like structure or a table in which a column contains values
of one variable, and rows contains one set of values from each column. A data frame is a special
case of the list in which each component has equal length.
A data frame is used to store data table and the vectors which are present in the form of a list in a
data frame, are of equal length.
In a simple way, it is a list of equal length vectors. A matrix can contain one type of data, but a
data frame can contain different data types such as numeric, character, factor, etc.
o Rectangular structure: Data frames are two dimensional structures with rows and
columns forming a rectangular shape. All columns must have same number of rows,
making them suitable for structured datasets.
o Column Names: The columns name should be non-empty.
o The rows name should be unique.
o The data which is stored in a data frame can be a factor, numeric, or character type.
o Each column contains the same number of data items.
In R, the data frames are created with the help of frame() function of data. This function contains
the vectors of any type such as numeric, character, or integer. In below example, we create a data
frame that contains employee id (integer vector), employee name(character vector),
salary(numeric vector), and starting date(Date vector).
Example
Output
employee_idemployee_namesalstarting_date
1 1 Shubham623.30 2012-01-01
2 2 Arpita915.20 2013-09-23
3 3 Nishka611.00 2014-11-15
4 4 Gunjan729.00 2014-05-11
5 5 Sumit843.25 2015-03-27
Getting the structure of R Data Frame
In R, we can find the structure of our data frame. R provides an in-build function called str() which
returns the data with its complete structure. In below example, we have created a frame using a
vector of different data type and extracted the structure of it.
Example
Output
Let's see an example of each one to understand how data is extracted from the data frame with the
help these ways.
Example
Output
emp.data.employee_idemp.data.sal
1 1 623.30
2 2 515.20
3 3 611.00
4 4 729.00
5 5 843.25
Example
8. "2015-03-27")),
9. stringsAsFactors = FALSE
10. )
11. # Extracting first row from a data frame
12. final <- emp.data[1,]
13. print(final)
14. # Extracting last two row from a data frame
15. final <- emp.data[4:5,]
16. print(final)
Output
Example
Output
employee_id starting_date
2 2 2013-09-23
3 3 2014-11-15
Modification in Data Frame
R allows us to do modification in our data frame. Like matrices modification, we can modify our
data frame through re-assignment. We cannot only add rows and columns, but also we can delete
them. The data frame is expanded by adding rows and columns.
We can
1. Add a column by adding a column vector with the help of a new column name using cbind()
function.
2. Add rows by adding new rows in the same structure as the existing data frame and using rbind()
function
3. Delete the columns by assigning a NULL value to them.
4. Delete the rows by re-assignment to them.
Let's see an example to understand how rbind() function works and how the modification is done
in our data frame.
Output
Output
employee_idemployee_namesalstarting_date
1 1 Shubham623.30 2012-01-01
2 2 Arpita515.20 2013-09-23
3 3 Nishka611.00 2014-11-15
4 4 Gunjan729.00 2014-05-11
5 5 Sumit843.25 2015-03-27
employee_idemployee_namesalstarting_date
2 2 Arpita515.20 2013-09-23
3 3 Nishka611.00 2014-11-15
4 4 Gunjan729.00 2014-05-11
5 5 Sumit843.25 2015-03-27
employee_idemployee_namesal
1 1 Shubham623.30
2 2 Arpita515.20
3 3 Nishka611.00
4 4 Gunjan729.00
5 5 Sumit843.25
Special Values:
NA(Not Available): NA represents missing or undefined values. It is used to indicate the absence of a
value. It is often used in data analysis to handle missing data points.
Ex:
v<- c(1,2, 3)
v
length(v)<-4
v
Output:
[1]1 2 3
[1] 1 2 3 NA
NaN (Not a Number): NaN represents an undefined or unpresentable value in numerical calculations. It is
often used when a mathematical operation doesn’t result in a valid numeric value.
Ex:
0/0 output: [1] NaN
Inf and –Inf (Positive and Negative Infinity): Inf represents positive infinity and –Inf represents negative
infinity. These values are that are beyond the representable range.
2^1024 Output [1] Inf
-2^1024 Output [1] -Inf
R factors
The factor is a data structure which is used for fields which take only predefined finite number of
values. These are the variable which takes a limited number of different values. These are the data
objects which are used to categorize the data and to store it on multiple levels. It can store both
integers and strings values, and are useful in the column that has a limited number of unique values.
Factors have labels which are associated with the unique integers stored in it. It contains predefined
set value known as levels and by default R always sorts levels in alphabetical order.
Attributes of a factor
There are the following attributes of a factor in R
a. X
It is the input vector which is to be transformed into a factor.
b. levels
It is an input vector that represents a set of unique values which are taken by x.
c. labels
It is a character vector which corresponds to the number of labels.
d. Exclude
It is used to specify the value which we want to be excluded,
e. ordered
It is a logical attribute which determines if the levels are ordered.
f. nmax
It is used to specify the upper bound for the maximum number of level.
R provides factor() function to convert the vector into factor. There is the following syntax of
factor() function
1. factor_data<- factor(vector)
Example
Output
Example
4. factor_data<- factor(data)
5. #Printing all elements of factor
6. print(factor_data)
7. #Accessing 4th element of factor
8. print(factor_data[4])
9. #Accessing 5th and 7th element
10. print(factor_data[c(5,7)])
11. #Accessing all elemcent except 4th one
12. print(factor_data[-4])
13. #Accessing elements using logical vector
14. print(factor_data[c(TRUE,FALSE,FALSE,FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,FALSE,TRUE)]
)
Output
[1] Shubham Nishka Arpita Nishka Shubham Sumit Nishka Shubham Sumit
[10] Arpita Sumit
Levels: Arpita Nishka Shubham Sumit
[1] Nishka
Levels: Arpita Nishka Shubham Sumit
[1] Shubham Nishka Arpita Shubham Sumit Nishka Shubham Sumit Arpita
[10] Sumit
Levels: Arpita Nishka Shubham Sumit
Example
3.
4. # Applying the factor function.
5. factor_data<- factor(data)
6.
7. #Printing all elements of factor
8. print(factor_data)
9.
10. #Change 4th element of factor with sumit
11. factor_data[4] <-"Arpita"
12. print(factor_data)
13.
14. #change 4th element of factor with "Gunjan"
15. factor_data[4] <- "Gunjan" # cannot assign values outside levels
16. print(factor_data)
17.
18. #Adding the value to the level
19. levels(factor_data) <- c(levels(factor_data),"Gunjan")#Adding new level
20. factor_data[4] <- "Gunjan"
21. print(factor_data)
Output
Example
Output
Example
Output
R provides gl() function to generate factor levels. This function takes three arguments i.e., n, k,
and labels. Here, n and k are the integers which indicate how many levels we want and how many
times each level is required.
1. gl(n, k, labels)
1. n indicates the number of levels.
2. k indicates the number of replications.
3. labels is a vector of labels for the resulting factor levels.
Example
1. gen_factor<- gl(3,5,labels=c("BCA","MCA","B.Tech"))
2. gen_factor
Output
[1] BCA BCA BCA BCA BCA MCA MCA MCA MCA MCA
[11] B.Tech B.Tech B.Tech B.Tech B.Tech
Levels: BCA MCA B.Tech
An object is simply a collection of data (variables) and methods (functions). Similarly, a class is a
blueprint for that object.
Class System in R
While most programming languages have a single class system, R has three class systems:
S3 Class
S4 Class
Reference Class
S3 Class in R
S3 class is the most popular class in the R programming language. Most of the classes that come
predefined in R are of this type.
First we create a list with various components then we create a class using the class() function. For
example,
Output
$name
[1] "John"
$age
[1] 21
$GPA
[1] 3.5
attr(,"class")
[1] "student"
In the above example, we have created a list named student1 with three components. Notice the
creation of class,
Here, Student_Info is the name of the class. And to create an object of this class, we have passed
the student1 list inside class() .
Finally, we have created an object of the Student_Info class and called the object student1 .
To learn more in detail about S3 classes, please visit R S3 class.
S4 Class in R
S4 class is an improvement over the S3 class. They have a formally defined structure which helps
in making objects of the same class look more or less similar.
Here, we have created a class named Student_Info with three slots (member variables): name , age ,
and GPA .
Now to create an object, we use the new() function. For example,
Here, inside new() , we have provided the name of the class "Student_Info" and value for all three
slots.
We have successfully created the object named student1 .
Example: S4 Class in R
Output
Slot "age":
[1] 21
Slot "GPA":
[1] 3.5
Here, we have created an S4 class named Student_Info using the setClass() function and an object
named student1 using the new() function.
To learn more in detail about S4 classes, please visit R S4 class.
Reference Class in R
Reference classes were introduced later, compared to the other two. It is more similar to the
object oriented programming we are used to seeing in other major programming languages.
# Student_Info() is our generator function which can be used to create new objects
student1 <- Student_Info(name = "John", age = 21, GPA = 3.5)
Output
In the above example, we have created a reference class named Student_Info using
the setRefClass() function.
And we have used our generator function Student_Info() to create a new object student1 .
Objects are created by setting Objects are created using Objects are created using generator
the class attribute new() functions
Coercion:
Coercion is the process of automatically converting data from one data type to another,
typically to make operations and calculations between different data types work correctly.
Coercion can be either implicit (automatic) or explicit (user-defined).
Implicit Coercion: In many cases, R will implicitly perform coercion when necessary to make
expression or operations work. For example, when you add an integer and a numeric value, R will
coerce the integer to a numeric type to match the data types.
x<-5
y<-3.5
result <- x+y
result
Output: [1] 8.5
Explicit Coercion: You can explicitly coerce data types using functions like:
Function Description
as.list It accepts only dictionary type or vector as input arguments in the parameter
This allows you to convert data from one type to another based on your requirements.
x<-5
y<-as.numeric(x)
y
Output: [1] 5
Example:
The program prints the values and data types of these variables to demonstrate coercion
#Create a numeric variable
nvar<-42
#Coerce the numeric variable to a character
cvar<-as.character(nvar)
#Coerce the character variable back to a numeric
nv<-as.numeric(cvar)
#Prints the result
cat(“Original Numeric Variable”, nvar, “\n”)
cat(“Coerced to Character”, cvar, “\n”)
cat(“Coerced back to Numeric”,nv, “\n”)
#Check the data types
cat(“Data type of numeric_var”, class(nvar), “\n”)
cat(“Data type of character_var”, class(cvar), “\n”)
cat(“Data type of numeric_var Coerced”, class(nv), “\n”)
Output:
Original Numeric Varaible: 42
Coerced to Character: 42
Coeeced Back to Numeric: 42
Data type of numeric_var: numeric
Data type of character_var: character
Data type of numeric_var coerced: numeric
Plotting
R has a number of built-in tools for basic graph types such as histograms, scatter plots, bar charts,
boxplots and much more. Rather than going through all of different types, we will focus on plot(),
a generic function for plotting x-y data.
Ex:
In the above example, we have used the plot() function to plot one point on a graph.
plot(2, 4)
Here,
2 - specifies point on the x-axis
4 - specifies point on the y-axis
Output
# create a vector x
x <- c(2, 4, 6, 8)
# create a vector y
y <- c(1, 3, 5, 7)
Output
In the above example, we can plot multiple points on a graph using the plot() function and R
vector.
plot(x, y)
Here, we have passed two vectors: x and y inside plot() to plot multiple points.
The first item of x and y i.e. 2 and 1 respectively plots 1st point on graph and second item
of x and y plots 2nd point on graph and so on.
Note: Make sure the number of points on both vectors are the same.
Output
Scatter Plot:
Scatterplots are useful for visualizing the relationship and distribution of data points and for
identifying patterns, clusters or outliners.
x<-c(1, 2, 3, 4, 5)
x and y are numeric vectors representing the data to be plotted on the x-axis and y-axis respectively.
pch- 19 sets the type of point used in the plot (a filled circle)
main, xlab and ylab are used to set the plot’s title and axis labels.
Plot Labels:
The plot() function also accepts other parameters, such as main, xlab and ylab if you want to customize the
graph with a main title and different labels for x and y-axis.
plot(1:10, main=”My Graph”, xlab=”The x-axis”, ylab=”The y-axis”)
Line Plot:
A line plot in R is used to display data points connected by lines. It’s a useful visualization for showing
trends and changes in data over time across a continues variable.
plot(x, y, type="l", lwd=2, col="red", main="Line Plot", xlab="X-Axis", ylab="Y-Axis")
Bar Chart:
A bar chart, also known as a bar plot or bar graph, is a common type of data visualization in R used to
represent categorical data. It displays data using rectangular bars, where the lengths of each bar is
proportional o the value it represents.
barplot(y, names.arg=x, col="green", main="Bar Chart", xlab="X-axis", ylab="Y-axis")
Histogram:
A histogram in R is a graphical representation of the distribution of a continues or discrete dataset. It’s a
valuable tool for data visualizing the frequency or density of data within specified intervals or bins.
hist(y, col="purple", main="Histogram", xlab="Value", ylab="Frequency")
Pie Chart
A pie chart is a circular statistical graphic, which is divided into slices to illustrate numerical
proportion. Pie charts represents data visually as a fractional part of a whole, which can be an
effective communication tool.
pie(expenditure,
main = "Monthly Expenditure Breakdown",
labels = c("Housing", "Food", "Cloths", "Entertainment", "Other")
)
Boxplot:
A boxplot also known as a box-and –whisker plot is a graphical representation of the distribution
of a dataset. It displays the median, quartiles and potential outliers.