Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
8 views

Basics of R Programming

R is a programming language and software environment for statistical analysis and graphical display of data. It allows importing, cleaning, transforming, modeling and visualizing data. R can be used for statistical techniques like statistical tests, classification, clustering and data reduction. R is open-source, works on multiple platforms and has a large community for support. The RStudio IDE provides an integrated development environment for working with R with features to help support the development life cycle of R code, including writing, testing, and debugging code. Common data structures in R include vectors, lists, matrices and data frames which are used to store and manipulate different types of data.

Uploaded by

varadkarpragati
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views

Basics of R Programming

R is a programming language and software environment for statistical analysis and graphical display of data. It allows importing, cleaning, transforming, modeling and visualizing data. R can be used for statistical techniques like statistical tests, classification, clustering and data reduction. R is open-source, works on multiple platforms and has a large community for support. The RStudio IDE provides an integrated development environment for working with R with features to help support the development life cycle of R code, including writing, testing, and debugging code. Common data structures in R include vectors, lists, matrices and data frames which are used to store and manipulate different types of data.

Uploaded by

varadkarpragati
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 64

Basics of R programming

Unit 1
What Is R?
● R refers to two things. There is R, the programming language, and R, the
piece of software that you use to run programs written in R.
● R (the language) was created in the early 1990s by Ross Ihaka and Robert
Gentleman, then both working at the University of Auckland.
● R (the software) is a GNU project, reflecting its status as open source
software
● R is an interpreted language (sometimes called a scripting language),
which means that your code doesn’t need to be compiled before you run
it.
● R is often used for statistical computing and graphical presentation to
analyze and visualize data.
Why Use R?
● It is a great resource for data analysis, data visualization, data science
and machine learning
● It provides many statistical techniques (such as statistical tests,
classification, clustering and data reduction)
● It is easy to draw graphs in R, like pie charts, histograms, box plot, scatter
plot, etc
● It works on different platforms (Windows, Mac, Linux)
● It is open-source and free
● It has a large community support
● It has many packages (libraries of functions) that can be used to solve
different problems
Installing R and RStudio-
● R (the software) is a GNU project, reflecting its status as important free and
open source software
● RStudio is an R-specific IDE. That means that you lose the ability to code
(easily) in multiple languages, but you do get some features especially for
R.
● Go to the website https://www.rstudio.com/categories/rstudio-ide/
RStudio
● The RStudio IDE Layout is divided into 4 parts
RStudio
1. The R console
● This is the “interpreter” and runs your code in real time (as opposed to
needing to compile your code and then run it).
● It interprets whatever you write into the console to
➔ Perform basic calculations such as 2 + 2
➔ Assign values to a variable
➔ Apply a function
● This is where code is executed
● Code is not saved on your disk when entered into the console
● If you want to save your code, then use an R script.
RStudio
2. The R Scripts
● An R analysis script allows you to store your code in a static document that
you can save.
● A script allows you to
➔ Save your code and to share with others (reproducibility)
➔ Try things out interactively and then add/modify to your code in the script
● You can add R code and comments to script files
● You can run the code from your script by highlighting the code and
pressing CMD+Enter (Mac) or Ctrl+Enter (Windows).
● In .R files (R scripts), code is saved on your disk.
RStudio
3. Environments
● In this section, you can find
➔ Workspace/enviroment tab which tells you what objects are in R and what
exists in memory/what is loaded/what you have read in.
➔ History tab which shows previous commands you have run. This is useful
for debugging your code, but don’t rely on it as a script.
RStudio
4. In the bottom right hand corner, there are several tabs which include
➔ Files - shows the files on your computer in the directory you are working in
➔ Viewer - can vew data or R objects
➔ Help - shows help documentations for R commands
➔ Plots - shows plots generated in your R sessions. Can see current and
previous plots, save, and export them to png/pdf formats.
➔ Packages - list of R packages you have installed
Basics of R programming(Using console)
Displaying output :

Using R console as a calculator


In R, we can put out code directly into the console terminal.
Basics of R programming(Using Script)
● It is generally better to write your code in a script, so that you can easily
save, edit, and share your code.
● Create an R script by going to File > New File > R script or using the button
that looks like a sheet of paper with a plus in a green circle on the upper left
hand corner of RStudio and selecting R script.
Basics of R programming(Using Script)
● Comments in R
Comments can be used to explain R code, and to make it more readable. It can
also be used to prevent execution when testing alternative code.
Comments starts with a #. When executing code, R will ignore anything that
starts with #.
Basics of R programming(Using Script)
● Variables in R
1. Variables are containers for storing data values.
2. R uses = or <- to assign values to a variable name. Note that variable
names are case sensitive, so x and X are different variables in R.
Codes Output
Basics of R programming(Using Script)
● Variables in R
1. You can also concatenate, or join, two or more elements, by using the
paste() function.
Codes Output
Basics of R programming(Using Script)
● Variables in R
1. A variable can have a short name (like x and y) or a more descriptive name
(age, carname, total_volume). Rules for R variables are:
2. A variable name must start with a letter and can be a combination of
letters, digits, period(.)
3. and underscore(_). If it starts with period(.), it cannot be followed by a digit.
4. A variable name cannot start with a number or underscore (_)
5. Variable names are case-sensitive (age, Age and AGE are three different
variables)
6. Reserved words cannot be used as variables (TRUE, FALSE, NULL, if...)
Basics of R programming(Using Script)
● Variables in R
Basics of R programming(Using Script)
● R Data Types
1. Variables can store data of different types, and different types can do
different things.
2. In R, variables do not need to be declared with any particular type, and can
even change type after they have been set
3. Basic data types in R can be divided into the following types:
numeric - Eg. 10.5, 55, 787
integer - Eg. 1L, 55L, 100L, where the letter "L" declares this as an integer
character (a.k.a. string) - Eg. "k", "R is exciting", "FALSE", "11.5"
logical (a.k.a. boolean) - Eg. TRUE or FALSE
Basics of R programming(Using Script)
● R Math
1. In R, you can use operators to perform common mathematical operations
on numbers.
2. The standard mathematical operators are +, -, *, /, and ^.
Built-in Math Functions
● R also has many built-in math functions that allows you to perform
mathematical tasks on numbers.
R Input & outputs
How to Read User Input in R?
1. readline() function
● read the input given by the user in the terminal with the readline() function.
● Returns output in string format
Code Output
R Input & outputs
How to Read User Input in R?
2. scan() function
● Use the scan() function to read user input.
● This function, however, can only read numeric values and returns a numeric vector.
● If a non-numeric input is given, the function gives an error.
● To stop taking user input press “Enter” key twice.
Code Output
R Input & outputs
How to Display Output in R?
1. print() function
● use the print() function to display the output to the terminal.
● The print() function is a generic function.
● This means that the function has a lot of different methods for different types of objects
it may need to print.
● The function takes an object as the argument
Code Output
R Input & outputs
How to Display Output in R?
2. cat() function
● use the cat() function to display a string.
● The cat() function concatenates all of the arguments and forms a single string which it
then prints.
Code

Output
R Data Structures
● A data structure is a specialized format for organizing, processing,
retrieving and storing data.
● Some data structures in R are as follows:
1. Vectors
2. Lists
3. Matrices
4. Arrays
5. Data frames
R Data Structures-Vectors=>Creating a vector
1. Vectors
● A vector is simply a list of items that are of the same type.
● To combine the list of items to a vector, use the c() function and separate the items by a
comma.
● Vectors can be created using c(), vector() and scan() functions.
● Vectors cannot
● Eg1. A vector of characters

● Eg2. A vector of numbers


Vectors=> Creating a vector
● Eg3. Vector with numerical values in a sequence

● Eg4.Vector of logical values

● The vector function creates a vector of a specified type and length.


Each of the values in the result is zero, FALSE, or an empty string, or
whatever the equivalent of “nothing” is.
Vectors=> Creating a vector

Example of Vector() function


Vectors=> Creating a vector
Generating Sequenced Vectors
● create a vector with numerical values in a sequence with the : operator

● To make bigger or smaller steps in a sequence, use the seq() function.The seq()
function has three parameters: from is where the sequence starts, to is where the
sequence stops, and by is the interval of the sequence.
Vectors=> length of vector and sorting a vector
● To find out how many items a vector has, use the length() function.

● To sort items in a vector alphabetically or numerically, use the sort()


function
Vectors=> Accessing vector items
● You can access the vector items by referring to its index number inside
brackets [ ]. The first item has index 1, the second item has index 2, and
so on.

● You can also access multiple elements by referring to different index


positions with the c() function
Vectors=> Changing vector items
● To change the value of a specific item, refer to the index number:
R Data Structures - LISTS => Creating lists
● A list in R can contain many different data types inside it.
● A list is a collection of data which is ordered and changeable.
● To create a list, use the list() function.
Code Output

● Lists are often called “recursive vectors” as you can store a list inside
another list.
R Data Structures - LISTS => Creating lists
Code

Output
LISTS => Access & Change Lists items
● Access Lists
Access the list items by referring to its index number, inside brackets. The first item has index
1, the second item has index 2, and so on
CODE OUTPUT

● Change Item Value


To change the value of a specific item, refer to the index number
CODE OUTPUT
LISTS => Access Lists
Code

Output
LISTS => Access List Item
● To access multiple item to the end of the list, use the append() function
Code

Output
LISTS => Change Item Value & Get length
● To change the value of a specific item, refer to the index number:

Code Output

● To find out how many items a list has, use the length() function
Code Output
LISTS => Add Item
● To add an item to the end of the list, use the append() function
Code Output

● To add an item to the right of a specified index, add "after=index


number" in the append() function
Code Output
LISTS => Remove Item
● Method 1: Removing Item from the list using (-) method
Code Output

● Method 2: Remove elements using NULL assignment


Code Output
LISTS => Join Two Lists
● There are several ways to join, or concatenate, two or more lists in R.
● The most common way is to use the c() function, which combines two
elements together:
Code Output
LISTS => List traversal
● You can loop through the list items by using a for loop
Code Output
Matrices=> Creating
● A collection of values in rows and columns is called a matrix
● It is a two dimensional data set and can store different data types at once
● A column is a vertical representation of data, while a row is a horizontal
representation of data.
● A matrix can be created with the matrix() function. Specify the nrow and ncol
parameters to get the amount of rows and columns:
Code

Output
Matrices=> Creating
● You can also create a matrix with strings
Code

Output
Matrices=> Access Items
● You can access the items by using [ ] brackets. The first number "1" in the
bracket specifies the row-position, while the second number "2" specifies the
column-position
Code

Output
Matrices=> Access Items
● The whole row can be accessed if you specify a comma after the number in the
bracket:
Code

Output

● The whole column can be accessed if you specify a comma before the number
in the bracket
CODE:
Matrices=> Access More Than One Row & Column
● More than one row and column can be accessed if you use the c() function
Matrices=> Add Rows and Columns
● Use the cbind() function to add additional columns in a Matrix.The cells in
the new column must be of the same length as the existing matrix.
Matrices=> Add Rows and Columns
● Use the rbind() function to add additional rows in a Matrix. The cells in the
new row must be of the same length as the existing matrix.
Matrices=> Remove Rows and Columns
● Use the c() function to remove rows and columns in a Matrix
Matrices=> Check if an Item Exists, length of matrix
● To find out if a specified item is present in a matrix, use the %in% operator

Length of matrix
Use the length() function to find the dimension of a Matrix
Matrices=> Number of Rows and Columns, Loop through Matrix
● Use the dim() function to find the number of rows and columns in a Matrix:

Loop through matrix


loop through a Matrix using a for loop. The loop will start at the first row, moving right

Output
Matrices=> Combine two Matrices
● Use the rbind() or cbind() function to combine two or more matrices
together
Arrays=> Creating Arrays
● A collection of values of similar data type
● Compared to matrices, arrays can have more than two dimensions.
● Use the array() function to create an array, and the dim parameter to
specify the dimensions
● Syntax:
array_name <- array(data, dim= (row_size, column_size, matrices, dim_names))
1. Data- The data is the first argument in the array() function. It is an input
vector which is given to the array.
2. Dim - create matrices of n row and m columns.Takes 4 inputs.
- row_size:defines the number of row elements which an array can store.
- column_size: defines the number of columns elements which an array can
store
- matrices: In R, the array consists of multi-dimensional matrices
Arrays=> Creating Arrays
- dim_names: Used to change the default names of rows and columns.
● Eg1. An array with one dimension with values ranging from 1 to 10

● Eg2. An array with more than one dimension


Arrays=> Access Array Items
● Access the array elements by referring to the index position. Use the [ ]
brackets to access the desired elements from an array
● The syntax is as follow:
array[row position, column position, matrix level]
Arrays=> Access Array Items
● Can also access the whole row or column from a matrix in an array, by
using the c() function:
Arrays=> check if item exists, no of rows & columns & length of array
● To find out if a specified item is present in an array, use the %in% operator

● Use the dim() function to find the amount of rows and columns in an array

● Use the length() function to find the dimension of an array


Arrays=> Traverse Array Items
● Loop through the array items by using a for loop
Data Frames=> Creating Data frames
● Data Frames are data displayed in a format as a table.
● Data frames can also be interpreted as matrices where each column of a
matrix can be of different data types.
● Use the data.frame() function to create a data frame
Data Frames=> Summarize the Data
● Use the summary() function to summarize the data from a Data Frame
Data Frames=> Add Rows & columns
● Use the rbind() function to add new rows in a Data Frame
● Use the cbind() function to add new columns in a Data Frame
Data Frames=> Number of Rows & columns
● Use the dim() function to find the amount of rows and columns in a Data
Frame

● Can also use the ncol() function to find the number of columns and nrow()
to find the number of rows
Data Frames=> Data Frame Length
● Use the length() function to find the number of columns in a Data Frame
(similar to ncol())
Data Frames=> Combining Data Frames
● Use the rbind() function to combine two or more data frames in R vertically
Data Frames=> Combining Data Frames
● Use the cbind() function to combine two or more data frames in R
horizontally

You might also like