Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
36 views

Introduction To R

The document introduces R programming language. It states that R is a software environment for statistical analysis, graphics, and reporting. It was created by Ross Ihaka and Robert Gentleman at the University of Auckland. R is useful for statistical computation, data visualization, and predictive analysis. It provides an integrated suite of tools for data manipulation, calculation, and graphical displays. The document also discusses how to get started with R, its basic data types and structures, and how to get online help in R.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
36 views

Introduction To R

The document introduces R programming language. It states that R is a software environment for statistical analysis, graphics, and reporting. It was created by Ross Ihaka and Robert Gentleman at the University of Auckland. R is useful for statistical computation, data visualization, and predictive analysis. It provides an integrated suite of tools for data manipulation, calculation, and graphical displays. The document also discusses how to get started with R, its basic data types and structures, and how to get online help in R.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Introduction to R Programming Language

Olatunbosun R O (200L Student)

Department of Statistics
University of Ibadan

February 6, 2023

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 1 / 28


What is R ?

• R is a software environment for statistical analysis, graphics


representation and reporting.
• R was created by Ross Ihaka and Robert Gentleman at the University
of Auckland, New Zealand, and is currently developed by the R
Development Core Team.
• This programming language was named R, based on the first letter of
the first name of the two authors (Robert Gentleman and Ross
Ihaka). And partly a play on the name of Bell Labs Language S.
• R is a platform for the object-oriented statistical programming
language S.
• R is free and easy to learn.
• R runs on major platforms: Window, Mac Os, UNIX/Linus.

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 2 / 28


Why using R ?

When learning data science, many people struggle with choosing which
programming languages to learn. There are many programming languages
available for data science, like R, Python, SAS, Java, and more. There are
many data science software packages to learn, such as SPSS Statistics,
SAS, Minitab, Tableau, Power BI and more.

I recommend learning R for statistics because it was developed for


statistics in the first place. Python is a real programming language, so you
can develop real applications and software via Python programming.

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 3 / 28


R programming is very strong in statistics, so it is ideal for data
exploration or data understanding using descriptive statistics, inferential
statistics, regression analysis, and data visualizations. R is also ideal for
modeling because you can use statistical learning like regressions for
predictive analytics.

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 4 / 28


R programming language is best used for the following :
• STATISTICAL COMPUTATION
• Descriptive statistics
• Data sampling
• Correlation analysis
• Hypothesis testing
• Statistical inference
• DATA VISUALIZATION
• Interactive and Visual analytics
• Dashboards
• PREDICTIVE ANALYSIS
• Pandemic trend
• Understanding human language
• Recognizing visual objects

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 5 / 28


The R environment

R is an integrated suite of software facilities for data manipulation,


calculation and graphical display. It includes
• An effective data handling and storage facility.
• A suite of operators for calculations on arrays, in particular matrices.
• A large, coherent for calculation of intermediate tools for data
analysis. Graphical facility for data analysis and display.
• A well-developed, simple and effective programming language which
includes conditionals, loops, user-defined recursive functions and input
and output facilities.

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 6 / 28


Getting Started with R and online help system

Starting R
• The beauty of R is that it is a shareware, so it is free to anyone (no
license fee) and available online.
• To obtain R from windows (or Mac). Go to the comprehensive R
Archive Network (CRAN) at http://www.r-project.org or
http://www.cran.r-project.org and you can immediately install it.
• Once you have installed R, there will be an icon on your desktop.
Double click it and R will start up. When you do, a window should
appear in your screen.
• The ”>” is a prompt symbol displayed by R. This is R’s way of telling
you that it’s ready for you to type a command.

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 7 / 28


• If a command is not complete at the end of a line, R will give a
different prompt by default ”+” on the second and subsequent line
and continue to read input until the command is syntactically
complete.
• To see the list of installed datasets, use the data method with an
empty argument: >data().

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 8 / 28


Getting online help in R
• R has a built-in help facility.
• To get more information on any specific named function e.g. sqrt,
mean e.t.c the command is :
• >help(sqrt) or
• >?sqrt
• For a feature specific by special characters, the argument must be
enclosed in single or double quotes e.g. (”[[”).
• >help(”[[”)
• This is also necessary for a few word with syntactic meaning,
including if, for, while and function.
• >help(”for”)

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 9 / 28


Command and Execution

Technically R is an expression language with a very simple syntax. Users


are expected to type inputs (commands) into R in the console window.
Commands
• Consists of expressions or assignments
• Are separated by semi-colon (;) or by a newline
• Can be group together using braces (”” and ””)
• Expressions and commands in R are case-sensitive. e.g. X and x do
not refer to the same variable
• Command lines do not need to be separated by any special character
like semicolon in SAS.
• Anything following the hash (#) character R ignores as a comment
• You can use the arrow keys on the keyboard to scroll back to previous
commands.

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 10 / 28


The set of variables/symbols which can be used in R:
• Variable name can be created by using letters, digits and the dot (.)
symbol. The variable name consists of letters, numbers and the dot
or underlined character. E.g. Wt.male, Var name, Var, . . .
• Variable name must not start with a digit or a dot (.) followed by a
digit or vice versa. E.g. 2var name, 1var, .1x, 1.x . . . are not valid.
• Avoid some special name use by the system, e.g. c, q, t, C, F, I,T,
diff, df, pt –AVOID.

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 11 / 28


Basic Data Types in R
There are 5 basic data types in R
• Numeric: The most common data type in R is numeric. A variable
or a series will be stored as numeric data if the values are numbers or
if the values are decimals e.g c=3.7, 5, 6,1.231
• Integer: Integer data are actually a special case of numeric data. E.g
numbers of children in a family . . . 1L, 3L, 6L (the L tells R to store
this as an integer).
• Complex: Complex numbers with real and imaginary parts. E.g 1+4i
• Logical: A logical variable is a variable with only two values; TRUE
or FALSE
• Character: The data type character is used when sorting text, known
as strings in R. The simplest way to store data under the character
format is by using ” ” around the piece of text: e.g char = ”male”.

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 12 / 28


• If you want to force any kind of data to be stored as character, you
can do it by using the command: as.character()
• Note that everything inside ” ” or ’ ’ will be consider as character, no
matter if it looks like character or not. E.g. ’7.5’ will be saved as
character.
• Furthermore, as soon as there is at least one character value in a
variable or vector, the whole vector will be considered as character.
E.g. char name = c(”text”, 1.237, 4), the whole vector will be
considered as character.

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 13 / 28


Data structures in R
R has many data structures. These include:
• Vector: A vector is the most common and basic data structure in R
and is pretty much the workhorse in R. e.g. c(3,4,5) or c(”male”,
”female”).
• List: A list is an R-object which can contain many different types of
elements like vector, functions and even another list inside it. e.g.
list(c(1,2,3), 21.3, sin60, c(2,4,1))
• Matrix:A matrix is a two-dimensional rectangular data set. It can be
created using a vector input to the matrix function.
• Data Frames: Data frames are tabular data objects. Unlike matrix,
in the data frame each column can contain different modes of data.
The first column could be numeric while the second column could be
character and the third column could be logical. It is a list of equal
length. Data frames are created using ”data.frame()” function.

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 14 / 28


• Array: While matrices are confined to two dimensions, arrays can be
of any number of dimensions, the array takes a dim attribute which
creates the required numbers of dimension.
• Factor: They are data objects which are used to categorize the data
and store it as levels. They can store both strings and integers. They
are useful in data analysis for statistical modeling. They are created
using the ”factor()” function by taking a vector as input.

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 15 / 28


Variable Assignment
The variable can be assigned values using leftward, rightward and equal to
operator. The values of the variable can be printed using ”print()” or
simply print the variable name.
1 > # Assignment using equal operator .
2 > var1 = c (1 ,2 ,3)
3 > var1
4 [1] 1 2 3
5
6 > # Assignment using leftward operator .
7 > var2 <- c ( " Rasheed " ," Olatunbosun " )
8 > var2
9 [1] " Rasheed " " Olatunbosun "
10

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 16 / 28


R Operators

OPERATORS NAME EXAMPLE OUTPUT


+ Addition 12 + 8 20
- Subtraction 12 - 8 4
* Multiplication 12 * 8 72
/ Division 12/8 1.5
∧ Exponent 12 ∧ 2 144
%% Modulus 12%%8 4
%/% Integer Division 12%/%8 1
== Equal 12 == 5 FALSE
!= Not Equal 12!=5 TRUE
>/< Greater/Less than 12 > 8 TRUE
>= Greater than or equal to 12 >= 8 TRUE
<= Less than or equal to 8 <= 12 TRUE
: Creating a series of numbers 1:3 123
Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 17 / 28
Objects; Vectors; Generating Sequence

Objects
• The entities that R creates and manipulates are known as objects.
• These may be variables, arrays of numbers, character strings,
functions or more general structures built from such components.
• R saves any object you create.
• To list the object you have created in a session use either of the
following commands: >object() or ls().
• To remove all the object in R type: >rm(list=ls(all=T).
• To remove a specified number of objects use: >rm(x,y), only object x
and y will be removed.
• To quit the R program use the close(X) button in the window or you
can use the command q().

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 18 / 28


Vectors
• Vectors are the simplest type of object in R and it is simply a list of
items that are of the same type.
• They can be created with c then combined function
• There are 3 main types of vectors in R.
• Numeric Vectors
• Character Vectors
• Logical Vectors

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 19 / 28


• Numeric Vectors: is a single entity consisting of an ordered
collection of numbers. E.g. To set up a number vector X consisting
of 5 numbers, 10, 6, 3, 6, 22. We use any one of the following
commands:
1 X = c (10 , 6 , 3 , 6 , 22)
2 X
3
4 X <- c (10 , 6 , 3 , 6 , 22)
5 X
6

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 20 / 28


• Character Vectors: A character or strings are used for storing text.
A string is surrounded by either single quotation marks or double
quotation marks, but is printed using double quotes (or sometimes
without quotes). “hello” is the same as ‘hello’. For example, we
create two vector variable called Department and Programming
Languages;
1 Department = c ( " Statistics " , " Mathematics " , " Economics " )
2 Department
3
4 Prog _ Language = c ( 'R ' , ' Python ' , ' Java ' , ' CSS ')
5 Prog _ Language
6

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 21 / 28


• Logical Vectors: A logical vector is vector whose elements are
TRUE, FALSE, or NA. TRUE and FALSE are often abbreviated as T
and F respectively. Comparison operators are <, >, <=, >=, ==,
!=.

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 22 / 28


Logical Operators
• &: Element-wise logical AND operator. It returns TRUE if both
elements are TRUE.
• |: Element-wise logical OR operator. It returns TRUE if one of the
element is TRUE.
• &&: Logical AND operator. It returns TRUE if both statements are
TRUE.
• ||: Logical OR operator. It returns TRUE if one of the statement is
TRUE.
• !: Logical False. It returns FALSE if statement is TRUE.
Note: The logical operator && and || consider only the first element of
the vectors and give a vector of single element as output.

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 23 / 28


Example: Consider x = c(9,12,FALSE) and y = c(12,3,TRUE), then print
• x&y
• x|y
• x&&y
• x||y
1 x = c (9 ,12 , FALSE )
2 y = c (12 ,3 , TRUE )
3 x&y
4 x|y
5 x&&y
6 x || y
7

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 24 / 28


Generating Sequence R has a number of ways of generating Sequence of
numbers, and this includes:
• Using operator with numeric data
1 # Sequence
2 2:8
3
4 12:4
5
6 7.2:13.2
7
8 2 * 1:10
9
10 2 * (1:10)
11

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 25 / 28


• Using sequence seq() function
1 seq (1:10)
2
3 seq (1 ,10)
4
5 seq (1 ,10 , by = 2)
6
7 seq ( from = 1 , to = 10 , by = 2)
8
9 seq ( from = 10 , to = 1 , by = 2)
10

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 26 / 28


Matrix

• In R, matrices are an extension of the numeric or character vectors


with dimensions; the numbers of row and columns.
• As with vectors, all the elements of a matrix must be of the same
data type.
• A matrix is created using matrix() function.
• The basic syntax for creating a matrix in R is :
matrix(data, nrow, ncol, byrow, dimnames)

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 27 / 28


matrix(data, nrow, ncol, byrow, dimnames)
Matrices, Arrays, Lists and Data Frames Matrices

Olatunbosun R O (University of Ibadan) Introduction to R Programming Language February 6, 2023 28 / 28

You might also like