Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
22 views

R Programming 101 Part 1

Uploaded by

PavaniPaladugu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

R Programming 101 Part 1

Uploaded by

PavaniPaladugu
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

R Programming 101:

Nuts and Bolts


Jocelyn Mara
Discipline of Sport and Exercise Science
R
• A free programming language and software environment
• Primarily used for statistical computing and graphics
• Uses command line interface for most processes
• RStudio is the graphical interface (but still heavily reliant on CLI)
• Runs on any operating system
• Users can use the built-in functions or create their own

https://www.r-project.org
First step

Download and install R and RStudio using the


guide provided
RStudio
RStudio
RStudio
RStudio
The Prompt >

• Informally stands for “what’s next”


• R is waiting for you to give it some instructions
Calculations in R
• We can use R as a calculator
>4+3
[1] 7

> 20 / 5
[1] 4

>5*4
[1] 20

> 64 - 57
[1] 7

>8^2
[1] 64
Value assignment
• The <- symbol is the assignment operator

> x <- 7
> print(x)
[1] 7
>x
[1] 7

• The [1] indicates that x is a vector and 7 is the first element


Value assignment

> x <- 1:20


>x
[1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

• The : operator is used to create integer sequences


Value assignment
Value assignment
> x <- 7
>x+3
[1] 10
> x+y
Error: object ‘y’ not found
> y <- 3
> x+y
[1] 10
Value assignment
Objects
• Anything we manipulate/analyse/encounter in R is an object
• Single values (e.g. x <- 7)
• Vectors (e.g. Numerical, Matrices, Dataframes)
• Plots
Object Classes
Classes in R describe the type of values within an object
• Numeric (real numbers, e.g. 2.73)
• Integers (whole numbers, e.g. 2, 7, 68)
• Factor (e.g. 1 = Male, 2 = Female)
• Logical (true/false)
• Character (e.g. “Hello World”)
• Complex (e.g. 2n + i)
Object Classes
Object Classes
Object Classes
• If you want a number to be an integer you need to use the suffix ‘L’
Object Classes
• If you want to check the class of an object you can use the class function

> class(x)
[1] "numeric”
> class(y)
[1] ”integer”
Vectors
• Vectors are objects which contain multiple values of the same class (with
the exception of a list and dataframe)

> x <- rnorm(n = 20)

>x

[1] 1.31 2.35 0.91 -1.06 -0.54 0.86 -0.15 1.19 -1.40 -0.60 -1.44 -1.70 2.02

[14] -0.50 1.72 0.23 -0.61 -2.78 1.41 -1.57

> class(x)

[1] "numeric"
Vectors

> y <- x > 0

>y

[1] TRUE TRUE TRUE FALSE FALSE TRUE FALSE TRUE FALSE FALSE FALSE

[12] FALSE TRUE FALSE TRUE TRUE FALSE FALSE TRUE FALSE

> class(y)

[1] ”logical"
Creating Vectors
• Use the c() function to create vectors (combine values)

> x <- c(12.3, 27.8)


>x
[1] 12.3 27.8
Creating Vectors
• Use the c() function to create vectors (combine values)

> x <- c(TRUE, FALSE, TRUE)


>x
[1] TRUE FALSE TRUE
Creating Vectors
• Use the c() function to create vectors (combine values)

> x <- c(“This”, “Is”, “Fun!”)


>x
[1] "This" "is" "Fun!"
Mixing Classes
• When values of different classes are mixed in a vector, coercion occurs so
that every element in the vector is of the same class

> x <- c(12.3, TRUE, “foo”)


> class(x)
[1] “Character”
>x
[1] "12.3" "TRUE" "foo"
Mixing Classes
• When values of different classes are mixed in a vector, coercion occurs so
that every element in the vector is of the same class

> x <- c(TRUE, 1.7, FALSE)


> class(x)
[1] “Numeric”
>x
[1] 1.0 1.7 0.0
Explicit Coercion
• Objects can be explicitly coerced from one class to another using the as.*
functions
> x <- 0:6
> class(x)
[1] “integer”
> as.numeric(x)
[1] 0 1 2 3 4 5 6
> as.logical(x)
[1] FALSE TRUE TRUE TRUE TRUE TRUE
> as.character(x)
[1] “0” “1” “2” “3” “4” “5” “6”
Explicit Coercion
• A coercion that doesn’t make sense will result in NAs

> x <- c(“a”, “b”, ”c”)


> as.numeric(x)
[1] NA NA NA
Warning message:
NAs introduced by coercion
> as.logical(x)
[1] NA NA NA
Warning message:
Nas introduced by coercion
Matrices
• A matrix is a vector with a dimensions attribute (nrow, ncol)

> mat <- matrix(x, nrow = 5, ncol = 4)

> mat
[,1] [,2] [,3] [,4]
[1,] 1.31 0.86 -1.44 0.23
[2,] 2.35 -0.15 -1.70 -0.61
[3,] 0.91 1.19 2.02 -2.78
[4,] -1.06 -1.40 -0.50 1.41
[5,] -0.54 -0.60 1.72 -1.57
Matrices
• Use the dim function to check the dimensions of a vector

> mat <- matrix(x, nrow = 5, ncol = 4)

> dim(mat)

[1] 5 4
Dataframes
• Are vectors with dimensions and variable names (attributes)

• Arranged with each column as a variable and each row a case

> df <- as.data.frame(mat)

> df
V1 V2 V3 V4
1 1.31 0.86 -1.44 0.23
2 2.35 -0.15 -1.70 -0.61
3 0.91 1.19 2.02 -2.78
4 -1.06 -1.40 -0.50 1.41
5 -0.54 -0.60 1.72 -1.57
Dataframes
• Can contain different classes

• But each column (variable) should have the same class

> df
Subject Position Distance
1 Centre 1200
2 Back 1759
3 Forward 1680
Dataframes
• Use the dim function to check dimensions
• Use the names function to check variable names

> dim(df)

[1] 5 4

> names(df)

[1] "V1" "V2" "V3" "V4"


Lists
• Vectors that can contain elements of different classes

> x <- list(c(17.1, 23.2), TRUE, "a")


>x
[[1]]
[1] 17.1 23.2

[[2]]
[1] TRUE

[[3]]
[1] "a"
Attributes
• Names (variable names or dim names)
• Dimensions (nrow, ncol)
• Length (n values if vector with no dim or a matrix, ncol if dataframe)
• Class
Attributes
• Use attributes function to check attributes of a vector

> attributes(df)

$names

[1] "V1” "V2" "V3” "V4”

$row.names

[1] 1 2 3 4 5

$class

[1] "data.frame"
Missing Values
• Missing values represented by NA

• NaN (not a number) is used for undefined mathematical operations (e.g.


0/0)

• is.na( ) is used to test if there are missing values in an object

• is.nan( ) is used to test for NaN

• A NaN value is also NA, but a NA is not a NAN


Missing Values
> x <- c(1, 2, NA, 10, 3)
> is.na(x)
[1] FALSE FALSE TRUE FALSE FALSE
> is.nan(x)
[1] FALSE FALSE FALSE FALSE FALSE
> y <- c(1, 2, NaN, NA, 4)
> is.na(y)
[1] FALSE FALSE TRUE TRUE FALSE
> is.nan(y)
[1] FALSE FALSE TRUE FALSE FALSE
Basic Functions
So far in this lesson I’ve used:

• class( )
• rnorm( )
• c( )
• as.numeric( )
• as.logical( )
• as.character( )
• as.data.frame( )
• matrix( )
• dim( )
• attributes( )
• is.na( )
Basic Functions
Rather than doing this to find the mean..

> (1 + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9 + 10) / 10
[1] 5.5

... I can do this...

> mean(x)
[1] 5.5
Basic Functions
Some other examples:

• sd( )
• min( )
• max( )
• median( )
• range( )
Basic Functions
Functions have this format...

function-name(arg 1, arg 2, ...)

Example:
mean(x, trim = 0, na.rm = FALSE)

Function name/ Other


describes what we’re arguments
doing The object
we’re applying
the function to
Function Arguments
• Functions have named arguments which sometimes have default values

mean(x, trim = 0, na.rm = FALSE)

• If I just typed mean(x)...

.... this would be equivalent to mean(x, trim = 0, na.rm = FALSE)


Argument Matching
• Function arguments can be matched by position or by name

• E.g. the following calls are all equivalent:

> mydata <- 1:20


> mean(x = mydata, trim = 0, na.rm = FALSE)
> mean(mydata, 0, FALSE)
> mean(na.rm = FALSE, trim = 0, x = mydata)
> mean(mydata, trim = 0, FALSE)
> mean(mydata)

But don’t mess around with it too much


Function Arguments
• To see the arguments for a function you can use the args( ) function

> args(lm)
function (formula, data, subset, weights, na.action, method = "qr", model = TRUE, x =
FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE, contrasts = NULL, offset, ...)
Function Arguments
• You can also use ?function-name to see more information about the function and it’s
arguments

> ?mean
The “...” Argument
• Generic functions use “...” so extra arguments can be passed in the
function later

> x <- rnorm(n = 20)


> y <- rnorm(n = 20)
> args(plot)
function (x, y, ...)
NULL
> plot(x, y)
The “...” Argument
• Generic functions use “...” so extra arguments can be passed in the
function later

> x <- rnorm(n = 20)


> y <- rnorm(n = 20)
> args(plot)
function (x, y, ...)
NULL
> plot(x, y)
The “...” Argument
• Generic functions use “...” so extra arguments can be passed in the
function later

> plot(x, y, col = “red”)


The “...” Argument
• The “...” argument is also necessary when the number of arguments
passed to the function is not known in advance

> args(paste)
function (..., sep = " ", collapse = NULL)
NULL
> paste(“This”, “is”, “Fun”, sep = “ ”, collapse = NULL)
[1] "This is Fun"
The “...” Argument
• The catch – any arguments coming after the “...” must be explicitly named

> paste("This", "is", "Fun"," ", NULL)


[1] "This is Fun "
Summary
• Value assignment

• Objects, classes, attributes

• Missing values

• Basic functions and their arguments

You might also like