0% found this document useful (0 votes)

45 views

Introduction To R Programming 1691124649

The document provides an introduction and overview of using R for database management and statistical analysis. It discusses reading various data formats into R, such as CSV, Excel, and STATA files, using functions from packages like utils, xlsx, and foreign. It also covers basic R operations on vectors and matrices, and manipulating and transforming data in R for analysis. The goal is to lower the learning curve of R and leverage its strong data handling capabilities.

Uploaded by

puneetbd

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views

Introduction To R Programming 1691124649

Uploaded by

puneetbd

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 79

Introduction to R

Adrian Rohit Dass

Institute of Health Policy, Management, and
Evaluation
Canadian Centre for Health Economics
University of Toronto

September 17th, 2021

Outline
• Why use R?
• R Basics
• R for Database Management
– Reading-in data, merging datasets, reshaping, recoding variables, sub-
setting data, etc.
• R for Statistical Analysis
– Descriptive and Regression Analysis
• Other topics in R
– Tidyverse
– Parallel Processing
– R Studio
• R Markdown
• Applied Example
• R Resources
Learning Curves of Various Software
Packages

Source: https://sites.google.com/a/nyu.edu/statistical-software-guide/summary
Summary of Various Statistical
Software Packages

Source: https://sites.google.com/a/nyu.edu/statistical-software-guide/summary
Goals of Today’s Talk
• Provide an overview of the use of R for database
management
– By doing so, we can hopefully lower the learning curve
of R, thereby allowing us to take advantage of its “very
strong” data manipulation capabilities
• Provide an overview of the use of R for statistical
analysis
– This includes descriptive analysis (means, standard
deviations, frequencies, etc.) as well as regression
analysis
– R contains a wide number of pre-canned routines that
we can use to implement the method we’d like to use
Part I
R Basics
Command Window Syntax Window
Programming Language
• Programming language in R is generally object
oriented
– Roughly speaking, this means that data, variables,
vectors, matrices, characters, arrays, etc. are treated
as “objects” of a certain “class” that are created
throughout the analysis and stored by name.
– We then apply “methods” for certain “generic
functions” to these objects
• Case sensitive (like most statistical software
packages), so be careful
Classes in R
• In R, every object has a class
– For example, character variables are given the
class of factor or character, whereas numeric
variables are integer
• Classes determine how objects are handled by
generic functions. For example:
– the mean(x) function will work for integers but not
for factors or characters - which generally makes
sense for these types of variables
Packages available (and loaded) in R by
default
Package Description
base Base R functions (and datasets before R 2.0.0).
compiler R byte code compiler (added in R 2.13.0).
datasets Base R datasets (added in R 2.0.0).
grDevices Graphics devices for base and grid graphics (added in R 2.0.0).
graphics R functions for base graphics.
grid A rewrite of the graphics layout capabilities, plus some support for interaction.
Formally defined methods and classes for R objects, plus other programming tools, as
methods described in the Green Book.
Support for parallel computation, including by forking and by sockets, and random-
parallel number generation (added in R 2.14.0).
splines Regression spline functions and classes.
stats R statistical functions.
stats4 Statistical functions using S4 classes.
tcltk Interface and language bindings to Tcl/Tk GUI elements.
tools Tools for package development and administration.
utils R utility functions.
Source: https://cran.r-project.org/doc/FAQ/R-FAQ.html

For database management, we usually won’t need to load or install any additional packages,
although we might need the “foreign” package (available in R by default, but not initially loaded)
if we’re working with a dataset from another statistical program (SPSS, SAS, STATA, etc.)
Packages in R
• Functions in R are stored in packages
– For example, the function for OLS (lm) is accessed via
the “stats” package, which is available in R by default
– Only when a package is loaded will its contents be
available. The full list of packages is not loaded by
default for computational efficiency
– Some packages in R are not installed (and thus
loaded) by default, meaning that we will have to
install packages that we will need beforehand, and
then load them later on
Packages in R (Continued)
• To load a package, type library(packagename)
– Ex: To load the foreign package, I would type library(foreign) before
running any routines that require this package
• To install a package in R:
– Type install.packages(“packagename”) in command window
– For example, the package for panel data econometrics is plm in R. So, to
install the plm package, I would type install.packages(“plm”).
• Note that, although installed, a package will not be loaded by default
(i.e. when opening R). So, you’ll need library(package) at the top of
your code (or at least sometime before the package is invoked).
– Some packages will draw upon functions in other packages, so those
packages will need to be installed as well. By using install.packages(“ ”), it
will automatically install dependent packages
Some Basic Operations in R
• Q: If x = 5, and y = 10, and z = x + y, what is the value of z?
• Let’s get R to do this for us:

• In this example, we really only used the ‘+’ operator,

but note that ‘-’, ‘/’, ‘*’, ‘^’, etc. work the way they
usually do for scalar operations
Some Basic Operations in R
• Now suppose we created the following vectors:
1 2
A= 2 B= 4
3 6

In R, c() is used to combine

• What is A + B? values into a vector or list. Since
we have multiple values, we
need to use it here

• Note that with vectors, ‘+’, ‘-’, ‘/’, ‘*’, ‘^’ perform element-wise
calculations when applied to vectors. So, vectors need to be
the same length.
Working with Matrices in R
• A matrix with typical element (i,j) takes the following form:

(1,1) (1,2) (1,3)

(2,1) (2,2) (2,3)
(3,1) (3,2) (3,3)

• Where i = row number and j = column number

• In R, the general formula for extracting elements (i.e. single
entry, rows, columns) is as follows:
– matrixname[row #, column #]
• If we leave the terms in the brackets blank (or leave out the
whole bracket term) R will spit out the whole matrix
Working with Matrices in R (Continued)
• Example: Suppose we had the following matrix:
1 4 7
2 5 8
3 6 9

• To create this matrix in R, type:

> matrix = matrix(c(1, 2, 3, 4, 5, 6, 7, 8, 9), nrow=3, ncol=3)
• Extract the element in row #2, column #3
> matrix[2,3]
8
• Extract the second row
> matrix[2,]
258
Since we require
• Extract the last two columns
multiple
> matrix[,c(2,3)]
columns, we
47
need to use c()
58 here
69
Working with Matrices in R (Continued)
• Example: Suppose now we had the following vector, with typical element
‘i':
1
2
3

• Extract the third element of the vector

> vector[3]
3
• Suppose the 2nd element should be 5, not 2. How do we correct this
value?
> vector[2] = 5
> vector
1
5
3
But wait a minute…
• Q: If this is a tutorial on the use of R for database
management/statistical analysis, then why are we
learning about vectors/matrices?
• A: The way we work with data in R is very
similar/identical to how we work with
vectors/matrices
– This is different from other statistical software
packages, which may be a contributing factor to the
“high” learning curve in R
• The importance of vector/matrices operations
will become more clear as we move
Part II
R for Database Management
Reading Data into R
What format is the data in?
• Data from Comma Separated Values File (.csv)
– Package: utils
– Formula: read.csv(file, header = TRUE, sep = ",", quote = "\"", dec = ".", fill =
TRUE, comment.char = "", ...)
• Data from Excel File (.xlsx)
– Package: xlsx
– Formula: read.xlsx(file, sheetIndex, sheetName=NULL, rowIndex=NULL,
startRow=NULL, endRow=NULL, colIndex=NULL, as.data.frame=TRUE,
header=TRUE, colClasses=NA, keepFormulas=FALSE, encoding="unknown", ...)
• Data from STATA (.dta)
– Package: foreign
– read.dta(file, convert.dates = TRUE, convert.factors = TRUE, missing.type =
FALSE, convert.underscore = FALSE, warn.missing.labels = TRUE)

Other Formats: See package “foreign”

https://cran.r-project.org/web/packages/foreign/foreign.pdf
Reading Data into R
Examples:
• CSV file with variable names at top
– data = read.csv(“C:/Users/adrianrohitdass/Documents/R
Tutorial/data.csv”)
• CSV file with no variable names at top
– data = read.csv(“C:/Users/adrianrohitdass/Documents/R
Tutorial/data.csv”, header=F)
• STATA data file (12 or older)
– library(foreign)
– data = read.dta(“C:/Users/adrianrohitdass/Documents/R
Tutorial/data.dta”)
• STATA data file (13 or newer)
– library(readstata13)
– data = read.dta13(“C:/Users/adrianrohitdass/Documents/R
Tutorial/data.dta”)
Comparison and Logical Operators

Operator Description Example

= Assign a value x=5
== Equal to sex ==1
!= Not equal to LHIN != 5
> Greater than income >5000
< Less than healthcost < 5000
>= or <= Greater than or equal to income >= 5000
Less than or equal to healthcost <= 5000
& And sex==1 & age>50
| Or LHIN==1 | LHIN ==5
Referring to Variables in a Dataset
• Suppose I had data stored in “mydata” (i.e an
object created to store the data read-in from a
.csv by R). To refer to a specific variable in the
dataset, I could type
mydata$varname

Name of Dataset Name of Variable in dataset

‘$’ used in R to extract named

elements from a list
Creating a new variable/object
• No specific command to generate new
variables (in contrast to STATA’s “gen” and
“egen” commands)
– x = 5 generates a 1x1 scalar called “x” that is equal
to 5
– data$age = year – data$dob creates a new
variable “age” in the dataset “data” that is equal
to the year – the person’s date of birth (let’s say in
years)
Looking at Data
• Display the first or last few entries of a dataset:
– Package: utils
– View entire dataset in separate window
• View(x, title)
– First few elements of dataset (default is 5):
• head(x, n, …)
– Last few elements of dataset (default is 5):
• tail(x, n, …)
• List of column names in dataset
– Package: base
– Formula: colnames(x)
Missing Values
Missing Values are listed as “NA” in R
• Count number of NA’s in column
sum(is.na(x))
• Recode Certain Values as NA (i.e. non
responses coded as -1)
x[x==-1] = NA
Renaming Variables (Columns)
A few different ways to do this:
• To rename the ‘ith’ column in a dataset
– colnames(data)[i] = “My Column Name”
• Can be cumbersome – especially if don’t know column # of the
column you want to rename (just it’s original name)
• Alternative:
– colnames(data)[which(colnames(data) == “R1482600”)] = “race”

Grabs column names Look-up that returns New column name

from specified the column #
dataset
Subsetting Data
• Subsetting can be used to restrict the sample in the dataset, create
a smaller data with fewer variables, or both
• Recall: extracting elements from a matrix in R
• matrixname[row #, column #]
• What’s the difference between a matrix and a dataset?
– Both have row elements
• Typically the individual records in a dataset
– Both have column elements
• Typically the different variables in the dataset
• If we think of our dataset as a matrix, then the concept of
subsetting in R becomes a lot easier to digest
Subsetting Data (Continued)
Examples:
• Restrict sample to those with age >=50
> datas1 = data[data$age >=50,]
• Create a smaller dataset with just ID, age, and
height
>datas2 = data[, c(“ID”, “age”, “height”)]
• Create a smaller dataset with just ID, age, and
height; with age >=50
>datas3 = data[data$age>=50, c(“ID”, “age”, “height”)]
Recoding Variables in R
• Usually done with a few lines of code using comparison
and logical operators
• Ex: Suppose we had the following for age:
> data$age = [19, 20, 25, 30, 45, 55]
• If we wanted to create a categorical variable for age
(say, <20, 20-39, 40-59), we could do the following:
> data$agecat[data$age <20] = 1
> data$agecat[data$age >=20 & data$age <40] = 2
> data$agecat[data$age >=40 & data$age <60] = 3
> data$agecat
> [1, 2, 2, 2, 3, 3]
Merging Datasets
Suppose we had the following 2 datasets:
Data1 Data2
Id Age Income Id Health Care Cost
1 55 49841.65 1 188.1965
2 63 46884.78 2 172.2420
3 65 45550.87 3 102.8355
4 69 26254.15 4 150.2247
5 52 22044.73

Our first dataset contains some data on age and income, but not health care
costs to the public system. Dataset 2 contains this data, but was not initially
available to us. It also doesn’t have age or income.

The common element between the two datasets is “Id”, which uniquely identifies
the same individuals across the two datasets.

Note that, for some reason, individual 5 does not have a reported health care
cost
Merging Datasets (Continued)
• Command: merge
– Package: base
• For our example:
Optional, but
– Datam = merge(Data1, Data2, by=“Id”, all=T) default is F,
meaning those
who can’t be
Unique identifier matched will be
across datasets excluded
– Resulting Dataset
Datam
Id Age Income Health Care Cost
1 55 49841.65 188.1965
2 63 46884.78 172.2420
3 65 45550.87 102.8355
4 69 26254.15 150.2247
5 52 22044.73 NA
Part II

R for Statistical Analysis

Descriptive Statistics in R
• Mean
– Package: base
– Formula: mean(x, trim = 0, na.rm = FALSE, ...)
• Standard Deviation
– Package: stats
– Formula: sd(x, na.rm = FALSE)
• Correlation
– Package: stats
– Formula: cor(x, y = NULL, use = "everything”, method
= c("pearson", "kendall", "spearman"))
Descriptive Statistics (Example)
• Suppose we had the following data column in
R (transposed to fit on slide):
– Vector = [5,5,6,4]
• What is the mean of the vector?
• In R, I would type
> mean(Vector)
>5
Descriptive Statistics (Example)
• Suppose now we had the following:
– Vector = [5,5,6,4,NA]
• What is the mean of the vector?
• In R, I would type
> mean(Vector)
> NA
• Why did I get a mean of NA?
– Our vector included a missing value, so R couldn’t
compute the mean as is.
• To remedy this, I would type
> mean(Vector, na.rm=T)
>5
Tabulations R
• Tabulations of categorical/ordinal variables can be done with
R’s table command:
– Package: base
– Formula: table(..., exclude = if (useNA == "no") c(NA, NaN), useNA =
c("no”, "ifany", "always"), dnn = list.names(...), deparse.level = 1)
Ex: Table Sex Variable, with extra column for missing values (if
any)
Graphing Data in R
• Generic X-Y Plotting
– Package: graphics
– Formula: plot(x, y, ...)
Example:
plot(cost.data$income,cost.data$totexp)

• Plotting with ggplot() function

– Package: ggplot2
– Formula: ggplot(data = NULL, mapping = aes(), ..., environment =
parent.frame())
Example:
ggplot(cost.data, aes(x=income, y=totexp)) + geom_point()
Resulting Graph (Generic)
Resulting Graph (ggplot2)

See https://github.com/rstudio/cheatsheets/raw/master/data-visualization.pdf for

ggplot cheatsheet
Ordinary Least Squares
• The estimator of the regression intercept and slope(s) that minimizes the sum of
squared residuals (Stock and Watson, 2007).

– Package: stats
– Formula: lm(formula, data, subset, weights, na.action, method = "qr", model =
TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE, contrasts = NULL,
offset, ...)

Examples:

Regression of “total health care expenditure” on “age, gender, household income,

supplementary insurance status (insurance beyond Medicare), physical and activity
limitations and the total number of chronic conditions” using dataset “cost.data” from
Medical Expenditure Panel Survey (65+)

ols.costdata = lm(totexp ~ age + female + income + suppins + phylim + actlim + totchr,

data = cost.data)

Online Help File

https://stat.ethz.ch/R-manual/R-devel/library/stats/html/lm.html
Ordinary Least Squares

Example adapted from Jones (2013) Applied Health Economics

Post-Estimation
Package: lmtest
• Breusch-Pagan test for heteroskedasticity.
bptest(formula, varformula = NULL, studentize = TRUE, data = list())
• Ramsey’s RESET test for functional form.
resettest(formula, power = 2:3, type = c("fitted", "regressor", "princomp"),
data = list())
Package: car
• Variance Inflation Factor (VIF)
vif(model)
Package: sandwich
• Heteroskedasticity-Consistent Covariance Matrix Estimation
coeftest(ols.costdata, vcovHC(ols costdata, type = "HC1"))
Notes: need to combine with lmtest coeftest() command, and use type =
“HC1” to get the same results as STATA’s “robust” command
Extracting Beta coefficients, standard
errors, etc. from model
• A couple of ways to do this, but most of the information we’re after is stored in the
coefficients object returned from summary:

• The above is a matrix, so we can get the information we need through column
extractions:
– Beta coefficients: summary(ols.costdata)$coefficients[,1]
– Standard errors: summary(ols.costdata)$coefficients[,2]
– T-value: summary(ols.costdata)$coefficients[,3]
– P-value: summary(ols.costdata)$coefficients[,4]
Residuals vs Fitted Values
• For Residuals vs Fitted Values (RVFV) Plot, use generic plot() function on
regression object. First plot is RVFV
• Formula: plot(ols.costdata, 1)

*The other 5 plots are: Normal Q-Q, Scale-Location, Cook’s distance, Residuals vs
Leverage, and Cook’s distance vs Leverage
Models for Binary Outcomes
• R does not come with different programs for binary outcomes. Instead, it
utilizes a unifying framework of generalized linear models (GLMs) and a
single fitting function, glm() (Kleiber & Zeileis (2008))

Package: stats
Formula: glm(formula, family = gaussian, data, weights, subset, na.action,
start = NULL, etastart, mustart, offset, control = list(...), model = TRUE,
method = "glm.fit”, x = FALSE, y = TRUE, contrasts = NULL, ...)

• For binary outcomes, we specify family=“binomial” and link= “logit” or

“probit”
• Can be extended to count data as well (family=“poisson”)

Online help: https://stat.ethz.ch/R-manual/R-

devel/library/stats/html/glm.html
Models for Binary Outcomes
Example: Probit Analysis: factors associated with being arrested
Instrumental Variables
A way to obtain a consistent estimator of the unknown co-
efficiencts of the population regression function when the
regressor, X, is correlated with the error term, u. (Stock and
Watson, 2007).

Package: AER
Formula: ivreg(formula, instruments, data, subset, na.action,
weights, offset, contrasts = NULL, model = TRUE, y = TRUE, x =
FALSE, ...)

Online documentation: https://cran.r-

project.org/web/packages/AER/AER.pdf
IV Example
Example: Determinants of Income (As a function of Health)

Prints out F-test for

Weak Instruments,
Hausman Test
Statistic (vs ols) and
Sargan’s Test for
Over-identifying
Restrictions (if more
than one instrument
use)
Other Regression Models
• Panel Data Econometrics
– Package: plm
– https://cran.r-
project.org/web/packages/plm/vignettes/plm.pdf
• Linear and Generalized Linear Mixed Effects Models
– Package: lme4
– https://cran.r-project.org/web/packages/lme4/lme4.pdf
• Quantile Regression
– Package: quantreg
– https://cran.r-
project.org/web/packages/quantreg/quantreg.pdf
Part III
Other topics in R
Tidyverse
Tidyverse
From Tidyverse website:
“The tidyverse is an opinionated collection of R packages
designed for data science. All packages share an underlying
design philosophy, grammar, and data structures…tidyverse
makes data science faster, easier and more fun”

Source: https://www.tidyverse.org

• Packages within tidyverse: ggplot2, dplyr, tidyr, readr, purrr,

tibble, stringr, and forcats

• To get, type: install.packages(“tidyverse”) in R console

Tidyverse (Continued)
Package: dplyr
• Description: provides a flexible grammar of data
manipulation.
• Example Commands:
– Restrict sample to those with age >=50
• subdata = filter(data, age>=50)
– Create a smaller dataset with just ID, age, and height
• subdata = select(data, ID, age, height)
– Create a smaller dataset with just ID, age, and height;
with age >=50
• subdata = data %>%
filter(age>=50) %>%
select(ID, age, height)
Tidyverse (Continued)
Package: dplyr
• Example Commands (continued):
– Create new variable (age) in existing dataset
• data = mutate(data, age = year – dob)
– Rename a variable in a dataset (new name = old
name)
• data = rename(data, race = R1482600)
• https://cran.r-
project.org/web/packages/dplyr/dplyr.pdf
Tidyverse (Continued)
Other (selected) packages in Tidyverse:
• Package: readr
– Description: The goal of 'readr' is to provide a fast and
friendly way to read rectangular data (like 'csv', 'tsv', and
'fwf’)
– https://cran.r-project.org/web/packages/readr/readr.pdf
• Package: tidyr
– Description: Tools for reshaping data, extracting values out
of string columns, and working with missing values
– https://cran.r-project.org/web/packages/tidyr/tidyr.pdf
Parallel Processing
Parallel Processing in R
• Parallel computing: From Wikipedia: “Parallel computing is a type of
computation in which many calculations or the execution of
processes are carried out simultaneously. Large problems can often
be divided into smaller ones, which can then be solved at the same
time.”
– See here for more:
https://en.wikipedia.org/wiki/Parallel_computing
• Modern day computers typically contain:
– Single-core
– Multicore (Dual, Quad, Hexa, Octo, etc.)
• May also contain hyperthreading
Parallel Processing in R (Continued)
• Parallel processing can be used in many
situations, including:
– Bootstrapping
– Microsimulation models
– Monte Carlo experiments
– Probabilistic Sensitivity Analysis
• By utilizing parallel processing, we can
significantly speed up the processing time of
our calculations
Parallel Processing in R (Continued)
• There are many packages to perform parallel processing in R, including
• parallel
– Available in R by default
– Handles large chunks of computations in parallel
– https://stat.ethz.ch/R-manual/R-
devel/library/parallel/doc/parallel.pdf

• doParallel
– “parallel backend” for the “foreach” package
– provides a mechanism needed to execute foreach loops in parallel
– https://cran.r-
project.org/web/packages/doParallel/vignettes/gettingstartedParallel.
pdf
Example: Monte Carlo Experiment
Example: Monte Carlo Experiment
(Continued)

Notice we changed %dopar%

to %do% to run everything
through a single core
R Studio
What is R Studio?
From R Studio Website:
• An integrated development environment (IDE) for R.
Includes:
– A console
– Syntax highlighting editor
– Tools for plotting, history, debugging, and workspace
history
• Can think of it as a more user friendly version of R
• A free version is available as well
• For more information, see https://www.rstudio.com
List of datasets/variables

Syntax Window

Files, plots, packages, help, and viewer

Command/Results Window
R Markdown
(In R Studio)
What is R Markdown?
From R Markdown website:
“R Markdown provides an authoring framework for data science. You
can use a single R Markdown file to both
• save and execute code
• generate high quality reports that can be shared with an audience”
Source: https://rmarkdown.rstudio.com/lesson-1.html

With R Markdown, you can render to a variety of formats, which

includes PDF (uses LaTeX) and Microsoft Word

To create a R Markdown file, go to File à New File à R Markdown

“Knit”, or generate document

Global options for document here (echoing of R code, loading

packages, etc.)

# for Document Sections

R code chunk for output (summary of “cars” data)

R code chunk for output (to insert a plot available in R memory)

Page 1 (of 2)
Page 2 (of 2)
Tips for Outputting In MS Word
Output Option • The word_document2 (Bookdown) and rdocx_document (Officedown) formats are
generally superior to word_document (default in R Markdown), particularly for
automatic numbering of figures/tables, and cross-referencing of figures/tables.
• The rdocx_document lets you easily switch between landscape and portrait
Tables Default knitr::kable() function works, but flextable() function flextable creates “pretty”
tables with a large amount of flexibility (customize cell padding and column widths, table
footnotes, long tables, etc.)
Figures Use knitr::include_graphics(filepath) for previously saved figures to include in the
document
References • Default reference style is Chicago. Visit Zotero Style Repository to search for additional
Citation Style Language (CSL) files (Vancouver, APA, journal specific styles, etc.). Can
modify existing reference style, which may be necessary for certain journals
(https://editor.citationstyles.org/about/)
• Add citations with markdown syntax by typing [@cite] or @cite.
• Store references in plain text BibTeX database (*.bib)
• Can also look up and Insert Citations dialog in the Visual Editor by clicking the @
symbol in the toolbar or by clicking Insert > Citation
Document To modify font sizes, text alignment, etc., need to create a reference style document
formatting following these instructions: https://rmarkdown.rstudio.com/articles_docx.html

Please also see the R Markdown cheat sheet:

https://github.com/rstudio/cheatsheets/raw/master/rmarkdown-2.0.pdf
Applied Example
• Analysis of Health Expenditure Data in Jones et al.
(2013) Chapter Three
• The data covers the medical expenditures of US citizens
aged 65 years and older who qualify for health care
under Medicare.
– Outcome of interest is total annual health care
expenditures (measured in US dollars).
– Other key variables are age, gender, household income,
supplementary insurance status (insurance beyond
Medicare), physical and activity limitations and the total
number of chronic conditions.
• Data can be downloaded from here (mus03data.dta):
https://www.stata-press.com/data/musr.html
R Markdown Code From Example
---
title: "Untitled"
output: word_document
---

```{r setup, include=FALSE}

knitr::opts_chunk$set(echo = FALSE)
```

# Regression Results

```{r regresults}
load("cost.data.results.RData")
knitr::kable(cost.data.results)
```

# Plot

```{r plot}
knitr::include_graphics("RVFV.jpg")
```
Conclusions
• R has extremely powerful database management
capabilities
– Is fully capable of performing the same sort of tasks as
commercial software programs
– Can be enhanced through Tidyverse package for a more user
friendly experience
• R is very capable of statistical analysis
– Is fully capable of calculating summary statistics and performing
regression analysis right out of the box
– Can install additional packages to perform other sorts of
analysis, depending on the research question of the user
– Performance can be improved by the use of parallel processing
• R, and the additional packages available to enhance the use
of R, are available free of charge
R Resources
R Online Resources
• A list of R packages is contained here:
https://cran.r-
project.org/web/packages/available_packages_by_
date.html
• By clicking on a particular package, you’ll be
taken to a page with more details, as well as a link
to download the documation
• Typing help(topic) in R pulls up a brief help file
with synax and examples, but the online manuals
contain more detail
R Online Resources
• UCLA Institute for Digital Research and
Education
– List of topics and R resources (getting started, data
examples, etc.) can be found here:
http://www.ats.ucla.edu/stat/r/
• RStudio Cheatsheets
– https://www.rstudio.com/resources/cheatsheets/
Other R Resources
1. Kleiber, C., & Zeileis, A. (2008). Applied econometrics with
R. Springer Science & Business Media.
• Great reference for the applied researcher wanting to use R for
econometric analysis. Includes R basics, linear regression
model, panel data models, binary outcomes, etc.
2. Jones, A. M., Rice, N., d'Uva, T. B., & Balia, S.
(2013). Applied health economics. Routledge.
• Excellent reference for applied health economics. Examples
are all performed using STATA, but foreign package should help
here.
3. CRAN Task View: Econometrics
• A listing of the statistical models used in econometrics, as well
as the R package(s) needed to perform them. Available at:
https://cran.r-project.org/view=Econometrics
Thanks for Listening
Good luck with R!

40 R Programming Interview Questions & Answers For All Levels - DataCamp
No ratings yet
40 R Programming Interview Questions & Answers For All Levels - DataCamp
22 pages
R Programming
100% (8)
R Programming
60 pages
TI-Nspire Programming - TI-Basic Developer
No ratings yet
TI-Nspire Programming - TI-Basic Developer
14 pages
#02 R Basics
No ratings yet
#02 R Basics
30 pages
Module 1: Unit - 1.1: Introduction To Analytics or R Programming
No ratings yet
Module 1: Unit - 1.1: Introduction To Analytics or R Programming
26 pages
R Programming Slides
No ratings yet
R Programming Slides
73 pages
R Programming for Data Science QB
No ratings yet
R Programming for Data Science QB
21 pages
QB Samplealllllll Hemu
No ratings yet
QB Samplealllllll Hemu
19 pages
R Programming 2 MARKS
No ratings yet
R Programming 2 MARKS
12 pages
STATS LAB Basics of R PDF
No ratings yet
STATS LAB Basics of R PDF
77 pages
Part I: Introductory Materials: Introduction To R
No ratings yet
Part I: Introductory Materials: Introduction To R
25 pages
Introduction to R
No ratings yet
Introduction to R
23 pages
R Tutorial
No ratings yet
R Tutorial
100 pages
IDS UNIT 3 NOTES CSM & CSD
No ratings yet
IDS UNIT 3 NOTES CSM & CSD
24 pages
Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell pdf download
100% (2)
Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell pdf download
40 pages
Basic of R Language: Jarno Tuimala
No ratings yet
Basic of R Language: Jarno Tuimala
41 pages
R - II UNIT
No ratings yet
R - II UNIT
10 pages
Introduction To R: Arin Basu MD MPH Dataanalytics
No ratings yet
Introduction To R: Arin Basu MD MPH Dataanalytics
33 pages
Practical 1_Data Frame Manipulation_072502
No ratings yet
Practical 1_Data Frame Manipulation_072502
16 pages
r 2m
No ratings yet
r 2m
34 pages
Using R For Scientific Computing: Karline Soetaert
No ratings yet
Using R For Scientific Computing: Karline Soetaert
53 pages
Introduction To R: Pavan Kumar A
No ratings yet
Introduction To R: Pavan Kumar A
55 pages
PW1 2
No ratings yet
PW1 2
20 pages
Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell - Read Online Or Download Now
100% (6)
Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell - Read Online Or Download Now
35 pages
Data Science in Process Engineering: Introduction To R
No ratings yet
Data Science in Process Engineering: Introduction To R
14 pages
Download full Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell all chapters
100% (13)
Download full Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell all chapters
43 pages
Mod1 R Programming
No ratings yet
Mod1 R Programming
49 pages
Statistical Models Using R
No ratings yet
Statistical Models Using R
6 pages
SSMDA Expt 7
No ratings yet
SSMDA Expt 7
16 pages
Unit-1 (Part-2) : Loading and Handling Data in R
No ratings yet
Unit-1 (Part-2) : Loading and Handling Data in R
78 pages
MATLAB R Dictionary
No ratings yet
MATLAB R Dictionary
17 pages
Coding Assignment Guide
No ratings yet
Coding Assignment Guide
2 pages
2.R Concepts - BDSM - Oct2020 PDF
No ratings yet
2.R Concepts - BDSM - Oct2020 PDF
37 pages
Unit I R Data Structures
No ratings yet
Unit I R Data Structures
30 pages
R Programming
No ratings yet
R Programming
60 pages
R Module 1 Notes
No ratings yet
R Module 1 Notes
15 pages
Sam BRM Rstudio
No ratings yet
Sam BRM Rstudio
43 pages
R Programming
No ratings yet
R Programming
61 pages
ProgrammingForDS13_introR
No ratings yet
ProgrammingForDS13_introR
25 pages
R Programming
No ratings yet
R Programming
37 pages
R Programming
No ratings yet
R Programming
20 pages
R Programming
No ratings yet
R Programming
22 pages
Introduction to Analytics and R file
No ratings yet
Introduction to Analytics and R file
29 pages
How To Use The R Programming Language For Statistical Analyses
No ratings yet
How To Use The R Programming Language For Statistical Analyses
38 pages
R Programming
No ratings yet
R Programming
59 pages
Download full Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell all chapters
100% (20)
Download full Solution Manual for Using Multivariate Statistics 7th Edition Barbara G. Tabachnick, Linda S. Fidell all chapters
43 pages
Data Analysis Using R and Vectors
No ratings yet
Data Analysis Using R and Vectors
35 pages
1.R Unit 1
No ratings yet
1.R Unit 1
49 pages
R Cheat Sheet For The Ethiosis Crash Course On DSM: Organizing Data
No ratings yet
R Cheat Sheet For The Ethiosis Crash Course On DSM: Organizing Data
2 pages
Introduction To R Programming
No ratings yet
Introduction To R Programming
14 pages
Linear Regression Analysis HUDM 5122: Introduction To R Johnny Wang
No ratings yet
Linear Regression Analysis HUDM 5122: Introduction To R Johnny Wang
17 pages
Introduction To Rlogistic
No ratings yet
Introduction To Rlogistic
135 pages
Introduction To R
No ratings yet
Introduction To R
20 pages
Introduction To Data Science With R Programming
No ratings yet
Introduction To Data Science With R Programming
91 pages
Study Material IP 2022
No ratings yet
Study Material IP 2022
55 pages
R Concepts - 25092018 PDF
No ratings yet
R Concepts - 25092018 PDF
51 pages
R Interview
No ratings yet
R Interview
20 pages
updated question paper 2 ans
No ratings yet
updated question paper 2 ans
12 pages
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
From Everand
R Fast Track Guide - 86 Key Points Every Programmer from Other Languages Should Master
Ginno
No ratings yet
Introduction to Algorithms
From Everand
Introduction to Algorithms
S VASIST
No ratings yet
Mastering Data Structures and Algorithms in C and C++
From Everand
Mastering Data Structures and Algorithms in C and C++
Sachin Naha
No ratings yet
List of Institutes
No ratings yet
List of Institutes
23 pages
Gill, D.L.: Note: Figures May Be Missing From This Format of The Document
No ratings yet
Gill, D.L.: Note: Figures May Be Missing From This Format of The Document
7 pages
2 M Ed Phy Edu Structure PDF
No ratings yet
2 M Ed Phy Edu Structure PDF
53 pages
List of Institutes
No ratings yet
List of Institutes
38 pages
Marketing Research and Information Systems: Resources
No ratings yet
Marketing Research and Information Systems: Resources
16 pages
Application Form For Teaching Position
No ratings yet
Application Form For Teaching Position
3 pages
Bureau Veritas
No ratings yet
Bureau Veritas
4 pages
Plan: Identify Aspects, Laws, Objectives and Targets and Programs Do: Assign Responsibilities, Train People, Communicate, Control Procedures, Control
No ratings yet
Plan: Identify Aspects, Laws, Objectives and Targets and Programs Do: Assign Responsibilities, Train People, Communicate, Control Procedures, Control
1 page
Process Choreographics
No ratings yet
Process Choreographics
22 pages
Batching and Mixing 2011-1 PDF
No ratings yet
Batching and Mixing 2011-1 PDF
84 pages
Dynamic Memory Allocation in C Using Malloc
No ratings yet
Dynamic Memory Allocation in C Using Malloc
3 pages
2010 Chemistry Damai Sec Prelim
No ratings yet
2010 Chemistry Damai Sec Prelim
30 pages
Chapter 2 - Circuit Elements
No ratings yet
Chapter 2 - Circuit Elements
15 pages
SUMMATIVE TEST - 1st QTR - SY2020-2021 - No Answers
No ratings yet
SUMMATIVE TEST - 1st QTR - SY2020-2021 - No Answers
3 pages
Periodic Prop. 2
No ratings yet
Periodic Prop. 2
30 pages
CX Motion NCF v.1.9 Manual en 201003
No ratings yet
CX Motion NCF v.1.9 Manual en 201003
148 pages
HSS Twist Drill Recommended Speeds and Point Angles
No ratings yet
HSS Twist Drill Recommended Speeds and Point Angles
7 pages
Elife-Drive H Series Application Reference Manual RevB v2.4
No ratings yet
Elife-Drive H Series Application Reference Manual RevB v2.4
80 pages
Engineering Data: Heat Pump Outdoor Unit
No ratings yet
Engineering Data: Heat Pump Outdoor Unit
10 pages
A New Method of Balancing Supercapacitors in A Series Stack Using Mosfets
No ratings yet
A New Method of Balancing Supercapacitors in A Series Stack Using Mosfets
7 pages
Accessories & Welding Mig Guns: Catalog
100% (1)
Accessories & Welding Mig Guns: Catalog
136 pages
Perpetual Futility
No ratings yet
Perpetual Futility
17 pages
Syll2001ao1to4 PDF
No ratings yet
Syll2001ao1to4 PDF
48 pages
Transponder Ref Jma
No ratings yet
Transponder Ref Jma
11 pages
Aerodynamic Analysis of The Undertray of Formula 1 Alberto Gomez PDF
No ratings yet
Aerodynamic Analysis of The Undertray of Formula 1 Alberto Gomez PDF
92 pages
Part and Assembly Modeling: With Solidworks 2014
100% (1)
Part and Assembly Modeling: With Solidworks 2014
123 pages
Perhitungan Data Teknis Pompa Hidrolis, Motor Hidrolis Dan Prime Mover (Engine Diesel)
No ratings yet
Perhitungan Data Teknis Pompa Hidrolis, Motor Hidrolis Dan Prime Mover (Engine Diesel)
2 pages
References: Sources Used
No ratings yet
References: Sources Used
4 pages
Fractal Robots
No ratings yet
Fractal Robots
6 pages
ملزمة الاحصاء د.عبدالخالق
No ratings yet
ملزمة الاحصاء د.عبدالخالق
106 pages
Integrals Using Branch Cut
No ratings yet
Integrals Using Branch Cut
23 pages
Protection From Coastal Erosion
No ratings yet
Protection From Coastal Erosion
30 pages
The Mathematics of DNA Sturcture Mechanics and Dyn
No ratings yet
The Mathematics of DNA Sturcture Mechanics and Dyn
29 pages
This Study Resource Was: K 61 Units
No ratings yet
This Study Resource Was: K 61 Units
4 pages
Solcon USA HRVS DN MV 10 13pt8kV Spec Guide 2011
100% (1)
Solcon USA HRVS DN MV 10 13pt8kV Spec Guide 2011
10 pages
322 Sample Chapter
100% (1)
322 Sample Chapter
16 pages
Illustrative Example: A Blending Process: An Unsteady-State Mass Balance For The Blending System
No ratings yet
Illustrative Example: A Blending Process: An Unsteady-State Mass Balance For The Blending System
22 pages