Unit 1 Notes R Programming
Unit 1 Notes R Programming
● R Installation:
○ R can be downloaded from the Comprehensive R Archive Network
(CRAN) website.
○ Installation involves selecting the appropriate version for your
operating system (Windows, macOS, Linux) and following the
installation instructions.
● RStudio Installation:
○ RStudio is an integrated development environment (IDE) for R,
providing a more user-friendly interface.
○ Download the RStudio installer from the RStudio website and install it
after installing R.
○ RStudio includes a console, syntax-highlighting editor, and tools for
plotting, history, and workspace management.
Basic Operations
● Data Types:
○ Numeric: Represents real numbers, can be integers or floating-point
numbers.
○ Integer: Whole numbers, declared by appending an L to the number
(e.g., 5L).
○ Character: Text or string data, enclosed in quotes (e.g., "Hello").
○ Logical: Boolean values, either TRUE or FALSE.
○ Factor: Categorical data used for representing variables with a fixed
number of unique values.
● Variables:
○ Variables store data values and are assigned using <- or =.
○ Naming conventions: Start with a letter, can contain letters, numbers,
and underscores, but not spaces or special characters.
Exploring Data Structures
● Vectors: One-dimensional array that can hold numeric, character, or logical data.
○ Created using c() function (e.g., c(1, 2, 3)).
● If-Else Statements:
Syntax:
if (condition) {
} else {
● Loops:
for (i in 1:5) {
print(i)
}
While Loop: Repeats code while a condition is true.
while (condition) {
# code to execute
Syntax:
# function body
return(result)
Example:
return(x + y)
add(3, 5) # Returns 8
Importing Data into R from Various Sources
● CSV Files:
○ Imported using read.csv() function.
○ Example: data <- read.csv("path/to/file.csv").
● Excel Files:
○ Imported using readxl package with read_excel() function.
○ Example: data <- read_excel("path/to/file.xlsx").
● Databases:
○ Connected using DBI and RSQLite packages.
○ Example:RCopy code library(DBI)
○ conn <- dbConnect(RSQLite::SQLite(), "path/to/database.sqlite")
○ data <- dbGetQuery(conn, "SELECT * FROM table_name")
○ dbDisconnect(conn)
● dplyr Package:
○ Provides a set of functions designed to simplify data manipulation.
○ Key functions:
■ filter(): Subset rows based on conditions.
■ select(): Select columns by name.
■ mutate(): Create new variables.
■ summarise(): Summarize data with aggregate functions.
■ arrange(): Sort data by specified variables.
● tidyr Package:
○ Designed for reshaping data.
○ Key functions:
■ gather(): Converts wide data to long format.
■ spread(): Converts long data to wide format.
■ separate(): Splits a column into multiple columns.
■ unite(): Combines multiple columns into one.
● Base R Graphics:
○ Provides basic plotting functions for data visualization.
○ Common functions:
■ plot(): General plotting function.
■ hist(): Creates histograms.
■ boxplot(): Creates box plots.
● ggplot2:
○ A powerful and flexible package for creating advanced visualizations.
○ Based on the Grammar of Graphics.
○ Common functions:
■ ggplot(): Initializes a ggplot object.
■ geom_point(): Creates scatter plots.
■ geom_bar(): Creates bar charts.
■ geom_line(): Creates line plots.