Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
15 views

Unit 1 Notes R Programming

R programming language

Uploaded by

Fallen Angel
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Unit 1 Notes R Programming

R programming language

Uploaded by

Fallen Angel
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Introduction to R Programming:

Understanding the Basics of R Programming Language

● Definition and Purpose: R is a powerful language and environment for statistical


computing and graphics. It is designed for data analysis and visualization.
● History: Developed in the early 1990s by Ross Ihaka and Robert Gentleman at
the University of Auckland, New Zealand.
● Key Features:
○ Open-source and free.
○ Comprehensive statistical analysis and graphical capabilities.
○ Extensive community support and a large number of packages for
various applications.

Installing R and RStudio

● R Installation:
○ R can be downloaded from the Comprehensive R Archive Network
(CRAN) website.
○ Installation involves selecting the appropriate version for your
operating system (Windows, macOS, Linux) and following the
installation instructions.
● RStudio Installation:
○ RStudio is an integrated development environment (IDE) for R,
providing a more user-friendly interface.
○ Download the RStudio installer from the RStudio website and install it
after installing R.
○ RStudio includes a console, syntax-highlighting editor, and tools for
plotting, history, and workspace management.
Basic Operations

● Arithmetic Operations: Basic mathematical computations can be performed


using operators like + (addition), -(subtraction), * (multiplication), / (division),
and ^ (exponentiation).
● Relational Operations: Comparisons between values using == (equal to), != (not
equal to), > (greater than), < (less than), >= (greater than or equal to), and <=
(less than or equal to).
● Logical Operations: Logical comparisons using & (AND), | (OR), and ! (NOT).

Data Types and Variables in R

● Data Types:
○ Numeric: Represents real numbers, can be integers or floating-point
numbers.
○ Integer: Whole numbers, declared by appending an L to the number
(e.g., 5L).
○ Character: Text or string data, enclosed in quotes (e.g., "Hello").
○ Logical: Boolean values, either TRUE or FALSE.
○ Factor: Categorical data used for representing variables with a fixed
number of unique values.

● Variables:
○ Variables store data values and are assigned using <- or =.
○ Naming conventions: Start with a letter, can contain letters, numbers,
and underscores, but not spaces or special characters.
Exploring Data Structures

● Vectors: One-dimensional array that can hold numeric, character, or logical data.
○ Created using c() function (e.g., c(1, 2, 3)).

● Matrices: Two-dimensional, homogeneous data structures with rows and


columns.
○ Created using matrix() function (e.g., matrix(1:6, nrow=2, ncol=3)).

● Arrays: Multi-dimensional, homogeneous data structures.


○ Created using array() function (e.g., array(1:8, dim=c(2, 2, 2))).

● Lists: Ordered collections that can hold different types of elements.


○ Created using list() function (e.g., list(name="John", age=25,
scores=c(90, 85, 92))).

● Data Frames: Two-dimensional, heterogeneous data structures similar to tables


in a database.
○ Created using data.frame() function (e.g., data.frame(name=c("Alice",
"Bob"), age=c(25, 30))).
Control Structures

● If-Else Statements:

Allows conditional execution of code based on whether a condition is true


or false.

Syntax:

if (condition) {

# code to execute if condition is true

} else {

# code to execute if condition is false

● Loops:

For Loop: Iterates over a sequence of values.

for (i in 1:5) {

print(i)

}
While Loop: Repeats code while a condition is true.

while (condition) {

# code to execute

Functions and Their Usage in R

● Defining Functions: Functions encapsulate code for reuse and modularity.

Syntax:

function_name <- function(arg1, arg2) {

# function body

return(result)

Example:

add <- function(x, y) {

return(x + y)

add(3, 5) # Returns 8
Importing Data into R from Various Sources

● CSV Files:
○ Imported using read.csv() function.
○ Example: data <- read.csv("path/to/file.csv").

● Excel Files:
○ Imported using readxl package with read_excel() function.
○ Example: data <- read_excel("path/to/file.xlsx").

● Databases:
○ Connected using DBI and RSQLite packages.
○ Example:RCopy code library(DBI)
○ conn <- dbConnect(RSQLite::SQLite(), "path/to/database.sqlite")
○ data <- dbGetQuery(conn, "SELECT * FROM table_name")
○ dbDisconnect(conn)

Data Manipulation Using dplyr and tidyr Packages

● dplyr Package:
○ Provides a set of functions designed to simplify data manipulation.
○ Key functions:
■ filter(): Subset rows based on conditions.
■ select(): Select columns by name.
■ mutate(): Create new variables.
■ summarise(): Summarize data with aggregate functions.
■ arrange(): Sort data by specified variables.
● tidyr Package:
○ Designed for reshaping data.
○ Key functions:
■ gather(): Converts wide data to long format.
■ spread(): Converts long data to wide format.
■ separate(): Splits a column into multiple columns.
■ unite(): Combines multiple columns into one.

Data Visualization with Base R Graphics and ggplot2

● Base R Graphics:
○ Provides basic plotting functions for data visualization.
○ Common functions:
■ plot(): General plotting function.
■ hist(): Creates histograms.
■ boxplot(): Creates box plots.

● ggplot2:
○ A powerful and flexible package for creating advanced visualizations.
○ Based on the Grammar of Graphics.
○ Common functions:
■ ggplot(): Initializes a ggplot object.
■ geom_point(): Creates scatter plots.
■ geom_bar(): Creates bar charts.
■ geom_line(): Creates line plots.

You might also like