Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
1 views

Functions Vs Scripts and Datasets

Uploaded by

Kakashi Hatake
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

Functions Vs Scripts and Datasets

Uploaded by

Kakashi Hatake
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Functions vs Scripts and Datasets

answer the questions based on R

Q) i. What are the advantages of using functions over scripts?


ii. Suppose you have two datasets A and B.
Dataset A has the following data: 6 7 8 9
Dataset B has the following data: 1 2 4 5.
Which function is used to combine the data from both datasets into dataset C.
Demonstrate the function with the input values and write the output.

i) Advantages of Using Functions Over Scripts

Functions offer several advantages over scripts in R:

1. Code Reusability: Functions can be reused in multiple parts of your program, reducing
redundancy and saving time.
2. Modularity: Functions allow you to break your code into smaller, manageable chunks, making it
easier to read and debug.
3. Ease of Maintenance: When code is organized into functions, making changes or fixing issues
becomes simpler without affecting unrelated parts of the code.
4. Parameterization: Functions can accept arguments, allowing you to handle different inputs
dynamically.
5. Improved Readability: Functions encapsulate specific tasks, making the overall program structure
cleaner and more understandable.

ii) Combining Two Datasets into a Single Dataset

To combine datasets in R, you can use the `c()` function, which concatenates values into a single vector.

Example

Given:

Dataset A: `6, 7, 8, 9`
Dataset B: `1, 2, 4, 5`

You can combine these using `c()` as follows:

# Define datasets
A <- c(6, 7, 8, 9)
B <- c(1, 2, 4, 5)

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 1/25
# Combine datasets
C <- c(A, B)

# Print the combined dataset


print(C)

Output:

csharp

[1] 6 7 8 9 1 2 4 5

Q) List and explain various functions that allow users to handle data in R workspace with
appropriate examples.

Functions to Handle Data in R Workspace

In R, the workspace is where all your objects, such as variables, data frames, vectors, and functions,
reside during your session. Below is a list of essential functions to handle data in the R workspace, along
with explanations and examples:

1. `ls()`

Description: Lists all objects currently in the workspace.


Example:

x <- 5
y <- c(1, 2, 3)
ls()

Output:

csharp

[1] "x" "y"

2. `rm()`

Description: Removes specified objects from the workspace.


Example:

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 2/25
R

rm(x) # Removes object 'x'


ls()

Output:

csharp

[1] "y"

3. `rm(list = ls())`

Description: Clears the entire workspace by removing all objects.


Example:

rm(list = ls())
ls()

Output:

scss

character(0)

4. `get()`

Description: Retrieves the value of an object by its name.


Example:

z <- 10
get("z")

Output:

csharp

[1] 10

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 3/25
5. `assign()`

Description: Assigns a value to an object dynamically.


Example:

assign("new_var", 42)
print(new_var)

Output:

csharp

[1] 42

6. `exists()`

Description: Checks whether an object exists in the workspace.


Example:

exists("new_var")

Output:

csharp

[1] TRUE

7. `save()`

Description: Saves specific objects to a file for later use.


Example:

save(y, file = "my_data.RData")

8. `load()`

Description: Loads objects from a saved file back into the workspace.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 4/25
Example:

load("my_data.RData")
print(y)

9. `save.image()`

Description: Saves the entire workspace to a file.


Example:

save.image(file = "workspace.RData")

10. `history()`

Description: Displays the command history of the current session.


Example:

history()

11. `gc()`

Description: Invokes garbage collection to free up memory.


Example:

gc()

12. `objects()`

Description: Similar to `ls()`, lists all objects in the workspace.


Example:

objects()

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 5/25
13. `str()`

Description: Displays the structure of an object.


Example:

str(y)

Output:

css

num [1:3] 1 2 3

14. `summary()`

Description: Provides a summary of an object, such as statistical measures for a dataset.


Example:

summary(y)

Output:

mathematica

Min. 1st Qu. Median Mean 3rd Qu. Max.


1 1 2 2 3 3

15. `dir()`

Description: Lists files in the current working directory.


Example:

dir()

These functions allow users to efficiently manage, organize, and retrieve data in the R workspace.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 6/25
Q) Consider the following data frame given below: course id class 1 11 marks 1 2 12 56 2 3 13 75 1 4
14 48 2 5 15 69 1 6 16 84 2 53 i.Create a subset of course less than 3 by using [ ] brackets and
demonstrate the output. ii. Create a subset where the course column is less than 3 or the class
equals to 2 by using subset () function and demonstrate the output.
Q) List and discuss various types of data structures in R.

1. Subsetting the Data Frame

Given the data frame:

df <- data.frame(
course = c(1, 2, 3, 4, 5, 6),
id = c(11, 12, 13, 14, 15, 16),
class = c(1, 2, 1, 2, 1, 2),
marks = c(53, 56, 75, 48, 69, 84)
)

i) Subset of `course < 3` using `[ ]` Brackets

We can subset rows where the `course` column is less than 3:

subset1 <- df[df$course < 3, ]


print(subset1)

Output:

kotlin

course id class marks


1 1 11 1 53
2 2 12 2 56

ii) Subset with `course < 3` OR `class == 2` using `subset()`

The `subset()` function allows conditional filtering with readable syntax:

subset2 <- subset(df, course < 3 | class == 2)


print(subset2)

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 7/25
Output:

kotlin

course id class marks


1 1 11 1 53
2 2 12 2 56
4 4 14 2 48
6 6 16 2 84

2. Various Types of Data Structures in R

R supports several data structures, each suited for different types of data. Here’s a discussion of the
main types:

1. Vectors

Description: A one-dimensional data structure that holds elements of the same type (numeric,
character, logical, etc.).
Example:

vec <- c(1, 2, 3, 4)


print(vec)

Output:

csharp

[1] 1 2 3 4

2. Matrices

Description: Two-dimensional arrays where all elements must be of the same type.
Example:

mat <- matrix(1:9, nrow = 3)


print(mat)

Output:

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 8/25
css

[,1] [,2] [,3]


[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9

3. Arrays

Description: Multi-dimensional data structures (more than 2 dimensions) with elements of the
same type.
Example:

arr <- array(1:8, dim = c(2, 2, 2))


print(arr)

Output:

css

, , 1
[,1] [,2]
[1,] 1 3
[2,] 2 4

, , 2
[,1] [,2]
[1,] 5 7
[2,] 6 8

4. Data Frames

Description: Two-dimensional structures that can hold data of different types in columns.
Example:

df <- data.frame(name = c("A", "B"), age = c(25, 30))


print(df)

Output:

css

name age
1 A 25

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 9/25
2 B 30

5. Lists

Description: A versatile data structure that can hold elements of different types, including other
lists.
Example:

lst <- list(name = "John", age = 25, marks = c(85, 90))


print(lst)

Output:

bash

$name
[1] "John"

$age
[1] 25

$marks
[1] 85 90

6. Factors

Description: Used to represent categorical data with fixed values (levels).


Example:

gender <- factor(c("Male", "Female", "Male"))


print(gender)

Output:

csharp

[1] Male Female Male


Levels: Female Male

7. Tibble

Description: A modern version of a data frame with enhanced features.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 10/25
Example (requires `dplyr`):

library(tibble)
tib <- tibble(name = c("A", "B"), age = c(25, 30))
print(tib)

Output:

less

# A tibble: 2 × 2
name age
<chr> <dbl>
1 A 25
2 B 30

8. NULL

Description: Represents an empty or undefined value.


Example:

x <- NULL
print(x)

Output:

arduino

NULL

These data structures provide flexibility in data handling and are foundational to programming in R.

i. The following table shows the number of units of different products sold on
different days:
Product
Monday Tuesday Wednesday Thursda
y
Bread
12
3
5
Friday

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 11/25
Milk
11
21
27
18
9
Cola Cans
20
10
1
33
6
15
Chocolate
bars
12
6
7
4
13
Detergent
5
8
12
12
Create five sample numeric vectors from this data.
20
23
ii. Name and explain the operators used to form data subsets in R.

i) Creating Numeric Vectors from the Data

The given table is:

Product Monday Tuesday Wednesday Thursday Friday


Bread 12 3 5 20 23

Milk 11 21 27 18 9
Cola Cans 20 10 1 33 6
Chocolate Bars 12 6 7 4 13

Detergent 5 8 12 12 15

Creating Numeric Vectors

Here, we define one vector for each product:

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 12/25
R

# Bread sales
bread <- c(12, 3, 5, 20, 23)

# Milk sales
milk <- c(11, 21, 27, 18, 9)

# Cola Cans sales


cola_cans <- c(20, 10, 1, 33, 6)

# Chocolate Bars sales


chocolate_bars <- c(12, 6, 7, 4, 13)

# Detergent sales
detergent <- c(5, 8, 12, 12, 15)

# Print vectors to verify


print(bread)
print(milk)
print(cola_cans)
print(chocolate_bars)
print(detergent)

Output:

csharp

[1] 12 3 5 20 23
[1] 11 21 27 18 9
[1] 20 10 1 33 6
[1] 12 6 7 4 13
[1] 5 8 12 12 15

ii) Operators Used to Form Data Subsets in R

R provides several operators for subsetting data. These can be applied to vectors, matrices, data frames,
and lists.

1. Indexing by Position

Use square brackets `[ ]` to specify the position of the elements you want to extract.
Example:

vec <- c(10, 20, 30, 40)


vec[2] # Second element

Output:

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 13/25
csharp

[1] 20

2. Indexing by Logical Vectors

Use logical conditions to extract elements that satisfy a condition.


Example:

vec <- c(10, 20, 30, 40)


vec[vec > 20] # Elements greater than 20

Output:

csharp

[1] 30 40

3. Indexing by Names

Use named elements for subsetting.


Example:

sales <- c(Monday = 10, Tuesday = 20, Wednesday = 30)


sales["Tuesday"]

Output:

csharp

[1] 20

4. Using the `subset()` Function

Extracts rows or columns of a data frame based on a condition.


Example:

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 14/25
R

df <- data.frame(product = c("Bread", "Milk"), sales = c(12, 15))


subset(df, sales > 12)

Output:

product sales
2 Milk 15

5. The `$` Operator

Used to extract a column from a data frame or a list.


Example:

df <- data.frame(product = c("Bread", "Milk"), sales = c(12, 15))


df$sales

Output:

csharp

[1] 12 15

6. Indexing Matrices

Specify row and column positions using `[row, column]`.


Example:

mat <- matrix(1:9, nrow = 3)


mat[1, 2] # First row, second column

Output:

csharp

[1] 4

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 15/25
7. Using `%in%`

Extracts elements that match specified values.


Example:

vec <- c(10, 20, 30, 40)


vec[vec %in% c(20, 30)]

Output:

csharp

[1] 20 30

8. Negative Indexing

Excludes specific elements by their positions.


Example:

vec <- c(10, 20, 30, 40)


vec[-2] # Exclude the second element

Output:

csharp

[1] 10 30 40

These operators make subsetting in R versatile and powerful.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 16/25
The provided table includes data for the number of units of various products sold on different days.
Here's how to address the questions:

a) Create Five Sample Numeric Vectors from the Data

From the table, each row corresponds to a product's sales over the days of the week. We can create a
numeric vector for each product:

# Numeric vectors for each product


bread <- c(12, 3, 5, 11, 9) # Sales of Bread
milk <- c(21, 27, 18, 20, 15) # Sales of Milk
cola_cans <- c(10, 1, 33, 6, 15) # Sales of Cola Cans
chocolate_bars <- c(6, 7, 4, 13, 12) # Sales of Chocolate Bars
detergent <- c(5, 8, 12, 20, 23) # Sales of Detergent

# Printing the vectors


print(bread)
print(milk)
print(cola_cans)
print(chocolate_bars)
print(detergent)

Output:

csharp

[1] 12 3 5 11 9
[1] 21 27 18 20 15
[1] 10 1 33 6 15
[1] 6 7 4 13 12
[1] 5 8 12 20 23

b) Operators Used to Form Data Subsets in R

Here is an explanation of the operators with examples for subsetting:

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 17/25
1. `[ ]` Brackets

Used to extract specific elements by their index position.


Example:

vec <- c(1, 2, 3, 4, 5)


vec[1:3] # Extracts first three elements

Output:

csharp

[1] 1 2 3

2. `$` Operator

Used to access specific columns in a data frame by name.


Example:

df <- data.frame(product = c("Bread", "Milk"), sales = c(12, 21))


df$sales # Access 'sales' column

Output:

csharp

[1] 12 21

3. `subset()` Function

Allows filtering rows based on conditions.


Example:

df <- data.frame(product = c("Bread", "Milk"), sales = c(12, 21))


subset(df, sales > 15) # Rows where sales > 15

Output:

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 18/25
product sales
2 Milk 21

4. Logical Indexing

Selects elements based on logical conditions.


Example:

vec <- c(12, 21, 15, 18)


vec[vec > 15] # Extract elements greater than 15

Output:

csharp

[1] 21 18

5. Named Indexing

Subset elements using their names.


Example:

vec <- c(Monday = 12, Tuesday = 21, Wednesday = 15)


vec["Tuesday"] # Access value for 'Tuesday'

Output:

mathematica

Tuesday
21

6. Negative Indexing

Excludes elements by their positions.


Example:

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 19/25
R

vec <- c(1, 2, 3, 4, 5)


vec[-c(1, 3)] # Exclude first and third elements

Output:

csharp

[1] 2 4 5

7. `%in%` Operator

Selects elements that match specified values.


Example:

vec <- c("Bread", "Milk", "Cola")


vec[vec %in% c("Milk", "Cola")] # Selects "Milk" and "Cola"

Output:

csharp

[1] "Milk" "Cola"

8. Multiple Conditions

Combine conditions using logical operators (`&`, `|`).


Example:

vec <- c(12, 21, 15, 18)


vec[vec > 15 & vec < 20] # Elements greater than 15 and less than 20

Output:

csharp

[1] 18

These operators provide robust and flexible options for filtering and extracting data in R.
Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 20/25
Q) Describe applications of data visualization.
Q) Consider the following data frame given below:
course
id
class
1
11
marks
1
2
12
56
2
3
13
75
1
4
14
48
2
5
15
69
1
6
16
84
2
i.
53
Create a subset of course less than 5 by using [ ] brackets and demonstrate
the output.
ii.
Create a subset where the course column is less than 4 or the class equals
to 1 by using subset () function and demonstrate the output.

Q)i. Write a script to create a dataset named data1 in R containing the following text.
Text: 2, 3, 4, 5, 6.7, 7, 8.1, 9
ii. Explain the various functions provided by R to combine different sets of data.

Q) Applications of Data Visualization

Data visualization has widespread applications across various domains. Some key applications include:

1. Business Intelligence

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 21/25
Helps companies analyze sales trends, customer behavior, and financial performance.
Example: Dashboards displaying monthly sales performance.

2. Scientific Research

Visualizes experimental results, trends, and patterns in scientific data.


Example: Scatter plots to analyze the correlation between variables.

3. Healthcare Analytics

Tracks patient health trends and identifies patterns in disease outbreaks.


Example: Heatmaps to display COVID-19 case density across regions.

4. Education

Assists in understanding complex concepts by presenting data visually.


Example: Using bar charts to illustrate student performance trends.

5. Social Media and Marketing

Analyzes audience engagement, campaign performance, and demographics.


Example: Pie charts to visualize the age distribution of social media users.

6. Government and Policy-Making

Helps policymakers understand societal issues and monitor programs.


Example: Maps to visualize unemployment rates across states.

7. Big Data Analytics

Simplifies the presentation of vast and complex datasets.


Example: Using interactive dashboards for web traffic analysis.

Q) Data Frame Subsetting

Data Frame

Given data:

Course ID Class Marks


1 11 1 53
2 12 56 2

3 13 75 1
4 14 48 2
5 15 69 1
6 16 84 2

i) Subset of `course` Less than 5 Using `[ ]`

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 22/25
R

# Create the data frame


data <- data.frame(
course = c(1, 2, 3, 4, 5, 6),
id = c(11, 12, 13, 14, 15, 16),
class = c(1, 56, 75, 48, 69, 84),
marks = c(53, 2, 1, 2, 1, 2)
)

# Subset where course < 5


subset1 <- data[data$course < 5, ]

# Print the subset


print(subset1)

Output:

kotlin

course id class marks


1 1 11 1 53
2 2 12 56 2
3 3 13 75 1
4 4 14 48 2

ii) Subset Where `course` < 4 or `class` = 1 Using `subset()`

# Subset using conditions


subset2 <- subset(data, course < 4 | class == 1)

# Print the subset


print(subset2)

Output:

kotlin

course id class marks


1 1 11 1 53
2 2 12 56 2
3 3 13 75 1
5 5 15 69 1

Q) Create Dataset and Combine Data

i) Creating a Dataset

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 23/25
R

# Create a dataset named data1


data1 <- c(2, 3, 4, 5, 6.7, 7, 8.1, 9)

# Print the dataset


print(data1)

Output:

csharp

[1] 2.0 3.0 4.0 5.0 6.7 7.0 8.1 9.0

ii) Functions to Combine Datasets

R provides several functions to combine datasets:

1. `c()`:
Combines elements into a vector.
Example:

vec <- c(1, 2, 3, 4)

2. `rbind()`:
Combines datasets by rows.
Example:

df1 <- data.frame(a = c(1, 2), b = c(3, 4))


df2 <- data.frame(a = c(5, 6), b = c(7, 8))
combined <- rbind(df1, df2)

3. `cbind()`:
Combines datasets by columns.
Example:

vec1 <- c(1, 2, 3)


vec2 <- c(4, 5, 6)
combined <- cbind(vec1, vec2)

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 24/25
4. `merge()`:
Merges two data frames based on common columns or rows.
Example:

df1 <- data.frame(ID = c(1, 2), Value = c(10, 20))


df2 <- data.frame(ID = c(2, 3), Value = c(30, 40))
merged <- merge(df1, df2, by = "ID")

5. `list()`:
Combines objects into a list.
Example:

lst <- list(vec1 = c(1, 2), vec2 = c(3, 4))

Each function has specific use cases depending on the structure and requirement of the data.

Printed using Save ChatGPT as PDF, powered by PDFCrowd HTML to PDF API. 25/25

You might also like