Applied Data Science: Machine Problem No. 1: Data Structures
Applied Data Science: Machine Problem No. 1: Data Structures
Applied Data Science: Machine Problem No. 1: Data Structures
A. The following scores were obtained by a chemical engineering graduate during the recently held licensure
exams:
Day 1: Physical and Chemical Principles 62%
Day 2: Chemical Engineering Principles 81%
Day 3: General Engineering Principles 95%
The examinee’s final rating is determined as follows: Day 1: 30%, Day 2: 40%, Day 3: 30%. Using the variables
day1, day2 and day3 for the scores, create an R code to determine the examinee’s final rating (use the
variable rating). Use the paste() function to obtain an output of the following form:
B. The following profits/losses were recorded by your café for the past week:
Coffee Profits/Losses
Monday Profit 14,000
Tuesday Loss 5,000
Wednesday Profit 2,000
Thursday Loss 8,000
Friday Profit 18,000
Saturday Profit 23,000
Tea Profits/Losses
Monday Loss 8,000
Tuesday Loss 3,000
Wednesday Profit 10,000
Thursday Loss 6,000
Friday Profit 11,000
Saturday Profit 5,000
Create separate vectors for coffee and tea. Be sure to label the columns with the names of the days. Create a
code to determine and display: (a) the total daily profits/losses; (b) the total profits/losses for the week; (c)
the total profits/losses for both coffee and tea; (d) the profits/losses for coffee on Friday; (e) the average
profits/losses for tea during the midweek ~ Wednesday and Thursday; (f) the days where coffee bring profits.
Page 1 of 4
APPLIED DATA SCIENCE
C. Analyze the box office performance of the Star Wars trilogies. The following table shows the box office
revenues obtained by the first trilogy per region:
Create a matrix for the data; do not forget to label the rows with the movie title and the columns with the
region revenue.
Determine the total revenue for each of the movies. Add a column for the worldwide box office figures using
the cbind() function.
Create another matrix for the data for the next three movies as follows:
Determine the: (a) total box office revenue per film; (b) the total box office revenue of the entire saga; (c) the
average US revenue for the first two movies; (d) the number of tickets sold per region, assuming that each
ticket costs $5.
Page 2 of 4
APPLIED DATA SCIENCE
D. Create a data frame named heroes_df to contain the following data on some graphic novel characters.
Select and print out the following: (a) complete data for Iron Man; (b) first three values of hair color; (c) data
of deceased hero/es; (d) heroes with blue eyes; (e) all heroes in order of first appearance (f) Marvel heroes;
(g) heroes with more than 2,000 appearances in the novels.
Page 3 of 4
APPLIED DATA SCIENCE
Create a list named shining_list that contains all three variables given above. Use the names
moviename, actors, reviews.
a. Print out the vector representing the actors.
b. Print out the second element of the vector representing the actors.
c. Add the year 1980 to the list; print out the contents of the final list.
Page 4 of 4