Assignment Question
Assignment Question
Page 1 of 6
Employee Dataset
For the assignment, you are asked to explore the application of data analytics
techniques to the dataset which is provided. You must study data problems related to the
dataset, giving special consideration to the unique properties of the problem domain, and
testing one or more techniques on it.
Your analysis should be deep and in details, also it must go further than what has
already been covered in this course. You must adopt the data Exploration, Manipulation,
Transformation and Visualization concepts to guide you through the solution process. It is
very important to explain and justify the techniques that have been chosen.
You also may need to pre-process your data to get it into an appropriate format. The
assignment should involve a number of techniques by categorize it into different criteria and
a detailed exploration with the commands using in each criteria. Outline the findings,
analyze them and justify correctly with an appropriate graph. Also, a supporting document
is needed to reflect the graph and code using R programming concepts.
This assignment will help you to explore and analyse a set of data and reconstruct it into
meaningful representations for decision making.
3.0 TYPE
Individual Assignment
This dataset contains the data of staffs within an organization that could determine some
hidden issue in human resources management. Human resource department manager
assigned you to perform analysis with the given dataset to identify hidden problem in the
organization and provide meaningful insight for decision making.
Techniques
The techniques used to explore the dataset using various data exploration, manipulation,
transformation, and visualization techniques which covered in the course. And as an
additional feature must explore the further concepts which can improve the retrieval effects.
The dataset provided for this assignment is related to the employees’ job information and
attribution. It contains 18 columns and 49654 rows. The dataset includes the personal detail
of the staff, job department, position, location, working status, and reason of termination.
6.0 DELIVERABLES:
The complete RScript (source code) and report must be submitted to APU
Learning Management System (Moodle).
RScript (Program Code):
o Name the file under your name and TP number.
o Start the first two lines in your program by typing your name and TP number.
For example:
# NAME
#TP123456
o For each question example, give an id and explain what you want to discover.
For example:
# Question 1: Why staff would leave the company.
# Analysis 1-1: Find the relationship between job position with attrition…
# Analysis 1-2: Find the relationship between job age and ….
# Analysis 1-3: Find the relationship between …
o For each extra feature example, give an id and provide the explanation.
# Extra feature 1
# comments about the extra feature
B) Contents:
o Introduction and assumptions (if any)
o Data import / Cleaning / pre-processing / transformation
o Each question must start in a separate page and contains:
Analysis Techniques - data exploration / manipulation / visualization
Screenshot of source code with the explanation.
Screenshot of output/plot with the explanation.
Outline the findings based on the results obtained.
o The extra feature explanation must be in a separate page and contains:
Screenshot of source code with the explanation.
Screenshot of output/plot with the explanation.
Explain how adding this extra feature can improve the results.
C) Conclusion
D) References
The font size used in the report must be 12pt and the font is Times
New Roman. Full source code is not allowed to be included in the
report. The report must be typed and clearly printed.
You may source algorithms and information from the Internet or
books. Proper referencing of the resources should be evident in the
document.
All references must be made using the APA (American
Psychological Association) referencing style as shown below:
The theory was first propounded in 1970 (Larsen, A.E. 1971), but
since then has been refuted; M.K. Larsen (1983) is among those most
energetic in their opposition……….
/**
* Following source code obtained from (Danang, S.N. 2002)
*/
int noshape=2;
noshape=GetShape();