Data Analytics Syllabus
Data Analytics Syllabus
● R (Caret, RWeka)
Course Description:
This course is designed for students who have no previous knowledge of data
analytics but wish to acquire these skills in a short period of time. These students
will learn how to analyze large data sets and identify patterns that will improve
any company’s and organization decision-making process. After completing the
course, they will be able to:
At the end of the program, you will have a professional portfolio of projects
and real experience with data analysis that will give you the necessary confidence
to be successful as a Data Analyst.
Course Objectives:
Almost every company and organization collects data about their operations to
better understand how to make internal improvements. As the amount of data
collected increases, it is more difficult to analyze this data manually. There is a
growing tendency at bigger companies to automate the collection of large
quantities of data (Big Data) to discover behavior patterns and better understand
their internal processes.
The collection of data (Data Mining) has several applications, including reducing
the amount of time needed to make decisions and cutting error margins. Data
analytics’ prediction abilities can improve a company’s marketing, help understand
customer behavior, and prevent fraud. Therefore, Data Mining is applicable to
every department in any company.
Course Pre-requisites:
Those interested in Data Analytics should have some prior experience in (or be
willing to work with):
This course is well suited to those with a degree in Social and natural Sciences,
Engineering or Mathematics.
Course Grading:
● attendance (40%)
● programming exercises and deliverables at the end of every task (40%)
● and final presentation (20%)
Per our methodology there will be no exams and no master lessons. The student
is committed to attend to class and work on their assigned tasks and deliver them
via our online platform according to the program schedule.
Course Details:
Week 1-3 In this course students will be working for Blackwell Electronics as
data analysts. The students' job is to use data mining and
machine-learning techniques to investigate the patterns in
Blackwell's sales data and provide insight into customer buying
trends and preferences. The inferences students draw from the
patterns in the data will help the business make data-driven
decisions about sales and marketing activities and understand the
Syllabus: Data Analytics & Big Data
In this course, students will use both Weka and the R statistical
programming language augmented with machine learning packages
to predict which potential new products that the sales team is
considering adding to Blackwell's current product mix will be the
most profitable. Next, students will create a model to predict which
brand of computer products Blackwell customers prefer based on
customer demographics collected from a marketing survey. Finally,
students will present to management, explaining their insights and
suggestions for data mining process improvements.
Week 7-12 You will learn the algorithmic and organizational skills required to
scale data analysis to large server farms, computing clouds, and the
web, including an understanding of the design and implementation
differences between single-computer and cloud-scale programs,
analytics, and data processing. You will also gain a deep knowledge
of predictive data analysis, ranging from discovering patterns and
correlations in data to making predictions and estimating their
accuracy. As part of this process, you will master fundamentals of
Syllabus: Data Analytics & Big Data
These skills are applicable to every big data project and will enable
you to present yourself as a key asset in solving business problems
that require data analysis. After successfully completing this course,
you will be able to:
Designed by:
This program were designed in collaboration with Dr. Jaime Carbonell and Dr.
Ravi Starzl of Carnegie Mellon University.
Dr. Jaime Carbonell is the Director of the Language Technologies Institute and
Allen Newell Professor of Computer Science at Carnegie Mellon University, where
he pioneered the PhD and MS degrees in Language Technologies. His current
research includes machine learning, computational proteomics, data mining
(primarily in healthcare and finance), text mining and machine translation. Dr.
Carbonell has served on multiple governmental advisory committees such as the
Human Genome Committee of the National Institutes of Health, the Oakridge
National Laboratories Scientific Advisory Board, the National Institute of
Standards and Technology Interactive Systems Scientific Advisory Board, and the
German National Artificial Intelligence (DFKI) Scientific Advisory Board. He has
published more than 300 papers and books, and supervised more than 45 PhD
dissertations. He received SB degrees in Physics and Mathematics from MIT, and
MS and PhD degrees in Computer Science from Yale University.
Dr. Ravi Starzl is a Systems Scientist in the Language Technologies Institute at
Carnegie Mellon University. He is an expert in the computational analysis and
modeling of complex information-driven systems, with experience in such diverse
domains as biological systems, financial systems, and Internet topologies. Dr.
Starzl has extensive experience with the computational and mathematical
methods integral to the effective acquisition, management, and utilization of
extremely large amounts of information. Having led several massive data analysis
projects with data sets as large as 100+TB, Dr. Starzl is fluent in the methods of Big
Data analytics, and is an active researcher in the area of parallelization of machine
learning methods for Big Data. He received his doctoral degree in Language and
Information Technologies from the School of Computer Science at Carnegie
Syllabus: Data Analytics & Big Data
Mellon University in 2012. At In addition to his research at CMU, Dr. Starzl develops
and teaches classes on the topics of Big Data, biotechnology, and advanced
software development.He has also participated in the founding, growth, and sale
of several biotechnology and high-tech startups.
Programme methodology:
You will be working in a team, as is usually the case in the world of work. A
comprehensive range of supporting material and the help needed to complete the
tasks are available on-line. You will be supervised by a tutor, who is always
available to answer any questions or clear up any doubts, and who will assess your
performance and advise on the project "deliverables".
Tutors:
At the start of the course you are assigned a tutor who is a professional expert.
The tutor will foster teamwork and promote discussion of issues, helping you find
Syllabus: Data Analytics & Big Data
solutions and resolve the difficulties of the project by drawing on your own
resources.
They will give you feedback on the “deliverables” for each project at every
stage, so that you can continually refine them, learning from your mistakes and
achieving the mastery needed for each task.