This is the data science basic introduction which covers Big data ,machine learning including supervised machine learning & unsupervised machine learning. This presentation also covers Hadoop tool and its landscape. This will help in deciding where to start your career in data science. It has all the skills you require to build a career in data science industry.
Data Science Introduction by Emerging India Analytics
2. About Us & Nasscom Intiatives
What is Data Analytics & How is it used ?
Types of Analytics and Life Cycle
Application of ML in the Industry
Big Data- Hadoop Development & Landscape
Where to Start- Data Analytic skills needed & Career Prospect ?
Product Offerings
Course Curriculum
Contact information
Contents
3. Mrs. Rakhi Singh, Delivery Head (NASSCOM Certified Trainer)
Mr. Neeraj Gehlot, Lead-Marketing Science Group, Annalect
Mr. Mayank Jain, Big Data Developer and Analyst
Mr. Kapil Sharma Center-Head cum Trainer (Certified by North-
western University)
Speakers:
4. 4
Vision
To become leading consulting and
training provider in the field of Data
Analytics, Machine Learning, Big Data in
India & Overseas.
Mission
To create value for our customers by
providing consulting services and to
impart high quality training & skill
enhancement programs for employability.
About Us
Emerging India is promoted by professionals from IIT’s, IIM’s, MBAs and experts from Education and IT Industry. We are one of the India’s
fastest growing Analytics/ IT consulting and training companies. We offer services in both consulting and training domain including
NASSCOM certified professional programs (designed to bridge the gap between academics and Industry) and Data Analytics/ Cyber
Security/ IoT/ Robotics/ AI/ Blockchain consulting solutions. We are also proud NASSCOM member and NASSCOM
SSC Licensed Training Partner for the northern region in India.. As NASSCOM licensed training partner, Emerging
India is proudly taking NASSCOM SSC initiatives to the next level in the field of Data Analytics to enhance the technical skills of students
& working professionals.
5. What is Analytics?
Data on its own is useless unless you can make sense of it!
WHAT IS ANALYTICS?
The scientific process of transforming data into insight for making better decisions, offering
new opportunities for a competitive advantage
5
6. ANALYTICS LIFE CYCLE
- Defining target variable
- Splitting data for training and
validating the model
- Defining analysis time frame
for training and validation
- Correlation analysis and
variable selection
- Selecting right data mining
algorithm
- Do validation by measuring
accuracy, sensitivity, and
model lift
- Data mining and modeling is
an iterative process
Data
Mining
& Modeling
- Define variables to
support hypothesis
- Cleaning &
transforming the data
- Create longitudinal
data/trend data
- Ingesting additional
data if needed
- Build analytical data
mart
- Gathering problem
information
- Defining the goal to
solve the problem
- Defining expected
output
- Defining hypothesis
- Defining analysis
methodology
- Measuring the
business value
Data
Understanding
Business
Understanding
7. ANALYTICS LIFE CYCLE
- Create monitoring
process for model
evaluation
- Evaluate the model
based on real-world
result
- Monitor and evaluate
the business impact
Model
Monitoring
- Define the model scoring
period
- Integrate model result
with execution system
(campaign system, CRM,
etc)
- Create operational
process that timely,
consistent, and efficient
Model
Operationalization
- Describe the importance
of each variable
- Visualize overall model
by creating decision tree
for example
- Define business action
based on the model
result
Model
Interpretation
Analytics and modeling is an iterative process. Data model will become
obsolete and need to evolve to accommodate changes in behavior
8. LearningObjectives
• Why Machine Learning?
• What is Machine Learning?
• What is Supervised Learning?
• Applications of Supervised Learning?
• What is Unsupervised Learning?
• Applications of Unsupervised
Learning?
9. Why MachineLearning?
• Why Machine Learning?
• Everyone like to know the Future
• Adapt and learn fast with changing scenario
• Act fast with changing data
• What is Machine Learning?
• An algorithm that learns from data, identifies
patterns in data and store the learnings in form
of a Model
• Apply the Model to predict on new data
• It has the ability to quickly change, refresh,
and enhance the Model with changing data
and newer datasets
10. Simple BusinessScenario
Scenario
Let us assume you are working in a Bank and the Chief Marketing
Officer suggests that he wish to run a campaign to promote a financial
product, say, some Investment Product
Based on business filters, you have an eligible contactable base of
1,000,000 customers.
Cost of Targeting each customer being Rs. 10/-
It is expected that 0.5% incremental customers will purchase the
Investment Product because of the campaign
Expected Revenue per customer who purchases the product is Rs.
2500/-
13. Analytics BasedApproach
High Response Segment
25% of Base
With expected conversion rate of 1.3%
Medium Response Segment
25% of Base
With expected conversion rate of 0.4%
Low Response Segment
50% of Base
With expected conversion rate of 0.15%
14. Analytics BasedROMI
Segment
# Customer
(A)
Exp. Conv.
Rate
(B)
# Conv’s
(C = A * B)
Cost of
Targeting
(D = A * 10)
Exp.
Revenue
(E = C *
2500)
Profit
(F = E – D)
ROMI
G = F / D
High
Response
Segment
250,000 1.3% 3250 2,500,000 8,125,000 5,625,000 225%
Medium
Response
Segment
250,000 0.4% 1000 2,500,000 2,500,000 0 0%
Low
Response
Segment
500,000 0.15% 750 5,000,000 1,875,000 -3,125,000 -ve
Total 1,000,000 0.5% 5000 10,000,000 12,500,000 2,500,000 25%
Note: Cost of Targeting per customer : INR 10/- ; Expected Revenue per Convert : INR 2500/-
15. Recommendation toCMO
Your recommendation to the CMO:
•Target only the High Response Segment
Benefits of your strategy
A) It will reduce Marketing Cost by 75%
B) It will increase Profits by 125%
C) 9X increase in ROMI
17. Machine Learning TechniquesCategories
Supervised learning is the Machine Learning task of
finding a function from a Labeled Data
Labeled Data is a dataset which has Independent
Variable/s and a Dependent Variable
Unsupervised learning is the Machine Learning task
of exploring the data to derive some inferences /
insights from the dataset
The “Target Variable” or the “Labeled Class” is not
present in the Unsupervised Learning dataset
18. SupervisedLearning
• Supervised Learning Techniques
• Classification
• Regression
D
A
T
A
Input
Attributes
Desired
Output
Supervised
Learning
Technique
Predictive
Model
Predicted
Output
20. Application of SupervisedLearning
• Assume you are working in a bank (say MyBank)
• The Chief Marketing Officer has assigned you the task of
growing the Personal Loans Portfolio by cross-selling the
loans to existing Customers
• Data of past promotional campaigns and offers sent to the
Customers, their behavioural data and those who took the
loan is all available with you
• This is an example where Supervised Learning can be
applied
22. Some e.g.of Supervised LearningApplications
Industry /
Vertical
Supervised Learning Technique Applications Labeled Class
HR Topredict whether a good employee is likely to resign or not Resign / Not-Resign
Telecom Toclassify customers who are likely to be Churners Churn / Not-Churn
Retail /
Ecommerce
Tofind potential customers from churned base who can be won back
again
Win-back Yes / No
Banking Tobuild a model that will help assign the probability to a customer to
take a product / service
Respond / Not-
Respond
Insurance Tobuild a model to assess the likelihood of customer not renewing his /
her policy
Lapse / Not-Lapse
23. Application of UnSupervisedLearning
• Assume you are working in a Retail Company
• You have 1 Mn Loyalty Members
• You have been asked to segment them based on their Buying
Behaviour Pattern
• This is an example of UnSupervised Learning Application
36. Tools Covered in Program
The program is developed keeping in mind the needs of an evolving Analytics industry that requires individuals
to be “ J O B - R E A D Y ” .
36
37. Key Program Highlights
300 hours of
Exhaustive Instructor
LedTraining
Course Content
Created by 23 Leading
Companies in
Collaboration with
Nasscom
Assessment and
Online Certification
by Nasscom
Government of India
Approved certificate
Nasscom SSC Official
On Line Study
material
Certified candidates
will be provided 100%
placement assistance
Globally recognized
certificate
Training delivered by
Nasscom Certified &
ExperiencedTrainers.
Real world project &
Case studies
Education Loan
Facility
40. Short Term Courses- Data Science & Big Data
DATA SCIENCE
WITH R
PYTHON AND
ML
DATA SCIENCE
WITH R AND
ML
DATA SCIENCE
WITH PYTHON
AND ML
BIG DATA WITH
HADOOP &
SPARK
Data
Visualization
withTableau
BIG DATA WITH
HADOOP
BIG DATA WITH
SPARK
42. Our Location A
H-196,304,Iind Floor
Sector 63, ,Noida –
201301
Our Phone
+91 120-4169097
+91 8860599698
Email / Website
info@emergingindiagroup.com
https://www.emergingindiagro
up.com
Get in Touch with Us
We would be glad to hear from you !
Editor's Notes
Business Analytics refers to the practice of investigation of past business
performance, using data and statistical models in order to develop new insights and
understanding of future business performance. It makes extensive use of statistical
and quantitative analysis, explanatory and predictive modeling and fact-based
management to drive decision-making.
Big Data cannot be converted into an asset unless it is analyzed and insights are mined from it. This is where Big Data Analytics comes into the picture.
The process of mining useful information (i.e. relevant and useful insights from raw data) from the plethora of data being generated to make smart business decisions, is Big Data Analytics. (This is how the word ―information‖ differs from the word ―data‖- other pair of words that are used interchangeably.)
Analytics is a process of discovery, interpretation, and communicating meaningful patterns in data. It denotes a persons‘ skill to gather and use data to generate insights that lead to fact-based decision making.
Data-driven analytics provides us with unparalleled opportunities that will help to transform the vast areas concerning business, healthcare, government, etc. The application of data-driven analytics is especially valuable in areas rich with recorded information.
Analytics banks on the simultaneous application of statistics, computer programming, and
operation research to measure performance.
It is observed that analytics most likely favours data visualization while communicating insight.
Analytics also supports the organizations to use the generated business data. It helps the organizations to describe, predict, and enhance their business performance.