Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Data Science Introduction by Emerging India Analytics
 About Us & Nasscom Intiatives
 What is Data Analytics & How is it used ?
 Types of Analytics and Life Cycle
 Application of ML in the Industry
 Big Data- Hadoop Development & Landscape
 Where to Start- Data Analytic skills needed & Career Prospect ?
 Product Offerings
 Course Curriculum
 Contact information
Contents
Mrs. Rakhi Singh, Delivery Head (NASSCOM Certified Trainer)
Mr. Neeraj Gehlot, Lead-Marketing Science Group, Annalect
Mr. Mayank Jain, Big Data Developer and Analyst
Mr. Kapil Sharma Center-Head cum Trainer (Certified by North-
western University)
Speakers:
4
Vision
To become leading consulting and
training provider in the field of Data
Analytics, Machine Learning, Big Data in
India & Overseas.
Mission
To create value for our customers by
providing consulting services and to
impart high quality training & skill
enhancement programs for employability.
About Us
Emerging India is promoted by professionals from IIT’s, IIM’s, MBAs and experts from Education and IT Industry. We are one of the India’s
fastest growing Analytics/ IT consulting and training companies. We offer services in both consulting and training domain including
NASSCOM certified professional programs (designed to bridge the gap between academics and Industry) and Data Analytics/ Cyber
Security/ IoT/ Robotics/ AI/ Blockchain consulting solutions. We are also proud NASSCOM member and NASSCOM
SSC Licensed Training Partner for the northern region in India.. As NASSCOM licensed training partner, Emerging
India is proudly taking NASSCOM SSC initiatives to the next level in the field of Data Analytics to enhance the technical skills of students
& working professionals.
What is Analytics?
Data on its own is useless unless you can make sense of it!
WHAT IS ANALYTICS?
The scientific process of transforming data into insight for making better decisions, offering
new opportunities for a competitive advantage
5
ANALYTICS LIFE CYCLE
- Defining target variable
- Splitting data for training and
validating the model
- Defining analysis time frame
for training and validation
- Correlation analysis and
variable selection
- Selecting right data mining
algorithm
- Do validation by measuring
accuracy, sensitivity, and
model lift
- Data mining and modeling is
an iterative process
Data
Mining
& Modeling
- Define variables to
support hypothesis
- Cleaning &
transforming the data
- Create longitudinal
data/trend data
- Ingesting additional
data if needed
- Build analytical data
mart
- Gathering problem
information
- Defining the goal to
solve the problem
- Defining expected
output
- Defining hypothesis
- Defining analysis
methodology
- Measuring the
business value
Data
Understanding
Business
Understanding
ANALYTICS LIFE CYCLE
- Create monitoring
process for model
evaluation
- Evaluate the model
based on real-world
result
- Monitor and evaluate
the business impact
Model
Monitoring
- Define the model scoring
period
- Integrate model result
with execution system
(campaign system, CRM,
etc)
- Create operational
process that timely,
consistent, and efficient
Model
Operationalization
- Describe the importance
of each variable
- Visualize overall model
by creating decision tree
for example
- Define business action
based on the model
result
Model
Interpretation
Analytics and modeling is an iterative process. Data model will become
obsolete and need to evolve to accommodate changes in behavior
LearningObjectives
• Why Machine Learning?
• What is Machine Learning?
• What is Supervised Learning?
• Applications of Supervised Learning?
• What is Unsupervised Learning?
• Applications of Unsupervised
Learning?
Why MachineLearning?
• Why Machine Learning?
• Everyone like to know the Future
• Adapt and learn fast with changing scenario
• Act fast with changing data
• What is Machine Learning?
• An algorithm that learns from data, identifies
patterns in data and store the learnings in form
of a Model
• Apply the Model to predict on new data
• It has the ability to quickly change, refresh,
and enhance the Model with changing data
and newer datasets
Simple BusinessScenario
Scenario
Let us assume you are working in a Bank and the Chief Marketing
Officer suggests that he wish to run a campaign to promote a financial
product, say, some Investment Product
Based on business filters, you have an eligible contactable base of
1,000,000 customers.
Cost of Targeting each customer being Rs. 10/-
It is expected that 0.5% incremental customers will purchase the
Investment Product because of the campaign
Expected Revenue per customer who purchases the product is Rs.
2500/-
Campaign Return on MarketingInvestment without
AnalyticalApproach
• Target Customer Base : 1,000,000
• Cost of Targeting per customer : INR 10/-
• Cost of Campaign = 1,000,000 * 10 = INR 10,000,000 = 10 Mn
• Expected Incremental Conversion Rate : 0.5%
• Expected Incremental Conversions = 1,000,000 * 0.5% = 5,000
• Expected Revenue per Convert : INR 2500/-
• Expected Incremental Revenue = 5,000 * 2500 = 12,500,000 = 12.5
Mn
• Expected Profit = 12.5 Mn – 10 Mn = 2.5 Mn
CampaignROMI
Return on
Marketing =
Revenue – Cost
----------------------- =
12.5 - 10
-------------------- = 25%
Investment (ROMI) Cost 10
Analytics BasedApproach
High Response Segment
25% of Base
With expected conversion rate of 1.3%
Medium Response Segment
25% of Base
With expected conversion rate of 0.4%
Low Response Segment
50% of Base
With expected conversion rate of 0.15%
Analytics BasedROMI
Segment
# Customer
(A)
Exp. Conv.
Rate
(B)
# Conv’s
(C = A * B)
Cost of
Targeting
(D = A * 10)
Exp.
Revenue
(E = C *
2500)
Profit
(F = E – D)
ROMI
G = F / D
High
Response
Segment
250,000 1.3% 3250 2,500,000 8,125,000 5,625,000 225%
Medium
Response
Segment
250,000 0.4% 1000 2,500,000 2,500,000 0 0%
Low
Response
Segment
500,000 0.15% 750 5,000,000 1,875,000 -3,125,000 -ve
Total 1,000,000 0.5% 5000 10,000,000 12,500,000 2,500,000 25%
Note: Cost of Targeting per customer : INR 10/- ; Expected Revenue per Convert : INR 2500/-
Recommendation toCMO
Your recommendation to the CMO:
•Target only the High Response Segment
Benefits of your strategy
A) It will reduce Marketing Cost by 75%
B) It will increase Profits by 125%
C) 9X increase in ROMI
Supervised
vs Unsupervised Learning
Machine Learning TechniquesCategories
 Supervised learning is the Machine Learning task of
finding a function from a Labeled Data
 Labeled Data is a dataset which has Independent
Variable/s and a Dependent Variable
 Unsupervised learning is the Machine Learning task
of exploring the data to derive some inferences /
insights from the dataset
 The “Target Variable” or the “Labeled Class” is not
present in the Unsupervised Learning dataset
SupervisedLearning
• Supervised Learning Techniques
• Classification
• Regression
D
A
T
A
Input
Attributes
Desired
Output
Supervised
Learning
Technique
Predictive
Model
Predicted
Output
UnSupervisedLearning
UnSupervised Learning Techniques
• Dimension Reduction Techniques like PCA, Factor Analysis
• Clustering
• Association Analysis
Input
Data
Unsupervised
Learning
Technique
Output
Application of SupervisedLearning
• Assume you are working in a bank (say MyBank)
• The Chief Marketing Officer has assigned you the task of
growing the Personal Loans Portfolio by cross-selling the
loans to existing Customers
• Data of past promotional campaigns and offers sent to the
Customers, their behavioural data and those who took the
loan is all available with you
• This is an example where Supervised Learning can be
applied
Marketing ModelingDataset
• Sample Predictive Modeling Dataset
Some e.g.of Supervised LearningApplications
Industry /
Vertical
Supervised Learning Technique Applications Labeled Class
HR Topredict whether a good employee is likely to resign or not Resign / Not-Resign
Telecom Toclassify customers who are likely to be Churners Churn / Not-Churn
Retail /
Ecommerce
Tofind potential customers from churned base who can be won back
again
Win-back Yes / No
Banking Tobuild a model that will help assign the probability to a customer to
take a product / service
Respond / Not-
Respond
Insurance Tobuild a model to assess the likelihood of customer not renewing his /
her policy
Lapse / Not-Lapse
Application of UnSupervisedLearning
• Assume you are working in a Retail Company
• You have 1 Mn Loyalty Members
• You have been asked to segment them based on their Buying
Behaviour Pattern
• This is an example of UnSupervised Learning Application
ClusteringModelingDataset
• Sample Clustering Modeling Dataset
Big Data – Hadoop Development
• What is Big Data
• Why Big Data
• Scope of Big Data
• Industry of Big Data
• Technologies in Big Data
Data Science Introduction by Emerging India Analytics
Data Science Introduction by Emerging India Analytics
Data Science Introduction by Emerging India Analytics
Data Science Introduction by Emerging India Analytics
HADOOP
Storage (Base Layer)
 HDFS - Hadoop Distributed
File System
Processing
 Map Reduce
 Hive
 Sqoop
 Pig
 Hbase
 Spark
HADOOP LANDSCAPE
WHERE TO START
LETS GET OUR HAND DIRTY
SKILLS NEEDED
DOMAIN KNOWLEDGE
Most Demanded course
Nasscom Data Science Program
Course Designed & Created by SIG
Members
Tools Covered in Program
The program is developed keeping in mind the needs of an evolving Analytics industry that requires individuals
to be “ J O B - R E A D Y ” .
36
Key Program Highlights
300 hours of
Exhaustive Instructor
LedTraining
Course Content
Created by 23 Leading
Companies in
Collaboration with
Nasscom
Assessment and
Online Certification
by Nasscom
Government of India
Approved certificate
Nasscom SSC Official
On Line Study
material
Certified candidates
will be provided 100%
placement assistance
Globally recognized
certificate
Training delivered by
Nasscom Certified &
ExperiencedTrainers.
Real world project &
Case studies
Education Loan
Facility
NASSCOM CERTIFICATE SAMPLE
ShortTerm Courses
Big Data Analytics Programs
Short Term Courses- Data Science & Big Data
DATA SCIENCE
WITH R
PYTHON AND
ML
DATA SCIENCE
WITH R AND
ML
DATA SCIENCE
WITH PYTHON
AND ML
BIG DATA WITH
HADOOP &
SPARK
Data
Visualization
withTableau
BIG DATA WITH
HADOOP
BIG DATA WITH
SPARK
Questions
Our Location A
H-196,304,Iind Floor
Sector 63, ,Noida –
201301
Our Phone
+91 120-4169097
+91 8860599698
Email / Website
info@emergingindiagroup.com
https://www.emergingindiagro
up.com
Get in Touch with Us
We would be glad to hear from you !

More Related Content

Data Science Introduction by Emerging India Analytics

  • 2.  About Us & Nasscom Intiatives  What is Data Analytics & How is it used ?  Types of Analytics and Life Cycle  Application of ML in the Industry  Big Data- Hadoop Development & Landscape  Where to Start- Data Analytic skills needed & Career Prospect ?  Product Offerings  Course Curriculum  Contact information Contents
  • 3. Mrs. Rakhi Singh, Delivery Head (NASSCOM Certified Trainer) Mr. Neeraj Gehlot, Lead-Marketing Science Group, Annalect Mr. Mayank Jain, Big Data Developer and Analyst Mr. Kapil Sharma Center-Head cum Trainer (Certified by North- western University) Speakers:
  • 4. 4 Vision To become leading consulting and training provider in the field of Data Analytics, Machine Learning, Big Data in India & Overseas. Mission To create value for our customers by providing consulting services and to impart high quality training & skill enhancement programs for employability. About Us Emerging India is promoted by professionals from IIT’s, IIM’s, MBAs and experts from Education and IT Industry. We are one of the India’s fastest growing Analytics/ IT consulting and training companies. We offer services in both consulting and training domain including NASSCOM certified professional programs (designed to bridge the gap between academics and Industry) and Data Analytics/ Cyber Security/ IoT/ Robotics/ AI/ Blockchain consulting solutions. We are also proud NASSCOM member and NASSCOM SSC Licensed Training Partner for the northern region in India.. As NASSCOM licensed training partner, Emerging India is proudly taking NASSCOM SSC initiatives to the next level in the field of Data Analytics to enhance the technical skills of students & working professionals.
  • 5. What is Analytics? Data on its own is useless unless you can make sense of it! WHAT IS ANALYTICS? The scientific process of transforming data into insight for making better decisions, offering new opportunities for a competitive advantage 5
  • 6. ANALYTICS LIFE CYCLE - Defining target variable - Splitting data for training and validating the model - Defining analysis time frame for training and validation - Correlation analysis and variable selection - Selecting right data mining algorithm - Do validation by measuring accuracy, sensitivity, and model lift - Data mining and modeling is an iterative process Data Mining & Modeling - Define variables to support hypothesis - Cleaning & transforming the data - Create longitudinal data/trend data - Ingesting additional data if needed - Build analytical data mart - Gathering problem information - Defining the goal to solve the problem - Defining expected output - Defining hypothesis - Defining analysis methodology - Measuring the business value Data Understanding Business Understanding
  • 7. ANALYTICS LIFE CYCLE - Create monitoring process for model evaluation - Evaluate the model based on real-world result - Monitor and evaluate the business impact Model Monitoring - Define the model scoring period - Integrate model result with execution system (campaign system, CRM, etc) - Create operational process that timely, consistent, and efficient Model Operationalization - Describe the importance of each variable - Visualize overall model by creating decision tree for example - Define business action based on the model result Model Interpretation Analytics and modeling is an iterative process. Data model will become obsolete and need to evolve to accommodate changes in behavior
  • 8. LearningObjectives • Why Machine Learning? • What is Machine Learning? • What is Supervised Learning? • Applications of Supervised Learning? • What is Unsupervised Learning? • Applications of Unsupervised Learning?
  • 9. Why MachineLearning? • Why Machine Learning? • Everyone like to know the Future • Adapt and learn fast with changing scenario • Act fast with changing data • What is Machine Learning? • An algorithm that learns from data, identifies patterns in data and store the learnings in form of a Model • Apply the Model to predict on new data • It has the ability to quickly change, refresh, and enhance the Model with changing data and newer datasets
  • 10. Simple BusinessScenario Scenario Let us assume you are working in a Bank and the Chief Marketing Officer suggests that he wish to run a campaign to promote a financial product, say, some Investment Product Based on business filters, you have an eligible contactable base of 1,000,000 customers. Cost of Targeting each customer being Rs. 10/- It is expected that 0.5% incremental customers will purchase the Investment Product because of the campaign Expected Revenue per customer who purchases the product is Rs. 2500/-
  • 11. Campaign Return on MarketingInvestment without AnalyticalApproach • Target Customer Base : 1,000,000 • Cost of Targeting per customer : INR 10/- • Cost of Campaign = 1,000,000 * 10 = INR 10,000,000 = 10 Mn • Expected Incremental Conversion Rate : 0.5% • Expected Incremental Conversions = 1,000,000 * 0.5% = 5,000 • Expected Revenue per Convert : INR 2500/- • Expected Incremental Revenue = 5,000 * 2500 = 12,500,000 = 12.5 Mn • Expected Profit = 12.5 Mn – 10 Mn = 2.5 Mn
  • 12. CampaignROMI Return on Marketing = Revenue – Cost ----------------------- = 12.5 - 10 -------------------- = 25% Investment (ROMI) Cost 10
  • 13. Analytics BasedApproach High Response Segment 25% of Base With expected conversion rate of 1.3% Medium Response Segment 25% of Base With expected conversion rate of 0.4% Low Response Segment 50% of Base With expected conversion rate of 0.15%
  • 14. Analytics BasedROMI Segment # Customer (A) Exp. Conv. Rate (B) # Conv’s (C = A * B) Cost of Targeting (D = A * 10) Exp. Revenue (E = C * 2500) Profit (F = E – D) ROMI G = F / D High Response Segment 250,000 1.3% 3250 2,500,000 8,125,000 5,625,000 225% Medium Response Segment 250,000 0.4% 1000 2,500,000 2,500,000 0 0% Low Response Segment 500,000 0.15% 750 5,000,000 1,875,000 -3,125,000 -ve Total 1,000,000 0.5% 5000 10,000,000 12,500,000 2,500,000 25% Note: Cost of Targeting per customer : INR 10/- ; Expected Revenue per Convert : INR 2500/-
  • 15. Recommendation toCMO Your recommendation to the CMO: •Target only the High Response Segment Benefits of your strategy A) It will reduce Marketing Cost by 75% B) It will increase Profits by 125% C) 9X increase in ROMI
  • 17. Machine Learning TechniquesCategories  Supervised learning is the Machine Learning task of finding a function from a Labeled Data  Labeled Data is a dataset which has Independent Variable/s and a Dependent Variable  Unsupervised learning is the Machine Learning task of exploring the data to derive some inferences / insights from the dataset  The “Target Variable” or the “Labeled Class” is not present in the Unsupervised Learning dataset
  • 18. SupervisedLearning • Supervised Learning Techniques • Classification • Regression D A T A Input Attributes Desired Output Supervised Learning Technique Predictive Model Predicted Output
  • 19. UnSupervisedLearning UnSupervised Learning Techniques • Dimension Reduction Techniques like PCA, Factor Analysis • Clustering • Association Analysis Input Data Unsupervised Learning Technique Output
  • 20. Application of SupervisedLearning • Assume you are working in a bank (say MyBank) • The Chief Marketing Officer has assigned you the task of growing the Personal Loans Portfolio by cross-selling the loans to existing Customers • Data of past promotional campaigns and offers sent to the Customers, their behavioural data and those who took the loan is all available with you • This is an example where Supervised Learning can be applied
  • 21. Marketing ModelingDataset • Sample Predictive Modeling Dataset
  • 22. Some e.g.of Supervised LearningApplications Industry / Vertical Supervised Learning Technique Applications Labeled Class HR Topredict whether a good employee is likely to resign or not Resign / Not-Resign Telecom Toclassify customers who are likely to be Churners Churn / Not-Churn Retail / Ecommerce Tofind potential customers from churned base who can be won back again Win-back Yes / No Banking Tobuild a model that will help assign the probability to a customer to take a product / service Respond / Not- Respond Insurance Tobuild a model to assess the likelihood of customer not renewing his / her policy Lapse / Not-Lapse
  • 23. Application of UnSupervisedLearning • Assume you are working in a Retail Company • You have 1 Mn Loyalty Members • You have been asked to segment them based on their Buying Behaviour Pattern • This is an example of UnSupervised Learning Application
  • 25. Big Data – Hadoop Development • What is Big Data • Why Big Data • Scope of Big Data • Industry of Big Data • Technologies in Big Data
  • 30. HADOOP Storage (Base Layer)  HDFS - Hadoop Distributed File System Processing  Map Reduce  Hive  Sqoop  Pig  Hbase  Spark
  • 32. WHERE TO START LETS GET OUR HAND DIRTY
  • 34. Most Demanded course Nasscom Data Science Program
  • 35. Course Designed & Created by SIG Members
  • 36. Tools Covered in Program The program is developed keeping in mind the needs of an evolving Analytics industry that requires individuals to be “ J O B - R E A D Y ” . 36
  • 37. Key Program Highlights 300 hours of Exhaustive Instructor LedTraining Course Content Created by 23 Leading Companies in Collaboration with Nasscom Assessment and Online Certification by Nasscom Government of India Approved certificate Nasscom SSC Official On Line Study material Certified candidates will be provided 100% placement assistance Globally recognized certificate Training delivered by Nasscom Certified & ExperiencedTrainers. Real world project & Case studies Education Loan Facility
  • 39. ShortTerm Courses Big Data Analytics Programs
  • 40. Short Term Courses- Data Science & Big Data DATA SCIENCE WITH R PYTHON AND ML DATA SCIENCE WITH R AND ML DATA SCIENCE WITH PYTHON AND ML BIG DATA WITH HADOOP & SPARK Data Visualization withTableau BIG DATA WITH HADOOP BIG DATA WITH SPARK
  • 42. Our Location A H-196,304,Iind Floor Sector 63, ,Noida – 201301 Our Phone +91 120-4169097 +91 8860599698 Email / Website info@emergingindiagroup.com https://www.emergingindiagro up.com Get in Touch with Us We would be glad to hear from you !

Editor's Notes

  1. Business Analytics refers to the practice of investigation of past business performance, using data and statistical models in order to develop new insights and understanding of future business performance. It makes extensive use of statistical and quantitative analysis, explanatory and predictive modeling and fact-based management to drive decision-making. Big Data cannot be converted into an asset unless it is analyzed and insights are mined from it. This is where Big Data Analytics comes into the picture. The process of mining useful information (i.e. relevant and useful insights from raw data) from the plethora of data being generated to make smart business decisions, is Big Data Analytics. (This is how the word ―information‖ differs from the word ―data‖- other pair of words that are used interchangeably.) Analytics is a process of discovery, interpretation, and communicating meaningful patterns in data. It denotes a persons‘ skill to gather and use data to generate insights that lead to fact-based decision making. Data-driven analytics provides us with unparalleled opportunities that will help to transform the vast areas concerning business, healthcare, government, etc. The application of data-driven analytics is especially valuable in areas rich with recorded information. Analytics banks on the simultaneous application of statistics, computer programming, and operation research to measure performance. It is observed that analytics most likely favours data visualization while communicating insight. Analytics also supports the organizations to use the generated business data. It helps the organizations to describe, predict, and enhance their business performance.