Data Science
Data Science
Data Science
SCIENCE
Introduction and Administration
Plan
requirements
What is data
science?
Course
homework assignments
• Homepage, contact details
• Mashup of
disciplines
• Syllabus
• Grade ,exam,
1. Why are you here?
Introduction: Media Buzz
Data Scientists are in high demand
Also in Academia
Demand will outpace the supply
Israel
Pays well
2. What is data science?
Technology and raising expectations
Data Science
New Discipline
Very little/none textbooks/courses
covering
the discipline as a whole
Compare to Software Engineering/Compute Science during
70-80s of the last century
Data Science is what data scientists do
Why data science and data scientists are needed?
Development of enabling technology
Raising Expectations from customers
2. What is data science?
Technological developments
Declining cost of storage
Declining cost of computing
Surpassing the brain
More data can be stored and processed
Value of Big Data
Devices vs. People
Internet of Things
Next frontier: IoT
2. What is data science?
Raising expectations
Cognitive Computing
People expect systems to behave like humans
Be Adaptive
Learn as information and goals change
Be Interactive
Interact easily with people and other systems
Be Contextual
Understand meaning, exploit additional sources of information
Need to process large quantities of uncertain data of
different types (text, speech, sensors, images etc.)
Cognitive and Data Science
People want their systems/devices to
behave smarter
Personal devices
Industrial systems
More data to acquire and analyze using
more complex algorithms and technologies
3. What is data science
Some examples
Example I: Marketing
Predicting Lifetime Value (LTV)
what for: if you can predict the characteristics
of high LTV customers, this supports customer
segmentation, identifies upsell opportunities and
supports other marketing initiatives
usage: can be both an online algorithm and a static
report showing the characteristics of high LTV customers
Example II: Logistics
Demand forecasting
How many of what thing do you need and where
will we need them? (Enables leaninventory and
prevents out of stock situations.)
revenue impact: supports growth and militates
against revenue leakage
usage: online algorithm and static report
Example III: Healthcare
Survival analysis
Analyze survival statistics for different patientattributes
(age, blood type, gender, etc) and treatments
Medication (dosage) effectiveness
Analyze effects of admittingdifferent types and dosage
of medication for a disease
Readmission risk
Predict risk of re-admittance based on patient
attributes, medical history, diagnose & treatment
Example IV: Wearable Health and
Fitness
Example V: Brain Computer Interface
2. What is data science?
A Mashup of disciplines
A mashup of disciplines
PREDICTION OF
FUTURE MOVEMENTS • What is the next move of S&P 500?
IN THE STOCK MARKET:
PREDICTING INSURANCE
PURCHASE • Will a potential customer purchase?
MARKETING OF ORANGE
JUICE • What brand a customer will buy?
What is data
science?
Course
homework assignments
• Homepage, contact details
• Mashup of
disciplines
• Syllabus
• Grade ,exam,
Few More Disclaimers
Very inaccurate explanation
Statistics: take a sample (data), answer questions about the
process that produced this sample
Is it a normal distribution? Estimate it’s mean.
Machine Learning: take a sample(data),
build a model to answer
questions about future samples
Given a sample of named faces, design a model for naming a new unseen face.
Data Mining: mine huge data store for interesting patterns or relationships
Given DBof transactions, apply tools and algorithms to find frequent product
bundles
Data Science: do whatever necessary to extract value from the data
Use data to improve book sales: mine patterns, engineer recommender
systems, suggest improvements, estimate impact
No clear-cut boundaries!
Disclaimer: Math in the course
All the computation are performed by computer
You are in charge for interpretation of numbers
So you’ll have to understand the logic behind the number
You’ll see significant amount formulas during the course
Mostly arithmetic, matrices and probability
You are not expected to memorize or derive each
formula (with exceptions), but you are expected to
Understand its meaning and use