Data drives decisions and actions. Machine learning uses data to build models that can predict unknown data. The machine learning process involves getting data, preparing it by extracting features, training a model on known data and labels, and evaluating the model's performance on predicting labels of unknown data. These trained models can then be deployed as web services to power applications.
13. Blobs and Tables
Hadoop (HDInsight)
Relational DB (Azure SQL DB)
Data Clients
Model is now a web
service that is
callable
Monetize the API
through our
marketplace
API
Integrated development
environment for Machine
Learning
ML STUDIO
16. Classify a news article as (politics, sports, technology, health,
…)
Politics Sports Tech Health
Using known data, develop a model to predict unknown data.
17. Using known data, develop a model to predict unknown data.
Documents Labels
Tech
Health
Politics
Politics
Sports
Documents consist of
unstructured text. Machine
learning typically assumes a
more structured format of
examples
Process the raw
data
18. Using known data, develop a model to predict unknown data.
LabelsDocuments
Feature
Documents Labels
Tech
Health
Politics
Politics
Sports
Process each data instance to represent it as a feature
vector
19. Known data
Data instance
i.e.
{40, (180, 82), (11,7), 70, …..} : Healthy
Age Height/Weight
Blood Pressure
Hearth Rate
LabelFeatures
Feature Vector
20. Using known data, develop a model to predict unknown data.
Documents Labels
Tech
Health
Politics
Politics
Sports
Training
data
Train
the
Mode
l
Feature Vectors
Base
Model
Adjust
Parameters
21. Known data with true labels
Tech
Health
Politics
Politics
Sports
Tech
Health
Politics
Politics
Sports
Tech
Health
Politics
Politics
Sports
Model’s
Performance
Difference between
“True Labels” and
“Predicted Labels”
True
labels
Tech
Health
Politics
Politics
Sports
Predicte
d
labels
Train the Model
Split
Detac
h
+/-
+/-
+/-
A) Main concepts to cover for Data Science:
Regression
Classification
Clustering
Recommendation
B) Building programmable components in Azure ML experiments
C) Working with Azure ML studio
Actually it is the science on data
Basic definition:
Machine learning develops algorithms for making predictions (statistical sense) from data *
Learning models from available training data, to make good predictions on unseen test data
Highlight the keywords: Known Data, Model, Unknown Data and the Prediction (statistically…)
Goto https://www.projectoxford.ai/demo/visions#Analysis and analyze this image to find its accent color
Ready to use ML API published on Azure Datamarket place. A showcase of production ML solution in action (This can be used i.e. in shopping websites to see user reviews about a product etc.)
Test by yourself some of the reviews comments on Expedia or other online shopping sites. Copy/paste user comments to see the analyze result. If sentiment result shows higher percentage number, it means it is positive. If the percentage is low than it is a negative comment.
https://text-analytics-demo.azurewebsites.net/
Azure ML aims to transform your data in to an intelligent action
Lets look at some of the features of ML Studio
Stages to develop end to end Azure ML solution. First you need an AML workspace... then you build a model under AML Studio, then publish it as web service and as an app (see text analytics demo in prior slides)
With the cloud, you can bring in data sources with the ease of a drop down or drop your on-premises data set into the built in storage space. Users can then model in our development environment – Machine Learning Studio – where we’re offering R, Python and SQLite as first class citizens in addition to our world-class Microsoft algorithms.
The second issue – and often the primary one – is putting finished work into production in a way others can use. We’ve heard from many data scientists that they model in R on a Linux stack but then have to hand over their work to developers who need to translate that into another language to actually make it work. This time consuming and unnecessary process has been eliminated with our system, as the model is with a click transformed into a web service end-point that can run over any data, anywhere and connect to any solution or client.
Next, not only can this model be put into production for your company, it can be made available for the world on our Machine Learning Marketplace. Microsoft hosts your solution and markets it for you, while you have the freedom to brand and monetize as you see fit. We also offer a number of Microsoft solutions here.
Just a brief info about some keywords used in ML
Ratio of the Test/Training data is trivial. Depends on model, data size, quality etc.
Split is random (i.e. which %80 portion of the data?)
There is no rule to split into %80 & %20 … it might be %50 & %50 etc depending on the problem, model, case.
If you are not happy with the Model’s performance than adjust the parameters and re-train the model. Or change the algorithm (training approach) behind the method…
Confidential…
+ notes From the book: AzureMachineLearning – AzureFundamentals
Many examples of predictive analytics can be found literally everywhere today in our society:
Spam/junk email filters These are based on the content, headers, origins, and even user behaviors (for example, always delete emails from this sender).
Mortgage applications Typically, your mortgage loan and credit worthiness is determined by advanced predictive analytic algorithm engines.
Various forms of pattern recognition These include optical character recognition (OCR) for routing your daily postal mail, speech recognition on your smart phone, and even facial recognition for advanced security systems.
Life insurance Examples include calculating mortality rates, life expectancy, premiums, and payouts.
Medical insurance Insurers attempt to determine future medical expenses based on historical medical claims and similar patient backgrounds.
Liability/property insurance Companies can analyze coverage risks for automobile and home owners based on demographics.
Credit card fraud detection This process is based on usage and activity patterns. In the past year, the number of credit card transactions has topped 1 billion. The popularity of contactless payments via near-field communications (NFC) has also increased dramatically over the past year due to smart phone integration.
Airline flights Airlines calculate fees, schedules, and revenues based on prior air travel patterns and flight data.
Web search page results Predictive analytics help determine which ads, recommendations, and display sequences to render on the page.
Predictive maintenance This is used with almost everything we can monitor: planes, trains, elevators, cars, and yes, even data centers.
Health care Predictive analytics are in widespread use to help determine patient outcomes and future care based on historical data and pattern matching across similar patient data sets.
More samples on: https://azure.microsoft.com/en-us/documentation/articles/machine-learning-algorithm-choice/
Mention about Classification, Regression etc.