Slides from my talk at Big Data Conference 2018 in Vilnius
Doing data science today is far more difficult than it will be in the next 5-10 years. Sharing, collaborating on data science workflows in painful, pushing models into production is challenging.
Let’s explore what Azure provides to ease Data Scientists’ pains. What tools and services can we choose based on a problem definition, skillset or infrastructure requirements?
In this talk, you will learn about Azure Machine Learning Studio, Azure Databricks, Data Science Virtual Machines and Cognitive Services, with all the perks and limitations.
2. Quiz
Microsoft Machine Learning Server
Machine Learning for .NET
Azure Machine Learning Service
Azure Machine Learning Studio
Azure Databricks
Data Science Virtual Machine
SQL Server Machine Learning Services
Azure Cognitive Services
3. Inspiration for the talk
One thing about Microsoft - they have
multiple ways to solve the same problem
6. So what do you mean by saying
“Making Data Scientists Productive in Azure”?
17. Azure Cognitive Services - Summary
Key benefits:
• Minimal development effort
• Easy integration via HTTP REST
• Built-in support with other Azure services
• Containers support
18. Azure Cognitive Services - Summary
Key benefits:
• Minimal development effort
• Easy integration via HTTP REST
• Built-in support with other Azure services
• Containers support
Considerations:
• Limited customization allowed
• Limited support for Non-English languages
19. ML.NET
What is it?
An open source and cross-platform ML framework
What can you do with it?
Create custom ML models using C# or F#
without leaving the .NET ecosystem
22. ML.NET - Summary
Key benefits:
• Powers products like Microsoft Defender, Outlook, Bing, PowerBI
• Seamlessly integrates ML into .NET apps
• AutoML functionality
• Leverage TensorFlow or ONNX
Considerations:
• Limited support for other ML libraries
24. Azure Machine Learning Studio
What is it?
Drag-and-drop visual interface for ML
What can you do with it?
Build, experiment, and deploy models using
pre-configured algorithms
30. Azure Machine Learning Studio - Summary
Key benefits:
• Interactive visual interface
• Built-in Jupyter Notebooks for data exploration
• Direct deployment of trained models as web services
• Built-in support for other Azure services
31. Azure Machine Learning Studio - Summary
Key benefits:
• Interactive visual interface
• Built-in Jupyter Notebooks for data exploration
• Direct deployment of trained models as web services
• Built-in support for other Azure services
Considerations:
• Limited scalability (the maximum size of a training dataset is 10 GB)
• Online only
• Limited support for custom Python/R code
34. Azure Machine Learning Studio
Service
What is it?
Managed cloud service for ML
What can you do with it?
Train, deploy and manage models in Azure using
Python and CLI
38. Azure Machine Learning Service -
Compute Targets
• Your local computer
• Linux VM in Azure
• Azure Batch AI Cluster
• Azure Databricks
• Azure Container Instance
• Apache Spark for HDInsight
45. Azure Machine Learning Service - Summary
Key benefits:
• Central management of scripts and run history
• Run model training scripts locally, and then scale out to the cloud
• Deployment and management of models to the cloud or edge devices
• Start development locally (offline)
46. Azure Machine Learning Service - Summary
Key benefits:
• Central management of scripts and run history
• Run model training scripts locally, and then scale out to the cloud
• Deployment and management of models to the cloud or edge devices
• Start development locally (offline)
Considerations:
• Python only
• Requires some familiarity with the model management model
47. Bradley
• Data Scientist / Engineer
• Apache Spark / SQL / Python /
Scala
• Wants to spend more time
outdoors than exploring beta
version tools
Build an enterprise data lake
and data science environment
48. Azure Databricks
What is it?
Spark-based analytics platform
What can you do with it?
Build and deploy models and data workflows
54. Azure Databricks - Overview
Collaborative Workspace
• Notebooks
• User access
• Git integration
Deploy Jobs & Workflows
• Job scheduler
• Notifications & logs
• Multi-stage pipelines
Databricks Runtime
• Apache Spark
• Rest APIs
• Libraries
Security
• Single sign-on (SSO)
• Access control list (ACL)
• Secrets via Key Vault
56. Azure Databricks - Summary
Key benefits:
• Probably the most mature development environment for ML on the
Azure platform
• Nicely integrated with other Azure services
57. Azure Databricks - Summary
Key benefits:
• Probably the most mature development environment for ML on the
Azure platform
• Nicely integrated with other Azure services
Considerations:
• Online only
59. Data Science Virtual Machine
What is it?
A virtual machine with pre-installed data science tools
What can you do with it?
Develop ML solutions in a pre-configured environment
61. Azure Data Science Virtual Machine - Summary
Key benefits:
• Probably the most complete development environment for ML on the Azure platform
• Reduced time to install, manage, and troubleshoot data science tools and frameworks
• Virtual machine options include highly scalable GPU images
• A dedicated geospatial with ArcGIS distribution
Considerations:
• Online only
• You need to take care of VM management
63. Rick
• Specializes in R
• Not allowed to push data to
Azure
Create personalized treatment
based on individual health data
64. Microsoft Machine Learning Service
Server
What is it?
Cross-platform standalone server for predictive
analysis
What can you do with it?
Build and deploy models with R and Python
65. Microsoft Machine Learning Server - Overview
• A new name for Microsoft R Server
• Install on Windows / Linux / Hadoop cluster
• Deploy models as web services packaged as container images
• Satisfy security and compliance needs of any enterprise
67. Microsoft Machine Learning Server - Summary
Key benefits:
• Built on a legacy of Microsoft R Server and Revolution R Enterprise
• Advanced security options
• Deploy R and Python models as web services
68. Microsoft Machine Learning Server - Summary
Key benefits:
• Built on a legacy of Microsoft R Server and Revolution R Enterprise
• Advanced security options
• Deploy R and Python models as web services
Considerations:
• You need to deploy and manage Machine Learning Server in your
enterprise
69. SQL Server Machine Learning Services
What is it?
A built-in SQL Server feature to support machine
learning
What can you do with it?
Execute Python and R scripts with relational data
71. SQL Server Machine Learning Services - Summary
Key benefits:
• Run your scripts where the data resides and eliminate transfer of
the data across the network to another server
• Encapsulate predictive logic in a database function
72. SQL Server Machine Learning Services - Summary
Key benefits:
• Run your scripts where the data resides and eliminate transfer of
the data across the network to another server
• Encapsulate predictive logic in a database function
Considerations:
• Assumes a SQL Server database as the data tier for your application
• Limited scalability
• Long list of known issues
73. Quiz
Azure Cognitive Services
Machine Learning for .NET
Azure Machine Learning Studio
Azure Machine Learning Service
Azure Databricks
Data Science Virtual Machine
Microsoft Machine Learning Server
SQL Server Machine Learning Services
90. Welcome to
Vilnius Data Platform Meetup
• Making Data Scientists Productive in Azure
by Valdas Maksimavičius
• Building Churn Prediction Model Using Azure
Databricks, Sklearn and MLflow by Tomas Lukas Komar
• Snacks & Networking
Agenda: