Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Machine Learning
Services with SQL
Server 2017MARK TABLADILLO PH.D.
LEAD DATA SCIENTIST, MICROSOFT
JULY 31, 2017
Machine learning services with SQL Server 2017
Machine learning services with SQL Server 2017
Microsoft
https://marketrealist.imgix.net/uploads/2017/07/Microsoft-Shares-Are-at-an-All-Time-High-
2017-07-24.jpg?w=660&fit=max&auto=format
Microsoft and Open Source
 SQL Server 2017 on Linux
 Nearly 1/3 of Virtual Machines (IAAS) on Azure are Linux
https://news.microsoft.com/bythenumbers/azure-virtual
 Purchase of RevolutionR
 R Distribution  Microsoft R Client
 R inside Azure Machine Learning, Power BI, SQL Server, Jupyter
 Python inside Azure Machine Learning, SQL Server, Jupyter
 Cloud Shell In Azure (preview): yes, we mean Bash
 https://azure.microsoft.com/en-us/features/cloud-shell/
 Microsoft now the leading contributor on GitHub
Focus
 1) to describe major features of this technology for technology
managers;
 2) to outline use cases for architects; and
 3) to provide demos for developers and data scientists.
SQL Server 2017
MAJOR FEATURES
Gartner Review October 2016
Machine learning services with SQL Server 2017
Machine learning services with SQL Server 2017
SQL Server on Linux
Possible with Drawbridge
Over 1M Docker Downloads
Whitepaper on Linux
https://info.microsoft.com/SQL
Server-on-Linux-Open-source-
enterprise-environment.html
Video – Overview of SQL
Server on Linux
https://channel9.msdn.com/e
vents/connect/2016/101
Microsoft Release Acronyms
CTP RC
Community Technology
Preview
Release Candidate
Versions of Microsoft SQL Server
 https://docs.microsoft.com/en-us/sql/sql-server/editions-and-
components-of-sql-server-2017
 Enterprise
 Many data scientists will use the free developer version (not
intended for production)
 Since we are still at RC (Release Candidate):
 Free 180 day evaluation version (Enterprise equivalent)
 Windows Docker image
 Linux Docker image
 https://www.microsoft.com/en-us/evalcenter/evaluate-sql-server-2017-
ctp
Data Science & AI
Certifications
https://borntolearn.mslearn.net/b/weblog/posts/microsoft-introduces-
several-new-data-management-amp-analytics-certifications
Team Data Science Process
 https://github.com/Azure/Microsoft-TDSP
• A statistics programming language
• Data analysis & visualization capabilities
• Majority of data scientists use R
• Thriving user groups worldwide
• Vibrant open Source community
• 10,000 + free algorithms in CRAN
• New and recent grad’s use it
#1
Language
Advanced
Analytics
2.5M+
Users
Open
Biggest
Ecosystem
• Strong ties to academia feeds ever-
growing machine learning capabilities
What is
• Constantly innovating
but, Open Source R is not Enterprise Class
76% of analytic
professionals
report using R
36% select
R as their
primary tool
R Usage Growth
Rexer Data Miner Survey
2007-2015
Inadequate
Modeling
Performance
?
?
Lack of
Commercial
Support
Complex
Deployment
Processes
Limited
Data
Scale
Our data science tool that allows you
to do high performance analytics on
production data, running locally on
your computer.
Machine learning services with SQL Server 2017
https://microsoft.github.io/r-server-loan-chargeoff/index.html
https://docs.microsoft.com/en-us/sql/advanced-
analytics/getting-started-with-machine-learning-
services
O(16)N
OPERATIONALIZATION
Classified as Microsoft Confidential
• Turn R analytics  Web
services in one line of
code;
• Swagger-based REST
APIs, easy to consume,
with any programming
languages, including R!
• Deploying web service
server to any platform:
Windows, SQL,
Linux/Hadoop
• On-prem or in cloud
• Fast scoring, real time
and batch
• Scaling to a grid for
powerful computing with
load balancing
• Diagnostic and capacity
evaluation tools
• Enterprise
authentication:
AD/LDAP or AAD
• Secure connection:
HTTPS with SSL/TLS 1.2
• Enterprise grade high
availability
Classified as Microsoft Confidential
• Turn R analytics  Web
Service in one line of
code;
• Swagger-based REST
APIs, easy to consume,
with any programming
languages, including R!
• Deploying Web Service
server to any platform:
Windows / SQL /
Linux/Hadoop
• On Prem or in Cloud
• Fast scoring, real time
and batch
• Scaling to a grid for
powerful computing with
load balancing
• Diagnostic and capacity
evaluation tools
• Enterprise
authentication: LDAP /
AD/ AAD
• Secure connection:
HTTPS with SSL.TSL1.2
• Enterprise grade High
Availability
Classified as Microsoft Confidential
Data Scientist
Developer
Easy Integration
Easy Deployment
Easy Setup
▪ In-cloud or on-prem
▪ Adding nodes to scale
▪ High availability & load balancing
▪ Remote execution server
Microsoft R Server
configured for
operationalizing R analytics
Microsoft R Client
(mrsdeploy package)
Data Scientist
Easy Consumption
publishServiceMicrosoft R Client
(mrsdeploy package)
Classified as Microsoft Confidential
Build the model first Deploy as a web service instantly
Classified as Microsoft Confidential
Function Description
publishService
Publish a predictive function as a Web
Service
deleteService Delete a Web Service
getService Get a Web Service
ListServices List the different published web services
serviceOption
Retrieve, set, and list the different service
options
updateService Updates a Web Service
Classified as Microsoft Confidential
Data Scientist
# Run the following code in R
swagger <- api$swagger()
cat(swagger, file = "swagger.json",
append = FALSE)
Generate Swagger
Docs for Web Services
Developer
Popular Swagger Tools:
AutoRest or Code Generator
AutoRest.exe -CodeGenerator
CSharp -Modeler Swagger -
Input swagger.json -
Namespace Mynamespace
Run Swagger tools to
generate code
Developer
Write a few code to
consume the service
Classified as Microsoft Confidential
Share / Reuse R code / functions
• Not just models, a data scientist can share any functional code as a service.
• Other data scientists can explore in the repository to re-use those functions.
Enable Model Management capabilities
• A Predictive Web Service = “Model” + “Prediction Script”
• R Server hosts all those services  Central Repo of Models
• Each service has a version tag  Model Version Control
• All versions are active  Model Roll Back (to any version)
• A service can be accessed by any authorized users 
• Model reuse
• Model validation and monitoring by QA team
After service is published, I can
test if the service works as
expected right away
Classified as Microsoft Confidential
▪ Built-in remote execute
functions in R Client/R Server
▪ Generate Diff report to
reconcile local and remote
▪ Execute .R script or interactive
R commands
▪ Results come back to local
▪ Generate working snapshots
for resume and reuse
▪ IDE agnostic
R Client
(mrsdeploy package)
R Server
configured to
Remote Execute R Scripts
(Support Window Server, Linux
Server, Hadoop )
▪ Execute R Scripts
▪ Snapshot remote env.
▪ Logout remote server
▪ Login remote server
▪ Generate Diff report
▪ Reconcile Environment
Classified as Microsoft Confidential
Snapshot Functions
createSnapshot
Create a snapshot of the remote session (workspace and
working directory)
loadSnapshot
Load a snapshot from the server into the remote session
(workspace and working directory)
listSnapshots Get a list of snapshots for the current user
downloadSnapshot Download a snapshot from the server
deleteSnapshot Delete a snapshot from the server
Remote Objects Management
listRemoteFiles
Get a list of files in the working directory of the
remote session
deleteRemoteFile
Delete a file from the working directory of the remote
R session
getRemoteFile
Copy a file from the working directory of the remote
R session
putLocalFile
Copy a file from the local machine to the working
directory of the remote R session
getRemoteObject Get an object from the remote R session
putLocalObject
Put an object from the local R session and load it into
the remote R session
getRemoteWorkspace
Take all objects from the remote R session and load
them into the local R session
putLocalWorkspace
Take all objects from the local R session and load
them into the remote R session
Remote Connection
remoteLogin
Remote login to the R Server with AD or admin
credentials
remoteLoginAAD Remote login to R Server server using Azure AD
remoteLogout Logout of the remote session on the DeployR Server.
Remote Execution
remoteExecute Remote execution of either R code or an R script
remoteScript Wrapper function for remote script execution
diffLocalRemote Generate a 'diff' report between local and remote
pause Pause remote connection and back to local
resume Return the user to the 'REMOTE >' command prompt
Classified as Microsoft Confidential
• Turn R analytics  Web
Service in one line of
code;
• Swagger-based REST
APIs, easy to consume,
with any programming
languages, including R!
• Deploying Web Service
server to any platform:
Windows / SQL /
Linux/Hadoop
• On Prem or in Cloud
• Fast scoring, real time
and batch
• Scaling to a grid for
powerful computing with
load balancing
• Diagnostic and capacity
evaluation tools
• Enterprise
authentication: LDAP /
AD/ AAD
• Secure connection:
HTTPS with SSL.TSL1.2
• Enterprise grade High
Availability
Classified as Microsoft Confidential
ModelPrepare
SQL
2016
OperationalizeOperationalize
R & ScaleR
Models
CRAN R
Models
AzureML
Web Services
R Server VMs
ModelPrepare
Operationalize
T-SQL/Stored
Procedure
Operationalize
R Server
On PremCloud
Deploy to SQL
Server 2016
Deploy to Hadoop / Linux
Server / Windows Server
Classified as Microsoft Confidential
•
•
•
•
•
•
•
ModelPrepare
OperationalizeOperationalize
R & ScaleR Models R Models
On Prem
Classified as Microsoft Confidential
•
•
•
•
•
•
•
•
ModelPrepare
Operationalize
SQL,
HDFS
R & ScaleR Models
On Prem • R Server
• T-SQL/Stored
Procedure
Classified as Microsoft Confidential
Product Platforms Modeling Operationalization
R Server for Windows Windows Server 2012 - 2016 Same as modeling
R Server for Linux Red Hat Enterprise Linux 6.X and 7.X 7.x
R Server for Linux SUSE Enterprise SLES 11 will support in future release
R Server for Linux Ubuntu 14.04 LTS, 16.04 LTS Same as modeling
R Server for Linux CentOS 6.X and 7.X 7.x
R Server for Hadoop Red Hat and SUSE Enterprise RHEL 6.x and 7.x, SUSE SLES11 RHEL 7.x
•
•
•
Classified as Microsoft Confidential
• Turn R analytics  Web
Service in one line of
code;
• Swagger-based REST
APIs, easy to consume,
with any programming
languages, including R!
• Deploying Web Service
server to any platform:
Windows / SQL /
Linux/Hadoop
• On Prem or in Cloud
• Fast scoring, real time
and batch
• Scaling to a grid for
powerful computing with
load balancing
• Diagnostic and capacity
evaluation tools
• Enterprise
authentication: LDAP /
AD/ AAD
• Secure connection:
HTTPS with SSL.TSL1.2
• Enterprise grade High
Availability
Classified as Microsoft Confidential
• Easily scale up a single
server to a grid to handle
more concurrent requests
• Load balancing cross
compute nodes
• A shared pool of warmed
up R shells to improve
scoring performance.
R
Client
Classified as Microsoft Confidential
• Health check node
configuration
• Get system status
• Trace R code execution
• Trace service execution
• Evaluate grid capacity
• Simulate traffic per service
• Configure with # of
concurrent threads or
latency thresholds
Classified as Microsoft Confidential
• Turn R analytics  Web
Service in one line of
code;
• Swagger-based REST
APIs, easy to consume,
with any programming
languages, including R!
• Deploying Web Service
server to any platform:
Windows / SQL /
Linux/Hadoop
• On Prem or in Cloud
• Fast scoring, real time
and batch
• Scaling to a grid for
powerful computing with
load balancing
• Diagnostic and capacity
evaluation tools
• Enterprise
authentication: LDAP /
AD/ AAD
• Secure connection:
HTTPS with SSL.TSL1.2
• Enterprise grade High
Availability
Classified as Microsoft Confidential
• Seamless integration
with authentication
solution: LDAP/AD/AAD
• Secure connection:
HTTPS encrypted by TLS
1.2/SSL
• Compliance with
Microsoft Security
Development Lifecycle
R
Client
Classified as Microsoft Confidential
Load
Balancer
• Server level HA:
Introduce multiple Web
Nodes for Active-Active
backup / recovery, via
load balancer
• Data Store HA: leverage
Enterprise grade DB, SQL
Server and Postgres’ HA
capabilities
Connect
 LinkedIn
 SlideShare
 Twitter @marktabnet
Abstract
 SQL Server 2017 introduces Machine Learning Services with two
independent technologies: R and Python. The purpose of this
presentation is 1) to describe major features of this technology for
technology managers; 2) to outline use cases for architects; and 3)
to provide demos for developers and data scientists.

More Related Content

Machine learning services with SQL Server 2017

  • 1. Machine Learning Services with SQL Server 2017MARK TABLADILLO PH.D. LEAD DATA SCIENTIST, MICROSOFT JULY 31, 2017
  • 5. Microsoft and Open Source  SQL Server 2017 on Linux  Nearly 1/3 of Virtual Machines (IAAS) on Azure are Linux https://news.microsoft.com/bythenumbers/azure-virtual  Purchase of RevolutionR  R Distribution  Microsoft R Client  R inside Azure Machine Learning, Power BI, SQL Server, Jupyter  Python inside Azure Machine Learning, SQL Server, Jupyter  Cloud Shell In Azure (preview): yes, we mean Bash  https://azure.microsoft.com/en-us/features/cloud-shell/  Microsoft now the leading contributor on GitHub
  • 6. Focus  1) to describe major features of this technology for technology managers;  2) to outline use cases for architects; and  3) to provide demos for developers and data scientists.
  • 11. SQL Server on Linux Possible with Drawbridge Over 1M Docker Downloads Whitepaper on Linux https://info.microsoft.com/SQL Server-on-Linux-Open-source- enterprise-environment.html Video – Overview of SQL Server on Linux https://channel9.msdn.com/e vents/connect/2016/101
  • 12. Microsoft Release Acronyms CTP RC Community Technology Preview Release Candidate
  • 13. Versions of Microsoft SQL Server  https://docs.microsoft.com/en-us/sql/sql-server/editions-and- components-of-sql-server-2017  Enterprise  Many data scientists will use the free developer version (not intended for production)  Since we are still at RC (Release Candidate):  Free 180 day evaluation version (Enterprise equivalent)  Windows Docker image  Linux Docker image  https://www.microsoft.com/en-us/evalcenter/evaluate-sql-server-2017- ctp
  • 16. Team Data Science Process  https://github.com/Azure/Microsoft-TDSP
  • 17. • A statistics programming language • Data analysis & visualization capabilities • Majority of data scientists use R • Thriving user groups worldwide • Vibrant open Source community • 10,000 + free algorithms in CRAN • New and recent grad’s use it #1 Language Advanced Analytics 2.5M+ Users Open Biggest Ecosystem • Strong ties to academia feeds ever- growing machine learning capabilities What is • Constantly innovating
  • 18. but, Open Source R is not Enterprise Class 76% of analytic professionals report using R 36% select R as their primary tool R Usage Growth Rexer Data Miner Survey 2007-2015 Inadequate Modeling Performance ? ? Lack of Commercial Support Complex Deployment Processes Limited Data Scale
  • 19. Our data science tool that allows you to do high performance analytics on production data, running locally on your computer.
  • 24. Classified as Microsoft Confidential • Turn R analytics  Web services in one line of code; • Swagger-based REST APIs, easy to consume, with any programming languages, including R! • Deploying web service server to any platform: Windows, SQL, Linux/Hadoop • On-prem or in cloud • Fast scoring, real time and batch • Scaling to a grid for powerful computing with load balancing • Diagnostic and capacity evaluation tools • Enterprise authentication: AD/LDAP or AAD • Secure connection: HTTPS with SSL/TLS 1.2 • Enterprise grade high availability
  • 25. Classified as Microsoft Confidential • Turn R analytics  Web Service in one line of code; • Swagger-based REST APIs, easy to consume, with any programming languages, including R! • Deploying Web Service server to any platform: Windows / SQL / Linux/Hadoop • On Prem or in Cloud • Fast scoring, real time and batch • Scaling to a grid for powerful computing with load balancing • Diagnostic and capacity evaluation tools • Enterprise authentication: LDAP / AD/ AAD • Secure connection: HTTPS with SSL.TSL1.2 • Enterprise grade High Availability
  • 26. Classified as Microsoft Confidential Data Scientist Developer Easy Integration Easy Deployment Easy Setup ▪ In-cloud or on-prem ▪ Adding nodes to scale ▪ High availability & load balancing ▪ Remote execution server Microsoft R Server configured for operationalizing R analytics Microsoft R Client (mrsdeploy package) Data Scientist Easy Consumption publishServiceMicrosoft R Client (mrsdeploy package)
  • 27. Classified as Microsoft Confidential Build the model first Deploy as a web service instantly
  • 28. Classified as Microsoft Confidential Function Description publishService Publish a predictive function as a Web Service deleteService Delete a Web Service getService Get a Web Service ListServices List the different published web services serviceOption Retrieve, set, and list the different service options updateService Updates a Web Service
  • 29. Classified as Microsoft Confidential Data Scientist # Run the following code in R swagger <- api$swagger() cat(swagger, file = "swagger.json", append = FALSE) Generate Swagger Docs for Web Services Developer Popular Swagger Tools: AutoRest or Code Generator AutoRest.exe -CodeGenerator CSharp -Modeler Swagger - Input swagger.json - Namespace Mynamespace Run Swagger tools to generate code Developer Write a few code to consume the service
  • 30. Classified as Microsoft Confidential Share / Reuse R code / functions • Not just models, a data scientist can share any functional code as a service. • Other data scientists can explore in the repository to re-use those functions. Enable Model Management capabilities • A Predictive Web Service = “Model” + “Prediction Script” • R Server hosts all those services  Central Repo of Models • Each service has a version tag  Model Version Control • All versions are active  Model Roll Back (to any version) • A service can be accessed by any authorized users  • Model reuse • Model validation and monitoring by QA team After service is published, I can test if the service works as expected right away
  • 31. Classified as Microsoft Confidential ▪ Built-in remote execute functions in R Client/R Server ▪ Generate Diff report to reconcile local and remote ▪ Execute .R script or interactive R commands ▪ Results come back to local ▪ Generate working snapshots for resume and reuse ▪ IDE agnostic R Client (mrsdeploy package) R Server configured to Remote Execute R Scripts (Support Window Server, Linux Server, Hadoop ) ▪ Execute R Scripts ▪ Snapshot remote env. ▪ Logout remote server ▪ Login remote server ▪ Generate Diff report ▪ Reconcile Environment
  • 32. Classified as Microsoft Confidential Snapshot Functions createSnapshot Create a snapshot of the remote session (workspace and working directory) loadSnapshot Load a snapshot from the server into the remote session (workspace and working directory) listSnapshots Get a list of snapshots for the current user downloadSnapshot Download a snapshot from the server deleteSnapshot Delete a snapshot from the server Remote Objects Management listRemoteFiles Get a list of files in the working directory of the remote session deleteRemoteFile Delete a file from the working directory of the remote R session getRemoteFile Copy a file from the working directory of the remote R session putLocalFile Copy a file from the local machine to the working directory of the remote R session getRemoteObject Get an object from the remote R session putLocalObject Put an object from the local R session and load it into the remote R session getRemoteWorkspace Take all objects from the remote R session and load them into the local R session putLocalWorkspace Take all objects from the local R session and load them into the remote R session Remote Connection remoteLogin Remote login to the R Server with AD or admin credentials remoteLoginAAD Remote login to R Server server using Azure AD remoteLogout Logout of the remote session on the DeployR Server. Remote Execution remoteExecute Remote execution of either R code or an R script remoteScript Wrapper function for remote script execution diffLocalRemote Generate a 'diff' report between local and remote pause Pause remote connection and back to local resume Return the user to the 'REMOTE >' command prompt
  • 33. Classified as Microsoft Confidential • Turn R analytics  Web Service in one line of code; • Swagger-based REST APIs, easy to consume, with any programming languages, including R! • Deploying Web Service server to any platform: Windows / SQL / Linux/Hadoop • On Prem or in Cloud • Fast scoring, real time and batch • Scaling to a grid for powerful computing with load balancing • Diagnostic and capacity evaluation tools • Enterprise authentication: LDAP / AD/ AAD • Secure connection: HTTPS with SSL.TSL1.2 • Enterprise grade High Availability
  • 34. Classified as Microsoft Confidential ModelPrepare SQL 2016 OperationalizeOperationalize R & ScaleR Models CRAN R Models AzureML Web Services R Server VMs ModelPrepare Operationalize T-SQL/Stored Procedure Operationalize R Server On PremCloud Deploy to SQL Server 2016 Deploy to Hadoop / Linux Server / Windows Server
  • 35. Classified as Microsoft Confidential • • • • • • • ModelPrepare OperationalizeOperationalize R & ScaleR Models R Models On Prem
  • 36. Classified as Microsoft Confidential • • • • • • • • ModelPrepare Operationalize SQL, HDFS R & ScaleR Models On Prem • R Server • T-SQL/Stored Procedure
  • 37. Classified as Microsoft Confidential Product Platforms Modeling Operationalization R Server for Windows Windows Server 2012 - 2016 Same as modeling R Server for Linux Red Hat Enterprise Linux 6.X and 7.X 7.x R Server for Linux SUSE Enterprise SLES 11 will support in future release R Server for Linux Ubuntu 14.04 LTS, 16.04 LTS Same as modeling R Server for Linux CentOS 6.X and 7.X 7.x R Server for Hadoop Red Hat and SUSE Enterprise RHEL 6.x and 7.x, SUSE SLES11 RHEL 7.x • • •
  • 38. Classified as Microsoft Confidential • Turn R analytics  Web Service in one line of code; • Swagger-based REST APIs, easy to consume, with any programming languages, including R! • Deploying Web Service server to any platform: Windows / SQL / Linux/Hadoop • On Prem or in Cloud • Fast scoring, real time and batch • Scaling to a grid for powerful computing with load balancing • Diagnostic and capacity evaluation tools • Enterprise authentication: LDAP / AD/ AAD • Secure connection: HTTPS with SSL.TSL1.2 • Enterprise grade High Availability
  • 39. Classified as Microsoft Confidential • Easily scale up a single server to a grid to handle more concurrent requests • Load balancing cross compute nodes • A shared pool of warmed up R shells to improve scoring performance. R Client
  • 40. Classified as Microsoft Confidential • Health check node configuration • Get system status • Trace R code execution • Trace service execution • Evaluate grid capacity • Simulate traffic per service • Configure with # of concurrent threads or latency thresholds
  • 41. Classified as Microsoft Confidential • Turn R analytics  Web Service in one line of code; • Swagger-based REST APIs, easy to consume, with any programming languages, including R! • Deploying Web Service server to any platform: Windows / SQL / Linux/Hadoop • On Prem or in Cloud • Fast scoring, real time and batch • Scaling to a grid for powerful computing with load balancing • Diagnostic and capacity evaluation tools • Enterprise authentication: LDAP / AD/ AAD • Secure connection: HTTPS with SSL.TSL1.2 • Enterprise grade High Availability
  • 42. Classified as Microsoft Confidential • Seamless integration with authentication solution: LDAP/AD/AAD • Secure connection: HTTPS encrypted by TLS 1.2/SSL • Compliance with Microsoft Security Development Lifecycle R Client
  • 43. Classified as Microsoft Confidential Load Balancer • Server level HA: Introduce multiple Web Nodes for Active-Active backup / recovery, via load balancer • Data Store HA: leverage Enterprise grade DB, SQL Server and Postgres’ HA capabilities
  • 45. Abstract  SQL Server 2017 introduces Machine Learning Services with two independent technologies: R and Python. The purpose of this presentation is 1) to describe major features of this technology for technology managers; 2) to outline use cases for architects; and 3) to provide demos for developers and data scientists.