The document discusses Microsoft technologies that can be used for data science, including SQL Server, Azure ML, Cortana Intelligence Suite, and R Server. It provides definitions of key terms like data science, machine learning, and data mining. It also shares links to resources for learning about Microsoft's data science tools and platforms.
Report
Share
Report
Share
1 of 56
Download to read offline
More Related Content
Microsoft Technologies for Data Science 201612
1. Microsoft Technologies
for Data Science
Mark Tabladillo, Ph.D.
Lead Data Scientist (Architect)
Microsoft
December 2016: SQL Saturday BI Atlanta, GA
5. Terms Definition
Data Science
Machine Learning
Data Mining
Applied Statistics
the automated or semi-
automated process of
discovering patterns in
data
Applied scientific method
13. Technology Choices
SQL SERVER ANALYSIS SERVICES Enterprise
Business Intelligence
EXCEL ADD-IN FOR SSAS Office 365
Office 2013 or Higher x64
SEMANTIC SEARCH Enterprise
Business Intelligence
Standard
Web
Express with Advanced Services
MICROSOFT AZURE ML Free (Size Limited)
Paid (Web Service): Experiment + Query
F# Open Source
SQL SERVER R SERVICES SQL Server 2016 or higher
25. Time in Seconds vs. Number of Documents
(2011 – K. Mukerjee, T. Porter, S. Gherman – Microsoft)
http://users.cis.fiu.edu/~lzhen001/activities/KDD2011Program/docs/p213.pdf
30. Features
Microsoft R Open
R Distribution (Free)
Microsoft R Client
Free
Microsoft R Server
Commercial
Big Data
In-memory bound
Can only process datasets that fit
into the available memory
In-memory bound
Can process datasets that fit into the available
memory
Operates on large volumes when connected
to R Server
Disk scalability
Operates on bigger volumes &
factors
Speed of
Analysis
Multi-threaded when MKL is
installed for non-ScaleR functions
Multi-threaded with MKL for non-ScaleR
functions
Up to 2 threads for ScaleR functions with a
local compute context
Full parallel threading &
processing
Enterprise
Readiness
Community support Community support Commercial support
Analytic
Breadth
& Depth
8000+ open source packages
Leverage & optimize open source R packages
plus 'Big Data'-ready ScaleR packages
Leverage & optimize open source
R packages plus 'Big Data'-ready
+ Multithreaded ready ScaleR
packages
Commercial
Viability
Risk of deployment to open
source
Free for everyone Commercial licenses
DeployR
Enterprise
Not available Not available Included
31. Microsoft R Server Editions Description Install ScaleR Get Started
R Server for Hadoop
Scale your analysis transparently
by distributing work across
nodes without complex
programming
Doc Doc
R Server for Teradata DB
Run advanced analytics in-
database for seamless data
analysis
Doc Doc
R Server for Linux
Bring predictive and prescriptive
analytics power to your Linux
environments
Doc Doc