Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Abishay Rao
Partner Engineer - Google Cloud
Democratizing
AI/ML with GCP
/abishayrao
Every company is talking about AI/ML today
Source: AI Index Report 2018
2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
0
50
100
150
Company earnings calls mentions - Sum of other industries
”Cloud”“Big data”“Artificial intelligence” “Machine Learning”
Annualmentions
2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
250
750
500
1,000
1,200
0
Annualmentions
Company earnings calls mentions - IT companies
”Cloud”“Big data”“Artificial intelligence” “Machine Learning”
When most companies
talk about AI/ML in their
transformation journey,
you hear terms like…
PROOF OF CONCEPT
PILOT
A/B TEST
FEASIBILITY STUDY
KNOW
DATA
PROCEDURE
BUSINESS
ANALYSIS
THESIS
EXPERIMENT
TEST
CRITICAL
PROOF
CREATIVE
REPORT
PROCESS
DEBATE
REVIEW
CORE
MIX
PROJECT
VALUE
INNOVATION
PROBE
BOOK
STATISTICS
CONCEPT
CHECK
SUMMARY
SEARCH
CHECK
SUMMARY
ANALYZING
TEST
VALUE
BOOK
BUSINESS
INVESTIGATION
EXPERIMENTS
CHECK
INVESTIGATE
TEST
DATA
MIX
PROCEDURES
EFFECTIVE
SOLUTIONS
DATA
BOOK
ASSESS
NEW
PROOF
EVIDENCE
CONTINUAL
QUANTITATIVE RESULTS
ACADEMICS
TRIAL
SAMPLE
RESULTS
STUDIES
MIX
EXPLORE
DISCOVERY
PRELIMINARY DATA
CONCEPT
INSPECTION
EXPLORATION
PROBE
VALUE
VERIFICATIONEVENTS
STUDIES
REVIEWING
FUNCTIONCASE
FUNCTIONIONAL RESPONSIBILITY QUALITATIVE RESULTS
MIX
YES
CONFIRMATION
DOCUMENTATION
CONFIRM
NEW
We are at the peak of AI’s hype
Research Reality
~90 AI papers per day
12%
Of enterprises have deployed
AI in production1
1
McKinsey Global Institute,
Artificial Intelligence: The Next Digital Frontier
40000
30000
20000
10000
0
2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
0
5
10
15
20
25
30
35
MLArxivPapers
Relativeto2009MLArxivPapers
ML Arxiv Papers Moore’s Law growth rate (2x/2 years)
Year
Decision
trees
Engineers applied
decision trees to drive
machine outcomes.
70’s 90’s
Modeling and
simulation
Faster computers and
software paved the way
to apply statistics to
drive superior outcomes.
Deep
Learning
Deep Learning ushered
the possibility to solve
previously unsolvable
problems.
10’s
Embedded
Intelligence
Businesses view AI as an
integral part of product
development and
operational efficiency.
Now
AI will be synonymous with software
2019.
Empower every
industry to transform
their business with AI
10,000X
of deep learning
researchers
2M
ML experts
100X
M
business users
23M
developers
Builders
How can I make faster impact on business
How do I spend less time
preparing data
How do I get my models to
production faster and manage
its lifecycle
How do I build and deploy my
models flexibly (on-premises, GCP)
How do I collaborate with
all users
GCP ML Stack
Notebooks
Data Services
Models
APIs
KF Pipelines
VM Images
Reference
Architectures
Storage Compute Preprocessing
Managed
Services
Deep Learning Virtual
Machines
AI Platform
Training & Prediction
Tooling/
SDKs
Notebook
On Prem
infra
Kubeflow Core Services
DIY
AutoML
Kubeflow Fairing
(Hybrid SDK)
Kubeflow
Pipelines
GCP Infra
Options
BigQuery ML
AI Hub
+
K8
+
GKE
Educational
Materials
BETA
...
Open Source Frameworks
Visualize
Accelerate your ML development with our
unified, open and fully-managed architecture
Endpoint clients
User &
device data
Or Or
Ingest Transform Analyze
Web
IoT
Mobile
PubSub
Apache
Kafka
Apache
Beam
Dataflow
Apache
Spark
BigQuery
AI Platform
Kubeflow
Data Studio
3rd-party
BI Tools
Data
consumers
Or
Data
warehouses
From 1st-gen EDWs,
increased data collection
and analysis has helped
build more data-driven
businesses.
90’s 00’s
BI
foundations
Data warehousing
formed the foundation of
reporting and business
intelligence.
Cloud data
warehousing
BigQuery represents
a fundamentally different
approach to cloud data
warehousing.
Now
AI
foundations
We’re working to make
BigQuery the foundation
for organizations that
will leverage machine
intelligence in their
businesses.
Next
Data warehousing for
AI-driven business
Google BigQuery forms the AI foundation
Automate
data delivery
Democratize
data insights
Build the
foundation for AI
Break data silos, power apps,
add read-only data sets &
make query results
accessible to anyone
Automated data transfer to
extract data from your
systems & shared
data with federated
querying across any
Google service
Enterprise Data Warehouse
stores the most valuable data
for your company & brings AI
capabilities without
replicating data
into storage
Tee up
real-time insights
Analyze real-time business
events by automatic data
ingestion, which is
immediately available to
query in your
data warehouse
Google BigQuery
Petabyte-scale storage
and queries
Encrypted, durable and
highly available
Real-time analytics on
streaming data
Google Cloud Platform’s
enterprise data warehouse
for analytics
Convenience of
standard SQL
Fully managed and serverless
Days to months to create an ML model
TensorFlow or scikit-learn
Only an expert data scientist can do this
Export small amounts of data from BQ
Create frames of data for use with TensorFlow Build model
Go back and get more data to create features, and
improve performance
Repeat. It’s hard, so you stop after a few iterations
3
Export data
1
Regression in Excel/Sheets
Export small amounts of data from BQ
Run linear regression
Get a model with low accuracy due to small data for training
Go back and get more data to create new features,
and improve performance
Repeat. It’s hard, so you stop after a few iterations
2
1
2
3
Execute ML initiatives
without moving data from
BigQuery
Iterate on models in SQL in
BigQuery to increase
development speed
Automate common ML
tasks and hyperparameter
tuning
BigQuery ML
Through two lines of SQL
● Leverage BigQuery’s processing power to build a
model
● Auto-tuned learning rate
● Auto-split of data into training and test
● Null imputation
● Standardization of numeric features
● One-hot encoding of strings
● Class imbalance handling
Behind the scenes
Making ML accessible for all audiences
Developer Data Analyst Data Scientist Use cases and skills
TensorFlow and
CloudML Engine
● Build and deploy state-of-art custom models
● Requires deep understanding of ML
and programming
BigQuery ML
● Build and deploy custom models using SQL
● Requires only basic understanding of ML
AutoML and
CloudML APIs
● Build and deploy Google-provided models
for standard use cases
● Requires almost no ML knowledge
● StandardSQL and UDFs within the ML queries
● Linear regression (forecasting)
● Multi-nomial logistic regression (classification)
● Kmeans clustering (segmentation)
● Model evaluation functions for standard metrics, including
the ROC curve
● Model weight inspection
● Feature distribution analysis through standard functions
Supported features
Model evaluation on
BigQuery UI
Available through your favorite BI Platform
Looker integration with
BigQuery ML
Explore data
Create BigQuery table for
model creation
Create the model
Evaluate model using a
standard dashboard
Operationalize the ML workflow
Easily JOIN predictions into existing
dashboards
Alerts and scheduling
Retail
Optimize inventory
Forecast revenue
Enable product
recommendations
Optimize staff
promotions
Marketing
Predict customer
lifetime value
Predict funnel
conversion
Personalize ads, email,
webpage content
Industrial and IoT
Forecast demand for
parking, traffic,
utilities, personnel
Predict maintenance
needs
Prevent equipment
downtime
Media / gaming
Personalize content
Predict game
difficulty
Predict player
lifetime value
The possibilities are endless
22
BQML DEMO!
23
Thank you!
Abishay Rao
Partner Engineer - Google Cloud
/abishayrao

More Related Content

Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019

  • 1. Abishay Rao Partner Engineer - Google Cloud Democratizing AI/ML with GCP /abishayrao
  • 2. Every company is talking about AI/ML today Source: AI Index Report 2018 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 0 50 100 150 Company earnings calls mentions - Sum of other industries ”Cloud”“Big data”“Artificial intelligence” “Machine Learning” Annualmentions 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 250 750 500 1,000 1,200 0 Annualmentions Company earnings calls mentions - IT companies ”Cloud”“Big data”“Artificial intelligence” “Machine Learning”
  • 3. When most companies talk about AI/ML in their transformation journey, you hear terms like… PROOF OF CONCEPT PILOT A/B TEST FEASIBILITY STUDY KNOW DATA PROCEDURE BUSINESS ANALYSIS THESIS EXPERIMENT TEST CRITICAL PROOF CREATIVE REPORT PROCESS DEBATE REVIEW CORE MIX PROJECT VALUE INNOVATION PROBE BOOK STATISTICS CONCEPT CHECK SUMMARY SEARCH CHECK SUMMARY ANALYZING TEST VALUE BOOK BUSINESS INVESTIGATION EXPERIMENTS CHECK INVESTIGATE TEST DATA MIX PROCEDURES EFFECTIVE SOLUTIONS DATA BOOK ASSESS NEW PROOF EVIDENCE CONTINUAL QUANTITATIVE RESULTS ACADEMICS TRIAL SAMPLE RESULTS STUDIES MIX EXPLORE DISCOVERY PRELIMINARY DATA CONCEPT INSPECTION EXPLORATION PROBE VALUE VERIFICATIONEVENTS STUDIES REVIEWING FUNCTIONCASE FUNCTIONIONAL RESPONSIBILITY QUALITATIVE RESULTS MIX YES CONFIRMATION DOCUMENTATION CONFIRM NEW
  • 4. We are at the peak of AI’s hype Research Reality ~90 AI papers per day 12% Of enterprises have deployed AI in production1 1 McKinsey Global Institute, Artificial Intelligence: The Next Digital Frontier 40000 30000 20000 10000 0 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 0 5 10 15 20 25 30 35 MLArxivPapers Relativeto2009MLArxivPapers ML Arxiv Papers Moore’s Law growth rate (2x/2 years) Year
  • 5. Decision trees Engineers applied decision trees to drive machine outcomes. 70’s 90’s Modeling and simulation Faster computers and software paved the way to apply statistics to drive superior outcomes. Deep Learning Deep Learning ushered the possibility to solve previously unsolvable problems. 10’s Embedded Intelligence Businesses view AI as an integral part of product development and operational efficiency. Now AI will be synonymous with software
  • 6. 2019. Empower every industry to transform their business with AI 10,000X of deep learning researchers 2M ML experts 100X M business users 23M developers Builders
  • 7. How can I make faster impact on business How do I spend less time preparing data How do I get my models to production faster and manage its lifecycle How do I build and deploy my models flexibly (on-premises, GCP) How do I collaborate with all users
  • 8. GCP ML Stack Notebooks Data Services Models APIs KF Pipelines VM Images Reference Architectures Storage Compute Preprocessing Managed Services Deep Learning Virtual Machines AI Platform Training & Prediction Tooling/ SDKs Notebook On Prem infra Kubeflow Core Services DIY AutoML Kubeflow Fairing (Hybrid SDK) Kubeflow Pipelines GCP Infra Options BigQuery ML AI Hub + K8 + GKE Educational Materials BETA ... Open Source Frameworks
  • 9. Visualize Accelerate your ML development with our unified, open and fully-managed architecture Endpoint clients User & device data Or Or Ingest Transform Analyze Web IoT Mobile PubSub Apache Kafka Apache Beam Dataflow Apache Spark BigQuery AI Platform Kubeflow Data Studio 3rd-party BI Tools Data consumers Or
  • 10. Data warehouses From 1st-gen EDWs, increased data collection and analysis has helped build more data-driven businesses. 90’s 00’s BI foundations Data warehousing formed the foundation of reporting and business intelligence. Cloud data warehousing BigQuery represents a fundamentally different approach to cloud data warehousing. Now AI foundations We’re working to make BigQuery the foundation for organizations that will leverage machine intelligence in their businesses. Next Data warehousing for AI-driven business
  • 11. Google BigQuery forms the AI foundation Automate data delivery Democratize data insights Build the foundation for AI Break data silos, power apps, add read-only data sets & make query results accessible to anyone Automated data transfer to extract data from your systems & shared data with federated querying across any Google service Enterprise Data Warehouse stores the most valuable data for your company & brings AI capabilities without replicating data into storage Tee up real-time insights Analyze real-time business events by automatic data ingestion, which is immediately available to query in your data warehouse
  • 12. Google BigQuery Petabyte-scale storage and queries Encrypted, durable and highly available Real-time analytics on streaming data Google Cloud Platform’s enterprise data warehouse for analytics Convenience of standard SQL Fully managed and serverless
  • 13. Days to months to create an ML model TensorFlow or scikit-learn Only an expert data scientist can do this Export small amounts of data from BQ Create frames of data for use with TensorFlow Build model Go back and get more data to create features, and improve performance Repeat. It’s hard, so you stop after a few iterations 3 Export data 1 Regression in Excel/Sheets Export small amounts of data from BQ Run linear regression Get a model with low accuracy due to small data for training Go back and get more data to create new features, and improve performance Repeat. It’s hard, so you stop after a few iterations 2
  • 14. 1 2 3 Execute ML initiatives without moving data from BigQuery Iterate on models in SQL in BigQuery to increase development speed Automate common ML tasks and hyperparameter tuning BigQuery ML
  • 15. Through two lines of SQL ● Leverage BigQuery’s processing power to build a model ● Auto-tuned learning rate ● Auto-split of data into training and test ● Null imputation ● Standardization of numeric features ● One-hot encoding of strings ● Class imbalance handling Behind the scenes
  • 16. Making ML accessible for all audiences Developer Data Analyst Data Scientist Use cases and skills TensorFlow and CloudML Engine ● Build and deploy state-of-art custom models ● Requires deep understanding of ML and programming BigQuery ML ● Build and deploy custom models using SQL ● Requires only basic understanding of ML AutoML and CloudML APIs ● Build and deploy Google-provided models for standard use cases ● Requires almost no ML knowledge
  • 17. ● StandardSQL and UDFs within the ML queries ● Linear regression (forecasting) ● Multi-nomial logistic regression (classification) ● Kmeans clustering (segmentation) ● Model evaluation functions for standard metrics, including the ROC curve ● Model weight inspection ● Feature distribution analysis through standard functions Supported features
  • 19. Available through your favorite BI Platform
  • 20. Looker integration with BigQuery ML Explore data Create BigQuery table for model creation Create the model Evaluate model using a standard dashboard Operationalize the ML workflow Easily JOIN predictions into existing dashboards Alerts and scheduling
  • 21. Retail Optimize inventory Forecast revenue Enable product recommendations Optimize staff promotions Marketing Predict customer lifetime value Predict funnel conversion Personalize ads, email, webpage content Industrial and IoT Forecast demand for parking, traffic, utilities, personnel Predict maintenance needs Prevent equipment downtime Media / gaming Personalize content Predict game difficulty Predict player lifetime value The possibilities are endless
  • 23. 23 Thank you! Abishay Rao Partner Engineer - Google Cloud /abishayrao