Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
•
0 likes•1,039 views
Every company today is talking about AI/ML, but when most companies talk about AI/ML in their transformation journey, you hear terms like Proof of Concept, Feasibility Study, Pilot, A/B Test. We are at the peak of AI's hype, but only 12% of enterprises have deployed AI in production. Google aims to make big data processing available for everyone, the possiblities of Big Query ML are endless: Marketing, retail, industrial and IoT, media, gaming, and so fort.
1 of 23
Download to read offline
More Related Content
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
2. Every company is talking about AI/ML today
Source: AI Index Report 2018
2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
0
50
100
150
Company earnings calls mentions - Sum of other industries
”Cloud”“Big data”“Artificial intelligence” “Machine Learning”
Annualmentions
2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017
250
750
500
1,000
1,200
0
Annualmentions
Company earnings calls mentions - IT companies
”Cloud”“Big data”“Artificial intelligence” “Machine Learning”
3. When most companies
talk about AI/ML in their
transformation journey,
you hear terms like…
PROOF OF CONCEPT
PILOT
A/B TEST
FEASIBILITY STUDY
KNOW
DATA
PROCEDURE
BUSINESS
ANALYSIS
THESIS
EXPERIMENT
TEST
CRITICAL
PROOF
CREATIVE
REPORT
PROCESS
DEBATE
REVIEW
CORE
MIX
PROJECT
VALUE
INNOVATION
PROBE
BOOK
STATISTICS
CONCEPT
CHECK
SUMMARY
SEARCH
CHECK
SUMMARY
ANALYZING
TEST
VALUE
BOOK
BUSINESS
INVESTIGATION
EXPERIMENTS
CHECK
INVESTIGATE
TEST
DATA
MIX
PROCEDURES
EFFECTIVE
SOLUTIONS
DATA
BOOK
ASSESS
NEW
PROOF
EVIDENCE
CONTINUAL
QUANTITATIVE RESULTS
ACADEMICS
TRIAL
SAMPLE
RESULTS
STUDIES
MIX
EXPLORE
DISCOVERY
PRELIMINARY DATA
CONCEPT
INSPECTION
EXPLORATION
PROBE
VALUE
VERIFICATIONEVENTS
STUDIES
REVIEWING
FUNCTIONCASE
FUNCTIONIONAL RESPONSIBILITY QUALITATIVE RESULTS
MIX
YES
CONFIRMATION
DOCUMENTATION
CONFIRM
NEW
4. We are at the peak of AI’s hype
Research Reality
~90 AI papers per day
12%
Of enterprises have deployed
AI in production1
1
McKinsey Global Institute,
Artificial Intelligence: The Next Digital Frontier
40000
30000
20000
10000
0
2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
0
5
10
15
20
25
30
35
MLArxivPapers
Relativeto2009MLArxivPapers
ML Arxiv Papers Moore’s Law growth rate (2x/2 years)
Year
5. Decision
trees
Engineers applied
decision trees to drive
machine outcomes.
70’s 90’s
Modeling and
simulation
Faster computers and
software paved the way
to apply statistics to
drive superior outcomes.
Deep
Learning
Deep Learning ushered
the possibility to solve
previously unsolvable
problems.
10’s
Embedded
Intelligence
Businesses view AI as an
integral part of product
development and
operational efficiency.
Now
AI will be synonymous with software
6. 2019.
Empower every
industry to transform
their business with AI
10,000X
of deep learning
researchers
2M
ML experts
100X
M
business users
23M
developers
Builders
7. How can I make faster impact on business
How do I spend less time
preparing data
How do I get my models to
production faster and manage
its lifecycle
How do I build and deploy my
models flexibly (on-premises, GCP)
How do I collaborate with
all users
8. GCP ML Stack
Notebooks
Data Services
Models
APIs
KF Pipelines
VM Images
Reference
Architectures
Storage Compute Preprocessing
Managed
Services
Deep Learning Virtual
Machines
AI Platform
Training & Prediction
Tooling/
SDKs
Notebook
On Prem
infra
Kubeflow Core Services
DIY
AutoML
Kubeflow Fairing
(Hybrid SDK)
Kubeflow
Pipelines
GCP Infra
Options
BigQuery ML
AI Hub
+
K8
+
GKE
Educational
Materials
BETA
...
Open Source Frameworks
9. Visualize
Accelerate your ML development with our
unified, open and fully-managed architecture
Endpoint clients
User &
device data
Or Or
Ingest Transform Analyze
Web
IoT
Mobile
PubSub
Apache
Kafka
Apache
Beam
Dataflow
Apache
Spark
BigQuery
AI Platform
Kubeflow
Data Studio
3rd-party
BI Tools
Data
consumers
Or
10. Data
warehouses
From 1st-gen EDWs,
increased data collection
and analysis has helped
build more data-driven
businesses.
90’s 00’s
BI
foundations
Data warehousing
formed the foundation of
reporting and business
intelligence.
Cloud data
warehousing
BigQuery represents
a fundamentally different
approach to cloud data
warehousing.
Now
AI
foundations
We’re working to make
BigQuery the foundation
for organizations that
will leverage machine
intelligence in their
businesses.
Next
Data warehousing for
AI-driven business
11. Google BigQuery forms the AI foundation
Automate
data delivery
Democratize
data insights
Build the
foundation for AI
Break data silos, power apps,
add read-only data sets &
make query results
accessible to anyone
Automated data transfer to
extract data from your
systems & shared
data with federated
querying across any
Google service
Enterprise Data Warehouse
stores the most valuable data
for your company & brings AI
capabilities without
replicating data
into storage
Tee up
real-time insights
Analyze real-time business
events by automatic data
ingestion, which is
immediately available to
query in your
data warehouse
12. Google BigQuery
Petabyte-scale storage
and queries
Encrypted, durable and
highly available
Real-time analytics on
streaming data
Google Cloud Platform’s
enterprise data warehouse
for analytics
Convenience of
standard SQL
Fully managed and serverless
13. Days to months to create an ML model
TensorFlow or scikit-learn
Only an expert data scientist can do this
Export small amounts of data from BQ
Create frames of data for use with TensorFlow Build model
Go back and get more data to create features, and
improve performance
Repeat. It’s hard, so you stop after a few iterations
3
Export data
1
Regression in Excel/Sheets
Export small amounts of data from BQ
Run linear regression
Get a model with low accuracy due to small data for training
Go back and get more data to create new features,
and improve performance
Repeat. It’s hard, so you stop after a few iterations
2
14. 1
2
3
Execute ML initiatives
without moving data from
BigQuery
Iterate on models in SQL in
BigQuery to increase
development speed
Automate common ML
tasks and hyperparameter
tuning
BigQuery ML
15. Through two lines of SQL
● Leverage BigQuery’s processing power to build a
model
● Auto-tuned learning rate
● Auto-split of data into training and test
● Null imputation
● Standardization of numeric features
● One-hot encoding of strings
● Class imbalance handling
Behind the scenes
16. Making ML accessible for all audiences
Developer Data Analyst Data Scientist Use cases and skills
TensorFlow and
CloudML Engine
● Build and deploy state-of-art custom models
● Requires deep understanding of ML
and programming
BigQuery ML
● Build and deploy custom models using SQL
● Requires only basic understanding of ML
AutoML and
CloudML APIs
● Build and deploy Google-provided models
for standard use cases
● Requires almost no ML knowledge
17. ● StandardSQL and UDFs within the ML queries
● Linear regression (forecasting)
● Multi-nomial logistic regression (classification)
● Kmeans clustering (segmentation)
● Model evaluation functions for standard metrics, including
the ROC curve
● Model weight inspection
● Feature distribution analysis through standard functions
Supported features
20. Looker integration with
BigQuery ML
Explore data
Create BigQuery table for
model creation
Create the model
Evaluate model using a
standard dashboard
Operationalize the ML workflow
Easily JOIN predictions into existing
dashboards
Alerts and scheduling
21. Retail
Optimize inventory
Forecast revenue
Enable product
recommendations
Optimize staff
promotions
Marketing
Predict customer
lifetime value
Predict funnel
conversion
Personalize ads, email,
webpage content
Industrial and IoT
Forecast demand for
parking, traffic,
utilities, personnel
Predict maintenance
needs
Prevent equipment
downtime
Media / gaming
Personalize content
Predict game
difficulty
Predict player
lifetime value
The possibilities are endless