Open, Secure & Transparent AI Pipelines

Open, Secure & Transparent AI
Pipelines
Nick Pentreath, Principal Engineer IBM

About
@MLnick on Twitter & Github
Principal Engineer, IBM
CODAIT - Center for Open-Source Data
& AI Technologies
Machine Learning & AI
Apache Spark committer & PMC
Author of Machine Learning with Spark
Various conferences & meetups

Center for Open Source
Data and AI Technologies
Watson West Building
505 Howard St.
San Francisco, California
CODAIT aims to make AI solutions dramatically
easier to create, deploy, and manage in the
enterprise.
Relaunch of the IBM Spark Technology Center
(STC) to reflect expanded mission.
We contribute to foundational open source
software across the enterprise AI lifecycle.
36 open-source developers!
Improving Enterprise AI Lifecycle in Open Source
CODAIT
codait.org

What about Machine Learning
/ Deep Learning?
Learn from data to make predictions

What about Machine Learning
/ Deep Learning?
Learn from historical data to make
predictions about the future

Applied Machine Learning
Learn from historical data to make
predictions about the future, in order
to make decisions

Intelligent Systems
Source: https://deepmind.com/blog/alphastar-mastering-real-time-strategy-game-starcraft-ii/
Automated decision-making
Continual learning (new data & feedback)
Adapting to environment
Memory & generalization

In reality the workflow spans
teams …

… and is a small (but critical!)
piece of the puzzle
*Source: Hidden Technical Debt in Machine Learning Systems

* Logos trademarks of their respective projects

17
Fabric for Deep Learning
https://developer.ibm.com/ope
n/projects/fabric-for-deep-
learning-ffdl/
https://github.com/IBM/FfDL• Fabric for Deep Learning or FfDL (pronounced ‘fiddle’) aims
at making Deep Learning easily accessible to Data
Scientists and AI developers.
• FfDL provides a consistent way to train and visualize Deep
Learning jobs across multiple frameworks like TensorFlow,
Caffe, PyTorch, Keras etc.
FfDL
Community Partners
FfDL is one of InfoWorld’s 2018 Best of Open
Source Software Award winners for machine
learning and deep learning!

What, Where, How?
• What are you deploying?
• What is a “model”?
• Where are you deploying?
• Target environment
• Batch, streaming, real-time?
• How are you deploying?
• “devops” deployment mechanism
• Serving framework
We will talk mostly about the what

Pipelines, not Models
• Deploying just the model part of the
workflow is not enough
• Entire pipeline must be deployed
• Data transform
• Feature extraction & pre-processing
• ML model itself
• Prediction transformation
• Technically even ETL is part of the
pipeline!
• Pipelines in open-source frameworks
• scikit-learn
• Spark ML pipelines
• TensorFlow Transform
• pipeliner (R)

Challenges
• Proliferation of formats
• Open source, open standard: PMML, PFA, ONNX
• Open-source, non-standard: MLeap, Spark,
TensorFlow, Keras, PyTorch, Caffe, …
• Proprietary formats: lock-in, not portable
• Lack of standardization leads to custom
solutions
• Where standards exist, limitations lead to
custom extensions, eliminating the benefits
• Need to manage and bridge many different:
• Languages - Python, R, Notebooks, Scala / Java / C
• Frameworks – too many to count!
• Dependencies
• Versions
• Performance characteristics can be highly
variable across these dimensions
• Friction between teams
• Data scientists & researchers – latest & greatest
• Production – stability, control, minimize changes,
performance
• Business – metrics, business impact, product must
always work!

Containers for ML
Pipelines
23

Containers for ML Deployment
• But …
• What goes in the container is most
important
• Performance can be highly variable across
language, framework, version
• Requires devops knowledge, CI /
deployment pipelines, good practices
• Does not solve the issue of standardization
• Formats
• APIs exposed
• A serving framework is still required on top
• Container-based deployment has
significant benefits
• Repeatability
• Ease of configuration
• Separation of concerns – focus on what, not
how
• Allow data scientists & researchers to use
their language / framework of choice
• Container frameworks take care of (certain)
monitoring, fault tolerance, HA, etc.

IBM Developer
Model Asset eXchange
http://ibm.biz/model-exchange
Free, open-source deep learning models.
Wide variety of domains.
Multiple deep learning frameworks.
Vetted and tested code and IP.
Build and deploy a web service in 30 seconds.
Start training on Fabric for Deep Learning (FfDL) or
Watson Machine Learning in minutes.

Open Standards for Model Deployment

Predictive Model Markup
Language (PMML)
• Shortcomings
• Cannot represent arbitrary programs / analytic
applications
• Flexibility comes from custom plugins => lose
benefits of standardization
• Data Mining Group (DMG)
• Model interchange format in XML with
operators
• Widely used and supported; open standard
• Spark support lacking natively but 3rd party
projects available: jpmml-sparkml
• Comprehensive support for Spark ML components
(perhaps surprisingly!)
• Watch SPARK-11237
• Other exporters include scikit-learn, R,
XGBoost and LightGBM

Portable Format for Analytics
• Shortcomings
• PFA is still young and needs to gain adoption
• Production robustness and performance at
scale?
• Limitations of PFA – especially for deep learning
applications
• A standard can move slowly in terms of new
features, fixes and enhancements
• PFA is being championed by the DMG
• PFA consists of:
• JSON serialization format
• AVRO schemas for data types
• Encodes functions (actions) that are applied to inputs to
create outputs with a set of built-in functions and
language constructs (e.g. control-flow, conditionals)
• Essentially a mini functional math language + schema
specification
• Portability across languages, frameworks, run
times and versions
https://github.com/CODAIT/aardpfark

Open Neural Network Exchange (ONNX)
• Shortcomings
• Relatively poor support for “traditional” ML or
general language constructs (currently)
• String / categorical processing
• Datetime, collection operators
• Intermediate variables
• Evolving standard – coverage of operators (e.g.
TensorFlow)
• Championed by Facebook & Microsoft
• Protobuf serialization format
• Describes computation graph (including
operators)
• In this way the serialized graph is “self-describing”
similarly to PFA
• More focused on Deep Learning / tensor
operations
• Baked into PyTorch 1.0.0 / Caffe2 as the
serialization & interchange format
• ONNX-ML covers “traditional” ML operators
https://github.com/onnx

Monitoring & Feedback
DBG / May 10, 2018 / © 2018 IBM Corporation

Monitoring
Performance Business
Monitoring
Software
Traditional software monitoring
Latency, throughput, resource usage, etc
Model metrics
Traditional ML evaluation measures +
bias, robustness, explainability
Business metrics
Impact of predictions on business
outcomes
• Additional revenue - e.g. uplift from recommender
• Cost savings – e.g. value of fraud prevented
• Metrics implicitly influencing these – e.g. user
engagement

Feedback
Adapt
An intelligent system must automatically
learn from & adapt to the world around it
Continual learning
Retraining, online learning,
reinforcement learning
Feedback loops
Explicit: models create or directly
influence their own training data
Implicit: predictions influence behavior in
longer-term or indirect ways
Humans in the loop
Data
Transform
TrainDeploy
Feedback

35
AI Fairness 360
Toolbox:
Fairness metrics (30+)
Fairness metric
explanations
Bias mitigation
algorithms (10)
AIF360AIF360 toolkit is an open-source library to help
detect and remove bias in machine learning
models.
The AI Fairness 360 Python package includes
a comprehensive set of metrics for datasets
and models to test for biases, explanations for
these metrics, and algorithms to mitigate bias
in datasets and models.
https://github.com/IBM/AIF360
https://developer.ibm.com/patterns/ensuring-
fairness-when-processing-loan-applications/

36
AIF360 Demo: http://aif360.mybluemix.net

Defending machine
learning systems

Adversarial Attacks
38Sources: Explaining and Harnessing Adversarial Examples
Robust Physical-World Attacks on Deep Learning Visual Classification

IBM Adversarial Robustness
Toolbox
ART
ART is a library dedicated to adversarial
machine learning. Its purpose is to allow rapid
crafting and analysis of attack and defense
methods for machine learning models. The
Adversarial Robustness Toolbox provides an
implementation for many state-of-the-art
methods for attacking and defending
classifiers.
39
https://github.com/IBM/adversarial-robustness-toolbox
https://developer.ibm.com/patterns/integrate-adversarial-
attacks-model-training-pipeline/
Toolbox
Evasion attacks (11)
Defenses (9)
Detection methods for
adversarial samples &
poisoning attacks
Robustness metrics

Getting CLEVER
40
Cross Lipschitz Extreme Value for nEtwork Robustness (CLEVER):
• Published by IBM Research
• Attack-agnostic measure
• Efficient computation
https://medium.com/@MITIBMLab/getting-clever-er-expanding-
the-scope-of-a-robustness-metric-for-neural-networks-81c6c6ecb
https://arxiv.org/abs/1801.10578
https://github.com/IBM/CLEVER-Robustness-Score

41
ART Demo: https://art-demo.mybluemix.net/

CODAIT: End-to-end Enterprise AI
in Open Source

44
codait.org
twitter.com/MLnick
github.com/MLnick
developer.ibm.com
FfDL
Sign up for IBM Cloud and try Watson Studio!
https://ibm.biz/Bd2Gb7
MAX

Open, Secure & Transparent AI Pipelines

Related slideshows

More Related Content

Open, Secure & Transparent AI Pipelines