Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Open, Secure & Transparent AI
Pipelines
Nick Pentreath, Principal Engineer IBM
About
@MLnick on Twitter & Github
Principal Engineer, IBM
CODAIT - Center for Open-Source Data
& AI Technologies
Machine Learning & AI
Apache Spark committer & PMC
Author of Machine Learning with Spark
Various conferences & meetups
Center for Open Source
Data and AI Technologies
Watson West Building
505 Howard St.
San Francisco, California
CODAIT aims to make AI solutions dramatically
easier to create, deploy, and manage in the
enterprise.
Relaunch of the IBM Spark Technology Center
(STC) to reflect expanded mission.
We contribute to foundational open source
software across the enterprise AI lifecycle.
36 open-source developers!
Improving Enterprise AI Lifecycle in Open Source
CODAIT
codait.org
What is AI?
What about Machine Learning
/ Deep Learning?
Learn from data to make predictions
What about Machine Learning
/ Deep Learning?
Learn from historical data to make
predictions about the future
Applied Machine Learning
Learn from historical data to make
predictions about the future, in order
to make decisions
Intelligent Systems
Source: https://deepmind.com/blog/alphastar-mastering-real-time-strategy-game-starcraft-ii/
Automated decision-making
Continual learning (new data & feedback)
Adapting to environment
Memory & generalization
Trust in AI
The Machine Learning
Workflow
Perception
In reality the workflow spans
teams …
… and tools …
… and is a small (but critical!)
piece of the puzzle
*Source: Hidden Technical Debt in Machine Learning Systems
Training
* Logos trademarks of their respective projects
17
Fabric for Deep Learning
https://developer.ibm.com/ope
n/projects/fabric-for-deep-
learning-ffdl/
https://github.com/IBM/FfDL• Fabric for Deep Learning or FfDL (pronounced ‘fiddle’) aims
at making Deep Learning easily accessible to Data
Scientists and AI developers.
• FfDL provides a consistent way to train and visualize Deep
Learning jobs across multiple frameworks like TensorFlow,
Caffe, PyTorch, Keras etc.
FfDL
Community Partners
FfDL is one of InfoWorld’s 2018 Best of Open
Source Software Award winners for machine
learning and deep learning!
AI Deployment
What, Where, How?
• What are you deploying?
• What is a “model”?
• Where are you deploying?
• Target environment
• Batch, streaming, real-time?
• How are you deploying?
• “devops” deployment mechanism
• Serving framework
We will talk mostly about the what
What is a “model”?
Pipelines, not Models
• Deploying just the model part of the
workflow is not enough
• Entire pipeline must be deployed
• Data transform
• Feature extraction & pre-processing
• ML model itself
• Prediction transformation
• Technically even ETL is part of the
pipeline!
• Pipelines in open-source frameworks
• scikit-learn
• Spark ML pipelines
• TensorFlow Transform
• pipeliner (R)
Challenges
• Proliferation of formats
• Open source, open standard: PMML, PFA, ONNX
• Open-source, non-standard: MLeap, Spark,
TensorFlow, Keras, PyTorch, Caffe, …
• Proprietary formats: lock-in, not portable
• Lack of standardization leads to custom
solutions
• Where standards exist, limitations lead to
custom extensions, eliminating the benefits
• Need to manage and bridge many different:
• Languages - Python, R, Notebooks, Scala / Java / C
• Frameworks – too many to count!
• Dependencies
• Versions
• Performance characteristics can be highly
variable across these dimensions
• Friction between teams
• Data scientists & researchers – latest & greatest
• Production – stability, control, minimize changes,
performance
• Business – metrics, business impact, product must
always work!
Containers for ML
Pipelines
23
Containers for ML Deployment
• But …
• What goes in the container is most
important
• Performance can be highly variable across
language, framework, version
• Requires devops knowledge, CI /
deployment pipelines, good practices
• Does not solve the issue of standardization
• Formats
• APIs exposed
• A serving framework is still required on top
• Container-based deployment has
significant benefits
• Repeatability
• Ease of configuration
• Separation of concerns – focus on what, not
how
• Allow data scientists & researchers to use
their language / framework of choice
• Container frameworks take care of (certain)
monitoring, fault tolerance, HA, etc.
IBM Developer
Model Asset eXchange
http://ibm.biz/model-exchange
Free, open-source deep learning models.
Wide variety of domains.
Multiple deep learning frameworks.
Vetted and tested code and IP.
Build and deploy a web service in 30 seconds.
Start training on Fabric for Deep Learning (FfDL) or
Watson Machine Learning in minutes.
MAX Model Metadata
Open Standards for Model Deployment
Predictive Model Markup
Language (PMML)
• Shortcomings
• Cannot represent arbitrary programs / analytic
applications
• Flexibility comes from custom plugins => lose
benefits of standardization
• Data Mining Group (DMG)
• Model interchange format in XML with
operators
• Widely used and supported; open standard
• Spark support lacking natively but 3rd party
projects available: jpmml-sparkml
• Comprehensive support for Spark ML components
(perhaps surprisingly!)
• Watch SPARK-11237
• Other exporters include scikit-learn, R,
XGBoost and LightGBM
Portable Format for Analytics
• Shortcomings
• PFA is still young and needs to gain adoption
• Production robustness and performance at
scale?
• Limitations of PFA – especially for deep learning
applications
• A standard can move slowly in terms of new
features, fixes and enhancements
• PFA is being championed by the DMG
• PFA consists of:
• JSON serialization format
• AVRO schemas for data types
• Encodes functions (actions) that are applied to inputs to
create outputs with a set of built-in functions and
language constructs (e.g. control-flow, conditionals)
• Essentially a mini functional math language + schema
specification
• Portability across languages, frameworks, run
times and versions
https://github.com/CODAIT/aardpfark
Open Neural Network Exchange (ONNX)
• Shortcomings
• Relatively poor support for “traditional” ML or
general language constructs (currently)
• String / categorical processing
• Datetime, collection operators
• Intermediate variables
• Evolving standard – coverage of operators (e.g.
TensorFlow)
• Championed by Facebook & Microsoft
• Protobuf serialization format
• Describes computation graph (including
operators)
• In this way the serialized graph is “self-describing”
similarly to PFA
• More focused on Deep Learning / tensor
operations
• Baked into PyTorch 1.0.0 / Caffe2 as the
serialization & interchange format
• ONNX-ML covers “traditional” ML operators
https://github.com/onnx
Monitoring & Feedback
DBG / May 10, 2018 / © 2018 IBM Corporation
Monitoring
Performance Business
Monitoring
Software
Traditional software monitoring
Latency, throughput, resource usage, etc
Model metrics
Traditional ML evaluation measures +
bias, robustness, explainability
Business metrics
Impact of predictions on business
outcomes
• Additional revenue - e.g. uplift from recommender
• Cost savings – e.g. value of fraud prevented
• Metrics implicitly influencing these – e.g. user
engagement
Feedback
Adapt
An intelligent system must automatically
learn from & adapt to the world around it
Continual learning
Retraining, online learning,
reinforcement learning
Feedback loops
Explicit: models create or directly
influence their own training data
Implicit: predictions influence behavior in
longer-term or indirect ways
Humans in the loop
Data
Transform
TrainDeploy
Feedback
AI Fairness & Transparency
34
35
AI Fairness 360
Toolbox:
Fairness metrics (30+)
Fairness metric
explanations
Bias mitigation
algorithms (10)
AIF360AIF360 toolkit is an open-source library to help
detect and remove bias in machine learning
models.
The AI Fairness 360 Python package includes
a comprehensive set of metrics for datasets
and models to test for biases, explanations for
these metrics, and algorithms to mitigate bias
in datasets and models.
https://github.com/IBM/AIF360
https://developer.ibm.com/patterns/ensuring-
fairness-when-processing-loan-applications/
36
AIF360 Demo: http://aif360.mybluemix.net
Defending machine
learning systems
Adversarial Attacks
38Sources: Explaining and Harnessing Adversarial Examples
Robust Physical-World Attacks on Deep Learning Visual Classification
IBM Adversarial Robustness
Toolbox
ART
ART is a library dedicated to adversarial
machine learning. Its purpose is to allow rapid
crafting and analysis of attack and defense
methods for machine learning models. The
Adversarial Robustness Toolbox provides an
implementation for many state-of-the-art
methods for attacking and defending
classifiers.
39
https://github.com/IBM/adversarial-robustness-toolbox
https://developer.ibm.com/patterns/integrate-adversarial-
attacks-model-training-pipeline/
Toolbox
Evasion attacks (11)
Defenses (9)
Detection methods for
adversarial samples &
poisoning attacks
Robustness metrics
Getting CLEVER
40
Cross Lipschitz Extreme Value for nEtwork Robustness (CLEVER):
• Published by IBM Research
• Attack-agnostic measure
• Efficient computation
https://medium.com/@MITIBMLab/getting-clever-er-expanding-
the-scope-of-a-robustness-metric-for-neural-networks-81c6c6ecb
https://arxiv.org/abs/1801.10578
https://github.com/IBM/CLEVER-Robustness-Score
41
ART Demo: https://art-demo.mybluemix.net/
Wrapping Up
CODAIT: End-to-end Enterprise AI
in Open Source
44
codait.org
twitter.com/MLnick
github.com/MLnick
developer.ibm.com
FfDL
Sign up for IBM Cloud and try Watson Studio!
https://ibm.biz/Bd2Gb7
MAX
Thank you.

More Related Content

Open, Secure & Transparent AI Pipelines

  • 1. Open, Secure & Transparent AI Pipelines Nick Pentreath, Principal Engineer IBM
  • 2. About @MLnick on Twitter & Github Principal Engineer, IBM CODAIT - Center for Open-Source Data & AI Technologies Machine Learning & AI Apache Spark committer & PMC Author of Machine Learning with Spark Various conferences & meetups
  • 3. Center for Open Source Data and AI Technologies Watson West Building 505 Howard St. San Francisco, California CODAIT aims to make AI solutions dramatically easier to create, deploy, and manage in the enterprise. Relaunch of the IBM Spark Technology Center (STC) to reflect expanded mission. We contribute to foundational open source software across the enterprise AI lifecycle. 36 open-source developers! Improving Enterprise AI Lifecycle in Open Source CODAIT codait.org
  • 5. What about Machine Learning / Deep Learning? Learn from data to make predictions
  • 6. What about Machine Learning / Deep Learning? Learn from historical data to make predictions about the future
  • 7. Applied Machine Learning Learn from historical data to make predictions about the future, in order to make decisions
  • 8. Intelligent Systems Source: https://deepmind.com/blog/alphastar-mastering-real-time-strategy-game-starcraft-ii/ Automated decision-making Continual learning (new data & feedback) Adapting to environment Memory & generalization
  • 12. In reality the workflow spans teams …
  • 14. … and is a small (but critical!) piece of the puzzle *Source: Hidden Technical Debt in Machine Learning Systems
  • 16. * Logos trademarks of their respective projects
  • 17. 17 Fabric for Deep Learning https://developer.ibm.com/ope n/projects/fabric-for-deep- learning-ffdl/ https://github.com/IBM/FfDL• Fabric for Deep Learning or FfDL (pronounced ‘fiddle’) aims at making Deep Learning easily accessible to Data Scientists and AI developers. • FfDL provides a consistent way to train and visualize Deep Learning jobs across multiple frameworks like TensorFlow, Caffe, PyTorch, Keras etc. FfDL Community Partners FfDL is one of InfoWorld’s 2018 Best of Open Source Software Award winners for machine learning and deep learning!
  • 19. What, Where, How? • What are you deploying? • What is a “model”? • Where are you deploying? • Target environment • Batch, streaming, real-time? • How are you deploying? • “devops” deployment mechanism • Serving framework We will talk mostly about the what
  • 20. What is a “model”?
  • 21. Pipelines, not Models • Deploying just the model part of the workflow is not enough • Entire pipeline must be deployed • Data transform • Feature extraction & pre-processing • ML model itself • Prediction transformation • Technically even ETL is part of the pipeline! • Pipelines in open-source frameworks • scikit-learn • Spark ML pipelines • TensorFlow Transform • pipeliner (R)
  • 22. Challenges • Proliferation of formats • Open source, open standard: PMML, PFA, ONNX • Open-source, non-standard: MLeap, Spark, TensorFlow, Keras, PyTorch, Caffe, … • Proprietary formats: lock-in, not portable • Lack of standardization leads to custom solutions • Where standards exist, limitations lead to custom extensions, eliminating the benefits • Need to manage and bridge many different: • Languages - Python, R, Notebooks, Scala / Java / C • Frameworks – too many to count! • Dependencies • Versions • Performance characteristics can be highly variable across these dimensions • Friction between teams • Data scientists & researchers – latest & greatest • Production – stability, control, minimize changes, performance • Business – metrics, business impact, product must always work!
  • 24. Containers for ML Deployment • But … • What goes in the container is most important • Performance can be highly variable across language, framework, version • Requires devops knowledge, CI / deployment pipelines, good practices • Does not solve the issue of standardization • Formats • APIs exposed • A serving framework is still required on top • Container-based deployment has significant benefits • Repeatability • Ease of configuration • Separation of concerns – focus on what, not how • Allow data scientists & researchers to use their language / framework of choice • Container frameworks take care of (certain) monitoring, fault tolerance, HA, etc.
  • 25. IBM Developer Model Asset eXchange http://ibm.biz/model-exchange Free, open-source deep learning models. Wide variety of domains. Multiple deep learning frameworks. Vetted and tested code and IP. Build and deploy a web service in 30 seconds. Start training on Fabric for Deep Learning (FfDL) or Watson Machine Learning in minutes.
  • 27. Open Standards for Model Deployment
  • 28. Predictive Model Markup Language (PMML) • Shortcomings • Cannot represent arbitrary programs / analytic applications • Flexibility comes from custom plugins => lose benefits of standardization • Data Mining Group (DMG) • Model interchange format in XML with operators • Widely used and supported; open standard • Spark support lacking natively but 3rd party projects available: jpmml-sparkml • Comprehensive support for Spark ML components (perhaps surprisingly!) • Watch SPARK-11237 • Other exporters include scikit-learn, R, XGBoost and LightGBM
  • 29. Portable Format for Analytics • Shortcomings • PFA is still young and needs to gain adoption • Production robustness and performance at scale? • Limitations of PFA – especially for deep learning applications • A standard can move slowly in terms of new features, fixes and enhancements • PFA is being championed by the DMG • PFA consists of: • JSON serialization format • AVRO schemas for data types • Encodes functions (actions) that are applied to inputs to create outputs with a set of built-in functions and language constructs (e.g. control-flow, conditionals) • Essentially a mini functional math language + schema specification • Portability across languages, frameworks, run times and versions https://github.com/CODAIT/aardpfark
  • 30. Open Neural Network Exchange (ONNX) • Shortcomings • Relatively poor support for “traditional” ML or general language constructs (currently) • String / categorical processing • Datetime, collection operators • Intermediate variables • Evolving standard – coverage of operators (e.g. TensorFlow) • Championed by Facebook & Microsoft • Protobuf serialization format • Describes computation graph (including operators) • In this way the serialized graph is “self-describing” similarly to PFA • More focused on Deep Learning / tensor operations • Baked into PyTorch 1.0.0 / Caffe2 as the serialization & interchange format • ONNX-ML covers “traditional” ML operators https://github.com/onnx
  • 31. Monitoring & Feedback DBG / May 10, 2018 / © 2018 IBM Corporation
  • 32. Monitoring Performance Business Monitoring Software Traditional software monitoring Latency, throughput, resource usage, etc Model metrics Traditional ML evaluation measures + bias, robustness, explainability Business metrics Impact of predictions on business outcomes • Additional revenue - e.g. uplift from recommender • Cost savings – e.g. value of fraud prevented • Metrics implicitly influencing these – e.g. user engagement
  • 33. Feedback Adapt An intelligent system must automatically learn from & adapt to the world around it Continual learning Retraining, online learning, reinforcement learning Feedback loops Explicit: models create or directly influence their own training data Implicit: predictions influence behavior in longer-term or indirect ways Humans in the loop Data Transform TrainDeploy Feedback
  • 34. AI Fairness & Transparency 34
  • 35. 35 AI Fairness 360 Toolbox: Fairness metrics (30+) Fairness metric explanations Bias mitigation algorithms (10) AIF360AIF360 toolkit is an open-source library to help detect and remove bias in machine learning models. The AI Fairness 360 Python package includes a comprehensive set of metrics for datasets and models to test for biases, explanations for these metrics, and algorithms to mitigate bias in datasets and models. https://github.com/IBM/AIF360 https://developer.ibm.com/patterns/ensuring- fairness-when-processing-loan-applications/
  • 38. Adversarial Attacks 38Sources: Explaining and Harnessing Adversarial Examples Robust Physical-World Attacks on Deep Learning Visual Classification
  • 39. IBM Adversarial Robustness Toolbox ART ART is a library dedicated to adversarial machine learning. Its purpose is to allow rapid crafting and analysis of attack and defense methods for machine learning models. The Adversarial Robustness Toolbox provides an implementation for many state-of-the-art methods for attacking and defending classifiers. 39 https://github.com/IBM/adversarial-robustness-toolbox https://developer.ibm.com/patterns/integrate-adversarial- attacks-model-training-pipeline/ Toolbox Evasion attacks (11) Defenses (9) Detection methods for adversarial samples & poisoning attacks Robustness metrics
  • 40. Getting CLEVER 40 Cross Lipschitz Extreme Value for nEtwork Robustness (CLEVER): • Published by IBM Research • Attack-agnostic measure • Efficient computation https://medium.com/@MITIBMLab/getting-clever-er-expanding- the-scope-of-a-robustness-metric-for-neural-networks-81c6c6ecb https://arxiv.org/abs/1801.10578 https://github.com/IBM/CLEVER-Robustness-Score
  • 43. CODAIT: End-to-end Enterprise AI in Open Source
  • 44. 44 codait.org twitter.com/MLnick github.com/MLnick developer.ibm.com FfDL Sign up for IBM Cloud and try Watson Studio! https://ibm.biz/Bd2Gb7 MAX