AI Expedition
AI Expedition
AI Expedition
AI Expedition
Steering Business Innovation on IBM Z
and LinuxONE
Tabari Alexander
Colton Cox
Joy Deng
Purvi Patel
Andrew Sica
Shin (Kelly) Yang
Artificial Intelligence
Data and AI
Redguide
Executive overview
Enterprises today are confronted with a spectrum of challenges in the AI domain ranging from
complex considerations to fundamental decisions on where to start implementing AI.
Navigating this landscape requires not only strategic thinking but also adept decision making
as more enterprises strive to unlock the full potential of AI for their business.
This IBM Redbooks® Redguide publication introduces validated strategies that ensure
effective and successful outcomes for AI implementation in enterprise workloads. These
approaches include cost-effective entry points with robust data management capabilities
across IBM Z® and LinuxONE all tailored and seamlessly integrated into enterprise
AI workloads.
A key driver behind the sudden importance of AI in the world today is that enterprises need to
process vast amounts of data with speed and precision to extract insights that drive business
results. In some specific use cases, organizations need to glean real-time business insights
from their mission critical, enterprise workloads, which reside on IBM Z and LinuxONE.
Embrace optimized open source technology that brings many popular AI frameworks and
tools natively and seamlessly into your enterprise. Train your model anywhere and deploy the
model on IBM Z and LinuxONE for inferencing to use the industry’s first integrated on-chip AI
accelerator designed for high-speed, latency-optimized inferencing. AI inferencing can be run
at scale with low latency, embedding AI into every transaction with no impact on service-level
agreements (SLAs), which empowers the enterprise to make informed decisions in real-time.
This IBM Redguide publication walks you through how to navigate your AI adventure on
IBM Z and LinuxONE. First, we cover infusing AI into critical business applications. Second,
we discuss cost-effective entry points for using AI on IBM Z. Finally, we show how to manage
data across platforms for use on IBM Z with Feature store. Whether it is fraud detection or
anti-money laundering, find out ways to easily consume AI capabilities with industry common
skills to integrate AI into your enterprise today.
Many mission-critical and core business applications have been running on the mainframe for
decades. IBM Institute for Business Value (IBV) reveals that nearly 7 in 10 IT executives
affirm mainframe-based applications are central to their business and technology strategies.
68% of respondents also assert that mainframe systems are central to their hybrid cloud
strategy.2
With all the data and applications that run on IBM Z and LinuxONE, enterprises can use AI to
solve complex business problems and drive customer delight. AI thrives on data, and
organizations have accumulated and saved heaps of data from their transactions over the
years. Hence, businesses can generate more value from their data by infusing intelligence
into applications.
1 https://newsroom.ibm.com/2024-01-10-Data-Suggests-Growth-in-Enterprise-Adoption-of-AI-is-Due-to-Wide
spread-Deployment-by-Early-Adopters
2
https://www.ibm.com/blog/new-study-reveals-why-mainframe-application-modernization-is-key-to-acceler
ating-digital-transformation/
The recently available IBM z16™ processor has an industry-first on-chip AI accelerator that is
designed for high-speed and latency-optimized inferencing. It accommodates 300 billion
inference requests per day with a 1-millisecond response time. It delivers consistent response
times with optimized inferencing that scales with workloads and scores every transaction
while still meeting the most stringent application SLAs. Instead of getting insights after the
transaction occurs, IBM z16 helps organizations create value with accelerated AI insights that
are applied to each transaction, in real-time, before the transaction completes.
In addition to the optimized inferencing, colocating AI applications with the data on IBM Z
systems provides data gravity benefits. IBM Z servers process approximately 30 billion
transactions each day, up to 19 billion of which are encrypted. So, it colocates analytics,
securely, with the data. Tight integration of AI with data and core business applications that
reside on IBM Z allows organizations to leverage the quality of service you expect from IBM Z
platform: AI resiliency, 99.99999% availability, and IBM Z flagship security for enterprise data.
Let's revisit the use case of fraud detection in online transactions. Due to SLA requirements,
most banks run fraud detection algorithms only on a fraction of transactions in real-time.
Many of these transactions trigger fraud detection analysis on a post-transaction basis,
drastically limiting the ability to detect fraud and avoid losses. However, with IBM z16 on-chip
AI accelerator, banks can score 100% of transactions in real-time, and get response times
within application SLAs, to detect and prevent fraudulent transactions. This results in
significant cost savings and improved customer satisfaction. Celent estimates that scoring
every transaction on the z16 processor can potentially reduce banking, card, and payments
fraud losses by US $161 billion globally.3
IBM Machine Learning for z/OS (MLz, formally known as Watson Machine Learning for z/OS
(WMLz)) is an enterprise machine learning solution that runs on the IBM Z platform. This
end-to-end machine learning platform enables organizations to infuse machine learning (ML)
and deep learning (DL) models to score transactional workloads running on IBM Z to derive
real-time business insights at scale. Easy to use web user interface allows you to build and
train your ML/DL models on any platform using your framework of choice, including MLz, and
easily import them to MLz with a single click. From there the models can be deployed on
IBM Z into your most demanding transactional workloads to drive business value in every
transaction, without impacting application SLAs or user experience. Tight integration of data
and AI provides transactional affinity, a key in achieving low latency.
3 https://www.ibm.com/downloads/cas/DOXY3Q94
3
The Administration dashboard in MLz can manage and monitor models for inaccuracy, such
as bias and drift, which helps simplify the efforts that are required to maintain a production
level model accuracy.
In addition to REST API calls to the scoring service, MLz also provides:
A scoring service that is integrated with native IBM CICS® runtime through the
ANLSCORE program that can be called through a CICS LINK command. This can be
used to easily modernize CICS COBOL applications to infuse AI. This solution yields
optimized performance as the scoring server is colocated with the transaction within the
CICS runtime itself.
A scoring service through the IBM WebSphere® Optimized Local Adapters (WOLA)
interface for IMS COBOL programs. This enables optimized performance for in-transaction
inferencing in a high-volume IMS transactions environment.
IBM provides the full-featured machine learning platform, training anywhere or on IBM Z and
readily deploying those models on IBM z/OS® applications, colocated with enterprise
transaction data and business logic for in-transaction scoring in near real-time without impact
to applications' SLAs. It takes advantage of the security, performance, and resiliency of the
IBM Z platform to deliver business insights when and where they are needed, at the point of
impact, in real-time.
First, it is worth recognizing that there is no single way to measure cost-effectiveness with a
technology investment; it can relate to various explicit and implicit costs, including the areas
of licensing costs, program management, and return-on-investment (ROI).
How much are you willing to invest in new enterprise software?
While software licensing costs might be one of the largest explicit expenditures
undertaken by an enterprise IT team, it also provides assurances that may be critical for
success, for example certainty that the technology has the necessary capacity and also
access to lines of support and documentation. In the case of enterprise AI, this may also
enable faster time-to-market for a first-time use case, with the necessary infrastructure
ready for immediate use to support model deployment. It is possible that your team would
be seeking a more flexible solution that allows for a more situation-specific approach to
your AI use case.
4 Source: https://www.oreilly.com/radar/ai-adoption-in-the-enterprise-2022/
5
Python AI Toolkit for IBM z/OS
Python AI Toolkit for IBM z/OS is a product delivering industry-leading AI Python packages for
availability on z/OS that also leverages IBM supply chain security. This ensures that the
Python packages have been scanned and vetted for vulnerabilities that might compromise the
safety of the operational environment, which may help to mitigate the concerns of warier
organizations who hesitate to tread into open source. With a familiar, flexible, and agile
delivery experience, you can access a range of 180+ packages that are built for data science
and AI use cases, including:
matplotlib: A comprehensive library for creating static, animated, and interactive
visualizations in Python.
Pandas: Providing fast, flexible, and expressive data structures that are designed to make
working with 'relational' or 'labeled' data both easy and intuitive.
XGBoost: An optimized distributed gradient boosting library designed to be highly efficient,
flexible, and portable.
Consisting of serving frameworks, machine learning and deep learning frameworks, and
deployment options, all sitting atop the power of the IBM Z Integrated Accelerator for AI, the
AI Toolkit enables you utilize and deploy AI frameworks with confidence.
Relief can come in the form of optimized processors available for IBM Z and LinuxONE
machines, dedicated to run certain specialized workloads alongside normal operations,
including workloads for many common artificial intelligence use cases.
Examples of AI software that can utilize zIIPs to achieve optimized hardware costs include:
ONNX: Library calls and compiling AI models on Open Neural Network Exchange (ONNX)
where the ONNX operators are defined to run directly on z/OS are eligible for zIIP
workloads.
Python: Up to 70% of Python AI and ML workloads are considered eligible for zIIP.
IBM Z AI Data Embedding library of z/OS: When invoked by using the Java native
application programming interface, the IBM Z AI Data Embedding library of z/OS is
considered eligible for zIIP.
Machine learning for IBM z/OS: The availability of zIIP engines allows all training and
inferencing conducted with MLz to be eligible for zIIP. You can also deploy Apache Spark
on IBM zIIPs to run analytics and machine learning on large, complex data sets.
The IBM Integrated Facility for Linux (IFL) is a processor that is dedicated to Linux workloads
on IBM Z and LinuxONE. It allows for the reduction of operational, software, facility, and
energy expenses with a high Linux server density. Similar to zIIPs, it does not increase
charges for IBM Z software running on “standard” processors, nor does it affect the MSU
rating or the IBM Z systems model designation.
An example from the banking industry concerns the issue of fraud detection. One US bank
found themselves unable to score all their transactions in real time with their existing
off-platform scoring engine due to latency. Without the ability to scale fraud detection, 80% of
all transactions were unscored for fraud, resulting in millions of dollars exposed to fraud
annually.
To mitigate this issue, the bank architects a fraud detection and prevention solution running
on IBM Z that utilizes a Python-based machine learning model and LightGBM, a framework
that increases model efficiency and reduces memory usage. With this solution, the client
achieved 100% real-time scoring, ultimately saving >$20 M annually in exposure risk.
Such a use case can be optimized with the Integrated Accelerator for AI on the IBM z16 and
LinuxONE Emperor 4, colocating the model with transactional systems.
For US banks, where fraud losses average 9.3¢ per $100 transacted, an advanced
inferencing model on the IBM z16 could reduce losses to 3.7¢, a 60% improvement.5
5 https://www.ibm.com/downloads/cas/DOXY3Q94
7
Example architecture: Fraud detection with AI Toolkit for IBM Z
While there are multiple ways to achieve real-time fraud detection with z/OS applications, one
approach you may consider utilizes the AI Toolkit for IBM Z, allowing access to a suite of
lightweight and free to download tools and runtime packages. Coupled with zIIP eligibility
through IBM z/OS Container Extensions (zCX), and making use of the IBM Z Integrated
Accelerator for AI, this architecture may offer a rapid path toward a return on investment.
The challenges with data are not only limited to data quality and quantity. Data privacy and
security must also be considered end to end for any enterprise use case, and the ability to
govern data used both for training and inference is a key benefit of products such as
IBM Cloud Pak for Data.
Another common production challenge that is overlooked is the availability of additional input
data (features) that does not reside on the platform. This is critical when dealing with a
real-time use case that has strict SLA requirements. Some common architectures can be
used to address the issue, including the use of a feature store that enables the use of
pre-transformed features in real-time.
In several cases, this exploration may result in the creation of new, additional features that
augments the set of existing data points. These new features serve to augment available
real-time data to enable the AI model to produce more accurate insights. Common examples
include:
Recent, historical events (transactions) to produce a more accurate prediction for use
cases involving a sequence of events. This enables models like Long-short term memory
(LSTM) to build up a state (or context) based on the recent events.
Aggregated or enriching features based on historical data.
Consider a real-time credit card fraud detection scenario. In practice, as credit card
transactions move through authorization processing, there is a set of “new” data that is
associated with that specific transaction. This data includes a set of features that are
associated with the cardholder, the merchant processing the request, as well as the
transaction details (date, time, amount, and so on).
While some of these features may be critical input for the AI model detecting fraud, a data
scientist may find they are able to improve accuracy by using additional data based on
historical transactions. For example, short term (30 day) averages of the card holder
transaction value, or a recent history of the geographies where their transactions originated,
can enable the AI model to take into account recent behaviors. This additional context can
help produce a more accurate prediction.
These additional features can be preprocessed and transformed before model execution.
Many clients store these preprocessed features in an existing AI feature store, a repository
that is commonly used to make preprocessed data available for reuse.
As these use cases progress toward production, some challenges may arise. One challenge
comes from such implementations where there are also strict application SLA requirements.
In these cases, a key benefit of deploying AI alongside a business application on IBM Z is to
keep latency low and improve scalability; thus, calling to an off-platform feature store to gather
additional historical data introduces intolerable latency and scalability challenges.
The good news is there are strategies to address these concerns and bring outside data back
to the platform.
Leveraging feature store on IBM Z for low latency inference use cases
Bringing data back to IBM Z can enable a more accurate prediction, the benefits of which can
be astronomical.
Colocating a high performing feature store on IBM Z alongside the AI inference service and
business application provides for the most cohesive solution.
In this context, the feature store serves as a localized, high performing data store containing
the pre-transformed set of features. The data store technology that is used should be capable
of serving real-time requests with low single digit millisecond response times while scaling to
meet demands.
9
In a typical end-to-end solution flow for a real-time use case, the inference service would
expose an API that is invoked by the business application, as shown in Figure 2. The
business application would provide the required raw transaction data on the API request.
When invoked, the API would pass this data to a scoring handler which, based on key data
from the transaction, queries the feature store to extract relevant pre-transformed features as
shown in step 1. The scoring service then builds the AI model input tensor, which is now
inclusive of raw data and the pre-transformed features. Finally, the inference request is issued
in step 2. An overview of this architecture is shown in Figure 2.
Figure 2 Leveraging feature store on IBM Z for low latency inferencing architecture overview
Various specific technologies can be leveraged to create a feature store, both on z/OS and for
Linux on IBM Z environments. Many clients have chosen to use in-memory databases for a
real-time use case feature store. There are a number of potential technologies that can be
leveraged, including Hazelcast, Redis, and Red Hat DataGrid - all of which can scale to
satisfy high rates of read requests in parallel with the qualities of service needed for online
scoring solutions. In z/OS environments, these Linux based options may be deployed in z/OS
environments by using zCX.
As mentioned prior in this chapter, in addition to storing these additional data points, it is
advantageous to preprocess them in advance as well - that is, transform them to the format
required by the AI model. This further decreases the processing overhead during an
inference request.
In this view, real-time inferencing is being invoked during a financial institution’s clearing and
settlement processing for fraud detection. As each batch request is processed, a scoring
service is being invoked (“Real-time AI inferencing”). These requests are processed by an
inference service that is deployed to Red Hat OpenShift Container Platform hosted in zCX. As
shown previously, the inference service gathers additional data from the feature store before
processing the inference request.
Summary
Today artificial intelligence presents a huge opportunity to turn data into actionable insights
and actions, amplify human capabilities, decrease risk, and increase return-on-assets by
achieving breakthrough innovations. To remain competitive, the adoption of AI is not so much
a choice for organizations as it is a necessity. By embracing AI, organizations can gain a
competitive edge, improve operational efficiencies, enhance customer experiences, and
deliver innovative solutions. This IBM Redguide discussed AI infusion into mission-critical
applications, different cost-effective entry points for leveraging AI on IBM Z, and data
management across platforms for use on IBM Z with Feature store.
Initiating the AI journey on IBM Z and LinuxONE may be challenging and full of uncertainties
but there are many steps that can be taken. No matter where you are on your AI journey, you
can participate in a Discovery Workshop tailored and hosted by the AI on IBM Z Solutions
Team at no charge. In this workshop, get hands-on experience with the IBM team in exploring
use cases through design thinking sessions and ideate on possible architecture.
11
Build out a proof-of-concept with foundational capabilities and develop a practical
implementation plan for a clear view of how AI can be leveraged on IBM Z and LinuxONE. For
more information, email aionz@us.ibm.com.
For teams that are further along in their AI exploration with developed use cases and who are
eager to dive into more specifics, there are many options to explore potential collaborations.
Run highly optimized, no-charge libraries like TensorFlow, TensorFlow Serving, Snap ML,
IBM Z Deep Learning Compiler and Triton Inference Server natively on IBM Z and LinuxONE.
The AI Toolkit offers an optional IBM Elite Support while offering IBM Secure Engineering that
vets and scans each package for security vulnerabilities that are compliant with industry
regulations. Find out more here:
https://www.ibm.com/products/ai-toolkit-for-z-and-linuxone
Reach out for any questions. IBM is ready to help you and your organization embarking on
this pathway toward AI on IBM Z and LinuxONE.
Authors
This guide was produced by a team of specialists from around the world working with the
IBM Redbooks team.
Tabari Alexander is a Senior Technical Staff Member for IBM Z AI and Analytics, where he
works on bridging AI acceleration capabilities with the IBM Z software ecosystem. He has a
master's degree in Computer Science from Columbia University with a focus in machine
learning, and nearly 20 years of development experience within IBM Z. Previous to his
involvement in the AI space, Tabari was the Product Owner for PDSE, and brought forth
PDSE encryption and zEDC compression of PDSE data sets.
Colton Cox is the manager of the AI on IBM Z design team and oversees efforts to drive
human-centric practices across a portfolio of AI and data science products. In his 5 years at
IBM, he has worked across multiple products, including the z/TPF operating system, the
IBM Z Security and Compliance Center, and Machine learning for IBM z/OS. He holds a
bachelor’s degree in English from SUNY Oneonta and a master’s degree in Information
Design & Strategy from Northwestern University. His areas of expertise include design
leadership, content design, and storytelling in technical domains.
Joy Deng is an enterprise product manager for AI on IBM Z based in Raleigh, North Carolina.
She has 5 years of experience in tech product management, and before that had experience
in market research, strategy, and operations finance across CPG and retail. She holds a
bachelor's degree in Marketing and Psychology from Washington University in St. Louis and
also a Masters of Business Administration from Duke University Fuqua School of Business
with concentrations in Strategy and Tech Management. Her areas of expertise include
customer-centered product design, and launching data and AI offerings.
13
Purvi Patel is a senior client engagement leader on the IBM AI on Z Solutions team, where
she is leading the development team. She spearheads the initiative to infuse AI into clients’
mission-critical workloads. She holds a master’s degree in Computer Science from New York
Institute of Technology and has over 25 years of experience in the core mainframe
technologies. She has a passion for problem solving and possesses an amazing talent for
thinking creatively in high-stressed situations and thrives on client interactions. Previously,
she was the chief product owner for z/OS Diagnostics Aids, where she designed and
implemented many solutions in SVC and Stand-alone dumps. Most recently, she took on the
challenge of securing the sensitive personal information in the diagnostics dumps through
IBM Data Privacy for Diagnostics product. Purvi holds many patents in the various areas of
operating systems and was recognized with the Outstanding Technical Achievement Award
by IBM.
Andrew Sica is an IBM Senior Technical Staff Member and Chief Architect of the AI on IBM Z
and LinuxONE Solutions team. In his nearly 24 years at IBM, Andrew has led several
innovative platform initiatives, with areas of expertise including AI infrastructure, operating
system development, and platform economics. In Andrew’s current role, he works extensively
with IBM customers who are interested in leveraging AI to improve their business insights.
Shin (Kelly) Yang is a AI on IBM Z Product Manager based in Poughkeepsie, NY. She has
8 years of experience product management and is responsible for the strategy and
development of products for the AI on IBM Z organization. She holds a bachelor’s degree in
Computer Science and an MBA from Clarkson University. Her areas of expertise span from AI
technology to business with ACs in Management and Leadership, and Supply Chain
Management.
Find out more about the residency program, browse the residency index, and apply online at:
ibm.com/redbooks/residencies.html
15
16 AI Expedition: Steering Business Innovation on IBM Z and LinuxONE
Notices
This information was developed for products and services offered in the US. This material might be available
from IBM in other languages. However, you may be required to own a copy of the product or product version in
that language in order to access it.
IBM may not offer the products, services, or features discussed in this document in other countries. Consult
your local IBM representative for information on the products and services currently available in your area. Any
reference to an IBM product, program, or service is not intended to state or imply that only that IBM product,
program, or service may be used. Any functionally equivalent product, program, or service that does not
infringe any IBM intellectual property right may be used instead. However, it is the user’s responsibility to
evaluate and verify the operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter described in this document. The
furnishing of this document does not grant you any license to these patents. You can send license inquiries, in
writing, to:
IBM Director of Licensing, IBM Corporation, North Castle Drive, MD-NC119, Armonk, NY 10504-1785, US
This information could include technical inaccuracies or typographical errors. Changes are periodically made
to the information herein; these changes will be incorporated in new editions of the publication. IBM may make
improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time
without notice.
Any references in this information to non-IBM websites are provided for convenience only and do not in any
manner serve as an endorsement of those websites. The materials at those websites are not part of the
materials for this IBM product and use of those websites is at your own risk.
IBM may use or distribute any of the information you provide in any way it believes appropriate without
incurring any obligation to you.
The performance data and client examples cited are presented for illustrative purposes only. Actual
performance results may vary depending on specific configurations and operating conditions.
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products and cannot confirm the
accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the
capabilities of non-IBM products should be addressed to the suppliers of those products.
Statements regarding IBM’s future direction or intent are subject to change or withdrawal without notice, and
represent goals and objectives only.
This information contains examples of data and reports used in daily business operations. To illustrate them
as completely as possible, the examples include the names of individuals, companies, brands, and products.
All of these names are fictitious and any similarity to actual people or business enterprises is entirely
coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which illustrate programming
techniques on various operating platforms. You may copy, modify, and distribute these sample programs in
any form without payment to IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating platform for which the sample
programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore,
cannot guarantee or imply reliability, serviceability, or function of these programs. The sample programs are
provided “AS IS”, without warranty of any kind. IBM shall not be liable for any damages arising out of your use
of the sample programs.
The following terms are trademarks or registered trademarks of International Business Machines Corporation,
and might also be trademarks or registered trademarks in other countries.
CICS® IBM z16™ z/OS®
IBM® Redbooks® z16™
IBM Telum® Redbooks (logo) ®
IBM Z® WebSphere®
The registered trademark Linux® is used pursuant to a sublicense from the Linux Foundation, the exclusive
licensee of Linus Torvalds, owner of the mark on a worldwide basis.
Java, and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its
affiliates.
Red Hat, OpenShift are trademarks or registered trademarks of Red Hat, Inc. or its subsidiaries in the United
States and other countries.
Other company, product, or service names may be trademarks or service marks of others.
REDP-5713-00
ISBN 0738434507
Printed in U.S.A.
®
ibm.com/redbooks