Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

A Business Guide To Modern Predictive Analytics

Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

A business guide

to modern predictive
analytics
What’s inside
Why this guide? 03
The big picture 04
Why predictive analytics and AI matter 05
The tipping point for AI adoption 06
How can AI augment your business? 08
Climbing the AI ladder 13
What are your solution options? 14
Taking the next step 16
Key takeaways 17
Why combine decision optimization? 18
Glossary 19

2
Why this guide? Modern predictive analytics is about using
In business, foresight is everything. If you can predict what will
machine-generated predictions with
happen next, you can do the following tasks: human insight to drive business forward.
–– Make smarter decisions
–– Get to market faster
–– Disrupt your competitors

Modern predictive analytics can empower your business


to augment historical data with real-time insights then harness
this to predict and shape your future.

Predictive analytics is a key milestone on the analytics journey—


a point of confluence, where classical statistical analysis
techniques meet the new world of artificial intelligence (AI).

According to Forrester Research, enterprises have reached a point


to begin combining machine learning with knowledge engineering.
Augmenting data with human wisdom will dramatically accelerate
the development of AI applications.

This guide will help your business perform the following actions:

–– Navigate the modern predictive analytics landscape


–– Identify opportunities to grow and enhance your use of AI
–– Empower both data science teams and business stakeholders
to deliver value, fast

→ Back to Table of Contents 3


The big picture 1 How do our customers behave?
As the AI revolution takes hold, businesses are increasingly
asking their data science teams to tackle the big questions. 2 Why are our markets fluctuating?
As a result, data scientists are expected to do much more than 3 What makes our business strategies
work on one-off research projects. They need to find repeatable,
automated ways to provide real-time insights for day-to-day succeed or fail?
decision-making.
4 What will happen next?
To meet these expectations, data science leaders not only need
to be able to explain the potential of modern predictive analytics
technologies to business stakeholders—they also need to deliver 5 How are the projected funded?
the results.

The ability to define and execute a successful data science


6 Where are the buying centers?
strategy will be one of the key differentiators between leaders
and followers in the years ahead.

This is no simple task. Building up your data science capabilities


will involve the following activities:

–– Attracting and retaining a disparate team of skilled specialists


–– Empowering them to collaborate seamlessly
–– Putting sound governance structures in place to ensure
that predictive models can always be trusted by the business

Above all, data science and business teams need to find


new ways to collaborate effectively. These methods include
understanding what predictive analytics can do and identifying
the areas where AI will drive business advantage.

→ Back to Table of Contents 4


Why predictive analytics
and AI matter $77.6 billion
Predictive analytics is not a new concept. Statisticians have
been using decision trees and linear and logistic regression will be spent on cognitive
for years to help businesses correlate and classify their data
and make predictions.
and AI systems by 2022
(Source: IDC)
What’s new is that the scope of predictive analytics has
broadened. Breakthroughs in machine learning and deep
learning have opened up opportunities to use predictive
models in areas that have been impractical for most
business investments—until now.

Enterprises are seeing an unprecedented confluence of intuitive


tools, new predictive techniques and hybrid cloud deployment
models that are making predictive analytics more accessible
than before.

This situation has created a tipping point. For the first time,
organizations of all sizes can do the following activities:

–– Embed predictive analytics into their business processes


–– Harness AI at scale
–– Extract value from previously unexplored “dark data”—
including everything from raw text to geolocational information

If you can evolve from departmental, small-group AI projects


and advance toward an enterprise data science platform,
your organization stands to gain significant competitive
advantage. Those who don’t seize the opportunity risk falling
behind the curve.

→ Back to Table of Contents 5


The tipping point for AI adoption

What types of data can be analyzed? What tooling is available?

Before: Before:
Primarily relational data at scale; other types of data Disparate, incompatible tools that require multiple handovers
require ad hoc research projects. between teams with different expertise.

Now: Now:
Relational data, semi-structured documents, text, sensor data A blend of drag-and-drop interfaces and open source notebooks
and more; both historical and real-time analytics are possible that make collaboration between teams more convenient.
at scale.

What analytical techniques can be used? How do enterprises deploy analytics applications?

Before: Before:
Basic statistical techniques such as logistic and linear regression. Applications and analytics are tied to data on-premises
servers and data warehouse appliances, reducing opportunities
Now: for anytime, anywhere analytics.
Statistical techniques augmented with state-of-the-art machine
learning and deep learning algorithms. Now:
Hybrid, multi-cloud deployments help push analytics to wherever
data resides, while combining on-premises security with flexibility
and scalability.

→ Back to Table of Contents 6


How can enterprises integrate analytics How do enterprises implement governance?
into our business processes?
Before:
Before: Ad hoc adherence to policies at departmental level,
Generate static reports for manual analysis with minimal visibility or traceability.
by business experts.
Now:
Now: A coherent governance and security framework enables
Seamlessly embed predictive models into enterprise-wide policies to be enforced at scale.
new apps and enterprise applications.

How do enterprises inject artificial intelligence How can enterprises progress on their
into modern applications? analytics journey?

Before: Before:
A total disconnect between application development and data Each step from descriptive to predictive and prescriptive
science teams means each deployment is a custom process. analytics requires separate tools, skills and investment.

Now: Now:
The data science lifecycle is designed to create An integrated platform supports analytic progression,
a standardized, repeatable process for AI integration. simplifies onboarding and grows with you as your needs
change and skills develop.

→ Back to Table of Contents 7


How can AI augment Which business functions are
your business? leading business investment
In theory, adopting a modern approach to predictive analytics
in AI systems?
should be straightforward. The technology is no longer an obstacle,

46%
and better tooling is lowering the barriers to entry significantly.

However, in practice, delivering value can still be a challenge. It’s


especially easy for business stakeholders to get caught up in the sales and marketing
hype around AI and have unrealistic expectations of what data

40%
science can achieve.

Defining use cases customer support


The first task for data science and business leaders is to work
(Source: Forrester Research)
together to identify concrete, practical use cases where modern
predictive analytics can deliver value.

Some use cases may be generally applicable across most


industries, such as the following examples:

–– Product recommendation and “next best action”


models for sales and marketing teams
–– Contact center automation for customer support teams

Other use cases may be specific to a particular industry,


department or even team within a business. These tend to
be more difficult to execute, but they have a greater potential
to unlock unique competitive advantages.

→ Back to Table of Contents 8


General use cases Some of the most common cross-industry use cases for
what modern predictive analytics can provide include:
When a business begins investing in a new technology, it often
makes sense to pick the lowest-hanging fruit first. –– Increasing cross- and up-selling with personalized real-time
recommendations and offers
Predictive analytics is no different. Several use cases are widely
applicable across industries, and vendors have already developed –– Boosting loyalty by anticipating customer churn and
general-purpose, prepackaged models and services. intervening to prevent it

These services can be an excellent starting point for businesses –– Optimizing offerings by listening to voices of customers
that want to transform data science from a research function and anticipating future needs
into an embedded part of day-to-day operations. They are easy
to deploy, require minimal custom development and deliver –– Enhancing marketing with targeted, personalized campaigns
value quickly.
–– Minimizing inventory costs and improving resource
management with accurate forecasting

–– Improving productivity by allocating the right employees


to the right jobs at the right time and creating accurate
labor forecasts

–– Reducing maintenance costs by anticipating faults before


Contact center optimization
they occur
Handling unpredictable volumes of customer calls, emails,
–– Mitigating risk with accurate customer credit scoring
SMS and chat messages is a challenge for many customer
service teams.
–– Detecting fraud by identifying suspicious behavior patterns
Intelligent chatbots are a powerful and cost-effective way
–– Unlocking new business models by addressing untapped
to take the pressure off employees and reduce wait times
demands and integrating prediction into modern apps
for customers. These chatbots use the following features
to understand customer inquiries:

–– AI-powered speech recognition


–– Natural language processing
–– Content analytics to explore the company’s knowledge
base and find helpful answers, without needing
human intervention

→ Back to Table of Contents 9


Industry-specific use cases Commercial banking

Innovative organizations across many industries are already Commercial banks use predictive analytics for the following tasks:
investing in building their own predictive models to solve specific
business problems. The next two pages highlight just a few of the –– Assess market and counterparty risk on trades
potential applications for AI and predictive analytics across several –– Assess credit risk for loan applications
major industries. –– Detect fraudulent transactions in real time
–– Harness predictive modeling to accelerate
loan approval processes

Insurance Energy and utilities

Insurers use predictive analytics for the following tasks: Utilities use predictive analytics for the following tasks:

–– Detect fraudulent claims –– Manage vast networks of physical assets


–– Optimize quotes and premiums by assessing –– Forecast production and demand patterns
relevant risks for each applicant –– Predict outages before they happen
–– Predict hazardous weather events to reduce –– Plan for supply and demand
auto insurance claims

→ Back to Table of Contents 10


Government Manufacturing

Governments rely on accurate statistics to inform policy-making Manufacturers use predictive analytics for the following tasks:
across many areas, including the following use cases for predictive
analytics: –– Keep production lines running smoothly by modeling
product quality and detecting defects
–– Detect benefit fraud –– Optimize warehouse management and logistics
–– Predict usage patterns for public services –– Develop sensors for autonomous vehicles by using
–– Optimize waste management and traffic flows machine learning models

Retail Food

Retailers use predictive analytics for the following tasks: The food industry uses predictive analytics for the following tasks:

–– Manage customer loyalty programs –– Automate data collection and analysis on food health
–– Boost cross- and up-selling by making targeted –– Predict and warn of potential health outbreaks to enable
rapid intervention
recommendations based on customer profiles
–– Protect companies’ sensitive data, making it safe
and sophisticated propensity models for competitors to collaborate
–– Enable accurate demand forecasting

→ Back to Table of Contents 11


Healthcare Retail banking

Healthcare organizations can use statistical modeling techniques Retail banks use predictive analytics for the following tasks:
for the following tasks:
–– Enhance customer satisfaction through faster credit scoring
–– Monitor streams of data from ECGs and other medical devices –– Combine flexibility with robustness and security through
–– Predict when a patient’s condition may change hybrid cloud infrastructure
–– Perform medical research –– Cut costs and accelerate development due
–– Analyze streams of patient data in real time
to innovative architecture

Transportation Education

Transportation and logistics companies use predictive analytics Education institutions use predictive analytics for the
for the following tasks: following tasks:

–– Optimize route planning –– Predict student achievement and retention


–– Enable predictive maintenance for vehicles –– Identify students who need extra support
–– Optimize supply chain operations to reach their goals
–– Strengthen donor relationships
–– Track student movements to help reduce absenteeism

→ Back to Table of Contents 12


Climbing the AI ladder

Achieving success with modern predictive analytics is a


journey. It’s important to pitch AI strategy at the right level
for a business, taking both technical and organizational
maturity into account. Data science and business leaders
need to work together to define the best and fastest way
to deliver business value.
Infuse - Operationalize AI with trust and transparency
From the technical perspective, you can visualize AI
maturity as a ladder. The first step on the ladder is data
collection, because without data, you won’t have anything
Analyze - Scale insights with AI everywhere to analyze or model. The next step is data organization.
Add metadata for governance and discoverability, to ensure
that the right data is always available to the data scientists
Organize - Create a trusted analytics foundation who need it.

While data collection and organization are important topics,


they’re beyond the scope of this guide. Instead, let’s focus
Collect - Make data simple and accessible on helping climb the following top two levels of the ladder:

Data of every type, no matter where it lives –– Analyzing data by building, training and testing
predictive models
–– Infusing AI into operations by deploying those
models into production as part of your applications

→ Back to Table of Contents 13


What are your solution options?
Interact with pre-built AI services
Watson application services

Build Deploy Catalog Manage

Watson Watson Watson Watson


Studio Machine Knowledge OpenScale
Learning Catalog

AI open source frameworks

Unify on a multicloud data platform


IBM Cloud Private for Data

The AI portfolio from IBM offers everything you need to reach Watson Knowledge Catalog provides robust data governance and
the top rungs of the AI ladder. discoverability for models and data, while Watson OpenScale helps
you monitor and manage models in real time—boosting accuracy,
Pre-built AI services such as Watson Assistant and Watson increasing explainability and mitigating bias.
Visual Recognition help you address common use cases quickly
and efficiently, delivering value fast. IBM Cloud™ Private for Data unifies access to all these capabilities
and provides a powerful multicloud data platform.
When you’re ready to start developing your own AI solutions,
Watson Studio and Watson Machine Learning provide seamless IBM Data Science Premium add-on for IBM Cloud Private for Data
workflows for building, training and deploying predictive models. provides the additional data science productivity capabilities such
These solutions empower you by harnessing both state-of-the-art as SPSS Modeler and Decision Optimization to accelerate the time
IBM tools and the best open source AI frameworks. to value and increase the chance of your AI/ML project success.

→ Back to Table of Contents 14


Watson Studio Watson OpenScale
Rebuild models, improve
Data performance and mitigate bias Fairness
exploration and explainability

Watson Machine Learning


Build Manage
Model Data Deployment Inputs for Business KPIs
development preparation continuous and production
evolution metrics
Run

Easily deploy models for online, Model Retraining Monitor and orchestrate
batch or streaming deployments management models served with Watson
Machine Learning

The top two steps of the AI ladder are Analyze and Infuse. To reach these
steps, organizations must help data scientists and business stakeholders
work together effectively at every stage of the data science lifecycle.

The complete lifecycle can be visualized as the following three


sub-cycles that interact with each other:

Build Run Manage

Data scientists explore Operations teams train, Business experts monitor the
business data to identify test, deploy and manage models’ runtime performance,
interesting features, then the models, and retrain look for any signs of bias or
prepare well-structured data them when necessary. need of explanation provide
sets that are used to design feedback and notify the data
predictive models. science team when they need
retraining.

→ Back to Table of Contents 15


Taking the next step Getting practical
Depending on their level of progress on the AI ladder, businesses Whether you’re a data scientist or a business leader, the best
may have different requirements based upon the level of predictive way to learn how the modern predictive analytics portfolio from
analytics adoption across their organization. IBM can transform your business is to experience it for yourself.
Try one of the following tutorials to get started:
Starting out
Perform a machine learning exercise
When businesses begin building their data science capabilities,
they often start with ad hoc projects—developing models to answer Dive into machine learning by performing an exercise
specific questions or support research projects. With solutions in IBM Watson Studio using Apache SystemML. Learn more
such as Watson Studio Desktop, data scientists can work 24x7
on their own computers or laptops and sync up with a wider team Create a scoring model to predict heart rate failure
when needed.
Use IBM Watson Studio to build a predictive model with IBM
Growing up Watson Machine Learning. Learn more

When data science is adopted widely, different departments need Predict equipment failure using IoT sensor data
to deploy their models, connect them to data sources and infuse
them into production applications. Watson Studio and Watson See how IBM Watson Studio can analyze multivariate Internet
Machine Learning make it easier for departmental data science of Things (IoT) sensor data and predict equipment failure.
and IT teams to collaborate across this lifecycle. Learn more

Going enterprise-scale Analyze open medical datasets to gain insights

Once AI is embedded into business-critical processes, building Use IBM Watson Studio to run machine learning classifiers and
a central platform is vital in order to manage and govern models compare the outputs with evaluating measures. Learn more
and data. IBM Cloud Private for Data can provide the infrastructure
and tools required for a comprehensive, multicloud platform that Shape and refine raw data
acts as a single point of control.
Work with IBM Data Refinery to prepare large data sets for
predictive analysis. Learn more

→ Back to Table of Contents 16


Key takeaways Watson Studio helps businesses
The modern predictive analytics portfolio from IBM offers
focus on solving problems and
the following benefits data science and business leaders identifying opportunities.
can use to help seize competitive advantage in the age of AI:

Scale Learn more

–– Reduce operational workload and costs by automating


data science and data engineering tasks

–– Train, test and deploy models seamlessly across multiple


enterprise applications

–– Extend common data science capabilities across hybrid,


multicloud environments

Speed

–– Accelerate development by harnessing pre-built applications


and pre-trained models

–– Deliver value faster by helping data science and business


teams collaborate Watson Machine Learning empowers
–– Streamline model building by combining state-of-the-art businesses to deploy and manage
IBM and open source software
models to give the results they need fast.
Simplicity

–– Take advantage of a central platform to manage the entire Learn more


data science lifecycle

–– Standardize development and deployment processes

–– Create a single framework data governance and security


across the organization

→ Back to Table of Contents 17


Why combine decision optimization
with predictive analytics?
IBM Decision Optimization is a prescriptive analytics solution Following that action, decision optimization returns answers
that enables highly data-intensive industries to make better to deliver value to the business, such as actionable items and
decisions and achieve business goals by solving complex recommendations for change. By performing this activity, decision
optimization problems. Business leaders use this tool optimization enhances what predictive analytics can offer you.
to improve their efficient use of resources, including but
not limited to the following activities: The solution lets teams combine optimization and machine-
learning techniques with model management, deployment
–– Inventory flow for supply chain and other data science capabilities to develop optimal solutions
–– Workforce scheduling that improve operational efficiency.
–– Routing of transportation

This solution works well with predictive analytics by using the Learn more
predictive outcomes of machine learning applications to provide
optimized outcomes. Machine learning provides insights on
the future based on observations given by users. With machine
learning, you know the answer, and you train the machine
how to find that answer.

Decision optimization lets you take the next step and act on
that information. With decision optimization, while you don’t
know the answer, you do know a lot about what is a good and
bad answer. You take the output from your machine learning
and specify an action for decision optimization to make, which
can include optimization rules and constraints to achieve
business goals.

→ Back to Table of Contents 18


Glossary
Algorithms are sets of rules that define a sequence of operations Classification models aim to put data points into categories by
that can be applied to data to solve a particular problem. comparing them with a set of data points that have already been
In a data science context, the term encompasses a huge range categorized. The result is a discrete value, meaning one of a limited
of techniques, including the following: list of options, rather than a score. For example, a classification
model can give a yes or no answer on whether customers are likely
–– Decision trees and regression models to make a purchase or if they are a bad credit risk. Classification
–– Autoregressive Moving Average (ARMA), Autoregressive models can be built using various techniques, including decision
Integrated Moving Average (ARIMA) and exponential smoothing trees and logistic regression.
–– Transfer functions with predictors and outlier detection
–– Ensemble and hierarchical models Content analytics is the analysis of unstructured data in
–– Vector machine and temporal causal modeling documents of various formats, including text, images, audio
–– Time series and spatial AR for spatiotemporal prediction and video files. Machine learning techniques can greatly
–– Generative adversarial networks (GANs) and reinforcement accelerate analyzing large repositories of content that would
previously have required workers hundreds or thousands
Your data science platform should give you easy access to all of hours to review and classify.
these powerful algorithms.
Data science is a wide-ranging discipline that unifies aspects
Artificial intelligence (AI) is the ability of computer systems of statistics, data analysis and machine learning to harness data
to interpret and learn from data. The term is most commonly to solve business problems.
used to describe systems built using machine learning or deep
learning models. AI techniques can be used to enable computers Deep learning is a branch of machine learning that uses neural
to solve a wide range of problems that were previously networks with large numbers of hidden layers. These highly
considered intractable. sophisticated networks are used in cutting-edge fields of deep
learning such as computer vision, machine translation and
Bias is a common issue when designing, training and testing speech recognition.
models that can lead to inaccurate predictions. Mitigating
bias by monitoring and auditing models during runtime Training a deep neural network is extremely computationally
is an increasingly important topic as businesses seek to intensive, typically requiring clusters of machines with high-
adopt AI more widely. performance processors. A hybrid cloud platform such as IBM
Watson Studio or IBM Cloud Private for Data can make this kind
of infrastructure more accessible and affordable for companies
of all sizes.

→ Back to Table of Contents 19


Deployment is the process of integrating a model into your During the exploration phase, it’s critical to exercise data science
business applications and running that model against real-world skills and business knowledge to define questions you want
data. Making and moving the model through test, staging and to answer and outcomes you want to predict. This may result
production environments requires collaboration between your in an iterative cycle of preparation and exploration until you
data science, application developers and IT operations teams. have fully explored the domain and have the data in the right
shape to proceed.
It can be challenging to integrate open source data science
tools with the organization’s existing continuous integration and Geospatial analytics is the analysis of geographic data such as
deployment pipeline. To avoid manual deployments with multiple latitude and longitude, postal codes and addresses. This analysis
handovers between teams, a coherent data science platform with is extremely useful for solving many kinds of practical data science
automated deployment capabilities can be a major advantage. problems. A modern data science platform should make it easy to
detect, parse and calculate geospatial information, and offer easy
Development of predictive models involves the use of traditional integration with mapping tools to visualize the results.
statistical techniques or machine learning algorithms to create and
refine models by training and testing them against your data sets. Inference in artificial intelligence applies logical rules to
the knowledge base to draw conclusions in the presence of
The development process is highly iterative; you may need to uncertainty. With inference, users get a prediction that is simplified,
train dozens or even hundreds of models to achieve the level of compressed and optimized for runtime performance.
accuracy you require. That’s why automating the workflows around
model development and training can deliver huge value. Linear regression is a statistical process using one independent
variable to explain or predict a value or score. Examples include
Explainability is an important attribute of any system that uses the number of SKUs of a product sold in a given week or the
predictive models to make recommendations and assist business percentage risk of a customer closing their account.
decision-making. A predictive model seen as complex and
mysterious can be difficult to convince business stakeholders, Logistic regression is a statistical process used in predicting
regulators and customers to trust its output. The advanced outcomes. The process differs from linear regression in that the one
runtime monitoring and logging capabilities of Watson OpenScale independent variable has only a limited number of possible values
provide context around each decision, making AI models rather than infinite possibilities. Users employ logistic regression
transparent and auditable. when the response falls into categories such as numeric orders like
first, second, third and so on.
Exploration of data is an important part of the model building
process. This activity aims to reveal interesting features in a given Machine learning uses statistical techniques to derive
data set, uncover hidden relationships and highlight use cases sophisticated predictive models and algorithms from large
where predictive modeling could deliver business value. data sets, without requiring explicit programming.

→ Back to Table of Contents 20


Typically, you start this iterative process by dividing a data set into Open source software has become an increasingly dominant
two subsets for training and testing. You train your models against paradigm in many areas of statistical modeling and machine
the training set and test their performance against the testing set learning. Languages like R, Python and Scala, big data
with dozens or hundreds of variations to assess their predictions’ architectures such as Apache Hadoop and Spark, and machine
accuracy. By running this process and basing the next generation learning frameworks like TensorFlow and Spark MLlib,
of variations on the best performers from each iteration, the model are all major players in the world of predictive analytics and
gradually learns and improves performance. data science.

The main approaches to machine learning can be divided into Open source frameworks often focus on developing high-quality
two categories: supervised and unsupervised learning. tools that target specific parts of the data science process, such
as model development or training. As a result, they often leave the
Management of models is vital to ensure that they remain end user responsible for integrating all the tools together into a
accurate over time. Retraining models regularly to take new data coherent workflow. This task can be a problem when you are trying
into account is critical, so model development, implementation, to scale predictive analytics across the enterprise and embed AI
deployment and management should form a continuous cycle. into business processes.

This management can be difficult to achieve with disparate open Predictive analytics uses historical data to model a specific
source tools. Using an end-to-end data science platform can avoid domain or problem and isolate the key factors that have driven
gaps in the process. The platform also can ensure that appropriate specific outcomes in the past. Models built using this process
teams will be notified immediately and can take rapid action predict likely future outcomes from new data.
whenever a model’s performance begins to degrade.
Predictive analytics can encompass a wide range of techniques,
Natural language processing (NLP) is a field of AI that focuses from classical statistical modeling to machine learning algorithms.
primarily on enabling computers to analyze unstructured textual
data. Common use cases include speech recognition, natural Predictive models are algorithms that map an input, meaning a
language understanding and sentiment analysis. piece of data, such as a database record, text sample or image,
to an output or prediction. Outputs are typically either continuous
Neural networks provide a framework for training models that variables, such as a number or percentage, or discrete categories,
enables complex interaction between many machine learning such as “yes” or “no.” There are two major types of predictive
algorithms to help identify optimal models. models: regression models and classification models.

The structure of interconnecting neurons in the brains of humans


and other animals inspired the structure of artificial neural
networks. Layers connect the artificial neurons. Data traverses
the structure from the input layer through one or more hidden
layers to the output layer. During this traversal, mathematical
functions transform the data into a prediction whose accuracy
you can assess.

→ Back to Table of Contents 21


Preparation of data is one of the first steps in the data science Testing predictive models is essential for determining the
process. Most projects start by refining data sets to ensure that accuracy of data in AI processes along with training. Predictive
the quality is high enough to bear the weight of detailed analysis. models need to be tested continuously to improve accuracy.
If a model fails, analysts must identify the root cause and retrain
In many cases, your source data may need to be cleaned and and test to improve the models.
transformed into a format that is more amenable to model and
analyze. If you’re building a machine learning model, you may also Text analytics measures unstructured content using linguistic
need to invest in manually labeling the data for use rules, natural language processing and machine learning. This
in supervised learning. process reviews data the same way as performed by human
brains, but at a faster rate. With text analytics, you obtain more
Regression models are useful when you have a data set that insights and discoveries from unstructured content, which makes
contains multiple variables and want to analyze the relationship up approximately 90 percent of all data.
between them. Specifically, regression models can reveal how
one specific variable is likely to change when other variables Training predictive models is a key element of machine learning,
are altered. deep learning and other AI processes to determine which data is
useful. A model trained to give accurate predictions can be used
Linear regression can be used to predict a value or score. Examples to score real-time data. Models must be retrained periodically to
include the number of SKUs of a product sold in a given week adjust for changing behavior patterns.
or the percentage risk of a customer closing their account. week
and the percentage risk of a customer closing their account. Unsupervised learning is a method of training machine learning
models with unlabeled data. The aim is typically to model and
Statistical modeling is a domain of mathematics that involves highlight interesting patterns or structures within the data.
the creation of models based on probabilistic assumptions about Clustering and association problems are common domains for
a set of data. Businesses have used statistical models to analyze unstructured learning—for example, finding interesting new ways
important features of their data sets and identify correlations that to segment customers or identify similarities between them.
can be used to classify data or generate predictions.
Visualization is the process of representing data graphically,
Supervised learning is a method of training a machine learning often using charts and diagrams. To understand data, humans
model using a data set where the data has already been correctly need to be able to visualize it. This process is important both
labeled. The model produces an output variable—typically when presenting your results to business stakeholders and when
a category or a value—so its accuracy can easily be assessed exploring a new data set during the early stages of a project.
by comparing the output to the labeled input. Linear regression,
random forests and support vector machines are all popular Your predictive analytics platform should provide an intuitive
examples of supervised learning algorithms, and most predictive graphical interface with visualization tools. These features help
models are built using these techniques. you to start making sense of even the largest data sets in minutes.

→ Back to Table of Contents 22


© Copyright IBM Corporation 2019

IBM Corporation
New Orchard Road
Armonk, NY 10504

Produced in the United States of America


March 2019

IBM, the IBM logo, ibm.com, IBM Cloud, IBM SPSS Modeler and IBM Watson are trademarks of International Business
Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of
IBM or other companies. Other product and service names might be trademarks of IBM or other companies. A current
list of IBM trademarks is available on the web at “Copyright and trademark information” at www.ibm.com/legal/
copytrade.shtml.

This document is current as of the initial date of publication and may be changed by IBM at any time. Not all offerings
are available in every country in which IBM operates.

The performance data discussed herein is presented as derived under specific operating conditions. Actual results
may vary. THE INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS” WITHOUT ANY WARRANTY, EXPRESS OR
IMPLIED, INCLUDING WITHOUT ANY WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE
AND ANY WARRANTY OR CONDITION OF NON-INFRINGEMENT. IBM products are warranted according to the terms
and conditions of the agreements under which they are provided.

The client is responsible for ensuring compliance with laws and regulations applicable to it. IBM does not provide
legal advice or represent or warrant that its services or products will ensure that the client is in compliance with
any law or regulation.

Statement of Good Security Practices: IT system security involves protecting systems and information through
prevention, detection and response to improper access from within and outside your enterprise. Improper access
can result in information being altered, destroyed, misappropriated or misused or can result in damage to or misuse
of your systems, including for use in attacks on others. No IT system or product should be considered completely
secure and no single product, service or security measure can be completely effective in preventing improper use
or access. IBM systems, products and services are designed to be part of a lawful, comprehensive security approach,
which will necessarily involve additional operational procedures, and may require other systems, products or services
to be most effective. IBM DOES NOT WARRANT THAT ANY SYSTEMS, PRODUCTS OR SERVICES ARE IMMUNE FROM,
OR WILL MAKE YOUR ENTERPRISE IMMUNE FROM, THE MALICIOUS OR ILLEGAL CONDUCT OF ANY PARTY.

You might also like