The Fundamentals of Machine Learning

THE FUNDAMENTALS
OF MACHINE LEARNING
TABLE OF CONTENTS
3 WHAT IS MACHINE LEARNING?
5 BRIEF HISTORY OF MACHINE LEARNING
8 HOW IT WORKS
9 MACHINE LEARNING TECHNIQUES
11 THE IMPORTANCE OF THE HUMAN ELEMENT
12 WHO’S USING IT?
14 CHALLENGES AND HESITATIONS
15 THE FUTURE OF MACHINE LEARNING
16 CONTRIBUTORS
2
WHAT IS MACHINE LEARNING?
Whether we realize it or not, machine learning is something we HOW IS IT DIFFERENT FROM

encounter on a daily basis. While the technology is not new, ARTIFICIAL INTELLIGENCE?
with the rise of artificial intelligence (AI) and the digital age, it is
becoming increasingly important to understand what it is, how it While the two are interconnected, machine learning and artificial
differs from AI, and the major role it will play in the future. This intelligence are different. It’s easiest to think of machine learning
whitepaper will discuss all of the above, and explore different as the underlying technology of AI. The goal of AI is to imitate
types of machine learning, how they work, and how a majority and mimic human behavior, and machine learning gives us the
of industries are utilizing it. mathematical tools that allow us to do that. AI can understand
language and conduct a conversation, allowing it to continually
First and foremost, it’s important to understand exactly what learn and improve itself based on experience, with the help of
machine learning is and how it differs from AI. In its simplest form, machine learning algorithms. So, machine learning, like humans,
machine learning is a set of algorithms learned from data and/or learns from data so that it can perform a higher-level function.
experiences, rather than being explicitly programmed. Each task
requires a different set of algorithms, and these algorithms detect
patterns to perform certain tasks.
3
HOW DOES MACHINE LEARNING IMPACT OUR DAILY LIVES?
GOOD MACHINE LEARNING
If you asked someone on the street if they have ever heard of or utilized machine SHOULD NOT BE IN YOUR
learning, their answer would probably be no. What they don’t know is that FACE. IT SHOULD BE BEHIND
they’ve probably encountered it numerous times—just in one day. THE SCENES, TRACKING, AND
HELPING ACHIEVE GOALS
When you ask Siri what the weather forecast is, that’s machine learning. When MUCH MORE QUICKLY AND
you Google search something at work to help you do your job better or more EFFICIENTLY.
efficiently, you can thank machine learning. Another everyday example is our
spam folders—a machine learning algorithm is used to determine which emails Srinivas Bangalore
are inbox-worthy, and which are spam and don’t deserve attention. Similarly, —Director of Research and Technology
when Netflix suggests a show you should watch based on preference, it’s getting at Interactions
the suggestion from an algorithm.
From TV suggestions to self-driving cars, machine learning is subtly in the

background of almost all that we do. These algorithms, and machine learning
as a whole, is intended to improve and radically simplify our lives. According to
Srinivas Bangalore, Director of Research and Technology at Interactions, “good
machine learning should not be in your face. It should be behind the scenes,
tracking, and helping achieve goals much more quickly and efficiently.”
4
BRIEF HISTORY OF
MACHINE LEARNING
From the 1950s to now, machine learning has significantly EXPLANATION BASED LEARNING 1981
developed. Below is a brief history of machine learning within the EBL, or Explanation Based Learning, was created in 1981 by
AI field. We show how the algorithms we described are motivated Gerald Dejong. This concept allowed a computer to analyze
by the need to solve very simple automation tasks, such as the training data and create a general rule it can follow by discarding
recognition of spoken words or written digits, and how AT&T unimportant data.
showed a strong leadership in this process.
MACHINE LEARNING RESEARCH GROUP 1985
THE TURING TEST 1950 Researchers from AT&T created the first research group for
In 1950, Alan Turing created the “Turing Test” to determine machine learning in 1985. They also began a series of machine
whether or not a computer was capable of real intelligence. In learning meetings that eventually turned into NIPS, the leading
order to pass the test, the computer had to be able to fool another conference on machine learning. This group was representative
human into believing it was also human. of the early machine learning community, breaking away from a
computer science field still mostly interested in expert systems.
THE FIRST COMPUTER PROGRAM 1952 These theoreticians were confronted with real world problems
Arthur Samuel created the first implementation of machine where machines had to replace humans in recognizing noisy
learning, the game of checkers, in 1952. The computer improved written digits: mainly check amounts and zip codes.
at the game the more it played by determining which moves
resulted in winning strategies, and incorporating those strategies AUTOMATED SPEECH RECOGNITION 1992
into the game. In 1992, Jay Wilpon (SVP of Natural Language Research at
Interactions) and a team of researchers at AT&T deployed the first
NEURAL NETWORKS FOR COMPUTERS 1957 nationwide automated speech recognition (ASR) using a machine
Frank Rosenblatt designed the first neural network for computers learning approach called Hidden Markov Models (HMMs). This
in 1957, which was meant to simulate the thought process of a saved billions of dollars in operating costs by spotting things like
human brain. collect calls.
“NEAREST NEIGHBOR” ALGORITHM 1967 SUPPORT VECTOR MACHINES 1992

The “nearest neighbor” algorithm was written in 1967, allowing Researchers at AT&T invented Support Vector Machines (SVMs)
computers to begin recognizing basic patterns. This could be used in 1992, a technique that revolutionized large scale classification
as a mapping route for traveling salesmen. because of its predictable performance.
5
1996 2014
1950
CONVOLUTIONAL NEURAL NETWORK 1996 MODERN APPLICATIONS OF MACHINE LEARNING

Patrick Haffner (Lead Inventive Scientist at Interactions) and
researchers from AT&T proposed the first convolutional neural GOOGLE AND FACEBOOK UTILIZE
network (CNN) in 1996, with a large scale application to check MACHINE LEARNING 2014
recognition. The influence of this technology was not appreciated In 2014, Google and Facebook made machine learning the pivotal
until 10 years later when it became rebranded as deep learning, and technology of their businesses. In both companies, machine
machine learning researchers began to focus on another technique learning was led by ex-AT&T researchers.
developed by the same group at AT&T: Support Vector Machines.
MACHINE LEARNING AND CUSTOMER CARE 2015
THE ADABOOST ALGORITHM 1997 In 2015, Interactions acquired AT&T’s Watson and the AT&T
In 1997, another group of researchers from AT&T invented the speech and language research team. Combined with their award-
Adaboost algorithm. This algorithm allowed unstructured data to winning Adaptive Understanding™ technology, Interactions
be handled through decision trees, making it wildly popular among delivers unprecedented accuracy in understanding that helps
a wide range of applications. enterprises revolutionize their customer care experience.
NATURAL LANGUAGE UNDERSTANDING 2001 MACHINE LEARNING AND SOCIAL MEDIA 2017
AT&T deployed natural language understanding in Interactive Acquired by Interactions in 2017, Digital Roots provides
Voice Response (IVR) systems in 2001, combining 3 of its machine companies with AI-based social media. Its technology allows
learning technologies: SVMs, HMMs, and Adaboost. brands to quickly filter, respond, and interact with followers
on social media.
DEEP LEARNING 2006
The concept of deep learning was successfully promoted,
increasing the power and accuracy of neural networks.
DEEP NEURAL NETWORKS 2011

A group of researchers began to work on deep neural networks
(DNNs) in 2011 and new algorithms were discovered that
made it possible to train a model on millions of examples,
outcompeting other techniques previously used in computer
vision and speech recognition. Large DNNs trained on massive
amounts of data also allowed ASR to reach ‘super-human’
performance in controlled settings.
6
BRIEF 1950
Alan Turing created the “Turing Test”
HISTORY 1952
Arthur Samuel wrote the first
OF MACHINE computer learning program
LEARNING
1957
Frank Rosenblatt
1981 designed the first neural </>
Gerald Dejong introduced network for computers
the concept of Explanation
Based Learning (EBL)
1967
The “nearest neighbor”
algorithm was written
1985
AT&T created the first
research group dedicated 1992
to machine learning Jay Wilpon and team from AT&T
deployed the first nationwide
automated speech recognition
1996
Patrick Haffner and the team
at AT&T proposed the first
convolutional neural network
1997
Another group of AT&T
2011 researchers invented
A large community of the Adaboost algorithm
researchers started to work
on deep neural networks
2001
AT&T deployed natural language
understanding in IVR systems
2014
Google and Facebook made
machine learning, and in
particular deep learning, 2015 2017
their pivotal technology Interactions acquired Interactions acquired
AT&T’s Watson Digital Roots
7
HOW IT WORKS
The overall goal of machine learning is to build models that imitate and generalize
data. These models need to learn how to discriminate certain things to achieve a HOW DO YOU TELL THEM APART?
desired end result. Simply put, machine learning uses a variety of techniques, and
algorithms within these techniques, to reach a specific goal. Jay Wilpon explains how machine learning works
with an analogy of how algorithms decipher the
RECOGNIZING PATTERNS difference between types of fruits.
Machine learning learns from data, and uses that data to recognize patterns. Jay
Wilpon, Senior Vice President of Natural Language Research at Interactions, best
describes how machine learning works by using an analogy of fruits. For instance,
let’s assume someone handed you an orange and a grapefruit, and you’ve never
seen them before. How do you tell them apart? They’re both round, but the Size is one feature that can separate the two.
grapefruit is slightly bigger. You could then determine that size is one feature that
can separate the two. Now, let’s say someone hands you an apple. While the
shapes are similar, this fruit is red, triggering you to realize that color is another
potential differentiator. Finally, someone gives you a banana...now you can add
Color is another potential differentiator.
shape as another characterization.
This simple analogy is similar to how machine learning works. The job of machine
learning is not only to recognize that what it’s being handed is fruit, but also to
make sure that it is not calling a grapefruit a banana and vice versa.
You can add shape as another characterization.
8
MACHINE LEARNING TECHNIQUES
It’s important to remember that machine learning is not one- Classification

size-fits-all. Different algorithms, and different techniques within Classification, which falls under supervised learning, can
those algorithms, are used to build a model that is application be defined as trying to predict an output given the input.
appropriate. Below we discuss a number of primarily used Classification takes an unknown group of entities and works to
techniques when utilizing machine learning. identify them into larger known groups. To learn, it requires a set
of labeled examples such as an image, text, or speech. As the
WHICH TECHNIQUE IS BEST? number of classes grow, the data required to train a classifier to
reach high accuracy can be large, reaching thousands or even
Machine learning is not a concrete set of algorithms used across millions of examples. While classification typically targets simple
the board. Depending on what you are trying to achieve, different categories, it can be extended to situations where the target is a
technologies and different algorithms can be used. But how do structure or a sequence, like in natural language processing.
you know when to use which technology and/or algorithm? The
answer heavily relies on the type of data, UNSUPERVISED LEARNING
and the amount of data, that is available. Unsupervised learning uses unlabeled data. In this situation, the
machine discovers new patterns without knowing any prior data or
SUPERVISED LEARNING information. This type of learning works well with clustering, which
is when data is categorized into groups of similar data.
Whether or not data has been labeled determines whether it is
supervised or unsupervised. Supervised learning uses human- REINFORCEMENT LEARNING
labeled data, and are commonly used when data can predict
likely events. In other words, it is an input when the desired Inspired by the psychological idea of reinforcement behavior,
output is known. The algorithm learns a set of inputs along with reinforcement learning is the idea of learning by doing. A machine
corresponding correct outputs and learns by comparing its actual can determine an ideal outcome by trial and error. Over time,
output with correct outputs to find errors. Once it finds the errors, it learns to choose certain actions that result in the desirable
it can modify the model accordingly. outcome. This type of learning is often used in applications such
as gaming, navigation, and more.
9
NEURAL NETWORKS
Deep neural networks (DNNs), also known as artificial neural networks (ANN),
represent a set of techniques used to build powerful learning systems. Unlike
algorithms such as SVMs and Adaboost, they add a number of “hidden” layers
that are used to extract intermediate representations. While invented in the
1980s, DNNs took off after 2010 thanks to powerful parallel hardware and
easy-to-use open source software. DNNs cover a huge range of different neural
architectures, the best known being:
• Recurrent Neural Networks (RNN) - A network whose neurons send

feedback signals to each other
• Convolutional Neural Networks (CNN) - A feed-forward ANN typically
applied for visual and image recognition
10
THE IMPORTANCE OF
THE HUMAN ELEMENT
Regardless of how intelligent technology can be, at the end of the Interactions understands the crucial role the human element plays
day it will never be perfect. Humans can accelerate the process in artificial intelligence, which is why we’ve focused on integrating
of understanding by teaching the technology in real-time. For human intelligence into our technology. Our proprietary Adaptive
example, if machine learning comes across a piece of data it Understanding™ technology combines speech recognition, natural
cannot understand, a human can interfere and tell the technology language processing, and Human Assisted Understanding to
what it is, making it more accurate the next time it comes across provide our customers, and their customers, conversational and
that same piece of data. Technology does not have the same level engaging self-service. This enables continuous improvement and
of understanding as a human, and adding humans to the machine learning in live applications.
learning process can assist with decision making and allow the
technology to become more self-aware. This human touch can
personify machine learning, make it easier to relate to, and
in-turn, make it less intimidating.
Aside from making it more personalized, when humans and

robots work together, the results are truly exceptional—and
accurate. Humans can become involved in the process in a few
ways. First, they can assist with labeling data that will be fed into
the machine learning model, and secondly they help machine
learning predict and correct inaccuracies, which results in more
accurate end results.
11
WHO’S USING IT?
As previously mentioned, we encounter machine learning on a Trading floors - With its ability to efficiently assess data and
daily basis, whether we realize it or not. Aside from in our day- patterns, machine learning can assist with quick decision-making
to-day lives, industries from retail to government and more are in real-time.
depending on machine learning to get things done. Below is a
short list of how different industries are utilizing machine learning. Credit and risk management - Typically assessing credit risk is
This is not a complete list, as dozens of industries are using labor intensive and is prone to human-subjected errors. With
machine learning in a vast number of ways. machine learning, certain algorithms can help to provide mitigation
recommendations.
FINANCE
UTILITIES
With its quantitative nature, banking and finance are an ideal
application for machine learning. The technology is being used Utility companies can utilize machine learning in a number of ways,
in dozens of ways industry-wide, but here are a few of the most including uncovering hidden energy patterns, learning customer’s
commonly used: energy behaviors, and more.
Fraud - Machine learning algorithms can analyze an enormous

amount of transactions at a time, and learn a person’s typical
spending patterns. If a transaction is made that is unusual, it will
reject the transaction and indicate potential fraud.
12
HEALTHCARE OIL AND GAS
Diagnoses - Machine learning can analyze data and identify trends Energy sources - By analyzing different minerals in the ground,
or red flags within patients to potentially lead to earlier diagnoses machine learning provides the potential to discover new energy
and better treatments. sources.
Patient information - Data can be collected from a patient’s Streamlining oil distribution - Algorithms work to make oil
device to assess their health in real-time. distribution more efficient and cost-effective.
Drug discovery - Given its ability to detect patterns within data, Reservoir modeling - Certain machine learning techniques can
scientists are able to better predict drug side effects and results of focus on optimization of hydraulic fracturing, reservoir simulation,
drug experiments without actually performing them. and more.
MARKETING AND SALES TRANSPORTATION
Personalization - Machine learning allows online brands to Efficient transportation - Analysis of data can identify certain
suggest and advertise things you may like based on your browser patterns and trends to make routes more efficient for public
and search history. Brands use their collected data to give transportation, delivery companies, and more.
customers a unique and personalized experience.
13
CHALLENGES AND HESITATIONS
While machine learning has proved to have a profound impact across all
industries, there are still uncertainties and challenges regarding the technology.
INTELLIGENT ASSISTANT, NOT OVERLORD
First and foremost is the fear that technology will overcome humans. As we
discussed, technology is not perfect, and often needs the assistance of humans
to ensure accuracy. However, there is still a lot of fear and uncertainty regarding
the power of technology and its ability to become smarter than we are. At its
core, AI is a set of mathematical equations and algorithms that require human
training. This means that AI, and machine learning, are only as smart as we teach
them to be. When applied properly, AI is a perfect assistant to help humans
become more productive. Technology is not here to overcome us and overpower
us, but rather assist us and improve our quality of life.
ISSUES WITH UNLABELED DATA
A more technical issue with both machine learning and artificial intelligence is the
technology’s ability to handle unlabeled data. Because machine learning relies on
data to learn, it naturally requires a large amount of labeled data to work most
efficiently. However, there are many cases when data isn’t readily available or
is unlabeled. This makes creating algorithms more challenging. With on-going
research and new advancements, we’re training these systems to become
smarter and reach human-level accuracy, so that one day unlabeled data will be
just as sufficient as labeled data.
14
THE FUTURE
OF MACHINE
LEARNING
While the technologies behind machine learning and AI seem

futuristic in themselves, this is only the beginning. Many machine
learning experts suspect that these systems will be as smart, if not
smarter, than us within the next 30-50 years.
But, as for the near future, experts expect we will continue to

collect more and more data that, in turn, will improve the accuracy THE POSSIBILITIES OF THIS
of our machine learning systems. With more data, better algorithms, TECHNOLOGY IN THE FUTURE
and improved accuracy, the possibilities of this technology in the ARE ENDLESS.
future are endless.
15
CONTRIBUTORS
JAY WILPON
SVP, NATURAL LANGUAGE RESEARCH, INTERACTIONS
With more than 150 published papers and patents in speech and natural language
research to his name, Jay Wilpon is one of the world’s pioneers and a chief evangelist
for speech and natural language technologies and services.
During his career, Jay has been a leading innovator for a number of industry-defining
voice enabled services, including AT&T’s How May I Help You service – the first
nationwide deployment of a true human-like spoken language understanding service.
Jay and his team are addressing the key challenges in speech, natural language
processing and multimodal dialog systems.
Jay has previously been awarded the distinguished honor of IEEE Fellow for his
leadership in the development of automatic speech recognition algorithms. For
pioneering leadership in the creation and deployment of speech recognition-based
services in the telephone network, Jay has also been awarded the honor of AT&T Fellow.
DAVID THOMSON
VP, SPEECH RESEARCH, INTERACTIONS
As Vice President of Speech Research, David manages Interactions’ R&D teams to
further Interactions’ goal to redefine the speech technology industry. David is at the
forefront of Interactions’ objective to create the most accurate, fastest, and highest
quality speech solutions. Prior to joining Interactions, David spent five years with
AT&T Labs, where he was responsible for the development of technology from speech
research. He has held senior executive-level positions at SpinVox, SpeechPhone, and
Fonix. He also spent 18 years at Lucent Technologies (now Alcatel-Lucent), where he
developed voice activated systems that have handled over 20 billion calls. David has
published 30 research papers and secured 11 patents in natural language research.
16
SRINIVAS BANGALORE
DIRECTOR OF RESEARCH AND TECHNOLOGY, INTERACTIONS
Dr. Srinivas Bangalore is currently the Director of Research and Technology at
Interactions. After receiving his PhD in Computer Science from The University of
Pennsylvania, he became a Principal Research Scientist at AT&T Labs—Research.
Dr. Bangalore has worked on many areas of Natural Language Processing including
Spoken Language Translation, Multimodal Understanding, Language Generation
and Question-Answering. He has co-edited three books on Supertagging, Natural
Language Generation, and Language Translation. He has authored over a 100 research
publications and holds over 100 patents in these areas. He has been awarded the Morris
and Dorothy Rubinoff award for outstanding dissertation, the AT&T Outstanding Mentor
Award, in recognition of his support and dedication to AT&T Labs Mentoring Program
and the AT&T Science & Technology Medal for technical leadership and innovative
contributions in Spoken Language Technology and Services. He has served on the
editorial board of Computational Linguistics Journal, Computer, Speech and Language
Journal and on program committees for a number of ACL and IEEE Speech Conferences.
PATRICK HAFFNER
LEAD INVENTIVE SCIENTIST, INTERACTIONS
Dr. Patrick Haffner has worked on machine learning algorithms since 1988. With
Yann LeCun, he was one of the pioneers in applying Neural Networks to speech and
image recognition, and led the deployment of the first NN used for an automation task
(check reading). With AT&T Labs Research, he was an expert in the learning algorithms
that enable data engineers to efficiently train machines using real world data, for
tasks ranging from language understanding to network monitoring. He was also an
expert advisor to the European Union for their funding programs on machine learning
and cognitive sciences. Dr. Haffner is a Lead Inventive Scientist at Interactions with
responsibility for managing the ever increasing variety of machine learning techniques
and software that an AI-driven company needs to use.
MICHAEL JOHNSTON
DIRECTOR OF RESEARCH AND INNOVATION, INTERACTIONS
Dr. Michael Johnston has over 25 years of experience in speech and language
technology. His research lies at the intersection of Natural Language Processing,
human-computer interaction, and spoken and multimodal dialog. More specifically,
his work focuses on the development of language and dialog processing techniques
that support spoken and multimodal interaction and the application of these to the
creation of novel systems and services. Dr. Johnston has over 50 technical papers and
32 patents in speech and language processing. Before joining Interactions, he held
positions at AT&T Labs Research, Oregon Graduate Institute and Brandeis University.
He is member of the board of AVIOS and editor and chair for the W3C EMMA
multimodal standard.
17
ABOUT INTERACTIONS
Interactions provides Intelligent Virtual Assistants that seamlessly combine Artificial

Intelligence and human understanding to enable businesses and consumers to engage in
productive conversations. With flexible products and solutions designed to meet the growing
demand for unified, multichannel customer care, Interactions is delivering significant cost
savings and unprecedented customer experience for some of the largest brands in the world.
Founded in 2004, Interactions is headquartered in Franklin, Massachusetts with additional
offices in Indiana, New Jersey and New York.
For more information about Interactions, contact us:
866.637.9049
18

The Fundamentals of Machine Learning

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The Fundamentals of Machine Learning

Uploaded by

Copyright:

Available Formats

THE FUNDAMENTALS

5 BRIEF HISTORY OF MACHINE LEARNING

9 MACHINE LEARNING TECHNIQUES

11 THE IMPORTANCE OF THE HUMAN ELEMENT

12 WHO’S USING IT?

14 CHALLENGES AND HESITATIONS

15 THE FUTURE OF MACHINE LEARNING

Whether we realize it or not, machine learning is something we HOW IS IT DIFFERENT FROM

From TV suggestions to self-driving cars, machine learning is subtly in the

“NEAREST NEIGHBOR” ALGORITHM 1967 SUPPORT VECTOR MACHINES 1992

CONVOLUTIONAL NEURAL NETWORK 1996 MODERN APPLICATIONS OF MACHINE LEARNING

DEEP NEURAL NETWORKS 2011

It’s important to remember that machine learning is not one- Classification

• Recurrent Neural Networks (RNN) - A network whose neurons send

Aside from making it more personalized, when humans and

Fraud - Machine learning algorithms can analyze an enormous

MARKETING AND SALES TRANSPORTATION

INTELLIGENT ASSISTANT, NOT OVERLORD

ISSUES WITH UNLABELED DATA

While the technologies behind machine learning and AI seem

But, as for the near future, experts expect we will continue to

Interactions provides Intelligent Virtual Assistants that seamlessly combine Artificial

For more information about Interactions, contact us:

You might also like