Understanding AI Technology
Understanding AI Technology
Understanding AI Technology
Acknowledgments
The author would like to thank the following individuals for their assistance reviewing earlier drafts
of this document:
Disclaimer
The views expressed in this document are those of the author alone and do not necessarily reflect
the position of the Department of Defense or the United States Government.
Website: https://www.ai.mil/
Twitter: @DoDJAIC
LinkedIn: https://www.linkedin.com/company/dod-joint-artificial-intelligence-center/
Understanding Artificial Intelligence Technology 2
It is hard for me to describe the steep slope of the learning curve I faced when I
started the Project Maven journey over three years ago. While in many ways I still
consider myself an Artificial Intelligence neophyte today, what I knew about the
subject back then could barely fill the first few lines of a single page in my trusty
notebook. My journey of discovery since then has been challenging, to say the
least. I only wish Greg Allen's guide to "Understanding AI Technology" had been
available to me in late 2016 as we embarked on our first AI/ML pilot project for ISR
full-motion video analysis.
EXECUTIVE SUMMARY
Many officials throughout the Department of Defense are asked to make
decisions about AI before they have an appropriate understanding of the
technology’s basics. This guide will help.
The DoD AI Strategy defines AI as “the ability of machines to perform tasks that
normally require human intelligence.” This definition includes decades-old DoD
AI, such as aircraft autopilots, missile guidance, and signal processing systems.
Though many AI technologies are old, there have been legitimate technological
breakthroughs over the past ten years that have greatly increased the diversity of
applications where AI is practical, powerful, and useful. Most of the breakthroughs
and excitement about AI in the past decade have focused on Machine Learning
(ML), which is a subfield of AI. Machine Learning is closely related to statistics and
allows machines to learn from data.
The best way to understand Machine Learning AI is to contrast it with an older
approach to AI, Handcrafted Knowledge Systems. Handcrafted Knowledge
Systems are AI that use traditional, rules-based software to codify subject matter
knowledge of human experts into a long series of programmed “if given x input,
then provide y output” rules. For example, the AI chess system Deep Blue, which
defeated the world chess champion in 1997, was developed in collaboration
between computer programmers and human chess grandmasters. The
programmers wrote (literally typed by hand) a computer code algorithm that
considered many potential moves and countermoves and reflected rules for
strong chess play given by human experts.
Machine Learning systems are different in that their “knowledge” is not
programmed by humans. Rather, their knowledge is learned from data: a
Machine Learning algorithm runs on a training dataset and produces an AI
model. To a large extent, Machine Learning systems program themselves. Even
so, humans are still critical in guiding this learning process. Humans choose
algorithms, format data, set learning parameters, and troubleshoot problems.
Machine Learning has been around a long time, but it previously was almost
always expensive and complicated with low performance, so there were
comparatively few applications and organizations for which it was a good fit.
Thanks to the ever-increasing availability of massive datasets, massive computing
power (both from using GPU chips as accelerators and from the cloud), open
source code libraries, and software development frameworks, the performance
and practicality of using Machine Learning AI systems has increased dramatically.
There are four different families of Machine Learning algorithms, which differ
based on aspects of the data they train on. It is important to understand the
different families because knowing which family an AI system will use has
implications for effectively enabling and managing the system’s development.
1) Supervised Learning uses example data that has been labeled by human
“supervisors.” Supervised Learning has incredible performance, but getting
sufficient labeled data can be difficult, time-consuming, and expensive.
2) Unsupervised Learning uses data but doesn’t require labels for the data. It has
lower performance than Supervised Learning for many applications, but it
can also be used to tackle problems where Supervised Learning isn’t viable.
3) Semi-Supervised Learning uses both labeled and unlabeled data and has a
mix of the pros and cons of Supervised and Unsupervised learning.
4) Reinforcement Learning has autonomous AI agents that gather their own
data and improve based on their trial and error interaction with the
environment. It shows a lot of promise in basic research, but so far
Reinforcement Learning has been harder to use in the real world. Regardless,
technology firms have many noteworthy, real-world success stories.
Deep Learning (Deep Neural Networks) is a powerful Machine Learning
technique that can be applied to any of the four above families. It provides the
best performance for many applications. However, the technical details are less
important for those not on the engineering staff or directly overseeing the
procurement of these systems. What matters most for program management is
whether or not the system uses Machine Learning, and whether or not the
selected algorithm requires labeled data.
Systems using Machine Learning software can provide very high levels of
performance. However, Machine Learning software has failure modes – both from
accidents and from adversaries – that are distinct from those of traditional
software. Program managers, system developers, test and evaluation personnel,
and system operators all need to be familiar with these failure modes to ensure
safe, secure, and reliable performance of AI systems.
There are multiple steps to developing an operational Machine Learning AI
system. Usually, the biggest challenges relate to getting sufficient high-quality
training data. System performance is directly tied to data quantity, quality, and
representativeness.
Organizations should not pursue using AI for its own sake. Rather, they should have
specific metrics for organizational performance and productivity that they are
seeking to improve. Merely developing a high-performing AI model will not by
itself improve organizational productivity. The model has to be integrated into
operational technology systems, organizational processes, and staff workflows.
Almost always, there will be some changes needed to existing processes to take
full advantage of the AI model’s capability. Adding AI technology without revising
processes will deliver only a tiny fraction of the potential improvements, if any.
Finally, traditional project management wisdom still applies. Many AI projects fail
not because of the technology, but because of a failure to properly set
expectations, integrate with legacy systems, and train operational personnel.
Purpose: By now, nearly all DoD officials understand that the rise of AI is an
important technology trend with significant implications for national security, but
many struggle to give simple and accurate answers to basic questions like:
• What is AI?
• How does AI work?
• Why is now an important time for AI?
• What are the different types of Machine Learning? How do they differ?
• What are Neural Networks and Deep Learning?
• What are the steps of building and operating AI systems?
• What are the limitations and risks of using of AI systems?
WHAT IS AI?
The DoD AI Strategy states that “AI refers to the ability of machines to perform
tasks that normally require human intelligence.” This definition is so obvious that
many are confused by its simplicity. In fact, however, this definition is very similar
to the ones used by many leading AI textbooks and leading researchers. The first
thing to note about this definition is that AI is an extremely broad field, one that
covers not only the breakthroughs of the past few years, but also the
achievements of the first electronic computers dating back to the 1940s.
some person or company claim that their system “uses AI,” most likely they mean
that their system is using Machine Learning, which is a far cry from their system
being an autonomous intelligence equal to or greater than human intellect in all
categories. Still, recent progress in Machine Learning is a big deal, with
implications for nearly every industry, including defense and intelligence. The
easiest way to understand Machine Learning systems, however, is by contrasting
them with Handcrafted Knowledge systems, so this paper will begin there.
Handcrafted Knowledge AI
Handcrafted Knowledge Systems are the older of the two AI approaches, nearly
as old as electronic computers. At their core, they are merely software developed
in cooperation between computer programmers and human domain subject
matter experts. Handcrafted Knowledge Systems attempt to represent human
knowledge into programmed sets of rules that computers can use to process
information. In other words, the “intelligence” of the Handcrafted Knowledge
System is merely a very long list of rules in the form of “if given x input, then provide
y output.” When hundreds or thousands or millions of these domain-specific rules
are combined successfully – into “the program” – the result is a machine that can
seem quite smart and can also be very useful.
Both tax preparation AI systems and Deep Blue are a specific type of Handcrafted
Knowledge AI known as an Expert System. Another type of Handcrafted
Knowledge AI is a Feedback Control System, which uses human-authored rules to
compute system output based on sensor measurement inputs. Feedback Control
Systems have been in widespread use by the Department of Defense for
decades. Aircraft autopilots, missile guidance systems, and electromagnetic
Machine Learning AI
The key difference between a Handcrafted Knowledge System and a Machine
Learning system is in where it receives its knowledge. Rather than having their
knowledge be provided by humans in the form of hand-programmed rules,
Machine Learning systems generate their own rules. For Machine Learning
systems, humans provide the system training data. By running a human-
generated algorithm on the training dataset, the Machine Learning system
generates the rules such that it can receive input x and provide correct output y.
In other words, the system learns from examples (training data), rather than being
explicitly programmed. This is why data is so vital in the context of AI. Data is the
main raw material out of which high-performing Machine Learning AI systems are
built. For this reason, the quality, quantity, representativeness and diversity of data
will directly impact the operational performance of the ML system. Algorithms and
computing hardware are also important, but nearly all ML systems run on
commodity computing hardware, and nearly all of the best algorithms are freely
available worldwide. Hence, having enough of the right data tends to be the key.
At this stage, many readers may ask themselves, “so what? Why is Machine
Learning important?”
The reason is that there are many applications where task automation would be
useful, but where human programming of all of the software rules to implement
automation is either impractical or genuinely impossible. Sometimes human
experts are unable to fully translate their intuition decision-making into fixed rules.
Further, for a surprisingly large subset of these applications, the performance of
Machine Learning systems is very high, much higher than was ever achieved with
Handcrafted Knowledge Systems or indeed by human experts. This does not
mean that Handcrafted Knowledge systems are obsolete. For many applications
they remain the cheapest and/or highest performing approach.
Of course, more data only helps if the data is relevant to your desired
application. If you’re trying to develop a better aircraft autopilot, then a
bunch of consumer loan application data isn’t going to help, no matter how
much you have. In general, training data needs to match the real-world
operational data very, very closely to train a high-performing AI model.
Despite their huge potential, AI solutions are not a great fit for all types of
problems. If you have an application where you think using AI could be beneficial,
knowing whether or not any particular system that is claiming to use “AI” is using
Machine Learning is important for several reasons. For one thing, Machine
Learning works differently from traditional software, and it has different strengths
and weaknesses too. Moreover, Machine Learning tends to break and fail in
different ways. A basic understanding of these strengths, weaknesses, and failure
modes can help you understand whether or not your particular problems are a
good fit for a Machine Learning AI solution.
WHAT ARE THE DIFFERENT TYPES OF MACHINE LEARNING? HOW DO THEY DIFFER?
Like Artificial Intelligence, Machine Learning is also an umbrella term, and there
are four different broad families of Machine Learning algorithms. There are also
many different subcategories and combinations under these four major families,
but a good understanding of these four broad families will be sufficient for the
vast majority of DoD employees, including senior leaders in non-technical roles.
with its correct associated output. For example, if the goal of the AI system is to
correctly classify the objects in different images as either “cat” or “dog,” the
labeled training data would have image examples paired with the correct
classification label. Supervised Learning systems can also be used for identifying
the correct labels of continuous numerical outputs. For example, “given this wing
shape input, predict the output air drag coefficient.”
Many Supervised Learning systems can achieve extremely high performance, but
they require very large labeled datasets to do so. Using image classification as an
example, a common rule of thumb is that the algorithm needs at least 5,000
labeled examples of each category in order to produce an AI model with decent
performance. Acquiring all of this labeled data can be easy or very difficult,
depending upon the application. In the case of facial recognition algorithms,
most companies use paid humans to manually label images. In the case of online
shopping recommendation engines, the customers are actually providing the
data labels through the normal course of their shopping. The data inputs are the
recommended items displayed to the customers and the customer’s profile
information, while the outputs are the actual purchases made or not made. This
is one of the major reasons why internet companies were at the forefront of the
adoption of Machine Learning AI: their users were constantly producing valuable
datasets – both labeled and unlabeled – and the online environment allowed for
rapid experimentation with Machine Learning-enabled analysis and automation.
Note that pre-labeled data is only required for the training data that the algorithm
uses to train the AI model. The AI model in operational use with new data will be
generating its own labels, the accuracy of which will depend on the AI’s training.
If the training data set was sufficiently large, high quality, and representative of
the diversity present in the operational environment, then the performance of the
AI model in generating these labels can be at or above human performance.
1) Data is gathered by the AI agent itself in the course of its interacting with
the environment and perceiving stated changes. For example, an AI agent
playing a digital game of chess makes moves and perceives changes in
the board based on its moves.
2) The rewards are input data received by the agent when certain criteria are
satisfied. For example, a Reinforcement Learning AI agent in chess will
make many moves before each win or loss. These criteria are typically
unknown to the agent at the outset of training.
3) Rewards often contain only partial information. A reward like a win in chess
conveys that some inputs must have been good, but it doesn’t clearly
signal which inputs were good and which were not.
4) The system is learning an action policy for taking actions to maximize its
receipt of cumulative rewards.
Reinforcement Learning works very well for games and simulations because the
system automatically generates its own training data, which only costs the price
of running computational hardware for the algorithm and simulation. For
example, AlphaGo, a Reinforcement Learning system focused on the board
game Go, played more than 4.9 million games in three days (one full game every
nineteen seconds) against itself in order to learn how to play the game at a world-
champion level. In real life, a Go game takes ~1 hour.
Neural Networks are a specific category of algorithms that are very loosely
inspired by biological neurons in the brain. Deep Neural Networks (a.k.a. Deep
Learning) merely refers to those Neural Networks that have many layers of
connected neurons in sequence (“deep” referring to the number of layers).
Though Neural Networks are most strongly associated with Supervised Learning,
Deep Learning can, with the right architecture, also be applied to Unsupervised,
Semi-Supervised, and Reinforcement Learning.
Neural Networks have been around since the late 1950s, but training Deep Neural
Networks only became practical around the 2006 timeframe. Since then, the
previously mentioned trends – more data, more computing power, improved
algorithms, and improved open source code libraries – have had an especially
large impact on improving Deep Learning performance. Since 2012, many of the
winning systems in AI competitions around important performance benchmarks
are routinely won by systems that make use of Neural Networks and Deep
Learning.
While it is very important for engineers and developers of DoD AI systems and DoD
technical leaders to have a good understanding of Neural Networks and how
they work, a granular understanding of Neural Networks is overkill and a
distraction for most DoD senior leaders. Everything stated earlier in this paper
about the general categories of Machine Learning and different types of
Machine Learning applies to Neural Networks as well. And knowing whether or
not your Machine Learning system is using Neural Networks or another algorithm
like decision trees won’t have many important implications for how you run your
program. In general, AI program managers should care about the performance
of the AI system and the types of data required in order to guarantee that
performance. For many applications, achieving the required performance will
require using Neural Networks. For others, it won’t. Focusing on using the most
advanced algorithm is important for many parts of the basic research community.
For those in positions involving policy, applied R&D, and operations, factors like
feasibility, performance, and reliability are more important.
Figure 6. Deep Learning’s Place in AI – Using the DARPA “AI Waves” Framework
There is one important exception, however. Neural Networks differ from other
types of Machine Learning algorithms in that they tend to have low explainability.
The system can generate a prediction or other output, and testing can provide
evidence suggesting that these predictions have high accuracy, but it is very
difficult to understand or explain the specific causal mechanisms by which the
Neural Network arrived at its prediction, even for top AI experts. This “explainability
problem” is often described as a problem for all of AI, but it is primarily a problem
for Neural Networks and Deep Learning. Many other types of Machine Learning
algorithms – for example decision trees – have very high explainability.
WHAT ARE THE STEPS OF BUILDING AND OPERATING MACHINE LEARNING SYSTEMS?
By now you should have a good understanding of what the different families of
Machine Learning algorithms are and how they work. Figure 7 shows the major
steps of actually generating Machine Learning models for operational use.
The AI model will also likely have to be integrated with existing systems and pass
through a suitable testing and evaluation process. Having a high performing AI
model by itself, however, is not enough to deliver a positive impact on
organizational productivity. The most significant organizational productivity
enhancements require not just enhanced technical performance, but also
operational processes and staff workflow changes that effectively take
advantage of the enhanced performance.
More broadly, there are not yet widely agreed upon safety and reliability
standards for the development, testing, and operation of Machine Learning
systems. Some methods that have proved critical in ensuring safety and reliability
of traditional software – such as formal verification – are not currently available
for use on Machine Learning systems. Moreover, some Machine Learning failure
modes are not fully understood even at the basic research level. Despite the
current challenges, Machine Learning AI systems are already (for some cases)
safer and better performing than what they replace. With additional future R&D
and improved program management standards, Machine Learning will also be
reliably used in a much more diverse set of applications, including safety-critical
ones. When this positive scenario is realized, it will not be because AI systems are
inherently safe and secure – no technology is – but because the responsible
stakeholders took the necessary steps to make AI safe and secure. For the DoD,
there are very promising developments on this issue. In February 2020, the DoD
officially adopted five Ethical Principles for Artificial Intelligence. The DoD also
established an Executive Steering Group with representation from each of the
Armed Services and major DoD components to make recommendations for
improvements in all aspects of DoD AI Policy and operational usage.
CONCLUSION
Talented human capital and access to AI experts are critical factors for success
in DoD’s AI strategy. However, the basics of AI technology can be understood by
anyone who devotes the time to learn. The concepts in this document provide a
technical overview that will be adequate for the vast majority of senior leaders to
understand what would be required to adopt and utilize AI for their organization.
Those who want to go further and learn more are encouraged to do so. This
document can serve as a useful jumping off point to more advanced and
domain-specific subjects. Some recommendations for further reading are
provided on the next page. Before moving on, however, readers would be wise
to double down and ensure they have a rock-solid understanding of the basics.