Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
245 views

Practical Java Machine Learning

This chapter introduces machine learning concepts and terminology. It discusses how machine learning is an important subfield of artificial intelligence, and how Java is a good choice for implementing machine learning solutions. The chapter defines common terms like artificial intelligence, machine learning, deep learning, and others. It also provides a brief history of artificial intelligence and discusses periods of reduced funding and interest known as "AI winters".

Uploaded by

eazpil01
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
245 views

Practical Java Machine Learning

This chapter introduces machine learning concepts and terminology. It discusses how machine learning is an important subfield of artificial intelligence, and how Java is a good choice for implementing machine learning solutions. The chapter defines common terms like artificial intelligence, machine learning, deep learning, and others. It also provides a brief history of artificial intelligence and discusses periods of reduced funding and interest known as "AI winters".

Uploaded by

eazpil01
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

CHAPTER 1

Introduction
Chapter 1 establishes the foundation for the book.
It describes what the book will achieve, who the book is intended for, why machine
learning (ML) is important, why Java makes sense, and how you can deploy Java ML
solutions.
The chapter includes the following:

• A review all of the terminology of AI and its sub-fields including


machine learning

• Why ML is important and why Java is a good choice for


implementation

• Setup instructions for the most popular development environments

• An introduction to ML-Gates, a development methodology for ML

• The business case for ML and monetization strategies

• Why this book does not cover deep learning, and why that is a good
thing

• When and why you may need deep learning

• How to think creatively when exploring ML solutions


• An overview of key ML findings

1.1 Terminology
As artificial intelligence and machine learning have seen a surge in popularity, there has
arisen a lot of confusion with the associated terminology. It seems that everyone uses the
terms differently and inconsistently.

1
© Mark Wickham 2018
M. Wickham, Practical Java Machine Learning, https://doi.org/10.1007/978-1-4842-3951-3_1
Chapter 1 Introduction

Some quick definitions for some of the abbreviations used in the book:

• Artificial intelligence (AI): Anything that pretends to be smart.

• Machine learning (ML): A generic term that includes the subfields of


deep learning (DL) and classic machine learning (CML).

• Deep learning (DL): A class of machine learning algorithms that


utilize neural networks.

• Reinforcement learning (RL): A supervised learning style that


receives feedback, but not necessarily for each input.

• Neural networks (NN): A computer system modeled on the human


brain and nervous system.

• Classic machine learning (CML): A term that more narrowly defines


the set of ML algorithms that excludes the deep learning algorithms.

• Data mining (DM): Finding hidden patterns in data, a task typically


performed by people.

• Machine learning gate (MLG): The book will present a development


methodology called ML-Gates. The gate numbers start at ML-Gate 5
and conclude at ML-Gate 0. MLG3, for example, is the abbreviation
for ML-Gate 3 of the methodology.

• Random Forest (RF) algorithm: A learning method for classification,


regression and other tasks, that operates by constructing decision
trees at training time.

• Naive Bayes (NB) algorithm: A family of “probabilistic classifiers”


based on applying Bayes’ theorem with strong (naive) independence
assumptions between the features.

• K-nearest neighbor (KNN) algorithm: A non-parametric method


used for classification and regression where the input consists of the
k closest training examples in the feature space.

• Support vector machine (SVM) algorithm: A supervised learning


model with associated learning algorithm that analyzes data used for
classification and regression.

2
Chapter 1 Introduction

Much of the confusion stems from the various factions or “domains” that use these
terms. In many cases, they created the terms and have been using them for decades
within their domain.
Table 1-1 shows the domains that have historically claimed ownership to each of the
terms. The terms are not new. Artificial intelligence is a general term. AI first appeared
back in the 1970s.

Table 1-1. AI Definitions and Domains


Term Definition Domain

Statistics Quantifies the data. DM, ML, DL all use statistics to Math departments
make decisions.
Artificial The study of how to create intelligent agents. Historical,
intelligence (AI) Anything that pretends to be smart. We program a Marketing, Trending.
computer to behave as an intelligent agent. It does
not have to involve learning or induction.
Data mining (DM) Explains and recognizes meaningful patterns. Business
Unsupervised methods. Discovers the hidden world, business
patterns in your data that can intelligence
be used by people to make decisions.
A complete commercial process flow,
often on large data sets (Big Data).
Machine learning (ML) A large branch within AI in which we build models Academic
to predict outcomes. Uses algorithms and has departments
a well-defined objective. We generalize existing
knowledge to new data. It’s about learning a model
to classify objects.
Deep learning (DL) Applies neural networks for ML. Pattern Trending
recognition is an important task.

The definitions in Table 1-1 represent my consolidated understanding after reading


a vast amount of research and speaking with industry experts. You can find huge
philosophical debates online supporting or refuting these definitions.

3
Chapter 1 Introduction

Do not get hung up on the terminology. Usage of the terms often comes down to domain
perspective of the entity involved. A mathematics major who is doing research on DL
algorithms will describe things differently than a developer who is trying to solve a problem
by writing application software. The following is a key distinction from the definitions:
Data mining is all about humans discovering the hidden patterns in data,
while machine learning automates the process and allows the computer
to perform the work through the use of algorithms.
It is helpful to think about each of these terms in context of “infrastructure” and
“algorithms.” Figure 1-1 shows a graphical representation of these relationships. Notice
that statistics are the underlying foundation, while “artificial intelligence” on the right-­
hand side includes everything within each of the additional subfields of DM, ML, and DL.
Machine learning is all about the practice of selecting and applying
algorithms to our data.
I will discuss algorithms in detail in Chapter 3. The algorithms are the secret sauce
that enables the machine to find the hidden patterns in our data.

Figure 1-1. Artificial intelligence subfield relationships

4
Chapter 1 Introduction

1.2 Historical
The term “artificial intelligence” is hardly new. It has actually been in use since the 1970s.
A quick scan of reference books will provide a variety of definitions that have in fact
changed over the decades. Figure 1-2 shows a representation of 1970s AI, a robot named
Shakey, alongside a representation of what it might look like today.

Figure 1-2. AI, past and present

Most historians agree that there have been a couple of “AI winters.” They represent
periods of time when AI fell out of favor for various reasons, something akin to a
technological “ice age.” They are characterized by a trend that begins with pessimism
in the research community, followed by pessimisms in the media, and finally followed
by severe cutbacks in funding. These periods, along with some historical context, are
summarized in Table 1-2.
5
Chapter 1 Introduction

Table 1-2. History of AI and “Winter” Periods


Period Context

1974 The UK parliament publishes research that AI algorithms would grind to a halt on “real
world” problems. This setback triggers global funding cuts including at DARPA. The crisis
is blamed on “unrealistic predictions” and “increasing exaggeration” of the technology.
1977 AI WINTER 1
1984-­1987 Enthusiasm for AI spirals out of control in the 1980s, leading to another collapse of the
billion-dollar AI industry.
1990 AI WINTER 2 as AI again reaches a low-water mark.
2002 AI researcher Rodney Brooks complains that “there is a stupid myth out there that AI
has failed.”
2005 Ray Kurzweil proclaims, “Many observers still think that the AI winter was the end of
the story ... yet today many thousands of applications are deeply embedded in the
infrastructure of every industry.”
2010 AI becomes widely used and well funded again. Machine learning gains prominence.

It is important to understand why these AI winters happened. If we are going to


make an investment to learn and deploy ML solutions, we want to be certain another AI
winter is not imminent.
Is another AI winter on the horizon? Some people believe so, and they raise three
possibilities:

• Blame it on statistics: AI is headed in the wrong direction because


of its heavy reliance on statistical techniques. Recall from Figure 1-1
that statistics are the foundation of AI and ML.

• Machines run amuck: Top researchers suggest another AI winter


could happen because misuse of the technology will lead to its
demise. In 2015, an open letter to ban development and use of
autonomous weapons was signed by Elon Musk, Steven Hawking,
Steve Wozniak, and 3,000 AI and robotics researchers.

6
Chapter 1 Introduction

• Fake data: Data is the fuel for machine learning (more about this in
Chapter 2). Proponents of this argument suggest that ever increasing
entropy will continue to degrade global data integrity to a point where
ML algorithms will become invalid and worthless. This is a relevant
argument in 2018. I will discuss the many types of data in Chapter 2.

It seems that another AI winter is not likely in the near future because ML is so
promising and because of the availability of high-quality data with which we can fuel it.
Much of our existing data today is not high quality, but we can mitigate this risk by
retaining control of the source data our models will rely upon.
Cutbacks in government funding caused the previous AI winters. Today, private
sector funding is enormous. Just look at some of the VC funding being raised by AI
startups. Similar future cutbacks in government support would no longer have a
significant impact. For ML, it seems the horse is out of the barn for good this time around.

1.3 Machine Learning Business Case


Whether you are a freelance developer or you work for a large organization with vast
resources available, you must consider the business case before you start to apply
valuable resources to ML deployments.

Machine Learning Hype


ML is certainly not immune from hype. The book preface listed some of the recent hype
in the media. The goal of this book is to help you overcome the hype and implement real
solutions for problems.
ML and DL are not the only recent technology developments that suffer from
excessive hype. Each of the following technologies has seen some recent degree of hype:

• Virtual reality (VR)

• Augmented reality (AR)

• Bitcoin

7
Chapter 1 Introduction

• Block chain

• Connected home

• Virtual assistants

• Internet of Things (IoT)

• 3D movies

• 4K television
• Machine learning (ML)

• Deep learning (DL)

Some technologies become widespread and commonly used, while other simply
fade away. Recall that just a few short years ago 3D movies were expected to totally
overtake traditional films for cinematic release. It did not happen.
It is important for us to continue to monitor the ML and DL technologies closely.
It remains to be seen how things will play out, but ultimately, we can convince ourselves
about the viability of these technologies by experimenting with them, building, and
deploying our own applications.

Challenges and Concerns


Table 1-3 lists some of the top challenges and concerns highlighted by IT executives
when asked what worries them the most when considering ML and DL initiatives. As
with any IT initiative, there is an opportunity cost associated with implementing it, and
the benefit derived from the initiative must outweigh the opportunity cost, that is, the
cost of forgoing another potential opportunity by proceeding with AI/ML.
Fortunately, there are mitigation strategies available for each of the concerns. These
strategies, summarized below, are even available to small organization and individual
freelance developers.

8
Chapter 1 Introduction

Table 1-3. Machine Learning Concerns and Mitigation Strategies


ML Concern Mitigation Strategy

Cost of IT Leverage cloud service providers such as Google GCP, Amazon AWS, Microsoft
infrastructure Azure
Not enough Even if we cannot hire data scientists, ML requires developers to start thinking
experienced staff like data scientists. This does not mean we suddenly require mathematics PhDs.
Organizations can start by adopting a data-first methodology such as ML-Gates
presented later in this chapter.
Cost of data or There are many very expensive data science platforms; however, we can start
analytics platform with classic ML using free open source software and achieve impressive results.
Insufficient data There exists a great deal of low quality data. We can mitigate by relying less on
quality “social” data and instead focusing on data we can create ourselves. We can also
utilize data derived from sensors that should be free of such bias.
Insufficient data Self-generated data or sensor data can be produced at higher scale by
quantity controlling sampling intervals. Integrating data into the project at the early
stages should be part of the ML methodology.

Using the above mitigation strategies, developers can produce some potentially
groundbreaking ML software solutions with a minimal learning curve investment. It is a
great time to be a software developer.
Next, I will take a closer look at ML data science platforms. Such platforms can help
us with the goal of monetizing our machine learning investments. The monetization
strategies can further alleviate some of these challenges and concerns.

Data Science Platforms


If you ask business leaders about their top ML objectives, you will hear variations of the
following:

• Improve organizational efficiency

• Make predictive insights into future scenarios or outcomes

• Gain a competitive advantage by using AI/ML

• Monetize AI/ML

9
Chapter 1 Introduction

Regardless of whether you are an individual or freelance developer, monetization is


one of the most important objectives.
Regardless of organizational size, monetizing ML solutions requires two
building blocks: deploying a data science platform, and following a ML
development methodology.
When it comes to the data science platforms, there are myriad options. It is helpful
to think about them by considering a “build vs. buy” decision process. Table 1-4 shows
some of the typical questions you should ask when making the decision. The decisions
shown are merely guidelines.

Table 1-4. Data Science Platform: Build vs. Buy Decision


Build vs. Buy Question Decision

Is there a package that exactly solves your problem? Yes: buy


Is there a package that solves many of your requirements? This is the common Undetermined
case and there is no an easy answer.
Is there an open source package you can consider? Yes: build
Is the package too difficult to implement? Yes: buy
Does your well-defined problem require deep learning? No: maybe
build
Is analytics a critical differentiator for your business? Yes: maybe
build
Is your analytics scenario unique? Yes: build
Is a new kind of data available? Yes: build
Does your domain require you to be agile? Yes: build
Do you have access to the data science talent your problem requires? Do not sell No: buy
yourself or your staff short; many developers pick up data science skills quickly.

So what does it actually mean to “buy” a data science platform? Let’s consider an
example.

10
Chapter 1 Introduction

You wish to create a recommendation engine for visitors to your website. You
would like to use machine learning to build and train a model using historical product
description data and customer purchase activity on your website. You would then like
to use the model to make real-time recommendations for your site visitors. This is a
common ML use case. You can find offerings from all of the major vendors to help you
implement this solution. Even though you will be “building” your own model using
the chosen vendor’s product, you are actually “buying” the solution from the provider.
Table 1-5 shows how the pricing might break down for this project for several of the
cloud ML providers.

Table 1-5. Example ML Cloud Provider Pricing https://cloud.google.com/


ml-engine/docs/pricing, https://aws.amazon.com/aml/pricing/, https://
azure.microsoft.com/en-us/pricing/details/machine-learning-studio/
Provider Function Pricing

Google Cloud ML Model building fees $0.27 per hour (standard machine)
Engine Batch predictions $0.09 per node hour
Real-time predictions $0.30 per node hour
Amazon Machine Model building fees $0.42 per hour
Learning (AML) Batch predictions $0.10 per 1000 predictions
Real-time predictions $.0001 per prediction
Microsoft Azure Model building fees $10 per month, $1 per hour (standard)
ML Studio Batch predictions $100 per month includes 100,000
Real-time predictions transactions (API)

In this example, you accrue costs because of the compute time required to build your
model. With very large data sets and construction of deep learning models, these costs
become significant.
Another common example of “buying” an ML solution is accessing a prebuilt model
using a published API. You can use this method for image detection or natural language
processing where huge models exist which you can leverage simply by calling the API
with your input details, typically using JSON. You will see how to implement this trivial
case later in the book. In this case, most of the service providers charge by the number of
API calls over a given time period.

11
Chapter 1 Introduction

So what does it mean to “build” a data science platform? Building in this case
refers to acquiring a software package that will provide the building blocks needed to
implement your own AI or ML solution.
The following list shows some of the popular data science platforms:

• MathWorks: Creators of the legendary MATLAB package, MathWorks


is a long-time player in the industry.

• SAP: The large database player has a complete big data services and
consulting business.

• IBM: IBM offers Watson Studio and the IBM Data Science Platform
products.

• Microsoft: Microsoft Azure provides a full spectrum of data and


analytics services and resources.

• KNIME: KNIME analytics is a Java-based, open, intuitive, integrative


data science platform.

• RapidMiner: A commercial Java-based solution.

• H2O.ai: A popular open source data science and ML platform.

• Dataku: A collaborative data science platform that allows users to


prototype, deploy, and run at scale.

• Weka: The Java-based solution you will explore extensively


in this book.

The list includes many of the popular data science platforms, and most of them are
commercial data science platforms. The keyword is commercial. You will take a closer
look at Rapidminer later in the book because it is Java based. The other commercial
solutions are full-featured and have a range of pricing options from license-based to
subscription-based pricing.
The good news is you do not have to make a capital expenditure in order to build a
data science platform because there are some open source alternatives available. You
will take a close look at the Weka package in Chapter 3. Whether you decide to build
or buy, open source alternatives like Weka are a very useful way to get started because
they allow you to build your solution while you are learning, without locking you into an
expensive technology solution.

12
Chapter 1 Introduction

ML Monetization
One of the best reasons to add ML into your projects is increased potential to monetize.
You can monetize ML in two ways: directly and indirectly.

• Indirect monetization: Making ML a part of your product or service.

• Direct monetization: Selling ML capabilities to customers who in


turn apply them to solve particular problems or create their own
products or services.

Table 1-6 highlights some of the ways you can monetize ML.

Table 1-6. ML Monetization Approaches


Strategy Type Description

AIaaS Direct AI as a Service, such as Salesforce Einstein or IBM Watson.


MLaaS Direct ML as a Service, such as the Google, Amazon, or Microsoft examples in
Table 1-5.
Model API Indirect You can create models and then publish an API that will allow others to use
your model to make their own predictions, for example.
NLPaaS Direct NLP as a Service. Chatbots such as Apple Siri, Microsoft Cortana, or Amazon
Echo/Alexa. Companies such as Nuance Communications, Speechamatics,
and Vocapia.
Integrated Indirect You can create a model that helps solve your problem and integrate that
ML model into your project or app.

Many of the direct strategies employ DL approaches. In this book, the focus is mainly
on the indirect ML strategies. You will implement several integrated ML apps later in the
book. This strategy is indirect because the ML functionality is not visible to your end user.
Customers are not going to pay more just because you include ML in your
application. However, if you can solve a new problem or provide them capability that
was not previously available, you greatly improve your chances to monetize.
There is not much debate about the rapid growth of AI and ML. Table 1-7 shows
estimates from Bank of America Merrill Lynch and Transparency Market Research. Both
firms show a double-digit cumulative annual growth rate, or CAGR. This impressive
CAGR is consistent with all the hype previously discussed.
13
Chapter 1 Introduction

Table 1-7. AI and ML Explosive Growth


Firm Domain Growth CAGR

Bank of America Merrill Lynch AI US$58 bn in 2015 toUS$153 bn in 2020 27%


Transparency Market Research ML US$1.07 bn in 2106 toUS$19.86 bn in 2025 38%

These CAGRs represent impressive growth. Some of the growth is attributed to DL;
however, you should not discount the possible opportunities available to you with CML,
especially for mobile devices.

The Case for Classic Machine Learning on Mobile


Classic machine learning is not a very commonly used term. I will use the term to
indicate that we are excluding deep learning. Figure 1-3 shows the relationship. These
two approaches employ different algorithms, and I will discuss them in Chapter 4.
This book is about implementing CML for widely available computing devices
using Java. In a sense, we are going after the “low-hanging fruit.” CML is much easier to
implement than DL, but many of the functions we can achieve are no less astounding.

Figure 1-3. Classic machine learning relationship diagram

14
Chapter 1 Introduction

There is a case for mastering the tools of CML before attempting to create DL
solutions. Table 1-8 highlights some of the key differences between development and
deployment of CML and DL solutions.

Table 1-8. Comparison of Classic Machine Learning and Deep Learning


Classic machine learning Deep learning

Algorithms
Algorithms are mostly commoditized. You do There is a lot of new research behind neural
not need to spend a lot of time choosing network algorithms. A lot of theory is involved
the best algorithm or tweaking algorithms. and a lot of tweaking is required to find the best
Algorithms are easier to interpret and algorithm for your application.
understand.

Data requirements
Modest amounts of data are required. You can Huge amounts of data are required to train DL
generate your own data in certain applications. models. Most entities lack sufficient data to create
their own DL models.

Performance
Sufficient performance for many mobile, web The recent growth in deep learning neural
app, or embedded device environments. network algorithms is largely due to their ability to
outperform CML algorithms.

Language
Many Java tools are available, both open Most DL libraries and tools are Python or
source and commercial. C++ based, with the exception of Java-based
DL4J. There are often Java wrappers available for
some of the popular C++ DL engines.

Model creation
Model size can be modest. Possible to create Model size can be huge. Difficult to embed models
models on desktop environments. Easy to into mobile apps. Large CPU/GPU resources
embed in mobile devices or embedded devices. required to create models.
(continued)

15
Chapter 1 Introduction

Table 1-8. (continued)

Classic machine learning Deep learning

Typical use cases


Regression Image classification
Clustering Speech
Classification Computer vision
Specific use cases for your data Playing games
Self-driving cars
Pattern recognition
Sound synthesis
Art creation
Photo classification
Anomaly (fraud) detection
Behavior analysis
Recommendation engine
Translation
Natural language processing
Facial recognition

Monetization
Indirect ML Model APIsMLaaS

For mobile devices and embedded devices, CML makes a lot of sense. CML
outperforms DL for smaller data sets, as shown on the left side of the chart in Figure 1-7.
It is possible to create CML models with a single modern CPU in a reasonable
amount of time. CML started on the desktop. It does not require huge compute resources
such as multiple CPU/GPU, which is often the case when building DL solutions.
The interesting opportunity arises when you build your models on the desktop and
then deploy them to the mobile device either directly or through API interface.
Figure 1-­4 shows a breakdown of funding by AI category according to Venture Scanning.

16
Chapter 1 Introduction

Figure 1-4. Funding by AI category

The data show that ML for mobile apps has approximately triple the funding of the
next closest area, NLP. The categories included show that many of the common DL
fields, such as computer vision, NLP, speech, and video recognition, have been included
as a specific category. This allows us to assume that a significant portion of the ML apps
category is classic machine learning.

17
Chapter 1 Introduction

1.4 Deep Learning


I will not cover deep learning in this book because we can accomplish so much more
easily with CML. However, in this short section I will cover a few key points of DL to help
identify when CML might not be sufficient to solve an ML problem.

Figure 1-5. Machine learning red pill/blue pill metaphor

Morpheus described the dilemma we face when pursuing ML in the motion picture
“The Matrix” (see also Figure 1-5):
“You take the blue pill, the story ends; you wake up in your bed and believe
whatever you want to believe. You take the red pill, you stay in Wonderland,
and I show you how deep the rabbit hole goes.”
Deep learning is a sort of Wonderland. It is responsible for all of the hype we have in
the field today. However, it has achieved that hype for a very good reason.
You will often hear it stated the DL operates at scale. What does this mean exactly?
It is a performance argument, and performance is obviously very important. Figure 1-6
shows a relationship between performance and data set size for CML and DL.

18
Chapter 1 Introduction

Figure 1-6. Deep learning operating at scale

The chart shows that CML slightly outperforms DL for smaller data set sizes. The
question is, how small is small? When we design ML apps, we need to consider which
side of the point of inflection the data set size resides. There is no easy answer. If there
were, we would place the actual numbers on the x-axis scale. It depends on your specific
situation and you will need to make the decision about which approach to use when you
design the solution.
Fortunately, we have tools that enable us to define the performance of our CML
models. In Chapters 4 and 5, you will look at how to employ the Weka workbench to show
you if increasing your data set size actually leads to increased performance of the model.

Identifying DL Applications
Deep learning has demonstrated superior results versus CML in many specific areas
including speech, natural language processing, computer vision, playing games, self-
driving cars, pattern recognition, sound synthesis, art creation, photo classification,
irregularity (fraud) detection, recommendation engine, behavior analysis, translation,
just to name a few.
As you gain experience with ML, you begin to develop a feel for when a project is a
good candidate for DL.

19
Chapter 1 Introduction

Deep networks work well when

• Simpler CML models are not achieving the accuracy you desire.

• You have complex pattern matching requirements.

• You have the dimension of time in your data (sequences).

If you do decide to pursue a DL solution, you can consider the following deep
network architectures:
• Unsupervised pre-trained network (UPN) including deep belief
networks (DBN) and generative adversarial networks (GAN)

• Convolutional neural network (CNN)

• Recurrent neural network (RNN) including long short-term memory


(LSTM)

• Recursive neural networks

I will talk more about algorithms in Chapter 4. When designing CML solutions,
you can start by identifying the algorithm class of CML you are pursuing, such as
classification or clustering. Then you can easily experiment with algorithms within the
class to find the best solution. In DL, it is not as simple. You need to match your data to
specific network architectures, a topic that is beyond the scope of this book.
While building deep networks is more complicated and resource intensive, as
described in Table 1-8, tuning deep networks is equally challenging. This is because,
regardless of the DL architecture you choose, you define deep learning networks using
neural networks that are comprised of a large number of parameters, layers, and weights.
There are many methods used to tune these networks including the methods in Table 1-­9.

Table 1-9. Tuning Methods for DL Networks


Tuning methods for DL neural networks

Back propagation Stochastic gradient descent


Learning rate decay Dropout
Max pooling Batch normalization
Long short-term memory Skipgram
Continuous bag of words Transfer learning

20
Chapter 1 Introduction

As the table suggests, DL is complicated. The AI engines available for DL try to


simplify the process. Table 1-10 shows many of the popular AI engines that include DL
libraries. In this book, you will focus on CML solutions for Java developers.
When you create DL solutions there are not as many Java tools and libraries
available. DL4J and Spark ML are the two most common Java-based packages that can
handle DL. DL4J is built from the ground up with DL in mind, whereas the popular
Spark open source project has recently added some basic DL capabilities. Some of the
excellent C++ libraries do provide Java wrappers, such as Apache MXNet and OpenCV.

Table 1-10. AI Engines with Deep Learning Libraries


Package Description Language

Theano Powerful general-purpose tool for mathematical programming. Developed Python


to facilitate deep learning. High-level language and compiler for GPU.
Tensor Library for all types of numerical computation associated with deep C++ and
Flow learning. Heavily inspired by Theano. Data flow graphs represent the Python
ways multi-dimensional arrays (tensors) communicate. (Google)
CNTK Computational Network Toolkit. Release by Microsoft Research under a C++
permissive license.
Caffe Clean and extensible design. Based on the AlexNet that won the 2012 C++ and
ImageNet challenge. Python
(Facebook support)
DL4J Java-based open source deep learning library (Apache 2.0 license). Uses a multi- Java
dimensional array class with linear algebra and matrix manipulation. (Skymind)
Torch Open source, scientific computing framework optimized for use with GPUs. C
Spark A fast and general engine for large-scale distributed data processing. MLlib is Java
MLlib the machine learning library. Huge user base. DL support is growing.
Apache Open source Apache project. Used by AWS. State of the art models: CNN and C++Java
MXNet LSTM. Scalable. Founded by University of Washington and Carnegie Mellon Wrapper
University.
Keras Powerful, easy-to-use library for developing and evaluating DL models. Best Python
of Theano and Tensor flow.
OpenCV Open source computer vision library that can be integrated for Android. C++Java
Wrapper

21
Chapter 1 Introduction

While it is entirely possibly that DL can solve your unique problem, this book
wants to encourage you to think about solving your problem, at least initially, by using
CML. The bottom line before we move onto ML methodology and some of the technical
setup topics is the following:
Deep learning is amazing, but in this book, we resist the temptation and
favor classic machine learning, simply because there are so many equally
amazing things it can accomplish with far less trouble.
In the rest of the book, we will choose the blue pill and stay in the comfortable
simulated reality of the matrix with CML.

1.5 ML-Gates Methodology


Perhaps the biggest challenge of producing ML applications is training yourself to think
differently about the design and architecture of the project. You need a new data-driven
methodology. Figure 1-7 introduces the ML-Gates. The methodology uses these six gates
to help organize CML and DL development projects. Each project begins with ML-Gate 6
and proceeds to completion at ML-Gate 0. The ML-Gates proceed in a decreasing order.
Think of them as leading to the eventual launch or deployment of the ML project.

22
Chapter 1 Introduction

Figure 1-7. ML-Gates, a machine learning development methodology

As developers, we write a lot of code. When we take on new projects, we typically just
start coding until we reach the deliverable product. With this approach, we typically end
up with heavily coded apps.
With ML, we want to flip that methodology on its head. We instead are trying to achieve
data-heavy apps with minimal code. Minimally coded apps are much easier to support.

ML-Gate 6: Identify the Well-Defined Problem


It all starts with a well-defined problem. You need to think a bit more narrowly in this
phase than you do when undertaking traditional non-ML projects. This can result in
creating ML modules that you integrate into the larger system.
To illustrate this, let’s consider an example project with client requirements.
For the project, you map the client requirements to well-defined ML solutions.
Table 1-11 shows the original client requirements mapped to the ML models.

23
Chapter 1 Introduction

Table 1-11. Mapping Requirements to ML Solution


Initial client requirement Well-defined ML solution

R1: Create a shopping app so You identify the need for an in-store location-based solution
customers inside the physical to make this app useful. You can use a clever CML approach
store will have an enhanced user to achieve this. More about the implementation at the end of
experience. Chapter 6.
R2: Implement a loyalty program for Loyalty programs are all about saving and recalling
shoppers who use the app to help customer data. You can build a ML model using product
increase sales. inventory data and customer purchase history data
to recommend products to customers, resulting in an
enhanced user experience.

In this example, the client wants an in-store shopping app. These are perfectly valid
requirements, but these high-level requirements do not represent well-defined ML
problems. Your client has “expressed” a need to provide an “enhanced user experience.”
What does that really mean? To create an ML solution, you need to think about the
unexpressed or latent needs of the client.
The right column shows how to map the expressed requirements to well-defined ML
solutions. In this case, you are going to build two separate ML models. You are going to
need data for these models, and that leads you to ML-Gate 5.

ML-Gate 5: Acquire Sufficient Data


Data is the key to any successful ML app. In MLG5, you need to acquire the data. Notice this
is happening well before you write any code. There are several approaches for acquiring
data. You have several options and I will discuss the following in detail in Chapter 2:

• Purchase the data from a third party.

• Use publicly available data sets.

• Use your own data set.

• Generate new static data yourself.

• Stream data from a real-time source.

24
Chapter 1 Introduction

ML-Gate 4: Process/Clean/Visualize the Data


Once you have a well-defined problem and sufficient data, it is time to architect your
solution. The next three gates cover this activity. In MLG4, you need to process, clean,
and then visualize your data.
MLG4 is all about preparing your data for the model construction. You need to
consider techniques such as missing values, normalization, relevance, format, data
types, and data quantity.
Visualization is an important aspect because you strive to be accountable for your
data. Data that is not properly preprocessed can lead to errors when you apply CML or
DL algorithms to the data. For this reason, MLG4 is very important. The old saying about
garbage in, garbage out is something you must avoid.

ML-Gate 3: Generate a Model


With your data prepared, MLG3 is where you actually create the model. At MLG3, you
will make the initial decision on which algorithm to use.
In Chapter 4, I will cover the Java-based CML environments that can generate models.
I will cover how to create models and how to measure the performance of your models.
One of the powerful design patterns you will use to build models offline for later use
in Java projects. Chapter 5 will cover the import and export of “pre-built” models.
At MLG3, you also must consider version control and updating approaches for your
models. This aspect of managing models is just as important as managing code updates
in non-ML software development.

ML-Gate 2: Test/Refine the Model


With the initial model created, MLG2 allows you to test and refine the model. It is here
that you are checking the performance of the model to confirm that it will meet your
prediction requirements.
Inference is the process of using the model to make predictions. During this process,
you may find that you need to tweak or optimize the chosen algorithm. You may find that
you need to change your initial algorithm of choice. You might even discover that CML is
not providing the desired results and you need to consider a DL approach.
Passing ML-Gate 2 indicates that the model is ready, and it is time to move on to
MLG1 to integrate the model.

25
Chapter 1 Introduction

ML-Gate 1: Integrate the Model


At MLG1, it is time to write actual production code. Notice how far back in the
methodology you have pushed the actual code writing. The good news is that you will
not have to write as much code as you normally do because the trained model you have
created will accomplish much of the heavy lifting.
Much of the code you need to write at MLG1 handles the “packaging” of the model.
Later in this chapter, I will discuss potential target environments that can also affect how
the model needs to be packaged.
Typically, you create CML models at MLG3/4 with training data and then utilize the
model to make predictions. At MLG1, you might write additional code to acquire new
real-time data to feed into the model to output a prediction. In Chapter 6, you will see
how to gather sensor data from devices to feed into the model.
MLG1 is where you recognize the coding time savings. It usually only take a few lines
of code to open a prebuilt model and make a new prediction.
This phase of the methodology also includes system testing of the solution.

ML-Gate 0: Deployment
At MLG0, it is time for deployment of the completed ML solution. You have several
options to deploy your solution because of the cross-platform nature of Java, including

• Release a mobile app though an app store such as Google Play.

• Ship a standalone software package to your clients.

• Provide the software online through web browser access.

• Provide API access to your solution.

Regardless of how you deploy your ML solutions, the important thing to


remember at MLG0 is that “ship it and forget it” is wrong.
When we create models, we have to recognize that they should not become static
entities that never change. We need a mechanism to update them and keep them
relevant. ML models help us to avoid the downside of code-heavy apps, but instead we
must effectively manage the models we create so they do not become outdated.

26
Chapter 1 Introduction

Methodology Summary
You now have covered the necessary background on CML, and you have a methodology
you can use for creating CML applications.
You have probably heard that saying “When you are a hammer, everything looks
like a nail.” After becoming proficient in CML and adopting a data-driven methodology,
you soon discover that most problems have an elegant ML solution for at least for some
aspect of the problem.
Next, you will look at the setup required for Java projects in the book, as well as one
final key ingredient for ML success: creative thinking.

1.6 The Case for Java


There is always a raging debate about which programming language is the best, which
language you should learn, what’s the best language for kids to start coding in, which
languages are dying, which new languages represent the future or programming, etc.
Java is certainly a big part of these debates. There are many who question the ability
of Java to meet the requirements of a modern developer. Each programming language
has its own strengths and weaknesses.
Exercises in Programming Style by Christina Videira Lopes is interesting because
the author solves a common programming problem in a huge variety of languages while
highlighting the strengths and weaknesses of each style. The book illustrates that we can
use any language to solve a given problem. As programmers, we need to find the best
approach given the constraints of the chosen language. Java certainly has its pro and
cons, and next I will review some reasons why Java works well for CML solutions.

Java Market
Java has been around since 1995 when it was first released by Sun Microsystems,
which was later acquired by Oracle. One of the benefits of this longevity is the market
penetration it has achieved. The Java market share (Figure 1-8) is the single biggest
reason to target the Java language for CML applications.

27
Chapter 1 Introduction

Figure 1-8. Java market

Java applications compile to bytecode and can run on any Java virtual machine
(JVM) regardless of computer architecture. It is one of the most popular languages with
many millions of developers, particularly for client-server web applications.
When you install Java, Oracle is quick to point out that three billion devices run Java.
It is an impressive claim. If we drill down deeper into the numbers, they do seem to be
justified. Table 1-12 shows some more granular detail of the device breakdown.

Table 1-12. Devices Running Java


Device Count

Desktops running Java 1.1 billion


JRE downloads each year 930 million
Mobile phones running Java 3 billion
Blue-ray players 100% run Java
Java cards 1.4 billion manufactured each year
Proprietary boxes Unknown number of devices which include set-top boxes,
printers, web cams, game consoles, car navigation systems,
lottery terminals, parking meters, VOIP phone, utility meters,
industrial controls, etc.

28
Chapter 1 Introduction

The explosion of Android development and the release of Java 8 helped Java to gain
some of its market dominance.
Java’s massive scale is the main reason I prefer it as the language of choice
for CML solutions. Developers only need to master one language to produce
working CML solutions and deploy them to a huge target audience.
For your target environments, the focus will be on the following three areas that
make up the majority of installed Java devices:

• Desktops running Java: This category includes personal computers


that can run standalone Java programs or browsers on those
computers that can run Java applets.

• Mobile phones running Java: Android mobile devices make up a large


part of this category, which also includes low-cost feature phones. One
of the key findings of ML is the importance of data, and the mobile
phone is arguably the greatest data collection device ever created.

• Java cards: This category represents the smallest of the Java


platforms. Java cards allow Java applets to run on embedded devices.
Device manufacturers are responsible for integrating embedded Java
and it is not available for download or installation by consumers.

Java Versions
Oracle supplies the Java programming language for end users and for developers:

• JRE (Java Runtime Environment) is for end users who wish to install
Java so they can run Java applications.

• JDK (Java SE Developer Kit) Includes the JRE plus additional tools for
developing, debugging, and monitoring Java applications.

There are four platforms of the Java programming language:

• Java Platform, Standard Edition (Java SE)

• Java Platform, Enterprise Edition (Java EE), is built on top of Java SE


and includes tools for building network applications such as JSON,
Java Servlet, JavaMail, and WebSocket. Java EE is developed and
released under the Java Community Process.

29
Chapter 1 Introduction

• Java Platform, Micro Edition (Java ME), is a small footprint virtual


machine for running applications on small devices.

• Java FX is for creating rich internet applications using a lightweight API.

All of the Java platforms consist of a Java Virtual Machine (JVM) and an application
programming interface (API).
Table 1-13 summarizes the current Java releases.

Table 1-13. Latest Supported Java Releases


Release Description

Java 8 SE build Currently supported long-term-support (LTS) version. Introduces lambda


171 expressions
Java 10 SE Currently supported rapid release version. Released March 20 2018. Includes 12
10.0.1 new major features. Latest update was 171.
Android SDK Alternative Java software platform used for developing Android apps.
Includes its own GUI extensive system and mobile device libraries.
Android does not provide the full Java SE standard library.
Android SDK supports Java 6 and some Java 7 features.

The most recent versions of Java have addressed some of the areas where the
language was lagging behind some of the newer, more trendy languages. Notably,
Java 8 includes the far-reaching feature known as the lambda expression along with
a new operator (->) and a new syntax element. Lambda expressions add functional
programming features and can help to simplify and reduce the amount of code required
to create certain constructs.
In the book, you will not be using lambda expression, nor will you use any of the
many new features added to the language in Java 10. Nonetheless, it is best to run with
the latest updates on either the long-term support release of Java 8 or the currently
supported rapid release of Java 10.

30
Chapter 1 Introduction

If you are looking for a comprehensive Java book, Java, The Complete Reference
Tenth Edition from Oracle, which weighs in at over 1,300 pages, is an excellent choice. It
covers all things Java. When it comes to Java performance tuning, Java Performance by
Charlie Hunt and John Binu is the 720-page definitive guide for getting the most out of
Java performance.

Installing Java
Before installing Java, you should first uninstall all older versions of Java from your
system. Keeping old versions of Java on your system is a security risk. Uninstalling older
versions ensures that Java applications will run with the latest security and performance
environment.
The main Java page and links for all the platform downloads are available at the
following URLs:

https://java.com/en/
https://java.com/en/download/manual.jsp

Java is available for any platform you require. Once you decide which load you
need, proceed to download and install. For the projects in this book, it is recommended
to install the latest stable release of Java 8 SE. For the Android projects, allow Android
Studio to manage your Java release. Android Studio will typically use the latest stable
release of Java 7 until the Android team adds support for Java 8.
Figure 1-9 shows the main Java download page.
Figure 1-10 shows the Java installation.
Figure 1-11 shows the completion of the installation.

31
Chapter 1 Introduction

Figure 1-9. Downloading Java

32
Chapter 1 Introduction

Figure 1-10. Installing Java

Figure 1-11. Successful Java SE installation

Java Performance
Steve Jobs once famously said about Java, “It’s this big, heavyweight ball and chain.”
Of course, Apple was never a big fan of the language. One of the results or perhaps the
reason for Java’s longevity is the support and improvements added to the language over
the years. The latest versions of Java offer far more features and performance than the
early versions.

33
Chapter 1 Introduction

One of the reasons developers have been hesitant to choose Java for ML solutions is
the concern over performance.
Asking which language is “faster” or offers better performance is not really a useful
question. It all depends, of course. The performance of a language depends on its
runtime, the OS, and the actual code. When developers ask, “Which language offers the
best performance for machine learning?” we really should be asking, “Which platform
should I use to accomplish the training and building of machine learning models the
most quickly and easily?”
Creating ML models using algorithms is CPU intensive, especially for DL
applications. This book is about Java, but if you research ML, you know that Python
and C++ are also very popular languages for ML. Creating a fair comparison of the
three languages for ML is not easy, but many researchers have tried to do this and you
can learn from their findings. Since ML is algorithm-based, they often try to choose a
standard algorithm and then implement a comparison with other variables being equal,
such as CPU and operating system.
Java performance is the hardest of the languages to measure because of several
factors including unoptimized code, Java’s JIT compilation approach, and the famous
Java garbage collection. Also, keep in mind that Java and Python may rely on wrappers to
C++ libraries for the actual heavy lifting.
Table 1-14 shows a high-level summary of the performance for a mathematical
algorithm implemented in three different languages on the same CPU and operating system.
To learn more about the underlying research used in the summary, refer to these sources:

• Program Speed, Wikipedia, https://en.wikipedia.org/wiki/Java_


performance

• A Google research paper comparing the performance of C++, Java,


Scala, and the Go programming language:

https://days2011.scala-lang.org/sites/days2011/files/ws3-1-
Hundt.pdf

• Comparative Study of Six Programming Languages:

https://arxiv.org/ftp/arxiv/papers/1504/1504.00693.pdf

• Ivan Zahariev’s blog:

https://blog.famzah.net/2016/02/09/cpp-vs-python-vs-perl-
vs-php-performance-benchmark-2016/

34
Chapter 1 Introduction

Table 1-14. Language Performance Comparison - Mathematical Algorithms


Language % Slower than C++ Note

C++ - C++ compiles to native, so it is first.


Java 8 15% Java produces bytecode for platform independence. Java’s
“kryptonite” has been its garbage collection (GC) overhead.
There have been many improvements made to the Java GC
algorithms over the years.
Kotlin 15+% Kotlin also produces Java Virtual Machine (JVM) bytecode.
Typically, Kotlin is as fast as Java.
Python 55% Python has high-level data types and dynamic typing, so the
runtime has to work harder than Java.

Table 1-14 is certainly not an exhaustive performance benchmark, but does provide
some insight to relative performance with possible explanation for the differences.
When you create prebuilt models for your ML solutions, it is more important to focus
on the data quality and algorithm selection than programming language. You should
use the programming language that most easily and accurately allows you to express the
problem you are trying to solve.
Java skeptics frequently ask, “Is Java a suitable programming language for
implementing deep learning?” The short answer: absolutely! It has sufficient
performance, and all the required math and statistical libraries are available. Earlier
in the chapter, I listed DL4J as the main Java package written entirely in Java. DL4J is
a fantastic package and its capabilities rival all of the large players in DL. Bottom line:
With multi-node computing available to us in the cloud, we have the option to easily add
more resources to computationally intensive operations. Scalability is one of the great
advantages provided by cloud-based platforms I will discuss in Chapter 2.

1.7 Development Environments


There are many IDEs available to Java developers. Table 1-15 shows the most popular
choices for running Java on the desktop or device. There are also some online browser-­
based cloud Java IDEs such as Codenvy, Eclipse Che, and Koding, which I will not cover.

35
Chapter 1 Introduction

Table 1-15. Java IDE Summary


IDE Name Features

Android Android-specific development environment from Google. It has become the de facto
Studio IDE for Android. Offers a huge number of useful development and debugging tools.
IntelliJ IDEA Full featured, professional IDE. Annual fee. Many developers love IntelliJ. Android
Studio was based on IntelliJ.
Eclipse Free open source IDE. Eclipse public license. Supports Git. Huge number of plugins
available.
BlueJ Lightweight development environment. Comes packaged with Raspberry Pi.
NetBeans Free open source IDE, alternative to Eclipse. The open source project is moving to
Apache, which should increase its popularity.

The book uses two development environments for the projects depending on the
target platform:

• Google’s Android Studio helps developers create apps for mobile


devices running Android.

• The Eclipse IDE for Java projects that do not target Android mobile
devices. This includes Java programs that target the desktop, the
browser, or non-Android devices such as the Raspberry Pi.

Android Studio
Google makes it easy to get started with Android Studio. The latest stable release build is
version 3.1.2 available April 2018. The download page is ­https://developer.android.
com/studio/.
Figure 1-12 shows the available platforms. Note that the files and disk requirements
are large. The download for 64-bit Windows is over 700MB.

36
Chapter 1 Introduction

Figure 1-12. Android Studio downloads

Android Studio has really been improving the last couple of years. The full featured
development environment for Android includes

• Kotlin version 1.2.30


• Performance tools

• Real-time network profiler


• Visual GUI layout editor

• Instant run

• Fast emulator

• Flexible Gradle build system

• Intelligent code editor

Figure 1-13 shows the show the Android Studio installation setup.

37
Chapter 1 Introduction

Figure 1-13. Android Studio install

Figure 1-14 shows the shows the Android Studio opening banner including the
current version 3.1.2.

Figure 1-14. Android Studio Version 3.1.2

Android Studio uses the SDK Manager to manage SDK packages. SDK packages are
available for download. The SDK packages are required to compile and release your app
for a specific Android version. The most recent SDK release is Android 8.1 (API level 27),
also known as Oreo. Figure 1-15 shows the Android SDK Manager.

38
Chapter 1 Introduction

Figure 1-15. Android Studio SDK Manager

Always keep an eye out for updates to both Android Studio and the SDK
platforms you use. Google frequently releases updates, and you want your
development environment to stay current.
This is especially important for mobile development when end users are constantly
buying the latest devices.

Eclipse
Android mobile apps are a big part of our CML strategy, but not the only target audience
we have available to us. For non-Android projects, we need a more appropriate
development environment.
Eclipse is the versatile IDE available from the Eclipse foundation. The download
page is https://eclipse.org/downloads. Eclipse is available for all platforms. The
most recent version is Oxygen.3a and the version is 4.7.3a. Similar to Android, Eclipse
uses proceeds through the alphabet and, like Android, is also currently at “O.”

39
Chapter 1 Introduction

Similar to the options available for the Java distributions, developers can choose either

• Eclipse IDE for Java EE Developers (includes extra tools for web
apps), or

• Eclipse IDE for Java Developers

The latter is sufficient for the projects in this book. Figure 1-16 shows the Eclipse IDE
for Java Developers installation banner.

Figure 1-16. Eclipse install

Eclipse makes it easy to get started with your Java projects. Once installed, you will
have the option to

• Create new projects.

• Import projects from existing source code.


• Check out or clone projects from the Git source code control system.

The Git checkout feature is very useful, and you can use that option to get started
quickly with the book projects. Figure 1-17 shows the Eclipse IDE for Java Developers
startup page with the various options.

40
Chapter 1 Introduction

Figure 1-17. Eclipse IDE for Java developers

One of the big advantages of Eclipse is the huge number of plugins available. There
are plugins for almost every imaginable integration. Machine learning is no exception.
Once you get a feel for the types of ML projects you are producing, you may find the
Eclipse plugins in Table 1-16 to be useful. For the book projects, you will use a basic
Eclipse installation without plugins.

41
Chapter 1 Introduction

Table 1-16. Eclipse IDE Machine Learning Related Plugins


Eclipse ML plugin Description

AWS Toolkit Helps Java developers integrate to the AWS services to their Java projects.
Google Cloud Tools Google-sponsored open source plugin that supports the Google Cloud
Platform. Cloud Tools for Eclipse enables you to create, import, edit, build,
run, and debug in the Google cloud.
Microsoft Azure Toolkit The Azure Toolkit for Eclipse allows you to create, develop, configure, test,
and deploy lightweight, highly available, and scalable Java web apps.
R for Data Science Eclipse has several plugins to support the R statistical language.
Eclipse IoT 80 plugins available.
Eclipse SmartHome 47 plugins available.
Quant Components Open source framework for financial time series and algorithmic trading.

It is important to keep your Eclipse environment up to date. Figure 1-18 shows


the Eclipse startup banner with the current version. Just as with your Java installation,
Android Studio, and Android SDK platforms, always keep your Eclipse IDE up to date.

Figure 1-18. Eclipse IDE for Java developers

42
Chapter 1 Introduction

Net Beans IDE


The Net Beans IDE is an alternative for Java developers who do not want to use Eclipse.
The download page is https://netbeans.org.downloads.
Eclipse has gained more users over the years, but NetBeans still has its supporters.
Recently, Oracle announced that it would turn over NetBeans to the Apache Foundation
for future support. Fans of NetBeans see this as a positive development because now the
long-time supporters of NetBeans will be able to continue its development.
I will not be using NetBeans in the book, but you are free to do so. The projects
should import easily. It is an IDE worth keeping an eye on in the future. Figure 1-19
shows the NetBeans main page.

Figure 1-19. NetBeans IDE


43
Chapter 1 Introduction

1.8 Competitive Advantage


Earlier in this chapter, you developed a strategy to deploy CML apps for Java-­
based devices. You also established a methodology, the ML-Gates for data-driven
development. The goal is to create a competitive advantage and monetize your ML
solutions. Achieving this goal takes more than just using the development tools that are
readily available to everyone.
This section will discuss two additional ingredients needed to help create a
competitive advantage when designing ML solutions:

• Creative thinking

• Bridging domains

One of the key success factors when trying to create ML solutions is creativity. You
need to think out of the box. It is a cliché, but it often takes a slightly different perspective
to discover a unique ML solution.

Standing on the Shoulders of Giants


If you visit the mathematics, computer science, or physics departments of your local
college or university, you will find academic research papers plastered on the corridor
walls. Upon closer look, you will find that many of these works focus on machine
learning. If you search online, you will also find many of these papers.
PhD students in mathematics or statistics usually author these papers. They typically
spend months or even years on the particular topic they are exploring. These papers are
often difficult for developers to understand. Sometimes we may only grasp a fraction of
the content. However, these papers are a very useful resource in our search for creative
ideas.
Academic research papers can provide valuable ideas for content and
approaches we can utilize in our machine learning apps.
Leveraging the findings of these researchers could potentially help you identify a
solution, or save you a lot of time. If you find a relevant research paper, do not be afraid
to reach out to the author. In most cases, they are not software developers, and you could
form an interesting partnership.

44
Chapter 1 Introduction

Bridging Domains
Everybody has access to the technologies in this book. How can we differentiate
ourselves? Recall from Table 1-1 in the beginning of this chapter, ML terminology
originates from different domains. Figure 1-20 shows a graphical view of the domains.
As developers, we approach the problem from the technology domain. With our toolkits,
we occupy a unique position, allowing us to produce Java ML solutions that lie at the
intersection of the domains.

Figure 1-20. Domain relationships

Businesses have the data, the capital ($) to deploy, and many problems they need to
solve. The scientists have the algorithms. As Java developers, we can position ourselves
at the intersection and produce ML solutions. Developers who can best understand
the business problem, connect the problem to the available data, and apply the most
appropriate algorithm will be in the best position for monetization.

45
Chapter 1 Introduction

1.9 Chapter Summary


I have covered quite a few broad topic areas in this chapter. A quick review of the key
findings follows. Keep them in mind as you proceed through the rest of the book.

Key Findings
1. Adopt a data-driven methodology.

2. “Set it and forget it” is wrong. You need to update models


frequently to reflect changes in the underlying data.

3. Adopt a data-driven methodology like the ML-Gates.

4. Always start with a clearly defined problem.

5. DL is not required to produce amazing solutions. You can use


CML techniques, which are far easier to build and implement for
many real-world scenarios.

6. DL can operate at scale. The more data you can feed to the model,
the more accurate it becomes.

7. CML performs better for smaller data sets.

8. Think creatively to gain a competitive advantage.

9. Scientific research papers can provide an excellent source of


ideas.

10. Think across domains. Bridge the gap between the technology,
business, and science domains.

46

You might also like