Recent
Recent
Recent
Association: An association rule learning problem is where you want to discover rules that describe large portions of your data, such
as people that buy X also tend to buy Y.
Association rules allow you to establish associations amongst data objects inside large datasets by identifying
relationships between variables in a given dataset, as in market basket analysis and recommendation engines.
Labels
A label is the thing we're predicting—the y variable in simple linear regression. The label could be the future price of wheat, the kind of
animal shown in a picture, the meaning of an audio clip, or just about anything. Features
A feature is an input variable—the x variable in simple linear regression. A simple machine learning project might use a single feature,
while a more sophisticated machine learning project could use millions of features, specified as: x1,x2,...xN
In the spam detector example, the features could include the following: •words in the email text •sender's address •time of day the
email was sent •email contains the phrase "one weird trick“.
Models
A model defines the relationship between features and label.
For example, a spam detection model might associate certain features
strongly with "spam". Let's highlight two phases of a model's life:
•Training means creating or learning the model. That is, you show the
model labeled examples and enable the model to gradually learn the
relationships between features and label.
•Inference means applying the trained model to unlabeled examples.
That is, you use the trained model to make useful predictions (y'). For
example, during inference, you can predict median HouseValue for new
unlabeled examples.
If you have lesser amount of data and clearly labelled data for training,
opt for Supervised Learning. Unsupervised Learning would generally
give better performance and results for large data sets.
A child playing with toys can arrange them by identifying patterns
based on colors, shapes, sizes, or just based on their interests. The kid
discovers new ways to cluster the toys without needing external
supervision is similar to unsupervised learning.
Unsupervised learning identifies hidden patterns and relationships in an
unlabeled dataset by grouping data into clusters or by association.
E.g. of association recommendation system.
The two main characteristics of RL are trial and error search and
delayed rewards like delayed gratification.
Reinforcement learning is applied in Robotics, Self-driving cars,
evaluating trading strategies and adaptive controls.
A software engineering perspective on engineering machine learning
systems: State of the art and challenges
Introduction
ML algorithms, which have been around for many decades, empowered software-
intensive systems for providing additional beneficial functionalities. Some of the
remarkable tasks that are successfully tackled by ML algorithms include
autonomous driving ,social network analysis ,natural language processing ,image
recognition ,and recommendation .Compelling examples of ML systems can be
seen in various sectors, including finance ,healthcare , and manufacturing ,etc.
Despite several promising examples, 47% of AI projects remain prototypes due to
the lack of the tools to develop and maintain a production-grade AI system,
according to Gartner Research .
ML systems engineering in real-world settings is challenging since it adds
additional complexity to engineering ‘‘traditional’’ software. We have separate
bodies of knowledge for engineering ML capabilities and engineering traditional
software.
On the other hand, ML capabilities are generally served as parts of larger
software-intensive systems (besides embedded software in robots or vehicles).
Therefore, we need a holistic view of engineering software-intensive systems with
ML capabilities (ML systems) in real-world settings. Many researchers from
software engineering (SE) and MLhave stated the requirement of such a holistic
view
The Software Engineering for Machine Learning Applications (SEMLA)
international symposium (Khomh et al., 2018) was arranged to bring
together researchers and practitioners in SE and ML to explore the
challenges and implications of engineering ML systems. In 2018,
twomain topics were addressed intensively in SEMLA.
(1) How can software development teams incorporate ML related
activities into existing software processes?
(2) What new roles, artifacts, and activities would be required to
develop ML systems?
Realizing Artificial Intelligence Synergies in Software Engineering
(RAISE).
Christian Kästner at Carnegie Mellon University started to deliver a
course called ‘‘Software Engineering for AI-Enabled Systems ,which
takes an SE perspective on building software systems with a significant
ML component .
events is QCon.ai, which aims to bring SE and ML practitioners together
to exchange experiences and thoughts on all aspects of SE for ML.
This study aims to present the state of the art on software engineering
for ML systems.
. This paper can serve as a starting point to obtain such a holistic view
and a repository of papers to explore this topic.
The goal of this paper is to summarize the state-of-the-art and identify
challenges when engineering ML systems.
Artificial intelligence (AI) is the name of the field involving the efforts for
building intelligent agents.
Intelligent agents perceive their environment and try to achieve their
goals by acting autonomously.
Machine learning (ML) is a subfield of AI, which tries to acquire
knowledge by extracting patterns from raw data and solve some
problems using this knowledge. Deep Learning (DL) is a subfield of ML
that focuses on creating large neural network models capable of making
accurate data-driven decisions.
DL has emerged from research in AI and ML and is particularly suited to
contexts where large datasets are available and the data is complex.
ML , since first time since the invention of FORTRAN and LISP’’
Based on the dataset representation and the approach to defining the
candidate models and final model (or function), three different
categories of ML are identified: supervised, unsupervised, and
reinforcement learning .In supervised learning, an ML model (or a
function mapping inputs to outputs) is constructed using a training
dataset with labels. Classification and regression problems are typical
examples of supervised learning. In unsupervised learning, a function to
describe a hidden structure is inferred from unlabeled data. Common
problems for unsupervised learning are clustering and association rule
learning. In reinforcement learning, an agent learns from a series of
reinforcements, i.e., rewards and punishments . Reinforcement learning
finds lots of uses in video games.
Traditional software and machine learning systems
Engineering traditional software (or conventional software Druffel and
Little, 1990) is about the implementation of programs (arithmetic &
logic operations, a sequence of if-then-else rules, etc.) explicitly by
engineers in the form of source code. On the contrary, in ML systems,
ML algorithms search through a large space of candidate programs,
driven by training experience, to find a program that optimizes the
performance metric (i.e., fulfills the requirements). in other words, learn
a function that maps from inputs to outputs.
DL
n large enough datasets are available and is particularly useful for the
tasks in complex high-dimensional domains such as face recognition and
machine translation.
Traditional web service or a mobile application development.
SE for ML and ML for SE
ML community focuses on algorithms and their performance, whereas
the SE community focuses on implementing and deploying software
intensive systems.
SE for ML refers to addressing various SE tasks for engineering ML
systems, i.e., designing, developing, and maintaining ML-enabled
software systems.
ML for SE refers to applying or adapting AI technologies to address
various SE tasks ,such as software fault prediction (Malhotra, 2015),
code smell detection ,reusability metrics prediction.
This paper focuses on SE for ML by systematically reviewing the SE
literature on engineering ML systems,
for evaluating and improving the quality of ML systems, ML systems’
safety and security, testing ML systems, good and bad SE design patterns
for ML systems, standardize ML system development processes…
Research methods
systematic mapping study (SMS)
Simple literature review(SLR)
Multiple linear regression (MLR), also known simply as multiple
regression, is a statistical technique that uses several explanatory
variables to predict the outcome of a response variable.
A .Goal
An SLR approach was adopted to synthesize the knowledge of engineering
ML systems from an SE perspective.
The scope and goal of this study were formulated using the Goal-Question-
Metric approach as follows. Analyze the state-of-the-art in engineering
machine learning systems for the purpose of exploration and analysis with
respect to the reported challenges; proposed solutions; the intensity of the
research in the area; the research methods from the point of view of
software engineering researchers in the context of software engineering.
RQ1. What research methods were used? RQ2. What application scenarios
and datasets were used for in experiments and case studies? RQ3. Which
challenges and solutions for engineering ML systems were raised by SE
researchers? challenge on requirements engineering, design, software
development and tools, testing and quality, maintenance and configuration
management, software engineering process and management, or
organizational aspects.
Some examples related to testing are as follows: (1) ‘‘Existing testing
methodologies always fail to include rare inputs in the testing dataset and
exhibit low neuron coverage.’’ (Guo et al., 2018); (2) ‘‘Deep neural networks
lack an explicit control-flow structure, making it impossible to apply to them
traditional software testing criteria such as code coverage.’’ ; (3) ‘‘Unlike
software bugs, model bugs cannot be easily fixed by directly modifying
models.
B. primary study selection
2 db search I started by applying the database (DB) search method to
identify relevant primary studies. I used five widely used online databases,
i.e., ACM, IEEE Xplore, ScienceDirect, Springer, Google Scholar, and Wiley.
. I used two search strings to query online databases: Query 1: ‘‘software
engineering’’ AND ‘‘machine learning’’; Query 2: ‘‘software engineering’’ AND ‘‘deep
learning’’.
3. Backward and forward snowballing To ensure the inclusion of relevant primary
studies
I conducted backward and forward snowballing
4 Manual search To enrich the primary study pool, I conducted a manual search in
two top SE conference proceedings (ICSE and ESEC/FSE)
1 Inclusion and exclusion criteria
To identify the relevant primary studies
(1) The paper should be about an ML system or component; (2) The paper should
mention at least one challenge on requirements engineering, design, software
development and tools, testing…
5 De-duplication I manually entered the metadata of primary studies (e.g., title,
abstract, keywords, publication year, venue, etc.)
6 Quality assessment
To assess the quality of primary studies, I used the quality assessment criteria
C . Data extraction
After selecting the primary studies, I started with the data extraction
phase. I formed an initial data extraction form (Table 6) based on my
RQs.
D. Data synthesis and reporting
I conducted open coding (Miles and Huberman, 1994) to analyze the
challenges and solutions.
results
Dealing with new types of Use a model and a measurement method to agree with
quality attributes customers on quality attributes • Use baselines for
specifying quality attributes when possible (such as
human performance for a safety-critical task)
Dealing with models Use a model-driven development based and platform agnostic framework to
generate DL library specificcode (such as DARVIZ), Use a visual tool to develop
DL models (such as DeepVisual), Use a verification tool (such as NEURODIFF)
to develop robust and resource-efficient ML systems involving compressed ML
models, use a visual tool for supporting engineers in the understanding
structure of neural network models (such as NeuralVis), Use a tool (such as
FeatureNET) to generate and evaluate DL models
Dealing with the • Use Docker images with all desired software pre-installed to prevent
development environment, discrepancies among development, quality assurance, and production
tools, and infrastructure environments, • Develop and use extensions on current IDEs for ML system
development, such as Azure ML for Visual Studio Code, Use a tool (such as
DARVIZ) for model abstraction to provide interoperability across platforms,
Use unified APIs (ML API) hiding the complexity of ML libraries
Dealing with data Use data verification tools
Testing Designing test cases Use a method/tool for testing DL models such as DeepEvolution , DeepCT ,DeepConcolic.
Utilize Metamorphic Testing to test DL systems, Use a method/tool to generate fault-
revealing inputs such as DeepJanus, Use a method/tool for generating test inputs for
autonomous cars, Reduce test data volume using an approach, such as DeepReduce ,PACE
Evaluating test cases Use a coverage criterion appropriate for ML/DL models, Consider other criteria when neuron
coverage criterion is not sufficient, such as proposed in DeepGini ,DeepImportance, Measure
the quality of test data
Preparing test data Use a tool (such as DLFuzz) to generate adversarial examples without manual labeling effort,
Develop an infrastructure for automated data collection and labeling, Consider using
simulators to generate test data when appropriate
Executing tests Use a tool to execute tests (such as ModelKB), Use a differential testing framework to detect
potential inconsistent behavior of ML models on different settings
Evaluating test Use metamorphic relations to tackle with oracle problem, Use combinatorial testing to tackle
results with oracle problem, Use a program synthesis technique to tackle with oracle problem
Debugging and fixing Use an approach/tool to debug and fix DL models, such as DeepFault , LAMP , MODE,
DARVIZ. Use an approach to find and localize bugs in DL libraries, such as CRADLE ,LEMON
Automating tests • Use a systematic technique to automate test case generation, such as DeepTest
Maintenance and Dealing with Use a tool for data and ML model configuration management (such as
configuration configuration ModelKB) • Use Software Composition Analysis (SCA) tools to discover
management management of data all related components of ML frameworks • Use a single platform (like
and ML models Kubernetes) to manage all components (data pipelines, ML models, code
logic, etc.) of an ML system
Dealing with the history Use a proper software architecture suitable for troubleshooting
of experiments
Dealing with re-training Apply Canary release approach for risky ML model re-deployments,Retrain
and re-deployment ML models using input data deviating from the distribution of training
data
State of the art
the level of knowledge and development achieved in a technique, science,
etc, esp at present adjective state-of-the-art (prenominal)
The state of the art (sometimes cutting edge or leading edge) refers
to the highest level of general development, as of a device, technique, or
scientific field achieved at a particular time. However, in some contexts
it can also refer to a level of development reached at any particular time
as a result of the