1 Introduction
The recent advances of
machine learning (
ML) in dealing with sophisticated data patterns and the increasingly available embedded hardware for accelerating ML trigger the interest of studying and implementing industrial
Artificial Intelligence of Things (
AIoT) [
6] that integrates
artificial intelligence (
AI) with the
Internet of Things (
IoT) edge. The AIoT systems have distributed,
in situ inference and decision capabilities to avoid the handicaps encountered when transmitting data to remote central servers for decision making. However, there is no one-size-fits-all AIoT system that can be used for all industrial applications. The designs and implementations of the AIoT systems in general need to be highly customized based on the specific objectives, operational procedures, and practical constraints of the industrial processes. Many task-specific designs such as the configuration and training of the used ML models still require substantial work to achieve the objectives. The main challenges often come from the deviations of the real-world conditions from the assumptions made by the relevant research. Specifically, the relevant research in general needs a set of clearly defined assumptions to render a satisfactory level of rigor in addressing a specific problem while isolating other problems, but real-world tasks in industrial practices face many coupled problems. Therefore, the design of a working industrial AIoT system requires holistic considerations with many inputs from the domain experts and technicians.
Despite the heterogeneity of industrial AIoT systems, the systematic description of an effort that designs and implements an AIoT system for a specific industrial application can provide insights into understanding the potential challenges that would be faced by other AIoT system designs. In this technical note, we present our recent work of designing an AIoT-based quality control (QC) system that provides an essential function to maintain high-quality products in the manufacturing systems. Specifically, the system aims at improving the QC of the ink cartridge manufacturing lines at the factories of HP Inc. (referred to as HP for short in this technical note). This development includes the key elements of AIoT, including sensing, data analytics, design and deployment of embedded ML models at the IoT computing edge, as well as decision support with associated reasoning for machine health prognostics. We present the motivation, the details of our system design, and the experiences learned from this work that can be useful to the design and implementation of other industrial AIoT systems.
Our target application is HP’s ink extraction testing (IET), which is destructive and accelerated testing on randomly selected samples of the manufactured ink cartridges. It is the final QC procedure which aims at detecting any defective batch in which the ink cartridges’ performance deviates from the specification. In particular, the IET machine (referred to as tester for short in this technical note) extracts the ink from the tested cartridge at a prescribed rate, which is much faster than those on printers, and records the liquid pressure of the ink throughout the course. The profile curve of the liquid pressure versus the volume of the extracted ink provides rich information regarding the performance of the tested ink cartridge. Thus, the match between the recorded profile and a preset template profile is the main criterion to pass the test. The alarms due to detected mismatch are further classified manually by trained technicians. Depending on the manual classification results, further QC actions will be taken. Although IET is critical to all HP’s ink cartridge manufacturing lines, the factories’ current IET procedure faces two main challenges as follows.
First, it is desirable to solidify the technicians’ experience-based approach of manually classifying alarms as a computable classifier for the purpose of QC consistency and knowledge transfer. However, the pressure profiles exhibit a significant degree of variability and the technicians’ manual classification incorporates extensive domain knowledge regarding the internals of the ink cartridges, which may be descriptive and not quantifiable. The attempt to convert the manual classification approach into a computable rule-based classifier results in many questions of how to properly define the features, configure the rules, and set the thresholds.
Second, the operations of the tester inevitably introduce uncertainties that result in false alarms. For example, from the technicians’ experiences, the formation of air bubbles in the tester’s ink tubes is one of the major factors causing false alarms, because a bubble with a sufficiently large volume affects the liquid pressure measurement. Performing a tube flush before each test can largely resolve the issue, but it significantly reduces the testing throughput. From the historical records, the overall alarm rate of the deployed testers is about 30 times the defect rate of the manufactured ink cartridges, suggesting most alarms are false. For quality assurance, upon any alarm, the factories’ current practice is to flush the tester’s tube and perform the destructive test on an additional ink cartridge sample to reconfirm the technician’s manual classification result. Thus, it is desirable to have an approach that can reliably identify false alarms and avoid unnecessary tests.
To address the above two challenges, we designed and implemented an AIoT system that classifies the tester’s alarms into product-induced (i.e., true alarms) and tester-induced (i.e., false alarms). The primary design goal is to achieve high recall and precision in identifying the product-induced and tester-induced alarms. Specifically, our AIoT system has four main components. First, the ML-based profile classifier captures the product engineers’ experiences in classifying the alarms. Second, we develop a heuristic-based anomaly detection (AD) approach that classifies the pressure profiles based on domain knowledge of the patterns contained in the profiles. Third, based on a key observation that the air bubbles are often formed at the joint of the tester’s ink tubes, we deploy a smart camera at the joint and design convolutional neural network (CNN) and computer vision (CV) algorithms that run on the camera to detect and estimate the presence and volume of air bubbles. Fourth, we develop a tester assessment approach that applies statistical learning to estimate the probability that a tester is faulty based on the historical alarm classification results. The outcome supports the decision process of whether maintenance activities should be performed for the concerned tester.
We have deployed our AIoT system in HP’s manufacturing lines. Through controlled experiments, our heuristic-based AD approach achieves a recall of 95.2% in detecting the defective ink cartridges. Moreover, the smart camera can correctly detect the presence of air bubbles in 94% of the testing images. In summary, this technical note presents the design and evaluation processes of the AIoT system and discusses the key experiences and lessons learned from the whole course of the work, which can be useful to the development of other industrial AIoT systems.
The remainder of this technical note is organized as follows. Section
2 reviews related work. Section
3 presents the background of IET and overviews our AIoT system. Sections
4–
6 present the designs of ML-based profile classifiers, heuristic-based AD approach, and smart camera, respectively. Section
7 presents deployment and evaluation of the system integrating the components in Sections
4–
6. Section
8 presents the statistical learning-based tester assessment. Section
9 discusses the experiences and learned lessons. Section
10 concludes this technical note.
2 Related Work
Challenges in deploying ML and AIoT in Industries: Industrial AIoT is the combination of AI and industrial IoT to improve the level of automation in analyzing and creating useful insights from the industrial sensor data [
12]. Deploying an industrial AIoT system often faces challenges in making decisions on the design and implementation of IoT hardware infrastructures (e.g., edge, fog, and cloud) and software components (e.g., ML models) based on the specific objectives and practical constraints of the industrial processes. A number of studies [
1,
2,
7,
8,
9] have investigated practical challenges and provided some insights on deploying industrial AIoT systems. Alkhabbas et al. [
1] conducted a survey that distributed a questionnaire containing 14 questions about the deployment decisions of IoT systems. Their findings based on the responses of 66 IoT system designers from 18 countries show that reliability, performance, security, and cost are the four main factors affecting the designer’s decisions on deploying IoT systems. The studies [
2,
7,
8,
9] discuss practical challenges and lessons learned from deploying ML algorithms for various applications. For instance, with experiences in designing analytics platforms at Twitter, Lin and Ryaboy [
9] observe that at the first step, the data scientists often spend many efforts in understanding and cleansing the collected data before they can design ML models. Budd et al. [
2] identify that the lack of training data labels is a key challenge in designing ML models for medical image analysis. As presented in [
7], practical ML systems often employ simple ML models such as random forests, decision trees, and shallow neural networks to shorten the deployment time and gain better interpretability. For instance, Haldar et al. [
7] report that in the process of applying deep ML models for AirBnB search, after several unsuccessful attempts with complex neural networks, they finally deployed a simple neural network model to simplify the deployment process while providing reasonably good performance. In addition, Hazelwood et al. [
8] discuss several key factors that drive the decisions on designing ML models for data center infrastructures at Facebook. Similar to the above studies, this technical note presents our experiences and lessons learned from the design and implementation of an industrial AIoT system. As our work considers different specific objectives, operational procedures, and practical constraints, this technical note provides new insights.
QC in production processes: QC is a set of procedures for determining whether a product meets a predefined set of quality criteria or the customer’s requirements [
16]. It also provides the information to determine the need for corrective actions in the manufacturing process. AIoT technologies have been adopted to improve the QC of manufacturing lines. For instance, at Siemens’ electronics plant in Amberg, Germany [
14], various ML models and edge computing are used to design a predictive model-based QC framework for testing the quality of
printed circuit boards (
PCBs). The framework helps improve the recall in detecting defective PCBs and reduce testing overheads. In this technical note, we present the work to develop an industrial AIoT system for improving the QC of the ink cartridge manufacturing lines at HP’s factories.
Our prior work [
18] has presented the design of the first three components of the developed AIoT system, i.e., ML-based profile classifiers, heuristic-based AD approach, and smart camera. Based on [
18], we make the following new contributions in this article. First, Section
4.2 presents a new profile classification approach based on ensemble learning and Section
4.3 presents a new set of experiments driven by historical data to evaluate all the ML-based profile classifiers incorporated with resampling for addressing the data imbalance issue. Second, Section
8 presents the fourth newly designed component of the statistical learning-based tester assessment approach and the related evaluation.
3 Background, Motivation, and System Overview
In this section, we present the background of the IET and discuss its current problems in practice. Then, we overview the design of our AIoT system for improving the IET.
3.1 IET Background and Problem Statement
As discussed in Section
1, the IET is the final QC process of the ink cartridge manufacturing. Specifically, a number of randomly selected ink cartridge samples are tested using the tester. The tester can run six ink cartridges simultaneously. Figure
1 illustrates how the tubes connect a tested ink cartridge, a stepper motor pump, and a pressure sensor. A transparent plastic Y-joint is used to join the tubes. A workstation computer of the tester controls the stepper motor pump to extract ink from the ink cartridge at a steady volume rate for a certain time duration. Meanwhile, a liquid pressure sensor continuously measures the pressure in the tube and reports the readings to the workstation computer. The resulting curve of the measured liquid pressure versus the volume of the extracted ink is a profile of the tested ink cartridge. The ink cartridges of different models have distinct profiles. Figure
2 shows profile samples of a certain ink cartridge model.
The tester adopts a
bound-based detector to assess a measured profile against a
template profile with an upper bound and a lower bound. The template profile is defined based on the specification of the ink cartridge. The bound-based detector classifies a profile
normal if the profile completely lies within the belt area between the two bounds; otherwise, the tester classifies the profile
abnormal. To achieve high recall in capturing defective cartridges, the factories’ current practice is to impose stringent bounds. As a result, the tester generates alarms frequently. As mentioned in Section
1, many alarms are actually false. This is because the pressure measurements can be noisy and biased.
Specifically, the pressure sensing is subject to both endogenous and exogenous noises. Endogenous noises are mainly from the thermal noises of the pressure sensor and the random control errors of the stepper motor pump. Exogenous noises are mainly caused by vibrations and blockage of the ink tubes. The vibration is caused by the movements of nearby human operators and bulky manufacturing machines, while the blockage is caused by the hardening ink residue trapped within the tube. In addition, the tester is subject to the following biases. An improper manual insertion of the tested ink cartridge onto the tester may cause loss of back pressure of the cartridge and deviation from the template profile. An air bubble formed in the tester’s ink tubes with a sufficiently large volume can also affect the pressure sensing.
In the current protocol of the factories, the alarm-triggering profiles will be further classified manually by the technicians into false positives (i.e., tester-induced) and true positives (i.e., product-induced). The manual classifications are based on the technicians’ knowledge received during training and also their own experiences. As such, the classification results may lack high confidence and consistency. To ensure that there is no doubt regarding the QC result of a tested batch, the technicians may need to perform maintenance of the tester and conduct destructive tests with additional samples. A common maintenance performed is to flush the tubes with water to purge out ink and air bubbles at the end of every test. However, the frequent maintenance reduces the IET throughput significantly; the additional destructive tests increase the cost. Therefore, it is desirable to develop a system that can reliably and consistently classify the alarms generated by the bound-based detector, such that all or part of the unnecessary tester maintenance and additional destructive tests can be avoided.
3.2 AIoT System Overview
In this work, we follow the progressive system development methodology to design and implement an AIoT system to replace the factories’ current practice of manually classifying the alarm-triggering profiles into normal and abnormal profiles. During the whole course of designing our AIoT system, we have developed four main components as follows.
(1) ML-based profile classifiers: We design and train several ML-based classifiers to classify the profiles. The training processes are based on historical profiles labeled by the product engineers. Specifically, we design multiple classifiers based on supervised, semi-supervised, and unsupervised ML models. Each classifier takes different features as input to classify a profile. Ensemble methods are also used to integrate the results of the multiple classifiers.
(2) Heuristic-based AD: The ML-based classifiers face challenges of limited and imbalanced training dataset. Thus, we also develop a heuristic approach which considers the profile classification as an AD problem. The profiles of good ink cartridges, albeit measured in the presence of noises and biases, should be detected as normal; the profiles of defective cartridges should be detected as abnormal.
(3) Smart camera: From the technicians’ experiences, the formation of an air bubble at the Y-joint of the ink tubes can affect the pressure measurement, which likely leads to false alarms. We design a smart camera system to monitor the Y-joint. It runs a CNN to detect air bubbles and a CV algorithm to estimate the volume of the bubbles. The results are used to assist the profile classifier or the AD algorithm in deciding the nature of any alarm generated by the tester.
(4) Statistical learning-based tester assessment: We develop a tester assessment approach that leverages statistical learning to estimate the probability that a tester is faulty based on the historical alarm classification results. The estimated probability can support making decisions on whether maintenance activities should be performed for the concerned tester. With the assessment support, more false alarms can be prevented proactively.
All computing for the profile classification and bubble detection is executed on a Raspberry Pi single-board computer deployed close to the sensors generating data. Specifically, the Pi is connected directly with the camera and tester to receive the captured images and measured pressure profiles.
5 AD-based Pressure Profile Classifiers
As evaluated in Section
4, the developed ML-based profile classifiers show limitations in achieving high accuracy due to the limited training dataset. In this section, we develop a heuristic approach which treats the profile classification as an AD problem. Specifically, our approach considers the abnormal profiles as outliers which do not follow the expected pattern of the normal profiles. Upon a new profile, a distance-based similarity score between itself and the normal profiles is calculated. The profile is considered abnormal if the score is lower than the threshold. This AD approach provides good interpretability in that it gives information for understanding the classification results. In this section, we present four categories of false alarms and then describe the AD approach.
5.1 Categories of Alarm-triggering Normal Profiles
As mentioned in Section
3, the liquid pressure measurements are subject to various biases due to the human operators and the tester deviations. The biases can cause different patterns of the normal profiles that trigger the bound-based detector. From the product engineers’ domain knowledge and experiences, the normal profiles can be divided into four categories as follows.
Miss-configuration profiles are caused by setting a wrong reference point by the human operator at the beginning of the test. With the wrong reference point, the measured profiles have a similar pattern to the profiles of good ink cartridges. However, they are shifted beyond the belt area between the two bounds of the template profile which is used by the tester to classify the profiles into normal and abnormal. As a result, these miss-configuration profiles trigger false alarms.
Miss-calibration profiles are caused by configuring a wrong gain to scale the sensor’s raw readings to the pressure unit in the calibration process of the pressure sensor.
No-cartridge profiles are collected when the ink cartridges are not inserted properly onto the tester. Without the ink from the cartridge, the motor pump of the tester pulls the air through the tube only. Under this condition, the measured pressure profile is nearly a flat line.
Tube-blocking profiles are measured when the ink tubes are blocked by air bubbles or ink residue. Specifically, the tube-blocking profiles have a liquid pressure drop in the early stage of the extraction due to presence of the air bubbles inside the tube. Then, they quickly increase and recover to the pattern which is similar to a shift-up variation of the normal profile.
5.2 Anomaly Detection
From the technician’s experiences, the last phase of the profiles often includes the pressure measurement fluctuations caused by over extraction in which the tester’s motor pump still operates when the internal valve of the ink cartridge is already closed. The air gaps traveling through the tube introduce measurement fluctuations that can trigger the bound-based detector. Thus, our AD algorithm excludes such fluctuations from the input profile. Moreover, our experiments in Section
7 show that the over extraction has a strong correlation with the presence of air bubble in the tube. Thus, we use air bubble as an indicator to determine whether the measurement fluctuations are caused by over extraction. Lastly, we apply data analytics methods to extract the features of the normal profiles that are used to distinguish the abnormal profiles as outliers. Specifically, we check whether a testing profile belongs to any of the four categories presented in Section
5.1. If yes, it is normal; otherwise, it is abnormal. The details of the check are as follows.
For the miss-configuration, no-cartridge, and tube-blocking categories, we use the mean subtraction method to normalize the original profile by subtracting its pressure measurements from its average.
Dynamic time warping (
DTW) distances [
3] between all pairs of normalized training profiles in the normal profile category
i are calculated. We define
\(\gamma _i\) as the detection threshold for category
i and
\(\gamma _i = \mu + 3\sigma\), where
\(\mu\) and
\(\sigma\) are the mean and standard deviation of the calculated DTW distances. Upon a new profile, we first calculate the DTW distance between itself with all training profiles of the category
i. If the mean of the calculated distances is less than
\(\gamma _i\), the profile is considered normal in the category
i.
For the miss-calibration category, we use a scale matching method to extract profile features. Each training profile is equally divided into 10 segments and the maximum among the pressure measurements of each segment is determined. The mean and variance of the maximum over the same segment across all training profiles are calculated. For a new profile, we first determine the maximum of its 10 segments, and then compute their scale with respect to the mean and variance obtained from the training profiles. The profile is considered normal if all scales of its 10 segments fall within a suitable range between each other. If the profile is considered normal by the above scale matching approach, we additionally perform the DTW distance-based AD process to confirm whether the profile is normal.
9 Experiences and Learned Lessons
As a systematic attempt to develop an industrial AIoT system for improving the QC of ink cartridge manufacturing, our research has generated experiences and learned lessons that the future industrial practices can consider. The experiences and lessons are summarized as follows.
(1) Classifiers vs. heuristics: In the early stage of our system development, we considered the problem of dividing the profiles into normal and abnormal classes as a classification problem. However, the four ML-based classifiers cannot achieve a high accuracy in the deployment. The main reason is the limited and imbalanced training dataset, which is also related to the second challenge that we will discuss shortly. Then, we investigated the characteristics of the normal and abnormal profiles. Specifically, the tester often induces stable biases and noises to the pressure measurement of all tested ink cartridges over a certain period of time. The profiles of defective ink cartridges are rare ones which do not follow the pattern of the profile of good cartridges under the tester-induced noises and biases. Thus, we further designed a heuristic approach that considers the profile classification as an AD problem. Our evaluation results based on the controlled tests show that the AD approach outperforms the ML-based profile classifiers. From our experience, the quality of the training data is crucial to the development of effective ML classifiers. It is often very difficult to achieve satisfactory performance if the data is limited or include high-variance noises and biases. In such cases, simpler, heuristic solutions (e.g., AD approach in our case) can be more effective.
(2) Curse from data labeling: ML classifier’s attractive advances recently are mainly owing to the availability of big labeled training data and standardized hardware acceleration. For the tasks that humans are good at, creating big labeled training datasets is feasible. Manual labeling services (e.g., Google’s [
5]) are now established. However, data labeling is very challenging for developing an industrial AIoT system. Such labeling processes cannot be performed by normal persons based on their instinct and/or basic knowledge. Differently, they require experts’ experience and prior knowledge. In our work, relabeling the pressure profiles is highly non-trivial and requires a collaboration with the tester domain experts. In particular, the experts sometimes lack high confidence and consistency for assigning labels for high-variance profiles. This can be solved if they can access meta information about the internals of the tested ink cartridges and tester’s parameters. However, this meta information was not collected in the historical database. Even if the meta information is available, frequently referring to the detailed meta information inevitably adds overhead to the relabeling process. Eventually, we can only relabel a limited number of profile samples, which lead to the poor performance of our ML-based profile classifiers. The use of ML classifier in our AIoT system is limited to the bubble detection, which is a task that a normal human can complete after receiving some simple guidance. From this experience, it is reasonable to argue that the success of applying ML classification to an industrial task highly depends on the availability of sufficient labeled data.
(3) System challenges: Sensor inconsistency and deviation pose challenges for the deployment of industrial AIoT systems in practices. In our system, we use a camera to capture images to train the CNN to detect the air bubbles. A light source was used to provide a stable and sufficient illumination for the camera to capture the training images. Then, the trained CNN was deployed to six sets of cameras. However, the trained CNN did not show the same performance on them. This is because the quality of captured images across six cameras are different due to the deviation in installation and working condition of the cameras and light sources. Figure
6(a) shows two images captured by two camera sets. We can see that they have different illumination conditions, which affect the performance of the CNN. Moreover, the illumination condition of a certain camera can drift over time due to wear and tear of the light source. Figure
6(b) presents two images captured by the same camera set at the beginning of the deployment and three months later. The light intensity of the light source is weakened. As a result, the CNN cannot correctly detect the air bubbles in the images captured with weakened lighting conditions. Although the dimming was caused by that the light was kept on all the time, which was then replaced with on-demand switch-on, the long-term wear and tear are inevitable. This calls for new research to obviate negative impacts of sensor inconsistency and deviation on performance of AIoT systems. The method proposed in [
10] may be promising to address the issues. Specifically, we can model the relationship between the images captured by different cameras or under different controlled illumination levels. Then, we can use the modeled relationship to augment the training dataset. As such, the trained CNN can have the capability to deal with different cameras and illumination levels.