Keywords: It is vital to detect the safety state and identify faults of the battery pack for the safe operation of electric vehicles.
Electric vehicle The voltage faults such as over-voltage and under-voltage imply more serious battery faults including short-
Lithium-ion battery circuit and thermal runaway. The voltage abnormal fluctuation is a warning signal of short-circuit, over-
Fault diagnosis
voltage and under-voltage. This paper proposes a scheme of three-layer fault detection method for lithium-ion
Three-layer detection
batteries based on statistical analysis. The first layer fault detection is based on the thresholds of over-charge
Confidence interval
Improved K-means and over-discharge of a battery pack. In the second layer, confidence interval estimation is applied to identify
risky cells. In the third layer, correlation and variability of all cells in one battery pack are analyzed by using an
improved K-means method to identify abnormal voltage fluctuation over a certain period. The validity and
feasibility of the proposed method are verified by real vehicle data from the National Big Data Alliance of New
Energy Vehicles.
* Corresponding authors at: Department of Mathematics, University of Manchester, Manchester M13 9PL, UK (Y. Han). Collaborative Innovation Center for Electric
Z. Sun et al. Applied Energy 307 (2022) 118172
The power battery faults triggered thermal runaway (TR) mainly Through battery connection fault experiments, Shannon entropy was
include over-charge, over-discharge, internal short-circuit, and external employed to identify cells with abnormal internal resistance and fault
short-circuit, the root causes of which are electrical abuse, thermal voltage [27,28]. Hong et al. [29] applied the improved entropy method
abuse, mechanical abuse, and the interaction between them [6]. To cope to capture over-voltage faults in actual EVS. To identify more types of
with TR, the most intuitive way is to study the triggering mechanism and faults, Shang et al. [30] proposed a modified sample entropy method to
propagation process of battery system under different abuse conditions. diagnose short-circuit and open-circuit faults. Although these methods
Faults like battery short circuit [7], overcharge [8] and over-discharge can be used to diagnose battery system faults and analyze fault levels,
[9], etc. may cause the side reactions of a battery pack, such as solid they cannot capture real thermal runaway accidents timely and thus
electrolyte interface (SEI) decomposition, cathode decomposition reac some fault information are either missed or incorrectly recorded [32].
tion, electrolyte decomposition reaction, and so on [10]. The lithium-ion Furthermore, the signal processing methods of correlation coefficient
batteries may experience the abnormal changes of voltages and current, have been widely utilized to extract useful features from sample data.
the abrupt rise of temperature during a thermal runaway process Xia [31] and Li [33] put forward Pearson correlation coefficient (PCC)
[11,12]. Therefore, many researchers diagnose faults by using temper and interclass correlation coefficient (ICC) of adjacent battery cells to
ature and voltage data. detect short circuit faults respectively. Xia et al. [34] also proposed an
Remarkable endeavors have been dedicated to fault diagnosis of interleaved voltage measurement topology for robust differentiation
batteries. The existing methods are usually divided into two categories: between sensor and cell faults. Based on the same measurement topol
1) electrochemical model-based 2) non-model-based [13]. ogy, Kang et al. [35] proposed an improved correlation coefficient
The electrochemical model-based method utilizes battery models to method to detect multi-faults of battery pack system, such as short cir
generate residuals, and determine whether the system is faulty through cuit fault, connection fault and sensor fault. These methods are appli
residual evaluation [14]. According to the difference of the residual cable to nonlinear systems, but not suitable for systems with multiple
generation method, it is divided into status estimation [15] and and interactive components. In addition, no criterion for choosing time-
parameter estimation [16]. The equivalent circuit model (ECM) in window length is available for selecting cells’ voltage, which affects
electrochemical models is currently the most widely used due to its diagnostic results directly and leads to poor system robustness for
excellent predictive performance. Sidhu et al. [17] proposed extended different EVs.
Kalman filters (EKF) to estimate the terminal voltage of the ECM and There is a different layout between the voltage sensor and the tem
generate the residual. Then multiple model adaptive estimation (MMAE) perature sensor in the actual battery system of EVs due to space limi
is used to diagnose over-charge and over-discharge faults. Liu et al. [18] tation and manufacturing cost limitation. Usually, the voltage values of
adopted a sensor fault detection scheme using adaptive extended Kal all cells can be obtained from individual voltage sensors, but not every
man filter (AEKF) and the second-order ECM. The electrochemical and cell has a temperature sensor [6]. Therefore, the fault diagnosis method
thermal models have also been used together. Feng et al. [19] built a 3D proposed in this paper is based on voltage data. This paper proposed a
electrochemical-thermal model to detect the internal short circuit of a three-layer fault diagnosis scheme of a lithium-ion battery system for
large format lithium-ion battery. However, electrochemical-thermal EVs. All EVs data are collected from the National Big Data Alliance of
models are difficult to meet the needs of online fault diagnosis with New Energy Vehicles (NDANEV) in Beijing. Firstly, maximum voltage
restriction to the complexity and accuracy of the model [20]. and minimum voltage are monitored to prevent over-charge and over-
The non-model-based methods, including statistical methods [21- discharge. Secondly, the battery cells in poor health state are screened
23], machine learning [5,24-26], signal processing methods [27-31], based on confidence interval estimation. Thirdly, the correlation and
can diagnose faulty cells and do not require an accurate battery model. variability of voltage curves are measured. Battery cells are then clas
Statistical methods are widely employed to depict the battery fault. The sified based on the correlation and variability parameters using the K-
method in [21-23] require that data obey Gaussian distribution which means algorithm. To improve the calculation efficiency of the K-means,
was not verified in the previous literature. Yao et al. diagnosed battery the cell with the highest risk will be initialized as the clustering center of
connection fault by wavelet neural network [5] and a grid search sup abnormal categories. Then, the battery cell can be located by the eval
port vector machine [26]. Li et al. [24] combined the long short-term uation parameters of clustering results. The main contributions in this
memory neural network (LSTM) and the ECM to predict voltage. Then paper are as follows.
the level of voltage abnormality was defined by the difference between (1) We use the Box-Cox transformation approach to normalize the
the estimated voltage and the true voltage. The disadvantage of the raw voltage data and guarantee the updated dataset used in the
methods in the previous literature is they all require large sample size following analysis are Normally distributed, so that confidence interval
and high computation cost. estimation can be applied to quantify the evaluation of the risk of cells.
The entropy-based approach is one of the signal processing methods, (2) The computation process has been significantly simplified since
and it has been applied to the field of battery fault diagnosis with the we choose the cell of the max risk as the clustering center of the K-means
advantage of evaluating the similarity of patterns in time series. method.
Z. Sun et al. Applied Energy 307 (2022) 118172
(3) The methodology proposed in this paper does not require a high long time, so the uneven loading current, thermal distribution and depth
cost of hardware such as sensor of voltage and circuit of a battery pack, of discharge (DOD) will further aggravate inconsistency of the initial
therefore is more feasible in practice. parameters such as capacity, internal resistance, self-discharge rate and
(4) The effectiveness of our fault diagnosis approach has been veri terminal voltage [37]. During the operation of the battery system, the
fied through real-world EV data with fault occurrence and thermal current stays the same, and the changing trend of the terminal voltage is
runaway accidents. theoretically consistent for the series-connected battery pack. If a cell
The content of this paper is structured as follows: Section 2 presents has faults, its voltage trend will be different from others. Thus, through
the data source and processing. In Section 3, the detailed flow of the the correlation coefficient and detrended fluctuation analysis, cluster
battery fault diagnosis method is proposed. To test the validity of the analysis will be used to identify faulty cells. A flow chart of the proposed
method, Section 4 discussed two different fault problems of lithium-ion method is illustrated in Fig. 2. The first layer algorithm is applied every
batteries. The conclusions are drawn from this research in Section 5. time. Three conditions need to be met before layer 2 and layer 3 are
activated. Firstly, abnormal behavior has not been detected by the first
2. Data source layer. Secondly, the number of sampling points in the charge and
discharge process needs to be greater than 50, Thirdly, maximum
Accidents of EVs can weaken the confidence of users and become the voltage range in charging/ discharging cycles is more than 50 mV.
main obstacle in the large-scale application of EVs. In order to ensure the
safety of EVs, an effective system called China’s big data platform for 3.1. The first layer algorithm
EVs is built to monitor and analyze data. The architecture of the big data
platform has been introduced in previous work [36]. The process of the The first layer is to prevent over-charge and over-discharge of a
EV data collection used in this work is illustrated in Fig. 1. We utilize the battery pack. The cell with the highest voltage first reaches charging cut-
datasets of the NDANEV supported by the big data platform. The data of off voltage. On the contrary, the cell with the lowest voltage first reaches
the battery system include temperature, current, voltage, state of charge discharging cut-off voltage. If the charge/discharge cut-off condition is
(SOC). It is very common to have invalid values and missing values in determined based on the total voltage, the cell is prone to have over-
the original data. Data cleaning including deduplication and deletion is charged or over-discharged fault. Therefore, two cells with maximum
an important step in the entire data analysis process. voltage and minimum voltage need to be monitored.
The voltage matrix of all cells from the time t0 to te is represented as
3. Diagnosis method ⎡ ⎤
ut11 ⋯ utn1
⎢ ⎥
A battery system is an extremely complicated and non-linear system. Ut1 − te = ⎢
⋮ ⎥ (1)
⎣ ⋮ ui ⎦
Due to technological limitations in battery manufacturing and usage, the ut1e ⋯ utne
inconsistency of li-ion battery packs is unavoidable. Usually, cells with
more consistent state are grouped into one battery pack. However bat t
tery cells work under a complicated charge and discharge condition for a where uij is the voltage of cell i at time tj, j = 1,2,…,e and n represents the
t t
number of all cells. Let umax
, umin
represent the maximum voltage and
Z. Sun et al. Applied Energy 307 (2022) 118172
⎢ 1,(λ) ⎥
⎢ ⎥
t1 − te t
⎡ ⎤ U(λ) =⎢ ⋮ j
ui,(λ) ⋮ ⎥
1 ⎣ ⎦
⎢ ⎥ ut1,(λ)
⋯ utn,(λ)
1 − te
⎣ ⋮ ⋮ ⎥ ⎦ (2)
where ui,(λ)
is the normalized value of voltage for cell i at time tj.
The threshold δcha and δdis of the charge and discharge cut-off voltage The two-step 95% confidence intervals with Z = 1.96 are used to
are determined according to the battery pack test. obtain the risky cell. The first step is to reduce the influence of extreme
values on the mean. The second step is to find the data center for cri
3.2. The second layer algorithm terion building. Its steps are as follows:
Step 1: Remove the normalized matrix Ut(λ) 1 − te
data outside the 1.96
In the second layer, confidence interval estimation is applied to t j
σ (1) range of each time tj, and then construct a new normalized data
identify risky cells. Currently, the 3σ screening strategy is widely
adopted to the Normally distributed data [23]. However, normality test matrix:
is an essential step before employing the 3σ principle to screen the fault ⎡ ⎤
⋯ vtn,(λ)
data. The Kolmogorov-Smirnov test (K-S test) is used to test whether ⎢ ⎥
⎢ ⎥
data conform to normality assumptions. In this paper, the significance
t1 − te j
V(λ) =⎢ ⋮ vi,(λ) ⋮ ⎥
⎣ ⎦
level of K-S test was 0.05. The voltage data from a normal EV for a total vt1,(λ)
⋯ vtn,(λ)
of 398,487 sampling points were applied for the K-S test. The number of
samples that conformed to a normal distribution was 128,516, ac Step 2: Compute each mean μ(2)
and standard deviation σ (2)
j t
of all
counting for 32%, while the number of samples that did not conform to a voltage data at each time tj. The risky cell is counted from the time t0 to te
normal distribution was 269,971, accounting for 68%. As a result, by:
improving normality of a distribution is a key step. In this paper, the
Box-Cox transformation is adopted to normalize data [38]. Its form can F t1 - te
= [ F1 , ⋯, Fi ] (8)
be expressed as: ⎧ ⃒ ⃒
⃒ tj tj ⃒ tj
∑e ⎨ 1,
⎪ ⃒ui,(λ) - μ(2) ⃒⩾1.96⋅σ (2)
⎪ λ tj t
⎨ y − 1, λ ∕ =0 where Fi = j=1 fi and fi j = ⃒ ⃒ . Then, we
y(λ) = λ (3) ⎩ 0, ⃒⃒uti,(λ)
⎪ j tj ⃒
- μ(2)
⃒. < 1.96⋅σ (2)
logy, λ = 0
can get the risky frequency of all cells is
where y > 0. Let y = (y1, y2, …, yn) be the data on which the Box-Cox pt1 - te
= [ p1 , ⋯, pi ] (9)
transformation is to be applied. After the Box-Cox transformation of
the formula (3), the variable y is expressed as where pi is the risky frequency of the cell i, and pi = ∑Fni . The cell with
i=1 i
y (λ)
= ( yλ1 , ⋯, yλn ) (4) maximum of matrix p - t1 te
carries the highest risk.
The choice of λ value can be estimated by the maximum likelihood 3.3. The third layer algorithm
method, which can be expressed as:
∑ According to the perspective of consistency evaluation, the cells with
n n
(y(λ) (λ)
i − yi )
L = − ⋅ln + (λ − 1)⋅ lnyi (5) poor consistency can be given by the second layer, but the cells of the
2 n
i=1 i=1 voltage abnormal fluctuations in the time dimension are difficult to
∑ effectively obtain. Therefore, it is necessary to screen voltage anomaly in
where y(λ) = 1n ni=1 yi . For each λ, the corresponding L(λ) can be ob
time dimension. The clustering method is used for outlier detection. If
tained. According to the relationship between λ and L(λ), the λ with the we find that the data sample number of some clusters is much smaller
largest L(λ) is optimal value. Voltage data are randomly selected at than that of other clusters, and the feature of the data in this cluster is
19:10:31, 2018–05-17, and its frequency distribution histogram and the also very different from other clusters, the sample points in these clusters
maximum likelihood value at different λ are shown in Fig. 3. The optimal are considered outliers. As a common unsupervised machine learning
λ is − 20. According to the formula (3) to (5), the normalized voltage algorithm, the K-means algorithm is improved for faulty cell identifi
data are expressed as: cation. In this section, the clustering algorithm contains two input
characteristics including correlation and variability parameter.
Z. Sun et al. Applied Energy 307 (2022) 118172
Table 1 Table 2
Specification of battery pack for model 1 and model 2. Detailed information of studied vehicles.
Item Model 1 Model 2 Model 3 Vehicle number Model Fault categories Fault causes
Battery type NCM NCM NCM No.1 vehicle model 1 Sudden fault Short circuit
Total energy (kWh) 43.5 53.6 52.56 No.2 vehicle model 1 Sudden fault Short circuit
Battery pack structure 1P95S 1P98S 1P96S No.3 vehicle model 2 Progressive fault Poor voltage consistency
Charge cut-off voltage (V) 4.25 4.25 4.25 No.4 vehicle model 3 Sudden fault Short circuit
Discharge cut-off voltage (V) 3 3 3 No.5 ~ 14 vehicle model 1 Normal /
where cov(x, z) is the covariance of variable x and z. The σx and σz are the regarded as the center of one cluster.
standard deviation of variables x and z. x and z are the mean of variables The ratio of the abnormal cells to the normal cells is defined as ka.
x and z. nabnormal
ka = ⋅100% (17)
In this paper, the similarity between the two voltage curves is nnormal
measured by the PCC. The variables x and z express the cell voltages of
cell i and cell m during sampling time t0 to te. Thus, the PCC of cell i to all where nabnormal and nnormal represent the number of abnormal and normal
other cells is expressed as: cells. The Euclidean distance between two cluster centers is
⎡ ⎤
ρ1,2 ⋯ ρ1,n d = euc(γ0 , γ1 ) (18)
ρ t0 − te
= ⎣ ⋮ ρi,m ⋮ ⎦ (11)
If ka and d is greater than the ks and ds, k is equal to 2 and the number
ρn,1 ⋯ ρn,n− 1
of abnormal cells will be given. In contrast, there are no abnormal cells,
where ρi,m is the PCC of cell i to cell m and i ∕
= m. The maximum PCC and k is equal to 1. Through a large amount of the verification on other
for each cell is expressed as: real-word operation data by using the trial-and-error method, the safety
threshold of ks and ds, can be determined as ks = 80% and ds = 11.
R = [r1 , r2 , ⋯, rn ] (12)
where rn represents the maximum PCC of cell n to other cells. 4. Results of abnormal voltage detection
Z. Sun et al. Applied Energy 307 (2022) 118172
Fig. 4. The changing curve of (a) current and SOC, (b) temperature and (c) voltages (cell1 to cell 95) for faulty vehicle 1.
Fig. 5. (a) the risky count of every day and (b) the risky frequency of all cells from July 1 to July 5.
temperature and cell voltages for July 2019 are shown in Fig. 4. It should off voltage in Table 1, so the first layer algorithm cannot raise any alarm.
be pointed out that the current is negative during charging and positive Using the two-step 95% confidence intervals, every risky cell for the
during discharging in Fig. 4a. The maximum discharge current and faulty vehicle No.1 at each time can be counted. Taking the data of a
charging current are 195.3A and − 144.4A respectively. At the 1420220- battery pack from July 1 to July 5 before the fire accident as an example,
th sampling point (14:27:22, 2019–07-05) in Fig. 4b, we can find that the frequency of risk cells is carried out according to the daily dimension
the maximum temperature had a rapid increase reaching 50℃. The in Fig. 5a. It is obvious that there is a highest frequency on July 2
voltage data of the No.1 ~ 95 cell for faulty vehicle 1 are shown in compared with the other four days. The risky frequency of each cell in
Fig. 4c. The red dashed line and dash-dotted line represent the upper these five days can be obtained in Fig. 5b. The risk frequencies of the top
limit voltage with 4.25 V and lower limit voltage of the battery with 2.8 three are 0.36, 0.35 and 0.23, and battery cell numbers are No.76, No.67
V. At the 1420221-st sampling point (14:27:32, 2019–07-05), the and No.26. This indicates that there is poor consistency within the
voltage of No.17 dropped to 0.003 V (less than 2.8 V), the alarm of over- battery pack. The No.76 cell is considered as a center of cluster because
discharge can be given in the first layer. Before the 1420201-st sampling it has a highest risky frequency.
point, the voltage of all cells does not exceed the charge/discharge cut- Fig. 6 shows the fault diagnosis result based on the improved K-
Z. Sun et al. Applied Energy 307 (2022) 118172
Fig. 6. Distance between two cluster centers for (a) charging cycle and (c) discharging cycle; ratio between the two categories for (b) charging cycle and (d) dis
charging cycle of faulty vehicle No.1. (f) the standardized correlation and variability of cell voltages during the 632-nd charging cycle.
means of cell voltages for all charging cycles and discharging cycle. It point during the 632-nd charging cycle. The corresponding time is from
can be observed that the distance d between two cluster centers of 632- 13:37:42, July 5, 2019, to 14:27:22, July 5, 2019. As shown in Fig. 6f,
nd charging cycle is the largest at 13.2, and the distance d for other cluster centers of the improved K-means method using standardized
cycles is less than 11 in Fig. 6a and Fig. 6c. There are three alarms when correlation and variability are C1(-9.6, 8.8) and C2(0.1, − 0.1). The
ratio ka between the two categories exceeds 80%, which occur in the Euclidean distance between them is 13.2. The No.17 faulty cell can be
189-th, 614-th and 632-nd charging cycle respectively in Fig. 6b. Since accurately identified and it takes 5 seconds to complete the calculation.
the result of the 632-nd charging cycle satisfies the two conditions of ka The selected 632-nd cycle contains 299 sampling points. The value of ka
> 80% and d > 11 at the same time, it is considered an abnormal and d during the 632-nd charging cycle can be calculated under the
segment. As shown in in Fig. 6c and Fig. 6d, the ka exceeds 80% at 633- requirement of data length greater than 50. From the 260-th sampling
rd discharging cycle, but the distance d between two cluster centers in all point to the 299-th sampling point, it can be found that the No. 17 cell
discharge cycles is less than 11. Therefore, k should be adjusted to 1 and can be identified in Fig. 7.
no abnormal fragments are selected from the all discharging process. The faulty vehicle No.2 has the same battery structure as faulty
Voltage data are selected from 1419923-rd to 1420220-th sampling vehicle No.1. Its data include 841 charging cycles and 1030 discharging
Z. Sun et al. Applied Energy 307 (2022) 118172
Fig. 7. The value of ka and d of data length from charging start to different endpoints during the 632-nd charging cycle.
Fig. 8. Distance between two cluster centers for (a) charging cycle and (c) discharging cycle; the ratio between the two categories for (b) charging cycle and (d)
discharging cycle of faulty vehicle No.2.
cycles. Its whole operating time is from June 28, 2018, to December 19, faults during the 148-th and 291-st cycle, there is the largest voltage
2019. And the faulty source of faulty vehicle No.2 is the No.1 cell with a drop with 0.57 V in the 606-th cycle in Fig. 9c. The No.1 cell voltage gets
sudden voltage drop during the three discharging processes. It is worth recovered rapidly after one-off fault due to self-repair and internal cir
noting that the voltages of all cells in case of faults do not touch the cuit equalization. In Fig. 9d to 9f, the highest risk cells calculated by the
charge and discharge cut-off voltages. Fig. 8 shows the result of the second layer are 83, 83 and 81 respectively. These three cells are
improved K-means method at different cycles of the faulty vehicle 2. respectively used as the initial clustering centers of abnormal clusters in
Because the maximum of d is 6.1 and the maximum of ka is 37% in the 148-th, 291-st and 606-th cycle. The correlation and variability for
Fig. 8a and Fig. 8b, no faulty cell is given during the charging process in three discharging processes are shown in Fig. 9g, 9h and 9i. The distance
the third layer. During the discharging process in Fig. 8c and Fig. 8d, the of two centers for three discharging processes is>11, and the ratio of the
No.1 cell was screened and identified in the 148-th, 291-st and 606-th abnormal cells to the normal cells is 94%. Their calculation time is 4.9 s,
cycle. 4.8 s and 5.5 s within 10 s. At this time, it is obeying the assumption that
Fig. 9a to 9c show the voltage curves of the above three discharging all cells are divided into two categories, and the final clustering center of
processes. In Fig. 9a, the one-fault takes place and resulted in a sudden abnormal clusters is the No. 1 cell. The proposed method can accurately
voltage drop of 0.53 V for the No.1 cell voltage curve at the 250-th identify the faulty cell.
sampling point. At the 407-th time in the 291-st discharging cycle, the
No.1 cell voltage drops from 3.94 V to 3.65 V in Fig. 9b. Compared with
Z. Sun et al. Applied Energy 307 (2022) 118172
Fig. 9. Voltage of cell 1 to cell 95 for faulty vehicle No.2 at (a) 148-th, (b) 291-st, (c) 606-th discharging cycle; The risky frequency of cell 1 to cell 95 for faulty
vehicle 2 at (d) 148-th, (e) 291-st, (f) 606-th discharging cycle; The correlation and variability at (g) 148-th, (h) 291-st, (i) 606-th discharging cycle.
Fig. 10. (a) distance between two cluster centers and (b) ratio between the two categories for charging cycle of faulty vehicle No.3.
4.2. Progressive fault diagnosis composed of 98 cells in series. The selected data cover from February
2019 to July 2020 with a total of 2,417,233 sampling points. The
The faulty vehicle No.3 has no thermal runaway but has poor voltage maximum voltage range is 0.64 V in 1150645-th time on September 13,
consistency due to the large self-discharge rate of No. 71 cell. It is a 2019. One battery equalization strategy is adopted on October 1, 2019.
different model from faulty vehicle No.1, and the battery pack is The voltage range is reduced from 0.32 V to below 0.1 V. However, the
Z. Sun et al. Applied Energy 307 (2022) 118172
Fig. 11. Distance between two cluster centers for (a) charging cycle and (c) discharging cycle; ratio between the two categories for (b) charging cycle and (d)
discharging cycle of vehicle No.5.
battery gradually deteriorates until the battery pack is replaced on the 166-th charging cycle, and the maximum d is 13.9. As the increasing
November 24, 2019. running mileage of fault vehicle No.3, there is the gradual deterioration
The voltage data of the whole life cycle is within the normal range, of the cycle performance of No.71. In stage 2, the distance d decreases
and the first layer will not give an alarm. Fig. 10 presents the fault from 13.3 to 2.1, which indicates that the battery equalization plays a
diagnosis result based on the improved K-means of cell voltages for all good effect. In stage 3, the distance d reaches 13.9 from 167-th to 179-th
charging cycles cycle. There are 83 alarms in a total of 504 charging cycle. This demonstrates that the battery is again in a very poor state of
processes when d > 11 and ka > 80% in Fig. 10a and Fig. 10b. It can be health. In stage 4, no abnormal voltage fluctuation alarms occur after
observed that the distance d between two clusters has four stages: stage 1 changing the battery pack of faulty vehicle No.3.
(A to B), stage 2(B to C), stage 3(C to D), stage 4(D to E). In stage 1, the
distance d shows a gradually increasing trend from the 100-th cycle to
Fig. 12. (a) Voltage of cell 1 to cell 96, (b) distance between two cluster centers and (c) ratio between the two categories of faulty vehicle No. 4.
Z. Sun et al. Applied Energy 307 (2022) 118172
Fig. 13. (a) Voltage curve, (b) clustering parameters of faulty vehicle No.3.
5.1. Reliability analysis based on normal vehicles Fig. 13a shows the voltage of all cells for faulty vehicle 3 during 97-th
charging process with a sampling frequency of 0.1Hz. The voltage data
No.5 ~ 14 vehicles are used to verify the reliability of the proposed are from 15:30:31 on July 22, 2019, to 16:55:01 on July 14, 2019. By
method. The normal vehicle No.5 is selected to demonstrate the fault using the proposed method, two clustering centers C1(-9.7, 8.7) and C2
diagnosis effect proposed in this article. Its data are from May 28, 2018 (0.1, − 0.1) can be obtained. And we can get ka = 97% and d = 13.17.
to December 27, 2019. The total operating mileage is 75681 km. There The clustering results are two types, and No.71 cell can be identified as a
are 528 charging cycles and 624 discharging cycles. The first layer will fault cell in Fig. 13b. The common K-means calculation time is 0.13 s
not give an alarm because all voltages are greater than the discharge cut- with 8-step calculations to find the centroid, while the improved method
off voltage and less than the charge cut-off voltage. Fig. 11 shows the only costs 0.04 s with 2-step calculations. Therefore, the proposed
distance between two cluster centers and the ratio for different charging method is more efficient. Meanwhile, the two types will be automati
and discharging cycle. As shown in Fig. 11a and Fig. 11c, the maximum cally synthesized into one type when no alarms, all of which are normal
distances between the two centers of different charging and discharging cells. This can realize the automatic adjustment of k.
cycles for the normal vehicle No.5 are 5.3 and 4.5 respectively. Their
distances between the two centers are all less than 11. As shown in 6. Conclusions
Fig. 11b and Fig. 11d, the maximum ratio between the two categories of
different charging and discharging cycles for normal No.1 are 22.8% and This paper has proposed a three-layer fault diagnosis method of the
12.6% respectively. Since ka > 80% and d > 11 cannot be satisfied, the battery system. In the first layer, the over-charge and over-discharge of
proposed method indicates there is no faulty cells. The diagnosis result cell voltages can be prevented by monitoring both maximum and min
shows that the fault diagnosis method is high accuracy. imum voltages. In the second layer, confidence interval estimation based
on voltage at each time is regarded as a secondary analysis of poor
5.2. Robustness analysis based on faulty vehicles health state of battery pack. The most frequently selected cell by our
fault screening scheme is considered as the highest faulty risk. In the
The final discharging process of the faulty vehicle No. 4 is selected to third layer, two parameters of correlation and variability are adopted to
verify the robustness of the proposed method which can be applied to capture the abnormal voltage fluctuation. The improved K-means
other vehicles. The voltage data of faulty vehicle No.4 are collected over method is used to identify faulty cells based on two diagnosis rules
162 time points. As shown in Fig. 12a, the thermal runaway of faulty including center distance and the ratio of points of the two categories.
vehicle No. 4 is triggered by the No.72 cell (solid orange line) at the 162- The faulty cell with highest risk in the second layer is considered as
nd sampling point from the incident investigation by professional center of the faulty cluster, and this procedure shortens the calculation
personnel. We can get the distance between two cluster centers and ratio time significantly, around 3 times better than available methods in the
between the two categories in Fig. 12b and Fig. 12c, from which it is previous literature. The proposed method is capable to accurately detect
clearly noted that the distance d exceeds 10 and the ratio value ka rea different types of faults, such as progressive fault and sudden fault. The
ches 95% at the 121-st sampling point. The red dotted line represented effectiveness, reliability and robustness of the proposed fault diagnosis
the faulty cell is found by the ka > 80% and d > 11 at the 122-nd approach has been verified through real-world EV data with fault
sampling point. The first layer will give an over-discharging alarm occurrence and thermal runaway accidents.
when the voltage of the No.72 cell is below the lower discharge
threshold voltage (3 V) at 145-th sampling point. At the 122-nd sam Author contribution
pling point, the voltage at is still within safe range, i.e., not exceeding
the upper and lower voltage thresholds and so abnormality is not picked Zhenyu Sun: Conceptualization, Methodology, Software, Writing –
up by the first layer model, however by using the third layer, this faulty original draft. Yang Han: Writing – review & editing, Supervision.
cell can be accurately identified by our model. If we do not take mea Zhenpo Wang: Resources, Data curation, Project administration. Yong
sures and continue to monitor the voltage data after the 122-nd point, it Chen: Writing – review & editing. Peng Liu: Writing – review & editing,
is shown that at 145-th sampling point, the abnormal voltage can be Supervision. Zian Qin: Review & editing. Zhaosheng Zhang: Review &
picked up by the first layer model, which is twenty seconds delayed than editing. Zhiqiang Wu: Data processing, Formal analysis. Chunbao
it is initially detected by the third layer model at the 122-nd point. Song: Data processing.
Z. Sun et al. Applied Energy 307 (2022) 118172
Declaration of Competing Interest 202006030153), and the Ministry of Science and Technology of the
People’s Republic of China (Grant No. 2019YFE0107900).
The authors declare that they have no known competing financial
interests or personal relationships that could have appeared to influence
Acknowledgements The authors declare that they have no known competing financial
interests or personal relationships that could have appeared to influence
This work is supported by the China Scholarship Council (Contract the work reported in this paper.
Appendix A
Z. Sun et al. Applied Energy 307 (2022) 118172
