A Flexible State-of-Health Prediction Scheme for Lithium-Ion Battery Packs With Long Short-Term Memory Network and Transfer Learning

The application of machine learning-based state-of-health (SOH) prediction is hindered by the large demand for training data. To conquer this defect, a flexible and easily transferred SOH prediction scheme for lithium-ion battery packs is developed. First, the charging duration for a predefined voltage range is hired as the health feature to quantify capacity degradation. Then, the long short-term memory (LSTM) network and transfer learning (TL) with fine-tuning strategy are incorporated to constitute the cell mean model (CMM) for SOH prediction with partial training data. Next, to evaluate the SOH inconsistencies among cells, the LSTM model is employed as the cell difference model (CDM), and the minimum estimation value of CDM is identified to determine pack SOH. The experimental results reveal that, even when the first 360 cycle data, occupying only 40% in the whole 904 cycle data, are chosen and constituted to the data set for model training, the obtained estimation algorithm can still predict SOH precisely with the error of less than 3%, thus remarkably reducing the training data amount and mitigating the computation burden during model training. In addition, the preferable validation results on different types of lithium-ion batteries further manifest the extendibility of the proposed strategy.


I. INTRODUCTION
D UE to the massive consumption of fossil fuel and pressing concerns over carbon emissions, high-efficiency and environment-friendly electric vehicles (EVs) have become the development mainstream in transportation electrification [1], [2]. In EVs, lithium-ion batteries have been widely deployed as main energy sources, with a form of a pack consisting of tens to thousands of cells connected in series, parallel, or mixed topologies [3], [4]. However, with the increase in operation, battery performance degrades gradually, including capacity decline, resistance increase, and inconsistency aggravation among cells [5]. It is, therefore, critical to reliably estimate the healthy status of lithium-ion batteries, referred to as state-of-health (SOH) prediction [6]. Accurate knowledge of SOH can not only contribute to the estimation promotion of state of charge (SOC) but also facilitate the safe operation of batteries [7].
Battery SOH represents the ratio of maximum discharging capacity over the rated value [8]. Generally, the prediction methods of SOH can be sorted into three groups: direct calibration methods, filter-based methods, and machine learning-based methods [9]. Direct calibration methods determine battery SOH via specific experimental operations, such as full discharge of the battery after a complete charge [10]. This kind of method is simple and easy to implement. Nevertheless, it is an open-loop approach highly depending on the acquisition accuracy of current and voltage; furthermore, it is impractical to fully charge and discharge the batteries equipped in EVs. To cope with the restrictions of direct calibration methods, filter-based methods are developed, and they treat SOH as one parameter waiting to be identified in the model or one state variable that needs to be estimated in the built observer [11]. Ordinary filter algorithms, such as the extended Kalman filter (EKF) [12], the particle filter [13], and the H-infinity filter [14], have been intensively applied for online SOH estimation. However, the performance of these methods highly relies on the accuracy and robustness of the established battery model or observer.
Machine-learning-based methods have been progressively exploited to predict SOH due to their strong data processing and nonlinear fitting capabilities [15]. These methods usually employ different machine learning algorithms to capture the degradation trend from measurement and generate healthy features representing SOH variation [16]. Tian et al. [17] reveal that the partial differential temperature variation is related to SOH degradation, and then, the support vector regression (SVR) is exploited to predict SOH based on temperature variation. In [18], the Gaussian process regression (GPR) is introduced to map the relationship between partial capacity increase and SOH. In addition to SVR and GPR methods, the random forest regression (RFR) [19], the radial basis function neural network (RBFNN) [20], the relevance vector machine (RVM) [21], and the long short-term memory (LSTM) network [22] are also successfully leveraged to estimate SOH. Among these machine learning algorithms, the LSTM network, as a typical representative of recurrent neural networks (RNNs), can capture the long-term dependence information through its specially designed gate structure and overcome the drawbacks of gradient explosion and gradient vanishing existing in RNN. Consequently, the LSTM network is widely adopted in state prediction due to its superior performance in capturing the long-term relationship of historical information [23].
Although the abovementioned approaches can achieve satisfactory SOH prediction, a large amount of training data is indispensable, undoubtedly increasing the duration of the off-line test. Moreover, when the battery type or the health feature is changed, the predetermined model needs to be reconstructed, or the model parameters need to be retrained, obviously complicating the algorithm design. Richardson et al. [24] exert the GPR to estimate SOH based on two different types of batteries. However, when the GPR model trained on the basis of data set 1 is transplanted to estimate the battery SOH described in data set 2, the repetitive operations, including reselecting the model parameters and retraining the model, need to be conducted once again. Similar difficulties are also encountered in [19], where the relative incremental capacity within three different voltage ranges is considered a healthy feature to characterize the SOH variation. However, the built model needs to be retrained with massive cycle data after the voltage range varies. In real-time applications, it is timeconsuming, material-consuming, and labor-intensive to do so. How to mitigate the burdensome training job or reduce the required data amount when constructing the SOH model based on machine learning algorithms needs to be further investigated.
Another critical concern for SOH estimation is the inconsistency among battery cells, which is ineluctable in production and usage, especially for EV applications. To the best of our knowledge, there is still a lack of efficient manners to characterize the SOH of battery packs on the basis of the cell's SOH. To diagnose the health state of lithium-ion battery packs, Zheng et al. [25] apply a multistage constant current (CC) charging strategy to accelerate battery pack degradation, and a dual GPR framework is advanced to simultaneously forecast SOH and remaining useful life (RUL). However, the whole pack is deemed as an integrated unit without considering the inconsistency among cells. In [26], to identify the inconsistency of cell SOC, a cell mean model (CMM) and a cell difference model (CDM) are, respectively, constructed to delineate the overall SOC of the pack and the difference between the cell SOC and mean value. Nevertheless, few methods have been devoted to the estimation of cell SOH inconsistency in battery packs. It is, therefore, imperative to design a straightforward and efficient pack SOH estimation method to fill in the gap of existing research.
Aiming to promote the potential of machine learning-based methods and investigate an efficient pack SOH prediction algorithm, a flexible and transferable SOH prediction scheme is advanced for lithium-ion battery packs. Concretely, an LSTM network integrating a fully connected layer is constructed to learn the long-term dependence of battery cell SOH according to the features generated from partial charging voltage data. On this basis, a transfer learning (TL) method, which is applied to improve the learning ability by transferring information from a related domain, is proposed for easy adaption to the variation of feature voltages existing in different batteries, thereby significantly mitigating the burdensome training task. In addition, to fully consider the inconsistency among battery cells, the CMM and CDM are, respectively, addressed to simulate mean SOH level of pack and difference of cell SOH, thereby efficiently identifying the pack SOH. Eight battery cells and three types of batteries are experimented with to verify the effectiveness of the proposed method. In addition, more in-depth validations on two different types of lithium-ion batteries further justify the extendibility and flexibility of the proposed algorithm.
The remainder of this article is structured as follows. The main methodologies, including LSTM and TL, are introduced in Section II. The experimental tests and generation process of health features are described in Section III. Next, the detailed prediction procedure of pack SOH is detailed in Section IV. After that, Section V analyzes and discusses the experimental validation results. Finally, Section VI draws the main conclusions.

II. METHODOLOGIES
The objective of this study is to construct a uniform scheme to forecast SOH with partial training data based on LSTM and TL. To attain it, the theoretical basis of both algorithms is detailed first.

A. Long Short-Term Memory Network
The LSTM network, as an extended form of RNN, exploits the memory units in place of conventional hidden nodes to avert gradient vanishing or explosion [27]. Fig. 1 exhibits the schematic of LSTM, which includes an input gate, an output gate, and a forget gate [28], [29]. They are in charge of receiving data, outputting estimation, and eliminating unnecessary information, respectively. Another key part of LSTM is the state cell, of which the main function is to store a summary of previous input sequences. In addition, the structure of LSTM for regression consists of five layers, i.e., the input layer, the LSTM layer, the dropout layer, the fully connected layer, and the regression output layer [30]. The basic functions of LSTM can be formulated as where i, f, o, C, and h denote the input gate, the forget gate, the output gate, the state cell, and the output data, respectively; b is the bias parameter; W x· and W h· , respectively, represent the weight matrices for input and previous output; indicates the elementwise product; sigm expresses the sigmoid function that is an activation function; and tanh is defined as the hyperbolic function. In (1), the information amount for the cell memory to update, forget, and output its state is determined by the first three equations, and the state cell and output are determined by the last two recursive equations. Herein, the root-mean-square error (RMSE) is applied to construct the loss function, as where SOH Ref and SOH Pre denote the reference value and the predicted value of SOH, and N means the data size of SOH.

B. Transfer Learning
Existing machine learning methods usually assume that the data obey the same distribution, and also a large amount of labeled data is required to train the model. However, in real applications, the domains of the source data set and the target data set are usually different, and the limited label data in the target data hindered the promotion of machine learning methods. To recognize the knowledge and skills learned from previous tasks and transfer them to novel domains/tasks, the TL based on deep neural networks is proposed and has attracted wide attention to now [31]. As an efficient data mining framework, TL can transfer the data or features from previous tasks (named the source domain hereinafter) to facilitate the model construction of the target task (called the target domain), thus remedying the difficulty of information loss caused by the data insufficiency. For compensating the data loss on degradation prediction and improve the SOH estimation precision, a flexibility scheme for SOH prediction of different types of batteries is presented by integrating LSTM and TL. Among all TL solutions, fine-tuning a pretrained model based on a new data set is the most popular manner for knowledge transfer in the machine learning field [32]. On this account, it is engaged in SOH estimation in this study, and the schematic of TL with fine-tuning strategy is exhibited in Fig. 2.
The network consists of five layers, where the first layer is the input layer, and the last layer is the output layer. The input on the left and right networks includes, respectively, the source data and the target data. First, the network is pretrained through the source data to obtain the base model. Then, the target data are set as the input variable, and the TL with fine-tuning strategy is activated to adjust the parameters of one middle layer, while the parameters of other layers remain unchanged. In this manner, a new transfer model is constructed. Since only partial parameters of the base model are adjusted, the training speed is much faster than that of retraining the whole model. Simultaneously, the proposed method can cooperatively transfer the cycle data and the formerly trained model of source battery to the newly built model of target batteries for easier SOH prediction. Its strong learning capability can bridge the serviceable information implied in the data and model between two domains to facilitate the prediction model construction of the target domain, thus mitigating the requirement of aging data and obviously improving the training efficiency.
After detailing the principles of LSTM and TL, the related experiments and SOH prediction based on the proposed methods will be investigated.

III. EXPERIMENTS AND FEATURE GENERATION
This section details the aging experiment procedures and specifies the feature generation from the test data for SOH estimation.

A. Aging Experiment
In this study, three different sources of battery data are considered: 1) the nickel cobalt manganese (NCM) data set from our laboratory; 2) the lithium cobalt oxide (LCO) data set from the Center for Advanced Life Cycle Engineering (CALCE), which is an open-access depository supplied by the University of Maryland [33]; and 3) the commercial lithium iron phosphate (LFP) battery data set from the Massachusetts Institute of Technology (MIT) and Stanford University [34]. These three data sets are named data set 1, data set 2, and data set 3, respectively. The specifications of these three batteries are tabulated in Table I. The laboratory batteries are repeatedly experimented with according to a CC charge/discharge strategy at room temperature. During the charging process, 2-A current is imposed in the CC charging step, and 4.15 V is sustained in the constant voltage (CV) charging stage until the current drops to the predefined cutoff value. Then, the test batteries are discharged with the 1-C current, where C denotes the value of rated capacity with unit Ampere-hour (Ah), to the lower cutoff voltage after 5-min rest. The batteries in data set 2 belong to commercial prismatic cells with 1100-milli-Ah (mAh) nominal capacity. Three cells, referred to as 2-1, 2-2, and 2-3, are aged with a similar cycle experiment setting in data set 1, and the charging current is set to 0.55 A. More detailed test steps can be found in [35]. For data set 3, the charging protocol and the chemistry property of batteries are different from the first two batteries. The cells are charged with a two-step fast-charging strategy: C1(Q1)-C2 scheme, in which C1 and C2, respectively, indicate the first and second step currents, and Q1 denotes the SOC at which the current switches from C1 to C2. The second step charge is terminated at 80% SOC, followed by the 1-C currentbased CC-CV charge mode. Finally, a 4-C current is imposed to discharge the battery until its voltage reaches the minimum cutoff value [36]. The experimental data from the aging test are employed to extract battery health features, train the model, and test the developed battery SOH estimation algorithm. First, the extraction process of health features will be introduced.

B. Feature Generation
As discussed above, it is critical to find a representative health feature to properly assess the aging state. The health feature needs to be easy to obtain, simple to calculate, and efficient for anti-interference. In EV applications, since driving conditions and driving demands are time-varying and stochastic, the discharging process of lithium-ion batteries is usually stochastic and unforeseeable [36], and it is tricky to apply the discharge data for SOH estimation. Nonetheless, the charging process is generally uniform as the battery is usually charged from the grid according to specific instructions. Hence, the charging voltage and capacity profiles are designated for feature generation and SOH prediction. Taking data set 1 as an example, the charging voltage profiles under different cycles are plotted in Fig. 3(a), and we can find that the charging time decreases with the increase of cycle numbers. Intuitively, the charging time for a fixed voltage range will be shortened when the battery capacity degrades. On this account, the duration for a fixed voltage range can be deemed as a health feature to quantify SOH variation. A full charging voltage profile under the CC scheme is presented in Fig. 3(b), where t 1 and t 2 denote the moment when the voltage reaches the preset low and high threshold during the battery charging process, and X indicates the feature vector. According to Fig. 3(b), the detailed feature generation procedures can be summarized, as follows.
1) Execute the predefined CC charge strategy for a certain duration and measure the voltage throughout. Since a full  charging cycle experiment from empty is difficult to acquire, a subset of the acquired points will be applied as the input representing the whole information of one cycle. As such, a definite voltage range based on the test data is marked, and the low and high boundaries are assumed as V 1 and V 2 , respectively.
2) With the age of batteries, the polarization becomes severer with the form of internal resistance increase, and concretely, the CC charging time decreases. On this account, the charging time for a certain voltage range can be hired to measure the battery degradation. To be specific, for each charging curve, the duration T i in the designated voltage range, i.e., V 1 to V 2 , is considered as a health feature, which can not only capture the degradation of capacity and increment of resistance but also evaluate the inconsistency of battery cells in a pack to some extent.
In the next step, the SOH prediction scheme will be designed according to the proposed strategy and the identified healthy features.

IV. BATTERY PACK SOH PREDICTION SCHEME
To avoid over operation of the individual battery cell in packs, the minimum SOH value among that of cells is defined as the SOH of the battery pack. Due to imperfect fabrication techniques in the manufacturing process and inconsistent heat distribution inside the battery pack, the aging speed of cells is different. In this section, a two-part SOH estimation model incorporating the CMM and CDM is developed, in which CMM describes the mean SOH performance among cells, and CDM mainly accounts for the difference between cell SOH and the mean value.

A. Construction of Cell Mean Model
The constructed block diagram of the CMM is depicted in Fig. 4. First, the health features are extracted by the method defined in Section III, and then, the basic model is built by the LSTM method to map the relationship between health features and SOH. Finally, the TL method with fine-tuning strategy is exerted to establish the autonomous learning model and attain SOH prediction based on the reduced training data and mutative feature range. The whole prediction scheme mainly includes three steps.
1) Feature Extraction: Partial voltage data are acquired from the voltage curves during the CC charging process to generate the first health feature coefficient, and the specified charging interval for the preset voltage range is calculated as the second health feature factor. Both factors are selected as the input of the base model.
2) Base Model Construction: According to the off-line test results, partial measurement data are selected as the training data set, and the remaining data are assigned as the validation object. The ratio of partitioning the data between training and testing will be discussed in Section V. Then, the LSTM network with five layers is implemented to construct the base model. As well known, increasing layers will complicate the calculation process and aggravate the computation intensity. However, when the number of layers is too small, the model accuracy will be deteriorated. Through optimization, the number of layers is set to five after trading off the prediction accuracy and calculation complexity. It is known that environment temperature and charging/discharging current can generate significant influences on the aging of lithium-ion batteries; however, it is quite difficult to fully account for the massive influences incurred by different operation conditions in the long-term prediction of SOH. Hence, in the proposed scheme, a one-step-ahead prediction method is exploited to predict battery SOH. That is, no matter how much the battery degrades, it is not necessary to know the previous capacity; only if the specific characteristic parameter, i.e., health feature is extracted, the trained model can be employed to predict battery SOH efficiently. In addition, although the current health feature is extracted to estimate the SOH, the long-term dependence during SOH estimation is merged due to the long-term memory characteristics of the LSTM algorithm. In this manner, the up-to-date aging state of lithium-ion batteries can be estimated comprehensively, and the impact of the working environment on battery degradation can be considered to some extent.
3) Transfer Model Construction: To ensure the flexibility of the proposed SOH prediction method, the TL with fine-tuning strategy is exploited to adjust the parameters of the base model. In this scheme, only the parameters of the bottom two layers, i.e., the fully connected layer and regression output layer, will be updated, and the parameters of other layers will remain unchanged. Note that, as the model parameters need to be adjusted according to the new features or battery types, the labeled data need to be prepared for retraining. In this manner, the transfer model is constructed.

B. Construction of Cell Difference Model
Here, we assume the SOH of the first cell in the pack as the CMM, the feature difference T i , and the SOH difference SOH i between the mean value SOH mean and SOH i of the i th cell can be formulated as where T i denotes the duration in the designated voltage range of cell i ; T mean indicates the duration in the designated voltage range of mean cell SOH. SOH is negative for lower SOH i and positive for higher SOH i compared to the mean value.
To map the hidden relationship between T i and SOH i , LSTM is employed to establish the CDM, in which T i and SOH i are assigned as the input and output. After attaining the mean value and the difference, the minimum difference under the same cycle is selected. Then, the minimum cell SOH, i.e., the pack SOH, can be deduced by the inverse transformation of (3). During the estimation process, only the duration for a preset voltage range needs to be determined online, and it is easy to acquire. Meanwhile, the real capacity or SOH of the training set needs to be known in advance for model training, and the real capacity of the test set is treated as the reference to evaluate the model performance.
In this manner, the SOH inconsistency within the battery pack can be well evaluated by the proposed CDM, and the proposed method can effectively distinguish the cell inconsistency and identify the battery pack SOH state. Next, the experimental validation will be conducted and discussed.

V. VALIDATION AND ANALYSIS
The performance of the proposed scheme for SOH prediction is evaluated by conducting a large number of comparisons raised by different predictors under various operation conditions. To demonstrate the feasibility and advancement of the proposed method, four representative tests are conducted, including the performance comparison of different predictors in the base model with training and testing data sets of different lengths, the portability over other voltage ranges and battery types, and validations on battery packs. The evaluation criteria include average absolute error (AAE), maximum absolute error (MAE), and RMSE; the AAE and MAE are formulated as

A. Comparison of Different Methods on Base Model
Based on the experimental data, the feasibility of the proposed base model is first validated in terms of prediction of data set 1. Another three commonly used SOH predictors, including backpropagation NN (BP-NN), SVM, and least-squares SVM (LS-SVM), are employed to conduct the estimation for performance comparison. The influence of voltage range selection is also addressed in this part. In the three commonly used predictors, the voltage range of 3.6-3.8 V is applied to extract the health feature. In the proposed method, three different ranges are considered for healthy feature generation. The first two ranges express individual voltage intervals, and the third one includes two ranges, as presented in Table II. For each prediction scenario, the first 70% of aging profiles are applied for model training, and all four methods are applied to forecast the same degradation data. Moreover, to fully promote the advantages of each network, the grid search method is leveraged to find the optimal parameters of each model.
The prediction performance of these four methods is compared with the reference curve, and the results are shown in Fig. 5 and Table II, respectively. As can be found, the capacity degradation trend is generally consistent, and no obvious rapid aging process emerges. Note that, as the phenomenon of capacity recovery exists due to the intermittent experiment, the capacity degradation curve is not monotonously decreasing. From Fig. 5 and Table II, it can be observed that all four approaches can effectively track the reference curve during the entire lifetime. The AAEs by the BP-NN, SVM, and LS-SVM are 0.43%, 0.50%, and 0.43%, respectively; in contrast, the largest AAE for the proposed scheme is less than 0.43%. After removing the estimation results in the initial stage, the MAE of the proposed method is much lower than that of the other three algorithms, except when two voltage ranges are selected to extract the health feature. In addition, for the BP-NN, SVM, and LS-SVM algorithms, their RMSEs are, respectively, 0.63%, 0.69%, and 0.60%, obviously higher than that of the proposed method. The running time of the proposed method is evaluated on a desktop computer, which is equipped with Intel Xeon E3-1230 (3.30 GHz) processor and 32-GB memory. The calculation time of the proposed method is 3.02 s when using data set 1 to predict SOH. Since SOH can be merely estimated once per cycle, the average running time of each cycle for data set 1 is 3.34 ms. Compared with the duration of one cycle experiment that lasts for a number of hours, the short estimation duration can fully meet the requirements of online implementation. By comparing the prediction results, it can be concluded that different voltage ranges as health features lead to different estimation performances. It is also distinct that the developed method with the input data under range from 3.6 to 3.8 V can raise better prediction performance compared to that with other voltage ranges. In addition, the model inputs with two voltage ranges will possibly increase the model instability and consequent computational complexity. Therefore, we can infer that the proposed algorithm with five layers and the health feature extracted from a single voltage range are feasible and reasonable.

B. Performance Evaluation on Different Voltage Ranges
In previous discussions, a random voltage range is selected to construct the base model. To validate the adaptability of  the proposed scheme, the voltage range for health feature extraction is modified to 3.5-4 and 3.5-4.15 V, significantly deviating from the preset voltage range of the base model. Fig. 6 displays the results for different voltage ranges by the proposed method and the single-LSTM method. In this case, the model trained by the first 360 cycle data (40% of the experimental data) is utilized to predict the SOH of the remaining 544 cycles. From Fig. 6, it can be found that the predicted SOH is, in general, less accurate than before, and the MAE becomes larger, due to the improper selection of the voltage range and the amount decrease of the training data set. Even so, the estimation performance based on the proposed method is still acceptable. However, the SOH curves obtained by the proposed method and the single-LSTM method show different trends, and the prediction profile of the former method can track the degradation curve more accurately during the whole process; nonetheless, the latter only follows the reference curve during training and gradually moves away from the true value during the test.
Similar to before, the proposed approach achieves more robust results with less training data and varying feature ranges. Fig. 6(c) shows the boxplot of the prediction errors across all the voltage ranges based on the proposed and individual LSTM methods, and the red lines represent the median value of error. We can find that the boxplots of LSTMand TL-based methods for different voltage ranges are flatter and show that the error distribution of this method is more concentrated. Moreover, the median error line based on the LSTM and TL methods is closer to zero. The maximum error of the proposed method occurs in the initial cycle, which is included in the training set. However, the maximum error of the individual LSTM method appears in the test set and higher than 9%. It can be concluded that the proposed scheme outperforms the single-LSTM approach. Moreover, it also can be observed from the above analysis that the proposed method can result in precise SOH estimation with an MAE of 3.2%. Moreover, this method can reduce the training data amount by about 30% (about 270 cycles). From the perspective of balancing data preparation and accuracy, it is still acceptable to sacrifice slight accuracy while reducing a variety of test data and saving much testing time.

C. Performance Evaluation on Other Batteries
To verify the adaptability and scalability of the proposed scheme to other types of batteries, data sets 2 and 3 specified in Table I are applied to estimate the SOH via the proposed method, and the performance of data set 2 is presented in Fig. 7. Compared with data set 1, the capacity recovery phenomenon of data set 2 is more obvious. With the help of TL with fine-tuning strategy, the model is trained by the data of the first 100 cycles (occupying 18%-20% of the aging data), which is determined through trial and error. The remaining cycles are applied for validation, and the selected voltage interval for feature generation ranges from 3.8 to 4.1 V. Although only 100-cycle data are leveraged to train the model, the developed method leads to satisfactory performance during the entire aging process. In the worst case, the maximum AAE, MAE, and RMSE are 0.82%, 3.9%, and 1.07%, respectively. Note that the MAEs of the cells 2-1 and 2-2 exceed 3%, and it can be seen from Fig. 7(b), (e), and (h) that most of the errors can be restricted within 2.8%. Concretely, the error distribution can be schematically summarized in Fig. 7(c), (f), and (i). As can be found, the zero (or near) error emerges frequently. With the increase in the absolute value of error, the error occurrence probability decreases gradually, being in line with the Gaussian distribution with the mean value of 0.04%, 0.3%, and 0.006% and the variance of 0.0002, 0.00008, and 0.00006. The detailed validation results reveal that the distribution of error is reasonable and manifest that the proposed method can accurately and stably estimate battery SOH, even in the case of obvious size reduction of the training data set. Moreover, the proposed method can effectively reduce the degradation experiment by 285 cycles, which will cost around 855 h in this case. This, in turn, highlights the strong timeliness and high efficiency of the proposed framework when it is leveraged to predict the SOH of different batteries, manifesting its adaptability and extendibility.
After the comparison based on data set 2, data set 3 is further experimented with to examine the performance of the proposed method. The cathode material of data set 3 is LFP, which is different from the first two types of batteries. Here, only one cell is chosen for validation, and the voltage range of 2.5-3.5 V is selected as the health feature. The prediction results are sketched in Fig. 8, from which we can find that the general degradation trend looks like a polynomial curve, and hence, an efficient polynomial fitting method may gain preferable estimation with the input of cycle number. However, from the zoomed-in figure in Fig. 8(a), local partial capacity recovery emerges, making it difficult to accurately estimate the SOH by the simple polynomial fitting method. On the other hand, since the battery in data set 3 is fully charged and discharged during the experiment, the SOH value with respect to cycle number is close to a polynomial curve. Nevertheless, it is difficult to encounter full charge and discharge operations all the time in practice. Once partial charge and discharge behaviors occur, the number of cycles will increase, while the SOH degradation rate will be slower, and in this case, the polynomial curve fitting method cannot work well. In addition, as the research target of this article is to investigate a general efficient SOH prediction algorithm that can adapt to different types of batteries with the reduction of training data size, it is obvious that the polynomial fitting method and the related settings based on data set 3 are not applicable for data sets 1 and 2. Instead, the LSTM network and TL with a fine-tuning strategy based on data set 1 can be transplanted to predict the SOH of data set 3 with preferable and robust estimation. In addition, the single-LSTM algorithm is continuously exploited to compare the size of training data. In the two networks, except for the training data, the other parameters remain the same. Through repetitive optimization, the first 80% (1076 cycles) and 60% (611 cycles) of aging data are, respectively, applied for training the single-LSTM and the present algorithm, and all the data are exerted to assess the prediction performance. As can be found, the prediction results do not show the distinct difference, and both methods can provide precise predictions. Concretely, the RMSE and AAE by the LSTM are 0.27% and 0.16%, and those by the proposed method remain close and are 0.36% and 0.24%. Nonetheless, the MAE by the proposed method is 0.09%, much lower than that by the LSTM; moreover, the data set size for training the proposed method is 20% less than that for training the single-LSTM algorithm. In other words, with the same estimation results, 465 cycle tests can be saved by the proposed method in this study, and most of the estimation error can still be controlled within 2.8%, which is only 0.92% higher than that of the base model. Thus, it can be summarized that the proposed algorithm can be transplanted from NCM batteries to LCO and LFP batteries, manifesting its superior extendable capability. The only task that needs to be conducted is to adjust the parameters of the fully connected layer and the output layer. In this manner, the feasibility of the proposed algorithm when applied to different types of batteries is verified.

D. Performance Evaluation on Battery Pack
Four cells grouped into a small pack by the serial connection are experimented with the same environmental conditions. These four cells are with the same property in data set 1 and are referred to as cells 1-1, 1-2, 1-3, and 1-4, respectively. The acquired data are utilized to identify the SOH by the proposed pack estimation algorithm. In this study, the estimation results of cell 1-1 are hired to construct the CMM. Note that the validation in Part A is based on the experiment of cell 1-1, and hence, the estimation results of cell 1-1 are not repeated here. The estimation curves of CDM for cells 1-2, 1-3, and 1-4 are depicted in Fig. 9(a), (d), and (g). It can be clearly found the evolution trend of CDM for each cell shows different trends. In the first 200 cycles, the differences are not obvious, and the maximum SOH difference is less than 2%. However, as the battery ages, the difference of CDM becomes more and more obvious; when the cycle reaches 440 times, the maximum difference exceeds 3%.
The prediction results of cell SOH are sketched in Fig. 9(b), (e), and (h), and their corresponding errors are plotted in Fig. 9(c), (f), and (i) and summarized in Table III, where "LSTM" means the SOH obtained by the single-LSTM algorithm. As can be seen, the prediction curves of the proposed method are closer to the reference profiles than those  by the single LSTM. The prediction error of the proposed method for all the cells is located closer to the horizontal axis. Also, the result of MAE is better than that of the single-LSTM method, as portrayed in Fig. 9(c), (f), and (i). Moreover, due to the long-term memory characteristics of single LSTM, the initial state of LSTM fluctuates obviously, raising larger SOH estimation error. By contrast, due to the LSTM-TL method and CDM, the SOH prediction results by the proposed method do not show a similar phenomenon.
The diagnosis results of battery pack SOH for single LSTM and the proposed method are shown in Fig. 10, and   Table III. As can be seen, the prediction value of the single LSTM is mostly below the reference value, while the estimation value of the proposed method oscillates around the reference. It is obvious that the proposed method enables a more reasonable variation track of pack SOH compared to the single-LSTM method. According to Table III, the evaluation indexes reveal that the proposed method performs more accurately than single LSTM. To be specific, the AAE, MAE, and RMSE of the proposed method are less than two-thirds, one-third, and three-fifths of that by the single-LSTM method.
To sum up, the proposed LSTM integrating the TL algorithm and CDM can achieve satisfactory pack SOH estimation. As such, the abuse of battery packs can be effectively avoided by reliably diagnosing the pack SOH, and safe operation of batteries can be guaranteed. Moreover, the prediction procedure of the proposed scheme is executed during the charging phase, indicating its easy application potential in practice. In addition, in the proposed pack SOH prediction scheme, the LSTM model is employed as the CDM, and the minimum estimation value of CDM is identified to determine the pack SOH. When facing the SOH of a battery pack with multiple cells (for example, tens to thousands), first, the CDM value of each cell needs to be compared, and the minimum CDM will be selected. Then, the estimation method will be exploited to predict the entire battery pack SOH. From this point of view, applying the proposed scheme to the pack with multiple cells increases only the amount of CDM comparison, and thus, the estimation performance of pack SOH by the proposed algorithm will be still the same, as addressed in this study.

VI. CONCLUSION
The knowledge of battery pack SOH is of vital importance for safe operation. Machine learning-based methods require a large amount of training data to portray the hidden nonlinear relationship of lithium-ion batteries to predict SOH. In this study, a flexible LSTM-based battery pack SOH prediction scheme is advanced with the combination of the TL strategy. The relative charging time for a preset voltage range is extracted as the health feature to represent the degradation state of lithium-ion battery cells. Then, an improved LSTM network integrating TL is proposed to establish the CMM and fully substantiated by comparing it with the commonly used SOH prediction algorithms. The validation results based on the proposed model manifest that the cell SOH error is less than 3% with 70% training data. Moreover, the flexibility of the proposed TL algorithm is validated on different voltage ranges and different types of batteries, and the training data can be reduced by 20%-40% without discrediting the estimation performance, thereby justifying the remarkable contribution of TL in mitigating the training data amount. Based on the developed CDM, the SOH inconsistency existing in battery cells can be effectively taken into account, and the pack SOH is properly determined by the CMM and the minimum value of CDM. According to the proposed cell and pack SOH prediction methods, the pack SOH can reasonably follow the variation trend of the minimum SOH of cells and, therefore, enables the avoidance of improper abuse operations more efficiently.
Although a smaller power battery pack is employed as the validation object in this study, the proposed scheme is also feasible to predict the SOH of larger battery packs, which will be our validation focus in the next step. In the future, the aging mechanism and SOH diagnosis of lithium-ion batteries from the perspective of electrochemical reactions and the long-term prediction of battery SOH is imperative to be explored for the promotion of estimation precision. In addition, the impact of temperature on performance variation and SOH prediction of lithium-ion batteries also need to be further investigated. He is currently a Reader in dynamics modeling and control with the Queen Mary University of London, London, U.K. His research interests include constrained optimal control, model predictive control, adaptive robust control, and control applications, including renewable energies and energy storage.