Code for : Realistic Multi-Fault Diagnostics of Millions-Scale Li-ion batteries with Rapid Unsupervised Learning

Xie, Shaohua

doi:10.5281/zenodo.18327925

Published January 21, 2026 | Version v1

Software Open

Code for : Realistic Multi-Fault Diagnostics of Millions-Scale Li-ion batteries with Rapid Unsupervised Learning

Xie, Shaohua¹

1. Harbin Institute of Technology

Code for : Realistic Multi-Fault Diagnostics of Millions-Scale Li-ion batteries with Rapid Unsupervised Learning

Abstract

The rapid deployment of battery swapping stations necessitates scalable and reliable fault diagnosis, yet massive, sparse operational data and scarce labeled samples make this challenging. Here, we report a rapid unsupervised learning framework for realistic multi-fault diagnosis in million-scale battery fleets. Our approach employs a double-layer mechanism. First, we rapidly screen for abnormal devices by extracting features from voltage-envelope sequences. Subsequently, we pinpoint faulty cells and types using an enhanced two-stage unsupervised clustering combined with rule-based fault tracing. The framework is validated on a production dataset of over 128,000 devices, achieving 97.33% device-layer and 99.66% cell-layer accuracy. Laboratory tests on recalled batteries further confirm the detection of low-capacity and micro-short-circuit faults. These results demonstrate scalability and robustness under sparse-data conditions, enabling reliable operations for large-scale energy storage systems.

Description

This project provides an implementation of a diagnostic framework with the following workflow:

1. Data: Partial raw data samples are stored in the `data/` folder.

2. processedData: Cleaned and transformed data are placed in `processedData/`.

3. Code: All scripts and notebooks for model constructing, parameters tuning, performance evaluating, and results analyzing are in `code/`.

4. Results: Outputs from the framework are saved in `Result/`.

Usage

1. Navigate to the `code/` folder.

2. Run the `dataProcess.ipynb` and `dataReorganization.ipynb` scripts in turn to transform raw data into processed data.

3. Execute `processPredefinedDtaset.ipynb` for predefined dataset while `processFullDataset.ipynb` for full dataset to generate results.

4. Find outputs in the `Result/` folder.

Requirements

- Python 3.8+

- Common libraries: `numpy`, `pandas`, `scikit-learn`, `matplotlib`,`seaborn` (add others as needed).

Files

readme.md

Files (857.3 kB)

Name	Size	Download all
Code_for_Realistic_Multi_Fault_Diagnostics_of_Millions_Scale_Li_ion_batteries_with_Rapid_Unsupervised_Learning.rar md5:5e6c2d212bb1083c1d0edb8a51a4bec3	856.3 kB	Download
readme.md md5:a699b95c9fdd6e2be9f3daf654386a4a	1.0 kB	Preview Download

	All versions	This version
Views	82	82
Downloads	37	37
Data volume	23.1 MB	23.1 MB

Code for : Realistic Multi-Fault Diagnostics of Millions-Scale Li-ion batteries with Rapid Unsupervised Learning

Authors/Creators

Description

Code for : Realistic Multi-Fault Diagnostics of Millions-Scale Li-ion batteries with Rapid Unsupervised Learning

Abstract

Description

Usage

Requirements

Files

readme.md

Files (857.3 kB)