Published August 27, 2025 | Version v1
Software Open

scGEN

Contributors

Data collector:

Data manager:

Description

# scGEN: Single-Cell Gene-Aware Embedded Network

## Introduction

Recent advancements in single-cell RNA sequencing have greatly enhanced our ability to dissect cellular heterogeneity. However, unsupervised clustering often struggles to identify transitional or developmental boundary cells, as existing methods rely on highly variable genes without considering expression levels, thereby overlooking subtle but crucial signals.

To address this challenge, we developed **scGEN** (single-cell Gene-aware Embedded Network), which captures complex cellular relationships among cells. scGEN employs adaptive feature weighting and iterative fine-tuning to prioritize ambiguous or transitional cells with overlapping transcriptional profiles. 

### Key Features
- Adaptive feature weighting for better cell type identification
- Iterative fine-tuning to capture transitional cell states
- Superior performance on ambiguous cell classification
- Enhanced detection of subtle biological differences

### Performance
Evaluation across eight distinct scRNA-seq datasets demonstrated that scGEN consistently outperformed nine leading clustering approaches. Additionally, scGEN refined the classification of ~10% ambiguous cells and uncovered biologically significant differences, providing a more comprehensive view of cellular heterogeneity in the human fetal pituitary than existing methods.

## Installation

```bash
git clone https://github.com/hurlab/scGEN.git
cd scGEN
```

## Data Preparation

scGEN accepts input data in `.mat` (MATLAB) format. You can convert your data to the required format using the provided `sv2mat.m` script in MATLAB.

## Usage

### Step-by-step workflow:

1. **Select HVGs**: Use the `hvgs2csv.py` file in the scGEN directory to filter the normalized data with top 2000 highly variable genes.

2. **Create .mat file**: Use the `csv2mat.m` file in the scGEN directory to create a `.mat` file in MATLAB.

3. **Place your data**: Put your `.mat` file in the `dataset` folder under the scGEN directory.

4. **Run scGEN**: Execute the main training script:
   ```bash
   python3 train.py
   ```

### Data Download
You can download example datasets from: https://zenodo.org/uploads/16949673

## Hyperparameter Configuration

scGEN utilizes two key hyperparameters:
- **α**: Balances the contributions of the Regularized ZINB loss and the structure-guided hard-sample contrastive loss functions
- **γ**: Adjusts the attention weight assigned to hard samples in the learning process

### Best-performing Parameters by Dataset

Based on extensive parameter sensitivity analyses (α: 0.01-100, γ: 1-5), the optimal parameters for benchmark datasets are:

| Dataset      | γ (gamma) | α (alpha) |
|--------------|-----------|-----------|
| Bell         | 1         | 1         |
| hrvatin_B1   | 1         | 1         |
| hrvatin_B2   | 1         | 1         |
| pbmc3k       | 4         | 0.1       |
| Savas        | 4         | 0.1       |
| Scala        | 2         | 1         |
| Schwalbe     | 4         | 100       |
| zhang        | 4         | 10        |

### Parameter Tuning Guidelines

1. **Start with default parameters**: α=1, γ=1
2. **If results are unsatisfactory**:
   - Adjust γ for better hard-sample mining
   - Modify α based on dataset complexity

## Output and Results

The output file `result.csv` contains performance metrics (ACC, NMI, ARI, and F1 values) for each dataset across 20 runs, including the top two best-performing seeds with their average and standard deviation values.

## Contact

For questions or issues, please contact guokai8@gmail.com or open an issue on GitHub.

Files

ablation study-without HSL.csv

Files (2.1 GB)

Name Size Download all
md5:bc32de8445c8878f3e02e1ebd2550d9a
1.6 kB Preview Download
md5:eccbfcae8a85b888771c6ba224c7ff20
1.5 kB Preview Download
md5:84bb94d6813a106783b15d18d6a3ec43
1.2 MB Download
md5:b0f8a73cab896fc17046d2337c69b775
62.8 MB Download
md5:cd83d439ca20d2537e0f62f3ac8b1fa1
62.8 MB Download
md5:7e2222c8fc07c2c8467fda26d4a93d4f
5.2 MB Download
md5:dcdd2d2be1362040b8a3fc409f5058cd
1.0 kB Download
md5:64d6d390143fdf23a39e74069bb27364
258.5 MB Preview Download
md5:faa90c82344b5c23502f8cd26d76a8b9
284.6 kB Preview Download
md5:6c98a62507e4c0b781e7d5b8203e8f5d
2.5 MB Download
md5:33ff82b1497f8ac53dff49b281698479
3.3 MB Download
md5:4ea79f50044f05210c93845f9aecaf54
852 Bytes Download
md5:adafa8100f23a89fde8cd30d5cdb8e4c
3.1 kB Download
md5:5f3ff8ce65739923d8bafe2a9ad270e4
5.7 kB Download
md5:eca07953f36366a4ca90caed21a6e53c
3.1 kB Download
md5:fadaf8e87cdece7347033788e4d00577
3.1 kB Download
md5:777305b0217d2f6807dd013f78e26705
4.3 kB Download
md5:4b43a25d34939f60980a9c7615fa5704
3.5 kB Download
md5:830df6f35d869d177398ff6f6b61109b
6.8 kB Download
md5:467e76bf92fe92ea524d43ffc6e64515
3.4 kB Download
md5:1903e4bd2aa07cd6638d25a9257633b0
3.4 kB Download
md5:ad870b140f9d7683181f0787e1b2eb70
3.2 kB Download
md5:bb484eb9a65660bd63b5a00fbcabf7a8
1.8 kB Download
md5:8e8abe410101df0ca7b37da51467e0d6
4.0 kB Download
md5:ff66e9de5da138a7b20c0df7885a8d1a
1.8 kB Download
md5:27f6f515b788b52c8046b92f783ae2e1
1.8 kB Download
md5:5639d053f1bdb10862be65723c2e69d0
1.6 kB Download
md5:3739996e1033130d21c2d18b8308249c
1.1 kB Download
md5:14cf4840cf01a63e9d8e7d13fc15196a
2.3 kB Download
md5:f5cb007f6f3baa33b43e52516e4a076b
1.1 kB Download
md5:15a9576cb4a09efd6d7afa7d362c506d
1.1 kB Download
md5:f793ff01f973e8f7252de44cb084c565
2.0 kB Download
md5:c23675e68c81e4159c9d14a0309b799d
51.8 MB Download
md5:7909b1e36ce798b1f719b2ca89c76fff
52.7 MB Download
md5:4ec4eac2d14caa4e5f7110384b5859f1
52.3 MB Download
md5:afc8f45287b5b0f1988770e35ce74330
52.4 MB Download
md5:b5839f78f22e5ff3b01ed4611c77c59f
52.5 MB Download
md5:0761c0fc25f376a4a4cd55b5b3600991
52.6 MB Download
md5:60cb64349bd59fcbc8fb42efd6dfaa83
52.6 MB Download
md5:0d3aefe92e685ea31123379fb9b44310
52.2 MB Download
md5:285a3186bd08d065bb93d1e465cbdc7a
52.6 MB Download
md5:130123012313f776cfbb9abb770d20f2
52.4 MB Download
md5:7248bca2304d79b4e3df366fd33b2031
52.4 MB Download
md5:3eee532a754240200ab2bd078130763d
52.6 MB Download
md5:bedc2b7c0e041097a81fe91f353cc322
52.5 MB Download
md5:aea9a64a3e63673c833b8c341736a4ce
52.6 MB Download
md5:843f4cf11b87bbc4be07224688ea4230
53.0 MB Download
md5:e82c957a8e6969c6bb053a22994caf2f
4.4 MB Download
md5:6614cf51caccacbbb27978defab5b6dd
1.8 kB Preview Download
md5:94e1bfb17d4578b7c9844ce90972e2d3
3.6 kB Download
md5:3e9b69aff2db8d350d16c33474a622ef
3.6 kB Preview Download
md5:67d43aed7084255108b839b26f3b8619
18.1 kB Download
md5:a13e0ca5d34aa787914335ded9cd1655
5.5 MB Download
md5:d06202fc08951d415b07f212fe7687de
3.8 MB Download
md5:554f171e3a19a005f18a6905930c714b
5.3 kB Preview Download
md5:522947f749d5b402be5a9e45602db6bb
10.2 kB Preview Download
md5:ed558d622f00923cbe44492bd22d2c43
1.4 MB Download
md5:655aa5c5203ce9adbe8728973e1e9bf9
675 Bytes Download
md5:35ff3c57223184469c5676d84bccdd72
1.4 kB Download
md5:d327a79b3714fb6760a6e29f288763a7
660 Bytes Download
md5:aeecc4d6e49bfd8045e7f170db5706d7
1.3 kB Download
md5:d78d765df933ee1c7133d2d1b4d463e2
527 Bytes Download
md5:841ec7819c40175a752feefabc9e0261
1.2 kB Preview Download
md5:d64ef8c8a7e097bf6acdd0a1ad226a53
5.2 kB Download
md5:10b159ae7da194d092da9e4e377b0176
7.4 kB Download
md5:b5467576af3af459db7a13c107defb63
6.5 kB Download
md5:9061115641aee08abeb0bd91580f12b5
13.2 kB Download
md5:3550e751bd012ff9d3765cc40caaa079
6.5 kB Download
md5:b71b1b99acf235c8a750d0c4e760938f
6.5 kB Download
md5:b801963c285d77c73702bd8dd8d175a4
6.7 kB Download
md5:0b7240f3efe9de2df8d326e23820ecc7
13.0 MB Download
md5:102171b3c6dbc69cb830bb0a1da40af1
47.3 kB Download
md5:ff3ab9176771690c777ff5124f7ee902
17.6 kB Preview Download
md5:44127bf225e5afc7418d01f17dee02e7
18.1 kB Preview Download
md5:2d035a580eb5629212a0cbde98627669
18.1 kB Preview Download
md5:52a694f1295b19f30ded703b9234bb2a
26.2 kB Preview Download
md5:709736db7c81f96840bdf20364145ecd
886.1 MB Download
md5:5012b68eb92a2dc153d358a08f0995f9
18.2 kB Preview Download
md5:7ecfff4bcd736a52d63421979b32a10d
18.0 kB Preview Download
md5:26246bf287c82d75522becd26c5f3c3c
79.5 kB Preview Download
md5:8898c144a533fdcea8bffe0099ab6f3f
18.2 kB Preview Download