Complementary CNN and Transformer Architectures for HBV Genotype Classification and Genotype-Boundary Interpretation
Description
This record contains the preprint manuscript and full reproducibility package for a study comparing convolutional neural network (CNN) and Transformer architectures for hepatitis B virus (HBV) full-genome genotype classification under a leakage-controlled, alignment-aware framework. The CNN reached 99.47% held-out accuracy versus 86.77% for the Transformer (paired McNemar exact p = 1.19 x 10^-7); MAFFT-anchored population analysis identified 11 strict and 238 modal B-C divergent positions, 977 positions conserved across A/B/C, and candidate genotyping marker panels. The package includes curated Colab notebooks and analysis scripts, the deduplicated GenBank accession list, the reference alignment and train/validation/test split with SHA-256 manifests, figures, supplementary tables, and source result tables. Trained model weights are reproducible from the notebooks under fixed seeds and available on request.
Files
HBV-VISLM_paperA_26Jun3.pdf
Files
(8.0 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:e4b455c8e5af52775f886cc608bad031
|
2.5 MB | Preview Download |
|
md5:6e99fc7b901789ea6bc7b4acfb578964
|
5.4 MB | Preview Download |
Additional details
Dates
- Issued
-
2026-06-03Initial public Zenodo release of the Paper A preprint and reproducibility package.