Published June 3, 2026 | Version 1.0
Preprint Open

Complementary CNN and Transformer Architectures for HBV Genotype Classification and Genotype-Boundary Interpretation

  • 1. Liver Research Initiative

Description

This record contains the preprint manuscript and full reproducibility package for a study comparing convolutional neural network (CNN) and Transformer architectures for hepatitis B virus (HBV) full-genome genotype classification under a leakage-controlled, alignment-aware framework. The CNN reached 99.47% held-out accuracy versus 86.77% for the Transformer (paired McNemar exact p = 1.19 x 10^-7); MAFFT-anchored population analysis identified 11 strict and 238 modal B-C divergent positions, 977 positions conserved across A/B/C, and candidate genotyping marker panels. The package includes curated Colab notebooks and analysis scripts, the deduplicated GenBank accession list, the reference alignment and train/validation/test split with SHA-256 manifests, figures, supplementary tables, and source result tables. Trained model weights are reproducible from the notebooks under fixed seeds and available on request.

Files

HBV-VISLM_paperA_26Jun3.pdf

Files (8.0 MB)

Name Size Download all
md5:e4b455c8e5af52775f886cc608bad031
2.5 MB Preview Download
md5:6e99fc7b901789ea6bc7b4acfb578964
5.4 MB Preview Download

Additional details

Dates

Issued
2026-06-03
Initial public Zenodo release of the Paper A preprint and reproducibility package.