Published June 3, 2026
| Version v1
Dataset
Open
Handwritten Document Dataset Splits
Authors/Creators
Description
NorHAND-mini is a randomly sampled subset of the original NorHAND dataset created for controlled experiments in handwritten text recognition.
The subset contains:
- Training: 350 pages (10,490 lines)
- Validation: 50 pages (1,576 lines)
- Test: 50 pages (1,491 lines)
Total:
- 450 pages
- 13,557 text lines
The split was generated using random page-level sampling to preserve writer and page integrity. The exact train/validation/test identifiers used in our experiments are provided to ensure full reproducibility.
This release contains only the split definitions and sample identifiers. Original images and annotations remain subject to the licensing and distribution terms of the NorHAND dataset.
Files
splits.zip
Files
(2.6 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:3c1657318b04fbf3aea2add7db5a3340
|
2.6 kB | Preview Download |