Published February 25, 2026 | Version 2

A dataset for 19th-century French land registry pages classification

  • 1. ROR icon Laboratoire en Sciences et Technologies de l'Information Géographique pour la ville intelligente et les territoires durables
  • 2. EPITA
  • 3. ROR icon Centre de Recherches Historiques

Description

Initial registers pages classification dataset overview

This dataset is prepared to train classification models for French 19th-century land registry pages. It deals only with initial registers pages (états de sections in French). Images have been provided by the Val de Marne Archive and they can be associated with two former French departements : the Seine and the Seine-et-Oise (departements that were rounded Paris before 1968). There are three main layouts in these registers which are presents here.

The dataset counts 1046 simple pages (1016 in version 1) and has been split into training, validation, and test sets with stratification based on the type of page.

It is structured according to YOLO Classification format.

The metadata_v2.json file gives more information on the pages of the dataset (version 2).

The additionnal_pages.json gives the information related to the pages added to version 2.

📁 Usage

This dataset is intended for training a YOLO-v11 classification model.

Differences with version 1

Version 2 is an augmented version of the dataset (+30 pages). The train subset is exactly the same as in version 1. The less represented classes (ets_couv and ets_resume) in the val and test subsets have been manually augmented to strengthen the evaluation.

📊 Dataset Splits

Split Count Diff. with V1
Train 745 =
Val 137 +14
Test 164 +16

🏙️ Grouped by municipality

Communes are distinct across splits (useful to test geographic generalization), as land registry were produced for each municipality under the supervision of each departement.

Train Set

Commune Count
BONNEUIL 8
BRY 28
CHARENTON 20
CHEVILLY 4
CHOISY 16
CRETEIL 10
FONTENAY 8
FRESNES 20
GENTILLY 68
IVRY 72
JOINVILLE 28
LEPERREUX 24
LEPLESSIS 52
LHAY 31
MAISONSALFORT 20
MAROLLES 24
NOGENT 40
ORMESSON 24
SAINTMANDE 48
SAINTMAURICE 12
SANTENY 8
SUCY 8
VALENTON 12
VILLENEUVESTG 8
VILLIERS 16
VINCENNES 76
VITRY 60

Val Set

Commune Count
ARCUEIL 22
CHAMPIGNY 8
MANDRES 20
PERIGNY 16
RUNGIS 20
SAINTMAUR 15
VILLECRESNES 12
VILLEJUIF 24

Test Set

Commune Count
ABLON 8
ALFORTVILLE 33
BOISSY 16
LIMEIL 14
NOISEAU 16
ORLY 32
THIAIS 21
VILLENEUVELEROI 24

🔠 Stratification by type of page 

Train Set

Classe Count
ets_couv 15
ets_recap_inter 198
ets_resume 30
ets_tab_p1 413
ets_tab_p2 89

Val Set

Classe Count Diff. with V1
ets_couv 9 +6
ets_recap_inter 38 =
ets_resume 11 +8
ets_tab_p1 57 =
ets_tab_p2 22 =

Test Set

Classe Count Diff. with V1
ets_couv 11 +10
ets_recap_inter 27 =
ets_resume 10 +6
ets_tab_p1 86 =
ets_tab_p2 30 =

🎨 Color Distribution

Indication on digitisation color of the images

Split COLOR GREY
train 473 272
val 50 87
test 71 93

Type of register

AV_1822_NB : Register produced before 1822
AP_1822 : Register produced after 1822 (except renovation campaign of the Seine departement)
RECTIFICATION_1835 : Renovation campaign of the Seine departement (circa 1835-1840)

tag AV_1822_NB AP_1822 RECTIFICATION_1835
train 212 302 231
val 64 0 73
test 74 36 54

 

Files

additionnal_pages.json

Files (7.3 GB)

Name Size
md5:62e248170a4bfff2f149ae3b6ec8af87
36.3 kB Preview Download
md5:3bc0ec81e54ef70074b69f4fe2f94aa4
7.3 GB Preview Download
md5:ddeeaa33c1e88f4a4d9f3db5d027c897
1.2 MB Preview Download

Additional details

Additional titles

Translated title (French)
Dataset pour la classification automatique de pages issues des registres d'états de sections du cadastre napoléonien

Related works

Is derived from
Dataset: https://archives.valdemarne.fr/recherches/archives-en-ligne/cadastre-napoleonien (URL)
Is referenced by
Conference proceeding: 10.1007/978-3-032-05409-8_24 (DOI)
Is supplement to
Dataset: 10.5281/zenodo.18799034 (DOI)

Funding

Agence de l'innovation de défense
Institut national de l'information géographique et forestière

Dates

Created
2025-02-26