Dataset Open Access

CASSMIR

Thibault Le Corre

Project leader(s)
Thibault Le Corre
Project member(s)
Ronan Ysebaert; Pierre Le Brun; Timothée Giraud; Jean-Baptiste Durand

New version 2.0.0 with majors change

For free and complete informations concerning CASSMIR datasets, please visit our website (in French).

The CASSMIR database (Contribution to the Spatial and Sociological Analysis of Residential Real Estate Markets) is a spatial and population datasets on housing property market of the Parisian metropolitan area, from 1996 to 2018. The indicators in the CASSMIR database cover four "thematic areas of investigation" : prices, socio-demographic profile of buyers and sellers, purchasing regimes and types of property transfers and types of real estate. These indicators characterize spatial units at three scales (communal level, 1km grid and 200m grid) and population groups of buyers and sellers declined according to social, generational and gender criteria. The delivery of the database follows a series of matching and aggregation of individual data from two original databases : a database on real estate transactions (BIEN database) and a database on first-time buyer investments (PTZ database). CASSMIR delivers aggregated data (with nearly 350 variables) in open access for non-commercial use.

This repository consists of sevenfiles.

"CASSMIR_SpatialDataBase" is a Geopackage file, it lists all the data aggregated to spatial units of reference. It is composed of three layers that correspond to the geographical scale of aggregation: at a communal level, a grid of one kilometer on each side and a grid of two hundred meters on each side.

"CASSMIR_GroupesPopDataBase" is a .csv file, it lists all the data aggregated to population groups of reference. There are three types of population groups : groups referenced by the social position of the buyers/sellers (social group), groups referenced by the age group to which the buyers/sellers belong (generational group), groups referenced by the sex of the buyers/sellers (gender group).

Two metadata files (.csv)  lists the metadata of the indicators made available. They are systematically structured as follows :

  • Id_var: the identifier of the variable contained in "CASSMIR_SpatialDataBase" or "CASSMIR_GroupesPopDataBase"
  • Unite d'observation des variables descriptives : descriptive units of observation (Prices, buyers, sellers...)
  • Type d'information : precision on the type of information
  • Label : Description of the contents of the variable
  • Indicator_Group: The group of indicators to which the variable relates (prices, socio-demographics indicators of buyers and sellers...)
  • Unit : The unit of measurement of the variable
  • Spatial_Availability : A precision on the availability of the variable in the spatial database (communes, 1 km grid and 200m grid)     
  • GroupesPop_Availability : A precision on the availability of the variable in the population groupes database (Social, generational , gender)
  • Data_Source: The main origin of the data (INSEE, BIEN and/or PTZ)
  • Remarques : possible remarks on the construction of the variable

"BIENSampleForTest" and  "PTZSampleForTest" are two .txt files which restore a sample of individual data from each of the original databases. All data were anonymized and the values randomized. These two files are specifically dedicated to reproducing the different stages of processing that lead to the production of the CASSMIR files ("CASSMIR_SpatialDataBase" or "CASSMIR_GroupesPopDataBase") and cannot be used in any other way.

"LEXIQUE" is a glossary of terms used to name the variables (.csv).

The creation of the database was funded by the National Reseach Agency (ANR WIsDHoM https://anr.fr/Projet-ANR-18-CE41-0004).

All CASSMIR documentation (in French) and R codes are accessible via the Gitlab repository at the following address : https://gitlab.huma-num.fr/tlecorre/cassmir.git

METADATA  :

  • Licence

This dataset is registered under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license. You are free to copy, distribute, transmit, and adapt the data, provided that you give credit to the CASSMIR data base and specify the original source of the data. If you modify or use the data in other derivative works, you may distribute them only under the same license. You may not make commercial use of this database, nor may you use it for any purpose other than scientific research.

  • Citation standard

- Figures: (CC - CASSMIR database, indicator(s) constructed from XXX data)
- Bibliography : Productions that use the CASSMIR database must reference the dataset and the data paper.

Dataset: Le Corre T.,  2020, CASSMIR (Version 2.0.0) [Data set], Zenodo. http://doi.org/10.5281/zenodo.4497219

Data paper: Le Corre T., 2021, "Une base de données pour étudier vingt années de dynamiques du marché immobilier en Île-de-France", Cybergeo.

  • Data paper title

"Une base de données pour étudier vingt années de dynamiques du marché immobilier en Île-de-France"

  • Author

Thibault Le Corre

  • Keywords

Housing market, data base, Île-de-France, spatio-temporal dynamics

  • Related Publication

DOI

  • Language

French

  • Time Period Covered

The time period covered by the indicators in the database depends on the data sources used, respectively:
For data from BIEN: 1996, 1999, 2003-2012, 2015, 2018
For data from PTZ: 1996-2016

Kind of data

Nature of data submitted

  • vector: Vector data

  • grid: Data mesh

  • code: programming code (see the website or GitLab of the project)

Reference Coordinate System (RCS): EPSG 2154 RGF93/Lambert 93.

Data Sources

Base BIEN

Base PTZ

Geographical Coverage

Île-de-France region

Geographical Unit

Municipalities and grid mesh elements (1km side grid and 200 side grid) concerned by real estate transactions

Geographic Bounding Box

- Xmin : 586421.7
- Xmax : 741205.6
- Ymin : 6780020
- Ymax : 6905324

Type of article

Data Paper

Files (886.8 MB)
Name Size
BIENSampleForTest.csv
md5:a56ea2640ace7c84a1080e32f7a15e92
9.1 MB Download
CASSMIR_GroupesPopDataBase.csv
md5:a8f57c0e76c8c95a4c4933dbfbd794f1
1.0 MB Download
CASSMIR_PopMetadata.csv
md5:e3436f3a8cb7f73397bf6d09e3fdbeb7
65.1 kB Download
CASSMIR_SpatialDataBase.gpkg
md5:53eb19d31b04d78d99608bf09cb2a5a0
873.0 MB Download
CASSMIR_SpatialMetadata.csv
md5:1b31bbc3db98a50ab275172fb932a280
41.2 kB Download
LEXIQUE.csv
md5:765412c38b090f19e38551733f70ca42
3.1 kB Download
PTZSampleForTest.txt
md5:4db806c5a0cc66fc18521ebfe2e0cf97
3.5 MB Download
344
223
views
downloads
All versions This version
Views 344110
Downloads 22398
Data volume 13.2 GB4.1 GB
Unique views 22773
Unique downloads 15163

Share

Cite as