Published May 27, 2020 | Version V1.0
Dataset Open

Dairy database for prediction of main environmental challenges to resilience and efficiency in cattle production systems at regional resolution


The dairy database comprises average values for a wide range of variables (110 or 119), available in 4 worksheets: BasicFarmType (18 rows), DetailedFarmType (10 rows), ClimateClass+BasicFarmType (100 rows), NUTS+DetailedFarmType (1452 rows). Data are omitted when the sample size (n) is below 15, as per the confidentiality agreement under FADN data use rules.

A combined farm characterisation database was constructed using two major data sources, the Farm Accountancy Data Network (FADN), and the Gridded Agro-Meteorological Data in Europe (AGRI4CAST). The database initially constructed was further enhanced through the addition of forage and crop yield data from the Food and Agriculture Organization of the United Nations (FAO) and the International Institute for Applied Systems Analysis (IIASA) developed Agro-Ecological Zones (AEZ) methodology database (FAO, 2012). The data was processed and is presented in D1.2 as two databases (dairy and beef), as averages for a wide range of variables at basic or detailed farm types, and at NUTS2 regional scale.

Detailed FADN data (anonymised individual farm data) was requested for all ruminant and mixed farm types, over 10 years and the most recent data available at request (2011-2013) was utilised for the analysis. Following receipt of the data (~250k farms) this has been compiled into two consistent datasets, one for dairy (141,961) farms and one for beef farms (54,417). Each dataset comprises some values directly from the FADN data, but also a large number of calculated variables, to identify dairy or beef enterprise performance at per animal, per output product unit or per hectare. These values were calculated according to the respective dairy and beef enterprise allocation methodologies described by FADN. Further economic and structural variables have been calculated as necessary, as described in GenTORE D1.1 (Quiédeville et al., 2019).

For each farm within the dataset, the structural, production and economic data from the FADN data is supplemented with the addition of meteorological data. The daily meteorological data was downloaded from the AGRI4STAT database web portal at a NUTS2 scale. For each NUTS2 region data was available for a number of weather stations. This large dataset was processed through scripts in STATA software to generate annual values for a wide range of climatic variables, including Temperature Humidity Index (THI), and indicators of drought and seasonality of weather. Furthermore, the altitude values per weather station allowed for a sub-grouping of weather station data by altitude zone (aligned with values available in the FADN dataset).

Using a Latent Class Analysis process, the meteorological data was analysed to identify consistent environmental regions in Europe. Selected climatic variables, together with altitude zone, were utilised to statistically identify differing zones, and to classify each NUTS2 region to a zone, resulting in 6 lowland zones and 3 upland zones (above 600m) The LCA process enhanced an earlier method of manually overlaying the Metzger et al. (20054) pedo-climatic zone allocation, but closely correlates. Therefore for each farm in the dairy and beef datasets, meteorological and environmental zone data was allocated on a NUTS2 by altitude zone basis and this dataset has been subsequently assessed and submitted as papers; Quiédeville et al., (submitted May 2020) and Grovermann et al. (submitted May 2020).

The GAEZ forage and crop yield data was downloaded from the GAEZ data portal as baseline and two future climate prediction periods: Baseline (1961-2000), 2020s (2011-2040), and 2050s (2041-2070), for the Hadley CM3 model and IPCC scenario A (the most extreme scenario). See: A zonal statistics was applied to the GAEZ layers to aggregate the data to NUT2 region and altitude zone (0-300m, 300-600m, 600m+) with raster package in R. The result is an average yield[1] for varying forages and crops for each altitude zone in each nuts2, for both the baseline and the future climate scenario. This data allows further analysis of the future impacts on cattle farming at both a regional scale, but also by farm type or system, which may be affected differently (Moakes et al. in preparation).

All variable processing from FADN data is shown in the Annex, as performed in Stata software.


[1] The mean was performed on non-zero yield pixels in order to exclude non-suitable areas from average.


Files (4.1 MB)

Name Size Download all
2.2 MB Download
1.9 MB Download

Additional details


GenTORE – Genomic management Tools to Optimise Resilience and Efficiency 727213
European Commission