There is a newer version of the record available.

Published September 30, 2022 | Version 1.0.0
Dataset Open

AgrImOnIA: Open Access dataset correlating livestock and air quality in the Lombardy region, Italy


Project manager:

  • 1. University of Bergamo


OBSOLETE VERSION - This is not the last update of the dataset, we strongly suggest to use the last version

The AgrImOnIA dataset is a comprehensive dataset relating air quality and livestock (expressed as the density of bovines and swine bred) along with weather and other variables. The AgrImOnIA Dataset represents the first step of the AgrImOnIA project. The purpose of this data set is to give the opportunity to assess the impact of agriculture on air quality in Lombardy through statistical techniques capable of highlighting the relationship between the livestock sector and air pollutants concentrations.

This dataset is a collection of estimated daily values for a range of measurements of different dimensions as: air quality, meteorology, emissions, livestock animals and land use. Data are related to Lombardy and the surrounding area for 2016-2021, inclusive. The surrounding area is obtained by applying a 0.3° buffer on Lombardy borders.

The data uses several aggregation and interpolation methods to estimate the measurement for all days.

For more details see the paper:

A. Fassò, J. Rodeschini, A. Fusta Moro, Q. Shaboviq, P. Maranzano, M. Cameletti, F. Finazzi, N. Golini, R. Ignaccolo, and P. Otto (2022)  Agrimonia: a dataset on livestock, meteorology and air quality in the Lombardy region, Italy. Arxiv preprint, arxiv:2210.10604. (click here).

The files in the folder are:

Agrimonia_Dataset.csv(.Rdata,.mat) which is built by joining the daily time series related to the AQ, WE, EM, LI and LA variables. In order to simplify access to variables in the Agrimonia dataset, the variable name starts with the dimension of the variable, i.e., the name of the variables related to the AQ dimension start with 'AQ_'. This file is archived also in the and format for MATLAB and R software, respectively. 

Metadata_Agrimonia.csv which provides further information for the sources used, variables imported, transformations applied, and about the Agrimonia variables.

Metadata_AQ_imputation_uncertainty.csv which contains the daily uncertainty estimate of the imputed observation for the AQ to mitigate missing data in the hourly time series.  

Metadata_LA_CORINE_labels.csv which contains the label and the description associated with the CLC class.  

Metadata_monitoring_network_registry.csv which contains all details about the AQ monitoring station used to build the dataset. Information about pollutant stations includes: station type, municipality code, environment type, altitude, pollutants sampled and other information. Each row represents a single sensor.

Metadata_LA_SIARL_labels.csv which contains the label and the description associated with the SIARL class.

The dataset can be reproduced using the code available at the GitHub page:



Files (190.0 MB)

Name Size Download all
114.6 MB Preview Download
26.0 MB Download
19.0 MB Download
23.6 kB Preview Download
30.2 MB Preview Download
4.0 kB Preview Download
475 Bytes Preview Download
187.9 kB Preview Download