There is a newer version of the record available.

Published June 26, 2023 | Version 1.00
Dataset Restricted

BRAINTEASER ALS and MS Datasets

Description

BRAINTEASER (Bringing Artificial Intelligence home for a better care of amyotrophic lateral sclerosis and multiple sclerosis) is a data science project that seeks to exploit the value of big data, including those related to health, lifestyle habits, and environment, to support patients with Amyotrophic Lateral Sclerosis (ALS) and Multiple Sclerosis (MS) and their clinicians. Taking advantage of cost-efficient sensors and apps, BRAINTEASER will integrate large, clinical datasets that host both patient-generated and environmental data.

As part of its activities, BRAINTEASER organized two open evaluation challenges on Intelligent Disease Progression Prediction (iDPP), iDPP@CLEF 2022 and iDPP@CLEF 2023, co-located with the Conference and Labs of the Evaluation Forum (CLEF).

The goal of iDPP@CLEF is to design and develop an evaluation infrastructure for AI algorithms able to:

  • better describe disease mechanisms;
  • stratify patients according to their phenotype assessed all over the disease evolution;
  • predict disease progression in a probabilistic, time dependent fashion.

The iDPP@CLEF challenges relied on retrospective ALS and MS patient data made available by the clinical partners of the BRAINTEASER consortium. The datasets contain data about 2,204 ALS patients (static variables, ALSFRS-R questionnaires, spirometry tests, environmental/pollution data) and  1,792 MS patients (static variables, EDSS scores, evoked potentials, relapses, MRIs).

More in detail, the BRAINTEASER project retrospective datasets derived from the merging of already existing datasets obtained by the clinical centers involved in the BRAINTEASER Project. 

  • The ALS dataset was obtained by the merge and homogenisation of the Piemonte and Valle d’Aosta Registry for Amyotrophic Lateral Sclerosis (PARALS, Chiò et al., 2017) and the Lisbon ALS clinic (CENTRO ACADÉMICO DE MEDICINA DE LISBOA, Centro Hospitalar Universitário de Lisboa-Norte, Hospital de Santa Maria, Lisbon, Portugal,) dataset. Both datasets was initiated in 1995 and are currently maintained by researchers of the ALS Regional Expert Centre (CRESLA), University of Turin and of the CENTRO ACADÉMICO DE MEDICINA DE LISBOA-Instituto de Medicina Molecular, Faculdade de Medicina, Universidade de Lisboa. They include demographic and clinical data, comprehending both static and dynamic variables.
  • The MS dataset was obtained from the Pavia MS clinical dataset, that was started in 1990 and contains demographic and clinical information that are continuously updated by the researchers of the Institute and the Turin MS clinic dataset (Department of Neurosciences and Mental Health, Neurology Unit 1, Città della Salute e della Scienza di Torino.
  • Retrospective environmental data are accessible at various scales at the individual subject level. Thus, environmental data have been retrieved at different scales: 
    • To gather macroscale air pollution data we’ve leveraged data coming from public monitoring stations that cover the whole extension of the involved countries, namely the European Air Quality Portal;
    •  data from a network of air quality sensors (PurpleAir - Outdoor Air Quality Monitor / PurpleAir PA-II) installed in different points of the city of Pavia (Italy) were extracted as well. In both cases, environmental data were previously publicly available. In order to merge environmental data with individual subject location we leverage on postcodes (postcodes of the station for the pollutant detection and postcodes of subject address). Data were merged following an anonymization procedure based on hash keys. Environmental exposure trajectories have been pre-processed and aggregated in order to avoid fine temporal and spatial granularities. Thus, individual exposure information could not disclose personal addresses.

 

The datasets are shared in two formats:

  • RDF (serialized in Turtle) modeled according to the BRAINTEASER Ontology (BTO);
  • CSV, as shared during the iDPP@CLEF 2022 and 2023 challenges, split into training and test.

Each format corresponds to a specific folder in the datasets, where a dedicated README file provides further details on the datasets. Note that the ALS dataset is split into multiple ZIP files due to the size of the environmental data.

 

The BRAINTEASER Data Sharing Policy section below reports the details for requesting access to the datasets.

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.

Request access

If you would like to request access to these files, please fill out the form below.

You need to satisfy these conditions in order for this request to be accepted:

BRAINTEASER Data Sharing

Sharing Research data is a necessary component of Research, encouraging more connection and collaboration between researchers, which can result in important new findings within the field.  In order to promote broad, transparent and responsible data sharing the BRAINTEASER Project has developed the BRAINTEASER Data Sharing Policy reported in this guideline, within constraints of funders' and regulatory requirements, on when specific conditions of access should be put in place. The data sharing policy was developed in accordance with the General Data Protection Regulation (GDPR, EU Regulation 2016/679), which provides a number of bases for sharing personal information.

 

Data Usage Agreement (DUA)

By downloading this data, you acknowledge and agree to abide by the terms set forth in the Data Use Agreement (DUA). These terms are designed to ensure the responsible and ethical use of the dataset and include specific conditions regarding confidentiality, permitted use, and data security. Your access to the data is contingent upon acceptance of these terms. The following articles outline the key provisions of the DUA to which you must adhere:

  1. The providing partners within the BRAINTEASER Project Consortium retain the property of the Data.
  2. The receiving subject agrees that the data:
    1. will not be transferred to anyone else within the receiving subject's organisation without the prior written consent of the BRAINTEASER Project Committee members;
    2. will not be disclosed to a third party.
    3. will be used only in compliance with applicable EU treaties, laws and regulations, as they are amended from time to time, and only after securing related reviews and approvals as such amended treaties, laws and regulations require;
  3. The receiving subject will not attempt to deanonymise the data or identify any individual or Data Subject included in the data.
  4. The receiving subject will exercise all reasonable and prudent care to avoid disclosure of the identity of any individual or institution referenced in the data, in any publication or other communication.
  5. Should the receiving subject inadvertently identify any individual or Data Subject included in the data, they will neither record this fact nor share the identification of that individual with any other person, and nor will they attempt to contact the individual themselves, and will inform immediately the respective BRAINTEASER Project Committee members.
  6. Any suspected or actual breach of data security or violation of these terms must be reported to the BRAINTEASER Project Committee immediately, along with a detailed description of the breach and corrective actions taken.
  7. The receiving subject will not attempt to contact any of the donors or Subjects included in the data supplied.
  8. The receiving subject will exercise all reasonable and prudent care to maintain the physical and electronic security of the data.
  9. The receiving subject shall procure that its Authorized Users are made aware and will be bound by terms similar to those in this Agreement so that each of the Authorized Users complies with the relevant duties, obligations and restrictions imposed on the receiving subject by this Agreement. Any act or omission of any such Authorized User which, if it had been committed or omitted by a receiving subject, would have been a breach of this Agreement, will be deemed a breach of this Agreement by that receiving subject.
  10. Upon completion of the project or termination of this Agreement, the receiving subject agrees to securely delete or return all copies of the data, including backups, unless otherwise authorized in writing.
  11. The receiving subject agrees to appropriately acknowledge the BRAINTEASER Project and relevant data providers in any publication or presentation resulting from the use of the data, according to the citation guidelines provided.

 

Request for use of BRAINTEASER datasets

Please inform us about your intended use of BRAINTEASER datasets by sending an email to

brainteaser-data@dei.unipd.it

Doing so will help us to keep track of ongoing research initiatives and allow us to facilitate collaboration of researchers, whenever possible. If you would like additional results, please submit a short, informal research proposal.

 

Citations in publications

When you report results of data that utilizes publicly available BRAINTEASER project data in any way, it is our policy that you:

  • Acknowledge the BRAINTEASER Project Consortium by:
    • Listing the “Brainteaser Project Consortium” among the co-author
      OR
    • Including the following statement in the acknowledgements:
      The authors would like to thank the ‘Brainteaser Project Consortium’
  • Cite the relevant publication of the original results.

 

Clinical Data

Summary statistics derived from environmental and clinical data registered with or without  sensor recordings

 

A - Studies using retrospective clinical data

Retrospective clinical data derived from a huge and long term work of data collection performed for ALS by Prof. A. Chiò and colleagues in Turin and Prof. M. De Carvalho and colleagues in Lisbon. Retrospective data on MS patients were also collected in many years by Dott. P. Cavalla and colleagues in Turin and Prof. R. Bergamaschi and colleagues in Pavia.

To obtain these datasets, the researcher should send a request for access to the data together with a detailed and structured study proposal that will be evaluated by the BRAINTEASER Project Data Committee in order to understand the purposes of the requesting research group. After the decision and authorisation, the requesting research group will receive all the information and data. The subsequent passage, following the analysis and the potential results, will be characterized by the revision and validation process made by the BRAINTEASER Project Data Committee. Requests including topics under current analysis by the members of this Consortium will be declined due to conflict of interests.

Considering the aims of the proposed work and the amount of the dataset to be used, it could be requested inclusion of the project and their members  either as main authors or as “Brainteaser Project Consortium” as co-authors in the publication.

The inclusion request will be communicated by the BRAINTEASER Project Data Committee before the delivering of the dataset to external researchers.

 

B -  Studies using prospective clinical data from the Project

All the participants in the BRAINTEASER Project will be included in the paper as part of the “Brainteaser Project Consortium”. An updated list of all participants will be provided periodically. Based on the different involvement in data analysis, interpretation and writing, a list of main authors will be also defined for each specific paper in addition to the “Brainteaser Project Consortium”.

BRAINTESEAR Project Committee members

  • Roberto Bergamaschi, University of Pavia, Italy
  • Maria Fernanda Cabrera-Umpierrez, Technical University of Madrid, Spain
  • Adriano Chiò, University of Turin, Italy
  • Arianna Dagliati, University of Pavia, Italy
  • Mamede De Carvalho, University of Lisbon, Portugal
  • Barbara Di Camillo, University of Padua, Italy
  • Nicola Ferro, University of Padua, Italy
  • Jose Manuel Garcia Dominguez, Gregorio Marañon Hospital in Madrid, Spain
  • Sara C. Madeira, University of Lisbon, Portugal
  • José Luis Muñoz Blanco, Gregorio Marañon Hospital in Madrid, Spain

 

You are currently not logged in. Do you have an account? Log in here

Additional details

Funding

European Commission
BRAINTEASER - BRinging Artificial INTelligencE home for a better cAre of amyotrophic lateral sclerosis and multiple SclERosis 101017598

References

  • Chiò A, Mora G, Moglia C, Manera U, Canosa A, Cammarosano S, Ilardi A, Bertuzzo D, Bersano E, Cugnasco P, Grassano M, Pisano F, Mazzini L, Calvo A (2017). Piemonte and Valle d'Aosta Register for ALS (PARALS). Secular Trends of Amyotrophic Lateral Sclerosis: The Piemonte and Valle d'Aosta Register. JAMA Neurol., 74(9):1097-1104. doi: 10.1001/jamaneurol.2017.1387
  • Bergamaschi R, Monti MC, Trivelli L, Mallucci G, Gerosa L, Pisoni E, Montomoli C. (2021). PM2.5 exposure as a risk factor for multiple sclerosis. An ecological study with a Bayesian mapping approach. Environ Sci Pollut Res Int., 28(3):2804-2809, doi: 10.1007/s11356-020-10595-5
  • Bergamaschi R, Monti MC, Trivelli L, Introcaso VP, Mallucci G, Borrelli P, Gerosa L, Montomoli C. (2020). Increased prevalence of multiple sclerosis and clusters of different disease risk in Northern Italy. Neurol Sci., 41(5):1089-1095, doi: 10.1007/s10072-019-04205-7
  • Guazzo, A., Trescato, I., Longato, E., Hazizaj, E., Dosso, D., Faggioli, G., Di Nunzio, G. M., Silvello, G., Vettoretti, M., Tavazzi, E., Roversi, C., Fariselli, P., Madeira, S. C., de Carvalho, M., Gromicho, M., Chiò, A., Manera, U., Dagliati, A., Birolo, G., Aidos, H., Di Camillo, B., and Ferro, N. (2022). Intelligent Disease Progression Prediction: Overview of iDPP@CLEF 2022. In Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Thirteenth International Conference of the CLEF Association (CLEF 2022), pages 395–422. Lecture Notes in Computer Science (LNCS) 13390, Springer, Heidelberg, Germany. doi: 10.1007/978-3-031-13643-6_25
  • Faggioli, G., Guazzo, A., Marchesin, S., Menotti, L., Trescato, I., Aidos, H., Bergamaschi, R., Birolo, G., Cavalla, P., Chiò, A., Dagliati, A., de Carvalho, M., Di Nunzio, G. M., Fariselli, P., Garc ́ıa Dominguez, J. M., Gromicho, M., Longato, E., Madeira, S. C., Manera, U., Sil- vello, G., Tavazzi, E., Tavazzi, E., Vettoretti, M., Di Camillo, B., and Ferro, N. (2023). Intelligent Disease Progression Prediction: Overview of iDPP@CLEF 2023. In Experimental IR Meets Multilinguality, Multimodality, and Interaction. Proceedings of the Fourteenth International Conference of the CLEF Association (CLEF 2023). Lecture Notes in Computer Science (LNCS), Springer, Heidelberg, Germany.