Dataset of Spanish Mammographic Reports with BI-RADS Classification
Authors/Creators
-
Vázquez Noguera, José Luis
(Researcher)1, 2
-
Gómez Adorno, Helena
(Project leader)3
- Torres Hurtado, Alejandro (Data curator)3
-
Mello Román, Julio César
(Project member)4
- Fleitas Alvarez, Enrique Javier (Data curator)5
- Espinola Schulze, Federico Fernando (Project member)5
- Garcia Torres, Miguel (Project member)5
- Méndez Gaona, Carlos Domingo (Project member)5
-
Gardel Sotomayor, Pedro Esteban
(Project member)6
- Zaracho Amarilla, Norma Elizabeth (Data curator)7
- Gamorra Esquivel, Oxades Wilfrido (Project member)7
Description
This dataset contains a total of 4,357 reports of mammographic studies in Spanish, obtained through several medical units in Paraguay. This dataset aims to help with the shortage of public datasets within the area of natural language processing applied to radiological reports.
This dataset contains key information from the mammographic reports through the 15 variables that make up our dataset, the full text of the reports is included, but each of the sections of the report is also included separately, these sections are clinical observations, diagnostic conclusions and follow-up recommendations, in addition to the BI-RADS classification that has been assigned to each report, finally there are metadata related to the reports such as a unique identifier, year, month and patient information such as age, patient reasons for the analysis, last menstruation period, type of hormonal therapy received, family history and number of children
This dataset, containing data not generated artificially, represents a real-world scenario, which can be used by researchers to replicate results from articles within the area, as well as to develop and test new models and algorithms specifically for the classification of the BI-RADS system.
Files
BIRADS_radiology_reports.csv
Files
(7.3 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:5c86e1318d4cae3d260560f77a131e2b
|
7.3 MB | Preview Download |
|
md5:3b28ca7a2d75552e475889f41810092a
|
2.0 kB | Preview Download |
Additional details
Dates
- Collected
-
2019-01Start of the data collection
- Collected
-
2024-08End of the data collection