Published November 27, 2025 | Version v 1.0.0
Dataset Open

Mammo-MX dataset: An X-ray mammography dataset for computer-aided diagnosis of breast cancer

  • 1. Science and Engineering Division, University of Guanajuato, Leon Campus, Mexico
  • 2. Unidad de Investigación en Epidemiología Clínica, OOAD Guanajuato Instituto Mexicano del Seguro Social, Mexico

Description

The Mammo-MX database represents one of the first comprehensive dataset of Mexican patients, containing mammography studies conducted during screening campaigns between 2023 and 2024 at the breast clinic in Jalisco. The database includes 3,368 patients with a total of 13,659 mammographic examinations. All data are labeled according to radiologist categorization using the BI-RADS scale, ranging from category 0 (inconclusive studies requiring additional imaging or supplementary examinations) to category 6 (patients with confirmed malignant findings), as well as breast density classification. This database serves as a valuable resource for developing and testing breast cancer detection systems that can be implemented in hospitals as diagnostic support tools for radiologists.

The project's content is organized within the "Mammo-MX_dataset" folder, which contains a total of 13,659 mammograms from 3,368 patients. Inside this folder, there are six ZIP files labeled B0, B1, B3, B4, B5, and B6, each corresponding to a specific BI-RADS classification assigned by expert radiologists. Due to the large volume of studies categorized as BI-RADS 2, the B2 folder is subdivided into seven separate files to facilitate data management and access. 

Each mammogram file is named using a structured format that begins with six-digit patient identifier, followed by a letter indicating breast laterality-"R" for the right breast and "L" for the left. This is followed by the projection view, either craniocaudal (CC) or mediolateral-oblique (MLO). The final two identifiers represent the BI-RADS category (B0-B6) and breast density level (D1-D4), both determined by experienced radiologists.

All mammograms are stored in DICOM format. In addition to the pixel matrix representation of the image, each file includes metadata such as:

  • Age: Patient’s age
  • Laterality: Right (R) or left (L) breast
  • Projection view: CC or MLO
  • BI-RADS: BI-RADS classification of the breast
  • Breast density: Density classification according to BI-RADS criteria

Technical acquisition parameters are also embedded in the DICOM files, including:

  • Source-to-patient distance (mm): Distance from the X-ray source to the patient
  • Exposure (mAs): Amount of radiation measured in milliampere-seconds
  • Compression force (N): Force applied during mammography, measured in Newtons
  • Relative X-ray exposure: Normalized measure of radiation dose during imaging
  • Entrance dose (mGy): Radiation dose at the breast surface, measured in milligray
  • Exposure time (ms): Duration of exposure during image acquisition, measured in milliseconds

All this information, along with additional technical parameters extracted from the DICOM headers, is compiled and organized in the file Metadata.csv. Further acquisition-related parameters such as X-ray tube voltage (kVp), anode target material, pixel spacing, and other acquisition settings can also be retrieved from the DICOM files, offering deeper insight into the imaging protocol and system configuration.

There is a related data paper in Press "An X-ray Mammography Dataset for Computer-Aided Diagnosis of Breast Cancer". There you can find a comprehensive description of the dataset. Blanca Olivia Murillo-Ortiz et al 2025 Mach. Learn.: Sci. Technol. in press https://doi.org/10.1088/2632-2153/ae275c 

Files

Metadata.csv

Files (74.6 GB)

Name Size Download all
md5:ab939d1fd5151d5962f933d61046a9ab
117.8 MB Preview Download
md5:5c2ae315be42821d7eb0dd3bd35afa6b
5.8 GB Preview Download
md5:eaca14492caea6ec6018918bce3f839a
9.2 GB Preview Download
md5:df93d75854d37b2ff3842bca54023a70
8.1 GB Preview Download
md5:1b67340453578960fb7ba4a8b4b98384
7.8 GB Preview Download
md5:dfeb8cd302b7fb5d7ecf032a914b73e8
7.8 GB Preview Download
md5:78d86860acf7b49f228736d43e7c8b13
7.6 GB Preview Download
md5:d8fe085de134e54e4a71044cc59c8851
8.2 GB Preview Download
md5:54b47dd2dcfdda358609f4ae61d9a5f7
6.2 GB Preview Download
md5:38bf96c5e77ef58dced7fa8a5656d234
3.0 GB Preview Download
md5:aa6815f7af9b3bd7363fe3afc890a5e8
6.7 GB Preview Download
md5:42f60e1dbced3f6228b0bc853c2aaf32
3.7 GB Preview Download
md5:2a86030c08556fecb0bbae721803a3a8
401.2 MB Preview Download
md5:aad9a12b988a684df831ec1338ed616f
1.2 MB Preview Download

Additional details

Additional titles

Subtitle
An X-ray mammography dataset for computer-aided diagnosis of breast cancer

Related works

Is described by
Journal article: 10.1088/2632-2153/ae275c (DOI)

Funding

Mexican Social Security Institute
Call for cross-cutting research networks from the Mexican Social Security Institute 2025-16-4

Dates

Created
2025-11-27