BREAST MASSES DATASET WITH PRECISELY ANNOTATED SEQUENTIAL MAMMOGRAMS General Information This dataset consists of 100 pairs of mammograms, from two temporally sequential rounds. Specifically, this dataset includes the prior and recent mammograms with two mammographic views for each patient. This is a complete dataset for the detection and classification of breast masses, using sequential mammograms. It contains normal (BI-RADS 1), benign (BI-RADS 2), and biopsy-confirmed malignant cases (BI-RADS 6). For each mammogram, an image with precise annotation of each individual mass, by two expert radiologists, is provided. Description In this dataset, 100 pairs of full-field mammograms are included from various local hospitals. Women (40 to 81 years of age) with either no (normal population), BI-RADS benign (benign population), or malignant masses (malignant population) in their recent mammograms, were randomly selected. A normal or BI-RADS benign prior mammogram (average interval of 2.4 years) is included in the dataset. The collection of the data was approved by the appropriate Institutional Review Board (Cyprus National Bioethics Committee #ΕΕΒΚ ΕΠ 2020.01.144) and informed consent was retrospectively collected. For every participant, two mammographic views, the Cranio-Caudal (CC) and Medio-Lateral Oblique (MLO) of the breast, from two sequential screening rounds, are included (a total of 400 images). Two clinicians (radiologist with 27 years of experience and a consultant breast surgeon with 10 years of experience) identified the eligible patients, according to specific criteria. Two radiologists (6 years of experience and 5 years of experience), marked the location of each mass (BI-RADS benign or suspicious). Subsequently, suspicious cases were biopsied, followed by histopathologic analysis, confirming their malignant nature. Fifty percent of the population had none or BI-RADS benign findings in the first round of screening, and none or BI-RADS benign findings in the most recent mammogram (35 with no visible masses – normal population, and 15 with only BI-RADS benign masses – benign population). The remaining 50% of the patients had normal or BI-RADS benign priors, but at least one new biopsy-confirmed malignant mass in the most recent mammographic views. The size of the mammographic views were 4096 x 3328 pixels, in 8-bit DICOM format. Dataset In total there are two items that should be downloaded: (1) the zip folder 'Dataset.zip' containing the data, and (2) the excel file 'Description.xlx' with all the necessary information for each patient. Below are more details about each item. 1) Description.xlx This excel file consists of the important information for each patient such as age, BI-RADS breast density, BI-RADS classification, etc. It includes four sheets: a) 'General_Info' that contains a table describing the overall characteristics of the population and mammographic examinations. b) 'Normal_cases' containing all the information for each normal case such as: the folder number (1-35  each folder contains the complete data of a patient  35 normal cases  35 folders), which breast is included (Right/Left), age (at the time of the recent mammogram), BI-RADS category for breast density, BI-RADS category for classification, years of the available mammograms, and the years between screenings. c) 'Benign_cases' containing all the information for each benign case such as: the folder number (1-15  each folder contains the complete data of a patient  15 benign cases  15 folders), which breast is included (Right/Left), age (at the time of the recent mammogram), BI-RADS category for breast density, BI-RADS category for classification, years of the available mammograms, and the years between screenings. d) 'Malignant_cases' containing the information for each malignant case such as: the folder number (1-50  each folder contains the complete data of a patient  50 malignant cases  50 folders), which breast is included (Right/Left), age (at the time of the recent mammogram), BI-RADS category for breast density, BI-RADS category for classification, dates of the available mammograms, the years between screenings, and the biopsy results. 2) Dataset.zip This zip folder contains all the mammograms along with the ground truth images for each patient. It is divided into three sub-folders 'Normal_cases', 'Benign_cases', and 'Malignant_cases'. a) Normal_cases This folder contains the 35 normal cases. Each patient has its own folder (35 patients, 35 folders), with the name of the folder indicating the patient number (folder ‘1’ has all the information for patient 1). Inside of each folder there are 8 files: the four mammograms in a dicom format (CC_prior, CC_recent, MLO_prior, and MLO_recent), and the corresponding ground truth images for each mammogram in a .jpg format (CC_prior_GT, CC_recent_GT, MLO_prior_GT, and MLO_recent_GT). In these cases, all the ground truth images are just the .jpg versions of the dicom files, since there are no masses. b) Benign_cases The folder contains the 15 BI-RADS benign cases. Each patient has its own folder (15 patients, 15 folders), with the name of the folder indicating the patient number (folder ‘1’ has all the information for patient 1). Inside of each folder there are 8 files: the four mammograms in a dicom format (CC_prior, CC_recent, MLO_prior, and MLO_recent), and the corresponding ground truth images for each mammogram in a .jpg format (CC_prior_GT, CC_recent_GT, MLO_prior_GT, and MLO_recent_GT). The ground truth images were annotated by the radiologists. The BI-RADS benign masses are outlined with blue color. c) Malignant_cases The folder contains the 50 malignant cases. Each patient has its own folder (50 patients, 50 folders), with the name of the folder indicating the patient number (folder ‘1’ has all the information for patient 1). Inside of each folder there are 8 files: the four mammograms in a dicom format (CC_prior, CC_recent, MLO_prior, and MLO_recent), and the corresponding ground truth images for each mammogram in a .jpg format (CC_prior_GT, CC_recent_GT, MLO_prior_GT, and MLO_recent_GT). The ground truth images were annotated by the radiologists. The BI-RADS benign masses are outlined with blue color and the suspicious masses with red color. In case of more than one suspicious mass, a green bounding box is used to indicate the suspicious mass which was biopsy-confirmed as malignant. Name description ‘CC_prior’ – prior CC mammographic view in a dicom format ‘CC_prior_GT’ – prior CC ground truth image in a .jpg format ‘CC_recent’ – recent CC mammographic view in a dicom format ‘CC_recent_GT’ – recent CC ground truth image in a .jpg format ‘MLO_prior’ – prior MLO mammographic view in a dicom format ‘MLO_prior_GT’ – prior MLO ground truth image in a .jpg format ‘MLO_recent’ – recent MLO mammographic view in a dicom format ‘MLO_recent_GT’ – recent MLO ground truth image in a .jpg format ACKNOWLEDGMENT The publication of this paper is supported by the European Union’s Horizon 2020 research and innovation programme under grant agreement No 739551 (KIOS CoE) and the Government of the Republic of Cyprus through the Cyprus Deputy Ministry of Research, Innovation and Digital Policy.