Published December 15, 2023 | Version COCO
Dataset Open

ARCADE: Automatic Region-based Coronary Artery Disease diagnostics using x-ray angiography imagEs Dataset

Description

ARCADE: Automatic Region-based Coronary Artery Disease diagnostics using x-ray angiography imagEs Dataset Phase 2 consist of two folders with 300 images in each of them as well as annotations. 

ARCADE: Automatic Region-based Coronary Artery Disease diagnostics using x-ray angiography imagEs Dataset Phase 1 consists of two datasets of XCA images for each of two tasks of ARCADE challenge. The first task includes in total 1200 coronary vessel tree images, which are divided into train(1000) and validation(200) groups, images for training are followed with annotations, depicting the division of a heart into 26 different regions based on the Syntax Score methodology[1]. Similarly, the second task includes a different set of 1200 images with same train-val division proportion with annotated regions containing atherosclerotic plaques. This dataset, carefully annotated by medical experts, enables scientists to actively contribute towards the advancement of an automated risk assessment system for patients with CAD. 

The dataset structure is as follows: top-level directories "syntax" and "stenosis" contain files for the two dataset objectives, namely: i) vessel branch classification according to the SYNTAX methodology; and ii) stenosis detection. Inside both directories, there are 3 subsets of the dataset, such as "train", "val", and "test". Inside each of those folders, there are 2 lower-level directories - "images", and "annotations". Inside the "images" folder there are images in ".png" format, extracted from DICOM recordings. The "annotations" folders contain single ".JSON" files, which are named in correspondence to the objective, i.e. "train.JSON", "val.JSON", and "test.JSON".

The structure of ".JSON" contains three top-level fields: "images", "categories", and "annotations". The "images" field contains the unique "id" of the image in the dataset, its "width" and "height" in pixels, and the "file_name" sub-field, which contains specific information about the image. The "categories" field contains a unique "id" from 1 to 26, and a "name", relating it to the SYNTAX descriptions. The "annotations" field contains a unique "id" of the annotation, "image_id" value, relating it to the specific image from the "images" field, and a "category_id" relating it to the specific category from the "categories" field. The "segmentation" sub-field contains coordinates of mask edge points in "XYXY" format. Bounding box coordinates are given in the "bbox" field in the "XYWH" format, where the first 2 values represent the x and y coordinates of the left-most and top-most points in the segmentation mask. The height and width of the bounding box are determined by the difference between the right-most and bottom-most points and the first two values. Finally, the "area" field provides the total area of the bounding box, calculated as the area of a rectangle.

 

The corresponding Dataset Article will be provided later. 

[1] Syntax score segment definitions. https://syntaxscore.org/index.php/tutorial/definitions/14-appendix-i-segment-definitions

Files

arcade.zip

Files (451.6 MB)

Name Size Download all
md5:c5b1973ade06f7dff210f878161e1a76
451.6 MB Preview Download

Additional details

Dates

Available
2023-06-01