Published November 20, 2025 | Version v1.0
Dataset Open

ThrombUS+ Ultrasound Dataset #1 - Compression Ultrasound Videos - Testing Set

  • 1. ROR icon Athena Research and Innovation Center In Information Communication & Knowledge Technologies
  • 2. ROR icon Democritus University of Thrace

Description

Venous thromboembolism conditions, including deep vein thrombosis (DVT), are the third most common cause of vascular mortality worldwide after heart attack and stroke. Prompt diagnoses of DVT is essential to decrease the risk of fatal complications. Machine learning (ML) models have emerged as a valuable tool for assisting prompt diagnoses. These models, by performing pixel-wise segmentation in ultrasound videos, assess vein compressibility, an indicator of DVT or no-DVT. Training such models, requires an enormous amount of effort for creating the ground truth datasets. Thereby, this dataset aims to foster the development of AI models for DVT detection using ultrasound videos without the need for exhaustive pixel-wise annotations, reducing the burden of manual labeling, while also enabling robust and clinically relevant predictions.

The dataset consists of compression ultrasound video scannings of lower limbs collected during a multi-center cohort study in European Hospitals (Greece, France, Italy and Lithuania [https://clinicaltrials.gov/study/NCT06989255]). Patients suspected of DVT are scanned using conventional ultrasound machines, after they had given informed consent form, according to a dedicated scanning protocol. All hospitals have been granted ethics approvals from their respective local ethics committees, prior enrollment and data collection.   
 
The first anatomical site refers near the inguinal ligament, where the common femoral vein (CFV) and the common femoral artery (CFA) are visualized. The second compression video is recorded a few centimeters distally, where the great saphenous vein is visible, branching off of the common femoral vein (CFV). The third compression video is acquired in the middle of the thigh, where the femoral vein is visible. The video from the fourth anatomical site is recorded below the knee, where compression of the popliteal vein (PV) is recorded. Videos were collected from the left (L), right (R) or both limbs, based on clinical symptoms.   
 
All videos have been anonymized prior uploading to the eCRF platform using an open-source software. All personal-related DICOM attributes were removed, videos were cropped to desired dimensions, and finally, for each DICOM file, a filename-based tag were added, for representing the anatomical site (CFV, GS, FV, PV) visualized and the laterality of the limb (L, R).   
 
All DICOM files and a structure report filled by the medical expert were stored in the eCRF platform, per patient. The structure report is based on the compression interpretation and diagnosis made by the medical expert for each available video recording. In the structure report medical experts rate the compressibility of the visualized veins (as No (0), Partial (1), Yes (2)) and the presence of thrombosis (as Yes (1) or No (0)). Consequently, the provided dataset has been created by combining the filename-based tags with the structured report which assesses the compressibility and the presence of thrombosis. This information can serve as ground truth annotations of the compression ultrasound recordings to train artificial intelligence and machine learning models. In addtion, the training.csv file also includes the age (years), height (cm), weight (kg) and thigh circumference (cm) that can be used in training. 
 
The .zip folder has the following structure.
 
Thrombus_ultrasound_dataset_1_testing/
|── testing/
│    |── file1.dcm
│    |── file2.dcm
│    |── .....
|── test.csv
 
Please refer to https://doi.org/10.5281/zenodo.17659415 for the training data set.
 

Columns

For the test.csv the columns are

  • File Name - The filename of the anonymized DICOM file with the compression ultrasound.
  • SubjectID - the ID of the participants the file belongs too. Multiple files can have the same subjectID, indicating the same participant.
  • Age - Participant's age (years)
  • Sex - Participant's sex (F for female, M for male).
  • Height - Participant' height (cm).
  • Weight - Participant's weight (kg).
  • Thigh Circumference - Participant's thigh circumference (cm)
  • Limb - Limb the video is referred to. L for left limb, or R for right limb. [EMPTY]
  • Anatomical Site - The anatomical site veins are visualised to. CFV: common femoral vein, GS: great saphenous, FV: femoral vein, PV: popliteal vein. [EMPTY]
  • Compressibility - Indicator of vein compressibility. 2: fully, 1: partially, 0:none [EMPTY]
  • Thrombosis - Indicates the presence of thrombosis in percentages. 0: no thrombosis, 1: thrombosis [EMPTY]

Files

Thrombus_ultrasound_dataset_1_testing.zip

Files (2.5 GB)

Name Size Download all
md5:a9e77c3a8f63b7c93115c38a39ef7c61
2.5 GB Preview Download

Additional details

Funding

European Commission
ThrombUS - ThrombUS+: Wearable Continuous Point-of-Care Monitoring, Risk Estimation and Prevention for Deep Vein Thrombosis 101137227