POLARIS: A High-contrast Polarimetric Imaging Benchmark Dataset for Exoplanetary Disk Representation Learning
Description
The POLARIS dataset is built from a decade of polarimetric observations (2014–2024) conducted with the SPHERE instrument on the Very Large Telescope (VLT). Specifically, it includes all public polarized light observations obtained using the IRDIS instrument, retrieved from the ESO Science Archive. These raw observations were uniformly preprocessed using a modified version of the IRDAP pipeline to generate high-quality Polarimetric Differential Imaging (PDI) products.
The dataset consists of three main components:
-
96 labeled PDI-postprocessed polarimetric images (1024 × 1024 pixels), each annotated as either a target (with circumstellar disk structures) or a reference (with no detectable disk structures). This subset is approximately 3.18 GB in size.
-
813 unlabeled PDI-postprocessed polarimetric images, each derived from sequences of preprocessed exposures in total intensity light ( 2014-2023) . These samples are also annotated with vegetation indices and land-use metadata. This component occupies approximately GB. The PDI-postprocessed polarimetric images for 2024 will be updated soon with new version. There will be total 921 unlabeld polarized data.
-
206 RDI preprocessed exposure sequences used for downstream imputation, each corresponding to a labeled reference and composed of the original preprocessed exposures in total intensity light. The data is organized by year, with each archive file named according to its corresponding year. Each sequence contains 4n images (where n is the number of exposure cycles), with a resolution of 1024 × 1024 pixels per frame. This component totals approximately 38 GB (2014-2024).
- All preprocessed exposure sequences, spanning 2014–2024, consist of $4n$ images per sequence (where $n$ is the number of exposure cycles), with each frame at a resolution of $1024 \times 1024$ pixels. The data are annotated with vegetation indices and land-use metadata. Due to its large volume (exceeding 300 GB), it is hosted via the following permanent Dropbox link for convenient access: \href{https://www.dropbox.com/scl/fo/5bgfwb7d5gozo6k6pl5rc/AGJkkHLPGUIVlDRXssSAZpQ?rlkey=j0vm2xkbl0imgfvbcuba8e6lr&st=6ij3n0wz&dl=0}{Dropbox (POLARIS exposures)}.
All files are provided in standard .fits
format, following astronomical data conventions. The labeled PDI images support supervised learning tasks such as classification or domain adaptation, while the exposure sequences and unlabeled samples enable studies in imputation, denoising, self-supervised learning, or contrastive representation learning. The dataset will continue to expand as additional SPHERE observations are released to the public.
Files
POLARIS_no_disk_ref_2017.zip
Additional details
Dates
- Created
-
2025-05-15