There is a newer version of the record available.

Published May 15, 2025 | Version 1.0
Dataset Open

POLARIS: A High-contrast Polarimetric Imaging Benchmark Dataset for Exoplanetary Disk Representation Learning

Description

The POLARIS dataset is built from a decade of polarimetric observations (2014–2024) conducted with the SPHERE instrument on the Very Large Telescope (VLT). Specifically, it includes all public polarized light observations obtained using the IRDIS instrument, retrieved from the ESO Science Archive. These raw observations were uniformly preprocessed using a modified version of the IRDAP pipeline to generate high-quality Polarimetric Differential Imaging (PDI) products.

The dataset consists of three main components:

  1. 96 labeled PDI-postprocessed polarimetric images (1024 × 1024 pixels), each annotated as either a target (with circumstellar disk structures) or a reference (with no detectable disk structures). This subset is approximately 3.18 GB in size.

  2. 813 unlabeled PDI-postprocessed polarimetric images, each derived from sequences of preprocessed exposures in total intensity light ( 2014-2023) . These samples are also annotated with vegetation indices and land-use metadata. This component occupies approximately GB. The PDI-postprocessed polarimetric images for 2024 will be updated soon with new version. There will be total 921 unlabeld polarized data. 

  3. 206 RDI preprocessed exposure sequences used for downstream imputation, each corresponding to a labeled reference and composed of the original preprocessed exposures in total intensity light. The data is organized by year, with each archive file named according to its corresponding year.  Each sequence contains 4n images (where n is the number of exposure cycles), with a resolution of 1024 × 1024 pixels per frame. This component totals approximately 38 GB (2014-2024).

  4. All preprocessed exposure sequences, spanning 2014–2024, consist of $4n$ images per sequence (where $n$ is the number of exposure cycles), with each frame at a resolution of $1024 \times 1024$ pixels. The data are annotated with vegetation indices and land-use metadata. Due to its large volume (exceeding 300 GB), it is hosted via the following permanent Dropbox link for convenient access: \href{https://www.dropbox.com/scl/fo/5bgfwb7d5gozo6k6pl5rc/AGJkkHLPGUIVlDRXssSAZpQ?rlkey=j0vm2xkbl0imgfvbcuba8e6lr&st=6ij3n0wz&dl=0}{Dropbox (POLARIS exposures)}.

All files are provided in standard .fits format, following astronomical data conventions. The labeled PDI images support supervised learning tasks such as classification or domain adaptation, while the exposure sequences and unlabeled samples enable studies in imputation, denoising, self-supervised learning, or contrastive representation learning. The dataset will continue to expand as additional SPHERE observations are released to the public.

Files

POLARIS_no_disk_ref_2017.zip

Files (5.4 GB)

Name Size Download all
md5:2a8f2846c6aa73632159d2b185307a3d
1.6 GB Preview Download
md5:7a39862732fc20f969ac7ed3503db562
9.7 kB Preview Download
md5:c4df9b15bd481509c8534386e4f60a70
3.4 GB Preview Download
md5:8da466514f3d94961b8eb036e2b8fff3
403.1 MB Preview Download

Additional details

Dates

Created
2025-05-15

Software

Programming language
Python