Published February 28, 2023 | Version v4
Dataset Open

Synthetically generated clouds on ground-based solar observations

  • 1. Université de Toulon, Aix Marseille Univ, CNRS, LIS, Marseille, France
  • 2. Observatoire de Paris/PSL, Paris, France

Description

A dataset consisting of Ca II & H-alpha images taken at the Paris Meudon Observatory. Synthetically generated cloud coverage has been applied to clean images, thereby creating an (cloudy, clean) pair--facilitating the training of cloud-removal algorithms.

Data description

The Ca-II and H-α synthetic dataset comprise respectively 319 and 367 pairs of shadow/shadow-free images, split into 223/96 and 256/111 training/testing pairs.

Listed here are two zip archives:

  1. filament-bounding-boxes.zip -- bounding boxes of filaments that were used to compute the patched metrics.
  2. synthetic-clouds.zip -- the cloudy input/clean output images that are used to train machine learning algorithms.

A PyTorch dataset has been created that handles the download, importing, and usage of this dataset. You can find this code at the github repository: https://github.com/jaypmorgan/cloud-removal

Pre-processing routines

To generate this set of data, we have applied a series of pre-processing routines. These are:

  1. Correct determination of the solar limb (source code can be found at: https://gitlab.lis-lab.fr/presage/solar-limb-detection).
  2. Scaling the solar disk to 420 pixels, and centring it at 511.5 pixels in the x and y dimensions.
  3. Setting background values outside the solar disk to 0.
  4. Normalising the disk intensity values into the range of 0-1.

Files

filament-bounding-boxes.zip

Files (17.6 GB)

Name Size Download all
md5:988950e29c155688b341e622a83faf37
39.6 kB Preview Download
md5:05017e08da4ae31f7b36c5464aa69b16
8.6 GB Preview Download
md5:3e94b737def97fe70ba4938205ca64a5
9.0 GB Preview Download