Published June 30, 2020 | Version 3.0.0
Dataset Open

Penobscot Interpretation Dataset

Description

We have seen in the past years the flourishing of machine and deep learning algorithms in several applications such as image classification and segmentation, object detection and recognition, among many others. This was only possible, in part, because datasets like ImageNet - with +14 million labeled images - were created and made publicly available, providing researches with a common ground to compare their advances and extend the state-of-the-art. Although we have seen an increasing interest in machine learning in geosciences as well, we will only be able to achieve a significant impact in our community if we collaborate to build such a common basis. This is even more difficult when it comes to the Oil & Gas industry, in which confidentiality and commercial interests often hinder the sharing of datasets to others. In this letter, we present the Penobscot interpretation dataset, our contribution to the development of machine learning in geosciences, more specifically in seismic interpretation. The Penobscot 3D seismic dataset was acquired in the Scotian shelf, offshore Nova Scotia, Canada. The data is publicly available and comprises pre- and pos-stack data, 5 horizons and well logs of 2 wells. However, for the dataset to be of practical use for our tasks, we had to reinterpret the seismic, generating 7 horizons separating different seismic facies intervals. The interpreted horizons were used to generated +100,000 labeled images for inlines and crosslines. To demonstrate the utility of our dataset, results of two experiments are presented.

Dataset contents

  • Crosslines:
    • Classes: 7
    • Slices used for training: 289
    • Records per class for training: 16706
    • Slices used for testing: 192
    • Records per class for testing: 1000
  • Inlines:
    • Classes: 7
    • Slices used for training: 358
    • Records per class for training: 14988
    • Slices used for testing: 238
    • Records per class for testing: 1000

Files

dataset-log.txt

Files (2.3 GB)

Name Size Download all
md5:d1fd8bdc6fa7818401326819c9130387
1.9 kB Preview Download
md5:e53af020d42f49dba7a1a5988eccc829
2.3 GB Download
md5:42c104fafbb8e79695ae23527a91ee78
36.2 MB Preview Download
md5:3595ff238e927171ae6065532c77a7aa
633.3 kB Preview Download
md5:ed415cd77672fb31a4f17c2b58fd67a3
10.6 MB Preview Download

Additional details