Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

There is a newer version of the record available.

Published July 31, 2018 | Version 2.0.0
Dataset Open

Penobscot Interpretation Dataset

Description

We have seen in the past years the flourishing of machine and deep learning algorithms in several applications such as image classification and segmentation, object detection and recognition, among many others. This was only possible, in part, because datasets like ImageNet - with +14 million labeled images - were created and made publicly available, providing researches with a common ground to compare their advances and extend the state-of-the-art. Although we have seen an increasing interest in machine learning in geosciences as well, we will only be able to achieve a significant impact in our community if we collaborate to build such a common basis. This is even more difficult when it comes to the Oil & Gas industry, in which confidentiality and commercial interests often hinder the sharing of datasets to others. In this letter, we present the Penobscot interpretation dataset, our contribution to the development of machine learning in geosciences, more specifically in seismic interpretation. The Penobscot 3D seismic dataset was acquired in the Scotian shelf, offshore Nova Scotia, Canada. The data is publicly available and comprises pre- and pos-stack data, 5 horizons and well logs of 2 wells. However, for the dataset to be of practical use for our tasks, we had to reinterpret the seismic, generating 7 horizons separating different seismic facies intervals. The interpreted horizons were used to generated +100,000 labeled images for inlines and crosslines. To demonstrate the utility of our dataset, results of two experiments are presented.

Dataset contents

  • Crosslines:
    • Classes: 7
    • Slices used for training: 289
    • Records per class for training: 16706
    • Slices used for testing: 192
    • Records per class for testing: 1000
  • Inlines:
    • Classes: 7
    • Slices used for training: 358
    • Records per class for training: 14988
    • Slices used for testing: 238
    • Records per class for testing: 1000

Files

crosslines.zip

Files (2.2 GB)

Name Size Download all
md5:7bbe432052fe41c6009d9437fd0929b8
892.7 MB Preview Download
md5:42c104fafbb8e79695ae23527a91ee78
36.2 MB Preview Download
md5:0553676ef48879f590378cafc12d165d
898.4 MB Preview Download
md5:12f142cb33af55c3b447401ebd81aba1
2.9 MB Preview Download
md5:8dbd99da742ac9c6f9b63f8c6f925f6d
161.6 MB Preview Download
md5:955e2f9afb01878df2f71f0074736e42
167.8 MB Preview Download

Additional details