Presentation Open Access

Looking up the AI maturity curve in E&P; opportunities, challenges and the impact on geoscience work

Eirik Larsen; Stephen Purves; Dimitris Economou; Behzad Alaei

Machine Learning (ML) has been capable for three decades, to infer lithology, sedimentary facies, porosity, and fluid saturation as functions of wireline logs. Now, ML is moving from R&D projects and into the tool box of the generalist, transforming the subsurface workflow. In addition to being fueled by algorithmic development, data, and high-performance compute; this transformation is enabled by the emergence of data analytics platforms, that facilitate; i) practical use of ML methods by the generalist geoscientist, ii) integration of data analytics with structured data in databases, iii) semi-automated data management, quality-control and -improvement, and iv) tracking of data provenance, enabling reproducible scientific workflows.

On a regional scale we can now train supervised ML models with well-log data (as features) and data from core, and/or from physics models (as target labels). We can efficiently condition large well data sets in order to enable ML prediction of rock and fluid properties at scale. We can measure prediction accuracies using a cross-validation approach with blind testing against all wells in the dataset. The data-types we can predict includes porosity, permeability, lithology, sedimentary facies, source rock properties, and fluid saturation among others.


On a local scale we can train supervised ML models with partial-stack seismic data (features) and rock- and fluid-property data from wells (labels). We can use deep convolutional neural networks to predict rock- and fluid property cubes based on upscaled version of the inferred property logs. Wells within the bounds of 3D surveys can be used for blind cross validation allowing network hyperparameters to be tuned and model performance to be assessed.


In order to provide stratigraphic and structural context to the predicted rock and fluid property data we can use automated seismic interpretation techniques to interpret stratigraphic units and faults from seismic data. We use fully convolutional deep networks for fault interpretation, and deep encoder-decoder networks such as SegNet for stratigraphic interpretation. These techniques classify 3D seismic post stack datasets achieving a high level of consistency based on a relatively small number of expert labelled regions.

Since geological interpretation and prediction is typically based on sparse and low-resolution data and is inherently uncertain, we can apply methods such as Bayesian neural networks to determine model uncertainty for automatic seismic interpretation. We can efficiently integrate scenario analysis with ML modelling to construct multiple models based on variations of the input data. So in addition to applying a data-driven approach we can now start to make uncertainty analysis fundamental to everything that we do as geoscientists.

The incredibly rich subsurface data and metadata available in national data repositories and company databases can, and is now starting to, serve as rich resources for training machine learning models at scale. When we can use machine-learning technology to build models at scale, using well and seismic data, we can start to piece together the data-driven puzzle needed to define and characterise known and potential hydrocarbon accumulations. We can now begin to; i) leverage machine learning in play screening, using all the available log, core and seismic data, ii) apply machine learning models to reveal missed pay intervals, which frequently have been the inception of successful discoveries, and iii) identify and characterize prospects, discoveries and fields. These tasks typically require use of multiple ML models to be applied to multiple data types, and that we use technology to achieve true data-driven integration.


ML technology paired with solid data science practice; i) facilitates integration of data and disciplines, ii) enables geoscientists to exceed current best practice with the ML tools available today, and iii) paves the way to the "new" best practice which is integrated data science and geoscience. What the future holds is more automation and more reliable data analytics platforms. Geo/data scientists will spend less time on hands on data science, needed  to make systems run, and more time on being creative, searching for and characterizing opportunities to build the basis for successful data-driven exploration and production decisions.

Files (11.2 MB)
Name Size
EAGE ML Workshop Kuala Lumpur 2019.pdf
11.2 MB Download
All versions This version
Views 550550
Downloads 490490
Data volume 5.5 GB5.5 GB
Unique views 503503
Unique downloads 384384


Cite as