Published January 31, 2024 | Version v1
Journal article Open

A Generic Approach to Entity Resolution Mechanisms for Big Data on Real World Match Problems in the Global Oil and Gas Sector

  • 1. Mathematical Science Department, Bauchi State University Gadau, Bauchi State, Nigeria

Description

Abstract :

Complex challenges are facing the global oil and gas industry. Oil prices are dropping due to OPEC production level, US oil boom, and other factors. Many experts believe that prices of oil will remain low for years at equilibrium of around 4050(Blumberg,2018;WallsandZheng2018;Azar,2019).Although2019oilpriceisexpectedtoaverageat65 with a further decline at $62 by 2020 (Amadeo, 2019; Kasim, 2019). Also, newly commercial resources are extremely expensive to develop, as massive capital investments are required. This research intends to develop a comprehensive entity resolution framework that has the ability to search across multiple databases with disparate forms, tame large amounts of data very quickly, efficiently resolving multiple entities into one, as well as finding hidden connections without human intervention. Putting in place a system to manage these entities will not only help to better assign resources, but to do so in a more expedient fashion. Although the necessary information is mostly already available within the oil and gas companies, it is spread around different company areas and application. Entity resolution will helps to aggregate these data, identify and exploit connection between entities and offer holistic all-in-one information that can helps to identify and deal with potential risk. We therefore present such an evaluation of existing implementations on challenging real-world match tasks. We consider approaches both with and without using machine learning to find suitable parameterization and combination of similarity functions. In addition to approaches from the research community we also consider a state-of-the-art commercial entity resolution implementation. Our results indicate significant quality and efficiency differences between different approaches. We also find that some challenging resolution tasks such as matching product entities from Opec database are not sufficiently solved with conventional approaches based on the similarity of attribute values.

Files

77-3101-2024.pdf

Files (912.0 kB)

Name Size Download all
md5:eeaf8a807a56f0e392e73d3cbaaac9ac
912.0 kB Preview Download