Published January 24, 2019 | Version v1
Conference paper Open

Hot Spot Analysis over Big Trajectory Data

Description

Hot spot analysis is the problem of identifying statistically significant spatial clusters from an underlying data set. In this paper, we study the problem of hot spot analysis for massive trajectory data of moving objects, which has many real-life applications in different domains, especially in the analysis of vast repositories of historical traces of spatio-temporal data (cars, vessels, aircrafts). In order to identify hot spots, we propose an approach that relies on the Getis-Ord statistic, which has been used successfully in the past for point data. Since trajectory data is more than just a collection of individual points, we formulate the problem of trajectory hot spot analysis, using the Getis-Ord statistic. We propose a parallel and scalable algorithm for this problem, called THS, which provides an exact solution and can operate on vast-sized data sets. Moreover, we introduce an approximate algorithm (aTHS) that avoids exhaustive computation and trades-off accuracy for efficiency in a controlled manner. In essence, we provide a method that quantifies the maximum induced error in the approximation, in relation with the achieved computational savings. We develop our algorithms in Apache Spark and demonstrate the scalability and efficiency of our approach using a large, historical, real-life trajectory data set of vessels sailing in the Eastern Mediterranean for a period of three years.

Files

BIG DATA Nikitopoulos.pdf

Files (535.9 kB)

Name Size Download all
md5:de4272ec11b67975f5283c75fb21b534
535.9 kB Preview Download

Additional details

Funding

MASTER – Multiple ASpects TrajEctoRy management and analysis 777695
European Commission
datACRON – Big Data Analytics for Time Critical Mobility Forecasting 687591
European Commission
Track and Know – Big Data for Mobility Tracking Knowledge Extraction in Urban Areas 780754
European Commission