Scalable Distributed Trajectory Clustering Using Apache Spark

Stefanopoulos, Stamatis; Akasiadis, Charilaos; Pelekis, Nikos; Zissis, Dimitrios

doi:10.5281/zenodo.10475337

Published March 2023 | Version v1

Conference paper Open

Scalable Distributed Trajectory Clustering Using Apache Spark

1. University of Peloponnese
2. National Centre of Scientific Research Demokritos
3. University of Piraeus
4. University of the Aegean

Trajectory clustering is an important problem, where position data of mobile objects, such as vehicles and vessels, is analyzed to extract knowledge utilized for a plethora of management tasks. Recently, a vast increase in the production of data gathering devices has taken place, allowing the collection of data in much larger volumes. This challenges the application of existing clustering algorithms, as they are not always able to handle large datasets due to their design. In particular, TRACLUS is one of the most well-known trajectory clustering algorithms that is a generalization of DBSCAN for trajectory line segments. However, due to the iterative approach and the repetitive usage of a spatial index inherited from DBSCAN, TRACLUS’s performance degrades as the datasets increase in size and can be extremely slow in some cases. To tackle this shortcoming, we propose a distributed implementation of TRACLUS, built on Apache Spark, that can operate on very large datasets by applying different types of partitioning to the input data. Results from an empirical evaluation on real-world trajectories illustrate that our distributed variant achieves improved runtime performance and clustering efficiency.

Files

BMDA_2023_paper_4347.pdf

Files (4.1 MB)

Name	Size	Download all
BMDA_2023_paper_4347.pdf md5:d623d4925e231d066c8a39e5d8805092	4.1 MB	Preview Download

Additional details

European Commission
VesselAI - ENABLING MARITIME DIGITALIZATION BY EXTREME-SCALE ANALYTICS, AI AND DIGITAL TWINS 957237

	All versions	This version
Views	65	65
Downloads	64	64
Data volume	282.5 MB	282.5 MB

Scalable Distributed Trajectory Clustering Using Apache Spark

Authors/Creators

Description

Files

BMDA_2023_paper_4347.pdf

Files (4.1 MB)

Additional details

Funding