Published November 2, 2016 | Version v1
Conference paper Open

BigCAB: Distributed Hot Spot Analysis over Big Spatio-temporal Data using Apache Spark (GIS Cup)

Description

Hot spot analysis is the problem of identifying statistically significant spatial clusters from an underlying data set. In this paper, we target the problem of hot spot analysis of massive spatio-temporal data, which raises the need for a parallel and scalable solution that operates on data distributed over a set of nodes. We propose an algorithm, called BigCAB, implemented in Spark, that solves the problem in a parallel and scalable way. Our experiments on real data representing taxi trips demonstrate both the efficiency as well as the nice scaling properties of our algorithm.

Files

BigCAB_GisCup_0.pdf

Files (284.4 kB)

Name Size Download all
md5:5621fae4e9d45b478d89bdf6bc33d1c6
284.4 kB Preview Download