What your Fitbit Says about You: De-anonymizing Users in Lifelogging Datasets
Description
Recently, there has been a significant surge of lifelogging experiments, where the activity of few participants
is monitored for a number of days through fitness trackers. Data from such experiments can be aggregated
in datasets and released to the research community. To protect the privacy of the participants, fitness datasets
are typically anonymized by removing personal identifiers such as names, e-mail addresses, etc. However,
although seemingly correct, such straightforward approaches are not sufficient. In this paper we demonstrate
how an adversary can still de-anonymize individuals in lifelogging datasets. We show that users’ privacy can
be compromised by two approaches: (i) through the inference of physical parameters such as gender, height,
and weight; and/or (ii) via the daily routine of participants. Both methods rely solely on fitness data such as
steps, burned calories, and covered distance to obtain insights on the users in the dataset. We train several
inference models, and leverage them to de-anonymize users in public lifelogging datasets. Between our two
approaches we achieve 93.5% re-identification rate of participants. Furthermore, we reach 100% success rate
for people with highly distinct physical attributes (e.g., very tall, overweight, etc.).
Files
kazlouski_SECRYPT_2022.pdf
Files
(7.1 MB)
Name | Size | Download all |
---|---|---|
md5:c83356c36936e253b57c0c19f05f715a
|
7.1 MB | Preview Download |