Conference paper Open Access

Extracting Provenance Metadata from Privacy Policies

Pandit, Harshvardhan J.; O'Sullivan, Declan; Lewis, Dave

Privacy policies are legal documents that describe activities over personal data such as its collection, usage, processing, sharing, and storage. Expressing this information as provenance metadata can aid in legal accountability as well as modelling of data usage in real-world use-cases. In this paper, we describe our early work on identification, extraction, and representation of provenance information within privacy policies. We discuss the adoption of entity extraction approaches using concepts and keywords defined by the GDPRtEXT resource along with using annotated privacy policy corpus from the UsablePrivacy project. We use the previously published GDPRov ontology (an extension of PROV-O) to model provenance model extracted from privacy policies.

Files (183.4 kB)
Name Size
183.4 kB Download
All versions This version
Views 2020
Downloads 2323
Data volume 4.2 MB4.2 MB
Unique views 2020
Unique downloads 2323


Cite as