Published November 8, 2022 | Version v1
Thesis Open

Web Scraping for a Database of Court Decision Related Documents

  • 1. University of Bern

Contributors

  • 1. University of Bern

Description

Reports of Swiss court rulings are anonymous to protect the privacy of involved subjects. The Swiss national research project ”Open Justice vs. Privacy” strives to automate re-identification of anonymous reports of court rulings using natural language processing. The achievement of this goal requires a database of court decision related documents. So far, despite the increasing amount and availability of data, no data sets of external documents related to Swiss federal court rulings have been collected and processed. This research project aims to provide a database of court decision related documents from five promising online data sources. In a second step, the sources were analyzed and the documents along with the metadata provided were scraped. After being structured in a JavaScript Object Notation (JSON) format, the results were analyzed in the context of the overarching research project and quantitatively evaluated by examining the text similarity of the resulting data and the court rulings.

Files

thesis.pdf

Files (1.0 MB)

Name Size Download all
md5:81e537a6d386d941350291e9b3572419
1.0 MB Preview Download