Published October 3, 2023 | Version 1
Dataset Open

JusBrasilRec: A large-scale dataset of user sessions for recommendations on the legal domain

  • 1. State University of Maringá and Federal University of Amazonas and Jusbrasil
  • 2. Federal University of Amazonas and Jusbrasil
  • 3. Federal University of Campina Grande
  • 4. Federal University of Amazonas

Description

JusBrasilRec: A large-scale dataset of user sessions for recommendations on the legal domain

The proliferation of legal documents in various formats and their dispersion across multiple courts present a significant challenge for users seeking precise matches to their information requirements. Despite notable advancements in legal information retrieval systems, research into legal recommender systems remains limited. A plausible factor contributing to this scarcity could be the absence of extensive publicly accessible datasets or benchmarks.

Jusbrasil (https://www.jusbrasil.com.br) is known as the largest legal search portal in Brazil. It provides an online environment where users can find the legal documents that best match their information needs. With millions of user interactions to billions of documents containing different artifacts related to law in Brazil, Jusbrasil appears as a large-scale test bed for advancing research on the still scarce area of legal recommender systems. 

Therefore, we collected and made available the JusBrasilRec, a dataset containing user sessions from Jusbrasil for recommendations on the legal domain. Additionally, we also computed and made available a TF-IDF matrix from the textual content of the documents in Jusbrasil. The following files are available for download from JusBrasilRec:

  • jusbrasilrec_dataset.zip: a compacted file containing the user sessions;
  • jusbrasilrec_tfidf_matrix.zip: a compacted file containing the TF-IDF matrix;
  • readme.txt: a text file explaining the content and format of the previous files.

How to cite the dataset: Marcos Aurélio Domingues, Edleno Silva de Moura, Leandro Balby Marinho and Altigran da Silva. A Large Scale Benchmark for Session-based Recommendations on the Legal Domain. Artificial Intelligence and Law. 2023.

Files

jusbrasilrec_dataset.zip

Files (6.2 GB)

Name Size Download all
md5:e1494e101c2b16f06af4dd068a31892f
269.2 MB Preview Download
md5:83eb6b38e03797a936b773ed8794c34e
6.0 GB Preview Download
md5:3610f2be670e09b0ce9ec1d126ef8213
2.2 kB Preview Download