LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

Ilias Chalkidis; Abhik Jana; Dirk Hartung; Michael Bommarito; Ion Androutsopoulos; Daniel Martin Katz; Nikolaos Aletras

doi:10.5281/zenodo.5532997

Published September 27, 2021 | Version 1.0

Dataset Open

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

1. University of Copenhagen
2. Universität Hamburg
3. Bucerius Law School
4. CodeX, Stanford Law School
5. Athens University of Economics and Business
6. Illinois Tech – Chicago Kent College of Law
7. University of Sheffield

This benchmark dataset is published with the article:

Ilias Chalkidis, Abhik Jana, Dirk Hartung, Michael Bommarito, Ion Androutsopoulos, Daniel Martin Katz, and Nikolaos Aletras. 2021. LexGLUE: A Benchmark Dataset for Legal Language Understanding in English. ArXiv.

Short Description

Inspired by the recent widespread use of the GLUE multi-task benchmark NLP dataset (Wang et al., 2018), the subsequent more difficult SuperGLUE (Wang et al., 2019), other previous multi-task NLP benchmarks (Conneau and Kiela,2018; McCann et al., 2018), and similar initiatives in other domains (Peng et al., 2019), we introduce LexGLUE, a benchmark dataset to evaluate the performance of NLP methods in legal tasks. LexGLUE is based on seven existing legal NLP datasets:

ECtHR Task A (Chalkidis et al., 2019)
ECtHR Task B (Chalkidis et al., 2021a)
SCOTUS (Spaeth et al., 2020)
EUR-LEX (Chalkidis et al., 2021b)
LEDGAR (Tuggener et al. (2020)
UNFAIR-ToS (Lippi et al., 2019)
CaseHOLD (Zheng et al., 2021)

Files

Files (310.2 MB)

Name	Size	Download all
casehold.tar.gz md5:46f494c25d83ccdaa9a5057068d2c773	30.4 MB	Download
ecthr.tar.gz md5:b74661e9470b5dcd5edc9c0b814be064	32.9 MB	Download
eurlex.tar.gz md5:f52be1f8cb765186f5fff0c4ea869a30	125.4 MB	Download
ledgar.tar.gz md5:0aed3de1660d212401cb5e4f5bc8bd57	16.3 MB	Download
scotus.tar.gz md5:557a0f42093c31ad58bd7bf559a22878	104.8 MB	Download
unfair_tos.tar.gz md5:6e2e50a770b6aafe8ffe813a42aa9955	511.3 kB	Download

Citations

Oops! Something went wrong while fetching results.

	All versions	This version
Views	1,351	329
Downloads	10,596	9,954
Data volume	1.0 TB	935.5 GB

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

Creators

Description

Files

Files (310.2 MB)