There is a newer version of the record available.

Published December 28, 2021 | Version v2
Dataset Open

Dataset and additional files/softwares required for the paper "LeSICiN: A Heterogeneous Graph-based Approach for Automatic Legal Statute Identification from Indian Legal Documents"

  • 1. Indian Institute of Technology, Kharagpur

Description

This dump contains all files and softwares required for running the codes for the paper "LeSICiN: A Heterogeneous Graph-based Approach for Automatic Legal Statute Identification from Indian Legal Documents". Specifically, these codes are available at https://github.com/Law-AI/LeSICiN.

LeSICiN is a deep neural network for the task of Legal Statute Identification which also uses graphical properties of the document-statute citation network for training and predictions.

We have three datasets --- train, dev and test. These are all .jsonl files with each instance dict per line; each instance dict contains the unique id, list of sentences and cited labels of the particular instance. Also, there is a fourth file --- secs.jsonl, which stores the text of all the statutes in similar format.

schemas.json list out the metapath schemas for fact and section type nodes, while type_map.json maps the id of each node to its type (Act/Chapter/Topic/Section/Fact). 

label_tree.json and citation_network.json list out the edges for the two parts of the network in the format of a 3-tuple ('source id', 'relationship type', 'target id')

"ils2v.bin" is the pretrained sent2vec vectorizer that can generate a 200-dim vector for each sentence

Files

citation_network.json

Files (2.8 GB)

Name Size Download all
md5:db4e5039f338cda668edaa7433ce9370
40.0 MB Download
md5:a8f6ac51b3c1b8631483fc8b7d8fd7f8
31.3 MB Preview Download
md5:237b1776f5a535eb3f079bfd0ea0b200
85.3 MB Download
md5:59026797f302c05f6b1b15586c1486a1
2.3 GB Download
md5:910a4c9f55f387d93c060a73f7d9e4b1
36.0 kB Preview Download
md5:b669b98a334191953aa4832404813796
3.8 kB Preview Download
md5:b2768e06377ab032df2d8f8c323886ff
46.2 kB Download
md5:09a554405680830c6d3818320eb055af
106.1 MB Download
md5:a1994d1aaa8ae7a987fcb1c2e2dc52f1
319.8 MB Download
md5:23659e56bc1efa72639e2a81d20fc4c7
1.0 MB Preview Download