Presentation Open Access

International Legal Data in Action: Ideas and Applications for the ICJ and PCIJ (FGV Direito Rio Workshop)

Fobbe, Seán



There can be no Data Science without data. However, the availability of legal data sets, particularly corpora in international law, has been rather limited until recently. In this presentation I discuss two new and high-quality international legal sets for the International Court of Justice (ICJ) and the Permanent Court of International Justice (PCIJ). These corpora collect the full-texts and metadata for all majority and minority opinions of both Courts from 1922 to 2022 and are fully compatible with each other. The ICJ corpus is updated twice a year with new data.

I further outline research questions and methods to put this international legal data into action. These include doctrinal analysis, citation analysis, social network analysis and geospatial analysis of full-texts and metadata. References to relevant papers provide an introduction to the research problems and methdologies for researchers who wish to tackle them. The presentation closes with a call for adherence to the replication standard through the publication of source code and data.


Download Data Sets


Academic Paper

Fobbe, S. (2022). Introducing Twin Corpora of Decisions for the International Court of Justice (ICJ) and the Permanent Court of International Justice (PCIJ). Journal of Empirical Legal Studies, 19(2), 491-524.



This presentation was delivered online on 16 March 2023 during Panel II of the Fundação Getulio Vargas (FGV) Workshop "Transforming the Role of International Courts and Tribunals in a New Era of Adjudication". The event was streamed live on YouTube and recordings can be viewed here:

Recording of Panel I:

Recording of Panel II:


About the speaker

Seán Fobbe specializes in international law, human rights and legal data science. His practical legal work and academic research focus on international human rights law, cultural rights, humanitarian law and international criminal law, with a particular emphasis on the protection of cultural heritage in Iraq and the prosecution of atrocity crimes committed by Da'esh (also known as ISIS, ISIL or the Islamic State). As a data scientist, his interests lie in natural language processing (NLP) and machine learning in the legal domain, quantitative peace research, as well as data engineering and statistical computing with the R programming language.



Files (333.1 kB)
Name Size
333.1 kB Download
All versions This version
Views 209209
Downloads 3,8323,832
Data volume 1.3 GB1.3 GB
Unique views 190190
Unique downloads 3,8233,823


Cite as