Dataset Open Access

Corpus of Decisions: Permanent Court of International Justice (CD-PCIJ)

Fobbe, Sean



The Corpus of Decisions: Permanent Court of International Justice (CD-PCIJ) collects and presents for the first time in human- and machine-readable formats all documents of PCIJ Series A, B and A/B of the Permanent Court of International Justice (PCIJ). Among these are judgments, advisory opinions, orders, appended minority opinions, annexes, applications instituting proceedings and requests for an advisory opinion. The International Court of Justice, the successor of the PCIJ, has kindly made available these documents on its website.

The Permanent Court of International Justice (PCIJ) was the primary judicial organ of the League of Nations, the ill-fated predecessor of the United Nations, which existed from 1920 to 1946. Nonetheless, as the first international court with general thematic jurisdiction, the PCIJ influenced international law in profound ways that are still felt today. Every lawyer who sets out on the path of international law encounters epoch-defining opinions such as the Lotus and Factory at Chorzów decisions, but the Court's lesser-known jurisprudence and the appended minority opinions offer many more ideas and legal principles which are seldom appreciated today.

This data set is designed to be complementary to and fully compatible with the Corpus of Decisions: International Court of Justice (CD-ICJ), which is also available open access.



A peer-reviewed academic paper describing the construction and relevance of the data set entitled 'Introducing Twin Corpora of Decisions for the International Court of Justice (ICJ) and the Permanent Court of International Justice (PCIJ)' was published open access in the Journal of Empirical Legal Studies (JELS). It is also available in print at JELS 2022, Vol. 19, No. 2, pp. 491-524.

If you use the data set for academic work, please cite both the JELS paper and the precise version of the data set you used for your analysis.


NEW in Version 1.1.0

  • Full recompilation of data set
  • CHANGELOG and README converted to external markdown files
  • Display of version number on Codebook and Compilation Report title pages fixed; correctly display semantic versioning
  • The ZIP archive of source files includes the TEX files
  • Config file converted to TOML format
  • All R packages are version-controlled with {renv}
  • Data set creation process cleans up all files from previous runs before a new data set is created
  • Remove redundant color from violin plots



The CD-PCIJ will only be updated if errors are discovered, enhancements are developed or in the unlikely event that the Court publishes additional documents within the collection ambit of the data set (PCIJ Series A, B and A/B).

Notifications regarding new and updated data sets will be published on my academic website at or via Mastodon at


Recommended Variants

Target Audience Recommended Variant
Practitioners PDF_ENHANCED_MajorityOpinions
Traditional Scholars PDF_ENHANCED_FULL
Quantitative Analysts CSV_TESSERACT_FULL


Please refer to the Codebook regarding the relative merits of each variant. All variants are available in either English or French. Unless you have very specific needs you should only use the variants denoted 'ENHANCED' or 'TESSERACT' for serious work.




Key Metrics

Version: 1.1.0

Temporal Coverage: 22 May 1922 – 26 February 1940

Documents: 259 (English) / 261 (French)

Tokens: 1,296,536 (English) / 1,262,184 (French)

Formats: PDF, TXT, CSV


Source Code and Compilation Report

With every compilation of the full data set an extensive Compilation Report is created in a professionally layouted PDF format (comparable to the Codebook). The Compilation Report includes the Source Code, comments and explanations of design decisions, relevant computational results, exact timestamps and a table of contents with clickable internal hyperlinks to each section. The Compilation Report and Source Code are published under the same DOI:

For details of the construction and validation of the data set please refer to the Compilation Report.



This data set has been created by Mr Seán Fobbe using documents available on the website of the International Court of Justice ( It is a personal academic initiative and is not associated with or endorsed by the International Court of Justice or the United Nations.

The Court accepts no responsibility or liability arising out of my use, or that of third parties, of the documents and information produced, used or published on the Zenodo website. Neither the Court nor its staff members nor its contractors may be held responsible or liable for the consequences, financial or otherwise, resulting from the use of these documents and information.


Academic Publications (Fobbe)

Website —

Open Data —

Code Repository —

Regular Publications —



Did you discover any errors? Do you have suggestions on how to improve the data set? You can either post these to the Issue Tracker on GitHub or write me an e-mail at


Files (1.6 GB)
Name Size
575.4 kB Download
7.2 kB Download
4.7 MB Download
2.0 MB Download
10.9 kB Download
335.2 MB Download
200.0 MB Download
120.7 MB Download
2.4 MB Download
2.3 MB Download
2.2 MB Download
10.9 kB Download
328.6 MB Download
193.5 MB Download
118.0 MB Download
2.5 MB Download
2.5 MB Download
241.8 MB Download
All versions This version
Views 1,356831
Downloads 15,33915,034
Data volume 1.7 TB1.7 TB
Unique views 1,238764
Unique downloads 14,87314,654


Cite as