Published May 6, 2022 | Version v1
Dataset Open

Embeddings models for Buddhist Sanskrit: Evaluation Datasets

  • 1. Mangalam Research Center
  • 2. Jožef Stefan Institute

Description

Evaluation Dataset used for the study published as  Embeddings models for Buddhist Sanskrit,  LREC 2022 proceedings. It contains a semantic similarity dataset and an analogy dataset, as well as the published study and a ReadMe file containing the guidelines used for scoring semantic  similarity and some notes about the manual scoring task.

 

The evaluation datasets have been prepared by Ligeia Lugli,  Bruno Galasek-Hul, Luis Quiñones and Jai Paranjape

 

Notes

This study was funded by a NEH Digital Advancement Grant level 2 (HAA-277246-21)

Files

AnalogyTask.csv

Files (177.4 kB)

Name Size Download all
md5:bdd250c335064d99407704377b217e45
1.1 kB Preview Download
md5:a1bb0ff6d051b843247ef97a211285fe
171.5 kB Preview Download
md5:c0d63a3d9452f998e45f80c8f6356402
2.3 kB Preview Download
md5:93fecfb64d9e02759bb29859a7b9c812
2.5 kB Preview Download