Embeddings models for Buddhist Sanskrit: Evaluation Datasets

Lugli Ligeia; Martinc Matej; Andraž Pelicon; Pollak Senja

doi:10.5281/zenodo.6523884

Published May 6, 2022 | Version v1

Dataset Open

Embeddings models for Buddhist Sanskrit: Evaluation Datasets

1. Mangalam Research Center
2. Jožef Stefan Institute

Evaluation Dataset used for the study published as Embeddings models for Buddhist Sanskrit, LREC 2022 proceedings. It contains a semantic similarity dataset and an analogy dataset, as well as the published study and a ReadMe file containing the guidelines used for scoring semantic similarity and some notes about the manual scoring task.

The evaluation datasets have been prepared by Ligeia Lugli, Bruno Galasek-Hul, Luis Quiñones and Jai Paranjape

Notes

This study was funded by a NEH Digital Advancement Grant level 2 (HAA-277246-21)

Files

AnalogyTask.csv

Files (177.4 kB)

Name	Size	Download all
AnalogyTask.csv md5:bdd250c335064d99407704377b217e45	1.1 kB	Preview Download
Lugli_Martinc_Pelicon_Pollak_LREC2022_BuddhistSanskritEmbeddings.pdf md5:a1bb0ff6d051b843247ef97a211285fe	171.5 kB	Preview Download
ReadMe.txt md5:c0d63a3d9452f998e45f80c8f6356402	2.3 kB	Preview Download
SemanticSimilarityDataset_Lugli2022.csv md5:93fecfb64d9e02759bb29859a7b9c812	2.5 kB	Preview Download

322

Views

Downloads

Show more details

	All versions	This version
Views	322	322
Downloads	57	57
Data volume	3.5 MB	3.5 MB

More info on how stats are collected....

DOI

Resource type

Dataset

Publisher

Zenodo

Languages

Sanskrit

Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: May 10, 2022
Modified: July 16, 2024

Embeddings models for Buddhist Sanskrit: Evaluation Datasets

Creators

Description

Notes

Files

AnalogyTask.csv

Files (177.4 kB)