Published April 7, 2025 | Version v1
Dataset Open

T-REx Star 1.0

  • 1. ROR icon University of Zurich
  • 2. ROR icon Dublin City University

Description

T-REx Star provides star-topology subgraphs from Wikidata for each entity that appears as a subject in T-REx. Each entity’s local subgraph is represented in JSON format, includes up to 100 neighbors ranked by PageRank, and stores both node (Q-ID, English label, PageRank) and edge (P-ID, relation label) metadata. The JSON structure is easily loaded into tools such as NetworkX, enabling further graph-based processing or embedding.

Crucially, T-REx Star aligns with T-REx Bite and Tri-REx by using a consistent partitioning scheme. Every entity that serves as a subject in one of the three datasets appears in exactly one split (train, validation, or test). Entities may nonetheless appear as objects in multiple splits if they are neighbors of different subjects. This consistency is important for fair comparisons of LLM performance across training and evaluation sets.

Files

Files (19.7 GB)

Name Size Download all
md5:0cd77a6988ab36f2d8c296be96e9f092
19.4 GB Download
md5:22d4c021ce5a7a98fdf4cf6204a22683
331.8 MB Download