Published May 21, 2026 | Version 1.0.1
Dataset Open

GraphOntology Dataset

  • 1. ROR icon École Polytechnique Fédérale de Lausanne

Contributors

  • 1. ROR icon École Polytechnique Fédérale de Lausanne

Description

The EPFL Graph Platform is an open-source data infrastructure designed for academic institutions. It organizes educational and institutional data into a semantically interconnected knowledge graph, making it accessible through a graph-based search engine and an LLM-powered chatbot. The platform is composed of five core services: Graph Registry, Graph AI, Graph Ontology, Graph Search, and Graph Chat.

This dataset primarily supports the Graph Ontology service and plays a foundational role in the platform’s semantic capabilities. It includes:

  • A concepts graph constructed from over 40,000 Wikipedia pages, covering a broad spectrum of academic and scientific topics.

  • A category tree that defines hierarchical relationships between concepts and categories, enabling structured navigation and inference.

  • The semantic backbone used to detect, disambiguate, and link entities and keywords across diverse data sources such as course descriptions, lecture slides, research publications, and lab affiliations.

These structures are central to the platform’s ability to perform entity recognition, enable semantic search, and power AI-based recommendations. They allow users to search for academic topics (e.g., "quantum computing") and receive an integrated view of relevant content within the institution — from courses and lectures to researchers, labs, and scholarly output.

Files

README.md

Files (15.7 GB)

Name Size Download all
md5:702b4aa5cc6be6e7c2cbdac0ab4f0399
4.3 GB Download
md5:7f2230669a26f631b3935a1568268e46
103 Bytes Download
md5:23fa73699ebd536798eaeb38c564a2d2
11.3 GB Download
md5:571e66edc077308f8b92b4e252238db0
99 Bytes Download
md5:c04d8844c5dbcbfe66bcde1474c155c2
3.9 kB Preview Download

Additional details

Software

Repository URL
https://github.com/epflgraph/graphontology
Programming language
Python
Development Status
Active