Published November 11, 2025 | Version v6
Dataset Open

SciMKG: A Multimodal Knowledge Graph for Science Education with Text, Image, Video and Audio

  • 1. EDMO icon Beijing Normal University

Description

đź§ SciMKG: A Multimodal Knowledge Graph for Science Education with Text, Image, Video and Audio


SciMKG is a large-scale multimodal educational knowledge graph (MEKG) designed for Chinese K12 education (covering biology, physics, and chemistry). It leverages large language models (LLMs) to automatically extract and align concepts from diverse educational materials such as text, images, videos, and audio, enabling structured and intelligent educational content understanding.

🔍 Key Features

Automated Multimodal Construction: Introduce an Extraction–Verification–Integration–Augmentation pipeline to incrementally extract and refine disciplinary concepts.

Cross-Modal Alignment: Ensure semantic consistency between modalities via shared structural and semantic representations.

Large-Scale Coverage
1,356 knowledge points
34,630 multimodal concepts
403,400 relational triples

Files

Example.zip

Files (5.5 GB)

Name Size Download all
md5:42ba1f64364b888fec957175a7a7395f
347.6 kB Preview Download
md5:cc7e142c1685636bfbbe47b3dfa6606f
131.3 kB Preview Download
md5:afe8af58643089eb998c7fd15ff1ef98
5.5 GB Preview Download
md5:2e5e271558660793f80fc9defda5405c
17.3 MB Preview Download

Additional details

Software

Repository URL
https://github.com/kg-bnu/SciMKG
Programming language
Python
Development Status
Active