Published June 13, 2024 | Version v1
Dataset Open

Minecraft MCQ Datasets

Creators

Description

Minecraft MCQ Datasets

Overview

This repository contains two high-quality evaluation datasets for assessing the domain-specific expertise of models within the context of Minecraft. Due to the open-ended nature of questions in Minecraft, conventional evaluation metrics and common benchmarks may not be fully suitable. To address this, we created two Multiple-Choice Question (MCQ) datasets using GPT-4, focusing on different themes and keywords related to Minecraft.

Datasets

1. Multi-Theme MCQ Dataset

This dataset covers various content themes related to the game Minecraft. It includes 1,050 multiple-choice questions, carefully designed to test a wide range of Minecraft knowledge. The following themes are covered:

  • Game Basics: Blocks and items, survival mechanics, etc.
  • World Exploration: Biomes, terrain and landforms, etc.
  • Mobs and Interactions: Characteristics of mobs, combat system, trading, and villagers.
  • Survival Skills: Resource gathering, crafting and production, farming, and animal husbandry.
  • Building and Creativity: Building styles, techniques, interior decoration, and Redstone mechanics.
  • Special Dimensions: The Nether, The End, adventure, and exploration.

2. Wiki-Based MCQ Dataset

This dataset is derived from the information available on Minecraft Wiki pages, designed to align closely with documented in-game knowledge. This dataset contains 2,083 multiple-choice questions based on 615 Minecraft-related keywords, ensuring depth and coverage of Minecraft-related trivia.

Dataset Generation Process

Multi-Theme MCQ Dataset

  1. Summarization: Summarize key themes within Minecraft.
  2. Keyword Listing: Identify relevant keywords for each theme.
  3. Question Generation: Use GPT-4 to generate questions based on these keywords, ensuring a balanced distribution of difficulty levels (easy, medium, hard).

Wiki-Based MCQ Dataset

  1. Keyword Identification: List important Minecraft-related keywords.
  2. Question Generation: Use GPT-4 to generate questions based on these keywords and the information on corresponding Minecraft Wiki pages, ensuring the objectivity and accuracy of the MCQs.

Usage

To utilize the datasets, download the corresponding files from this repository. Each dataset is provided in multiple-choice question format with clearly defined difficulty levels and themes/keywords.

Details

License

These datasets are made available under the Creative Commons Attribution 4.0 International License.

DOI

10.5281/zenodo.11583478

Files

Multi-Theme MCQ Dataset.csv

Files (1.2 MB)

Name Size Download all
md5:2d690480175798a139019e6da8a758f8
430.7 kB Preview Download
md5:a2ad9c56189b72d1d10ec2e42270ac08
762.3 kB Preview Download