Published June 14, 2022 | Version 1.0.0
Dataset Open

MineDojo Internet Knowledge Base (Reddit)

  • 1. NVIDIA
  • 2. Caltech
  • 3. Stanford
  • 4. Columbia
  • 5. SJTU
  • 6. NVIDIA, UT Austin
  • 7. NVIDIA, Caltech


Project website:



We collect 340K+ Reddit posts along with 6.6M comments under the “r/Minecraft” subreddit. These posts ask questions on how to solve certain tasks, showcase cool architectures and achievements in image/video snippets, and discuss general tips and tricks for players of all expertise levels. Large language models can be finetuned on our Reddit corpus to internalize Minecraft-specific concepts and develop sophisticated strategies.

Check out our paper!


  title = {MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge},
  author = {Linxi Fan and Guanzhi Wang and Yunfan Jiang and Ajay Mandlekar and Yuncong Yang and Haoyi Zhu and Andrew Tang and De-An Huang and Yuke Zhu and Anima Anandkumar},
  year = {2022},
  journal = {arXiv preprint arXiv: Arxiv-2206.08853}




Files (13.0 MB)

Name Size Download all
990.2 kB Preview Download
12.0 MB Preview Download