Published November 7, 2023 | Version v1
Conference proceeding Open

Draw Me Like My Triples: Leveraging Generative AI for Wikidata Image Completion

  • 1. Deutsches Forschungszentrum für Künstliche Intelligenz
  • 2. ROR icon Ca' Foscari University of Venice
  • 3. ROR icon Freie Universität Berlin
  • 4. ROR icon University of Bologna
  • 5. ROR icon Université Côte d'Azur
  • 6. ROR icon University of Oxford
  • 7. ROR icon King's College London

Description

Humans are critical for the creation and maintenance of high-quality Knowledge Graphs (KGs). However, creating and maintaining large KGs only with humans does not scale, especially for contributions based on multimedia (e.g. images) that are hard to find and reuse on the Web and expensive to generate by humans from scratch. Therefore, we leverage generative AI for the task of creating images for Wikidata items that do not have them. Our approach uses knowledge contained in Wikidata triples of items describing fictional characters and uses the fine-tuned T5 model based on the WDV dataset to generate natural text descriptions of items about fictional characters with missing images. We use those natural text descriptions as prompts for a transformer-based text-to-image model, Stable Diffusion v2.1, to generate plausible candidate images for Wikidata image completion. We design and implement quantitative and qualitative approaches to evaluate the plausibility of our methods, which include conducting a survey to assess the quality of the generated images.

Files

Draw Me Like My Triples: Leveraging Generative AI for Wikidata Image Completion.pdf

Additional details

Funding

European Commission
Polifonia – Polifonia: a digital harmoniser for musical heritage knowledge 101004746