Published May 27, 2026 | Version v1
Presentation Open

Multilingualism and AI, a Common(s) Challenge

  • 1. ROR icon Université de Montréal

Description

The large volume of publications currently observed in scholarly communication has not led to a greater bibliodiversity nor to increased visibility of research outputs published in languages other than English. For at least the past two decades, English has been the main language of scholarly communication across all academic disciplines. In 2025, scientific papers published in English accounted for over 75% of papers indexed in the OpenAlex bibliometric database, while French, for example, represented less than 3% (Céspedes et al., 2025).

The rapid advancement of generative artificial intelligence (GenAI) is seen by many as an opportunity to redress this major imbalance. Accordingly, since 2024, France, Canada, and Quebec have engaged in the conceptualization of several projects with a view to building a shared, open infrastructure aimed at promoting ethical and transparent use of generative AI, while increasing the discoverability of research outputs published in French. The main features expected from the development of such infrastructure include a repository of linguistic and terminological resources, as well as a language model trained on a corpus of scholarly and cultural documents made available under open licenses.

The French Science Commons open corpus – the first deliverable of this initiative – was released in its early version in March 2026. The other components of the shared infrastructure are still lacking the governance and structural foundations required for their implementation and therefore remain rather conceptual for now. Nevertheless, this conceptualization process has helped advance a shared reflection on multilingualism in science, within a framework of binational Francophone collaboration aligned with the internationalized research environment.

This short paper aims to share with the members of the OPERAS community the questions, challenges, and opportunities that these efforts have brought to light – for example, opportunities and challenges relating to generative AI in a research context, benefits of AI for multilingualism and bibliodiversity, conditions necessary for the development of a diverse, accessible, and common scientific environment for producing and communicating science in all languages.

Files

Fiorini & Paquin_Multilingualism and AI, a Common(s) Challenge.pdf

Files (625.7 kB)