Growth of non‐English‐language literature on biodiversity conservation

Abstract English is widely recognized as the language of science, and English‐language publications (ELPs) are rapidly increasing. It is often assumed that the number of non‐ELPs is decreasing. This assumption contributes to the underuse of non‐ELPs in conservation science, practice, and policy, especially at the international level. However, the number of conservation articles published in different languages is poorly documented. Using local and international search systems, we searched for scientific articles on biodiversity conservation published from 1980 to 2018 in English and 15 non‐English languages. We compared the growth rate in publications across languages. In 12 of the 15 non‐English languages, published conservation articles significantly increased every year over the past 39 years, at a rate similar to English‐language articles. The other three languages showed contrasting results, depending on the search system. Since the 1990s, conservation science articles in most languages increased exponentially. The variation in the number of non‐English‐language articles identified among the search systems differed markedly (e.g., for simplified Chinese, 11,148 articles returned with local search system and 803 with Scopus). Google Scholar and local literature search systems returned the most articles for 11 and 4 non‐English languages, respectively. However, the proportion of peer‐reviewed conservation articles published in non‐English languages was highest in Scopus, followed by Web of Science and local search systems, and lowest in Google Scholar. About 20% of the sampled non‐English‐language articles provided no title or abstract in English; thus, in theory, they were undiscoverable with English keywords. Possible reasons for this include language barriers and the need to disseminate research in countries where English is not widely spoken. Given the known biases in statistical methods and study characteristics between English‐ and non‐English‐language studies, non‐English‐language articles will continue to play an important role in improving the understanding of biodiversity and its conservation.

languages, published conservation articles significantly increased every year over the past 39 years, at a rate similar to English-language articles. The other three languages showed contrasting results, depending on the search system. Since the 1990s, conservation science articles in most languages increased exponentially. The variation in the number of non-English-language articles identified among the search systems differed markedly (e.g., for simplified Chinese, 11,148 articles returned with local search system and 803 with Scopus). Google Scholar and local literature search systems returned the most articles for 11 and 4 non-English languages, respectively. However, the proportion of peerreviewed conservation articles published in non-English languages was highest in Scopus, followed by Web of Science and local search systems, and lowest in Google Scholar. About 20% of the sampled non-English-language articles provided no title or abstract in English; thus, in theory, they were undiscoverable with English keywords. Possible reasons for this include language barriers and the need to disseminate research in countries where English is not widely spoken. Given the known biases in statistical methods and study characteristics between English-and non-English-language studies, non-English-language articles will continue to play an important role in improving the understanding of biodiversity and its conservation. KEYWORDS biodiversity information, evidence synthesis, global biodiversity databases, languages, language barrier, publication bias, ; ; ; ; ;

INTRODUCTION
Nearly, 77% of Earth's land and 87% of its ocean have been modified owing to human activities (Ceballos et al., 2015;Chowdhury et al., 2021;Venter et al., 2016). To effectively conserve the remaining biodiversity, making the best use of available scientific knowledge is essential (Sutherland et al., 2004;Tilman et al., 2017;Fraixedas et al., 2022). However, there is a substantial geographical bias in the availability of scientific knowledge on biodiversity conservation, especially for scientific knowledges published in English (Hickisch et al., 2019;Wilson et al., 2016). Such biases seriously hinder the understanding of local biodiversity and the development and implementation of effective conservation actions and policies in regions with limited scientific information, particularly the Global South (e.g., Blicharska et al., 2017;Toomey, 2016;Nuñez et al., 2021).
Despite the current underuse of non-English-language scientific knowledge at the global level (Lynch et al., 2021), many articles are published in languages other than English. For example, nearly 36% of conservation articles published in 2014 were in non-English languages (Amano et al., 2016). Most information on endemic and threatened bat species in Japan is only available in Japanese (Preble et al., 2021). Similarly, scientific articles on China's Belt and Road Initiative, a continental-scale infrastructure development that has potentially disastrous consequences for biodiversity in the region (Lechner et al., 2018), are often available only in simplified Chinese (Teo et al., 2020).
The majority of research published in non-English languages often remains unrecognized by scientists and decision makers; consequently, it is underused in global conservation planning and decisions (Fabian et al., 2019). Ignoring the vast amount of scientific knowledge available only in non-English languages could have serious consequences for conservation science, policies, and practices for the following three reasons. First, non-English-language studies provide local knowledge generated by field practitioners, who often find it challenging to publish their work in English if they are non-native English speakers (Amano et al., 2016). Second, because there can be a systematic difference in statistical results between English-and non-Englishlanguage studies, ignoring non-English-language studies can bias the outcomes of evidence syntheses, such as meta-analyses (Konno et al., 2020). Finally, non-English-language studies can provide important scientific knowledge on conservation in areas and for species where little or even no English-language evidence is available . The role of non-Englishlanguage studies could be especially crucial in countries with emerging or developing economies, which are also the areas in which most biodiversity hotspots occur (Cincotta et al., 2000). English is often not widely spoken in these countries, and difficulties in publishing in English are considered key barriers for conservation scientists and practitioners in such countries (e.g., Müller & Opgenoorth, 2014;Nuñez et al., 2019;Valenzuela-Toro & Viglino, 2021).
Since the end of World War II, English has become the lingua franca of science, and even non-native English speakers have increasingly been publishing their work in English (Fung, 2008;Gordon, 2012;López-Navarro et al., 2015). It is often assumed that scientific articles are being published less frequently in non-English languages (e.g., Montgomery, 2013;Baethge, 2013;Di Bitetti & Ferreras, 2017), and this partly contributes to the current underestimation of the importance of non-English-language literature in conservation science, policies, and practices . However, temporal changes in the number of conservation articles published in different languages are not well documented (Amano et al., 2016). We sought to investigate the changes over time in the number of scientific articles on biodiversity conservation published in 16 languages. Most previous investigations on the availability of non-English-language conservation studies have been conducted based on literature searches in either Web of Science or Scopus, although most non-English-language journals are not indexed in these search systems Nuñez & Amano, 2021). Amano et al. (2016) estimated the number of conservation-related scientific documents by language, but only for a single year (2014) and based only on a search in Google Scholar. Although Google Scholar supports literature searches in most major languages and thus effectively identifies non-English-language articles, it misses some important literature Gusenbauer & Haddaway, 2020). A more promising approach for estimating changes in the number of conservation-related articles in different languages is to search each language's literature in multiple literature search systems, including local-language-specific databases (Karlsson et al., 2007). However, no such attempt has been made to date.
To determine the temporal trends in published articles on biodiversity conservation, we compiled the number of articles published annually in English and in 15 non-English languages by searching both local (for non-English languages) and international (for English and non-English languages) literature search systems. We then estimated and compared the rate of change in the number of conservation articles among the languages to test the common assumption that the number of published non-English-language articles has decreased, which implies they play a less important role in conservation.

METHODS
To determine how the number of conservation articles published in different languages has changed over time, we searched multiple literature databases and search platforms and compared the number of published articles per year in each language. We selected the timeframe of 1980−2018 (39 years) because the number of articles on biodiversity and conservation started increasing substantially in the 1980s (e.g., Soulé, 1985). We searched scientific documents for the keywords biodiversity AND conservation in a set of international and local literature databases and search platforms in 16 languages (English, Spanish, Portuguese, simplified Chinese, traditional Chinese, French, Italian, German, Japanese, Korean, Swedish, Polish, Turkish, Russian, Persian, and Dutch). These 16 languages are the national languages of the 20 countries that produced the most scientific and technical journal articles in 2009 (Amano et al., 2016). The international search systems, we used were Google Scholar, Web of Science, and Scopus.

Searches in Web of Science
When searching for published articles by year in Web of Science, we used the following databases: Web of Science Core Collection, Current Contents Connect, Data Citation Index, Derwent Innovations Index, KCI-Korean Journal Database, MEDLINE, Russian Science Citation Index, and SciELO Citation Index. We limited the year accordingly , searched with TS = (biodiversity AND conservation) (i.e., using English keywords), and recorded the number of articles in each year for each language using the information provided in the search results about the articles' languages. The Web of Science search was conducted on 4 December 2020.

Searches in Google Scholar
For Google Scholar, we used the translation of the same two English keywords in 15 non-English languages (translated keywords in Appendix S1). We included at least one native speaker of each language in the language search. All had a bachelor's degree, but often had higher research (i.e., master's or doctorate) degrees, in ecology or conservation (Table 1). Searches were restricted to websites written in each language (apart from Swedish, Russian, and Persian because these options were unavailable [Amano et al., 2016]). We limited the year accordingly (e.g., 1980−1980, 1981−1981, … 2018−2018) and recorded the number of articles in each year for each language. The search on Google Scholar was conducted from 5 November 2019 to 13 November 2019 in Brisbane, Australia. Although search results on Google Scholar may vary depending on the location of searches, we assumed that this issue would not affect our estimates of temporal trends in the number of articles published in each language, given that all searches were conducted in the same location.

Searches in Scopus
Similarly, for Scopus, we limited the year accordingly, searched with TITLE-ABS-KEY (biodiversity and conservation) (i.e., with English keywords), and recorded the number of articles per year for each language based on the information provided in the search results. The search on Scopus was conducted from 14 November 2019 to 23 November 2019. Neither Web of Science nor Scopus distinguish between simplified and traditional Chinese, so we used search results for Chinese in general for these two search systems.

Searches in local literature search systems
Because international literature search systems are unlikely to fully include non-English-language articles, we also identified the most comprehensive and relevant literature search system in each of the 15 non-English languages, based on discussions with the native speakers of each language involved in this study. In cases where we found more than one non-English-language search system for a language, we adopted the most comprehensive search system based on discussions with each native speaker. We identified 12 local literature search systems covering 13 non-English languages, but we could not find any for the other two languages: Swedish and Italian (Table 1). In each literature search system, we searched with the translation of the same two keywords, biodiversity AND conservation, in each language (translated keywords in Appendix S1), selected the year range (or manually collated the number of articles by year for Turkish), and recorded the total number of articles published each year (see Table 1 for the date of each search). Our preliminary searches in Polish and German identified many irrelevant articles; thus, the translation of biodiversity was used only for Polish and German.

Assessment of relevance, quality, and visibility of non-English-language articles
Because we used only two keywords, search results may include articles not relevant to biodiversity conservation. Further, the quality and visibility of the articles identified may also differ among literature search systems and languages. Therefore, we also investigated the relevance (i.e., whether each article was actually about biodiversity conservation), quality (whether each article was peer reviewed), and visibility (whether each article provided the title and abstract in English) of the non-English-language articles identified in each literature search system in each language. The assessment was conducted by native speakers of each language who were involved in this study.
The assessment was based on a subset of articles sampled from the 2018 search results for each literature search system in each language. Using the 'sample.size.prop' function in R package samplingbook (Manitz et al., 2020), we first determined the sample size for each literature search system in each language (assuming the expected proportion was p = 0.5, a finite small population correction of the total number of articles identified in 2018 on the system in the language, precision [e] of 0.1, and confidence level of 0.95).
We then determined the number of articles to sample from each search result as follows. If the sample size estimated from the R function was ≥50, the estimated sample size was the number of articles to sample from the search results of the search system. If the estimated sample size was <50, 50 articles from the search result, or all articles if there were fewer than 50 articles were sampled.
Next, we extracted the determined number of articles from the search result for 2018 at regular intervals. For example, if the search identified 1000 articles and the number of articles to sample was 88, we sampled the first article and one in every 11 articles (i.e., 1000/88 = 11) from the search result. Google Scholar shows only the first 1000 articles identified. Therefore, if the total number of articles identified was >1000, we divided 1000 by the number of articles to sample to identify the interval of our samples. For example, if the total number of articles identified on Google Scholar was 20,000 and the number of articles to sample was 96, we sampled the first paper and one in every 10 papers (1000/96 = 10) from the search results.
For each of the articles sampled as described above, we read at least the title and abstract of the article, and the main text if needed, and recorded the following information: whether the article was actually about biodiversity conservation (i.e., articles that mentioned biodiversity or its conservation in any part of the article); the language of the main text; whether an English title was provided; whether an English abstract was provided; article type (peer-reviewed journal articles, books, theses, or other); and the URL (web link).

Analyses
To estimate the rate of change in the number of articles on biodiversity conservation, we fitted generalized linear models with a negative binomial distribution with the glmmTMB package (Brooks et al., 2017) in R 4.0.4 (R Core Team, 2021). We used the number of articles published in each language per year as the response variable and the published year (centered) as the explanatory variable. Because some languages seemed to show a nonlinear pattern in changes in the number of articles over time, we compared models with and without the quadratic year term based on Akaike information criterion. To deal with the zero-inflated nature of the data, we removed data from the years prior to the first year in which each language had at least one published study, and we included only languages with at least 10 years of published studies in the analyses. The estimated coefficient of the year term from the model without the quadratic term was used as the rate of change in the number of conservation articles. We used the ggplot2 (Wickham, 2011) package for visualization. Due to the small sample size, we could not estimate the coefficients for quadratic models in some languagedatabase combinations. For this, we discuss only the linear models in the main text and provide output of the quadratic model in Appendix S3.

RESULTS
Searches on Google Scholar identified the largest number of non-English-language articles on biodiversity conservation across the 16 languages (a total of 97,014 articles), whereas those on Web of Science (10,226 articles) and Scopus (4483 articles) resulted in a much smaller number of non-English-language articles (Figure 1). Searches on the 11 local search systems for 13 languages identified 42,331 articles, but this number is still much smaller than that based on Google Scholar. When comparing search results within each language, searches on Google Scholar identified many more articles than other search systems for most languages except Portuguese, Spanish, Japanese, and Russian, for which searches on the local search systems identified the largest number of articles (Figure 1).
In most languages and on most search systems, conservation articles increased exponentially in the late 1990s ( Figure 2) (results from Web of Science and Scopus in Appendix S2). The number of conservation articles published each year has increased significantly over the past 39 years for most languages (Figure 3 & Appendix S3). For Spanish, Portuguese, German, Korean, and Russian, the number of conservation articles published each year increased at a rate similar to, or even faster than, English-language articles (Figure 2). However, the number of published conservation articles identified through Google Scholar declined in the early to mid-2010s in English and some non-English languages (Figure 2). This result was corroborated by the better performance of the model with the quadratic year term (Appendix S2). Web of Science and Scopus tended to show a lower rate of increase than Google Scholar and local search systems, and they showed no growth for Italian, Swedish, Turkish, Dutch, and Persian (Figure 2 & Appendix S2). The number of articles published annually in Italian, Turkish, and Dutch showed a nonsignificant change based on Scopus and Web of Science, but showed a significant increase based on Google Scholar (and the local search system for Turkish; Appendix S3).
Although the percentage of relevant studies was generally high (> 75%) across search systems for most languages (e.g., over 86% for simplified Chinese), it varied substantially among search systems and languages. Scopus showed the highest mean percentage of relevant studies across languages (91%), followed by Web of Science (89%), local literature search systems (68%), and Google Scholar (66%) (Figure 4a). The most relevant articles identified on Scopus and Web of Science were peer-reviewed studies (both 91% on average). In comparison, the proportion of peer-reviewed relevant articles was considerably lower in both Google Scholar and local literature search systems (26% and 42%, respectively, on average) (Figure 4a).
Most peer-reviewed relevant articles identified on Scopus and Web of Science provided the title and abstract in English. In contrast, a considerable proportion of the peer-reviewed relevant articles identified on Google Scholar (39% on average) and local literature search systems (25%) provided neither the title nor the abstract in English, making them, in theory, undetectable when searching with English keywords (Figure 4b).

DISCUSSION
The common assumption that the number of non-Englishlanguage publications is decreasing was not supported by our results. We found that the number of non-English-language articles on biodiversity conservation published each year actually been increased over the past 39 years in 12 of the 15 languages. For Dutch, Italian, and Turkish, there was a significant increase in the number of published articles based on Google Scholar or local search systems and a nonsignificant change based on Web of Science and Scopus. Given a sizable number of non-English-language journals and articles are indexed by neither Web of Science nor Scopus (Figure 1), this result indicates that the number of conservation articles in Dutch, Italian, and Turkish may have also increased, or at least remained constant, over the same period. The reason for the slower growth of articles in these three languages is unknown, but our results indicated that speakers of these three languages may especially prefer publishing their work in English, compared with speakers of the other 12 languages.
This result suggests that an increasing number of people, presumably those whose first language is not English, choose to publish their conservation work in English and a non-English FIGURE 1 Number of biodiversity conservation articles published in 2018 in 16 languages based on searches for the keywords biodiversity AND conservation (translated into each language) (translated keywords in Appendix S1) in three international (Google Scholar, Scopus, and Web of Science) and 11 local (see Table 1) search systems (S, simplified; T, traditional). Because neither Scopus nor Web of Science distinguish between simplified and traditional Chinese, search results for Chinese in general from the two search systems are shown under simplified Chinese  Percentage (a) of nonpeer-reviewed and peer-reviewed studies that mentioned biodiversity conservation (one study represents one published article) identified in four types of literature search systems for 15 non-English languages in 2018 and (b) of peer-reviewed, relevant, non-English-language studies with neither title nor abstract available in English, with only title available in English, with only abstract available in English, and with title and abstract available in English (S, simplified; T, traditional). Data are unavailable in some language-search system combinations because local search engines for the language could not be identified or the search in the search system did not identify any article in the language. The order of languages on the x-axis is based on the total number of publications in 2018 identified using Google Scholar language. This choice may stem from one of the following reasons. First, language barriers, which cause difficulties in publishing in English, may force non-native English speakers to publish their work in their first language. Science written by non-native English speakers tends to be considered of low quality (Politzer-Ahles et al., 2020) and is often rejected by English-language journals (Ramírez-Castañeda, 2020) simply because of the writing style. Non-native English speakers, especially those in the early stages of their career and those with low socioeconomic status who are disproportionately affected by language barriers (due to high costs of learning English as a second language and using professional editing services [Ramírez-Castañeda, 2020]), may consequently turn to journals published in their first language as more promising publication outlets. Second, non-native English speakers may believe that the topic of their work is not of international importance (e.g., because it is about species and ecosystems that are specific to their country). Thus, they conclude that their results are not substantial enough to be published in international journals, leading to submission to and publication in local non-English-language journals. Finally, people may also deliberately choose to publish their work in a non-English language for constructive reasons. For example, the publication of science only in English represents a barrier to the use of scientific knowledge by decision makers (Amano et al., 2016). Recognizing this language barrier, scientists may choose to publish their work in the language of the country, where they aim to disseminate their work to decision makers and the general public. Both types of language barriers (i.e., language barriers to publications and to the use of science in decision-making) continue to be pervasive issues in countries where English is not widely spoken, which may explain the increasing trend of non-English-language articles on biodiversity conservation.
We observed a marked variation in the number of non-English-language articles identified among the different literature search systems. Overall, Google Scholar identified the largest number of non-English-language articles for most languages, followed by local search systems. In contrast, Scopus and Web of Science identified less than one-twentieth and oneninth the number of non-English-language articles identified on Google Scholar, respectively. The percentages of peer-reviewed relevant articles were much lower in Google Scholar than Scopus and Web of Science, which was also reported by . However, even when focusing only on peerreviewed relevant articles, Google Scholar seems to identify many more non-English-language articles than Scopus and Web of Science (Appendix S4). Further, it is now increasingly recognized that gray literature, which is often not peer reviewed, also plays an important role in environmental evidence synthesis (Haddaway & Bayliss, 2015). Google Scholar allowed us to identify such potentially important gray literature, whereas Scopus and Web of Science do not normally index gray literature (blue bars in Figure 4a). Local search systems identified far more articles than Google Scholar for some languages, such as Portuguese, Spanish, Japanese, and Russian. This finding indicates that the most effective search system for identifying non-English-language literature varies among languages.
We also found that searching with non-English-language keywords on Google Scholar and local literature search systems identified articles with neither the title nor the abstract available in English (approximately 20% of all articles identified), many of which were not discoverable with English keywords on Scopus and Web of Science. Overall, Google Scholar seems to be the most effective system in many languages, but using multiple search systems, especially Google Scholar and local search systems, is important for identifying important non-Englishlanguage articles.
One potential limitation of our study is that we searched non-English articles on a very broad topic (anything related to biodiversity and conservation) with a simple search string. The patterns we identified could be different for articles on more specific topics, although a recent study also showed that peer-reviewed studies on the effectiveness of conservation actions are being published at an increasing rate in many languages . Future studies should reassess how the availability of non-English-language literature might vary among, for example, taxonomic groups (e.g., birds vs. insects) and research topics (e.g., studies on ecological threats, the state of biodiversity, and the effectiveness of conservation intervention) to better understand where non-English-language literature can fill knowledge gaps that are not covered by the English literature.
Second, we did not consider the potential duplication of articles among languages. Some authors whose first language is not English may publish the same work in both their first language (e.g., as a thesis) and English (e.g., as an academic journal article) or the same paper may be found with searches in multiple languages. Considering that Google Scholar also covers gray literature, the results based on Google Scholar searches could be especially vulnerable to the effect of such duplicated articles. However, this limitation is unlikely to undermine our conclusion that the number of non-English-language articles is increasing because the results were consistent across languages and search systems, including the Web of Science and Scopus (Figure 3 & Appendix S3), which is essentially restricted to academic journal papers.
Third, we assessed only 15 non-English languages following Amano et al. (2016). We are aware that many languages were not covered in this study, although the languages assessed likely cover a large number of the languages spoken in the Global North and South. Finally, although we assessed the proportion of peer-reviewed articles to measure the quality of studies published in each language, study quality depends on many other factors, such as study design and sample size (Christie et al., 2020). Thus, our results do not necessarily indicate that high-quality research has been increasingly published in non-English languages. A recent study showed that non-English-language studies adopt less robust study designs than English-language studies in biodiversity conservation . In healthcare, previous studies comparing study quality between languages have reported mixed results, and it is thus recommended that "studies published in languages other than English [should not] be generally excluded [from evidence syntheses] for the reason of study quality" (Song et al., 2010). Testing differences in conservation and ecology study quality between languages is a priority for future studies.
We found that the number of non-English-language conservation articles increased in most languages. Recent studies have also shown between-language bias in study characteristics and statistical results (Konno et al., 2020;Amano et al., 2021). Focusing on meta-analysis in ecology and conservation, Konno et al. (2020) demonstrated that excluding non-Englishlanguage studies can cause substantial changes in overall mean effect sizes and even their direction (Konno et al., 2020). Amano et al. (2021) identified non-English-language studies providing evidence of the effectiveness of conservation actions, especially in areas and for species with little or even no English-language evidence. Together, these findings indicate that the importance of non-English-language articles in ecological evidence syntheses is not diminishing. To derive robust conclusions in evidence syntheses and correctly inform conservation actions and policies, synthesizing the best available knowledge in an unbiased manner is essential. This seems possible only by incorporating scientific knowledge that has long been accumulating and continues to be produced in languages other than English.
Scientific literature written in non-English languages has largely been ignored at the international level (Lynch et al., 2021), but our results indicate that conservation science, practices, and policies can greatly benefit from facilitating the better use of non-English-language literature. This could be achieved through, for example, involving native speakers of major non-English languages in evidence syntheses to search and screen literature in multiple languages or by integrating machine translation into automated or semiautomated evidence synthesis tools. Well-known international literature search systems, such as Web of Science and Scopus, and other search systems that fully include non-English-language literature should be used. Using Google Scholar, together with local, language-specific search systems, such as those listed in Table 1, would be a good starting point for such multilanguage literature searches. Increasing the visibility of non-English-language literature would also be effective. When publishing in non-English languages, authors should make sure to provide an English title and abstract or upload preprints on international repositories. Similarly, creating a list of non-English-language journals in the relevant discipline may also be useful (e.g., see a list of non-English-language journals in ecology and conservation https://translatesciences. com/resources/#non-english-journals).
Our results show that despite the increased use of English as a common language of science, language continues to be a factor in biodiversity conservation. Although overcoming the issue of language barriers is a challenging process, addressing this issue can boost understanding of life on Earth and improve decision-making for conservation.