There is a newer version of this record available.

Dataset Open Access

Suitability Map of COVID-19 Virus Spread

Gianpaolo Coro

This image reports a Maximum Entropy model that estimates suitable locations for COVID-19 spread, i.e. places that could favour the spread of the virus just in terms of environmental parameters.

The model was trained just on locations in Italy that have reported a rate of new infections higher than the geometric mean of all Italian infection rates. The following environmental parameters were used, which are correlated to those used by other studies:

  • Average Annual Surface Air Temperature in 2018 (NASA)
  • Average Annual Precipitation in 2018 (NASA)
  • CO2 emission (natural+artificial) averaged between January 1979 and December 2013 (Copernicus Atmosphere Monitoring Service)
  • Elevation (NOAA ETOPO2)
  • Population per 0.5° cell (NASA Gridded Population of the World)

A higher resolution map, the model file (in ASC format) and all parameters used are also attached.

The model indicates highest correlation with infection rate for CO2 around 0.03 gCm^−2day^−1, for Temperature around 11.8 °C, and for Precipitation around 0.3 kg m^-2  s^-1, whereas Elevation and Population density are poorly correlated with infection rate.

One interesting result is that the model indicates, among others, the Hubei region in China as a high-probability location, and Iran (around Teheran) as a suited location for virus' spread, but the model was not trained on these regions, i.e. it did not know about the actual spread in these regions.


A risk score was calculated for each country/region reported by the JHU monitoring system ( This score is calculated as the summed normalised probability in the populated locations divided by their total surface. This score represents how much the zone would potentially foster the virus' spread.

We assessed the reliability of this score, by selecting the country/regions that reported the highest rates of infection. These zones were selected as those with a rate higher than the upper confidence of a log-normal distribution of the rates.

The agreement between the two maps (covid_high_rate_vs_high_risk.png, where violet dots indicate high infection rates and countries' colours indicate estimated high risk score) is the following:

Accuracy (overall percentage of correctly predicted high-rate zones): 77.25%
Kappa (agreement between the two maps): 0.46 (Good, according to Fleiss' intepretation of the score) 

This assessment demonstrates that our map can be used to estimate the risk of a certain country to have a high rate of infection, and indicates that the influence of environmental parameters on virus's spread should be further investigated.


This experiment was done using the DataMiner cloud computing system of the D4Science e-Infrastructure and the BiodiversityLab Virtual Reseach Environment
Files (77.9 MB)
Name Size
5.7 MB Download
47.3 MB Download
1.9 MB Download
5.1 MB Download
289.3 kB Download
2.8 MB Download
4.6 MB Download
5.5 MB Download
4.7 MB Download
  • Coro, G., Panichi, G., Scarponi, P., & Pagano, P. (2017). Cloud computing in a distributed e‐infrastructure using the web processing service standard. Concurrency and Computation: Practice and Experience, 29(18), e4219.

All versions This version
Views 2,268719
Downloads 1,073490
Data volume 12.5 GB6.1 GB
Unique views 1,838624
Unique downloads 501168


Cite as