Dataset Open Access

Suitability Map of COVID-19 Virus Spread

Gianpaolo Coro

DataCite XML Export

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="" xmlns="" xsi:schemaLocation="">
  <identifier identifierType="DOI">10.5281/zenodo.3725831</identifier>
      <creatorName>Gianpaolo Coro</creatorName>
    <title>Suitability Map of COVID-19 Virus Spread</title>
    <subject>Maximum Entropy</subject>
    <subject>Carbon Dioxide</subject>
    <subject>Corona virus</subject>
    <date dateType="Issued">2020-03-20</date>
  <resourceType resourceTypeGeneral="Dataset"/>
    <alternateIdentifier alternateIdentifierType="url"></alternateIdentifier>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsVersionOf">10.5281/zenodo.3719140</relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="URL" relationType="IsPartOf"></relatedIdentifier>
    <relatedIdentifier relatedIdentifierType="URL" relationType="IsPartOf"></relatedIdentifier>
    <rights rightsURI="">Creative Commons Attribution 4.0 International</rights>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
    <description descriptionType="Abstract">&lt;p&gt;This image&amp;nbsp;reports a Maximum Entropy model that&amp;nbsp;estimates &lt;em&gt;suitable &lt;/em&gt;locations for COVID-19 spread, i.e. places that could favour the spread of the virus just in terms of environmental parameters.&lt;/p&gt;

&lt;p&gt;The model was trained just on locations in &lt;em&gt;Italy &lt;/em&gt;that have reported a rate of new infections higher than the geometric mean of all Italian infection rates. The following environmental parameters were used, which are correlated to those used by other studies:&lt;/p&gt;

	&lt;li&gt;Average Annual Surface Air Temperature in 2018 (NASA)&lt;/li&gt;
	&lt;li&gt;Average Annual Precipitation in 2018 (NASA)&lt;/li&gt;
	&lt;li&gt;CO2 emission (natural+artificial) averaged between January 1979 and&amp;nbsp;December 2013 (Copernicus Atmosphere Monitoring Service)&lt;/li&gt;
	&lt;li&gt;Elevation (NOAA ETOPO2)&lt;/li&gt;
	&lt;li&gt;Population per 0.5&amp;deg; cell (NASA Gridded Population of the World)&lt;/li&gt;

&lt;p&gt;A higher resolution map, the model file (in ASC format) and all parameters used are also attached.&lt;/p&gt;

&lt;p&gt;The model indicates highest correlation with&amp;nbsp;&lt;em&gt;infection rate&lt;/em&gt; for CO2 around 0.03 gCm^&amp;minus;2day^&amp;minus;1, for Temperature around 11.8 &amp;deg;C, and for Precipitation around 0.3 kg m^-2&amp;nbsp; s^-1, whereas Elevation and Population density are&amp;nbsp;poorly correlated with &lt;em&gt;infection rate&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;One interesting result is that the model indicates, among others, the Hubei region in China as a high-probability location&lt;/strong&gt;, &lt;strong&gt;and Iran (around Teheran) as a suited location for virus&amp;#39; spread, but the model was not trained on these regions, i.e. it did not know about the actual spread in these regions.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Evaluation: &lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A &lt;em&gt;risk score&lt;/em&gt; was calculated for&amp;nbsp;each country/region reported by the JHU&amp;nbsp;monitoring system (&lt;a href=""&gt;;/a&gt;). This score is calculated as&amp;nbsp;the summed normalised probability&amp;nbsp;in the populated locations divided by their total surface. This score represents how much the zone would potentially foster&amp;nbsp;the virus&amp;#39; spread.&lt;/p&gt;

&lt;p&gt;We assessed the reliability of this score, by selecting the country/regions that reported the &lt;em&gt;highest rates of infection&lt;/em&gt;. These zones were selected&amp;nbsp;as those with a rate higher than the upper confidence of a log-normal distribution of the rates.&lt;/p&gt;

&lt;p&gt;The agreement between the two maps (&lt;a href=""&gt;covid_high_rate_vs_high_risk.png&lt;/a&gt;, where violet dots indicate &lt;em&gt;high infection rates &lt;/em&gt;and countries&amp;#39; colours indicate estimated &lt;em&gt;high risk score&lt;/em&gt;) is the following:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Accuracy &lt;/strong&gt;(overall percentage of correctly predicted high-rate zones):&amp;nbsp;&lt;strong&gt;77.25%&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Kappa &lt;/strong&gt;(agreement between the two maps): &lt;strong&gt;0.46&lt;/strong&gt; (Good, according to Fleiss&amp;#39; intepretation of the score)&amp;nbsp;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This assessment demonstrates that our map can be used to estimate the risk of a certain country to have a high rate of infection, and indicates that the influence of environmental parameters on virus&amp;#39;s spread should be further investigated.&lt;/strong&gt;&lt;/p&gt;

    <description descriptionType="Other">This experiment was done using the DataMiner cloud computing system of the D4Science e-Infrastructure and the BiodiversityLab Virtual Reseach Environment</description>
    <description descriptionType="Other">{"references": ["Coro, G., Panichi, G., Scarponi, P., &amp; Pagano, P. (2017). Cloud computing in a distributed e\u2010infrastructure using the web processing service standard. Concurrency and Computation: Practice and Experience, 29(18), e4219."]}</description>
All versions This version
Views 1,088596
Downloads 330179
Data volume 4.9 GB2.4 GB
Unique views 864516
Unique downloads 191108


Cite as