Published March 3, 2026 | Version v1
Dataset Open

Data for the analysis of mosquito bites in Catalonia (2022-2024)

  • 1. ROR icon Institut Químic de Sarrià

Description

This project aims to analyse mosquito bite patterns across Catalonia using real-world data from multiple open sources. Mosquito activity is shaped by environmental and demographic conditions, and understanding these relationships can help identify risk factors and support public health initiatives. 
Our central hypothesis is that bite frequency varies with environmental variables such as temperature and humidity, as well as demographic characteristics like population density and geographic region. We also examine whether certain mosquito species are reported more frequently than others. 
To address these questions, we built a complete data-processing pipeline that integrates, cleans, and analyses information from several open datasets. 
It is important to note that the primary source of bite information (Mosquito Alert) is based on user-reported bites and sightings. These reports are not uniformly distributed across Catalonia and are particularly concentrated in the Barcelona metropolitan area. This uneven spatial distribution introduces potential biases in the dataset and limits the representativeness of the observations. For a more accurate and comprehensive analysis, a targeted data-collection campaign focused on mosquito bites and sightings across under-represented regions would be necessary. 

Codebook

Variable

Description

id

Unique user identifier that refers to the person who reported the mosquito bite (uuid)

code

A four‑digit code used to identify the person who reported the mosquito bite (????)

date

Date where the bite was reported (dd/mm/yyyy)

b_longitude

Longitude of the bite (decimal degrees)

b_latitude

Latitude of the bite (decimal degrees)

b_altitude

Altitude of the bite (m)

b_province

Province where the bite was reported

b_county

County where the bite was reported

b_municipality

City/town where the bite was reported

b_m_population

Municipality population

b_count

Bite counts for each entry

b_location

Bite location (outdoors, inside building, inside car, don’t know

b_time

Bite registered day part (Morning, midday, afternoon, night)

prob_tiger

Probability for the mosquito to be tiger

prob_culex

Probability for the mosquito to be culex

s_id

Nearest station with data available

s_altitude

Weather station altitude (m)

s_temp

Station registered mean temperature (ºC)

s_humidity

Station registered mean humidity (%)

distance

Haversine distance from the two points (bite and meteorological station) (km)

alt_diff

Altitude difference between bite and station coordinates (m)

temp_correction

Temperature deviation (ºC)

temp_adjusted

Corrected temperature (ºC)

temp_range

Temperature intervals

Files

mosquito_analysis.ipynb

Files (970.9 kB)

Name Size Download all
md5:1e8c87fc72175f463a8177ac3d839aa8
102.6 kB Preview Download
md5:21b7953342fa327bbdffd7f36a2410b2
868.4 kB Preview Download

Additional details

Software

Programming language
Jupyter Notebook , Python , CSV