Dataset Open Access
Substantial short-term demand fluctuations are common in the tourism industry. Therefore, tourism companies such as accommodation, transportation, catering, and leisure facilities have a vital interest in precise forecasts of the number of customers.
We present a novel forecasting dataset for tourism, consisting of four Swiss companies, one accommodation, two transportation, and one indoor leisure businesses, all located in the same touristic region. It covers a total of ten years starting in 2007 and ending in 2016. The dataset allows using cross-series information and includes explanatory variables, such as calendar effects, event data, and weather forecast information.
Machine learning (ML) practitioners, statisticians, and experts of tourism sectors are invited to investigate our dataset for new insights on short-term forecasting for industries in tourism.
The Algorithmic Business Research Lab (ABIZ) of the Lucerne University of Applied Sciences and Arts, Switzerland, researches ML algorithms for businesses to support industry partners in developing business models and services based on complex algorithms as well as in the induced digital transformation. In a joint effort with Institute of Tourism (ITW) of the School of Business of the Lucerne University of Applied Sciences and Arts, Switzerland, we provide the present tourism dataset as part of a publication.
Contacts and further information can be found at http://www.abiz.ch/.
The dataset comprises 3653 days of customer numbers of four Swiss companies in the tourism sector. There are 556 feature columns, four target columns, and two mask columns, 562 columns in total.
The customer volume data has daily resolution and features at worst minor interruptions of a few days over a common period of ten years, starting in 2007 and ending in 2016. The missing values for one transportation and the indoor leisure company are masked, and the masks are available as indicator variables.
The dataset contains both numerical and categorical,
The weather forecast data consist of information about temperature, sunshine, precipitation, and wind, forecasted up to 3 days in advance. Note that the weather forecast model is updated regularly, and therefore many features do not cover the entire period. We want to point out that there are categorical weather summary annotations created by meteorologists, which are only provided for the last year.
In the published version of the dataset, feature names are replaced with pseudonyms, but descriptions are given to identify feature groups with similar meaning. The content available for download consists of