Journal article Open Access

Generating reliable tourist accommodation statistics: Bootstrapping regression model for overdispersed long-tailed data

Van Truong, Nguyen; Shimizu, Tetsuo; Kurihara, Takeshi; Choi, Sunkyung

Purpose: Few studies have applied count data analysis to tourist accommodation data. This study was undertaken to investigate the characteristics and to seek for the most fitting models for population total estimation in relation to tourist accommodation data.

Methods: Based on the data of 10,503 hotels, obtained from by a nationwide Japanese survey, the bootstrap resampling method was applied for re-randomisation of the data. Training and test sets were derived by randomly splitting each of the bootstrap samples. Six count models were fitted to the training set and validated with the test set. Bootstrap distributions for parameters of significance were used for model evaluation.

Results: The outcome variable (number of guests), was found to be heterogenous, over dispersed and long-tailed, with excessive zero counts. The hurdle negative binomial and zero-inflated negative binomial models outperformed the other models. The accuracy (se) of the estimation of total guests with training sets that ranged from 5% to 85%, was from 3.7 to 0.4 respectively. Results appear little overestimated.

Implications: Findings indicated that the integration of the bootstrap resampling method and count regression provide a statistical tool for generating reliable tourist accommodation statistics. The use of bootstrap would help to detect and correct the bias of the estimation.

SUBMITTED: MAR. 2019, REVISION SUBMITTED: OCT. 2019, 2nd REVISION SUBMITTED: JAN. 2020, ACCEPTED: MAR. 2020, REFEREED ANONYMOUSLY, PUBLISHED ONLINE: 30 MAY 2020
Files (1.0 MB)
Name Size
6-2-4.pdf
md5:2301eadbf531a646ab644a60244ff4c3
1.0 MB Download
283
147
views
downloads
All versions This version
Views 283229
Downloads 147121
Data volume 152.2 MB125.3 MB
Unique views 225194
Unique downloads 133113

Share

Cite as