Published June 25, 2015 | Version v1
Dataset Open

Google Trends and Wikipedia Page Views

  • 1. Toyohashi University of Technology

Description

Abstract (our paper)

The frequency of a web search keyword generally reflects the degree of public interest in a particular subject matter. Search logs are therefore useful resources for trend analysis. However, access to search logs is typically restricted to search engine providers. In this paper, we investigate whether search frequency can be estimated from a different resource such as Wikipedia page views of open data. We found frequently searched keywords to have remarkably high correlations with Wikipedia page views. This suggests that Wikipedia page views can be an effective tool for determining popular global web search trends.

Data

personal-name.txt.gz:
The first column is the Wikipedia article id, the second column is the search keyword, the third column is the Wikipedia article title, and the fourth column is the total of page views from 2008 to 2014.

personal-name_data_google-trends.txt.gz, personal-name_data_wikipedia.txt.gz:
The first column is the period to be collected, the second column is the source (Google or Wikipedia), the third column is the Wikipedia article id, the fourth column is the search keyword, the fifth column is the date, and the sixth column is the value of search trend or page view.

Publication

This data set was created for our study. If you make use of this data set, please cite:
Mitsuo Yoshida, Yuki Arase, Takaaki Tsunoda, Mikio Yamamoto. Wikipedia Page View Reflects Web Search Trend. Proceedings of the 2015 ACM Web Science Conference (WebSci '15). no.65, pp.1-2, 2015.
http://dx.doi.org/10.1145/2786451.2786495
http://arxiv.org/abs/1509.02218 (author-created version)

Note

The raw data of Wikipedia page views is available in the following page.
http://dumps.wikimedia.org/other/pagecounts-raw/

Files

Files (79.3 MB)

Name Size Download all
md5:a78ab33caae84e05e9d7af757de0a037
78.8 kB Download
md5:22f533eabd676c3a47a7b5ea0fbdf536
2.4 MB Download
md5:1a7a18141cdfc0cbcc11e0f1f72a1871
6.0 MB Download
md5:9880ed76768196d2eec9a505a0a07e31
162.1 kB Download
md5:3f712e58a1a9f323bedaf3231c05ca33
4.0 MB Download
md5:b35738a81b3e0ff183e8bd67a2ac5ae2
12.5 MB Download
md5:2dde92c42ff01d75f4d31d0c883d616d
220.9 kB Download
md5:c05bb495e61c5858eaa39b8b77016318
6.6 MB Download
md5:f145cc986fcc1107552004379aa6f077
17.7 MB Download
md5:e4857aeef900e7aaaad543ab2dd93d4b
180.0 kB Download
md5:988d69454b1dd1c7be32ff5d7c21c5b9
10.2 MB Download
md5:2aed7230dda24cee0d84d3584a78f60e
19.4 MB Download

Additional details

References

  • Mitsuo Yoshida, Yuki Arase, Takaaki Tsunoda, Mikio Yamamoto. Wikipedia Page View Reflects Web Search Trend. Proceedings of the 2015 ACM Web Science Conference (WebSci '15). no.65, pp.1-2, 2015.
  • Mitsuo Yoshida, Yuki Arase, Takaaki Tsunoda, Mikio Yamamoto. Wikipedia Page View Analysis for Search Trend Prediction. Proceedings of the Annual Conference of Japanese Society for Artificial Intelligence (in Japanese). vol.29, no.2I1-1, pp.1-4, 2015.