Dataset Open Access

Google Trends and Wikipedia Page Views

Yoshida, Mitsuo

Abstract (our paper)

The frequency of a web search keyword generally reflects the degree of public interest in a particular subject matter. Search logs are therefore useful resources for trend analysis. However, access to search logs is typically restricted to search engine providers. In this paper, we investigate whether search frequency can be estimated from a different resource such as Wikipedia page views of open data. We found frequently searched keywords to have remarkably high correlations with Wikipedia page views. This suggests that Wikipedia page views can be an effective tool for determining popular global web search trends.

Data

personal-name.txt.gz:
The first column is the Wikipedia article id, the second column is the search keyword, the third column is the Wikipedia article title, and the fourth column is the total of page views from 2008 to 2014.

personal-name_data_google-trends.txt.gz, personal-name_data_wikipedia.txt.gz:
The first column is the period to be collected, the second column is the source (Google or Wikipedia), the third column is the Wikipedia article id, the fourth column is the search keyword, the fifth column is the date, and the sixth column is the value of search trend or page view.

Publication

This data set was created for our study. If you make use of this data set, please cite:
Mitsuo Yoshida, Yuki Arase, Takaaki Tsunoda, Mikio Yamamoto. Wikipedia Page View Reflects Web Search Trend. The 2015 ACM Web Science conference (WebSci15). Oxford, UK, June 28 - July 1, 2015.
http://dx.doi.org/10.1145/2786451.2786495
http://arxiv.org/abs/1509.02218 (author-created version)

Note

The raw data of Wikipedia page views is available in the following page.
http://dumps.wikimedia.org/other/pagecounts-raw/

Name Size
cartoon.txt.gz
md5:a78ab33caae84e05e9d7af757de0a037
78.8 kB Download
cartoon_data_google-trends.txt.gz
md5:22f533eabd676c3a47a7b5ea0fbdf536
2.4 MB Download
cartoon_data_wikipedia.txt.gz
md5:1a7a18141cdfc0cbcc11e0f1f72a1871
6.0 MB Download
comic.txt.gz
md5:9880ed76768196d2eec9a505a0a07e31
162.1 kB Download
comic_data_google-trends.txt.gz
md5:3f712e58a1a9f323bedaf3231c05ca33
4.0 MB Download
comic_data_wikipedia.txt.gz
md5:b35738a81b3e0ff183e8bd67a2ac5ae2
12.5 MB Download
movie.txt.gz
md5:2dde92c42ff01d75f4d31d0c883d616d
220.9 kB Download
movie_data_google-trends.txt.gz
md5:c05bb495e61c5858eaa39b8b77016318
6.6 MB Download
movie_data_wikipedia.txt.gz
md5:f145cc986fcc1107552004379aa6f077
17.7 MB Download
personal-name.txt.gz
md5:e4857aeef900e7aaaad543ab2dd93d4b
180.0 kB Download
personal-name_data_google-trends.txt.gz
md5:988d69454b1dd1c7be32ff5d7c21c5b9
10.2 MB Download
personal-name_data_wikipedia.txt.gz
md5:2aed7230dda24cee0d84d3584a78f60e
19.4 MB Download
  • Mitsuo Yoshida, Yuki Arase, Takaaki Tsunoda, Mikio Yamamoto. Wikipedia Page View Analysis for Search Trend Prediction. Proceedings of the Annual Conference of Japanese Society for Artificial Intelligence (in Japanese). vol.29, no.2I1-1, pp.1-4, Hokkaido, JAPAN, May 20 - June 2, 2015.
  • Mitsuo Yoshida, Yuki Arase, Takaaki Tsunoda, Mikio Yamamoto. Wikipedia Page View Reflects Web Search Trend. The 2015 ACM Web Science conference (WebSci15). Oxford, UK, June 28 - July 1, 2015.

Share

Cite as