Google Trends and Wikipedia Page Views
Description
Abstract (our paper)
The frequency of a web search keyword generally reflects the degree of public interest in a particular subject matter. Search logs are therefore useful resources for trend analysis. However, access to search logs is typically restricted to search engine providers. In this paper, we investigate whether search frequency can be estimated from a different resource such as Wikipedia page views of open data. We found frequently searched keywords to have remarkably high correlations with Wikipedia page views. This suggests that Wikipedia page views can be an effective tool for determining popular global web search trends.
Data
personal-name.txt.gz:
The first column is the Wikipedia article id, the second column is the search keyword, the third column is the Wikipedia article title, and the fourth column is the total of page views from 2008 to 2014.
personal-name_data_google-trends.txt.gz, personal-name_data_wikipedia.txt.gz:
The first column is the period to be collected, the second column is the source (Google or Wikipedia), the third column is the Wikipedia article id, the fourth column is the search keyword, the fifth column is the date, and the sixth column is the value of search trend or page view.
Publication
This data set was created for our study. If you make use of this data set, please cite:
Mitsuo Yoshida, Yuki Arase, Takaaki Tsunoda, Mikio Yamamoto. Wikipedia Page View Reflects Web Search Trend. Proceedings of the 2015 ACM Web Science Conference (WebSci '15). no.65, pp.1-2, 2015.
http://dx.doi.org/10.1145/2786451.2786495
http://arxiv.org/abs/1509.02218 (author-created version)
Note
The raw data of Wikipedia page views is available in the following page.
http://dumps.wikimedia.org/other/pagecounts-raw/
Files
Files
(79.3 MB)
Name | Size | Download all |
---|---|---|
md5:a78ab33caae84e05e9d7af757de0a037
|
78.8 kB | Download |
md5:22f533eabd676c3a47a7b5ea0fbdf536
|
2.4 MB | Download |
md5:1a7a18141cdfc0cbcc11e0f1f72a1871
|
6.0 MB | Download |
md5:9880ed76768196d2eec9a505a0a07e31
|
162.1 kB | Download |
md5:3f712e58a1a9f323bedaf3231c05ca33
|
4.0 MB | Download |
md5:b35738a81b3e0ff183e8bd67a2ac5ae2
|
12.5 MB | Download |
md5:2dde92c42ff01d75f4d31d0c883d616d
|
220.9 kB | Download |
md5:c05bb495e61c5858eaa39b8b77016318
|
6.6 MB | Download |
md5:f145cc986fcc1107552004379aa6f077
|
17.7 MB | Download |
md5:e4857aeef900e7aaaad543ab2dd93d4b
|
180.0 kB | Download |
md5:988d69454b1dd1c7be32ff5d7c21c5b9
|
10.2 MB | Download |
md5:2aed7230dda24cee0d84d3584a78f60e
|
19.4 MB | Download |
Additional details
References
- Mitsuo Yoshida, Yuki Arase, Takaaki Tsunoda, Mikio Yamamoto. Wikipedia Page View Reflects Web Search Trend. Proceedings of the 2015 ACM Web Science Conference (WebSci '15). no.65, pp.1-2, 2015.
- Mitsuo Yoshida, Yuki Arase, Takaaki Tsunoda, Mikio Yamamoto. Wikipedia Page View Analysis for Search Trend Prediction. Proceedings of the Annual Conference of Japanese Society for Artificial Intelligence (in Japanese). vol.29, no.2I1-1, pp.1-4, 2015.