Published September 2, 2015 | Version v1
Journal article Open

K-Means Clustering For Segment Web Search Results

Description

Clustering is the power full technique for segment relevant data into different levels. This study has proposed K-means clustering method for cluster web search results for search engines. For represent documents we used vector space model and use cosine similarity method for measure similarity between user query and the search results. As an improvement of K-means clustering we used distortion curve method for identify optimal initial number of clusters.

Files

K-Means_Clustering_For_Segment_Web_Search_Results.pdf

Files (379.4 kB)

Additional details

References

  • [1] GG. He, “Authoritative K-Means for Clustering of Web Search Results,” no. June, 2010.
  • [2] R.M. Kapila Tharanga Rathnayaka, Wei Jianguo and D.M.K.N Seneviratne, “Geometric Brownian Motion with Ito lemma Approach to evaluate market fluctuations: A case study on Colombo Stock Exchange”, 2014 International Conference on Behavioral, Economic, and Socio-Cultural Computing (BESC’2014- IEEE), Shanghai, China, 2014.
  • [3] M. Alam and K. Sadaf, “Labeling of Web Search Result Clusters using Heuristic Search and Frequent Itemset,” Procedia - Procedia Comput. Sci., vol. 46, no. Icict 2014, pp. 216–222, 2015.
  • [4] M. Mahdavi, M. H. Chehreghani, H. Abolhassani, and R. Forsati, “Novel meta-heuristic algorithms for clustering web documents,” vol. 201, pp. 441–451, 2008.
  • [5] Rathnayaka, R.M. K.T. and Seneviratne, D.M.K.N, “G M (1, 1) Analysis and Forecasting for Efficient Energy Production and Consumption”, International Journal of Business, Economics and Managment Works, Kambohwell Publisher Enterprises, 1 (1), 6-11, 2014.
  • [6] P. P. Anchalia, “MapReduce Design of K-Means Clustering Algorithm,” 2013.
  • [7] Jayathileke, P. M. B., and Rathnayaka, R.M. K. T. “Testing the Link between Inflation and Economic Growth: Evidence from Asia”, Modern Economy,4, 87.
  • [8] Jones Gareth, A.M. Robertson, Santimetvirul Chawchat, P. Willett, Non-hierarchic document clustering using a genetic algorithm, Informat. Res. 1 (1) (1995).
  • [9] R.M.K.T Rathnayaka and Zhong-jun Wang, “Prevalence and effect of personal hygiene on transmission of helminthes infection among primary school children living in slums”, International Journal of Multidisciplinary Research Journal; ZENITH, ISSN: 2231-5780, Vol 02, April 2012.
  • [10] V.V. Raghavan, K. Birchand, A clustering strategy based on a formalism of the reproductive process in a natural system, in: Proceedings of the Second International Conference on Information Storage and Retrieval, 1979, pp. 10–22.
  • [11] R.M Kapila Tharanga Rathnayaka, D.M. Kumudu Nadeeshani Seneviratne and Zhong- jun Wang, “An Investigation of Statistical Behaviors of the Stock Market Fluctuations in the Colombo Stock Market: ARMA & PCA Approach”, Journal of Scientific Research & Reports 3(1): 130-138, 2014; Article no. JSRR, www.sciencedomain.org
  • [12] Labroche, N. Monmarche’, G. Venturini, AntClust: ant clustering and web usage mining, Genet. Evolut. Comput. Conf. (2003) 25–36.
  • [13] R.M. Kapila Tharanga Rathnayaka and Zhong-jun Wang, “Enhanced Greedy Optimization Algorithm with Data Warehousing for Automated Nurse Scheduling System”, E-Health Telecommunication Systems and Networks, 2013, http://www.SciRP.org/journal/etsn.
  • [14] X. Cui, T.E. Potok, P. Palathingal, Document Clustering using Particle Swarm Optimization, IEEE Swarm Intell. Symp. (2005) 185–191.N.
  • [15] R.M. Kapila Tharanga Rathnayaka and Zhong-jun Wang, “Influence of Family Status on the Dietary Patterns and Nutritional Levels of Children”, Food and Nutrition Sciences, 2013, http://www.SciRP.org/journal/fns .
  • [16] J. Mei and L. Chen, “Expert Systems with Applications Proximity-based k -partitions clustering with ranking for document categorization and analysis,” Expert Syst. Appl., vol. 41, no. 16, pp. 7095–7105, 2014.
  • [17] R.M Kapila Tharanga Rathnayaka, “Cross-Cultural Dimensions of Business Communication: Evidence from Sri Lanka”, International Review of Management and Business Research, 3(3), 1579-1587, 2014; ISSN: 2306-9007, 2014, www.irmbrjournal.com
  • [18] E. Fersini, E. Messina, and F. Archetti, “A probabilistic relational approach for web document clustering,” Inf. Process. Manag., vol. 46, no. 2, pp. 117–130, 2010.
  • [19] M. Carullo, E. Binaghi, and I. Gallo, “An online document clustering technique for short web contents,” Pattern Recognit. Lett., vol. 30, no. 10, pp. 870–876, 2009.
  • [20] Rathnayaka, R.M. K.T. and Seneviratne, D.M.K.N, “A Comparative Analysis of Stock Price Behaviors on the Colombo and Nigeria Stock Exchanges”, International Journal of Business, Economics and Managment Works, Kambohwell Publisher Enterprises, 2 (2), 12-16, 2014.
  • [21] D. Ren, D. Zheng, G. Huang, S. Zhang, and Z. Wei, “Parallel Set Determination and K-means Clustering for Data Mining on Telecommunication Networks,” 2013.
  • [22] Rathnayaka, R.M. K.T., Seneviratne, D.M.K.N and Jianguo,W., “Grey system based novel approach for stock market forecasting”, Grey Systems: Theory and Application, Emerald Group Publishing Limited, 5 (2), 2015.
  • [23] Seneviratne, D.M.K.N and Long, w., “Analysis of Government Responses for the Photovoltaic Industry in China”, Journal of Economics and Sustainable Development, 4 (13), 2013.
  • [24] Seneviratne, D.M.K.N and Jianguo, w., “Analying the causal relationship between stock prices and selected microeconomic variables: evidence from Sri Lanka”, ZENITH international journal of Business economics & Management research, 3 (9), 2013.