Published September 21, 2011 | Version 10009
Journal article Open

Indonesian News Classification using Support Vector Machine

Description

Digital news with a variety topics is abundant on the internet. The problem is to classify news based on its appropriate category to facilitate user to find relevant news rapidly. Classifier engine is used to split any news automatically into the respective category. This research employs Support Vector Machine (SVM) to classify Indonesian news. SVM is a robust method to classify binary classes. The core processing of SVM is in the formation of an optimum separating plane to separate the different classes. For multiclass problem, a mechanism called one against one is used to combine the binary classification result. Documents were taken from the Indonesian digital news site, www.kompas.com. The experiment showed a promising result with the accuracy rate of 85%. This system is feasible to be implemented on Indonesian news classification.

Files

10009.pdf

Files (163.5 kB)

Name Size Download all
md5:d5130861fd52477634d3d2495f404580
163.5 kB Preview Download

Additional details

References

  • W. S. Maulsby, "Getting in News", in Mondry, 2008, pp. 132-133
  • A. Z. Arifin, and A. N. Setiono, "Klasifikasi Dokumen Berita Kejadian Berbahasa Indonesia dengan Algoritma Single Pass Clustering", Institut Teknologi Sepuluh Nopember(ITS). Surabaya. http://mail.itssby. edu/~agusza/SITIAKlasifikasiEvent.pdf.
  • I. Saputra, "Analisa Dan Implementasi Klasifikasi Berita Berbahasa Indonesia Menggunakan Metode Naive Bayes Analysis and Implementation of Classification Indonesian News With Naive Bayes Method". Institut Teknologi Telkom. Bandung.
  • M. Srinivas, and A. H. Sung. "Feature Selection for Intrusion Detection Using Neural Networks and Support Vector Machines", in Journal of Department of Computer Science, MIT. USA, 2003.
  • Y. Yang, and X. Liu, " A Re-examination of Text Categorization Methods", Proceedings of SIGIR-99, 22nd ACM International Conference on Research and Development in Information Retrieval, 1999, pp. 42-49
  • Tala, and Z. Fadillah, 2003, "A Study of Stemming Effects on Information Retrieval in Bahasa Indonesia". Master of Logic Project. Institute for Logic, Language and Computation, Universiteit van Amsterdam, 2003 The Netherlands www.illc.uva.nl/Publications/ResearchReports/MoL-200302.text.pdf.
  • J. C. Platt, "Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machine", Microsoft research, 1998.
  • N. Cristianini, and J. Shawe-Taylor, "An Introduction to Support Vector Machines" Cambridge, UK: Cambridge University Press, 2000.