Indonesian News Classification using Support Vector Machine
Creators
Description
Digital news with a variety topics is abundant on the internet. The problem is to classify news based on its appropriate category to facilitate user to find relevant news rapidly. Classifier engine is used to split any news automatically into the respective category. This research employs Support Vector Machine (SVM) to classify Indonesian news. SVM is a robust method to classify binary classes. The core processing of SVM is in the formation of an optimum separating plane to separate the different classes. For multiclass problem, a mechanism called one against one is used to combine the binary classification result. Documents were taken from the Indonesian digital news site, www.kompas.com. The experiment showed a promising result with the accuracy rate of 85%. This system is feasible to be implemented on Indonesian news classification.
Files
10009.pdf
Files
(163.5 kB)
Name | Size | Download all |
---|---|---|
md5:d5130861fd52477634d3d2495f404580
|
163.5 kB | Preview Download |
Additional details
References
- W. S. Maulsby, "Getting in News", in Mondry, 2008, pp. 132-133
- A. Z. Arifin, and A. N. Setiono, "Klasifikasi Dokumen Berita Kejadian Berbahasa Indonesia dengan Algoritma Single Pass Clustering", Institut Teknologi Sepuluh Nopember(ITS). Surabaya. http://mail.itssby. edu/~agusza/SITIAKlasifikasiEvent.pdf.
- I. Saputra, "Analisa Dan Implementasi Klasifikasi Berita Berbahasa Indonesia Menggunakan Metode Naive Bayes Analysis and Implementation of Classification Indonesian News With Naive Bayes Method". Institut Teknologi Telkom. Bandung.
- M. Srinivas, and A. H. Sung. "Feature Selection for Intrusion Detection Using Neural Networks and Support Vector Machines", in Journal of Department of Computer Science, MIT. USA, 2003.
- Y. Yang, and X. Liu, " A Re-examination of Text Categorization Methods", Proceedings of SIGIR-99, 22nd ACM International Conference on Research and Development in Information Retrieval, 1999, pp. 42-49
- Tala, and Z. Fadillah, 2003, "A Study of Stemming Effects on Information Retrieval in Bahasa Indonesia". Master of Logic Project. Institute for Logic, Language and Computation, Universiteit van Amsterdam, 2003 The Netherlands www.illc.uva.nl/Publications/ResearchReports/MoL-200302.text.pdf.
- J. C. Platt, "Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machine", Microsoft research, 1998.
- N. Cristianini, and J. Shawe-Taylor, "An Introduction to Support Vector Machines" Cambridge, UK: Cambridge University Press, 2000.