Development for performance of Porter stemmer algorithm

Manhal Elias Polus; Thekra Abbas

doi:10.15587/1729-4061.2021.225362

Published February 26, 2021 | Version v1

Journal article Open

Development for performance of Porter stemmer algorithm

1. Al-Mustansiriyah University

The Porter stemmer algorithm is a broadly used, however, an essential tool for natural language processing in the area of information access. Stemming is used to remove words that add the final morphological and diacritical endings of words in English words to their root form to extract the word root, i.e. called stem/root in the primary text processing stage. In other words, it is a linguistic process that simply extracts the main part that may be close to the relative and related root. Text classification is a major task in extracting relevant information from a large volume of data. In this paper, we suggest ways to improve a version of the Porter algorithm with the aim of processing and overcome its limitations and to save time and memory by reducing the size of the words. The system uses the improved Porter derivation technique for word pruning. Whereas performs cognitive-inspired computing to discover morphologically related words from the corpus without any human intervention or language-specific knowledge. The improved Porter algorithm is compared to the original stemmer. The improved Porter algorithm has better performance and enables more accurate information retrieval (IR).

Files

Development for performance of Porter stemmer algorithm.pdf

Files (2.0 MB)

Name	Size	Download all
Development for performance of Porter stemmer algorithm.pdf md5:74a38b3c1c1740ad86e008f031b70218	2.0 MB	Preview Download

Additional details

Seddiqui, H., Maruf, A. A. M., Chy, A. N. (2016). Recursive Suffix Stripping to Augment Bangla Stemmer. ICAICT-2016-Paper. Available at: http://www.ciu.edu.bd/icaict2016/publications/ICAICT-2016-Paper%20(50).pdf
Shah, F. P., Patel, V. (2016). A review on feature selection and feature extraction for text classification. 2016 International Conference on Wireless Communications, Signal Processing and Networking (WiSPNET). doi: https://doi.org/10.1109/wispnet.2016.7566545
Saeed, A. M., Rashid, T. A., Mustafa, A. M., Agha, R. A. A.-R., Shamsaldin, A. S., Al-Salihi, N. K. (2018). An evaluation of Reber stemmer with longest match stemmer technique in Kurdish Sorani text classification. Iran Journal of Computer Science, 1 (2), 99–107. doi: https://doi.org/10.1007/s42044-018-0007-4
Agbele, K., Adesina, A., Azeez, N., Abidoye, A. (2012). Context-Aware Stemming algorithm for semantically related root words. African Journal of Computing & ICT, 5 (4), 33–42.
Akkus, B. K., Cakici, R. (2013). Categorization of Turkish News Documents with Morphological Analysis. 51st Annual Meeting of the Association for Computational Linguistics Proceedings of the Student Research Workshop. Sofia, 1–8. Available at: https://www.aclweb.org/anthology/P13-3001.pdf
Kumar, R., Mansotra, V. (2016). Applications of stemming algorithms in information retrieval-a review. International Journal of Advanced Research in Computer Science and Software Engineering, 6 (2), 418–423.
Biba, M., Gjati, E. (2014). Boosting Text Classification through Stemming of Composite Words. Recent Advances in Intelligent Informatics, 185–194. doi: https://doi.org/10.1007/978-3-319-01778-5_19
Farrar, D., Huffman Hayes, J. (2019). A Comparison of Stemming Techniques in Tracing. 2019 IEEE/ACM 10th International Symposium on Software and Systems Traceability (SST). doi: https://doi.org/10.1109/sst.2019.00017
Al-Sharhan, S., Al-Hunaiyyan, A., Alhajri, R., Al-Huwail, N. (2019). Utilization of Learning Management System (LMS) Among Instructors and Students. Advances in Electronics Engineering, 15–23. doi: https://doi.org/10.1007/978-981-15-1289-6_2
Joshi, A., Thomas, N., Dabhade, M. (2016). Modified Porter Stemming Algorithm. International Journal of Computer Science and Information Technologies, 7 (1), 266–269.

	All versions	This version
Views	21	21
Downloads	41	41
Data volume	81.9 MB	81.9 MB

Development for performance of Porter stemmer algorithm

Creators

Description

Files

Development for performance of Porter stemmer algorithm.pdf

Files (2.0 MB)

Additional details

References