Published October 25, 2007 | Version 5974
Journal article Open

Enhancing Retrieval Effectiveness of Malay Documents by Exploiting Implicit Semantic Relationship between Words

Description

Phrases has a long history in information retrieval, particularly in commercial systems. Implicit semantic relationship between words in a form of BaseNP have shown significant improvement in term of precision in many IR studies. Our research focuses on linguistic phrases which is language dependent. Our results show that using BaseNP can improve performance although above 62% of words formation in Malay Language based on derivational affixes and suffixes.

Files

5974.pdf

Files (312.0 kB)

Name Size Download all
md5:a97f7e7a0822eb73abfd6249c194072a
312.0 kB Preview Download

Additional details

References

  • Atlam, E.S., Fuketa, M., Morita, K., & Aoe, J., Documents Similarity Measurement Using Field Association Terms, Information Processing and Management Journal, 39, 2003, pp. 809-824.
  • Baeza-Yates, R & Ribeiro-Neto, B., Modern Information Retrieval, Addison-Wesley, New York, 1999.
  • Croft, W. B., User-specified Domain Knowledge for Document Retrieval, Proceedings Of The ACM Conference On Research And Development In Information Retrieval, 1986, pp. 201-206.
  • Fatimah A., A Malay Language Document Retrieval System: An Experimental Approach And Analysis, Ph.D Thesis, Universiti Kebangsaan Malaysia, 1995
  • Fagan, J. L, Experiments in Automatic Phrase Indexing for Document Retrieval: A Comparison of Syntactic and Non-Syntactic Methods, Ph.D. Thesis, Department of Computing Science, Cornell University, Ithica, New York, 1987.
  • Lewis, D.D. and Jones, K.S., Natural Language Processing for Information Retrieval, Communication of the ACM, Vol 39 No. 1 , 1996, pp. 92-100.
  • Sanderson, M. ,Word Sense Disambiguation and Information Retrieval, Proceedings of the Seventeenth Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, 1994, pp. 142-151, Springer-Verlag.
  • Salton, G., A Blueprint For Automatic Indexing, ACM SIGIR Forum 16, 2 (Fall 1981), 1981, pp. 22-38.
  • Salton, C.. and Lesk., M.E. Computer Evaluation Of Indexing And Text Processing, Communication of the ACM, Vol 15 No. 1 , 1986, pp. 6-36. [10] Salton, G., Introduction to Modern Information Retrieval, McGraw-Hill, New York, 1983. [11] Salton, G., Another Look At Automatic Text Retrieval Systems, Communications of the ACM, Vol 29 No. 7, 1986, pp. 648-656. [12] Van Rijsbergen, C.J. Information Retrieval, 2nd edition, Butterworth.,1979. [13] Yun, B. H., H. S. Lim and H.C. Rim, Analysis of Korean Compound Nouns using Statistical Information, Proc. of the 22nd Korea Information Science Society Spring Conference, 1994, pp 925-928. [14] Zainab Abu Bakar, Evaluation Of Retrieval Effectiveness Of Conflation Methods On Malay Documents, Ph.D Thesis, Universiti Kebangsaan Malaysia, 1999. [15] Zainab Abu Bakar & Nurazzah Abdul Rahman, Evaluating The Effectiveness Of Thesaurus And Stemming Methods In Retrieving Malay Translated Al-Quran Documents, Proceeding Of 6th International Conference On Asian Digital Libraries, 2003, pp. 653- 662. Springer-verlag.