Determining the Gender of Korean Names for Pronoun Generation

Seong-Bae Park; Hee-Geun Yoon

doi:10.5281/zenodo.1061842

Published August 23, 2007 | Version 5867

Journal article Open

Determining the Gender of Korean Names for Pronoun Generation

It is an important task in Korean-English machine translation to classify the gender of names correctly. When a sentence is composed of two or more clauses and only one subject is given as a proper noun, it is important to find the gender of the proper noun for correct translation of the sentence. This is because a singular pronoun has a gender in English while it does not in Korean. Thus, in Korean-English machine translation, the gender of a proper noun should be determined. More generally, this task can be expanded into the classification of the general Korean names. This paper proposes a statistical method for this problem. By considering a name as just a sequence of syllables, it is possible to get a statistics for each name from a collection of names. An evaluation of the proposed method yields the improvement in accuracy over the simple looking-up of the collection. While the accuracy of the looking-up method is 64.11%, that of the proposed method is 81.49%. This implies that the proposed method is more plausible for the gender classification of the Korean names.

Files

5867.pdf

Files (159.5 kB)

Name	Size	Download all
5867.pdf md5:4790f485e15ea814d2e58d4a6d14ee28	159.5 kB	Preview Download

Additional details

E.-S. Chung, Y.-G. Hwang, and M.-G. Jang, "Korean Named Entity Recognition Using HMM and Co-Training Model," In Proceedings of the 6th International Workshop on Information Retrieval with Asian Languages, pp. 161-167, 2003.
C. Drummond and R. Holte, "C4.5, Class Imbalance, and Cost Sensitivity: Why Under-Sampling beats Over-Sampling," In Proceedings of Workshop on Learning from Imabalanced Datasets II, ICML, 2003.
N.-R. Han, Korean Zero Pronouns: Analysis and Resolution, Ph.D Thesis, University of Pennsylvania, 2006.
S. Katz, "Estimation of Probabilities from Sparse Data for the Language Model Component of a Speech Recognizer," IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 35, No. 3, pp. 400-401, 1987.
K.-N. Kim, Y.-H. Yoon, H.-S. Kim, and J.-Y. Seo, "Named Entity Recognition Using Acyclic Weighted Digraphs: A Semi-Supervised Statistical Method," Lecture Notes in Computer Science, Vol. 4426, pp. 571-578, 2007.
Y.-T. Kim, Introduction to Natural Language Processing, 2nd Edition, Saeng-Neung Publisher, 2001. (In Korean)
B.-K. Kwak and J.-W. Cha, "Named Entity Tagging for Korean Using DL-CoTrain Algorithm," Lecture Notes in Computer Science, Vol. 3689, pp. 589-594, 2005.
C.-K. Lee, Y.-G. Hwang, H.-J. Oh, S.-J. Lim, J. Heo, C.-H. Lee, H.-J. Kim, J.-H. Wang, and M.-G. Jang, "Fine-Grained Named Entity Recognition Using Conditional Random Fields for Question Answering," Lecture Notes in Computer Science, Vol. 4182, pp. 581-587, 2006.
S.-H. Lee, D. Byron, and S.-B. Jang, "Why Is Zero Marking Important in Korean?" In Proceedings of the 2nd International Conference on Natural Language Processing, pp. 588-599, 2005. [10] J.-E. Roh and J.-H. Lee, "Generation of Zero Pronouns Based on the Centering Theory and Pairwise Salience of Entities," IEICE Transactions on Information and Systems, Vol. E880D(2), pp. 837-846, 2006. [11] C.-N. Seon, Y-.J. Ko, J. Kim, and J.-Y. Seo, "Named Entity Recognition Using Machine Learning Methods and Pattern-Recognition Rules," In Proceedings of the 6th Natural Language Processing Pacific Rim Symposium, 2001. [12] S. Zhao and H. Ng, "Identification and Resolution of Chinese Zero Pronouns: A Machine Learning Approach," In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 541-550, 2007. [13] G. Zhou and J. Su, "Named Entity Recognition Using an HMM-Based Chunk Tagger," In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 473-480, 2002.

	All versions	This version
Views	90	90
Downloads	63	63
Data volume	10.4 MB	10.4 MB

Determining the Gender of Korean Names for Pronoun Generation

Creators

Description

Files

5867.pdf

Files (159.5 kB)

Additional details

References