TitleEnsembles of classifiers for Chinese word sense disambiguation
AuthorsWu, Yunfang
Wang, Miao
Jin, Peng
Yu, Shiwen
AffiliationSchool of Electronic Engineering and Computer Science, Peking University, Beijing 100871, China
School of Software and Microelectronics, Peking University, Beijing 102600, China
Issue Date2008
Publisherjisuanji yanjiu yu fazhancomputer research and development
CitationJisuanji Yanjiu yu Fazhan/Computer Research and Development.2008,45,(8),1354-1361.
AbstractWord sense disambiguation has long been a central concern for natural language processing, and ensemble of classifiers is one of the four current directions in machine learning study. This paper makes a systematic study on the ensembles of classifiers for Chinese word sense disambiguation. Nine kinds of combining strategies are experimented in this paper: product, average, max, min, majority voting, rank-based voting, weighted voting, weighted probability, and best single combining, among which the three combining methods of product, average and max have not been applied in word sense disambiguation in previous works. Support vector machine, naive Bayes, and decision tree are selected as the three component classifiers. Four kinds of features are used in all of the three classifiers: bag of words, words with position, parts of speech with position and 2-gram collocations. Experiments are conducted in two different datasets: the first dataset is 18 ambiguous words selected from Chinese semantic corpus, and the second dataset is the multilingual Chinese-English lexical sample task at SemEval-2007. The experimental results illustrate that the three kinds of combining strategies of average, product and max, which are applied for the first time in Chinese word sense disambiguation in this paper, exceed the accuracy of best single classifier support vector machine, and also outperform the other six kinds of combining methods.
Appears in Collections:信息科学技术学院

Files in This Work
There are no files associated with this item.

Web of Science®

Checked on Last Week


Checked on Current Time

License: See PKU IR operational policies.