TitleChinese abbreviation identification using abbreviation-template features and context information
AuthorsSun, Xu
Wang, Houfeng
AffiliationPeking Univ, Dept Comp Sci & Technol, Sch Elect Engn & Comp Sci, Beijing 100871, Peoples R China.
Issue Date2006
CitationComputer Processing of Oriental Languages, Proceedings: BEYOND THE ORIENT: THE RESEARCH CHALLENGES AHEAD.4285(245-255).
AbstractChinese abbreviations are frequently used without being defined, which has brought much difficulty into NLP. In this study, the definition-independent abbreviation identification problem is proposed and resolved as a classification task in which abbreviation candidates are classified as either, 'abbreviation' or 'non-abbreviation' according to the posterior probability. To meet our aim of identifying new abbreviations from existing ones, our solution is to add generalization capability to the abbreviation lexicon by replacing words with word classes and therefore create abbreviation-templates. By utilizing abbreviation-template features as well as context information, a SVM model is employed as the classifier. The evaluation on a raw Chinese corpus obtains an encouraging performance. Our experiments further demonstrate the improvement after integrating with morphological analysis, substring analysis and person name identification.
URIhttp://hdl.handle.net/20.500.11897/293532
ISSN0302-9743
DOI10.1007/11940098_26
IndexedEI
CPCI-S(ISTP)
Appears in Collections:信息科学技术学院

Files in This Work
There are no files associated with this item.

Web of Science®



Checked on Last Week

Scopus®



Checked on Current Time

百度学术™



Checked on Current Time

Google Scholar™





License: See PKU IR operational policies.