TitlePredicting Chinese abbreviations with minimum semantic unit and global constraints
AuthorsZhang, Longkai
Li, Li
Wang, Houfeng
Sun, Xu
AffiliationKey Laboratory of Computational Linguistics, Peking University, Ministry of Education, China
Issue Date2014
Citation2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014.Doha, Qatar.
AbstractWe propose a new Chinese abbreviation prediction method which can incorporate rich local information while generating the abbreviation globally. Different to previous character tagging methods, we introduce the minimum semantic unit, which is more fine-grained than character but more coarse-grained than word, to capture word level information in the sequence labeling framework. To solve the 'character duplication' problem in Chinese abbreviation prediction, we also use a substring tagging strategy to generate local substring tagging candidates. We use an integer linear programming (ILP) formulation with various constraints to globally decode the final abbreviation from the generated candidates. Experiments show that our method outperforms the state-of-the-art systems, without using any extra resource. ? 2014 Association for Computational Linguistics.
Appears in Collections:计算语言学教育部重点实验室

Files in This Work
There are no files associated with this item.

Web of Science®

Checked on Last Week


Checked on Current Time

License: See PKU IR operational policies.