Title | Coarse-grained candidate generation and fine-grained re-ranking for chinese abbreviation prediction |
Authors | Zhang, Longkai Wang, Houfeng Sun, Xu |
Affiliation | Key Laboratory of Computational Linguistics, Peking University, Ministry of Education, China |
Issue Date | 2014 |
Citation | 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014.Doha, Qatar. |
Abstract | Correctly predicting abbreviations given the full forms is important in many natural language processing systems. In this paper we propose a two-stage method to find the corresponding abbreviation given its full form. We first use the contextual information given a large corpus to get abbreviation candidates for each full form and get a coarse-grained ranking through graph random walk. This coarse-grained rank list fixes the search space inside the top-ranked candidates. Then we use a similarity sensitive re-ranking strategy which can utilize the features of the candidates to give a fine-grained re-ranking and select the final result. Our method achieves good results and outperforms the state-ofthe- Art systems. One advantage of our method is that it only needs weak supervision and can get competitive results with fewer training data. The candidate generation and coarse-grained ranking is totally unsupervised. The re-ranking phase can use a very small amount of training data to get a reasonably good result. ? 2014 Association for Computational Linguistics. |
URI | http://hdl.handle.net/20.500.11897/327361 |
Indexed | EI |
Appears in Collections: | 计算语言学教育部重点实验室 |