Title | Learning abbreviations from Chinese and English terms by modeling non-local information |
Authors | Sun, Xu Okazaki, Naoaki Tsujii, Jun'ichi Wang, Houfeng |
Affiliation | Key Laboratory of Computational Linguistics, Peking University, Ministry of Education, China Graduate School of Information Sciences, Tohoku University, Japan Microsoft Research Asia, Beijing, China |
Issue Date | 2013 |
Publisher | acm transactions on asian language information processing |
Citation | ACM Transactions on Asian Language Information Processing.2013,12,(2). |
Abstract | The present article describes a robust approach for abbreviating terms. First, in order to incorporate nonlocal information into abbreviation generation tasks, we present both implicit and explicit solutions: the latent variable model and the label encoding with global information. Although the two approaches compete with one another, we find they are also highly complementary. We propose a combination of the two approaches, and we will show the proposed method outperforms all of the existing methods on abbreviation generation datasets. In order to reduce computational complexity of learning non-local information, we further present an online training method, which can arrive the objective optimum with accelerated training speed. We used a Chinese newswire dataset and a English biomedical dataset for experiments. Experiments revealed that the proposed abbreviation generator with non-local information achieved the best results for both the Chinese and English languages. ? 2013 ACM. |
URI | http://hdl.handle.net/20.500.11897/410460 |
ISSN | 15300226 |
DOI | 10.1145/2461316.2461317 |
Indexed | EI |
Appears in Collections: | 计算语言学教育部重点实验室 |