TitlePredicting Chinese abbreviations from definitions: An empirical learning approach using support vector regression
AuthorsSun, Xu
Wang, Hou-Feng
Wang, Bo
AffiliationPeking Univ, Sch Elect Engn & Comp Sci, Inst Computat Linguist, Beijing 100871, Peoples R China.
Univ Tokyo, Dept Comp Sci, Grad Sch Informat Sci & Technol, Tokyo 1130033, Japan.
Keywordsstatistical natural language processing
abbreviation prediction
support vector regression
word clustering
DICTIONARY
Issue Date2008
Publisher计算机科学技术学报英文版
CitationJOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY.2008,23,(4),602-611.
AbstractIn Chinese, phrases and named entities play a central role in information retrieval. Abbreviations, however, make keyword-based approaches less effective. This paper presents an empirical learning approach to Chinese abbreviation prediction. In this study, each abbreviation is taken as a reduced form of the corresponding definition (expanded form), and the abbreviation prediction is formalized as a scoring and ranking problem among abbreviation candidates, which are automatically generated from the corresponding definition. By employing Support Vector Regression (SVR) for scoring, we can obtain multiple abbreviation candidates together with their SVR values, which are used for candidate ranking. Experimental results show that the SVR method performs better than the popular heuristic rule of abbreviation prediction. In addition, in abbreviation prediction, the SVR method outperforms the hidden Markov model (HMM).
URIhttp://hdl.handle.net/20.500.11897/292073
ISSN1000-9000
DOI10.1007/s11390-008-9156-5
IndexedSCI(E)
EI
中国科技核心期刊(ISTIC)
中国科学引文数据库(CSCD)
Appears in Collections:信息科学技术学院
计算语言学教育部重点实验室

Files in This Work
There are no files associated with this item.

Web of Science®



Checked on Last Week

Scopus®



Checked on Current Time

百度学术™



Checked on Current Time

Google Scholar™





License: See PKU IR operational policies.