TitleA unified model for cross-domain and semi-supervised named entity recognition in Chinese Social Media
AuthorsHe, Hangfeng
Sun, Xu
AffiliationMOE Key Laboratory of Computational Linguistics, Peking University School of Electronics Engineering and Computer Science, Peking University, China
Issue Date2017
Publisher31st AAAI Conference on Artificial Intelligence, AAAI 2017
Citation31st AAAI Conference on Artificial Intelligence, AAAI 2017. 2017, 3216-3222.
AbstractNamed entity recognition (NER) in Chinese social media is important but difficult because of its informality and strong noise. Previous methods only focus on in-domain supervised learning which is limited by the rare annotated data. However, there are enough corpora in formal domains and massive in-domain unannotated texts which can be used to improve the task. We propose a unified model which can learn from out-of-domain corpora and in-domain unannotated texts. The unified model contains two major functions. One is for cross-domain learning and another for semi-supervised learning. Cross-domain learning function can learn out-of-domain information based on domain similarity. Semi-Supervised learning function can learn in-domain unannotated information by self-training. Both learning functions outperform existing methods for NER in Chinese social media. Finally, our unified model yields nearly 11% absolute improvement over previously published results. Copyright ? 2017, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
Appears in Collections:信息科学技术学院

Files in This Work
There are no files associated with this item.

Web of Science®

Checked on Last Week


Checked on Current Time

License: See PKU IR operational policies.