TitleCBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
AuthorsZhang, Ningyu
Chen, Mosha
Bi, Zhen
Liang, Xiaozhuan
Li, Lei
Shang, Xin
Yin, Kangping
Tan, Chuanqi
Xu, Jian
Huang, Fei
Si, Luo
Ni, Yuan
Xie, Guotong
Sui, Zhifang
Chang, Baobao
Zong, Hui
Yuan, Zheng
Li, Linfeng
Yan, Jun
Zan, Hongying
Zhang, Kunli
Tang, Buzhou
Chen, Qingcai
AffiliationZhejiang Univ, AZFT Joint Lab Knowledge Engine, Hangzhou, Peoples R China
Alibaba Grp, Hangzhou, Peoples R China
Zhejiang Univ, Sch Math Sci, Hangzhou, Peoples R China
Pingan Hlth Technol, Hong Kong, Peoples R China
Ping An Hlth Cloud Co Ltd, Hong Kong, Peoples R China
Ping An Int Smart City Technol Co Ltd, Hong Kong, Peoples R China
Peking Univ, Key Lab Computat Linguist, Minist Educ, Beijing, Peoples R China
Tongji Univ, Sch Life Sci & Technol, Shanghai, Peoples R China
Tsinghua Univ, Beijing, Peoples R China
Yidu Cloud Technol Inc, Beijing, Peoples R China
Zhengzhou Univ, Sch Informat Engn, Zhengzhou, Peoples R China
Harbin Inst Technol Shenzhen, Shenzhen, Peoples R China
Peng Cheng Lab, Shenzhen, Peoples R China
Philips Res China, Shanghai, Peoples R China
Issue Date2022
PublisherPROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS)
AbstractWith the development of biomedical language understanding benchmarks, Artificial Intelligence applications are widely used in the medical field. However, most benchmarks are limited to English, which makes it challenging to replicate many of the successes in English for other languages. To facilitate research in this direction, we collect real-world biomedical data and present the first Chinese Biomedical Language Understanding Evaluation (CBLUE) benchmark: a collection of natural language understanding tasks including named entity recognition, information extraction, clinical diagnosis normalization, and an associated online platform for model evaluation, comparison, and analysis. To establish evaluation on these tasks, we report empirical results with the current 11 pre-trained Chinese models, and experimental results show that state-of-the-art neural models perform far worse than the human ceiling(1).
URIhttp://hdl.handle.net/20.500.11897/654032
ISBN978-1-955917-21-6
IndexedCPCI-SSH(ISSHP)
CPCI-S(ISTP)
Appears in Collections:计算语言学教育部重点实验室

Files in This Work
There are no files associated with this item.

Web of Science®



Checked on Last Week

百度学术™



Checked on Current Time




License: See PKU IR operational policies.