TitleRaise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning
AuthorsXu, Runxin
Luo, Fuli
Zhang, Zhiyuan
Tan, Chuanqi
Chang, Baobao
Huang, Songfang
Huang, Fei
AffiliationPeking Univ, Key Lab Computat Linguist, MOE, Beijing, Peoples R China
Alibaba Grp, Hangzhou, Peoples R China
Issue Date2021
Publisher2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021)
AbstractRecent pretrained language models extend from millions to billions of parameters. Thus the need to fine-tune an extremely large pretrained model with a limited training corpus arises in various downstream tasks. In this paper, we propose a straightforward yet effective fine-tuning technique, CHILD-TUNING, which updates a subset of parameters (called child network) of large pretrained models via strategically masking out the gradients of the non-child network during the backward process. Experiments on various downstream tasks in GLUE benchmark show that CHILD-TUNING consistently outperforms the vanilla fine-tuning by 1.5 similar to 8.6 average score among four different pretrained models, and surpasses the prior fine-tuning techniques by 0.6 similar to 1.3 points. Furthermore, empirical results on domain transfer and task transfer show that CHILD-TUNING can obtain better generalization performance by large margins.
URIhttp://hdl.handle.net/20.500.11897/657195
ISBN978-1-955917-09-4
IndexedEI
CPCI-SSH(ISSHP)
CPCI-S(ISTP)
Appears in Collections:计算语言学教育部重点实验室

Files in This Work
There are no files associated with this item.

Web of Science®



Checked on Last Week

百度学术™



Checked on Current Time




License: See PKU IR operational policies.