TitleTraining Simplification and Model Simplification for Deep Learning : A Minimal Effort Back Propagation Method
AuthorsSun, Xu
Ren, Xuancheng
Ma, Shuming
Wei, Bingzhen
Li, Wei
Xu, Jingjing
Wang, Houfeng
Zhang, Yi
AffiliationPeking Univ, MOE Key Lab Computat Linguist, Sch Elect Engn & Comp Sci, Beijing 100871, Peoples R China
Peking Univ, Ctr Data Sci, Beijing Inst Big Data Res, Beijing 100871, Peoples R China
Peking Univ, Sch EECS, MOE Key Lab Computat Linguist, Beijing 100871, Peoples R China
KeywordsNEURAL-NETWORKS
Issue Date1-Feb-2020
PublisherIEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING
AbstractWe propose a simple yet effective technique to simplify the training and the resulting model of neural networks. In back propagation, only a small subset of the full gradient is computed to update the model parameters. The gradient vectors are sparsified in such a way that only the top-$k$k elements (in terms of magnitude) are kept. As a result, only $k$k rows or columns (depending on the layout) of the weight matrix are modified, leading to a linear reduction in the computational cost. Based on the sparsified gradients, we further simplify the model by eliminating the rows or columns that are seldom updated, which will reduce the computational cost both in the training and decoding, and potentially accelerate decoding in real-world applications. Surprisingly, experimental results demonstrate that most of the time we only need to update fewer than 5 percent of the weights at each back propagation pass. More interestingly, the accuracy of the resulting models is actually improved rather than degraded, and a detailed analysis is given. The model simplification results show that we could adaptively simplify the model which could often be reduced by around 9x, without any loss on accuracy or even with improved accuracy.
URIhttp://hdl.handle.net/20.500.11897/585352
ISSN1041-4347
DOI10.1109/TKDE.2018.2883613
IndexedSCI(E)
Scopus
EI
Appears in Collections:信息科学技术学院
计算语言学教育部重点实验室
其他研究院

Files in This Work
There are no files associated with this item.

Web of Science®



Checked on Last Week

Scopus®



Checked on Current Time

百度学术™



Checked on Current Time

Google Scholar™





License: See PKU IR operational policies.