Unsupervised Network Quantization via Fixed-Point Factorization.
Journal
IEEE transactions on neural networks and learning systems
ISSN: 2162-2388
Titre abrégé: IEEE Trans Neural Netw Learn Syst
Pays: United States
ID NLM: 101616214
Informations de publication
Date de publication:
Jun 2021
Jun 2021
Historique:
pubmed:
25
7
2020
medline:
25
7
2020
entrez:
25
7
2020
Statut:
ppublish
Résumé
The deep neural network (DNN) has achieved remarkable performance in a wide range of applications at the cost of huge memory and computational complexity. Fixed-point network quantization emerges as a popular acceleration and compression method but still suffers from huge performance degradation when extremely low-bit quantization is utilized. Moreover, current fixed-point quantization methods rely heavily on supervised retraining using large amounts of the labeled training data, while the labeled data are hard to obtain in the real-world applications. In this article, we propose an efficient framework, namely, fixed-point factorized network (FFN), to turn all weights into ternary values, i.e., {-1, 0, 1}. We highlight that the proposed FFN framework can achieve negligible degradation even without any supervised retraining on the labeled data. Note that the activations can be easily quantized into an 8-bit format; thus, the resulting networks only have low-bit fixed-point additions that are significantly more efficient than 32-bit floating-point multiply-accumulate operations (MACs). Extensive experiments on large-scale ImageNet classification and object detection on MS COCO show that the proposed FFN can achieve about more than 20× compression and remove most of the multiply operations with comparable accuracy. Codes are available on GitHub at https://github.com/wps712/FFN.
Identifiants
pubmed: 32706647
doi: 10.1109/TNNLS.2020.3007749
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM