Unsupervised Network Quantization via Fixed-Point Factorization.

Journal

IEEE transactions on neural networks and learning systems

ISSN: 2162-2388

Titre abrégé: IEEE Trans Neural Netw Learn Syst

Pays: United States

ID NLM: 101616214

Informations de publication

Date de publication:
Jun 2021

Historique:

pubmed: 25 7 2020

medline: 25 7 2020

entrez: 25 7 2020

Statut: ppublish

Résumé

The deep neural network (DNN) has achieved remarkable performance in a wide range of applications at the cost of huge memory and computational complexity. Fixed-point network quantization emerges as a popular acceleration and compression method but still suffers from huge performance degradation when extremely low-bit quantization is utilized. Moreover, current fixed-point quantization methods rely heavily on supervised retraining using large amounts of the labeled training data, while the labeled data are hard to obtain in the real-world applications. In this article, we propose an efficient framework, namely, fixed-point factorized network (FFN), to turn all weights into ternary values, i.e., {-1, 0, 1}. We highlight that the proposed FFN framework can achieve negligible degradation even without any supervised retraining on the labeled data. Note that the activations can be easily quantized into an 8-bit format; thus, the resulting networks only have low-bit fixed-point additions that are significantly more efficient than 32-bit floating-point multiply-accumulate operations (MACs). Extensive experiments on large-scale ImageNet classification and object detection on MS COCO show that the proposed FFN can achieve about more than 20× compression and remove most of the multiply operations with comparable accuracy. Codes are available on GitHub at https://github.com/wps712/FFN.

Identifiants

DOI: 10.1109/TNNLS.2020.3007749 PMID: 32706647

pubmed: 32706647

doi: 10.1109/TNNLS.2020.3007749

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

2706-2720

Unsupervised Network Quantization via Fixed-Point Factorization.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Auteurs

Peisong Wang (P)

Xiangyu He (X)

Qiang Chen (Q)

Anda Cheng (A)

Qingshan Liu (Q)

Jian Cheng (J)

Classifications MeSH