Intermediate-grained kernel elements pruning with structured sparsity.

Deep neural networks Model pruning Regularization Sparse accelerator

Journal

Neural networks : the official journal of the International Neural Network Society
ISSN: 1879-2782
Titre abrégé: Neural Netw
Pays: United States
ID NLM: 8805018

Informations de publication

Date de publication:
07 Sep 2024
Historique:
received: 26 03 2024
revised: 15 08 2024
accepted: 05 09 2024
medline: 15 9 2024
pubmed: 15 9 2024
entrez: 14 9 2024
Statut: aheadofprint

Résumé

Neural network pruning provides a promising prospect for the deployment of neural networks on embedded or mobile devices with limited resources. Although current structured strategies are unconstrained by specific hardware architecture in the phase of forward inference, the decline in classification accuracy of structured methods is beyond the tolerance at the level of general pruning rate. This inspires us to develop a technique that satisfies high pruning rate with a small decline in accuracy and has the general nature of structured pruning. In this paper, we propose a new pruning method, namely KEP (Kernel Elements Pruning), to compress deep convolutional neural networks by exploring the significance of elements in each kernel plane and removing unimportant elements. In this method, we apply a controllable regularization penalty to constrain unimportant elements by adding a prior knowledge mask and obtain a compact model. In the calculation procedure of forward inference, we introduce a sparse convolution operation which is different from the sliding window to eliminate invalid zero calculations and verify the effectiveness of the operation for further deployment on FPGA. A massive variety of experiments demonstrate the effectiveness of KEP on two datasets: CIFAR-10 and ImageNet. Specially, with few indexes of non-zero weights introduced, KEP has a significant improvement over the latest structured methods in terms of parameter and float-point operation (FLOPs) reduction, and performs well on large datasets.

Identifiants

pubmed: 39276589
pii: S0893-6080(24)00632-4
doi: 10.1016/j.neunet.2024.106708
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

106708

Informations de copyright

Copyright © 2024 Elsevier Ltd. All rights reserved.

Déclaration de conflit d'intérêts

Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Auteurs

Peng Zhang (P)

School of Computer Science and Technology, Xidian University, No. 2 South Taibai Road, Xi'an, 710071, PR China. Electronic address: pezhang@stu.xidian.edu.cn.

Liang Zhao (L)

School of Computer Science and Technology, Xidian University, No. 2 South Taibai Road, Xi'an, 710071, PR China. Electronic address: lzhao@xidian.edu.cn.

Cong Tian (C)

School of Computer Science and Technology, Xidian University, No. 2 South Taibai Road, Xi'an, 710071, PR China. Electronic address: ctian@mail.xidian.edu.cn.

Zhenhua Duan (Z)

School of Computer Science and Technology, Xidian University, No. 2 South Taibai Road, Xi'an, 710071, PR China. Electronic address: zhhduan@mail.xidian.edu.cn.

Classifications MeSH