Toward Software-Equivalent Accuracy on Transformer-Based Deep Neural Networks With Analog Memory Devices.
BERT
DNN
PCM
RRAM
Transformer
analog accelerators
in-memory computing
Journal
Frontiers in computational neuroscience
ISSN: 1662-5188
Titre abrégé: Front Comput Neurosci
Pays: Switzerland
ID NLM: 101477956
Informations de publication
Date de publication:
2021
2021
Historique:
received:
03
03
2021
accepted:
14
05
2021
entrez:
22
7
2021
pubmed:
23
7
2021
medline:
23
7
2021
Statut:
epublish
Résumé
Recent advances in deep learning have been driven by ever-increasing model sizes, with networks growing to millions or even billions of parameters. Such enormous models call for fast and energy-efficient hardware accelerators. We study the potential of Analog AI accelerators based on Non-Volatile Memory, in particular Phase Change Memory (PCM), for software-equivalent accurate inference of natural language processing applications. We demonstrate a path to software-equivalent accuracy for the GLUE benchmark on BERT (Bidirectional Encoder Representations from Transformers), by combining noise-aware training to combat inherent PCM drift and noise sources, together with reduced-precision digital attention-block computation down to INT6.
Identifiants
pubmed: 34290595
doi: 10.3389/fncom.2021.675741
pmc: PMC8287521
doi:
Types de publication
Journal Article
Langues
eng
Pagination
675741Informations de copyright
Copyright © 2021 Spoon, Tsai, Chen, Rasch, Ambrogio, Mackin, Fasoli, Friz, Narayanan, Stanisavljevic and Burr.
Déclaration de conflit d'intérêts
The authors were employed by IBM Research.
Références
Nat Nanotechnol. 2015 Mar;10(3):209-20
pubmed: 25740132
Nature. 2015 May 28;521(7553):436-44
pubmed: 26017442
Nature. 2018 Jun;558(7708):60-67
pubmed: 29875487
Nat Commun. 2020 May 18;11(1):2473
pubmed: 32424184