An analog-AI chip for energy-efficient speech recognition and transcription.

Journal

Nature

ISSN: 1476-4687

Titre abrégé: Nature

Pays: England

ID NLM: 0410462

Informations de publication

Date de publication:
Aug 2023

Historique:

received: 13 12 2022

accepted: 16 06 2023

medline: 25 8 2023

pubmed: 24 8 2023

entrez: 23 8 2023

Statut: ppublish

Résumé

Models of artificial intelligence (AI) that have billions of parameters can achieve high accuracy across a range of tasks

Identifiants

DOI: 10.1038/s41586-023-06337-5 PMID: 37612392 PMC: PMC10447234

pubmed: 37612392

doi: 10.1038/s41586-023-06337-5

pii: 10.1038/s41586-023-06337-5

pmc: PMC10447234

doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

Pagination

768-775

Informations de copyright

Références

Vaswani, A. et al. Attention is all you need. In NIPS17: Proc. 31st Conference on Neural Information Processing Systems (eds. von Luxburg, U. et al.) 6000–6010 (Curran Associates, 2017).

Chan, W. et al. SpeechStew: simply mix all available speech recognition data to train one large neural network. Preprint at https://arxiv.org/abs/2104.02133 (2021).

Ambrogio, S. et al. Equivalent-accuracy accelerated neural-network training using analogue memory. Nature 558, 60–67 (2018).

doi: 10.1038/s41586-018-0180-5 pubmed: 29875487

Narayanan, P. et al. Fully on-chip MAC at 14 nm enabled by accurate row-wise programming of PCM-based weights and parallel vector-transport in duration-format. IEEE Trans. Electron. Devices 68, 6629–6636 (2021).

Khaddam-Aljameh, R. et al. HERMES-core—a 1.59-TOPS/mm

doi: 10.1109/JSSC.2022.3140414

Yao, P. et al. Fully hardware-implemented memristor convolutional neural network. Nature 577, 641–646 (2020).

doi: 10.1038/s41586-020-1942-4 pubmed: 31996818

Wan, W. et al. A compute-in-memory chip based on resistive random-access memory. Nature 608, 504–512 (2022).

doi: 10.1038/s41586-022-04992-8 pubmed: 35978128 pmcid: 9385482

Better Machine Learning for Everyone. ML Commons https://mlcommons.org (2023).

LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521, 436–444 (2015).

doi: 10.1038/nature14539 pubmed: 26017442

Dahl, G. E., Yu, D., Deng, L. & Acero, A. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process. 20, 30–42 (2011).

doi: 10.1109/TASL.2011.2134090

Graves, A., Fernández, S., Gomez, F. & Schmidhuber, J. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In ICML ’06: Proc. 23rd International Conference on Machine Learning (eds Cohen, W. & Moore, A.) 369–376 (ACM, 2006).

Graves, A. Sequence transduction with recurrent neural networks. Preprint at https://arxiv.org/abs/1211.3711 (2012).

Graves, A., Mohamed, A.-R. & Hinton, G. Speech recognition with deep recurrent neural networks. In Proc. 2013 IEEE International Conference on Acoustics, Speech and Signal Processing 6645–6649 (IEEE, 2013) .

Bahdanau, D., Cho, K. & Bengio, Y. Neural machine translation by jointly learning to align and translate. Preprint at https://arxiv.org/abs/1409.0473 (2014).

Hsu, W.-N. et al. HuBERT: self-supervised speech representation learning by masked prediction of hidden units. IEEE/ACM Trans. Audio Speech Lang. Process. 29, 3451–3460 (2021).

doi: 10.1109/TASLP.2021.3122291

Gulati, A. et al. Conformer: convolution-augmented transformer for speech recognition. Preprint at https://arxiv.org/abs/2005.08100 (2020).

Panayotov, V., Chen, G., Povey, D. & Khudanpur, S. Librispeech: an ASR corpus based on public domain audio books. In 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 5206–5210 (IEEE, 2015).

Godfrey, J., Holliman, E. & McDaniel, J. SWITCHBOARD: telephone speech corpus for research and development. In ICASSP-92: Proc. International Conference on Acoustics, Speech and Signal Processing 517–520 (IEEE, 1992).

Gholami, A., Yao, Z., Kim, S., Mahoney, M. W. & Keutzer, K. AI and memory wall. RiseLab Medium https://medium.com/riselab/ai-and-memory-wall-2cb4265cb0b8 (2021).

Jain, S. et al. A heterogeneous and programmable compute-in-memory accelerator architecture for analog-AI using dense 2-D mesh. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 31, 114–127 (2023).

Chen, G., Parada, C. & Heigold, G. Small-footprint keyword spotting using deep neural networks. In 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 4087–4091 (2014).

Zhang, Y., Suda, N., Lai, L. & Chandra, V. Hello edge: keyword spotting on microcontrollers. Preprint at https://arxiv.org/abs/1711.07128 (2018).

Gokmen, T., Rasch, M. J. & Haensch, W. The marriage of training and inference for scaled deep learning analog hardware. In 2019 IEEE International Electron Devices Meeting (IEDM) 22.3.1–22.3.4 (2019).

Spoon, K. et al. Toward software-equivalent accuracy on transformer-based deep neural networks with analog memory devices. Front. Comput. Neurosci. 15, 675741 (2021).

doi: 10.3389/fncom.2021.675741 pubmed: 34290595 pmcid: 8287521

Kariyappa, S. et al. Noise-resilient DNN: tolerating noise in PCM-based AI accelerators via noise-aware training. IEEE Trans. Electron Devices 68, 4356–4362 (2021).

doi: 10.1109/TED.2021.3089987

Joshi, V. et al. Accurate deep neural network inference using computational phase-change memory. Nat. Commun. 11, 2473 (2020).

doi: 10.1038/s41467-020-16108-9 pubmed: 32424184 pmcid: 7235046

Macoskey, J., Strimel, G. P., Su, J. & Rastrow, A. Amortized neural networks for low-latency speech recognition. Preprint at https://arxiv.org/abs/2108.01553 (2021).

Fasoli, A. et al. Accelerating inference and language model fusion of recurrent neural network transducers via end-to-end 4-bit quantization. In Proc. Interspeech 2022 2038–2042 (2022).

Ding, S. et al. 4-bit conformer with native quantization aware training for speech recognition. Proc. Interspeech 2022 1711–1715 (2022).

Sun, X. et al. Ultra-low precision 4-bit training of deep neural networks. Adv. Neural Inf. Process. Syst. 33, 1796–1807 (2020).

Lavizzari, S., Ielmini, D., Sharma, D. & Lacaita, A. L. Reliability impact of chalcogenide-structure relaxation in phase-change memory (PCM) cells—part II: physics-based modeling. IEEE Trans. Electron Devices 56, 1078–1085 (2009).

doi: 10.1109/TED.2009.2016398

Biswas, A. & Chandrakasan, A. P. Conv-RAM: an energy-efficient SRAM with embedded convolution computation for low-power CNN-based machine learning applications. In Proc. 2018 IEEE International Solid-State Circuits Conference (ISSCC) 488–490 (IEEE, 2018).

Chang, H.-Y. et al. AI hardware acceleration with analog memory: microarchitectures for low energy at high speed. IBM J. Res. Dev. 63, 8:1–8:14 (2019).

doi: 10.1147/JRD.2019.2934050

Jiang, H., Li, W., Huang, S. & Yu, S. A 40nm analog-input ADC-free compute-in-memory RRAM macro with pulse-width modulation between sub-arrays. In 2022 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits) 266–267 (IEEE, 2022).

Jia, H. et al. A programmable neural-network inference accelerator based on scalable in-memory computing. In 2021 IEEE International Solid-State Circuits Conference (ISSCC) 236–238 (IEEE, 2021).

Dong, Q. et al. A 351TOPS/W and 372.4GOPS compute-in-memory SRAM macro in 7nm FinFET CMOS for machine-learning applications. In 2020 IEEE International Solid-State Circuits Conference (ISSCC) 242–244 (IEEE, 2020).

Chih, Y.-D. et al. An 89TOPS/W and 16.3TOPS/mm

Su, J.-W. et al. A 28nm 384kb 6T-SRAM computation-in-memory macro with 8b precision for AI edge chips. In 2021 IEEE International Solid-State Circuits Conference (ISSCC) 250–252 (IEEE, 2021).

Yoon, J.-H. et al. A 40nm 64Kb 56.67TOPS/W read-disturb-tolerant compute-in-memory/digital RRAM macro with active-feedback-based read and in-situ write verification. In 2021 IEEE International Solid-State Circuits Conference (ISSCC) 404–406 (IEEE, 2021).

Xue, C.-X. et al. A 22nm 4Mb 8b-precision ReRAM computing-in-memory macro with 11.91 to 195.7TOPS/w for tiny AI edge devices. In 2021 IEEE International Solid- State Circuits Conference (ISSCC) 245–247 (IEEE, 2021).

Marinella, M. J. et al. Multiscale co-design analysis of energy, latency, area, and accuracy of a ReRAM analog neural training accelerator. IEEE J. Emerg. Select. Topics Circuits Syst. 8, 86–101 (2018).

doi: 10.1109/JETCAS.2018.2796379

An analog-AI chip for energy-efficient speech recognition and transcription.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Sous-ensembles de citation

Pagination

Informations de copyright

Références

Auteurs

S Ambrogio (S)

P Narayanan (P)

A Okazaki (A)

A Fasoli (A)

C Mackin (C)

K Hosokawa (K)

A Nomura (A)

T Yasuda (T)

A Chen (A)

A Friz (A)

M Ishii (M)

J Luquin (J)

Y Kohda (Y)

N Saulnier (N)

K Brew (K)

S Choi (S)

I Ok (I)

T Philip (T)

V Chan (V)

C Silvestre (C)

I Ahsan (I)

V Narayanan (V)

H Tsai (H)

G W Burr (GW)

Classifications MeSH