Detecting Glaucoma from Fundus Photographs Using Deep Learning without Convolutions: Transformer for Improved Generalization.

AI, artificial intelligence AUROC, areas under the receiver operating characteristic curve CI, confidence interval CNN, convolutional neural network DL, deep learning Deep learning DeiT, Data-efficient image Transformer Fundus photographs Glaucoma detection LAG, Large-Scale Attention-Based Glaucoma OHTS, Ocular Hypertension Treatment Study POAG, primary open-angle glaucoma SoTA, state-of-the-art VF, visual field ViT, Vision Transformer Vision Transformers

Journal

Ophthalmology science

ISSN: 2666-9145

Titre abrégé: Ophthalmol Sci

Pays: Netherlands

ID NLM: 9918230896206676

Informations de publication

Date de publication:
Mar 2023

Historique:

received: 17 06 2022

revised: 04 10 2022

accepted: 12 10 2022

entrez: 22 12 2022

pubmed: 23 12 2022

medline: 23 12 2022

Statut: epublish

Résumé

To compare the diagnostic accuracy and explainability of a Vision Transformer deep learning technique, Data-efficient image Transformer (DeiT), and ResNet-50, trained on fundus photographs from the Ocular Hypertension Treatment Study (OHTS) to detect primary open-angle glaucoma (POAG) and identify the salient areas of the photographs most important for each model's decision-making process. Evaluation of a diagnostic technology. Overall 66 715 photographs from 1636 OHTS participants and an additional 5 external datasets of 16 137 photographs of healthy and glaucoma eyes. Data-efficient image Transformer models were trained to detect 5 ground-truth OHTS POAG classifications: OHTS end point committee POAG determinations because of disc changes (model 1), visual field (VF) changes (model 2), or either disc or VF changes (model 3) and Reading Center determinations based on disc (model 4) and VFs (model 5). The best-performing DeiT models were compared with ResNet-50 models on OHTS and 5 external datasets. Diagnostic performance was compared using areas under the receiver operating characteristic curve (AUROC) and sensitivities at fixed specificities. The explainability of the DeiT and ResNet-50 models was compared by evaluating the attention maps derived directly from DeiT to 3 gradient-weighted class activation map strategies. Compared with our best-performing ResNet-50 models, the DeiT models demonstrated similar performance on the OHTS test sets for all 5 ground-truth POAG labels; AUROC ranged from 0.82 (model 5) to 0.91 (model 1). Data-efficient image Transformer AUROC was consistently higher than ResNet-50 on the 5 external datasets. For example, AUROC for the main OHTS end point (model 3) was between 0.08 and 0.20 higher in the DeiT than ResNet-50 models. The saliency maps from the DeiT highlight localized areas of the neuroretinal rim, suggesting important rim features for classification. The same maps in the ResNet-50 models show a more diffuse, generalized distribution around the optic disc. Vision Transformers have the potential to improve generalizability and explainability in deep learning models, detecting eye disease and possibly other medical conditions that rely on imaging for clinical diagnosis and management.

Identifiants

DOI: 10.1016/j.xops.2022.100233 PMID: 36545260 PMC: PMC9762193

pubmed: 36545260

doi: 10.1016/j.xops.2022.100233

pii: S2666-9145(22)00122-1

pmc: PMC9762193

doi:

Types de publication

Journal Article

Langues

eng

Pagination

100233

Informations de copyright

Références

Sensors (Basel). 2022 Jan 07;22(2):

pubmed: 35062405

Am J Ophthalmol. 2019 Mar;199:193-199

pubmed: 30471242

Ophthalmology. 2021 Jan;128(1):78-88

pubmed: 32598951

Sci Rep. 2018 Nov 12;8(1):16685

pubmed: 30420630

IEEE Trans Med Imaging. 2021 Sep;40(9):2392-2402

pubmed: 33945474

IEEE J Biomed Health Inform. 2020 May;24(5):1405-1412

pubmed: 31647449

Biomed Eng Online. 2021 Apr 23;20(1):39

pubmed: 33892734

Transl Vis Sci Technol. 2020 Apr 28;9(2):27

pubmed: 32818088

Annu Int Conf IEEE Eng Med Biol Soc. 2010;2010:3065-8

pubmed: 21095735

Ophthalmology. 2016 May;123(5):1036-42

pubmed: 26875007

Ophthalmology. 2020 Mar;127(3):346-356

pubmed: 31718841

Transl Vis Sci Technol. 2020 Jul 22;9(2):42

pubmed: 32855846

Sci Rep. 2021 Oct 13;11(1):20313

pubmed: 34645908

Arch Ophthalmol. 2009 Sep;127(9):1136-45

pubmed: 19752422

JAMA. 2014 May 14;311(18):1901-11

pubmed: 24825645

Biomed Eng Online. 2019 Mar 20;18(1):29

pubmed: 30894178

Comput Med Imaging Graph. 2019 Jun;74:61-71

pubmed: 31022592

Am J Ophthalmol. 2022 May;237:1-12

pubmed: 34942113

Ophthalmology. 2014 Nov;121(11):2081-90

pubmed: 24974815

Arch Ophthalmol. 2002 Jun;120(6):701-13; discussion 829-30

pubmed: 12049574

Arch Ophthalmol. 1999 May;117(5):573-83

pubmed: 10326953

Detecting Glaucoma from Fundus Photographs Using Deep Learning without Convolutions: Transformer for Improved Generalization.

Journal

Informations de publication

Résumé

Identifiants

Types de publication

Langues

Pagination

Informations de copyright

Références

Auteurs

Rui Fan (R)

Kamran Alipour (K)

Christopher Bowd (C)

Mark Christopher (M)

Nicole Brye (N)

James A Proudfoot (JA)

Michael H Goldbaum (MH)

Akram Belghith (A)

Christopher A Girkin (CA)

Massimo A Fazio (MA)

Jeffrey M Liebmann (JM)

Robert N Weinreb (RN)

Michael Pazzani (M)

David Kriegman (D)

Linda M Zangwill (LM)

Classifications MeSH