Deep Learning Pitfall: Impact of Novel Ultrasound Equipment Introduction on Algorithm Performance and the Realities of Domain Adaptation.

artificial intelligence deep learning domain shift inferior vena cava pediatrics point of care ultrasound

Journal

Journal of ultrasound in medicine : official journal of the American Institute of Ultrasound in Medicine
ISSN: 1550-9613
Titre abrégé: J Ultrasound Med
Pays: England
ID NLM: 8211547

Informations de publication

Date de publication:
Apr 2022
Historique:
revised: 03 05 2021
received: 11 04 2021
accepted: 17 05 2021
pubmed: 17 6 2021
medline: 16 3 2022
entrez: 16 6 2021
Statut: ppublish

Résumé

To test deep learning (DL) algorithm performance repercussions by introducing novel ultrasound equipment into a clinical setting. Researchers introduced prospectively obtained inferior vena cava (IVC) videos from a similar patient population using novel ultrasound equipment to challenge a previously validated DL algorithm (trained on a common point of care ultrasound [POCUS] machine) to assess IVC collapse. Twenty-one new videos were obtained for each novel ultrasound machine. The videos were analyzed for complete collapse by the algorithm and by 2 blinded POCUS experts. Cohen's kappa was calculated for agreement between the 2 POCUS experts and DL algorithm. Previous testing showed substantial agreement between algorithm and experts with Cohen's kappa of 0.78 (95% CI 0.49-1.0) and 0.66 (95% CI 0.31-1.0) on new patient data using, the same ultrasound equipment. Challenged with higher image quality (IQ) POCUS cart ultrasound videos, algorithm performance declined with kappa values of 0.31 (95% CI 0.19-0.81) and 0.39 (95% CI 0.11-0.89), showing fair agreement. Algorithm performance plummeted on a lower IQ, smartphone device with a kappa value of -0.09 (95% CI -0.95 to 0.76) and 0.09 (95% CI -0.65 to 0.82), respectively, showing less agreement than would be expected by chance. Two POCUS experts had near perfect agreement with a kappa value of 0.88 (95% CI 0.64-1.0) regarding IVC collapse. Performance of this previously validated DL algorithm worsened when faced with ultrasound studies from 2 novel ultrasound machines. Performance was much worse on images from a lower IQ hand-held device than from a superior cart-based device.

Identifiants

pubmed: 34133034
doi: 10.1002/jum.15765
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

855-863

Informations de copyright

© 2021 American Institute of Ultrasound in Medicine.

Références

Safina A, Lau L, Brennan P, et al. Precision imaging-its impact on image quality and diagnostic confidence in breast ultrasound examinations. Br J Radiol 2015; 88:20140340.
Birnholz J. Practice of ultrasound: part 9-image quality. 2013. www.auntminnie.com/. Accessed January 3, 2014.
Lévêque L, Zhang W, Parker P, Liu H. The impact of specialty settings on the perceived quality of medical ultrasound video. IEEE Access. 2017; 5:16998-17005.
Han X, Jovicich J, Salat D, et al. Reliability of mri-derived measurements of human cerebral cortical thickness: the effects of field strength, scanner upgrade and manufacturer. NeuroImage 2006; 32:180-194.
Panayides AS, Amini A, Filipovic ND, et al. AI in medical imaging informatics: current challenges and future directions. IEEE J Biomed Health Inform 2020; 247:1837-1857.
Zhou AZ, Green RS, Haines EJ, Vazquez MN, Tay ET, Tsung JW. Interobserver agreement of inferior vena cava ultrasound collapse duration and correlated outcomes in children with dehydration. Pediatr Emerg Care 2020; 11.
Blaivas M, Blaivas L, Tsung J. Deep learning algorithm performance compared to experts in visual evaluation of inferior vena cava ultrasound and dehydration. AIUM 2021 Annual Meeting Abstract Presentation. 2021
Baochen S, Feng J, Saenko K. Return of frustratingly easy domain adaptation. Thirtieth AAAI Conference on Artificial Intelligence; 2016.
Gbenga E, Joseph D, Bassia S, et al. Machine learning for email spam filtering: review, approaches and open research problems. Heliyon 2019; 5:e01802.
Ho D, Black E, Agrawalav M, Li F. Domain Shift and Emerging Questions in Facial Recognition Technology. Stanford, CA: Stanford University Human-Centered Artificial Intelligence; 2020.
Guo J, Zhu X, Zhao C, Cao D, Lei Z, Li S. Learning meta face recognition in unseen domains. CVPR 2020; 6163-6172.
Takao H, Hayashi N, Ohtomo K. Effects of study design in multi-scanner voxel-based morphometry studies. J NeuroImage 2013; 84. https://doi.org/10.1016/j.neuroimage 08:046.
Dinsdale N, Jenkinson M, Namburete A. Deep learning-based unlearning of dataset bias for MRI harmonisation and confound removal. Neuroimage 2021; 228:117689.
Meineri M, Arellano R, Bryson G, et al. Canadian recommendations for training and performance in basic perioperative point-of-care ultrasound: recommendations from a consensus of Canadian anesthesiology academic centres. Can J Anaesth 2021; 68:376-386.
Zaharchuk G. Next generation research applications for hybrid PET/MR and PET/CT imaging using deep learning. Eur J Nucl Med Mol Imaging 2019; 46:2700-2707.
Norman B, Pedoia V, Majumdar S. Use of 2D U-net convolutional neural networks for automated cartilage and meniscus segmentation of knee MR imaging data to determine relaxometry and morphometry. Radiology 2018; 288:177-185.
Madani A, Arnaout R, Mofrad M, Arnaout R. Fast and accurate view classification of echocardiograms using deep learning. NPJ Digit Med 2018; 1:6.
Tian SF, Liu AL, Liu JH, Liu YJ, Pan JD. Potential value of the PixelShine deep learning algorithm for increasing quality of 70 kVp+ASiR-V reconstruction pelvic arterial phase CT images. Jpn J Radiol 2019; 37:186-190.
Cheema BS, Walter J, Narang A, Thomas JD. Artificial intelligence-enabled POCUS in the COVID-19 ICU: a new spin on cardiac ultrasound. JACC Case Rep 2021; 3:258-263.
Blaivas M, Blaivas L, Philips G, et al. Development of a deep learning network to classify inferior vena cava collapse to predict fluid responsiveness. J Ultrasound Med 2020; 10.
https://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfRes/res.cfm?ID=173162. 2019.
Kasprzak J, Wejner-Mik P, Szymczyk E, Wdowiak-Okrojek K, Lipiec P. Artificial intelligence-powered measurement of left ventricular ejection fraction using a handheld ultrasound device. Ultrasound Med Biol 2021; 47:1120-1125.
Shokoohi H, Goldsmith A, Negishi K, et al. A novel measure for characterizing ultrasound device use and wear. J Am Coll Emerg Physicians Open 2020; 1:865-870.
Narang A, Bae R, Hong H, et al. Utility of a deep-learning algorithm to guide novices to acquire echocardiograms for limited diagnostic use. JAMA Cardiol 2021.

Auteurs

Michael Blaivas (M)

Department of Medicine, University of South Carolina School of Medicine, Columbia, South Carolina, USA.
Department of Emergency Medicine, St. Francis Hospital, Columbus, Georgia, USA.

Laura N Blaivas (LN)

Michigan State University, East Lancing, Michigan, USA.

James W Tsung (JW)

Department of Emergency Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH