Diagnostic performance of convolutional neural network-based Tanner-Whitehouse 3 bone age assessment system.
Artificial intelligence (AI)
Tanner-Whitehouse 3 method (TW3 method)
bone age
convolutional neural network (CNN)
Journal
Quantitative imaging in medicine and surgery
ISSN: 2223-4292
Titre abrégé: Quant Imaging Med Surg
Pays: China
ID NLM: 101577942
Informations de publication
Date de publication:
Mar 2020
Mar 2020
Historique:
entrez:
10
4
2020
pubmed:
10
4
2020
medline:
10
4
2020
Statut:
ppublish
Résumé
Bone age can reflect the true growth and development status of a child; thus, it plays a critical role in evaluating growth and endocrine disorders. This study established and validated an optimized Tanner-Whitehouse 3 artificial intelligence (TW3-AI) bone age assessment (BAA) system based on a convolutional neural network (CNN). A data set of 9,059 clinical radiographs of the left hand was obtained from the picture archives and communication systems (PACS) between January 2012 and December 2016. Among these, 8,005/9,059 (88%) samples were treated as the training set for model implementation, 804/9,059 (9%) samples as the validation set for parameters optimization, and the remaining 250/9,059 (3%) samples were used to verify the accuracy and reliability of the model compared to that of 4 experienced endocrinologists and 2 experienced radiologists. The overall variation of TW3-metacarpophalangeal, radius, ulna and short bones (RUS) and TW3-Carpal bone score, as well as each bone (13 RUS + 7 Carpal) between reviewers and the AI, were compared by Bland-Altman (BA) chart and Kappa test, respectively. Furthermore, the time consumption between the model and reviewers was also compared. The performance of TW3-AI model was highly consistent with the reviewers' overall estimation, and the root mean square (RMS) was 0.50 years. The accuracy of the BAA of the TW3-AI model was better than the estimate of the reviewers. Further analysis revealed that human interpretations of the male capitate, hamate, the first distal and fifth middle phalanx and female capitate, the trapezoid, and the third and fifth middle phalanx, were most inconsistent. The average image processing time was 1.5±0.2 s in the TW3-AI model, which was significantly shorter than manual interpretation. The diagnostic performance of CNN-based TW3 BAA was accurate and timesaving, and possesses better stability compared to diagnostics made by experienced experts.
Sections du résumé
BACKGROUND
BACKGROUND
Bone age can reflect the true growth and development status of a child; thus, it plays a critical role in evaluating growth and endocrine disorders. This study established and validated an optimized Tanner-Whitehouse 3 artificial intelligence (TW3-AI) bone age assessment (BAA) system based on a convolutional neural network (CNN).
METHODS
METHODS
A data set of 9,059 clinical radiographs of the left hand was obtained from the picture archives and communication systems (PACS) between January 2012 and December 2016. Among these, 8,005/9,059 (88%) samples were treated as the training set for model implementation, 804/9,059 (9%) samples as the validation set for parameters optimization, and the remaining 250/9,059 (3%) samples were used to verify the accuracy and reliability of the model compared to that of 4 experienced endocrinologists and 2 experienced radiologists. The overall variation of TW3-metacarpophalangeal, radius, ulna and short bones (RUS) and TW3-Carpal bone score, as well as each bone (13 RUS + 7 Carpal) between reviewers and the AI, were compared by Bland-Altman (BA) chart and Kappa test, respectively. Furthermore, the time consumption between the model and reviewers was also compared.
RESULTS
RESULTS
The performance of TW3-AI model was highly consistent with the reviewers' overall estimation, and the root mean square (RMS) was 0.50 years. The accuracy of the BAA of the TW3-AI model was better than the estimate of the reviewers. Further analysis revealed that human interpretations of the male capitate, hamate, the first distal and fifth middle phalanx and female capitate, the trapezoid, and the third and fifth middle phalanx, were most inconsistent. The average image processing time was 1.5±0.2 s in the TW3-AI model, which was significantly shorter than manual interpretation.
CONCLUSIONS
CONCLUSIONS
The diagnostic performance of CNN-based TW3 BAA was accurate and timesaving, and possesses better stability compared to diagnostics made by experienced experts.
Identifiants
pubmed: 32269926
doi: 10.21037/qims.2020.02.20
pii: qims-10-03-657
pmc: PMC7136746
doi:
Types de publication
Journal Article
Langues
eng
Pagination
657-667Informations de copyright
2020 Quantitative Imaging in Medicine and Surgery. All rights reserved.
Déclaration de conflit d'intérêts
Conflicts of Interest: The authors have no conflicts of interest to declare.
Références
Arch Dis Child. 1999 Aug;81(2):172-3
pubmed: 10490531
Am J Roentgenol Radium Ther Nucl Med. 1970 Mar;108(3):511-5
pubmed: 4313463
Quant Imaging Med Surg. 2018 Jun;8(5):491-499
pubmed: 30050783
Nature. 2017 Feb 2;542(7639):115-118
pubmed: 28117445
J Digit Imaging. 2011 Dec;24(6):1044-58
pubmed: 21347746
Radiology. 2019 Feb;290(2):498-503
pubmed: 30480490
Pediatr Endocrinol Rev. 2014 Dec;12(2):200-5
pubmed: 25581985
IEEE Trans Med Imaging. 1991;10(4):616-20
pubmed: 18222868
Radiology. 2018 Apr;287(1):313-322
pubmed: 29095675
Acta Radiol. 2013 Nov;54(9):1024-9
pubmed: 24179234
Comput Biol Med. 2018 Apr 1;95:43-54
pubmed: 29455079
JAMA. 2016 Dec 13;316(22):2402-2410
pubmed: 27898976
J Digit Imaging. 2017 Aug;30(4):427-441
pubmed: 28275919
IEEE Trans Med Imaging. 1989;8(1):64-9
pubmed: 18230501
IEEE Trans Med Imaging. 2009 Jan;28(1):52-66
pubmed: 19116188
JAMA. 2017 Dec 12;318(22):2199-2210
pubmed: 29234806
Korean J Radiol. 2017 Jul-Aug;18(4):570-584
pubmed: 28670152
Comput Med Imaging Graph. 2007 Jun-Jul;31(4-5):299-310
pubmed: 17369018
Korean J Radiol. 2015 Jan-Feb;16(1):201-5
pubmed: 25598691