Integrated Natural Language Processing and Machine Learning Models for Standardizing Radiotherapy Structure Names.

TG-263 machine learning natural language processing nomenclature standardization quality assurance radiotherapy structure names text categorization

Journal

Healthcare (Basel, Switzerland)
ISSN: 2227-9032
Titre abrégé: Healthcare (Basel)
Pays: Switzerland
ID NLM: 101666525

Informations de publication

Date de publication:
30 Apr 2020
Historique:
received: 26 02 2020
revised: 18 04 2020
accepted: 24 04 2020
entrez: 6 5 2020
pubmed: 6 5 2020
medline: 6 5 2020
Statut: epublish

Résumé

The lack of standardized structure names in radiotherapy (RT) data limits interoperability, data sharing, and the ability to perform big data analysis. To standardize radiotherapy structure names, we developed an integrated natural language processing (NLP) and machine learning (ML) based system that can map the physician-given structure names to American Association of Physicists in Medicine (AAPM) Task Group 263 (TG-263) standard names. The dataset consist of 794 prostate and 754 lung cancer patients across the 40 different radiation therapy centers managed by the Veterans Health Administration (VA). Additionally, data from the Radiation Oncology department at Virginia Commonwealth University (VCU) was collected to serve as a test set. Domain experts identified as anatomically significant nine prostate and ten lung organs-at-risk (OAR) structures and manually labeled them according to the TG-263 standards, and remaining structures were labeled as Non_OAR. We experimented with six different classification algorithms and three feature vector methods, and the final model was built with fastText algorithm. Multiple validation techniques are used to assess the robustness of the proposed methodology. The macro-averaged F 1 score was used as the main evaluation metric. The model achieved an F 1 score of 0.97 on prostate structures and 0.99 for lung structures from the VA dataset. The model also performed well on the test (VCU) dataset, achieving an F 1 score of 0.93 for prostate structures and 0.95 on lung structures. In this work, we demonstrate that NLP and ML based approaches can used to standardize the physician-given RT structure names with high fidelity. This standardization can help with big data analytics in the radiation therapy domain using population-derived datasets, including standardization of the treatment planning process, clinical decision support systems, treatment quality improvement programs, and hypothesis-driven clinical research.

Identifiants

pubmed: 32365973
pii: healthcare8020120
doi: 10.3390/healthcare8020120
pmc: PMC7348919
pii:
doi:

Types de publication

Journal Article

Langues

eng

Références

Adv Radiat Oncol. 2018 Oct 12;4(1):191-200
pubmed: 30706028
Int J Radiat Oncol Biol Phys. 2018 Mar 15;100(4):1057-1066
pubmed: 29485047
Stud Health Technol Inform. 2018;247:855-859
pubmed: 29678082
Int J Radiat Oncol Biol Phys. 2016 Jul 1;95(3):873-879
pubmed: 27302503
Pract Radiat Oncol. 2019 Mar;9(2):65-72
pubmed: 30576843
Int J Radiat Oncol Biol Phys. 2020 Mar 1;106(3):639-647
pubmed: 31983560

Auteurs

Khajamoinuddin Syed (K)

Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA.

William Sleeman Iv (WS)

Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA.
Department of Radiation Oncology, Virginia Commonwealth University, Richmond, VA 23298, USA.

Kevin Ivey (K)

Department of Computer Science, University of Virginia, Charlottesville, VA 22904, USA.

Michael Hagan (M)

Department of Radiation Oncology, Virginia Commonwealth University, Richmond, VA 23298, USA.
National Radiation Oncology Program, Department of Veteran Affairs Richmond, VA 23249, USA.

Jatinder Palta (J)

Department of Radiation Oncology, Virginia Commonwealth University, Richmond, VA 23298, USA.
National Radiation Oncology Program, Department of Veteran Affairs Richmond, VA 23249, USA.

Rishabh Kapoor (R)

Department of Radiation Oncology, Virginia Commonwealth University, Richmond, VA 23298, USA.
National Radiation Oncology Program, Department of Veteran Affairs Richmond, VA 23249, USA.

Preetam Ghosh (P)

Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA.

Classifications MeSH