Text-based predictions of COVID-19 diagnosis from self-reported chemosensory descriptions.


Journal

Communications medicine
ISSN: 2730-664X
Titre abrégé: Commun Med (Lond)
Pays: England
ID NLM: 9918250414506676

Informations de publication

Date de publication:
27 Jul 2023
Historique:
received: 19 01 2023
accepted: 19 07 2023
medline: 28 7 2023
pubmed: 28 7 2023
entrez: 27 7 2023
Statut: epublish

Résumé

There is a prevailing view that humans' capacity to use language to characterize sensations like odors or tastes is poor, providing an unreliable source of information. Here, we developed a machine learning method based on Natural Language Processing (NLP) using Large Language Models (LLM) to predict COVID-19 diagnosis solely based on text descriptions of acute changes in chemosensation, i.e., smell, taste and chemesthesis, caused by the disease. The dataset of more than 1500 subjects was obtained from survey responses early in the COVID-19 pandemic, in Spring 2020. When predicting COVID-19 diagnosis, our NLP model performs comparably (AUC ROC ~ 0.65) to models based on self-reported changes in function collected via quantitative rating scales. Further, our NLP model could attribute importance of words when performing the prediction; sentiment and descriptive words such as "smell", "taste", "sense", had strong contributions to the predictions. In addition, adjectives describing specific tastes or smells such as "salty", "sweet", "spicy", and "sour" also contributed considerably to predictions. Our results show that the description of perceptual symptoms caused by a viral infection can be used to fine-tune an LLM model to correctly predict and interpret the diagnostic status of a subject. In the future, similar models may have utility for patient verbatims from online health portals or electronic health records. Early in the COVID-19 pandemic, people who were infected with SARS-CoV-2 reported changes in smell and taste. To better study these symptoms of SARS-CoV-2 infections and potentially use them to identify infected patients, a survey was undertaken in various countries asking people about their COVID-19 symptoms. One part of the questionnaire asked people to describe the changes in smell and taste they were experiencing. We developed a computational program that could use these responses to correctly distinguish people that had tested positive for SARS-CoV-2 infection from people without SARS-CoV-2 infection. This approach could allow rapid identification of people infected with SARS-CoV-2 from descriptions of their sensory symptoms and be adapted to identify people infected with other viruses in the future.

Sections du résumé

BACKGROUND BACKGROUND
There is a prevailing view that humans' capacity to use language to characterize sensations like odors or tastes is poor, providing an unreliable source of information.
METHODS METHODS
Here, we developed a machine learning method based on Natural Language Processing (NLP) using Large Language Models (LLM) to predict COVID-19 diagnosis solely based on text descriptions of acute changes in chemosensation, i.e., smell, taste and chemesthesis, caused by the disease. The dataset of more than 1500 subjects was obtained from survey responses early in the COVID-19 pandemic, in Spring 2020.
RESULTS RESULTS
When predicting COVID-19 diagnosis, our NLP model performs comparably (AUC ROC ~ 0.65) to models based on self-reported changes in function collected via quantitative rating scales. Further, our NLP model could attribute importance of words when performing the prediction; sentiment and descriptive words such as "smell", "taste", "sense", had strong contributions to the predictions. In addition, adjectives describing specific tastes or smells such as "salty", "sweet", "spicy", and "sour" also contributed considerably to predictions.
CONCLUSIONS CONCLUSIONS
Our results show that the description of perceptual symptoms caused by a viral infection can be used to fine-tune an LLM model to correctly predict and interpret the diagnostic status of a subject. In the future, similar models may have utility for patient verbatims from online health portals or electronic health records.
Early in the COVID-19 pandemic, people who were infected with SARS-CoV-2 reported changes in smell and taste. To better study these symptoms of SARS-CoV-2 infections and potentially use them to identify infected patients, a survey was undertaken in various countries asking people about their COVID-19 symptoms. One part of the questionnaire asked people to describe the changes in smell and taste they were experiencing. We developed a computational program that could use these responses to correctly distinguish people that had tested positive for SARS-CoV-2 infection from people without SARS-CoV-2 infection. This approach could allow rapid identification of people infected with SARS-CoV-2 from descriptions of their sensory symptoms and be adapted to identify people infected with other viruses in the future.

Autres résumés

Type: plain-language-summary (eng)
Early in the COVID-19 pandemic, people who were infected with SARS-CoV-2 reported changes in smell and taste. To better study these symptoms of SARS-CoV-2 infections and potentially use them to identify infected patients, a survey was undertaken in various countries asking people about their COVID-19 symptoms. One part of the questionnaire asked people to describe the changes in smell and taste they were experiencing. We developed a computational program that could use these responses to correctly distinguish people that had tested positive for SARS-CoV-2 infection from people without SARS-CoV-2 infection. This approach could allow rapid identification of people infected with SARS-CoV-2 from descriptions of their sensory symptoms and be adapted to identify people infected with other viruses in the future.

Identifiants

pubmed: 37500763
doi: 10.1038/s43856-023-00334-5
pii: 10.1038/s43856-023-00334-5
pmc: PMC10374642
doi:

Types de publication

Journal Article

Langues

eng

Pagination

104

Informations de copyright

© 2023. The Author(s).

Références

Parkinsonism Relat Disord. 2016 Apr;25:45-51
pubmed: 26923521
J Am Med Inform Assoc. 2019 Apr 1;26(4):364-379
pubmed: 30726935
Chem Senses. 1997 Dec;22(6):623-33
pubmed: 9455609
Am Psychol. 2019 Dec;74(9):1003-1011
pubmed: 31829675
Nat Commun. 2018 Nov 26;9(1):4979
pubmed: 30478272
Nat Commun. 2020 Oct 14;11(1):5152
pubmed: 33056983
Sci Transl Med. 2022 Dec 21;14(676):eadd0484
pubmed: 36542694
Eur J Neurosci. 2021 Sep;54(6):6256-6266
pubmed: 34424569
Int Forum Allergy Rhinol. 2020 Aug;10(8):944-950
pubmed: 32301284
Chem Senses. 2020 Oct 9;45(7):609-622
pubmed: 32564071
Int Forum Allergy Rhinol. 2020 Jul;10(7):814-820
pubmed: 32271490
Chem Senses. 2003 Oct;28(8):691-4
pubmed: 14627537
Mayo Clin Proc. 2020 Aug;95(8):1621-1631
pubmed: 32753137
Chem Senses. 2020 Dec 5;45(9):865-874
pubmed: 33245136
Chem Senses. 2020 Oct 9;45(7):493-502
pubmed: 32556127
Science. 1988 Jun 3;240(4857):1285-93
pubmed: 3287615
Commun Med (Lond). 2022 Apr 5;2:34
pubmed: 35603293
Open Forum Infect Dis. 2020 Dec 28;8(2):ofaa589
pubmed: 33604398
J Biomed Inform. 2018 Dec;88:11-19
pubmed: 30368002
Percept Psychophys. 1982 Apr;31(4):397-401
pubmed: 7110896
Comput Math Methods Med. 2016;2016:8708434
pubmed: 27752278
Patterns (N Y). 2022 May 13;3(5):100493
pubmed: 35607616
BMJ. 2022 Jul 27;378:e069503
pubmed: 35896188
Science. 1979 Feb 2;203(4379):467-70
pubmed: 760202
Rhinology. 2022 Jun 1;60(3):207-217
pubmed: 35398877
BMC Psychiatry. 2018 Jun 18;18(1):199
pubmed: 29914416
BMC Geriatr. 2020 Mar 6;20(1):95
pubmed: 32143637
Curr Allergy Asthma Rep. 2020 Aug 3;20(10):61
pubmed: 32748211
Annu Rev Psychol. 2010;61:219-41, C1-5
pubmed: 19958179
Chem Senses. 2021 Jan 1;46:
pubmed: 33367502

Auteurs

Hongyang Li (H)

Health Care and Life Sciences, IBM T.J. Watson Research Center, Yorktown Heights, NY, USA.

Richard C Gerkin (RC)

School of Life Sciences, Arizona State University, Tempe, AZ, USA.
Osmo, Cambridge, MA, USA.

Alyssa Bakke (A)

Department of Food Science, The Pennsylvania State University, University Park, PA, USA.

Raquel Norel (R)

Health Care and Life Sciences, IBM T.J. Watson Research Center, Yorktown Heights, NY, USA.

Guillermo Cecchi (G)

Health Care and Life Sciences, IBM T.J. Watson Research Center, Yorktown Heights, NY, USA.

Christophe Laudamiel (C)

Department of Scent Engineering, DreamAir LLC, New York, NY, USA.

Masha Y Niv (MY)

The Faculty of Agriculture, Food and Environment, The Hebrew University of Jerusalem, Rehovot, Israel.

Kathrin Ohla (K)

Department of Food Science, The Pennsylvania State University, University Park, PA, USA.
Science & Research, dsm-firmenich, Satigny, Switzerland.

John E Hayes (JE)

Department of Food Science, The Pennsylvania State University, University Park, PA, USA.

Valentina Parma (V)

Monell Chemical Senses Center, Philadelphia, PA, USA.

Pablo Meyer (P)

Health Care and Life Sciences, IBM T.J. Watson Research Center, Yorktown Heights, NY, USA. pmeyerr@us.ibm.com.

Classifications MeSH