Investigating Topic Modeling Techniques to Extract Meaningful Insights in Italian Long COVID Narration.

BERTopic LDA narrative medicine text mining topic modeling

Journal

Biotech (Basel (Switzerland))
ISSN: 2673-6284
Titre abrégé: BioTech (Basel)
Pays: Switzerland
ID NLM: 9918383086206676

Informations de publication

Date de publication:
03 Sep 2022
Historique:
received: 29 07 2022
revised: 29 08 2022
accepted: 31 08 2022
entrez: 22 9 2022
pubmed: 23 9 2022
medline: 23 9 2022
Statut: epublish

Résumé

Through an adequate survey of the history of the disease, Narrative Medicine (NM) aims to allow the definition and implementation of an effective, appropriate, and shared treatment path. In the present study different topic modeling techniques are compared, as Latent Dirichlet Allocation (LDA) and topic modeling based on BERT transformer, to extract meaningful insights in the Italian narration of COVID-19 pandemic. In particular, the main focus was the characterization of Post-acute Sequelae of COVID-19, (i.e., PASC) writings as opposed to writings by health professionals and general reflections on COVID-19, (i.e., non-PASC) writings, modeled as a semi-supervised task. The results show that the BERTopic-based approach outperforms the LDA-base approach by grouping in the same cluster the 97.26% of analyzed documents, and reaching an overall accuracy of 91.97%.

Identifiants

pubmed: 36134915
pii: biotech11030041
doi: 10.3390/biotech11030041
pmc: PMC9496775
pii:
doi:

Types de publication

Journal Article

Langues

eng

Références

Int J Environ Res Public Health. 2020 Apr 17;17(8):
pubmed: 32316647
Intell Based Med. 2021;5:100036
pubmed: 34179855
F1000Res. 2020 Jun 23;9:636
pubmed: 33093946
J Med Internet Res. 2022 Jul 13;24(7):e37142
pubmed: 35731966
Open Forum Infect Dis. 2020 Jun 30;7(7):ofaa258
pubmed: 33117854
J Med Internet Res. 2020 Oct 23;22(10):e22624
pubmed: 33006937
Med Educ. 2002 Jun;36(6):508-13
pubmed: 12047662
IEEE Access. 2020 Jul 17;8:132527-132538
pubmed: 34786279
J Virol Methods. 2022 Mar;301:114433
pubmed: 34919977
J Med Internet Res. 2020 Dec 14;22(12):e21418
pubmed: 33284783
IEEE J Biomed Health Inform. 2020 Oct;24(10):2733-2742
pubmed: 32750931
JMIR Public Health Surveill. 2020 Nov 11;6(4):e21978
pubmed: 33108310
IEEE Access. 2021 Mar 01;9:36645-36656
pubmed: 34786310
Lancet Psychiatry. 2021 Feb;8(2):130-140
pubmed: 33181098
Nat Immunol. 2022 Feb;23(2):194-202
pubmed: 35105985
CJEM. 2020 Jul;22(4):418-421
pubmed: 32248871
J Med Internet Res. 2020 Apr 28;22(4):e19118
pubmed: 32302966
Front Psychiatry. 2020 Aug 07;11:790
pubmed: 32848952
J Public Health (Oxf). 2021 Oct 14;:
pubmed: 34651183
J Med Internet Res. 2020 Nov 10;22(11):e21559
pubmed: 33031049
PLoS One. 2021 Oct 11;16(10):e0258133
pubmed: 34634054
Soc Sci Med. 2021 Oct;286:114326
pubmed: 34425522

Auteurs

Ileana Scarpino (I)

Department of Medical and Surgical Sciences, University "Magna Græcia", 88100 Catanzaro, Italy.

Chiara Zucco (C)

Department of Medical and Surgical Sciences, University "Magna Græcia", 88100 Catanzaro, Italy.
Data Analytics Research Center, University "Magna Græcia", 88100 Catanzaro, Italy.

Rosarina Vallelunga (R)

Department of Medical and Surgical Sciences, University "Magna Græcia", 88100 Catanzaro, Italy.

Francesco Luzza (F)

Department of Health Sciences, University "Magna Græcia", 88100 Catanzaro, Italy.

Mario Cannataro (M)

Department of Medical and Surgical Sciences, University "Magna Græcia", 88100 Catanzaro, Italy.
Data Analytics Research Center, University "Magna Græcia", 88100 Catanzaro, Italy.

Classifications MeSH