Fuzzy Matching for Symptom Detection in Tweets: Application to Covid-19 During the First Wave of the Pandemic in France.

Content analysis Covid-19 fuzzy matching social media symptoms

Journal

Studies in health technology and informatics
ISSN: 1879-8365
Titre abrégé: Stud Health Technol Inform
Pays: Netherlands
ID NLM: 9214582

Informations de publication

Date de publication:
27 May 2021
Historique:
entrez: 27 5 2021
pubmed: 28 5 2021
medline: 1 6 2021
Statut: ppublish

Résumé

The exhaustive automatic detection of symptoms in social media posts is made difficult by the presence of colloquial expressions, misspellings and inflected forms of words. The detection of self-reported symptoms is of major importance for emergent diseases like the Covid-19. In this study, we aimed to (1) develop an algorithm based on fuzzy matching to detect symptoms in tweets, (2) establish a comprehensive list of Covid-19-related symptoms and (3) evaluate the fuzzy matching for Covid-19-related symptom detection in French tweets. The Covid-19-related symptom list was built based on the aggregation of different data sources. French Covid-19-related tweets were automatically extracted using a dedicated data broker during the first wave of the pandemic in France. The fuzzy matching parameters were finetuned using all symptoms from MedDRA and then evaluated on a subset of 5000 Covid-19-related tweets in French for the detection of symptoms from our Covid-19-related list. The fuzzy matching improved the detection by the addition of 42% more correct matches with an 81% precision.

Identifiants

pubmed: 34042803
pii: SHTI210308
doi: 10.3233/SHTI210308
doi:

Types de publication

Journal Article

Langues

eng

Pagination

896-900

Auteurs

Carole Faviez (C)

Centre de Recherche des Cordeliers, Sorbonne Université, INSERM, Université de Paris, F-75006, Paris, France.

Pierre Foulquié (P)

Kap Code, Paris, France.

Xiaoyi Chen (X)

Centre de Recherche des Cordeliers, Sorbonne Université, INSERM, Université de Paris, F-75006, Paris, France.

Adel Mebarki (A)

Kap Code, Paris, France.

Sophie Quennelle (S)

Centre de Recherche des Cordeliers, Sorbonne Université, INSERM, Université de Paris, F-75006, Paris, France.
M3C-Necker,Hôpital Necker-Enfants Malades, AP-HP, F-75015, Paris, France.

Nathalie Texier (N)

Kap Code, Paris, France.

Sandrine Katsahian (S)

Centre de Recherche des Cordeliers, Sorbonne Université, INSERM, Université de Paris, F-75006, Paris, France.
Hôpital européen Georges Pompidou, Unité d'épidémiologie et de recherche clinique, AP-HP, F-75015, Paris, France.

Stéphane Schuck (S)

Kap Code, Paris, France.

Anita Burgun (A)

Centre de Recherche des Cordeliers, Sorbonne Université, INSERM, Université de Paris, F-75006, Paris, France.
Hôpital Necker-Enfants Malades, Département d'informatique médicale, AP-HP, F-75015, Paris, France.
PaRis Artificial Intelligence Research InstitutE (PRAIRIE), France.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH