Automated Construction of Lexicons to Improve Depression Screening With Text Messages.
Journal
IEEE journal of biomedical and health informatics
ISSN: 2168-2208
Titre abrégé: IEEE J Biomed Health Inform
Pays: United States
ID NLM: 101604520
Informations de publication
Date de publication:
06 2023
06 2023
Historique:
medline:
8
6
2023
pubmed:
1
9
2022
entrez:
31
8
2022
Statut:
ppublish
Résumé
Given that depression is one of the most prevalent mental illnesses, developing effective and unobtrusive diagnosis tools is of great importance. Recent work that screens for depression with text messages leverage models relying on lexical category features. Given the colloquial nature of text messages, the performance of these models may be limited by formal lexicons. We thus propose a strategy to automatically construct alternative lexicons that contain more relevant and colloquial terms. Specifically, we generate 36 lexicons from fiction, forum, and news corpuses. These lexicons are then used to extract lexical category features from the text messages. We utilize machine learning models to compare the depression screening capabilities of these lexical category features. Out of our 36 constructed lexicons, 14 achieved statistically significantly higher average F1 scores over the pre-existing formal lexicon and basic bag-of-words approach. In comparison to the pre-existing lexicon, our best performing lexicon increased the average F1 scores by 10%. We thus confirm our hypothesis that less formal lexicons can improve the performance of classification models that screen for depression with text messages. By providing our automatically constructed lexicons, we aid future machine learning research that leverages less formal text.
Identifiants
pubmed: 36044503
doi: 10.1109/JBHI.2022.3203345
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.
Langues
eng
Sous-ensembles de citation
IM