Acoustic and Facial Features From Clinical Interviews for Machine Learning-Based Psychiatric Diagnosis: Algorithm Development.
audiovisual
audiovisual patterns
bipolar disorder
diagnostic prediction
facial analysis
machine learning
psychiatry
schizophrenia
schizophrenia spectrum disorders
spectrum disorders
speech
speech analysis
symptom prediction
Journal
JMIR mental health
ISSN: 2368-7959
Titre abrégé: JMIR Ment Health
Pays: Canada
ID NLM: 101658926
Informations de publication
Date de publication:
24 Jan 2022
24 Jan 2022
Historique:
received:
01
10
2020
accepted:
01
12
2021
revised:
29
04
2021
entrez:
24
1
2022
pubmed:
25
1
2022
medline:
25
1
2022
Statut:
epublish
Résumé
In contrast to all other areas of medicine, psychiatry is still nearly entirely reliant on subjective assessments such as patient self-report and clinical observation. The lack of objective information on which to base clinical decisions can contribute to reduced quality of care. Behavioral health clinicians need objective and reliable patient data to support effective targeted interventions. We aimed to investigate whether reliable inferences-psychiatric signs, symptoms, and diagnoses-can be extracted from audiovisual patterns in recorded evaluation interviews of participants with schizophrenia spectrum disorders and bipolar disorder. We obtained audiovisual data from 89 participants (mean age 25.3 years; male: 48/89, 53.9%; female: 41/89, 46.1%): individuals with schizophrenia spectrum disorders (n=41), individuals with bipolar disorder (n=21), and healthy volunteers (n=27). We developed machine learning models based on acoustic and facial movement features extracted from participant interviews to predict diagnoses and detect clinician-coded neuropsychiatric symptoms, and we assessed model performance using area under the receiver operating characteristic curve (AUROC) in 5-fold cross-validation. The model successfully differentiated between schizophrenia spectrum disorders and bipolar disorder (AUROC 0.73) when aggregating face and voice features. Facial action units including cheek-raising muscle (AUROC 0.64) and chin-raising muscle (AUROC 0.74) provided the strongest signal for men. Vocal features, such as energy in the frequency band 1 to 4 kHz (AUROC 0.80) and spectral harmonicity (AUROC 0.78), provided the strongest signal for women. Lip corner-pulling muscle signal discriminated between diagnoses for both men (AUROC 0.61) and women (AUROC 0.62). Several psychiatric signs and symptoms were successfully inferred: blunted affect (AUROC 0.81), avolition (AUROC 0.72), lack of vocal inflection (AUROC 0.71), asociality (AUROC 0.63), and worthlessness (AUROC 0.61). This study represents advancement in efforts to capitalize on digital data to improve diagnostic assessment and supports the development of a new generation of innovative clinical tools by employing acoustic and facial data analysis.
Sections du résumé
BACKGROUND
BACKGROUND
In contrast to all other areas of medicine, psychiatry is still nearly entirely reliant on subjective assessments such as patient self-report and clinical observation. The lack of objective information on which to base clinical decisions can contribute to reduced quality of care. Behavioral health clinicians need objective and reliable patient data to support effective targeted interventions.
OBJECTIVE
OBJECTIVE
We aimed to investigate whether reliable inferences-psychiatric signs, symptoms, and diagnoses-can be extracted from audiovisual patterns in recorded evaluation interviews of participants with schizophrenia spectrum disorders and bipolar disorder.
METHODS
METHODS
We obtained audiovisual data from 89 participants (mean age 25.3 years; male: 48/89, 53.9%; female: 41/89, 46.1%): individuals with schizophrenia spectrum disorders (n=41), individuals with bipolar disorder (n=21), and healthy volunteers (n=27). We developed machine learning models based on acoustic and facial movement features extracted from participant interviews to predict diagnoses and detect clinician-coded neuropsychiatric symptoms, and we assessed model performance using area under the receiver operating characteristic curve (AUROC) in 5-fold cross-validation.
RESULTS
RESULTS
The model successfully differentiated between schizophrenia spectrum disorders and bipolar disorder (AUROC 0.73) when aggregating face and voice features. Facial action units including cheek-raising muscle (AUROC 0.64) and chin-raising muscle (AUROC 0.74) provided the strongest signal for men. Vocal features, such as energy in the frequency band 1 to 4 kHz (AUROC 0.80) and spectral harmonicity (AUROC 0.78), provided the strongest signal for women. Lip corner-pulling muscle signal discriminated between diagnoses for both men (AUROC 0.61) and women (AUROC 0.62). Several psychiatric signs and symptoms were successfully inferred: blunted affect (AUROC 0.81), avolition (AUROC 0.72), lack of vocal inflection (AUROC 0.71), asociality (AUROC 0.63), and worthlessness (AUROC 0.61).
CONCLUSIONS
CONCLUSIONS
This study represents advancement in efforts to capitalize on digital data to improve diagnostic assessment and supports the development of a new generation of innovative clinical tools by employing acoustic and facial data analysis.
Identifiants
pubmed: 35072648
pii: v9i1e24699
doi: 10.2196/24699
pmc: PMC8822433
doi:
Types de publication
Journal Article
Langues
eng
Pagination
e24699Informations de copyright
©Michael L Birnbaum, Avner Abrami, Stephen Heisig, Asra Ali, Elizabeth Arenare, Carla Agurto, Nathaniel Lu, John M Kane, Guillermo Cecchi. Originally published in JMIR Mental Health (https://mental.jmir.org), 24.01.2022.
Références
Int Rev Psychiatry. 2010;22(5):417-28
pubmed: 21047156
Curr Opin Psychol. 2015 Aug;4:75-79
pubmed: 26295056
IEEE Trans Affect Comput. 2013 Apr-Jun;4(2):142-150
pubmed: 26985326
Int J Epidemiol. 2014 Apr;43(2):476-93
pubmed: 24648481
Heliyon. 2020 May 20;6(5):e03990
pubmed: 32462093
Psychiatry Investig. 2018 Jul;15(7):695-700
pubmed: 29969852
Can J Psychiatry. 2016 Dec;61(12):746-757
pubmed: 27310247
Psychol Assess. 2005 Sep;17(3):324-35
pubmed: 16262458
Br J Psychiatry Suppl. 2013 Jan;54:s5-10
pubmed: 23288502
J Med Internet Res. 2021 Feb 22;23(2):e21037
pubmed: 33616535
J Speech Lang Hear Res. 2007 Dec;50(6):1510-45
pubmed: 18055771
Early Interv Psychiatry. 2015 Oct;9(5):345-56
pubmed: 25345316
PLoS One. 2019 Apr 9;14(4):e0214314
pubmed: 30964869
Med Eng Phys. 2010 Nov;32(9):1074-9
pubmed: 20692864
Annu Int Conf IEEE Eng Med Biol Soc. 2012;2012:2104-7
pubmed: 23366336
J Clin Psychiatry. 2020 Oct 27;81(6):
pubmed: 33113597
Syst Rev. 2012 Nov 24;1:58
pubmed: 23176742
Psychol Assess. 2007 Jun;19(2):210-24
pubmed: 17563202
Br J Psychiatry Suppl. 1989 Nov;(7):49-58
pubmed: 2695141
J Abnorm Psychol. 2018 Oct;127(7):623-638
pubmed: 30211576
J Biomed Inform. 2018 Jul;83:103-111
pubmed: 29852317
Eur Psychiatry. 2018 Sep;53:74-99
pubmed: 29957371
Schizophr Res. 2017 Jul;185:2-8
pubmed: 28017494
Br J Psychiatry. 1978 Nov;133:429-35
pubmed: 728692
Int J Med Educ. 2019 Jul 30;10:149-160
pubmed: 31381505
Australas Psychiatry. 2019 Jun;27(3):249-254
pubmed: 30907115
Annu Int Conf IEEE Eng Med Biol Soc. 2019 Jul;2019:225-228
pubmed: 31945883
World Psychiatry. 2018 Feb;17(1):67-75
pubmed: 29352548
J Neurosci Methods. 2008 Feb 15;168(1):224-38
pubmed: 18045693
Methods. 2018 Dec 1;151:41-54
pubmed: 30099083
Curr Opin Psychiatry. 2014 May;27(3):203-9
pubmed: 24613984
Eur Arch Psychiatry Clin Neurosci. 1990;240(2):67-76
pubmed: 2149651
J Abnorm Psychol. 2019 Feb;128(2):97-105
pubmed: 30714793
J Neurol Neurosurg Psychiatry. 1960 Feb;23:56-62
pubmed: 14399272
J Abnorm Psychol. 2019 May;128(4):341-351
pubmed: 30869926
J Psychiatr Res. 2018 Mar;98:59-63
pubmed: 29291581
IEEE Int Conf Autom Face Gesture Recognit Workshops. 2015 May;1:
pubmed: 27453895
J Psychiatr Res. 2015 May;64:74-8
pubmed: 25777474
J Gen Intern Med. 2018 Mar;33(3):335-346
pubmed: 28948432
Psychol Med. 2019 Feb;49(3):440-448
pubmed: 29692287
Annu Rev Clin Psychol. 2018 May 7;14:91-118
pubmed: 29401044
Neurosci Biobehav Rev. 2016 Aug;67:57-78
pubmed: 26743859
PLoS One. 2012;7(4):e34928
pubmed: 22506057
Proc ACM Int Conf Multimodal Interact. 2015 Nov;2015:307-310
pubmed: 27213186
IEEE Trans Pattern Anal Mach Intell. 2015 Jun;37(6):1113-33
pubmed: 26357337
Schizophr Res. 2010 Aug;121(1-3):90-100
pubmed: 20434313
Annu Rev Clin Psychol. 2015;11:251-81
pubmed: 25581236
J Neurosci Methods. 2011 Sep 15;200(2):237-56
pubmed: 21741407
Pharmakopsychiatr Neuropsychopharmakol. 1976 Mar;9(2):67-75
pubmed: 790410
Front Psychiatry. 2020 Nov 12;11:598946
pubmed: 33262715
Neurosci Biobehav Rev. 2018 Oct;93:85-92
pubmed: 29890179
IEEE J Biomed Health Inform. 2018 Mar;22(2):525-536
pubmed: 28278485
Laryngoscope Investig Otolaryngol. 2020 Jan 31;5(1):96-116
pubmed: 32128436
JAMA Psychiatry. 2018 Dec 1;75(12):1289-1297
pubmed: 30347013
Biol Psychiatry Cogn Neurosci Neuroimaging. 2018 Mar;3(3):223-230
pubmed: 29486863
Acta Psychiatr Scand. 2012 Nov;126(5):363-76
pubmed: 22509998
Annu Int Conf IEEE Eng Med Biol Soc. 2017 Jul;2017:1433-1436
pubmed: 29060147
Schizophr Res. 2020 Jun;220:141-146
pubmed: 32247747
Annu Int Conf IEEE Eng Med Biol Soc. 2016 Aug;2016:3835-3838
pubmed: 28269122
Transl Psychiatry. 2016 Jul 19;6:e856
pubmed: 27434490
Psychiatr Pol. 2017 Apr 30;51(2):169-195
pubmed: 28581530