Improving early diagnosis of rare diseases using Natural Language Processing in unstructured medical records: an illustration from Dravet syndrome.
Data mining
Dravet syndrome
Early diagnosis
Natural Language Processing
Rare Diseases
Journal
Orphanet journal of rare diseases
ISSN: 1750-1172
Titre abrégé: Orphanet J Rare Dis
Pays: England
ID NLM: 101266602
Informations de publication
Date de publication:
13 07 2021
13 07 2021
Historique:
received:
13
01
2021
accepted:
27
06
2021
entrez:
14
7
2021
pubmed:
15
7
2021
medline:
6
8
2021
Statut:
epublish
Résumé
The growing use of Electronic Health Records (EHRs) is promoting the application of data mining in health-care. A promising use of big data in this field is to develop models to support early diagnosis and to establish natural history. Dravet Syndrome (DS) is a rare developmental and epileptic encephalopathy that commonly initiates in the first year of life with febrile seizures (FS). Age at diagnosis is often delayed after 2 years, as it is difficult to differentiate DS at onset from FS. We aimed to explore if some clinical terms (concepts) are significantly more used in the electronic narrative medical reports of individuals with DS before the age of 2 years compared to those of individuals with FS. These concepts would allow an earlier detection of patients with DS resulting in an earlier orientation toward expert centers that can provide early diagnosis and care. Data were collected from the Necker Enfants Malades Hospital using a document-based data warehouse, Dr Warehouse, which employs Natural Language Processing, a computer technology consisting in processing written information. Using Unified Medical Language System Meta-thesaurus, phenotype concepts can be recognized in medical reports. We selected individuals with DS (DS Cohort) and individuals with FS (FS Cohort) with confirmed diagnosis after the age of 4 years. A phenome-wide analysis was performed evaluating the statistical associations between the phenotypes of DS and FS, based on concepts found in the reports produced before 2 years and using a series of logistic regressions. We found significative higher representation of concepts related to seizures' phenotypes distinguishing DS from FS in the first phases, namely the major recurrence of complex febrile convulsions (long-lasting and/or with focal signs) and other seizure-types. Some typical early onset non-seizure concepts also emerged, in relation to neurodevelopment and gait disorders. Narrative medical reports of individuals younger than 2 years with FS contain specific concepts linked to DS diagnosis, which can be automatically detected by software exploiting NLP. This approach could represent an innovative and sustainable methodology to decrease time of diagnosis of DS and could be transposed to other rare diseases.
Sections du résumé
BACKGROUND
The growing use of Electronic Health Records (EHRs) is promoting the application of data mining in health-care. A promising use of big data in this field is to develop models to support early diagnosis and to establish natural history. Dravet Syndrome (DS) is a rare developmental and epileptic encephalopathy that commonly initiates in the first year of life with febrile seizures (FS). Age at diagnosis is often delayed after 2 years, as it is difficult to differentiate DS at onset from FS. We aimed to explore if some clinical terms (concepts) are significantly more used in the electronic narrative medical reports of individuals with DS before the age of 2 years compared to those of individuals with FS. These concepts would allow an earlier detection of patients with DS resulting in an earlier orientation toward expert centers that can provide early diagnosis and care.
METHODS
Data were collected from the Necker Enfants Malades Hospital using a document-based data warehouse, Dr Warehouse, which employs Natural Language Processing, a computer technology consisting in processing written information. Using Unified Medical Language System Meta-thesaurus, phenotype concepts can be recognized in medical reports. We selected individuals with DS (DS Cohort) and individuals with FS (FS Cohort) with confirmed diagnosis after the age of 4 years. A phenome-wide analysis was performed evaluating the statistical associations between the phenotypes of DS and FS, based on concepts found in the reports produced before 2 years and using a series of logistic regressions.
RESULTS
We found significative higher representation of concepts related to seizures' phenotypes distinguishing DS from FS in the first phases, namely the major recurrence of complex febrile convulsions (long-lasting and/or with focal signs) and other seizure-types. Some typical early onset non-seizure concepts also emerged, in relation to neurodevelopment and gait disorders.
CONCLUSIONS
Narrative medical reports of individuals younger than 2 years with FS contain specific concepts linked to DS diagnosis, which can be automatically detected by software exploiting NLP. This approach could represent an innovative and sustainable methodology to decrease time of diagnosis of DS and could be transposed to other rare diseases.
Identifiants
pubmed: 34256808
doi: 10.1186/s13023-021-01936-9
pii: 10.1186/s13023-021-01936-9
pmc: PMC8278630
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
309Informations de copyright
© 2021. The Author(s).
Références
Gunter TD, Terry NP. The emergence of national electronic health record architectures in the United States and Australia: models, costs, and questions. J Med Internet Res. 2005;7(1):e3.
pubmed: 15829475
pmcid: 1550638
doi: 10.2196/jmir.7.1.e3
Landi I, Glicksberg BS, Lee HC, Cherng S, Landi G, Danieletto M, et al. Deep representation learning of electronic health records to unlock patient stratification at scale. npj Digit Med. 2020;3:1–11.
doi: 10.1038/s41746-020-0301-z
Olivera P, Danese S, Jay N, Natoli G, Peyrin-Biroulet L. Big data in IBD: a look into the future. Nat Rev Gastroenterol Hepatol. 2019;16(5):312–21.
pubmed: 30659247
doi: 10.1038/s41575-019-0102-5
Bates DW, Saria S, Ohno-Machado L, Shah A, Escobar G. Big data in health care: Using analytics to identify and manage high-risk and high-cost patients. Health Aff. 2014;33(7):1123–31.
doi: 10.1377/hlthaff.2014.0041
Shen F, Liu S, Wang Y, Wen A, Wang L, Liu H. Utilization of electronic medical records and biomedical literature to support the diagnosis of rare diseases using data fusion and collaborative filtering approaches. J Med Internet Res. 2018;20(10):e11301.
Southall NT, Natarajan M, Lau LPL, Jonker AH, Deprez B, Guilliams T, et al. The use or generation of biomedical data and existing medicines to discover and establish new treatments for patients with rare diseases-recommendations of the IRDiRC Data Mining and Repurposing Task Force. Orphanet J Rare Dis. 2019;14(1):225.
pubmed: 31615551
pmcid: 6794821
doi: 10.1186/s13023-019-1193-3
Garcelon N, Neuraz A, Salomon R, Bahi-Buisson N, Amiel J, Picard C, et al. Next generation phenotyping using narrative reports in a rare disease clinical data warehouse. Orphanet J Rare Dis. 2018;13:85.
pubmed: 29855327
pmcid: 5984368
doi: 10.1186/s13023-018-0830-6
Townsend H. Natural language processing and clinical outcomes: the promise and progress of NLP for improved care. J AHIMA. 2013;84:44–5.
pubmed: 23556403
pmcid: 23556403
Bodenreider O. The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res. 2004;32:D267–70.
pubmed: 14681409
pmcid: 308795
doi: 10.1093/nar/gkh061
Rindflesch TC, Fiszman M. The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text. J Biomed Inform. 2003;36(6):462–77.
pubmed: 14759819
doi: 10.1016/j.jbi.2003.11.003
pmcid: 14759819
Wu YW, Sullivan J, McDaniel SS, Meisler MH, Walsh EM, Li SX, et al. Incidence of dravet syndrome in a US population. Pediatrics. 2015;136(5):e1310–5.
pubmed: 26438699
pmcid: 4621800
doi: 10.1542/peds.2015-1807
Scheffer IE, Berkovic S, Capovilla G, Connolly MB, Guilhoto L, Hirsch E, et al. ILAE classification of the epilepsies position paper of the ILAE: commission for classification and terminology. Epilepsia. 2017;58:512–21.
pubmed: 28276062
pmcid: 5386840
doi: 10.1111/epi.13709
Dravet C. The core Dravet syndrome phenotype. Epilepsia. 2011;52(SUPPL. 2):3–9.
pubmed: 21463272
doi: 10.1111/j.1528-1167.2011.02994.x
pmcid: 21463272
Hirose S, Scheffer IE, Marini C, De Jonghe P, Andermann E, Goldman AM, et al. SCN1A testing for epilepsy: application in clinical practice. Epilepsia. 2013;54:946–52.
pubmed: 23586701
doi: 10.1111/epi.12168
pmcid: 23586701
Catarino CB, Liu JYW, Liagkouras I, Gibbons VS, Labrum RW, Ellis R, et al. Dravet syndrome as epileptic encephalopathy: evidence from long-term course and neuropathology. Brain. 2011;134(10):2982–3010.
pubmed: 21719429
pmcid: 3187538
doi: 10.1093/brain/awr129
Hattori J, Ouchida M, Ono J, Miyake S, Maniwa S, Mimaki N, et al. A Screening test for the prediction of Dravet syndrome before one year of age. Epilepsia. 2008;49(4):626–33.
pubmed: 18076640
doi: 10.1111/j.1528-1167.2007.01475.x
pmcid: 18076640
Lagae L, Brambilla I, Mingorance A, Gibson E, Battersby A. Quality of life and comorbidities associated with Dravet syndrome severity: a multinational cohort survey. Dev Med Child Neurol. 2018;60(1):63–72.
pubmed: 28984349
doi: 10.1111/dmcn.13591
pmcid: 28984349
Bremer A, Lossius MI, Nakken KO. Dravet syndrome—considerable delay in making the diagnosis. Acta Neurol Scand. 2012;125(5):359–62.
pubmed: 22050316
doi: 10.1111/j.1600-0404.2011.01609.x
Jansen FE, Sadleir LG, Harkin LA, Vadlamudi L, McMahon JM, Mulley JC, et al. Severe myoclonic epilepsy of infancy (Dravet syndrome): recognition and diagnosis in adults. Neurology. 2006;67(12):2224–6.
pubmed: 17190949
doi: 10.1212/01.wnl.0000249312.73155.7d
pmcid: 17190949
Connolly MB. Dravet syndrome: diagnosis and long-term course. Can J Neurol Sci. 2016;43:S3-8.
pubmed: 27264139
doi: 10.1017/cjn.2016.243
pmcid: 27264139
Garcelon N, Neuraz A, Salomon R, Faour H, Benoit V, Delapalme A, et al. A clinician friendly data warehouse oriented toward narrative reports: Dr. Warehouse. J Biomed Inform. 2018;80:52–63.
pubmed: 29501921
doi: 10.1016/j.jbi.2018.02.019
pmcid: 29501921
Neuraz A, Chouchana L, Malamut G, Le Beller C, Roche D, Beaune P, et al. Phenome-wide association studies on a quantitative trait: application to TPMT enzyme activity and thiopurine therapy in pharmacogenomics. PLoS Comput Biol. 2013;9(12):e1003405.
pubmed: 24385893
pmcid: 3873228
doi: 10.1371/journal.pcbi.1003405
Denny JC, Ritchie MD, Basford MA, Pulley JM, Bastarache L, Brown-Gentry K, et al. PheWAS: Demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations. Bioinformatics. 2010;26(9):1205–10.
pubmed: 20335276
pmcid: 2859132
doi: 10.1093/bioinformatics/btq126
Baumann RJ. Technical report: treatment of the child with simple febrile seizures. Pediatrics. 1999;103(6 I):1278–9.
Cetica V, Chiari S, Mei D, Parrini E, Grisotto L, Marini C, et al. Clinical and genetic factors predicting Dravet syndrome in infants with SCN1A mutations. Neurology. 2017;88(11):1037–44.
pubmed: 28202706
pmcid: 5384833
doi: 10.1212/WNL.0000000000003716
Dravet C, Guerrini R. Dravet syndrome. Arcueil: John Libbey Eurotext; 2011.
Ohki T, Watanabe K, Negoro T, Aso K, Haga Y, Kasai K, et al. Severe myoclonic epilepsy in infancy: evolution of seizures. Seizure. 1997;6(3):219–24.
pubmed: 9203251
doi: 10.1016/S1059-1311(97)80009-X
pmcid: 9203251
Gataullina S, Dulac O. From genotype to phenotype in Dravet disease. Seizure. 2017;44:58–64.
pubmed: 27817982
doi: 10.1016/j.seizure.2016.10.014
pmcid: 27817982
Ragona F, Brazzo D, De Giorgi I, Morbi M, Freri E, Teutonico F, et al. Dravet syndrome: early clinical manifestations and cognitive outcome in 37 Italian patients. Brain Dev. 2010;32:71–7.
pubmed: 19854600
doi: 10.1016/j.braindev.2009.09.014
pmcid: 19854600
Nabbout R, Chemaly N, Chipaux M, Barcia G, Bouis C, Dubouch C, et al. Encephalopathy in children with Dravet syndrome is not a pure consequence of epilepsy. Orphanet J Rare Dis. 2013;8(1):1–8.
doi: 10.1186/1750-1172-8-176
Verheyen K. Motor Development in children with Dravet syndrome. Dev Med Child Neurol. 2019;61:950–6.
pubmed: 30644536
doi: 10.1111/dmcn.14147
pmcid: 30644536
Wirrell EC, Laux L, Donner E, Jette N, Knupp K, Meskis MA, et al. Optimizing the diagnosis and management of Dravet syndrome: recommendations from a North American Consensus Panel. Pediatr Neurol. 2017;68:18–34.
pubmed: 28284397
doi: 10.1016/j.pediatrneurol.2017.01.025
pmcid: 28284397
Wirrell EC, Laux L, Franz DN, Sullivan J, Saneto RP, Morse RP, et al. Stiripentol in Dravet syndrome: results of a retrospective U.S. study. Epilepsia. 2013;54(9):1595–604.
pubmed: 23848835
doi: 10.1111/epi.12303
pmcid: 23848835
Nabbout R, Auvin S, Chiron C, Thiele E, Cross H, Scheffer IE, et al. Perception of impact of Dravet syndrome on children and caregivers in multiple countries: looking beyond seizures. Dev Med Child Neurol. 2019;61:1229–36.
pubmed: 30828793
doi: 10.1111/dmcn.14186
pmcid: 30828793
Shilo S, Rossman H, Segal E. Axes of a revolution: challenges and promises of big data in healthcare. Nat Med. 2020;26(1):29–38.
pubmed: 31932803
doi: 10.1038/s41591-019-0727-5
pmcid: 31932803
Castaneda C, Nalley K, Mannion C, Bhattacharyya P, Blake P, Pecora A, et al. Clinical decision support systems for improving diagnostic accuracy and achieving precision medicine. J Clin Bioinform. 2015;5(1):4.
doi: 10.1186/s13336-015-0019-3
Fitipaldi H, McCarthy MI, Florez JC, Franks PW. A global overview of precision medicine in type 2 diabetes. Diabetes. 2018;67:1911–22.
pubmed: 30237159
pmcid: 6152339
doi: 10.2337/dbi17-0045
Liang H, Tsui BY, Ni H, Valentim CCS, Baxter SL, Liu G, et al. Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence. Nat Med. 2019;25(3):433–8.
pubmed: 30742121
doi: 10.1038/s41591-018-0335-9
pmcid: 30742121
Hully M, Lo Barco T, Kaminska A, Barcia G, Cances C, Mignot C, et al. Deep phenotyping unstructured data mining in an extensive pediatric database to unravel a common KCNA2 variant in neurodevelopmental syndromes. Genet Med. 2021;23:968–71.
pubmed: 33500571
pmcid: 8105164
doi: 10.1038/s41436-020-01039-z
Shmuely S, Sisodiya SM, Gunning WB, Sander JW, Thijs RD. Mortality in Dravet syndrome: a review. Epilepsy Behav. 2016;64:69–74.
pubmed: 27732919
doi: 10.1016/j.yebeh.2016.09.007
pmcid: 27732919
Kim Y, Bravo E, Thirnbeck CK, Smith-Mellecker LA, Kim SH, Gehlbach BK, et al. Severe peri-ictal respiratory dysfunction is common in Dravet syndrome. J Clin Invest. 2018;128(3):1141–53.
pubmed: 29329111
pmcid: 5824857
doi: 10.1172/JCI94999
Hesdorffer DC, Shinnar S, Lewis DV, Moshé SL, Nordli DR, Pellock JM, et al. Design and phenomenology of the FEBSTAT study. Epilepsia. 2012;53(9):1471–80.
pubmed: 22742587
pmcid: 3436982
doi: 10.1111/j.1528-1167.2012.03567.x
Vitaliti G, Castagno E, Ricceri F, Urbino A, Di Pianella AV, Lubrano R, et al. Epidemiology and diagnostic and therapeutic management of febrile seizures in the Italian pediatric emergency departments: a prospective observational study. Epilepsy Res. 2017;129:79–85.
pubmed: 27930967
doi: 10.1016/j.eplepsyres.2016.11.005
pmcid: 27930967