Improving early diagnosis of rare diseases using Natural Language Processing in unstructured medical records: an illustration from Dravet syndrome.


Journal

Orphanet journal of rare diseases
ISSN: 1750-1172
Titre abrégé: Orphanet J Rare Dis
Pays: England
ID NLM: 101266602

Informations de publication

Date de publication:
13 07 2021
Historique:
received: 13 01 2021
accepted: 27 06 2021
entrez: 14 7 2021
pubmed: 15 7 2021
medline: 6 8 2021
Statut: epublish

Résumé

The growing use of Electronic Health Records (EHRs) is promoting the application of data mining in health-care. A promising use of big data in this field is to develop models to support early diagnosis and to establish natural history. Dravet Syndrome (DS) is a rare developmental and epileptic encephalopathy that commonly initiates in the first year of life with febrile seizures (FS). Age at diagnosis is often delayed after 2 years, as it is difficult to differentiate DS at onset from FS. We aimed to explore if some clinical terms (concepts) are significantly more used in the electronic narrative medical reports of individuals with DS before the age of 2 years compared to those of individuals with FS. These concepts would allow an earlier detection of patients with DS resulting in an earlier orientation toward expert centers that can provide early diagnosis and care. Data were collected from the Necker Enfants Malades Hospital using a document-based data warehouse, Dr Warehouse, which employs Natural Language Processing, a computer technology consisting in processing written information. Using Unified Medical Language System Meta-thesaurus, phenotype concepts can be recognized in medical reports. We selected individuals with DS (DS Cohort) and individuals with FS (FS Cohort) with confirmed diagnosis after the age of 4 years. A phenome-wide analysis was performed evaluating the statistical associations between the phenotypes of DS and FS, based on concepts found in the reports produced before 2 years and using a series of logistic regressions. We found significative higher representation of concepts related to seizures' phenotypes distinguishing DS from FS in the first phases, namely the major recurrence of complex febrile convulsions (long-lasting and/or with focal signs) and other seizure-types. Some typical early onset non-seizure concepts also emerged, in relation to neurodevelopment and gait disorders. Narrative medical reports of individuals younger than 2 years with FS contain specific concepts linked to DS diagnosis, which can be automatically detected by software exploiting NLP. This approach could represent an innovative and sustainable methodology to decrease time of diagnosis of DS and could be transposed to other rare diseases.

Sections du résumé

BACKGROUND
The growing use of Electronic Health Records (EHRs) is promoting the application of data mining in health-care. A promising use of big data in this field is to develop models to support early diagnosis and to establish natural history. Dravet Syndrome (DS) is a rare developmental and epileptic encephalopathy that commonly initiates in the first year of life with febrile seizures (FS). Age at diagnosis is often delayed after 2 years, as it is difficult to differentiate DS at onset from FS. We aimed to explore if some clinical terms (concepts) are significantly more used in the electronic narrative medical reports of individuals with DS before the age of 2 years compared to those of individuals with FS. These concepts would allow an earlier detection of patients with DS resulting in an earlier orientation toward expert centers that can provide early diagnosis and care.
METHODS
Data were collected from the Necker Enfants Malades Hospital using a document-based data warehouse, Dr Warehouse, which employs Natural Language Processing, a computer technology consisting in processing written information. Using Unified Medical Language System Meta-thesaurus, phenotype concepts can be recognized in medical reports. We selected individuals with DS (DS Cohort) and individuals with FS (FS Cohort) with confirmed diagnosis after the age of 4 years. A phenome-wide analysis was performed evaluating the statistical associations between the phenotypes of DS and FS, based on concepts found in the reports produced before 2 years and using a series of logistic regressions.
RESULTS
We found significative higher representation of concepts related to seizures' phenotypes distinguishing DS from FS in the first phases, namely the major recurrence of complex febrile convulsions (long-lasting and/or with focal signs) and other seizure-types. Some typical early onset non-seizure concepts also emerged, in relation to neurodevelopment and gait disorders.
CONCLUSIONS
Narrative medical reports of individuals younger than 2 years with FS contain specific concepts linked to DS diagnosis, which can be automatically detected by software exploiting NLP. This approach could represent an innovative and sustainable methodology to decrease time of diagnosis of DS and could be transposed to other rare diseases.

Identifiants

pubmed: 34256808
doi: 10.1186/s13023-021-01936-9
pii: 10.1186/s13023-021-01936-9
pmc: PMC8278630
doi:

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

309

Informations de copyright

© 2021. The Author(s).

Références

Gunter TD, Terry NP. The emergence of national electronic health record architectures in the United States and Australia: models, costs, and questions. J Med Internet Res. 2005;7(1):e3.
pubmed: 15829475 pmcid: 1550638 doi: 10.2196/jmir.7.1.e3
Landi I, Glicksberg BS, Lee HC, Cherng S, Landi G, Danieletto M, et al. Deep representation learning of electronic health records to unlock patient stratification at scale. npj Digit Med. 2020;3:1–11.
doi: 10.1038/s41746-020-0301-z
Olivera P, Danese S, Jay N, Natoli G, Peyrin-Biroulet L. Big data in IBD: a look into the future. Nat Rev Gastroenterol Hepatol. 2019;16(5):312–21.
pubmed: 30659247 doi: 10.1038/s41575-019-0102-5
Bates DW, Saria S, Ohno-Machado L, Shah A, Escobar G. Big data in health care: Using analytics to identify and manage high-risk and high-cost patients. Health Aff. 2014;33(7):1123–31.
doi: 10.1377/hlthaff.2014.0041
Shen F, Liu S, Wang Y, Wen A, Wang L, Liu H. Utilization of electronic medical records and biomedical literature to support the diagnosis of rare diseases using data fusion and collaborative filtering approaches. J Med Internet Res. 2018;20(10):e11301.
Southall NT, Natarajan M, Lau LPL, Jonker AH, Deprez B, Guilliams T, et al. The use or generation of biomedical data and existing medicines to discover and establish new treatments for patients with rare diseases-recommendations of the IRDiRC Data Mining and Repurposing Task Force. Orphanet J Rare Dis. 2019;14(1):225.
pubmed: 31615551 pmcid: 6794821 doi: 10.1186/s13023-019-1193-3
Garcelon N, Neuraz A, Salomon R, Bahi-Buisson N, Amiel J, Picard C, et al. Next generation phenotyping using narrative reports in a rare disease clinical data warehouse. Orphanet J Rare Dis. 2018;13:85.
pubmed: 29855327 pmcid: 5984368 doi: 10.1186/s13023-018-0830-6
Townsend H. Natural language processing and clinical outcomes: the promise and progress of NLP for improved care. J AHIMA. 2013;84:44–5.
pubmed: 23556403 pmcid: 23556403
Bodenreider O. The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res. 2004;32:D267–70.
pubmed: 14681409 pmcid: 308795 doi: 10.1093/nar/gkh061
Rindflesch TC, Fiszman M. The interaction of domain knowledge and linguistic structure in natural language processing: interpreting hypernymic propositions in biomedical text. J Biomed Inform. 2003;36(6):462–77.
pubmed: 14759819 doi: 10.1016/j.jbi.2003.11.003 pmcid: 14759819
Wu YW, Sullivan J, McDaniel SS, Meisler MH, Walsh EM, Li SX, et al. Incidence of dravet syndrome in a US population. Pediatrics. 2015;136(5):e1310–5.
pubmed: 26438699 pmcid: 4621800 doi: 10.1542/peds.2015-1807
Scheffer IE, Berkovic S, Capovilla G, Connolly MB, Guilhoto L, Hirsch E, et al. ILAE classification of the epilepsies position paper of the ILAE: commission for classification and terminology. Epilepsia. 2017;58:512–21.
pubmed: 28276062 pmcid: 5386840 doi: 10.1111/epi.13709
Dravet C. The core Dravet syndrome phenotype. Epilepsia. 2011;52(SUPPL. 2):3–9.
pubmed: 21463272 doi: 10.1111/j.1528-1167.2011.02994.x pmcid: 21463272
Hirose S, Scheffer IE, Marini C, De Jonghe P, Andermann E, Goldman AM, et al. SCN1A testing for epilepsy: application in clinical practice. Epilepsia. 2013;54:946–52.
pubmed: 23586701 doi: 10.1111/epi.12168 pmcid: 23586701
Catarino CB, Liu JYW, Liagkouras I, Gibbons VS, Labrum RW, Ellis R, et al. Dravet syndrome as epileptic encephalopathy: evidence from long-term course and neuropathology. Brain. 2011;134(10):2982–3010.
pubmed: 21719429 pmcid: 3187538 doi: 10.1093/brain/awr129
Hattori J, Ouchida M, Ono J, Miyake S, Maniwa S, Mimaki N, et al. A Screening test for the prediction of Dravet syndrome before one year of age. Epilepsia. 2008;49(4):626–33.
pubmed: 18076640 doi: 10.1111/j.1528-1167.2007.01475.x pmcid: 18076640
Lagae L, Brambilla I, Mingorance A, Gibson E, Battersby A. Quality of life and comorbidities associated with Dravet syndrome severity: a multinational cohort survey. Dev Med Child Neurol. 2018;60(1):63–72.
pubmed: 28984349 doi: 10.1111/dmcn.13591 pmcid: 28984349
Bremer A, Lossius MI, Nakken KO. Dravet syndrome—considerable delay in making the diagnosis. Acta Neurol Scand. 2012;125(5):359–62.
pubmed: 22050316 doi: 10.1111/j.1600-0404.2011.01609.x
Jansen FE, Sadleir LG, Harkin LA, Vadlamudi L, McMahon JM, Mulley JC, et al. Severe myoclonic epilepsy of infancy (Dravet syndrome): recognition and diagnosis in adults. Neurology. 2006;67(12):2224–6.
pubmed: 17190949 doi: 10.1212/01.wnl.0000249312.73155.7d pmcid: 17190949
Connolly MB. Dravet syndrome: diagnosis and long-term course. Can J Neurol Sci. 2016;43:S3-8.
pubmed: 27264139 doi: 10.1017/cjn.2016.243 pmcid: 27264139
Garcelon N, Neuraz A, Salomon R, Faour H, Benoit V, Delapalme A, et al. A clinician friendly data warehouse oriented toward narrative reports: Dr. Warehouse. J Biomed Inform. 2018;80:52–63.
pubmed: 29501921 doi: 10.1016/j.jbi.2018.02.019 pmcid: 29501921
Neuraz A, Chouchana L, Malamut G, Le Beller C, Roche D, Beaune P, et al. Phenome-wide association studies on a quantitative trait: application to TPMT enzyme activity and thiopurine therapy in pharmacogenomics. PLoS Comput Biol. 2013;9(12):e1003405.
pubmed: 24385893 pmcid: 3873228 doi: 10.1371/journal.pcbi.1003405
Denny JC, Ritchie MD, Basford MA, Pulley JM, Bastarache L, Brown-Gentry K, et al. PheWAS: Demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations. Bioinformatics. 2010;26(9):1205–10.
pubmed: 20335276 pmcid: 2859132 doi: 10.1093/bioinformatics/btq126
Baumann RJ. Technical report: treatment of the child with simple febrile seizures. Pediatrics. 1999;103(6 I):1278–9.
Cetica V, Chiari S, Mei D, Parrini E, Grisotto L, Marini C, et al. Clinical and genetic factors predicting Dravet syndrome in infants with SCN1A mutations. Neurology. 2017;88(11):1037–44.
pubmed: 28202706 pmcid: 5384833 doi: 10.1212/WNL.0000000000003716
Dravet C, Guerrini R. Dravet syndrome. Arcueil: John Libbey Eurotext; 2011.
Ohki T, Watanabe K, Negoro T, Aso K, Haga Y, Kasai K, et al. Severe myoclonic epilepsy in infancy: evolution of seizures. Seizure. 1997;6(3):219–24.
pubmed: 9203251 doi: 10.1016/S1059-1311(97)80009-X pmcid: 9203251
Gataullina S, Dulac O. From genotype to phenotype in Dravet disease. Seizure. 2017;44:58–64.
pubmed: 27817982 doi: 10.1016/j.seizure.2016.10.014 pmcid: 27817982
Ragona F, Brazzo D, De Giorgi I, Morbi M, Freri E, Teutonico F, et al. Dravet syndrome: early clinical manifestations and cognitive outcome in 37 Italian patients. Brain Dev. 2010;32:71–7.
pubmed: 19854600 doi: 10.1016/j.braindev.2009.09.014 pmcid: 19854600
Nabbout R, Chemaly N, Chipaux M, Barcia G, Bouis C, Dubouch C, et al. Encephalopathy in children with Dravet syndrome is not a pure consequence of epilepsy. Orphanet J Rare Dis. 2013;8(1):1–8.
doi: 10.1186/1750-1172-8-176
Verheyen K. Motor Development in children with Dravet syndrome. Dev Med Child Neurol. 2019;61:950–6.
pubmed: 30644536 doi: 10.1111/dmcn.14147 pmcid: 30644536
Wirrell EC, Laux L, Donner E, Jette N, Knupp K, Meskis MA, et al. Optimizing the diagnosis and management of Dravet syndrome: recommendations from a North American Consensus Panel. Pediatr Neurol. 2017;68:18–34.
pubmed: 28284397 doi: 10.1016/j.pediatrneurol.2017.01.025 pmcid: 28284397
Wirrell EC, Laux L, Franz DN, Sullivan J, Saneto RP, Morse RP, et al. Stiripentol in Dravet syndrome: results of a retrospective U.S. study. Epilepsia. 2013;54(9):1595–604.
pubmed: 23848835 doi: 10.1111/epi.12303 pmcid: 23848835
Nabbout R, Auvin S, Chiron C, Thiele E, Cross H, Scheffer IE, et al. Perception of impact of Dravet syndrome on children and caregivers in multiple countries: looking beyond seizures. Dev Med Child Neurol. 2019;61:1229–36.
pubmed: 30828793 doi: 10.1111/dmcn.14186 pmcid: 30828793
Shilo S, Rossman H, Segal E. Axes of a revolution: challenges and promises of big data in healthcare. Nat Med. 2020;26(1):29–38.
pubmed: 31932803 doi: 10.1038/s41591-019-0727-5 pmcid: 31932803
Castaneda C, Nalley K, Mannion C, Bhattacharyya P, Blake P, Pecora A, et al. Clinical decision support systems for improving diagnostic accuracy and achieving precision medicine. J Clin Bioinform. 2015;5(1):4.
doi: 10.1186/s13336-015-0019-3
Fitipaldi H, McCarthy MI, Florez JC, Franks PW. A global overview of precision medicine in type 2 diabetes. Diabetes. 2018;67:1911–22.
pubmed: 30237159 pmcid: 6152339 doi: 10.2337/dbi17-0045
Liang H, Tsui BY, Ni H, Valentim CCS, Baxter SL, Liu G, et al. Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence. Nat Med. 2019;25(3):433–8.
pubmed: 30742121 doi: 10.1038/s41591-018-0335-9 pmcid: 30742121
Hully M, Lo Barco T, Kaminska A, Barcia G, Cances C, Mignot C, et al. Deep phenotyping unstructured data mining in an extensive pediatric database to unravel a common KCNA2 variant in neurodevelopmental syndromes. Genet Med. 2021;23:968–71.
pubmed: 33500571 pmcid: 8105164 doi: 10.1038/s41436-020-01039-z
Shmuely S, Sisodiya SM, Gunning WB, Sander JW, Thijs RD. Mortality in Dravet syndrome: a review. Epilepsy Behav. 2016;64:69–74.
pubmed: 27732919 doi: 10.1016/j.yebeh.2016.09.007 pmcid: 27732919
Kim Y, Bravo E, Thirnbeck CK, Smith-Mellecker LA, Kim SH, Gehlbach BK, et al. Severe peri-ictal respiratory dysfunction is common in Dravet syndrome. J Clin Invest. 2018;128(3):1141–53.
pubmed: 29329111 pmcid: 5824857 doi: 10.1172/JCI94999
Hesdorffer DC, Shinnar S, Lewis DV, Moshé SL, Nordli DR, Pellock JM, et al. Design and phenomenology of the FEBSTAT study. Epilepsia. 2012;53(9):1471–80.
pubmed: 22742587 pmcid: 3436982 doi: 10.1111/j.1528-1167.2012.03567.x
Vitaliti G, Castagno E, Ricceri F, Urbino A, Di Pianella AV, Lubrano R, et al. Epidemiology and diagnostic and therapeutic management of febrile seizures in the Italian pediatric emergency departments: a prospective observational study. Epilepsy Res. 2017;129:79–85.
pubmed: 27930967 doi: 10.1016/j.eplepsyres.2016.11.005 pmcid: 27930967

Auteurs

Tommaso Lo Barco (T)

Department of Pediatric Neurology, Necker-Enfants Malades Hospital, APHP, Centre de Référence Épilepsies Rares, Member of ERN EPICARE, Université de Paris, Paris, France.
Child Neuropsychiatry, Department of Surgical Sciences, Dentistry, Gynecology and Pediatrics, University of Verona, Verona, Italy.

Mathieu Kuchenbuch (M)

Department of Pediatric Neurology, Necker-Enfants Malades Hospital, APHP, Centre de Référence Épilepsies Rares, Member of ERN EPICARE, Université de Paris, Paris, France.
Imagine Institute, INSERM, UMR 1163, Université de Paris, 75015, Paris, France.

Nicolas Garcelon (N)

Imagine Institute, INSERM, UMR 1163, Université de Paris, 75015, Paris, France.

Antoine Neuraz (A)

Université de Paris, Paris, France.
INSERM, UMR1138, Centre de Recherche Des Cordeliers, Paris, France.
Department of Medical Informatics, University Hospital Necker-Enfants Malades, APHP, Paris, France.

Rima Nabbout (R)

Department of Pediatric Neurology, Necker-Enfants Malades Hospital, APHP, Centre de Référence Épilepsies Rares, Member of ERN EPICARE, Université de Paris, Paris, France. rima.nabbout@aphp.fr.
Imagine Institute, INSERM, UMR 1163, Université de Paris, 75015, Paris, France. rima.nabbout@aphp.fr.
Université de Paris, Paris, France. rima.nabbout@aphp.fr.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH