Cluster analysis and visualisation of electronic health records data to identify undiagnosed patients with rare genetic diseases.


Journal

Scientific reports
ISSN: 2045-2322
Titre abrégé: Sci Rep
Pays: England
ID NLM: 101563288

Informations de publication

Date de publication:
01 Mar 2024
Historique:
received: 01 11 2023
accepted: 23 02 2024
medline: 1 3 2024
pubmed: 1 3 2024
entrez: 29 2 2024
Statut: epublish

Résumé

Rare genetic diseases affect 5-8% of the population but are often undiagnosed or misdiagnosed. Electronic health records (EHR) contain large amounts of data, which provide opportunities for analysing and mining. Data mining, in the form of cluster analysis and visualisation, was performed on a database containing deidentified health records of 1.28 million patients across 3 major hospitals in Singapore, in a bid to improve the diagnostic process for patients who are living with an undiagnosed rare disease, specifically focusing on Fabry Disease and Familial Hypercholesterolaemia (FH). On a baseline of 4 patients, we identified 2 additional patients with potential diagnosis of Fabry disease, suggesting a potential 50% increase in diagnosis. Similarly, we identified > 12,000 individuals who fulfil the clinical and laboratory criteria for FH but had not been diagnosed previously. This proof-of-concept study showed that it is possible to perform mining on EHR data albeit with some challenges and limitations.

Identifiants

pubmed: 38424111
doi: 10.1038/s41598-024-55424-8
pii: 10.1038/s41598-024-55424-8
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

5056

Subventions

Organisme : National Medical Research Council,Singapore
ID : NMRC/CSAINV21jun-0003

Informations de copyright

© 2024. The Author(s).

Références

The Lancet, N. Rare neurological diseases: A united approach is needed. Lancet Neurol. 10, 109. https://doi.org/10.1016/S1474-4422(11)70001-1 (2011).
doi: 10.1016/S1474-4422(11)70001-1
Ferreira, C. R. The burden of rare diseases. Am. J. Med. Genet. A 179, 885–892. https://doi.org/10.1002/ajmg.a.61124 (2019).
doi: 10.1002/ajmg.a.61124 pubmed: 30883013
Bauskis, A., Strange, C., Molster, C. & Fisher, C. The diagnostic odyssey: Insights from parents of children living with an undiagnosed condition. Orphanet. J. Rare Dis. 17, 233. https://doi.org/10.1186/s13023-022-02358-x (2022).
doi: 10.1186/s13023-022-02358-x pubmed: 35717227 pmcid: 9206122
Germain, D. P. Fabry disease. Orphanet. J. Rare Dis. 5, 30. https://doi.org/10.1186/1750-1172-5-30 (2010).
doi: 10.1186/1750-1172-5-30 pubmed: 21092187 pmcid: 3009617
Eng, C. M. et al. Fabry disease: Baseline medical characteristics of a cohort of 1765 males and females in the Fabry Registry. J. Inherit. Metab. Dis. 30, 184–192. https://doi.org/10.1007/s10545-007-0521-2 (2007).
doi: 10.1007/s10545-007-0521-2 pubmed: 17347915
Ison, H. E., Clarke, S. L. & Knowles, J. W. Familial Hypercholesterolemia. In GeneReviews® (eds Adam, M. P. et al.) (University of Washington, Seattle, Seattle (WA), 1993).
Kramer, A. I. et al. Major adverse cardiovascular events in homozygous familial hypercholesterolaemia: A systematic review and meta-analysis. Eur. J. Prev. Cardiol. 29, 817–828. https://doi.org/10.1093/eurjpc/zwab224 (2022).
doi: 10.1093/eurjpc/zwab224 pubmed: 34957506
Hoerbst, A. & Ammenwerth, E. Electronic health records. A systematic review on quality requirements. Methods Inf. Med. 49, 320–336. https://doi.org/10.3414/ME10-01-0038 (2010).
doi: 10.3414/ME10-01-0038 pubmed: 20603687
Morley, T. J. et al. Phenotypic signatures in clinical data enable systematic identification of patients for genetic testing. Nat. Med. 27, 1097–1104. https://doi.org/10.1038/s41591-021-01356-z (2021).
doi: 10.1038/s41591-021-01356-z pubmed: 34083811 pmcid: 8981189
Bastarache, L. et al. Phenotype risk scores identify patients with unrecognized Mendelian disease patterns. Science 359, 1233–1239. https://doi.org/10.1126/science.aal4043 (2018).
doi: 10.1126/science.aal4043 pubmed: 29590070 pmcid: 5959723
Wang, D. et al. Data mining: Traditional spring festival associated with hypercholesterolemia. BMC Cardiovasc. Disord. 21, 526. https://doi.org/10.1186/s12872-021-02328-4 (2021).
doi: 10.1186/s12872-021-02328-4 pubmed: 34742234 pmcid: 8571822
Dornan, L. et al. Utilisation of electronic health records for public health in asia: A review of success factors and potential challenges. Biomed. Res. Int. 2019, 7341841. https://doi.org/10.1155/2019/7341841 (2019).
doi: 10.1155/2019/7341841 pubmed: 31360723 pmcid: 6644215
Silva, C. A. B., Andrade, L. G. M., Vaisbich, M. H. & Barreto, F. C. Brazilian consensus recommendations for the diagnosis, screening, and treatment of individuals with fabry disease: Committee for Rare Diseases—Brazilian Society of Nephrology/2021. J. Bras. Nefrol. 44, 249–267. https://doi.org/10.1590/2175-8239-JBN-2021-0208 (2022).
doi: 10.1590/2175-8239-JBN-2021-0208 pubmed: 35212703 pmcid: 9269181
Koh, N. et al. Asian pacific society of cardiology consensus recommendations on dyslipidaemia. Eur. Cardiol. 16, e54. https://doi.org/10.15420/ecr.2021.36 (2021).
doi: 10.15420/ecr.2021.36 pubmed: 35024056 pmcid: 8728885
Chan, S. H. et al. Analysis of clinically relevant variants from ancestrally diverse Asian genomes. Nat. Commun. 13, 6694. https://doi.org/10.1038/s41467-022-34116-9 (2022).
doi: 10.1038/s41467-022-34116-9 pubmed: 36335097 pmcid: 9637116
Hopkin, R. J. et al. The management and treatment of children with Fabry disease: A United States-based perspective. Mol. Genet. Metab. 117, 104–113. https://doi.org/10.1016/j.ymgme.2015.10.007 (2016).
doi: 10.1016/j.ymgme.2015.10.007 pubmed: 26546059
Lee, W. J. et al. Familial hypercholesterolemia genetic variations and long-term cardiovascular outcomes in patients with hypercholesterolemia who underwent coronary angiography. Genes (Basel) https://doi.org/10.3390/genes12091413 (2021).
doi: 10.3390/genes12091413 pubmed: 35052425 pmcid: 8747665
Yadav, P., Steinbach, M., Kumar, V. & Simon, G. Mining electronic health records (EHRs): A survey. ACM Comput. Surv. 50, 85. https://doi.org/10.1145/3127881 (2018).
doi: 10.1145/3127881
Denny, J. C. Chapter 13: Mining electronic health records in the genomics era. PLoS Comput. Biol. 8, 1002823. https://doi.org/10.1371/journal.pcbi.1002823 (2012).
doi: 10.1371/journal.pcbi.1002823
Jensen, P. B., Jensen, L. J. & Brunak, S. Mining electronic health records: Towards better research applications and clinical care. Nat. Rev. Genet. 13, 395–405. https://doi.org/10.1038/nrg3208 (2012).
doi: 10.1038/nrg3208 pubmed: 22549152
Kirk, I. K. et al. Linking glycemic dysregulation in diabetes to symptoms, comorbidities, and genetics through EHR data mining. Elife https://doi.org/10.7554/eLife.44941 (2019).
doi: 10.7554/eLife.44941 pubmed: 31818369 pmcid: 6904221
Baek, H. et al. Analysis of length of hospital stay using electronic health records: A statistical and data mining approach. PLoS ONE 13, e0195901. https://doi.org/10.1371/journal.pone.0195901 (2018).
doi: 10.1371/journal.pone.0195901 pubmed: 29652932 pmcid: 5898738
Landi, I. et al. The evolution of mining electronic health records in the era of deep learning. Deep Learn. Biol. Med. 55, 92. https://doi.org/10.1142/9781800610941_0003 (2022).
doi: 10.1142/9781800610941_0003
Liang, C. et al. Curating a knowledge base for individuals with coinfection of HIV and SARS-CoV-2: A study protocol of EHR-based data mining and clinical implementation. BMJ Open 12, e067204. https://doi.org/10.1136/bmjopen-2022-067204 (2022).
doi: 10.1136/bmjopen-2022-067204 pubmed: 36100301
Garcelon, N., Burgun, A., Salomon, R. & Neuraz, A. Electronic health records for the diagnosis of rare diseases. Kidney Int. 97, 676–686. https://doi.org/10.1016/j.kint.2019.11.037 (2020).
doi: 10.1016/j.kint.2019.11.037 pubmed: 32111372

Auteurs

Daniel Moynihan (D)

Curtin University, Perth, Australia.

Sean Monaco (S)

Health Catalyst, Utah, USA.

Teck Wah Ting (TW)

Genetics Service, Department of Paediatrics, KK Women's and Children's Hospital, 100 Bukit Timah Road, Singapore, 229899, Singapore.
SingHealth Duke-NUS Genomic Medicine Centre, Singapore, Singapore.

Kaavya Narasimhalu (K)

SingHealth Duke-NUS Genomic Medicine Centre, Singapore, Singapore.
Department of Neurology, National Neuroscience Institute (Singapore General Hospital), Singapore, Singapore.

Jenny Hsieh (J)

SingHealth Duke-NUS Genomic Medicine Centre, Singapore, Singapore.
Department of Internal Medicine, Singapore General Hospital, Singapore, Singapore.

Sylvia Kam (S)

Genetics Service, Department of Paediatrics, KK Women's and Children's Hospital, 100 Bukit Timah Road, Singapore, 229899, Singapore.
SingHealth Duke-NUS Genomic Medicine Centre, Singapore, Singapore.

Jiin Ying Lim (JY)

Genetics Service, Department of Paediatrics, KK Women's and Children's Hospital, 100 Bukit Timah Road, Singapore, 229899, Singapore.
SingHealth Duke-NUS Genomic Medicine Centre, Singapore, Singapore.

Weng Khong Lim (WK)

SingHealth Duke-NUS Genomic Medicine Centre, Singapore, Singapore.
SingHealth Duke-NUS Institute of Precision Medicine, Singapore, Singapore.
Cancer & Stem Cell Biology Program, Duke-NUS Medical School, Singapore, Singapore.
Laboratory of Genome Variation Analytics, Genome Institute of Singapore, Singapore, Singapore.

Sonia Davila (S)

SingHealth Duke-NUS Genomic Medicine Centre, Singapore, Singapore.
SingHealth Duke-NUS Institute of Precision Medicine, Singapore, Singapore.

Yasmin Bylstra (Y)

SingHealth Duke-NUS Genomic Medicine Centre, Singapore, Singapore.
SingHealth Duke-NUS Institute of Precision Medicine, Singapore, Singapore.

Iswaree Devi Balakrishnan (ID)

SingHealth Duke-NUS Genomic Medicine Centre, Singapore, Singapore.
National Heart Centre Singapore, Singapore, Singapore.

Mark Heng (M)

SingHealth Office of Insights and Analytics, Singapore, Singapore.

Elian Chia (E)

SingHealth Office of Insights and Analytics, Singapore, Singapore.

Khung Keong Yeo (KK)

National Heart Centre Singapore, Singapore, Singapore.

Bee Keow Goh (BK)

Data Analytics Office, KK Women's and Children's Hospital, Singapore, Singapore.

Ritu Gupta (R)

Curtin University, Perth, Australia.

Tele Tan (T)

Curtin University, Perth, Australia.

Gareth Baynam (G)

Rare Care Centre, Perth Children's Hospital, Perth, WA, Australia.
Western Australian Register of Developmental Anomalies, Perth, WA, Australia.

Saumya Shekhar Jamuar (SS)

Genetics Service, Department of Paediatrics, KK Women's and Children's Hospital, 100 Bukit Timah Road, Singapore, 229899, Singapore. saumya.s.jamuar@singhealth.com.sg.
SingHealth Duke-NUS Genomic Medicine Centre, Singapore, Singapore. saumya.s.jamuar@singhealth.com.sg.
SingHealth Duke-NUS Institute of Precision Medicine, Singapore, Singapore. saumya.s.jamuar@singhealth.com.sg.

Classifications MeSH