Genetic variations analysis for complex brain disease diagnosis using machine learning techniques: opportunities and hurdles.
Brain disease
Deep learning
Genetic analysis
Machine learning
Microarrays
Single nucleotide polymorphism (SNP)
Journal
PeerJ. Computer science
ISSN: 2376-5992
Titre abrégé: PeerJ Comput Sci
Pays: United States
ID NLM: 101660598
Informations de publication
Date de publication:
2021
2021
Historique:
received:
12
04
2021
accepted:
05
08
2021
entrez:
7
10
2021
pubmed:
8
10
2021
medline:
8
10
2021
Statut:
epublish
Résumé
This paper presents an in-depth review of the state-of-the-art genetic variations analysis to discover complex genes associated with the brain's genetic disorders. We first introduce the genetic analysis of complex brain diseases, genetic variation, and DNA microarrays. Then, the review focuses on available machine learning methods used for complex brain disease classification. Therein, we discuss the various datasets, preprocessing, feature selection and extraction, and classification strategies. In particular, we concentrate on studying single nucleotide polymorphisms (SNP) that support the highest resolution for genomic fingerprinting for tracking disease genes. Subsequently, the study provides an overview of the applications for some specific diseases, including autism spectrum disorder, brain cancer, and Alzheimer's disease (AD). The study argues that despite the significant recent developments in the analysis and treatment of genetic disorders, there are considerable challenges to elucidate causative mutations, especially from the viewpoint of implementing genetic analysis in clinical practice. The review finally provides a critical discussion on the applicability of genetic variations analysis for complex brain disease identification highlighting the future challenges. We used a methodology for literature surveys to obtain data from academic databases. Criteria were defined for inclusion and exclusion. The selection of articles was followed by three stages. In addition, the principal methods for machine learning to classify the disease were presented in each stage in more detail. It was revealed that machine learning based on SNP was widely utilized to solve problems of genetic variation for complex diseases related to genes. Despite significant developments in genetic diseases in the past two decades of the diagnosis and treatment, there is still a large percentage in which the causative mutation cannot be determined, and a final genetic diagnosis remains elusive. So, we need to detect the variations of the genes related to brain disorders in the early disease stages.
Sections du résumé
BACKGROUND AND OBJECTIVES
OBJECTIVE
This paper presents an in-depth review of the state-of-the-art genetic variations analysis to discover complex genes associated with the brain's genetic disorders. We first introduce the genetic analysis of complex brain diseases, genetic variation, and DNA microarrays. Then, the review focuses on available machine learning methods used for complex brain disease classification. Therein, we discuss the various datasets, preprocessing, feature selection and extraction, and classification strategies. In particular, we concentrate on studying single nucleotide polymorphisms (SNP) that support the highest resolution for genomic fingerprinting for tracking disease genes. Subsequently, the study provides an overview of the applications for some specific diseases, including autism spectrum disorder, brain cancer, and Alzheimer's disease (AD). The study argues that despite the significant recent developments in the analysis and treatment of genetic disorders, there are considerable challenges to elucidate causative mutations, especially from the viewpoint of implementing genetic analysis in clinical practice. The review finally provides a critical discussion on the applicability of genetic variations analysis for complex brain disease identification highlighting the future challenges.
METHODS
METHODS
We used a methodology for literature surveys to obtain data from academic databases. Criteria were defined for inclusion and exclusion. The selection of articles was followed by three stages. In addition, the principal methods for machine learning to classify the disease were presented in each stage in more detail.
RESULTS
RESULTS
It was revealed that machine learning based on SNP was widely utilized to solve problems of genetic variation for complex diseases related to genes.
CONCLUSIONS
CONCLUSIONS
Despite significant developments in genetic diseases in the past two decades of the diagnosis and treatment, there is still a large percentage in which the causative mutation cannot be determined, and a final genetic diagnosis remains elusive. So, we need to detect the variations of the genes related to brain disorders in the early disease stages.
Identifiants
pubmed: 34616886
doi: 10.7717/peerj-cs.697
pii: cs-697
pmc: PMC8459785
doi:
Types de publication
Journal Article
Langues
eng
Pagination
e697Informations de copyright
© 2021 Ahmed et al.
Déclaration de conflit d'intérêts
The authors declare that they have no competing interests.
Références
Methods Mol Biol. 2010;628:1-20
pubmed: 20238073
Plant Genome. 2019 Mar;12(1):
pubmed: 30951095
Mol Cancer Ther. 2005 Oct;4(10):1636-43
pubmed: 16227414
J Hum Genet. 2007;52(11):871-880
pubmed: 17928948
PeerJ Comput Sci. 2019 Feb 25;5:e177
pubmed: 33816830
PLoS One. 2014 Jun 12;9(6):e94661
pubmed: 24922517
BMC Bioinformatics. 2020 Oct 21;21(1):469
pubmed: 33087039
Genes (Basel). 2020 Feb 05;11(2):
pubmed: 32033366
J Autism Dev Disord. 2012 Jun;42(6):971-83
pubmed: 21786105
Adv Bioinformatics. 2015;2015:639367
pubmed: 26366461
BMC Neurol. 2015 Mar 28;15:47
pubmed: 25880661
BMC Public Health. 2017 Nov 28;17(1):907
pubmed: 29179711
Int J Neural Syst. 2005 Dec;15(6):475-84
pubmed: 16385636
Arch Neurol. 2012 Oct;69(10):1243-4
pubmed: 22869030
Trends Genet. 2008 Mar;24(3):142-9
pubmed: 18262676
Curr Protoc Mol Biol. 2013 Jan;Chapter 22:Unit 22.1.
pubmed: 23288464
PLoS One. 2017 Mar 30;12(3):e0173907
pubmed: 28358850
Nucleic Acids Res. 2003 Jul 1;31(13):3812-4
pubmed: 12824425
Comput Biol Med. 2021 Dec;139:104949
pubmed: 34737139
Am J Hum Genet. 2014 May 1;94(5):677-94
pubmed: 24768552
Mol Psychiatry. 2021 Jan;26(1):70-79
pubmed: 32591634
Nat Biotechnol. 2007 Nov;25(11):1251-5
pubmed: 17989687
Proc Natl Acad Sci U S A. 2014 Jan 28;111(4):E455-64
pubmed: 24443550
Nat Rev Genet. 2015 Aug;16(8):441-58
pubmed: 26149713
Genetics. 2016 Oct;204(2):783-798
pubmed: 27489002
PLoS Comput Biol. 2017 Oct 12;13(10):e1005580
pubmed: 29023450
J Biomed Inform. 2018 Jan;77:50-61
pubmed: 29197649
Comput Struct Biotechnol J. 2018 Feb 21;16:77-87
pubmed: 29977480
Neurobiol Aging. 2019 Feb;74:225-233
pubmed: 30497016
BMC Bioinformatics. 2006 Feb 27;7:95
pubmed: 16504159
Clin Interv Aging. 2010 Oct 11;5:307-11
pubmed: 21103401
Hum Mol Genet. 2017 Feb 1;26(3):489-500
pubmed: 28053046
Comput Math Methods Med. 2013;2013:340678
pubmed: 24174989
Neuroimage. 2011 May 15;56(2):766-81
pubmed: 20542124
Proc Natl Acad Sci U S A. 2012 Jan 24;109(4):1193-8
pubmed: 22223662
Neuron. 2010 Oct 21;68(2):270-81
pubmed: 20955934
Oncotarget. 2016 Aug 30;7(35):56480-56490
pubmed: 27486767
Stud Health Technol Inform. 2010;160(Pt 2):1314-8
pubmed: 20841897
Nucleic Acids Res. 2001 Oct 1;29(19):3928-38
pubmed: 11574674
J Biomed Inform. 2020 Sep;109:103514
pubmed: 32711124
PLoS Genet. 2013 Feb;9(2):e1003295
pubmed: 23509438
PLoS One. 2018 Jul 26;13(7):e0201056
pubmed: 30048494
Lancet. 2017 Dec 16;390(10113):e51-e53
pubmed: 28735856
PLoS One. 2019 Jul 8;14(7):e0218111
pubmed: 31283791
BMC Bioinformatics. 2019 Dec 16;20(1):709
pubmed: 31842725
Curr Genomics. 2007 Jun;8(4):219-28
pubmed: 18645599
Front Bioeng Biotechnol. 2020 Sep 04;8:1032
pubmed: 33015010
PLoS One. 2018 Dec 10;13(12):e0208626
pubmed: 30532199
Nat Genet. 1999 Jul;22(3):239-47
pubmed: 10391210
Adv Bioinformatics. 2015;2015:198363
pubmed: 26170834
BMC Genomics. 2008;9 Suppl 1:S6
pubmed: 18366619
Transl Psychiatry. 2014 Feb 04;4:e358
pubmed: 24495969
Genes (Basel). 2020 Feb 25;11(3):
pubmed: 32106447
Front Genet. 2015 Apr 20;6:149
pubmed: 25941534
IEEE/ACM Trans Comput Biol Bioinform. 2020 Mar-Apr;17(2):526-535
pubmed: 30222581
Genome Res. 2015 Jan;25(1):142-54
pubmed: 25378250
Brief Bioinform. 2016 Mar;17(2):322-35
pubmed: 26197808
Comput Struct Biotechnol J. 2017 Aug 14;15:403-411
pubmed: 28883909
IEEE Trans Cybern. 2013 Dec;43(6):1656-71
pubmed: 24273143
BMC Res Notes. 2020 Jul 13;13(1):337
pubmed: 32660549
Nat Neurosci. 2016 Nov;19(11):1454-1462
pubmed: 27479844
Annu Int Conf IEEE Eng Med Biol Soc. 2014;2014:2464-7
pubmed: 25570489
Artif Intell Med. 2019 Jun;97:204-214
pubmed: 30797633
Aging Dis. 2020 Dec 1;11(6):1567-1584
pubmed: 33269107
PLoS One. 2020 Jan 23;15(1):e0225368
pubmed: 31971949
Crit Rev Oncol Hematol. 2019 Oct;142:58-67
pubmed: 31377433
IEEE/ACM Trans Comput Biol Bioinform. 2018 Mar-Apr;15(2):599-612
pubmed: 28060710
Front Psychiatry. 2020 May 14;11:416
pubmed: 32477189
Prog Mol Biol Transl Sci. 2012;107:79-100
pubmed: 22482448
IEEE/ACM Trans Comput Biol Bioinform. 2016 Sep-Oct;13(5):971-989
pubmed: 26390495
Bioinformatics. 2016 Dec 1;32(23):3611-3618
pubmed: 27506227
PLoS One. 2020 Apr 23;15(4):e0232103
pubmed: 32324812
Bioinformatics. 2007 Oct 1;23(19):2507-17
pubmed: 17720704
Biology (Basel). 2020 Oct 06;9(10):
pubmed: 33036150