KinOrtho: a method for mapping human kinase orthologs across the tree of life and illuminating understudied kinases.


Journal

BMC bioinformatics
ISSN: 1471-2105
Titre abrégé: BMC Bioinformatics
Pays: England
ID NLM: 100965194

Informations de publication

Date de publication:
18 Sep 2021
Historique:
received: 10 04 2021
accepted: 06 09 2021
entrez: 19 9 2021
pubmed: 20 9 2021
medline: 22 9 2021
Statut: epublish

Résumé

Protein kinases are among the largest druggable family of signaling proteins, involved in various human diseases, including cancers and neurodegenerative disorders. Despite their clinical relevance, nearly 30% of the 545 human protein kinases remain highly understudied. Comparative genomics is a powerful approach for predicting and investigating the functions of understudied kinases. However, an incomplete knowledge of kinase orthologs across fully sequenced kinomes severely limits the application of comparative genomics approaches for illuminating understudied kinases. Here, we introduce KinOrtho, a query- and graph-based orthology inference method that combines full-length and domain-based approaches to map one-to-one kinase orthologs across 17 thousand species. Using multiple metrics, we show that KinOrtho performed better than existing methods in identifying kinase orthologs across evolutionarily divergent species and eliminated potential false positives by flagging sequences without a proper kinase domain for further evaluation. We demonstrate the advantage of using domain-based approaches for identifying domain fusion events, highlighting a case between an understudied serine/threonine kinase TAOK1 and a metabolic kinase PIK3C2A with high co-expression in human cells. We also identify evolutionary fission events involving the understudied OBSCN kinase domains, further highlighting the value of domain-based orthology inference approaches. Using KinOrtho-defined orthologs, Gene Ontology annotations, and machine learning, we propose putative biological functions of several understudied kinases, including the role of TP53RK in cell cycle checkpoint(s), the involvement of TSSK3 and TSSK6 in acrosomal vesicle localization, and potential functions for the ULK4 pseudokinase in neuronal development. In sum, KinOrtho presents a novel query-based tool to identify one-to-one orthologous relationships across thousands of proteomes that can be applied to any protein family of interest. We exploit KinOrtho here to identify kinase orthologs and show that its well-curated kinome ortholog set can serve as a valuable resource for illuminating understudied kinases, and the KinOrtho framework can be extended to any protein-family of interest.

Sections du résumé

BACKGROUND BACKGROUND
Protein kinases are among the largest druggable family of signaling proteins, involved in various human diseases, including cancers and neurodegenerative disorders. Despite their clinical relevance, nearly 30% of the 545 human protein kinases remain highly understudied. Comparative genomics is a powerful approach for predicting and investigating the functions of understudied kinases. However, an incomplete knowledge of kinase orthologs across fully sequenced kinomes severely limits the application of comparative genomics approaches for illuminating understudied kinases. Here, we introduce KinOrtho, a query- and graph-based orthology inference method that combines full-length and domain-based approaches to map one-to-one kinase orthologs across 17 thousand species.
RESULTS RESULTS
Using multiple metrics, we show that KinOrtho performed better than existing methods in identifying kinase orthologs across evolutionarily divergent species and eliminated potential false positives by flagging sequences without a proper kinase domain for further evaluation. We demonstrate the advantage of using domain-based approaches for identifying domain fusion events, highlighting a case between an understudied serine/threonine kinase TAOK1 and a metabolic kinase PIK3C2A with high co-expression in human cells. We also identify evolutionary fission events involving the understudied OBSCN kinase domains, further highlighting the value of domain-based orthology inference approaches. Using KinOrtho-defined orthologs, Gene Ontology annotations, and machine learning, we propose putative biological functions of several understudied kinases, including the role of TP53RK in cell cycle checkpoint(s), the involvement of TSSK3 and TSSK6 in acrosomal vesicle localization, and potential functions for the ULK4 pseudokinase in neuronal development.
CONCLUSIONS CONCLUSIONS
In sum, KinOrtho presents a novel query-based tool to identify one-to-one orthologous relationships across thousands of proteomes that can be applied to any protein family of interest. We exploit KinOrtho here to identify kinase orthologs and show that its well-curated kinome ortholog set can serve as a valuable resource for illuminating understudied kinases, and the KinOrtho framework can be extended to any protein-family of interest.

Identifiants

pubmed: 34537014
doi: 10.1186/s12859-021-04358-3
pii: 10.1186/s12859-021-04358-3
pmc: PMC8449880
doi:

Substances chimiques

Proteins 0
Protein Kinases EC 2.7.-
Protein Serine-Threonine Kinases EC 2.7.11.1
Ulk4 protein, human EC 2.7.11.1

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

446

Subventions

Organisme : NIGMS NIH HHS
ID : R25 GM109435
Pays : United States
Organisme : NCI NIH HHS
ID : U01 CA239106
Pays : United States
Organisme : NIGMS NIH HHS
ID : R25GM109435
Pays : United States

Informations de copyright

© 2021. The Author(s).

Références

Nucleic Acids Res. 2019 Jan 8;47(D1):D427-D432
pubmed: 30357350
Sci Signal. 2020 Jul 07;13(639):
pubmed: 32636308
Genome Res. 2003 Sep;13(9):2178-89
pubmed: 12952885
J Cell Sci. 2012 Oct 1;125(Pt 19):4423-33
pubmed: 23132929
Proc Natl Acad Sci U S A. 2018 Apr 24;115(17):4325-4333
pubmed: 29686065
PLoS One. 2012;7(2):e31627
pubmed: 22384045
BMC Bioinformatics. 2011 Apr 28;12:124
pubmed: 21526987
Science. 2002 Dec 6;298(5600):1912-34
pubmed: 12471243
Dev Biol. 2000 Mar 15;219(2):334-49
pubmed: 10694426
J Muscle Res Cell Motil. 2005;26(6-8):419-26
pubmed: 16625317
Genome Res. 2003 Oct;13(10):2353-62
pubmed: 14525933
Curr Cancer Drug Targets. 2006 Nov;6(7):623-34
pubmed: 17100568
Bioinformatics. 2017 Apr 15;33(8):1154-1159
pubmed: 28096085
Bioinformatics. 2019 Jan 1;35(1):149-151
pubmed: 30032301
PLoS One. 2007 Apr 18;2(4):e383
pubmed: 17440619
Curr Protoc Bioinformatics. 2011 Sep;Chapter 6:6.12.1-6.12.19
pubmed: 21901743
Nucleic Acids Res. 2020 Jan 8;48(D1):D265-D268
pubmed: 31777944
Bioessays. 2011 Oct;33(10):769-80
pubmed: 21853451
Bioinformatics. 2007 May 15;23(10):1282-8
pubmed: 17379688
Nucleic Acids Res. 2002 Jul 15;30(14):3059-66
pubmed: 12136088
Bioinformatics. 2017 Jul 15;33(14):i75-i82
pubmed: 28881964
Nature. 2011 Feb 10;470(7333):163-5
pubmed: 21307913
PLoS One. 2013;8(1):e53786
pubmed: 23342000
Nucleic Acids Res. 2021 Jan 8;49(D1):D1334-D1346
pubmed: 33156327
Nucleic Acids Res. 2017 Jan 4;45(D1):D995-D1002
pubmed: 27903890
Trends Cell Biol. 2017 Apr;27(4):284-298
pubmed: 27908682
Brief Bioinform. 2011 Sep;12(5):379-91
pubmed: 21690100
Annu Rev Genet. 2005;39:309-38
pubmed: 16285863
Bioinformatics. 2012 Mar 1;28(5):715-6
pubmed: 22247275
Nucleic Acids Res. 2014 Jan;42(Database issue):D897-902
pubmed: 24275491
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D153-5
pubmed: 14681382
Nat Methods. 2016 May;13(5):425-30
pubmed: 27043882
Science. 1989 Nov 3;246(4930):629-34
pubmed: 2683079
Sci Rep. 2018 Apr 25;8(1):6518
pubmed: 29695735
Sci Signal. 2019 Apr 23;12(578):
pubmed: 31015289
Nucleic Acids Res. 2012 Jan;40(Database issue):D136-43
pubmed: 22139910
IUBMB Life. 2013 Jun;65(6):479-86
pubmed: 23512348
Genome Res. 2009 Feb;19(2):327-35
pubmed: 19029536
Mol Cell Biol. 2003 Mar;23(6):2083-95
pubmed: 12612080
Mol Biosyst. 2016 Nov 15;12(12):3651-3665
pubmed: 27731453
Nat Genet. 2000 May;25(1):25-9
pubmed: 10802651
J Biol Chem. 2001 Nov 23;276(47):44003-11
pubmed: 11546806
Structure. 2020 Nov 3;28(11):1181-1183
pubmed: 33147475
Oncogene. 2002 May 30;21(24):3939-48
pubmed: 12032833
Nucleic Acids Res. 2019 Jan 8;47(D1):D351-D360
pubmed: 30398656
Biochem J. 2015 Jan 15;465(2):195-211
pubmed: 25559089
Bioinformatics. 2003 Sep 1;19(13):1710-1
pubmed: 15593400
J Biomed Inform. 2002 Apr;35(2):142-50
pubmed: 12474427
Science. 1997 Oct 24;278(5338):631-7
pubmed: 9381173
Nucleic Acids Res. 2019 Jan 8;47(D1):D506-D515
pubmed: 30395287
Trends Genet. 2000 May;16(5):227-31
pubmed: 10782117
Mol Biol Evol. 2020 Nov 1;37(11):3389-3396
pubmed: 32602888
Nucleic Acids Res. 2017 Jan 4;45(D1):D687-D690
pubmed: 27742821
J Am Chem Soc. 2020 Jan 8;142(1):33-37
pubmed: 31841327
Science. 1999 Jul 30;285(5428):751-3
pubmed: 10427000
Cell. 2018 Oct 18;175(3):809-821.e19
pubmed: 30270044
Nucleic Acids Res. 2019 Jul 2;47(W1):W256-W259
pubmed: 30931475
Nucleic Acids Res. 2020 Jan 8;48(D1):D682-D688
pubmed: 31691826
Proc Natl Acad Sci U S A. 2001 Jul 3;98(14):7940-5
pubmed: 11438739
Nat Genet. 2013 Jun;45(6):580-5
pubmed: 23715323
J Cell Sci. 2009 Aug 1;122(Pt 15):2741-9
pubmed: 19596796
Nucleic Acids Res. 2021 Jan 8;49(D1):D677-D686
pubmed: 33095861
Nucleic Acids Res. 2013 Jul;41(Web Server issue):W518-22
pubmed: 23703206
Nucleic Acids Res. 2006 Jan 1;34(Database issue):D572-80
pubmed: 16381935
Genome Biol. 2019 Nov 14;20(1):238
pubmed: 31727128
Nucleic Acids Res. 2005 Jan 1;33(Database issue):D476-80
pubmed: 15608241
Nucleic Acids Res. 2019 Jan 8;47(D1):D607-D613
pubmed: 30476243
Biochem J. 2014 Jan 15;457(2):323-34
pubmed: 24107129
EMBO J. 2019 Oct 4;38(21):e100847
pubmed: 31433507
J Cell Sci. 2014 Feb 1;127(Pt 3):630-40
pubmed: 24284070
Sci Rep. 2016 Sep 27;6:31126
pubmed: 27670918
Mol Biol Evol. 2015 Jan;32(1):268-74
pubmed: 25371430
Nucleic Acids Res. 2015 Jan;43(Database issue):D234-9
pubmed: 25429972
BMC Bioinformatics. 2019 Oct 28;20(1):523
pubmed: 31660857
Science. 1991 Jun 21;252(5013):1651-6
pubmed: 2047873
Nucleic Acids Res. 2015 Jan;43(Database issue):D270-6
pubmed: 25398900
BMC Bioinformatics. 2010 Oct 15;11 Suppl 7:S6
pubmed: 21106128
Nat Rev Drug Discov. 2002 Sep;1(9):727-30
pubmed: 12209152
Sci Am. 1984 Aug;251(2):70-9
pubmed: 6206561
Syst Zool. 1970 Jun;19(2):99-113
pubmed: 5449325
Nature. 2000 Feb 10;403(6770):601-3
pubmed: 10688178
Nucleic Acids Res. 2020 Jan 8;48(D1):D650-D658
pubmed: 31552413
BMC Evol Biol. 2007 Feb 08;7 Suppl 1:S12
pubmed: 17288570
Nucleic Acids Res. 2002 Apr 1;30(7):1575-84
pubmed: 11917018
Nucleic Acids Res. 2019 Jan 8;47(D1):D411-D418
pubmed: 30380106
Nucleic Acids Res. 2019 Jan 8;47(D1):D867-D873
pubmed: 30407545
Proc Natl Acad Sci U S A. 1999 Mar 16;96(6):2896-901
pubmed: 10077608
FEBS J. 2020 Oct;287(19):4150-4169
pubmed: 32053275
Nucleic Acids Res. 2019 Jan 8;47(D1):D419-D426
pubmed: 30407594
Cell. 2017 Jul 27;170(3):564-576.e16
pubmed: 28753430
Cancer Res. 2018 Jan 1;78(1):15-29
pubmed: 29254998
Nucleic Acids Res. 2019 Jan 8;47(D1):D330-D338
pubmed: 30395331
Hum Genet. 2001 Dec;109(6):678-80
pubmed: 11810281
Nat Chem Biol. 2013 Jan;9(1):3-6
pubmed: 23238671
Genome Res. 2003 Sep;13(9):2129-41
pubmed: 12952881
Genome Biol. 2015 Aug 06;16:157
pubmed: 26243257
Cell Cycle. 2012 May 1;11(9):1827-40
pubmed: 22517431
BMC Bioinformatics. 2009 Dec 15;10:421
pubmed: 20003500
J Mol Biol. 1990 Oct 5;215(3):403-10
pubmed: 2231712
Sci Signal. 2012 May 01;5(222):ra35
pubmed: 22550341
Genome Biol. 2008 Oct 30;9(10):235
pubmed: 18983710
Nucleic Acids Res. 2019 Jan 8;47(D1):D309-D314
pubmed: 30418610
Genome Biol. 2006;7(4):R31
pubmed: 16613613
Sci Signal. 2019 Aug 13;12(594):
pubmed: 31409758

Auteurs

Liang-Chin Huang (LC)

Institute of Bioinformatics, University of Georgia, 120 Green St., Athens, GA, 30602, USA.

Rahil Taujale (R)

Institute of Bioinformatics, University of Georgia, 120 Green St., Athens, GA, 30602, USA.

Nathan Gravel (N)

PREP@UGA, University of Georgia, 500 D.W. Brooks Drive, Athens, GA, 30602, USA.

Aarya Venkat (A)

Department of Biochemistry and Molecular Biology, University of Georgia, 120 Green St., Athens, GA, 30602, USA.

Wayland Yeung (W)

Institute of Bioinformatics, University of Georgia, 120 Green St., Athens, GA, 30602, USA.

Dominic P Byrne (DP)

Department of Biochemistry and Systems Biology, University of Liverpool, Crown St, Liverpool, UK.

Patrick A Eyers (PA)

Department of Biochemistry and Systems Biology, University of Liverpool, Crown St, Liverpool, UK.

Natarajan Kannan (N)

Institute of Bioinformatics, University of Georgia, 120 Green St., Athens, GA, 30602, USA. nkannan@uga.edu.
Department of Biochemistry and Molecular Biology, University of Georgia, 120 Green St., Athens, GA, 30602, USA. nkannan@uga.edu.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH