NCLcomparator: systematically post-screening non-co-linear transcripts (circular, trans-spliced, or fusion RNAs) identified from various detectors.

Alignment ambiguity Circular RNA Gene fusion Non-co-linear RNA RNA-seq Trans-spliced RNA

Journal

BMC bioinformatics
ISSN: 1471-2105
Titre abrégé: BMC Bioinformatics
Pays: England
ID NLM: 100965194

Informations de publication

Date de publication:
03 Jan 2019
Historique:
received: 22 11 2017
accepted: 21 12 2018
entrez: 5 1 2019
pubmed: 5 1 2019
medline: 14 2 2019
Statut: epublish

Résumé

Non-co-linear (NCL) transcripts consist of exonic sequences that are topologically inconsistent with the reference genome in an intragenic fashion (circular or intragenic trans-spliced RNAs) or in an intergenic fashion (fusion or intergenic trans-spliced RNAs). On the basis of RNA-seq data, numerous NCL event detectors have been developed and detected thousands of NCL events in diverse species. However, there are great discrepancies in the identification results among detectors, indicating a considerable proportion of false positives in the detected NCL events. Although several helpful guidelines for evaluating the performance of NCL event detectors have been provided, a systematic guideline for measurement of NCL events identified by existing tools has not been available. We develop a software, NCLcomparator, for systematically post-screening the intragenic or intergenic NCL events identified by various NCL detectors. NCLcomparator first examine whether the input NCL events are potentially false positives derived from ambiguous alignments (i.e., the NCL events have an alternative co-linear explanation or multiple matches against the reference genome). To evaluate the reliability of the identified NCL events, we define the NCL score (NCL NCLcomparator provides useful guidelines, with the input of identified NCL events from various detectors and the corresponding paired-end RNA-seq data only, to help users selecting potentially high-confidence NCL events for further functional investigation. The software thus helps to facilitate future studies into NCL events, shedding light on the fundamental biology of this important but understudied class of transcripts. NCLcomparator is freely accessible at https://github.com/TreesLab/NCLcomparator .

Sections du résumé

BACKGROUND BACKGROUND
Non-co-linear (NCL) transcripts consist of exonic sequences that are topologically inconsistent with the reference genome in an intragenic fashion (circular or intragenic trans-spliced RNAs) or in an intergenic fashion (fusion or intergenic trans-spliced RNAs). On the basis of RNA-seq data, numerous NCL event detectors have been developed and detected thousands of NCL events in diverse species. However, there are great discrepancies in the identification results among detectors, indicating a considerable proportion of false positives in the detected NCL events. Although several helpful guidelines for evaluating the performance of NCL event detectors have been provided, a systematic guideline for measurement of NCL events identified by existing tools has not been available.
RESULTS RESULTS
We develop a software, NCLcomparator, for systematically post-screening the intragenic or intergenic NCL events identified by various NCL detectors. NCLcomparator first examine whether the input NCL events are potentially false positives derived from ambiguous alignments (i.e., the NCL events have an alternative co-linear explanation or multiple matches against the reference genome). To evaluate the reliability of the identified NCL events, we define the NCL score (NCL
CONCLUSION CONCLUSIONS
NCLcomparator provides useful guidelines, with the input of identified NCL events from various detectors and the corresponding paired-end RNA-seq data only, to help users selecting potentially high-confidence NCL events for further functional investigation. The software thus helps to facilitate future studies into NCL events, shedding light on the fundamental biology of this important but understudied class of transcripts. NCLcomparator is freely accessible at https://github.com/TreesLab/NCLcomparator .

Identifiants

pubmed: 30606103
doi: 10.1186/s12859-018-2589-0
pii: 10.1186/s12859-018-2589-0
pmc: PMC6318855
doi:

Substances chimiques

RNA 63231-63-0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

3

Subventions

Organisme : Ministry of Science and Technology, Taiwan
ID : MOST 103-2628-B-001-001-MY4
Organisme : Ministry of Science and Technology, Taiwan
ID : MOST 107-2311-B-001-046

Références

Science. 1990 Oct 26;250(4980):559-62
pubmed: 2237408
Bioinformatics. 2011 Jun 1;27(11):1481-8
pubmed: 21478487
Wiley Interdiscip Rev RNA. 2015 Sep-Oct;6(5):563-79
pubmed: 26230526
Nat Commun. 2015 Nov 02;6:8734
pubmed: 26521696
Genome Res. 2010 May;20(5):646-54
pubmed: 20305017
PLoS One. 2012;7(1):e28213
pubmed: 22238572
Genome Biol. 2015 Jan 13;16:4
pubmed: 25583365
Nature. 2013 Mar 21;495(7441):384-8
pubmed: 23446346
BMC Genomics. 2012 Aug 28;13:429
pubmed: 22925561
Nat Rev Cancer. 2007 Apr;7(4):233-45
pubmed: 17361217
Nature. 1985 Jun 13-19;315(6020):550-4
pubmed: 2989692
Nucleic Acids Res. 2010 Jan;38(Database issue):D81-5
pubmed: 19906715
Bioinformatics. 2006 Mar 15;22(6):692-8
pubmed: 16308355
Mol Cell. 2014 Oct 2;56(1):55-66
pubmed: 25242144
Nat Struct Mol Biol. 2015 Mar;22(3):256-64
pubmed: 25664725
RNA. 2014 Nov;20(11):1666-70
pubmed: 25234927
Nat Rev Mol Cell Biol. 2016 Apr;17(4):205-11
pubmed: 26908011
Stem Cells. 2016 Aug;34(8):2052-62
pubmed: 27090862
Proc Natl Acad Sci U S A. 2009 Jul 28;106(30):12353-8
pubmed: 19592507
Nat Genet. 2004 Apr;36(4):331-4
pubmed: 15054488
Genome Res. 2016 Jan;26(1):108-18
pubmed: 26556708
Bioinformatics. 2012 Aug 15;28(16):2114-21
pubmed: 22711792
Curr Opin Genet Dev. 2010 Apr;20(2):127-33
pubmed: 20211559
PLoS One. 2012;7(2):e30733
pubmed: 22319583
N Engl J Med. 2006 Dec 7;355(23):2408-17
pubmed: 17151364
Genome Res. 2011 May;21(5):676-87
pubmed: 21467264
Nucleic Acids Res. 2016 May 19;44(9):e87
pubmed: 26873924
Genome Res. 2011 Nov;21(11):1788-99
pubmed: 21948523
Nucleic Acids Res. 2016 Feb 18;44(3):1370-83
pubmed: 26657629
Genome Biol. 2013 Feb 14;14(2):R12
pubmed: 23409703
Genome Biol. 2013 Aug 23;14(8):R87
pubmed: 23972288
Cancer Res. 2009 Apr 1;69(7):2734-8
pubmed: 19293179
Blood. 1992 Dec 15;80(12):2983-90
pubmed: 1467514
Proc Natl Acad Sci U S A. 2012 Sep 25;109(39):15841-6
pubmed: 23019368
Genome Biol. 2015 Jun 16;16:126
pubmed: 26076956
Genome Biol. 2014 Feb 10;15(2):R34
pubmed: 24512684
Proc Natl Acad Sci U S A. 2009 Feb 10;106(6):1886-91
pubmed: 19181860
Nat Commun. 2016 Apr 06;7:11215
pubmed: 27050392
Sci Rep. 2016 Feb 10;6:21597
pubmed: 26862001
Genome Res. 2012 Nov;22(11):2250-61
pubmed: 22745232
Cell. 2015 Mar 12;160(6):1125-34
pubmed: 25768908
Nucleic Acids Res. 2016 Feb 18;44(3):e29
pubmed: 26442529
Genome Res. 2011 Nov;21(11):1916-28
pubmed: 21994248
Genome Res. 2014 Jan;24(1):25-36
pubmed: 24131564
J Hematol Oncol. 2017 Feb 20;10(1):52
pubmed: 28219405
Nature. 2011 Oct 12;478(7370):476-82
pubmed: 21993624
Nature. 2009 Mar 5;458(7234):97-101
pubmed: 19136943
Proc Natl Acad Sci U S A. 2010 Jul 20;107(29):12975-9
pubmed: 20615941
Biomed Pharmacother. 2017 Apr;88:138-144
pubmed: 28103507
BMC Bioinformatics. 2011 Aug 04;12:323
pubmed: 21816040
PLoS One. 2010 Aug 18;5(8):e12271
pubmed: 20805885
Mol Cell. 2015 Jun 4;58(5):870-85
pubmed: 25921068
Sci Rep. 2016 Dec 13;6:38907
pubmed: 27958329
Genome Biol. 2018 Apr 12;19(1):52
pubmed: 29650026
Genome Biol. 2015 Nov 05;16:245
pubmed: 26541409
Nature. 2013 Mar 21;495(7441):333-8
pubmed: 23446348
Nucleic Acids Res. 2016 Apr 7;44(6):e58
pubmed: 26657634
Cancer Res. 2017 May 1;77(9):2339-2350
pubmed: 28249903
PLoS Comput Biol. 2019 May 31;15(5):e1006158
pubmed: 31150384
Wiley Interdiscip Rev RNA. 2017 Nov;8(6):
pubmed: 28589684
Brief Bioinform. 2013 Jul;14(4):506-19
pubmed: 22877769
BMC Med Genomics. 2011 Oct 27;4:75
pubmed: 22032724
N Engl J Med. 2008 Aug 14;359(7):722-34
pubmed: 18703475
Nucleic Acids Res. 2014 Aug;42(14):9410-23
pubmed: 25053845
Genome Biol. 2014 Jul 29;15(7):409
pubmed: 25070500
Nat Rev Genet. 2011 Feb;12(2):87-98
pubmed: 21191423
Science. 2008 Sep 5;321(5894):1357-61
pubmed: 18772439
BMC Bioinformatics. 2013;14 Suppl 7:S2
pubmed: 23815381
Bioinformatics. 2013 Jan 1;29(1):15-21
pubmed: 23104886
N Engl J Med. 2003 Mar 13;348(11):994-1004
pubmed: 12637609
Genome Res. 2002 Apr;12(4):656-64
pubmed: 11932250
PLoS Comput Biol. 2017 Jun 8;13(6):e1005420
pubmed: 28594838
Nucleic Acids Res. 2010 Oct;38(18):e178
pubmed: 20802226
RNA. 2013 Feb;19(2):141-57
pubmed: 23249747
Nucleic Acids Res. 2018 Apr 20;46(7):3671-3691
pubmed: 29385530
Oncogene. 2016 Jul 28;35(30):3919-31
pubmed: 26657152
PLoS One. 2014 Mar 07;9(6):e90859
pubmed: 24609083
Nature. 2009 Sep 10;461(7261):206-11
pubmed: 19741701
FASEB J. 1993 Jan;7(1):155-60
pubmed: 7678559
Mol Cell. 2013 Sep 26;51(6):792-806
pubmed: 24035497

Auteurs

Chia-Ying Chen (CY)

Genomics Research Center, Academia Sinica, Taipei, 11529, Taiwan.

Trees-Juen Chuang (TJ)

Genomics Research Center, Academia Sinica, Taipei, 11529, Taiwan. trees@gate.sinica.edu.tw.

Articles similaires

Animals Lung India Sheep Transcriptome
Humans RNA, Circular Exosomes Cell Proliferation Epithelial-Mesenchymal Transition
Spliceosomes Humans Transcriptome Alternative Splicing RNA Splice Sites
DNA Methylation Humans DNA Animals Machine Learning

Classifications MeSH