Genome annotation with long RNA reads reveals new patterns of gene expression and improves single-cell analyses in an ant brain.

3′ UTR annotation Alternative splicing Ants Genome annotation Harpegnathos saltator Iso-Seq Long-read RNA-seq Single-cell sequencing

Journal

BMC biology
ISSN: 1741-7007
Titre abrégé: BMC Biol
Pays: England
ID NLM: 101190720

Informations de publication

Date de publication:
27 11 2021
Historique:
received: 03 06 2021
accepted: 10 11 2021
entrez: 28 11 2021
pubmed: 29 11 2021
medline: 17 3 2022
Statut: epublish

Résumé

Functional genomic analyses rely on high-quality genome assemblies and annotations. Highly contiguous genome assemblies have become available for a variety of species, but accurate and complete annotation of gene models, inclusive of alternative splice isoforms and transcription start and termination sites, remains difficult with traditional approaches. Here, we utilized full-length isoform sequencing (Iso-Seq), a long-read RNA sequencing technology, to obtain a comprehensive annotation of the transcriptome of the ant Harpegnathos saltator. The improved genome annotations include additional splice isoforms and extended 3' untranslated regions for more than 4000 genes. Reanalysis of RNA-seq experiments using these annotations revealed several genes with caste-specific differential expression and tissue- or caste-specific splicing patterns that were missed in previous analyses. The extended 3' untranslated regions afforded great improvements in the analysis of existing single-cell RNA-seq data, resulting in the recovery of the transcriptomes of 18% more cells. The deeper single-cell transcriptomes obtained with these new annotations allowed us to identify additional markers for several cell types in the ant brain, as well as genes differentially expressed across castes in specific cell types. Our results demonstrate that Iso-Seq is an efficient and effective approach to improve genome annotations and maximize the amount of information that can be obtained from existing and future genomic datasets in Harpegnathos and other organisms.

Sections du résumé

BACKGROUND
Functional genomic analyses rely on high-quality genome assemblies and annotations. Highly contiguous genome assemblies have become available for a variety of species, but accurate and complete annotation of gene models, inclusive of alternative splice isoforms and transcription start and termination sites, remains difficult with traditional approaches.
RESULTS
Here, we utilized full-length isoform sequencing (Iso-Seq), a long-read RNA sequencing technology, to obtain a comprehensive annotation of the transcriptome of the ant Harpegnathos saltator. The improved genome annotations include additional splice isoforms and extended 3' untranslated regions for more than 4000 genes. Reanalysis of RNA-seq experiments using these annotations revealed several genes with caste-specific differential expression and tissue- or caste-specific splicing patterns that were missed in previous analyses. The extended 3' untranslated regions afforded great improvements in the analysis of existing single-cell RNA-seq data, resulting in the recovery of the transcriptomes of 18% more cells. The deeper single-cell transcriptomes obtained with these new annotations allowed us to identify additional markers for several cell types in the ant brain, as well as genes differentially expressed across castes in specific cell types.
CONCLUSIONS
Our results demonstrate that Iso-Seq is an efficient and effective approach to improve genome annotations and maximize the amount of information that can be obtained from existing and future genomic datasets in Harpegnathos and other organisms.

Identifiants

pubmed: 34838024
doi: 10.1186/s12915-021-01188-w
pii: 10.1186/s12915-021-01188-w
pmc: PMC8626913
doi:

Substances chimiques

3' Untranslated Regions 0

Types de publication

Journal Article Research Support, N.I.H., Extramural Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

254

Subventions

Organisme : NIMH NIH HHS
ID : DP2MH107055
Pays : United States
Organisme : NIA NIH HHS
ID : R01 AG071818
Pays : United States
Organisme : NIMH NIH HHS
ID : R21 MH123841
Pays : United States
Organisme : NIA NIH HHS
ID : R01AG071818
Pays : United States
Organisme : NIA NIH HHS
ID : R01 AG058762
Pays : United States

Informations de copyright

© 2021. The Author(s).

Références

PLoS Genet. 2015 Aug 14;11(8):e1005440
pubmed: 26274446
Nat Methods. 2013 Dec;10(12):1177-84
pubmed: 24185837
Annu Rev Biochem. 2012;81:145-66
pubmed: 22663078
Bioinformatics. 2013 Jan 1;29(1):15-21
pubmed: 23104886
RNA. 2015 Sep;21(9):1521-31
pubmed: 26179515
Proc Natl Acad Sci U S A. 2000 Feb 29;97(5):2140-4
pubmed: 10681433
Nat Genet. 2019 Dec;51(12):1679-1690
pubmed: 31784728
PLoS Comput Biol. 2013;9(8):e1003118
pubmed: 23950696
Trends Genet. 2007 Jul;23(7):334-41
pubmed: 17509723
Cell. 2015 May 21;161(5):1202-1214
pubmed: 26000488
Nat Rev Mol Cell Biol. 2017 Jul;18(7):437-451
pubmed: 28488700
Annu Rev Genet. 2014;48:433-55
pubmed: 25251851
Genes Brain Behav. 2019 Sep;18(7):e12581
pubmed: 31095869
Nucleic Acids Res. 2002 Sep 1;30(17):3754-66
pubmed: 12202761
Nucleic Acids Res. 2016 Jul 8;44(W1):W160-5
pubmed: 27079975
Cell Rep. 2018 Jun 5;23(10):3078-3090
pubmed: 29874592
Proc Natl Acad Sci U S A. 1995 Nov 21;92(24):10977-9
pubmed: 11607589
Ann N Y Acad Sci. 2012 Jul;1260:14-23
pubmed: 22239229
Cell Rep. 2016 May 17;15(7):1580-1596
pubmed: 27160913
Sci Rep. 2020 Feb 20;10(1):3101
pubmed: 32080242
Mol Biol Cell. 2020 Apr 1;31(8):813-824
pubmed: 32049582
Fly (Austin). 2016 Jan 2;10(1):11-8
pubmed: 26980713
G3 (Bethesda). 2018 Nov 6;8(11):3433-3446
pubmed: 30158319
PLoS One. 2015 Jul 15;10(7):e0132628
pubmed: 26177194
Mol Cell. 2012 Jun 29;46(6):884-92
pubmed: 22749401
Genome Biol. 2014;15(12):550
pubmed: 25516281
Nat Biotechnol. 2016 May;34(5):525-7
pubmed: 27043002
BMC Genomics. 2020 Oct 30;21(1):751
pubmed: 33126848
Genetics. 2007 Nov;177(3):1733-41
pubmed: 17947419
F1000Res. 2019 Feb 24;8:213
pubmed: 30906538
BMC Bioinformatics. 2005 Feb 15;6:31
pubmed: 15713233
Bioinformatics. 2011 Jul 1;27(13):i275-82
pubmed: 21685081
PLoS Biol. 2020 May 4;18(5):e3000711
pubmed: 32365102
Mol Cell. 2012 Jun 29;46(6):871-83
pubmed: 22749400
Genome Res. 2001 Sep;11(9):1520-6
pubmed: 11544195
Proc Natl Acad Sci U S A. 2014 Apr 1;111(13):E1291-9
pubmed: 24639501
Cell. 2017 Aug 10;170(4):748-759.e12
pubmed: 28802044
Mol Endocrinol. 2005 Mar;19(3):794-803
pubmed: 15550470
BMC Genomics. 2017 Nov 6;18(1):847
pubmed: 29110697
Mol Cell. 2019 Jan 3;73(1):130-142.e5
pubmed: 30472192
Nature. 2008 Nov 27;456(7221):470-6
pubmed: 18978772
PLoS Genet. 2012;8(8):e1002930
pubmed: 22952454
Learn Mem. 1998 May-Jun;5(1-2):11-37
pubmed: 10454370
Biochem J. 2019 Apr 10;476(7):1083-1104
pubmed: 30971458
Proteomics. 2015 Oct;15(19):3356-60
pubmed: 26201256
Genome Res. 2017 May;27(5):722-736
pubmed: 28298431
EMBO J. 1989 Mar;8(3):787-96
pubmed: 2524382
Nat Biotechnol. 2018 Jun;36(5):411-420
pubmed: 29608179
Science. 2018 Jul 27;361(6400):398-402
pubmed: 30049879
Nature. 2016 Nov 17;539(7629):452-455
pubmed: 27783602
Nucleic Acids Res. 2007 Jul;35(Web Server issue):W345-9
pubmed: 17631615
Nat Biotechnol. 2010 May;28(5):421-3
pubmed: 20458303
Nat Methods. 2019 Feb;16(2):163-166
pubmed: 30664774
Curr Biol. 2006 Sep 19;16(18):1771-82
pubmed: 16979554
Appl Entomol Zool. 2017;52(3):497-509
pubmed: 28798494
Science. 2010 Aug 27;329(5995):1068-71
pubmed: 20798317
Nat Commun. 2019 Oct 17;10(1):4714
pubmed: 31624240
Curr Opin Neurobiol. 2000 Dec;10(6):790-5
pubmed: 11240291
Annu Rev Entomol. 2006;51:209-32
pubmed: 16332210
Proc Natl Acad Sci U S A. 2015 Nov 10;112(45):13970-5
pubmed: 26483466
Nat Commun. 2016 Jun 24;7:11708
pubmed: 27339440
G3 (Bethesda). 2019 Mar 7;9(3):755-767
pubmed: 30642874
Cell. 2018 Aug 9;174(4):982-998.e20
pubmed: 29909982
BMC Genomics. 2019 May 7;20(1):344
pubmed: 31064321
Neuron. 2008 Sep 25;59(6):972-85
pubmed: 18817735
Mech Dev. 2002 Feb;111(1-2):173-6
pubmed: 11804792
RNA. 2015 Jan;21(1):14-27
pubmed: 25406361
Cell. 2018 May 31;173(6):1520-1534.e20
pubmed: 29856957
Nat Rev Genet. 2011 Sep 07;12(10):671-82
pubmed: 21897427
Sci Adv. 2020 Aug 19;6(34):eaba9869
pubmed: 32875108
Endocrinology. 2002 Jul;143(7):2541-7
pubmed: 12072386
Nat Commun. 2016 Jun 24;7:11706
pubmed: 27339290
Development. 2000 Aug;127(16):3475-88
pubmed: 10903173
Insect Biochem Mol Biol. 2018 Mar;94:42-49
pubmed: 29408414
Nat Protoc. 2013 Aug;8(8):1494-512
pubmed: 23845962
Bioinformatics. 2011 Mar 15;27(6):863-4
pubmed: 21278185
Nat Biotechnol. 2013 Nov;31(11):1009-14
pubmed: 24108091
BMC Plant Biol. 2019 Aug 19;19(1):365
pubmed: 31426739
Sci Rep. 2017 Jun 16;7(1):3732
pubmed: 28623371
Nat Commun. 2017 Jan 16;8:14049
pubmed: 28091601
Sci Adv. 2020 Sep 16;6(38):
pubmed: 32938672
Cell. 1989 Mar 24;56(6):997-1010
pubmed: 2493994
Genetics. 1988 Apr;118(4):649-63
pubmed: 3130291

Auteurs

Emily J Shields (EJ)

Epigenetics Institute and Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA.
Department of Urology and Institute of Neuropathology, Medical Center-University of Freiburg, Faculty of Medicine, University of Freiburg, Freiburg, Germany.

Masato Sorida (M)

Epigenetics Institute and Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA.

Lihong Sheng (L)

Epigenetics Institute and Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA.

Bogdan Sieriebriennikov (B)

Department of Biology, New York University, New York, NY, USA.
Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY, USA.

Long Ding (L)

Department of Biology, New York University, New York, NY, USA.

Roberto Bonasio (R)

Epigenetics Institute and Department of Cell and Developmental Biology, University of Pennsylvania Perelman School of Medicine, Philadelphia, PA, USA. roberto@bonasiolab.org.

Articles similaires

Robotic Surgical Procedures Animals Humans Telemedicine Models, Animal

Odour generalisation and detection dog training.

Lyn Caldicott, Thomas W Pike, Helen E Zulch et al.
1.00
Animals Odorants Dogs Generalization, Psychological Smell
Animals TOR Serine-Threonine Kinases Colorectal Neoplasms Colitis Mice
Animals Tail Swine Behavior, Animal Animal Husbandry

Classifications MeSH