CapTrap-seq: a platform-agnostic and quantitative approach for high-fidelity full-length RNA sequencing.


Journal

Nature communications
ISSN: 2041-1723
Titre abrégé: Nat Commun
Pays: England
ID NLM: 101528555

Informations de publication

Date de publication:
27 Jun 2024
Historique:
received: 12 07 2023
accepted: 10 06 2024
medline: 28 6 2024
pubmed: 28 6 2024
entrez: 27 6 2024
Statut: epublish

Résumé

Long-read RNA sequencing is essential to produce accurate and exhaustive annotation of eukaryotic genomes. Despite advancements in throughput and accuracy, achieving reliable end-to-end identification of RNA transcripts remains a challenge for long-read sequencing methods. To address this limitation, we develop CapTrap-seq, a cDNA library preparation method, which combines the Cap-trapping strategy with oligo(dT) priming to detect 5' capped, full-length transcripts. In our study, we evaluate the performance of CapTrap-seq alongside other widely used RNA-seq library preparation protocols in human and mouse tissues, employing both ONT and PacBio sequencing technologies. To explore the quantitative capabilities of CapTrap-seq and its accuracy in reconstructing full-length RNA molecules, we implement a capping strategy for synthetic RNA spike-in sequences that mimics the natural 5'cap formation. Our benchmarks, incorporating the Long-read RNA-seq Genome Annotation Assessment Project (LRGASP) data, demonstrate that CapTrap-seq is a competitive, platform-agnostic RNA library preparation method for generating full-length transcript sequences.

Identifiants

pubmed: 38937428
doi: 10.1038/s41467-024-49523-3
pii: 10.1038/s41467-024-49523-3
doi:

Substances chimiques

RNA 63231-63-0
RNA Caps 0

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

5278

Informations de copyright

© 2024. The Author(s).

Références

Zhao, B. S., Roundtree, I. A. & He, C. Post-transcriptional gene regulation by mRNA modifications. Nat. Rev. Mol. Cell Biol. 18, 31–42 (2017).
pubmed: 27808276 doi: 10.1038/nrm.2016.132
Passmore, L. A. & Coller, J. Roles of mRNA poly(A) tails in regulation of eukaryotic gene expression. Nat. Rev. Mol. Cell Biol. 23, 93–106 (2022).
pubmed: 34594027 doi: 10.1038/s41580-021-00417-y
Ramanathan, A., Robb, G. B. & Chan, S.-H. mRNA capping: biological functions and applications. Nucleic Acids Res. 44, 7511–7526 (2016).
pubmed: 27317694 pmcid: 5027499 doi: 10.1093/nar/gkw551
Herzel, L., Ottoz, D. S. M., Alpert, T. & Neugebauer, K. M. Splicing and transcription touch base: co-transcriptional spliceosome assembly and function. Nat. Rev. Mol. Cell Biol. 18, 637–650 (2017).
pubmed: 28792005 pmcid: 5928008 doi: 10.1038/nrm.2017.63
Lagarde, J. et al. High-throughput annotation of full-length long noncoding RNAs with capture long-read sequencing. Nat. Genet. 49, 1731–1740 (2017).
pubmed: 29106417 pmcid: 5709232 doi: 10.1038/ng.3988
Fu, G. et al. Female-specific insect lethality engineered using alternative splicing. Nat. Biotechnol. 25, 353–357 (2007).
pubmed: 17322873 doi: 10.1038/nbt1283
Ferreira, P. G. et al. The effects of death and post-mortem cold ischemia on human tissue transcriptomes. Nat. Commun. 9, 490 (2018).
pubmed: 29440659 pmcid: 5811508 doi: 10.1038/s41467-017-02772-x
Qiu, J., Ma, X., Zeng, F. & Yan, J. RNA editing regulates lncRNA splicing in human early embryo development. PLoS Comput. Biol. 17, e1009630 (2021).
pubmed: 34851956 pmcid: 8668112 doi: 10.1371/journal.pcbi.1009630
Amarasinghe, S. L. et al. Opportunities and challenges in long-read sequencing data analysis. Genome Biol. 21, 30 (2020).
pubmed: 32033565 pmcid: 7006217 doi: 10.1186/s13059-020-1935-5
Zhu, Y. Y., Machleder, E. M., Chenchik, A., Li, R. & Siebert, P. D. Reverse transcriptase template switching: a SMART approach for full-length cDNA library construction. BioTechniques 30, 892–897 (2001).
pubmed: 11314272 doi: 10.2144/01304pf02
Ramsköld, D. et al. Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nat. Biotechnol. 30, 777–782 (2012).
pubmed: 22820318 pmcid: 3467340 doi: 10.1038/nbt.2282
Picelli, S. et al. Smart-seq2 for sensitive full-length transcriptome profiling in single cells. Nat. Methods 10, 1096–1098 (2013).
pubmed: 24056875 doi: 10.1038/nmeth.2639
Dijk, E. Lvan, Jaszczyszyn, Y. & Thermes, C. Library preparation methods for next-generation sequencing: tone down the bias. Exp. Cell Res. 322, 12–20 (2014).
pubmed: 24440557 doi: 10.1016/j.yexcr.2014.01.008
Roy, S. W. & Irimia, M. When good transcripts go bad: artifactual RT-PCR ‘splicing’ and genome analysis. BioEssays N. Rev. Mol. Cell. Dev. Biol. 30, 601–605 (2008).
Levin, J. Z. et al. Comprehensive comparative analysis of strand-specific RNA sequencing methods. Nat. Methods 7, 709–715 (2010).
pubmed: 20711195 pmcid: 3005310 doi: 10.1038/nmeth.1491
Kuo, R. I. et al. Illuminating the dark side of the human transcriptome with long read transcript sequencing. BMC Genomics 21, 751 (2020).
pubmed: 33126848 pmcid: 7596999 doi: 10.1186/s12864-020-07123-7
Ibrahim, F., Oppelt, J., Maragkakis, M. & Mourelatos, Z. TERA-Seq: true end-to-end sequencing of native RNA molecules for transcriptome characterization. Nucleic Acids Res. 49, e115 (2021).
pubmed: 34428294 pmcid: 8599856 doi: 10.1093/nar/gkab713
Jiang, F. et al. Long-read direct RNA sequencing by 5’-Cap capturing reveals the impact of Piwi on the widespread exonization of transposable elements in locusts. RNA Biol. 16, 950–959 (2019).
pubmed: 30982421 pmcid: 6546357 doi: 10.1080/15476286.2019.1602437
Bayega, A., Oikonomopoulos, S., Wang, Y. C. & Ragoussis, J. Improved Nanopore full-length cDNA sequencing by PCR-suppression. Front. Genet. 13, 1031355–1031366 (2022).
Begik, O. et al. Nano3P-seq: transcriptome-wide analysis of gene expression and tail dynamics using end-capture nanopore cDNA sequencing. Nat. Methods 20, 75–85 (2023).
pubmed: 36536091 doi: 10.1038/s41592-022-01714-w
Probst, V. et al. Benchmarking full-length transcript single cell mRNA sequencing protocols. BMC Genomics 23, 860 (2022).
pubmed: 36581800 pmcid: 9801581 doi: 10.1186/s12864-022-09014-5
Zhao, S., Zhang, Y., Gamini, R., Zhang, B. & Schack, Dvon Evaluation of two main RNA-seq approaches for gene quantification in clinical RNA sequencing: polyA+ selection versus rRNA depletion. Sci. Rep. 8, 4781 (2018).
pubmed: 29556074 pmcid: 5859127 doi: 10.1038/s41598-018-23226-4
Carninci, P. et al. High-efficiency full-length cDNA cloning by biotinylated CAP trapper. Genomics 37, 327–336 (1996).
pubmed: 8938445 doi: 10.1006/geno.1996.0567
Carninci, P. & Hayashizaki, Y. High-efficiency full-length cDNA cloning. Methods Enzymol. 303, 19–44 (1999).
pubmed: 10349636 doi: 10.1016/S0076-6879(99)03004-9
Morioka, M. S. et al. Cap analysis of gene expression (CAGE): a quantitative and genome-wide assay of transcription start sites. Methods Mol. Biol. 2120, 277–301 (2020).
pubmed: 32124327 doi: 10.1007/978-1-0716-0327-7_20
Grapotte, M. et al. Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network. Nat. Commun. 12, 3297 (2021).
pubmed: 34078885 pmcid: 8172540 doi: 10.1038/s41467-021-23143-7
Frankish, A. et al. GENCODE reference annotation for the human and mouse genomes. Nucleic Acids Res. 47, D766–D773 (2019).
pubmed: 30357393 doi: 10.1093/nar/gky955
Frankish, A. et al. GENCODE: reference annotation for the human and mouse genomes in 2023. Nucleic Acids Res. 51, D942–D949 (2023).
pubmed: 36420896 doi: 10.1093/nar/gkac1071
Pardo-Palacios, F. J. et al. Systematic assessment of long-read RNA-seq methods for transcript identification and quantification. Nat. Methods https://doi.org/10.1038/s41592-024-02298-3 (2024).
Green, M. R. & Sambrook, J. Long and accurate polymerase chain reaction (LA PCR). Cold Spring Harb. Protoc. 2019, 188–191 (2019).
Cartolano, M., Huettel, B., Hartwig, B., Reinhardt, R. & Schneeberger, K. cDNA library enrichment of full length transcripts for SMRT long read sequencing. PLoS ONE 11, e0157779 (2016).
pubmed: 27327613 pmcid: 4915659 doi: 10.1371/journal.pone.0157779
Forrest, A. R. R. et al. A promoter-level mammalian expression atlas. Nature 507, 462–470 (2014).
pubmed: 24670764 doi: 10.1038/nature13182
Carninci, P. et al. Genome-wide analysis of mammalian promoter architecture and evolution. Nat. Genet. 38, 626–635 (2006).
pubmed: 16645617 doi: 10.1038/ng1789
Lopez, F., Granjeaud, S., Ara, T., Ghattas, B. & Gautheret, D. The disparate nature of ‘intergenic’ polyadenylation sites. RNA 12, 1794–1801 (2006).
pubmed: 16931874 pmcid: 1581981 doi: 10.1261/rna.136206
Statello, L., Guo, C.-J., Chen, L.-L. & Huarte, M. Gene regulation by long non-coding RNAs and its biological functions. Nat. Rev. Mol. Cell Biol. 22, 96–118 (2021).
pubmed: 33353982 doi: 10.1038/s41580-020-00315-9
Uszczynska-Ratajczak, B., Lagarde, J., Frankish, A., Guigó, R. & Johnson, R. Towards a complete map of the human long non-coding RNA transcriptome. Nat. Rev. Genet. 19, 535–548 (2018).
pubmed: 29795125 pmcid: 6451964 doi: 10.1038/s41576-018-0017-y
Coster, W. D., Weissensteiner, M. H. & Sedlazeck, F. J. Towards population-scale long-read sequencing. Nat. Rev. Genet. 22, 572–587 (2021).
pubmed: 34050336 pmcid: 8161719 doi: 10.1038/s41576-021-00367-3
Baker, S. C. et al. The external RNA controls consortium: a progress report. Nat. Methods 2, 731–734 (2005).
pubmed: 16179916 doi: 10.1038/nmeth1005-731
Paul, L. et al. SIRVs: spike-in RNA variants as external isoform controls in RNA-sequencing. Preprint at bioRxiv https://doi.org/10.1101/080747 (2016).
Hardwick, S. A. et al. Spliced synthetic genes as internal controls in RNA sequencing experiments. Nat. Methods 13, 792–798 (2016).
pubmed: 27502218 doi: 10.1038/nmeth.3958
Volden, R. et al. Improving nanopore read accuracy with the R2C2 method enables the sequencing of highly multiplexed full-length single-cell cDNA. Proc. Natl Acad. Sci. USA 115, 9726–9731 (2018).
pubmed: 30201725 pmcid: 6166824 doi: 10.1073/pnas.1806447115
Picelli, S. et al. Full-length RNA-seq from single cells using Smart-seq2. Nat. Protoc. 9, 171–181 (2014).
pubmed: 24385147 doi: 10.1038/nprot.2014.006
Lewin, H. A. et al. The Earth BioGenome Project 2020: starting the clock. Proc. Natl Acad. Sci. USA 119, e2115635118 (2022).
pubmed: 35042800 pmcid: 8795548 doi: 10.1073/pnas.2115635118
Carbonell-Sala, S. & Guigó, R. 5’ capping protocol to add 5’ cap structures to exogenous synthetic RNA references (spike-ins). https://doi.org/10.21203/rs.3.pex-2649/v1 (2024).
Carbonell-Sala, S. & Guigó, R. CapTrap-Seq cDNA library preparation for full-length RNA sequencing. https://doi.org/10.21203/rs.3.pex-2646/v1 (2024).
Shibata, Y. et al. Cloning full-length, cap-trapper-selected cDNAs by using the single-strand linker ligation method. Biotechniques 30, 1250–1254 (2001).
pubmed: 11414214 doi: 10.2144/01306st01
Islam, S. et al. Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq. Genome Res. 21, 1160–1167 (2011).
pubmed: 21543516 pmcid: 3129258 doi: 10.1101/gr.110882.110
Pertea, G. & Pertea, M. GFF Utilities: GffRead and GffCompare [version 1; peer review: 3 approved]. F1000Res. 9, ISCB (2020).
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
pubmed: 23104886 doi: 10.1093/bioinformatics/bts635
Ramírez, F. et al. deepTools2: a next generation web server for deep-sequencing data analysis. Nucleic Acids Res. 44, W160–W165 (2016).
pubmed: 27079975 pmcid: 4987876 doi: 10.1093/nar/gkw257
Perteghella, T. The CapTrap-seq GitHub code and data repository. https://doi.org/10.5281/zenodo.1124228 .
Lagarde, J. The tmerge GitHub repository. https://doi.org/10.5281/zenodo.11261789 .

Auteurs

Sílvia Carbonell-Sala (S)

Centre for Genomic Regulation (CRG), the Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain.

Tamara Perteghella (T)

Centre for Genomic Regulation (CRG), the Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain.
Universitat Pompeu Fabra, Barcelona, Catalonia, Spain.

Julien Lagarde (J)

Centre for Genomic Regulation (CRG), the Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain.
Flomics Biotech, SL, Carrer de Roc Boronat 31, 08005, Barcelona, Catalonia, Spain.

Hiromi Nishiyori (H)

Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences (IMS), Yokohama, Kanagawa, Japan.

Emilio Palumbo (E)

Centre for Genomic Regulation (CRG), the Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain.

Carme Arnan (C)

Centre for Genomic Regulation (CRG), the Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain.

Hazuki Takahashi (H)

Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences (IMS), Yokohama, Kanagawa, Japan.

Piero Carninci (P)

Laboratory for Transcriptome Technology, RIKEN Center for Integrative Medical Sciences (IMS), Yokohama, Kanagawa, Japan.
Human Technopole, Milan, Italy.

Barbara Uszczynska-Ratajczak (B)

Centre for Genomic Regulation (CRG), the Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain. barbara.uszczynska@gmail.com.
Department of Computational Biology of Noncoding RNA, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland. barbara.uszczynska@gmail.com.

Roderic Guigó (R)

Centre for Genomic Regulation (CRG), the Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain. roderic.guigo@crg.cat.
Universitat Pompeu Fabra, Barcelona, Catalonia, Spain. roderic.guigo@crg.cat.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C

Classifications MeSH