Extensible benchmarking of methods that identify and quantify polyadenylation sites from RNA-seq data.

(alternative) polyadenylation Benchmarking RNA-seq bioinformatics community initiative

Journal

bioRxiv : the preprint server for biology
Titre abrégé: bioRxiv
Pays: United States
ID NLM: 101680187

Informations de publication

Date de publication:
26 Jun 2023
Historique:
pubmed: 10 7 2023
medline: 10 7 2023
entrez: 10 7 2023
Statut: epublish

Résumé

The tremendous rate with which data is generated and analysis methods emerge makes it increasingly difficult to keep track of their domain of applicability, assumptions, and limitations and consequently, of the efficacy and precision with which they solve specific tasks. Therefore, there is an increasing need for benchmarks, and for the provision of infrastructure for continuous method evaluation. APAeval is an international community effort, organized by the RNA Society in 2021, to benchmark tools for the identification and quantification of the usage of alternative polyadenylation (APA) sites from short-read, bulk RNA-sequencing (RNA-seq) data. Here, we reviewed 17 tools and benchmarked eight on their ability to perform APA identification and quantification, using a comprehensive set of RNA-seq experiments comprising real, synthetic, and matched 3'-end sequencing data. To support continuous benchmarking, we have incorporated the results into the OpenEBench online platform, which allows for seamless extension of the set of methods, metrics, and challenges. We envisage that our analyses will assist researchers in selecting the appropriate tools for their studies. Furthermore, the containers and reproducible workflows generated in the course of this project can be seamlessly deployed and extended in the future to evaluate new methods or datasets.

Identifiants

pubmed: 37425672
doi: 10.1101/2023.06.23.546284
pmc: PMC10327023
pii:
doi:

Types de publication

Preprint

Langues

eng

Subventions

Organisme : NHLBI NIH HHS
ID : F31 HL162546
Pays : United States
Organisme : NIGMS NIH HHS
ID : R01 GM128096
Pays : United States
Organisme : NIGMS NIH HHS
ID : R01 GM147739
Pays : United States
Organisme : NLM NIH HHS
ID : R01 LM013437
Pays : United States

Commentaires et corrections

Type : UpdateIn

Auteurs

Sam Bryce-Smith (S)

UCL Queen Square Motor Neuron Disease Centre, Department of Neuromuscular Diseases, UCL Queen Square Institute of Neurology, UCL, London, UK.

Dominik Burri (D)

Biozentrum, University of Basel, Basel, Switzerland.
Swiss Institute of Bioinformatics, Lausanne, Switzerland.

Matthew R Gazzara (MR)

Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA.

Christina J Herrmann (CJ)

Biozentrum, University of Basel, Basel, Switzerland.
Swiss Institute of Bioinformatics, Lausanne, Switzerland.

Weronika Danecka (W)

Institute for Cell Biology, School of Biological Sciences, The University of Edinburgh, Edinburgh, United Kingdom.

Christina M Fitzsimmons (CM)

Laboratory of Cell Biology, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, USA.

Yuk Kei Wan (YK)

Genome Institute of Singapore, Buona Vista, Singapore.
National University of Singapore, Kent Ridge, Singapore.

Farica Zhuang (F)

Department of Computer and Information Science, School of Engineering, University of Pennsylvania, Philadelphia, USA.

Mervin M Fansler (MM)

Tri-Institutional Program in Computational Biology and Medicine, Weill Cornell GraduateStudies, New York, NY, USA.
Cancer Biology and Genetics, Sloan-Kettering Institute, MSKCC, New York, NY, USA.

José M Fernández (JM)

Barcelona Supercomputing Center, Barcelona, Spain.
Spanish National Bioinformatics Institute (INB/ELIXIR-ES).

Meritxell Ferret (M)

Barcelona Supercomputing Center, Barcelona, Spain.
Spanish National Bioinformatics Institute (INB/ELIXIR-ES).

Asier Gonzalez-Uriarte (A)

Barcelona Supercomputing Center, Barcelona, Spain.
Spanish National Bioinformatics Institute (INB/ELIXIR-ES).

Samuel Haynes (S)

Institute for Cell Biology, School of Biological Sciences, The University of Edinburgh, Edinburgh, United Kingdom.

Chelsea Herdman (C)

Department of Neurobiology, University of Utah, Utah, USA.

Alexander Kanitz (A)

Biozentrum, University of Basel, Basel, Switzerland.
Swiss Institute of Bioinformatics, Lausanne, Switzerland.

Maria Katsantoni (M)

Biozentrum, University of Basel, Basel, Switzerland.
Swiss Institute of Bioinformatics, Lausanne, Switzerland.

Federico Marini (F)

Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI) - UniversityMedical Center of the Johannes Gutenberg, University Mainz, Germany.

Euan McDonnel (E)

Leeds Institute for Data Analytics, School of Molecular and Cellular Biology, University of Leeds, United Kingdom.

Ben Nicolet (B)

Department of Hematopoiesis, Sanquin Research, Landsteiner Laboratory, AmsterdamUMC, University of Amsterdam, and Oncode Institute, Amsterdam, The Netherlands.

Chi-Lam Poon (CL)

Weill Cornell Medicine, New York, NY, USA.

Gregor Rot (G)

Swiss Institute of Bioinformatics, Lausanne, Switzerland.
Institute of Molecular Life Sciences, Zurich, Switzerland.

Leonard Schärfen (L)

Department of Molecular Biophysics & Biochemistry, Yale University, New Haven CT, USA.

Pin-Jou Wu (PJ)

Center for Plant Molecular Biology (ZMBP), University of Tübingen, Germany.

Yoseop Yoon (Y)

Department of Microbiology and Molecular Genetics, School of Medicine, University of California Irvine, Irvine, California, USA.

Yoseph Barash (Y)

Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, USA.
Department of Computer and Information Science, School of Engineering, University of Pennsylvania, Philadelphia, USA.

Mihaela Zavolan (M)

Biozentrum, University of Basel, Basel, Switzerland.
Swiss Institute of Bioinformatics, Lausanne, Switzerland.

Classifications MeSH