A single-cell RNA-sequencing training and analysis suite using the Galaxy framework.

10x Galaxy Web high-performance computing resources scRNA single-cell training

Journal

GigaScience
ISSN: 2047-217X
Titre abrégé: Gigascience
Pays: United States
ID NLM: 101596872

Informations de publication

Date de publication:
20 10 2020
Historique:
received: 22 06 2020
revised: 30 08 2020
entrez: 20 10 2020
pubmed: 21 10 2020
medline: 26 10 2021
Statut: ppublish

Résumé

The vast ecosystem of single-cell RNA-sequencing tools has until recently been plagued by an excess of diverging analysis strategies, inconsistent file formats, and compatibility issues between different software suites. The uptake of 10x Genomics datasets has begun to calm this diversity, and the bioinformatics community leans once more towards the large computing requirements and the statistically driven methods needed to process and understand these ever-growing datasets. Here we outline several Galaxy workflows and learning resources for single-cell RNA-sequencing, with the aim of providing a comprehensive analysis environment paired with a thorough user learning experience that bridges the knowledge gap between the computational methods and the underlying cell biology. The Galaxy reproducible bioinformatics framework provides tools, workflows, and trainings that not only enable users to perform 1-click 10x preprocessing but also empower them to demultiplex raw sequencing from custom tagged and full-length sequencing protocols. The downstream analysis supports a range of high-quality interoperable suites separated into common stages of analysis: inspection, filtering, normalization, confounder removal, and clustering. The teaching resources cover concepts from computer science to cell biology. Access to all resources is provided at the singlecell.usegalaxy.eu portal. The reproducible and training-oriented Galaxy framework provides a sustainable high-performance computing environment for users to run flexible analyses on both 10x and alternative platforms. The tutorials from the Galaxy Training Network along with the frequent training workshops hosted by the Galaxy community provide a means for users to learn, publish, and teach single-cell RNA-sequencing analysis.

Sections du résumé

BACKGROUND
The vast ecosystem of single-cell RNA-sequencing tools has until recently been plagued by an excess of diverging analysis strategies, inconsistent file formats, and compatibility issues between different software suites. The uptake of 10x Genomics datasets has begun to calm this diversity, and the bioinformatics community leans once more towards the large computing requirements and the statistically driven methods needed to process and understand these ever-growing datasets.
RESULTS
Here we outline several Galaxy workflows and learning resources for single-cell RNA-sequencing, with the aim of providing a comprehensive analysis environment paired with a thorough user learning experience that bridges the knowledge gap between the computational methods and the underlying cell biology. The Galaxy reproducible bioinformatics framework provides tools, workflows, and trainings that not only enable users to perform 1-click 10x preprocessing but also empower them to demultiplex raw sequencing from custom tagged and full-length sequencing protocols. The downstream analysis supports a range of high-quality interoperable suites separated into common stages of analysis: inspection, filtering, normalization, confounder removal, and clustering. The teaching resources cover concepts from computer science to cell biology. Access to all resources is provided at the singlecell.usegalaxy.eu portal.
CONCLUSIONS
The reproducible and training-oriented Galaxy framework provides a sustainable high-performance computing environment for users to run flexible analyses on both 10x and alternative platforms. The tutorials from the Galaxy Training Network along with the frequent training workshops hosted by the Galaxy community provide a means for users to learn, publish, and teach single-cell RNA-sequencing analysis.

Identifiants

pubmed: 33079170
pii: 5931798
doi: 10.1093/gigascience/giaa102
pmc: PMC7574357
pii:
doi:

Substances chimiques

RNA 63231-63-0

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Subventions

Organisme : Biotechnology and Biological Sciences Research Council
ID : BBS/E/T/000PR9817
Pays : United Kingdom

Informations de copyright

© The Author(s) 2020. Published by Oxford University Press GigaScience.

Références

Elife. 2017 Dec 05;6:
pubmed: 29206104
Nat Methods. 2018 May;15(5):379-386
pubmed: 29630061
Cell Syst. 2018 Jun 27;6(6):631-635
pubmed: 29953862
Nucleic Acids Res. 2018 Jul 2;46(W1):W537-W544
pubmed: 29790989
Nat Commun. 2016 Oct 14;7:13182
pubmed: 27739429
Nat Methods. 2020 Feb;17(2):137-145
pubmed: 31792435
Cell Syst. 2018 Jun 27;6(6):752-758.e1
pubmed: 29953864
Genome Biol. 2019 Dec 12;20(1):264
pubmed: 31829268
Nat Methods. 2019 Oct;16(10):983-986
pubmed: 31501545
Nat Methods. 2017 Apr;14(4):417-419
pubmed: 28263959
Nat Methods. 2014 Dec;11(12):1189
pubmed: 25423016
Nature. 2015 Sep 10;525(7568):251-5
pubmed: 26287467
Nat Biotechnol. 2016 Nov 8;34(11):1145-1160
pubmed: 27824854
Genome Biol. 2018 Feb 6;19(1):15
pubmed: 29409532
Nat Methods. 2018 May;15(5):359-362
pubmed: 29608555
Genome Biol. 2019 Mar 27;20(1):65
pubmed: 30917859
Bioinformatics. 2017 Apr 15;33(8):1179-1186
pubmed: 28088763
Bioinformatics. 2013 Jan 1;29(1):15-21
pubmed: 23104886
Cell. 2008 Oct 17;135(2):216-26
pubmed: 18957198
Nat Methods. 2018 Jul;15(7):475-476
pubmed: 29967506
Genome Biol. 2016 Apr 27;17:75
pubmed: 27122128
Nature. 2019 Feb;566(7745):496-502
pubmed: 30787437
Genome Biol. 2019 Mar 19;20(1):59
pubmed: 30890159
Genome Res. 2017 Mar;27(3):491-499
pubmed: 28100584
Genome Biol. 2018 Aug 24;19(1):125
pubmed: 30143029
Genome Biol. 2016 Apr 28;17:77
pubmed: 27121950
Nat Commun. 2017 Jan 16;8:14049
pubmed: 28091601
PLoS Comput Biol. 2018 Aug 10;14(8):e1006361
pubmed: 30096152
Mol Syst Biol. 2019 Jun 19;15(6):e8746
pubmed: 31217225
Nature. 2017 Oct 18;550(7677):451-453
pubmed: 29072289
Bioinformatics. 2019 Mar 15;35(6):1055-1057
pubmed: 30535135
Gigascience. 2019 Dec 1;8(12):
pubmed: 31825480
Science. 2018 Jun 1;360(6392):
pubmed: 29700227
Nat Biotechnol. 2015 May;33(5):495-502
pubmed: 25867923

Auteurs

Mehmet Tekman (M)

Department of Bioinformatics, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany.

Bérénice Batut (B)

Department of Bioinformatics, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany.

Alexander Ostrovsky (A)

Department of Biology, Johns Hopkins University, Mudd Hall 144, 3400 N. Charles Street, Baltimore, MD 21218, USA.

Christophe Antoniewski (C)

ARTbio, Sorbonne Université, CNRS FR 3631, Inserm US 037, Paris, France.
Institut de Biologie Paris Seine, 9 Quai Saint-Bernard Université Pierre et Marie Curie, Campus Jussieu, Bâtiments A-B-C, 75005 Paris, France.

Dave Clements (D)

Department of Biology, Johns Hopkins University, Mudd Hall 144, 3400 N. Charles Street, Baltimore, MD 21218, USA.

Fidel Ramirez (F)

Boehringer Ingelheim International GmbH, Binger Strasse 173, 55216 Ingelheim am Rhein, Biberach, Germany.

Graham J Etherington (GJ)

Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK.

Hans-Rudolf Hotz (HR)

Friedrich Miescher Institute for Biomedical Research, Maulbeerstrasse 66, 4058 Basel, Switzerland.
SIB Swiss Institute of Bioinformatics, Maulbeerstrasse 66, 4058 Basel, Switzerland.

Jelle Scholtalbers (J)

European Molecular Biology Laboratory, Genome Biology Unit, Meyerhofstraße 1, 69117 Heidelberg, Germany.

Jonathan R Manning (JR)

European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK.

Lea Bellenger (L)

ARTbio, Sorbonne Université, CNRS FR 3631, Inserm US 037, Paris, France.

Maria A Doyle (MA)

Research Computing Facility, Peter MacCallum Cancer Centre, Melbourne, 305 Grattan Street, Victoria 3000, Australia.
Sir Peter MacCallum Department of Oncology, The University of Melbourne, Victoria 3010, Australia.

Mohammad Heydarian (M)

Department of Biology, Johns Hopkins University, Mudd Hall 144, 3400 N. Charles Street, Baltimore, MD 21218, USA.

Ni Huang (N)

European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK.
Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK.

Nicola Soranzo (N)

Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, UK.

Pablo Moreno (P)

European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK.

Stefan Mautner (S)

Department of Bioinformatics, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany.

Irene Papatheodorou (I)

European Molecular Biology Laboratory, European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK.

Anton Nekrutenko (A)

Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA.

James Taylor (J)

Department of Biology, Johns Hopkins University, Mudd Hall 144, 3400 N. Charles Street, Baltimore, MD 21218, USA.

Daniel Blankenberg (D)

Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, 9500 Euclid Avenue, NB21 Cleveland, OH 44195, USA.

Rolf Backofen (R)

Department of Bioinformatics, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany.

Björn Grüning (B)

Department of Bioinformatics, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany.

Articles similaires

Selecting optimal software code descriptors-The case of Java.

Yegor Bugayenko, Zamira Kholmatova, Artem Kruglov et al.
1.00
Software Algorithms Programming Languages

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software
Cephalometry Humans Anatomic Landmarks Software Internet
Humans Colorectal Neoplasms Biomarkers, Tumor Prognosis Gene Expression Regulation, Neoplastic

Classifications MeSH