Quality control recommendations for RNASeq using FFPE samples based on pre-sequencing lab metrics and post-sequencing bioinformatics metrics.


Journal

BMC medical genomics
ISSN: 1755-8794
Titre abrégé: BMC Med Genomics
Pays: England
ID NLM: 101319628

Informations de publication

Date de publication:
16 09 2022
Historique:
received: 07 03 2022
accepted: 12 09 2022
entrez: 16 9 2022
pubmed: 17 9 2022
medline: 21 9 2022
Statut: epublish

Résumé

Formalin-fixed, paraffin-embedded (FFPE) tissues have many advantages for identification of risk biomarkers, including wide availability and potential for extended follow-up endpoints. However, RNA derived from archival FFPE samples has limited quality. Here we identified parameters that determine which FFPE samples have the potential for successful RNA extraction, library preparation, and generation of usable RNAseq data. We optimized library preparation protocols designed for use with FFPE samples using seven FFPE and Fresh Frozen replicate pairs, and tested optimized protocols using a study set of 130 FFPE biopsies from women with benign breast disease. Metrics from RNA extraction and preparation procedures were collected and compared with bioinformatics sequencing summary statistics. Finally, a decision tree model was built to learn the relationship between pre-sequencing lab metrics and qc pass/fail status as determined by bioinformatics metrics. Samples that failed bioinformatics qc tended to have low median sample-wise correlation within the cohort (Spearman correlation < 0.75), low number of reads mapped to gene regions (< 25 million), or low number of detectable genes (11,400 # of detected genes with TPM > 4). The median RNA concentration and pre-capture library Qubit values for qc failed samples were 18.9 ng/ul and 2.08 ng/ul respectively, which were significantly lower than those of qc pass samples (40.8 ng/ul and 5.82 ng/ul). We built a decision tree model based on input RNA concentration, input library qubit values, and achieved an F score of 0.848 in predicting QC status (pass/fail) of FFPE samples. We provide a bioinformatics quality control recommendation for FFPE samples from breast tissue by evaluating bioinformatic and sample metrics. Our results suggest a minimum concentration of 25 ng/ul FFPE-extracted RNA for library preparation and 1.7 ng/ul pre-capture library output to achieve adequate RNA-seq data for downstream bioinformatics analysis.

Sections du résumé

BACKGROUND
Formalin-fixed, paraffin-embedded (FFPE) tissues have many advantages for identification of risk biomarkers, including wide availability and potential for extended follow-up endpoints. However, RNA derived from archival FFPE samples has limited quality. Here we identified parameters that determine which FFPE samples have the potential for successful RNA extraction, library preparation, and generation of usable RNAseq data.
METHODS
We optimized library preparation protocols designed for use with FFPE samples using seven FFPE and Fresh Frozen replicate pairs, and tested optimized protocols using a study set of 130 FFPE biopsies from women with benign breast disease. Metrics from RNA extraction and preparation procedures were collected and compared with bioinformatics sequencing summary statistics. Finally, a decision tree model was built to learn the relationship between pre-sequencing lab metrics and qc pass/fail status as determined by bioinformatics metrics.
RESULTS
Samples that failed bioinformatics qc tended to have low median sample-wise correlation within the cohort (Spearman correlation < 0.75), low number of reads mapped to gene regions (< 25 million), or low number of detectable genes (11,400 # of detected genes with TPM > 4). The median RNA concentration and pre-capture library Qubit values for qc failed samples were 18.9 ng/ul and 2.08 ng/ul respectively, which were significantly lower than those of qc pass samples (40.8 ng/ul and 5.82 ng/ul). We built a decision tree model based on input RNA concentration, input library qubit values, and achieved an F score of 0.848 in predicting QC status (pass/fail) of FFPE samples.
CONCLUSIONS
We provide a bioinformatics quality control recommendation for FFPE samples from breast tissue by evaluating bioinformatic and sample metrics. Our results suggest a minimum concentration of 25 ng/ul FFPE-extracted RNA for library preparation and 1.7 ng/ul pre-capture library output to achieve adequate RNA-seq data for downstream bioinformatics analysis.

Identifiants

pubmed: 36114500
doi: 10.1186/s12920-022-01355-0
pii: 10.1186/s12920-022-01355-0
pmc: PMC9479231
doi:

Substances chimiques

Biomarkers 0
Formaldehyde 1HG84L3525
RNA 63231-63-0

Types de publication

Journal Article Research Support, N.I.H., Extramural

Langues

eng

Sous-ensembles de citation

IM

Pagination

195

Subventions

Organisme : NCI NIH HHS
ID : P30 CA015083
Pays : United States
Organisme : NCI NIH HHS
ID : P50 CA116201
Pays : United States
Organisme : NCI NIH HHS
ID : R01 CA187112
Pays : United States

Informations de copyright

© 2022. The Author(s).

Références

Nat Rev Genet. 2016 May;17(5):257-71
pubmed: 26996076
Bioinformatics. 2014 Apr 1;30(7):923-30
pubmed: 24227677
BMC Bioinformatics. 2014 Jun 27;15:224
pubmed: 24972667
BMC Genomics. 2014 Aug 11;15:675
pubmed: 25113896
Toxicol Sci. 2015 Dec;148(2):460-72
pubmed: 26361796
Nat Genet. 2011 May;43(5):491-8
pubmed: 21478889
Genome Res. 2015 Sep;25(9):1372-81
pubmed: 26253700
Bioinformatics. 2013 Jan 1;29(1):15-21
pubmed: 23104886
BMC Genomics. 2017 Jun 5;18(1):442
pubmed: 28583074
Sci Rep. 2018 Mar 19;8(1):4781
pubmed: 29556074
Sci Rep. 2015 Jul 23;5:12335
pubmed: 26202458
Eur J Hum Genet. 2013 Feb;21(2):134-42
pubmed: 22739340
Bioinformatics. 2012 Aug 15;28(16):2184-5
pubmed: 22743226
Annu Rev Genomics Hum Genet. 2014;15:127-50
pubmed: 24898039
Biomedicines. 2020 May 09;8(5):
pubmed: 32397474
Expert Rev Mol Diagn. 2011 Apr;11(3):333-43
pubmed: 21463242
BMC Genomics. 2018 Sep 21;19(1):696
pubmed: 30241496
Thyroid. 2021 Apr;31(4):589-595
pubmed: 32948110
Virchows Arch. 2012 Feb;460(2):131-40
pubmed: 22270699
BMC Genomics. 2013 Nov 11;14:778
pubmed: 24215113
Bioinformatics. 2010 Jan 1;26(1):139-40
pubmed: 19910308
PLoS One. 2019 May 6;14(5):e0216050
pubmed: 31059554
BMC Cancer. 2017 Apr 4;17(1):241
pubmed: 28376728
Bioinformatics. 2014 Dec 1;30(23):3414-6
pubmed: 25170027
J Hematol Oncol. 2020 Dec 4;13(1):166
pubmed: 33276803

Auteurs

Yuanhang Liu (Y)

Department of Quantitative Health Sciences, Mayo Clinic, 200 1st Street SW, Rochester, MN, 55905, USA.

Aditya Bhagwate (A)

Department of Quantitative Health Sciences, Mayo Clinic, 200 1st Street SW, Rochester, MN, 55905, USA.

Stacey J Winham (SJ)

Department of Quantitative Health Sciences, Mayo Clinic, 200 1st Street SW, Rochester, MN, 55905, USA.

Melissa T Stephens (MT)

Genomics and Bioinformatics Core Facility, 019 Galvin Life Sciences Center, University of Notre Dame, Notre Dame, IN, 46556, USA.

Brent W Harker (BW)

Genomics and Bioinformatics Core Facility, 019 Galvin Life Sciences Center, University of Notre Dame, Notre Dame, IN, 46556, USA.

Samantha J McDonough (SJ)

Department of Laboratory Medicine and Pathology, Mayo Clinic, 200 1st Street SW, Rochester, MN, 55905, USA.

Melody L Stallings-Mann (ML)

Department of Neuroscience, Mayo Clinic, 4500 San Pablo Road, Jacksonville, FL, 32224, USA.

Ethan P Heinzen (EP)

Department of Quantitative Health Sciences, Mayo Clinic, 200 1st Street SW, Rochester, MN, 55905, USA.

Robert A Vierkant (RA)

Department of Quantitative Health Sciences, Mayo Clinic, 200 1st Street SW, Rochester, MN, 55905, USA.

Tanya L Hoskin (TL)

Department of Quantitative Health Sciences, Mayo Clinic, 200 1st Street SW, Rochester, MN, 55905, USA.

Marlene H Frost (MH)

Department of Medical Oncology, Mayo Clinic, 200 1st Street SW, Rochester, MN, 55905, USA.

Jodi M Carter (JM)

Department of Laboratory Medicine and Pathology, Mayo Clinic, 200 1st Street SW, Rochester, MN, 55905, USA.

Michael E Pfrender (ME)

Department of Biological Sciences, 109B Galvin Life Science Center, University of Notre Dame, Notre Dame, IN, 46556, USA.

Laurie Littlepage (L)

Department of Chemistry and Biochemistry, Harper Cancer Research Center, University of Notre Dame, Notre Dame, IN, 46556, USA.

Derek C Radisky (DC)

Department of Cancer Biology, Mayo Clinic, 4500 San Pablo Road, Jacksonville, FL, 32224, USA.

Julie M Cunningham (JM)

Department of Laboratory Medicine and Pathology, Mayo Clinic, 200 1st Street SW, Rochester, MN, 55905, USA.

Amy C Degnim (AC)

Department of Surgery, Mayo Clinic, 200 1st Street SW, Rochester, MN, 55905, USA.

Chen Wang (C)

Department of Quantitative Health Sciences, Mayo Clinic, 200 1st Street SW, Rochester, MN, 55905, USA. wang.chen@mayo.edu.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH