Before and After: Comparison of Legacy and Harmonized TCGA Genomic Data Commons' Data.
DNA methylation
The Cancer Genome Atlas
human reference genome
mRNA expression
microRNA expression
quality control
somatic copy number alteration
somatic mutation
Journal
Cell systems
ISSN: 2405-4720
Titre abrégé: Cell Syst
Pays: United States
ID NLM: 101656080
Informations de publication
Date de publication:
24 07 2019
24 07 2019
Historique:
received:
19
01
2019
revised:
18
03
2019
accepted:
13
06
2019
entrez:
26
7
2019
pubmed:
26
7
2019
medline:
31
7
2020
Statut:
ppublish
Résumé
We present a systematic analysis of the effects of synchronizing a large-scale, deeply characterized, multi-omic dataset to the current human reference genome, using updated software, pipelines, and annotations. For each of 5 molecular data platforms in The Cancer Genome Atlas (TCGA)-mRNA and miRNA expression, single nucleotide variants, DNA methylation and copy number alterations-comprehensive sample, gene, and probe-level studies were performed, towards quantifying the degree of similarity between the 'legacy' GRCh37 (hg19) TCGA data and its GRCh38 (hg38) version as 'harmonized' by the Genomic Data Commons. We offer gene lists to elucidate differences that remained after controlling for confounders, and strategies to mitigate their impact on biological interpretation. Our results demonstrate that the hg19 and hg38 TCGA datasets are very highly concordant, promote informed use of either legacy or harmonized omics data, and provide a rubric that encourages similar comparisons as new data emerge and reference data evolve.
Identifiants
pubmed: 31344359
pii: S2405-4712(19)30201-7
doi: 10.1016/j.cels.2019.06.006
pmc: PMC6707074
mid: NIHMS1535521
pii:
doi:
Substances chimiques
MicroRNAs
0
Types de publication
Comparative Study
Journal Article
Research Support, N.I.H., Extramural
Langues
eng
Sous-ensembles de citation
IM
Pagination
24-34.e10Subventions
Organisme : NCI NIH HHS
ID : U24 CA210978
Pays : United States
Organisme : NCI NIH HHS
ID : U24 CA210950
Pays : United States
Organisme : NCI NIH HHS
ID : U24 CA210974
Pays : United States
Organisme : NCI NIH HHS
ID : U24 CA210989
Pays : United States
Organisme : NCI NIH HHS
ID : U24 CA210952
Pays : United States
Organisme : NCI NIH HHS
ID : U24 CA210957
Pays : United States
Organisme : NCI NIH HHS
ID : R01 CA175486
Pays : United States
Organisme : NCI NIH HHS
ID : U24 CA210949
Pays : United States
Organisme : NCI NIH HHS
ID : U24 CA209851
Pays : United States
Organisme : NCI NIH HHS
ID : U24 CA210990
Pays : United States
Organisme : NCI NIH HHS
ID : U24 CA211000
Pays : United States
Organisme : NIEHS NIH HHS
ID : P30 ES010126
Pays : United States
Organisme : NCI NIH HHS
ID : P30 CA016672
Pays : United States
Organisme : NCI NIH HHS
ID : U24 CA210969
Pays : United States
Organisme : NCI NIH HHS
ID : U24 CA210988
Pays : United States
Organisme : NCI NIH HHS
ID : U24 CA143883
Pays : United States
Organisme : NCI NIH HHS
ID : U24 CA211006
Pays : United States
Organisme : NCI NIH HHS
ID : U24 CA210999
Pays : United States
Informations de copyright
Copyright © 2019 The Authors. Published by Elsevier Inc. All rights reserved.
Références
Cancer Res. 1999 Apr 1;59(7):1445-8
pubmed: 10197611
Nucleic Acids Res. 2001 Jan 1;29(1):308-11
pubmed: 11125122
Genome Biol. 2001;2(6):RESEARCH0018
pubmed: 11423007
Genomics. 2004 Apr;83(4):679-93
pubmed: 15028290
Biostatistics. 2004 Oct;5(4):557-72
pubmed: 15475419
Nucleic Acids Res. 2006 Jan 1;34(Database issue):D140-4
pubmed: 16381832
Gynecol Oncol. 2007 Feb;104(2):331-7
pubmed: 17064757
Nucleic Acids Res. 2007 Jan;35(Database issue):D61-5
pubmed: 17130148
Am J Surg Pathol. 2008 Oct;32(10):1566-71
pubmed: 18724243
Bioinformatics. 2009 Jul 15;25(14):1754-60
pubmed: 19451168
Nucleic Acids Res. 2010 Oct;38(18):e178
pubmed: 20802226
Genome Biol. 2011;12(4):R41
pubmed: 21527027
Am J Surg Pathol. 2011 Jun;35(6):816-26
pubmed: 21552115
Nature. 2011 Jun 29;474(7353):609-15
pubmed: 21720365
PLoS Biol. 2011 Jul;9(7):e1001091
pubmed: 21750661
BMC Bioinformatics. 2011 Aug 04;12:323
pubmed: 21816040
Genomics. 2011 Oct;98(4):288-95
pubmed: 21839163
Bioinformatics. 2012 Feb 1;28(3):311-7
pubmed: 22155872
Genome Res. 2012 Mar;22(3):568-76
pubmed: 22300766
Genome Biol. 2012 Jun 15;13(6):R44
pubmed: 22703947
Genome Res. 2012 Sep;22(9):1760-74
pubmed: 22955987
Bioinformatics. 2013 Jan 1;29(1):15-21
pubmed: 23104886
Bioinformatics. 2013 Jan 15;29(2):189-96
pubmed: 23175756
Nat Biotechnol. 2013 Mar;31(3):213-9
pubmed: 23396013
Nucleic Acids Res. 2013 Apr;41(7):e90
pubmed: 23476028
Am J Hum Genet. 2013 Sep 5;93(3):411-21
pubmed: 23932108
Bioinformatics. 2015 Jan 15;31(2):166-9
pubmed: 25260700
PLoS One. 2014 Nov 18;9(11):e111516
pubmed: 25405470
Blood. 2015 Jan 22;125(4):600-5
pubmed: 25499761
Genome Biol. 2014 Dec 03;15(12):503
pubmed: 25599564
Nucleic Acids Res. 2016 Jan 8;44(1):e3
pubmed: 26271990
Curr Protoc Bioinformatics. 2015 Sep 03;51:11.14.1-19
pubmed: 26334920
Nat Med. 2015 Nov;21(11):1253-61
pubmed: 26540387
Nat Med. 2016 Jan;22(1):97-104
pubmed: 26657142
BMC Genomics. 2016 Jun 22;17:469
pubmed: 27334613
Nature. 2016 Aug 17;536(7616):285-91
pubmed: 27535533
Genome Biol. 2016 Aug 24;17(1):178
pubmed: 27557938
Nucleic Acids Res. 2017 Feb 28;45(4):e22
pubmed: 27924034
Cancer Res. 2017 Nov 1;77(21):e7-e10
pubmed: 29092928
Oncogene. 2018 Apr;37(17):2213-2224
pubmed: 29379162
Cell Syst. 2018 Mar 28;6(3):271-281.e7
pubmed: 29596782
Nat Genet. 2018 Apr;50(4):591-602
pubmed: 29610480
Cell Rep. 2018 Apr 3;23(1):297-312.e12
pubmed: 29617668
Cancer Cell. 2018 Apr 9;33(4):706-720.e9
pubmed: 29622465
Cell. 2018 Apr 5;173(2):283-285
pubmed: 29625045
Nucleic Acids Res. 2018 Nov 16;46(20):e123
pubmed: 30085201
Curr Opin Genet Dev. 1996 Dec;6(6):743-8
pubmed: 8994846