Legacy Data Confound Genomics Studies.

batch effect imputation mutational signature population genetics reference cohorts statistical genetics

Journal

Molecular biology and evolution
ISSN: 1537-1719
Titre abrégé: Mol Biol Evol
Pays: United States
ID NLM: 8501455

Informations de publication

Date de publication:
01 Jan 2020
Historique:
pubmed: 11 9 2019
medline: 6 2 2020
entrez: 11 9 2019
Statut: ppublish

Résumé

Recent reports have identified differences in the mutational spectra across human populations. Although some of these reports have been replicated in other cohorts, most have been reported only in the 1000 Genomes Project (1kGP) data. While investigating an intriguing putative population stratification within the Japanese population, we identified a previously unreported batch effect leading to spurious mutation calls in the 1kGP data and to the apparent population stratification. Because the 1kGP data are used extensively, we find that the batch effects also lead to incorrect imputation by leading imputation servers and a small number of suspicious GWAS associations. Lower quality data from the early phases of the 1kGP thus continue to contaminate modern studies in hidden ways. It may be time to retire or upgrade such legacy sequencing data.

Identifiants

pubmed: 31504792
pii: 5556817
doi: 10.1093/molbev/msz201
doi:

Types de publication

Journal Article Review

Langues

eng

Sous-ensembles de citation

IM

Pagination

2-10

Informations de copyright

© The Author(s) 2019. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Auteurs

Luke Anderson-Trocmé (L)

Department of Human Genetics, McGill University, Montreal, QC, Canada.
McGill University and Genome Quebec Innovation Centre, Montreal, QC, Canada.

Rick Farouni (R)

Department of Human Genetics, McGill University, Montreal, QC, Canada.
McGill University and Genome Quebec Innovation Centre, Montreal, QC, Canada.

Mathieu Bourgey (M)

Department of Human Genetics, McGill University, Montreal, QC, Canada.
McGill University and Genome Quebec Innovation Centre, Montreal, QC, Canada.

Yoichiro Kamatani (Y)

Center for Genomic Medicine, Graduate School of Medicine, Kyoto University, Kyoto, Japan.

Koichiro Higasa (K)

Center for Genomic Medicine, Graduate School of Medicine, Kyoto University, Kyoto, Japan.

Jeong-Sun Seo (JS)

Bioinformatics Institute, Macrogen Inc, Seoul, Republic of Korea.
Precision Medicine Center, Seoul National University Bundang Hospital, Seongnam, Republic of Korea.

Changhoon Kim (C)

Bioinformatics Institute, Macrogen Inc, Seoul, Republic of Korea.

Fumihiko Matsuda (F)

Center for Genomic Medicine, Graduate School of Medicine, Kyoto University, Kyoto, Japan.

Simon Gravel (S)

Department of Human Genetics, McGill University, Montreal, QC, Canada.
McGill University and Genome Quebec Innovation Centre, Montreal, QC, Canada.

Articles similaires

[Redispensing of expensive oral anticancer medicines: a practical application].

Lisanne N van Merendonk, Kübra Akgöl, Bastiaan Nuijen
1.00
Humans Antineoplastic Agents Administration, Oral Drug Costs Counterfeit Drugs

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male

Classifications MeSH