A Fast and Robust Strategy to Remove Variant-Level Artifacts in Alzheimer Disease Sequencing Project Data.
Journal
Neurology. Genetics
ISSN: 2376-7839
Titre abrégé: Neurol Genet
Pays: United States
ID NLM: 101671068
Informations de publication
Date de publication:
Oct 2022
Oct 2022
Historique:
received:
02
02
2022
accepted:
31
05
2022
entrez:
15
8
2022
pubmed:
16
8
2022
medline:
16
8
2022
Statut:
epublish
Résumé
Exome sequencing (ES) and genome sequencing (GS) are expected to be critical to further elucidate the missing genetic heritability of Alzheimer disease (AD) risk by identifying rare coding and/or noncoding variants that contribute to AD pathogenesis. In the United States, the Alzheimer Disease Sequencing Project (ADSP) has taken a leading role in sequencing AD-related samples at scale, with the resultant data being made publicly available to researchers to generate new insights into the genetic etiology of AD. To achieve sufficient power, the ADSP has adapted a study design where subsets of larger AD cohorts are collected and sequenced across multiple centers, using a variety of sequencing platforms. This approach may lead to variable variant quality across sequencing centers and/or platforms. In this study, we sought to implement and evaluate filters that can be applied fast to robustly remove variant-level artifacts in the ADSP data. We implemented a robust quality control procedure to handle ADSP data. We evaluated this procedure while performing exome-wide and genome-wide association analyses on AD risk using the latest ADSP whole ES (WES) and whole GS (WGS) data releases (NG00067.v5). We observed that many variants displayed large variation in allele frequencies across sequencing centers/platforms and contributed to spurious association signals with AD risk. We also observed that sequencing platform/center adjustment in association models could not fully account for these spurious signals. To address this issue, we designed and implemented variant filters that could capture and remove these center-specific/platform-specific artifactual variants. We derived a fast and robust approach to filter variants that represent sequencing center-related or platform-related artifacts underlying spurious associations with AD risk in ADSP WES and WGS data. This approach will be important to support future robust genetic association studies on ADSP data, as well as other studies with similar designs.
Sections du résumé
Background and Objectives
UNASSIGNED
Exome sequencing (ES) and genome sequencing (GS) are expected to be critical to further elucidate the missing genetic heritability of Alzheimer disease (AD) risk by identifying rare coding and/or noncoding variants that contribute to AD pathogenesis. In the United States, the Alzheimer Disease Sequencing Project (ADSP) has taken a leading role in sequencing AD-related samples at scale, with the resultant data being made publicly available to researchers to generate new insights into the genetic etiology of AD. To achieve sufficient power, the ADSP has adapted a study design where subsets of larger AD cohorts are collected and sequenced across multiple centers, using a variety of sequencing platforms. This approach may lead to variable variant quality across sequencing centers and/or platforms. In this study, we sought to implement and evaluate filters that can be applied fast to robustly remove variant-level artifacts in the ADSP data.
Methods
UNASSIGNED
We implemented a robust quality control procedure to handle ADSP data. We evaluated this procedure while performing exome-wide and genome-wide association analyses on AD risk using the latest ADSP whole ES (WES) and whole GS (WGS) data releases (NG00067.v5).
Results
UNASSIGNED
We observed that many variants displayed large variation in allele frequencies across sequencing centers/platforms and contributed to spurious association signals with AD risk. We also observed that sequencing platform/center adjustment in association models could not fully account for these spurious signals. To address this issue, we designed and implemented variant filters that could capture and remove these center-specific/platform-specific artifactual variants.
Discussion
UNASSIGNED
We derived a fast and robust approach to filter variants that represent sequencing center-related or platform-related artifacts underlying spurious associations with AD risk in ADSP WES and WGS data. This approach will be important to support future robust genetic association studies on ADSP data, as well as other studies with similar designs.
Identifiants
pubmed: 35966919
doi: 10.1212/NXG.0000000000200012
pii: NNG-2022-200015
pmc: PMC9372872
doi:
Types de publication
Journal Article
Langues
eng
Pagination
e200012Subventions
Organisme : NIA NIH HHS
ID : P30 AG066509
Pays : United States
Organisme : NIA NIH HHS
ID : P30 AG066515
Pays : United States
Informations de copyright
Copyright © 2022 The Author(s). Published by Wolters Kluwer Health, Inc. on behalf of the American Academy of Neurology.
Références
Alzheimers Dement. 2021 Sep;17(9):1509-1527
pubmed: 33797837
JAMA Neurol. 2019 Sep 01;76(9):1099-1108
pubmed: 31180460
Bioinformatics. 2019 May 15;35(10):1768-1770
pubmed: 30351394
J Am Stat Assoc. 2020;115(529):393-402
pubmed: 33012899
Neurol Genet. 2017 Oct 13;3(5):e194
pubmed: 29184913
PLoS One. 2021 Apr 16;16(4):e0249305
pubmed: 33861770
J Neurosci. 1998 May 1;18(9):3261-72
pubmed: 9547235
BMC Bioinformatics. 2017 Jul 24;18(1):351
pubmed: 28738841
Nat Genet. 2019 Mar;51(3):404-413
pubmed: 30617256
Transl Psychiatry. 2021 Feb 26;11(1):146
pubmed: 33637690
Lancet Neurol. 2020 Apr;19(4):326-335
pubmed: 31986256
BMC Bioinformatics. 2014 May 02;15:125
pubmed: 24884706
Neuron. 2015 May 20;86(4):985-999
pubmed: 25959733
Nat Commun. 2021 May 25;12(1):3152
pubmed: 34035245
Genet Epidemiol. 2015 May;39(4):276-93
pubmed: 25810074
Nat Commun. 2021 Jun 7;12(1):3417
pubmed: 34099642
Mol Psychiatry. 2020 Aug;25(8):1859-1875
pubmed: 30108311
Alzheimers Dement. 2021 Feb;17(2):215-225
pubmed: 32966694
JAMA Netw Open. 2019 Mar 1;2(3):e191350
pubmed: 30924900
Bioinformatics. 2013 Jun 1;29(11):1399-406
pubmed: 23539302
Am J Hum Genet. 2020 Sep 3;107(3):575-576
pubmed: 32888507
Nature. 2020 May;581(7809):434-443
pubmed: 32461654
Transl Psychiatry. 2021 May 19;11(1):296
pubmed: 34011927
Science. 2020 Oct 2;370(6512):61-66
pubmed: 33004512
Cold Spring Harb Protoc. 2011 Mar 01;2011(3):top102
pubmed: 21363959
Alzheimers Res Ther. 2021 Apr 1;13(1):72
pubmed: 33794991
Hum Mutat. 2019 Jan;40(1):115-126
pubmed: 30353964
Nat Neurosci. 2020 Mar;23(3):311-322
pubmed: 32112059
Am J Hum Genet. 2009 Dec;85(6):847-61
pubmed: 19931040
Nature. 2018 Oct;562(7726):203-209
pubmed: 30305743
Development. 2008 Mar;135(6):1189-99
pubmed: 18272596
Nat Genet. 2019 Mar;51(3):414-430
pubmed: 30820047
Alzheimers Dement. 2017 Dec;13(12):1410-1413
pubmed: 29055816