Pan-conserved segment tags identify ultra-conserved sequences across assemblies in the human pangenome.
k-mer
pan-conserved segment
pangenome
reference genome
structural polymorphism
structural variations
Journal
Cell reports methods
ISSN: 2667-2375
Titre abrégé: Cell Rep Methods
Pays: United States
ID NLM: 9918227360606676
Informations de publication
Date de publication:
28 08 2023
28 08 2023
Historique:
received:
17
11
2022
revised:
14
04
2023
accepted:
06
07
2023
medline:
7
9
2023
pubmed:
6
9
2023
entrez:
6
9
2023
Statut:
epublish
Résumé
The human pangenome, a new reference sequence, addresses many limitations of the current GRCh38 reference. The first release is based on 94 high-quality haploid assemblies from individuals with diverse backgrounds. We employed a k-mer indexing strategy for comparative analysis across multiple assemblies, including the pangenome reference, GRCh38, and CHM13, a telomere-to-telomere reference assembly. Our k-mer indexing approach enabled us to identify a valuable collection of universally conserved sequences across all assemblies, referred to as "pan-conserved segment tags" (PSTs). By examining intervals between these segments, we discerned highly conserved genomic segments and those with structurally related polymorphisms. We found 60,764 polymorphic intervals with unique geo-ethnic features in the pangenome reference. In this study, we utilized ultra-conserved sequences (PSTs) to forge a link between human pangenome assemblies and reference genomes. This methodology enables the examination of any sequence of interest within the pangenome, using the reference genome as a comparative framework.
Identifiants
pubmed: 37671027
doi: 10.1016/j.crmeth.2023.100543
pii: S2667-2375(23)00180-7
pmc: PMC10475782
doi:
Types de publication
Journal Article
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
100543Subventions
Organisme : NIMH NIH HHS
ID : K01 MH129758
Pays : United States
Investigateurs
Wen-Wei Liao
(WW)
Mobin Asri
(M)
Jana Ebler
(J)
Daniel Doerr
(D)
Marina Haukness
(M)
Glenn Hickey
(G)
Shuangjia Lu
(S)
Julian K Lucas
(JK)
Jean Monlong
(J)
Haley J Abel
(HJ)
Silvia Buonaiuto
(S)
Xian H Chang
(XH)
Haoyu Cheng
(H)
Justin Chu
(J)
Vincenza Colonna
(V)
Jordan M Eizenga
(JM)
Xiaowen Feng
(X)
Christian Fischer
(C)
Robert S Fulton
(RS)
Shilpa Garg
(S)
Cristian Groza
(C)
Andrea Guarracino
(A)
William T Harvey
(WT)
Simon Heumos
(S)
Kerstin Howe
(K)
Miten Jain
(M)
Tsung-Yu Lu
(TY)
Charles Markello
(C)
Fergal J Martin
(FJ)
Matthew W Mitchell
(MW)
Katherine M Munson
(KM)
Moses Njagi Mwaniki
(MN)
Adam M Novak
(AM)
Hugh E Olsen
(HE)
Trevor Pesout
(T)
David Porubsky
(D)
Pjotr Prins
(P)
Jonas A Sibbesen
(JA)
Chad Tomlinson
(C)
Flavia Villani
(F)
Mitchell R Vollger
(MR)
Lucinda L Antonacci-Fulton
(LL)
Gunjan Baid
(G)
Carl A Baker
(CA)
Anastasiya Belyaeva
(A)
Konstantinos Billis
(K)
Andrew Carroll
(A)
Pi-Chuan Chang
(PC)
Sarah Cody
(S)
Daniel E Cook
(DE)
Omar E Cornejo
(OE)
Mark Diekhans
(M)
Peter Ebert
(P)
Susan Fairley
(S)
Olivier Fedrigo
(O)
Adam L Felsenfeld
(AL)
Giulio Formenti
(G)
Adam Frankish
(A)
Yan Gao
(Y)
Carlos Garcia Giron
(CG)
Richard E Green
(RE)
Leanne Haggerty
(L)
Kendra Hoekzema
(K)
Thibaut Hourlier
(T)
Hanlee P Ji
(HP)
Alexey Kolesnikov
(A)
Jan O Korbel
(JO)
Jennifer Kordosky
(J)
HoJoon Lee
(H)
Alexandra P Lewis
(AP)
Hugo Magalhães
(H)
Santiago Marco-Sola
(S)
Pierre Marijon
(P)
Jennifer McDaniel
(J)
Jacquelyn Mountcastle
(J)
Maria Nattestad
(M)
Nathan D Olson
(ND)
Daniela Puiu
(D)
Allison A Regier
(AA)
Arang Rhie
(A)
Samuel Sacco
(S)
Ashley D Sanders
(AD)
Valerie A Schneider
(VA)
Baergen I Schultz
(BI)
Kishwar Shafin
(K)
Jouni Sirén
(J)
Michael W Smith
(MW)
Heidi J Sofia
(HJ)
Ahmad N Abou Tayoun
(AN)
Françoise Thibaud-Nissen
(F)
Francesca Floriana Tricomi
(FF)
Justin Wagner
(J)
Jonathan Md Wood
(JM)
Aleksey V Zimin
(AV)
Alice B Popejoy
(AB)
Guillaume Bourque
(G)
Mark Jp Chaisson
(MJ)
Paul Flicek
(P)
Adam M Phillippy
(AM)
Justin M Zook
(JM)
Evan E Eichler
(EE)
David Haussler
(D)
Erich D Jarvis
(ED)
Karen H Miga
(KH)
Ting Wang
(T)
Erik Garrison
(E)
Tobias Marschall
(T)
Ira Hall
(I)
Heng Li
(H)
Benedict Paten
(B)
Informations de copyright
© 2023 The Authors.
Déclaration de conflit d'intérêts
The authors declare no competing interests.
Références
Nucleic Acids Res. 2022 Jul 5;50(W1):W448-W453
pubmed: 35474383
Biology (Basel). 2017 Mar 11;6(1):
pubmed: 28287462
Annu Rev Genomics Hum Genet. 2021 Aug 31;22:81-102
pubmed: 33929893
Nature. 2022 Apr;604(7905):310-315
pubmed: 35388217
Nat Commun. 2017 Nov 28;8(1):1826
pubmed: 29184056
Genome Biol. 2019 Nov 20;20(1):246
pubmed: 31747936
Nature. 2020 Jul;583(7814):83-89
pubmed: 32460305
NAR Cancer. 2020 Dec;2(4):zcaa034
pubmed: 33345188
Cell. 2022 Sep 1;185(18):3426-3440.e19
pubmed: 36055201
Nature. 2023 May;617(7960):312-324
pubmed: 37165242
Science. 2022 Apr;376(6588):44-53
pubmed: 35357919
Nature. 2022 Apr;604(7906):437-446
pubmed: 35444317
Nature. 2020 May;581(7809):434-443
pubmed: 32461654
Genome Biol. 2022 Aug 29;23(1):182
pubmed: 36038949
Genome Res. 2004 Apr;14(4):708-15
pubmed: 15060014
Genome Res. 2002 Jun;12(6):996-1006
pubmed: 12045153
Genome Med. 2021 Apr 19;13(1):62
pubmed: 33875001
Nat Biotechnol. 2022 May;40(5):672-680
pubmed: 35132260
Mol Cytogenet. 2015 Nov 14;8:89
pubmed: 26582469
BMC Genomics. 2020 Nov 4;21(1):762
pubmed: 33148192
Nat Rev Genet. 2020 Apr;21(4):243-254
pubmed: 32034321