Getting insight into the pan-genome structure with PangTree.
Affinity tree
Multiple genome alignment
Pan-genome
Journal
BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258
Informations de publication
Date de publication:
16 Apr 2020
16 Apr 2020
Historique:
entrez:
18
4
2020
pubmed:
18
4
2020
medline:
14
1
2021
Statut:
epublish
Résumé
The term pan-genome was proposed to denominate collections of genomic sequences jointly analyzed or used as a reference. The constant growth of genomic data intensifies development of data structures and algorithms to investigate pan-genomes efficiently. This work focuses on providing a tool for discovering and visualizing the relationships between the sequences constituting a pan-genome. A new structure to represent such relationships - called affinity tree - is proposed. Each node of this tree has assigned a subset of genomes, as well as their homogeneity level and averaged consensus sequence. Moreover, subsets assigned to sibling nodes form a partition of the genomes assigned to their parent. Functionality of affinity tree is demonstrated on simulated data and on the Ebola virus pan-genome. Furthermore, two software packages are provided: PangTreeBuild constructs affinity tree, while PangTreeVis presents its result.
Sections du résumé
BACKGROUND
BACKGROUND
The term pan-genome was proposed to denominate collections of genomic sequences jointly analyzed or used as a reference. The constant growth of genomic data intensifies development of data structures and algorithms to investigate pan-genomes efficiently.
RESULTS
RESULTS
This work focuses on providing a tool for discovering and visualizing the relationships between the sequences constituting a pan-genome. A new structure to represent such relationships - called affinity tree - is proposed. Each node of this tree has assigned a subset of genomes, as well as their homogeneity level and averaged consensus sequence. Moreover, subsets assigned to sibling nodes form a partition of the genomes assigned to their parent.
CONCLUSIONS
CONCLUSIONS
Functionality of affinity tree is demonstrated on simulated data and on the Ebola virus pan-genome. Furthermore, two software packages are provided: PangTreeBuild constructs affinity tree, while PangTreeVis presents its result.
Identifiants
pubmed: 32299360
doi: 10.1186/s12864-020-6610-4
pii: 10.1186/s12864-020-6610-4
pmc: PMC7161101
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
274Références
Genome Res. 2017 May;27(5):737-746
pubmed: 28100585
Funct Integr Genomics. 2015 Mar;15(2):141-61
pubmed: 25722247
Proc Natl Acad Sci U S A. 2005 Sep 27;102(39):13950-5
pubmed: 16172379
Genome Biol. 2010;11(5):207
pubmed: 20441614
Genome Res. 2008 Nov;18(11):1814-28
pubmed: 18849524
Mol Ecol. 2012 Nov;21(22):5404-17
pubmed: 22913817
Genome Res. 2014 Dec;24(12):2077-89
pubmed: 25273068
Bioinformatics. 2012 Apr 15;28(8):1086-92
pubmed: 22368243
BMC Bioinformatics. 2014 Apr 09;15:99
pubmed: 24712884
Genome Res. 2004 Sep;14(9):1786-96
pubmed: 15342561
Brief Bioinform. 2018 Jan 1;19(1):118-135
pubmed: 27769991
Adv Virus Res. 2018;100:189-221
pubmed: 29551136
Arch Virol. 2010 Dec;155(12):2083-103
pubmed: 21046175
J Comput Biol. 2011 Mar;18(3):469-81
pubmed: 21385048
Bioinformatics. 2002 Mar;18(3):452-64
pubmed: 11934745
Bioinformatics. 2003 May 22;19(8):999-1008
pubmed: 12761063