Chromosome-length genome assembly and structural variations of the primal Basenji dog (Canis lupus familiaris) genome.

Artificial selection Canine genome Comparative genomics Domestication

Journal

BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258

Informations de publication

Date de publication:
16 Mar 2021
Historique:
received: 09 12 2020
accepted: 28 02 2021
entrez: 17 3 2021
pubmed: 18 3 2021
medline: 20 5 2021
Statut: epublish

Résumé

Basenjis are considered an ancient dog breed of central African origins that still live and hunt with tribesmen in the African Congo. Nicknamed the barkless dog, Basenjis possess unique phylogeny, geographical origins and traits, making their genome structure of great interest. The increasing number of available canid reference genomes allows us to examine the impact the choice of reference genome makes with regard to reference genome quality and breed relatedness. Here, we report two high quality de novo Basenji genome assemblies: a female, China (CanFam_Bas), and a male, Wags. We conduct pairwise comparisons and report structural variations between assembled genomes of three dog breeds: Basenji (CanFam_Bas), Boxer (CanFam3.1) and German Shepherd Dog (GSD) (CanFam_GSD). CanFam_Bas is superior to CanFam3.1 in terms of genome contiguity and comparable overall to the high quality CanFam_GSD assembly. By aligning short read data from 58 representative dog breeds to three reference genomes, we demonstrate how the choice of reference genome significantly impacts both read mapping and variant detection. The growing number of high-quality canid reference genomes means the choice of reference genome is an increasingly critical decision in subsequent canid variant analyses. The basal position of the Basenji makes it suitable for variant analysis for targeted applications of specific dog breeds. However, we believe more comprehensive analyses across the entire family of canids is more suited to a pangenome approach. Collectively this work highlights the importance the choice of reference genome makes in all variation studies.

Sections du résumé

BACKGROUND BACKGROUND
Basenjis are considered an ancient dog breed of central African origins that still live and hunt with tribesmen in the African Congo. Nicknamed the barkless dog, Basenjis possess unique phylogeny, geographical origins and traits, making their genome structure of great interest. The increasing number of available canid reference genomes allows us to examine the impact the choice of reference genome makes with regard to reference genome quality and breed relatedness.
RESULTS RESULTS
Here, we report two high quality de novo Basenji genome assemblies: a female, China (CanFam_Bas), and a male, Wags. We conduct pairwise comparisons and report structural variations between assembled genomes of three dog breeds: Basenji (CanFam_Bas), Boxer (CanFam3.1) and German Shepherd Dog (GSD) (CanFam_GSD). CanFam_Bas is superior to CanFam3.1 in terms of genome contiguity and comparable overall to the high quality CanFam_GSD assembly. By aligning short read data from 58 representative dog breeds to three reference genomes, we demonstrate how the choice of reference genome significantly impacts both read mapping and variant detection.
CONCLUSIONS CONCLUSIONS
The growing number of high-quality canid reference genomes means the choice of reference genome is an increasingly critical decision in subsequent canid variant analyses. The basal position of the Basenji makes it suitable for variant analysis for targeted applications of specific dog breeds. However, we believe more comprehensive analyses across the entire family of canids is more suited to a pangenome approach. Collectively this work highlights the importance the choice of reference genome makes in all variation studies.

Identifiants

pubmed: 33726677
doi: 10.1186/s12864-021-07493-6
pii: 10.1186/s12864-021-07493-6
pmc: PMC7962210
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

188

Subventions

Organisme : NHGRI NIH HHS
ID : UM1 HG009375
Pays : United States
Organisme : NIH HHS
ID : UM1HG009375
Pays : United States

Références

Bioinformatics. 2008 Aug 15;24(16):1757-64
pubmed: 18567917
Nat Methods. 2018 Jun;15(6):461-468
pubmed: 29713083
R Soc Open Sci. 2016 Nov 9;3(11):160449
pubmed: 28018628
Nature. 2005 Dec 8;438(7069):803-19
pubmed: 16341006
Gene. 2017 Sep 20;629:64-67
pubmed: 28754635
Curr Protoc Bioinformatics. 2014 Sep 08;47:11.12.1-34
pubmed: 25199790
Nucleic Acids Res. 2020 Jul 2;48(W1):W538-W545
pubmed: 32374845
Nat Rev Genet. 2017 Dec;18(12):705-720
pubmed: 28944780
Nucleic Acids Res. 2013 Sep;41(16):e155
pubmed: 23828043
EMBO Rep. 2012 Jun 01;13(6):473-4
pubmed: 22555611
Genetica. 2015 Aug;143(4):453-8
pubmed: 25991039
Bioinformatics. 2009 Jul 15;25(14):1754-60
pubmed: 19451168
Anim Genet. 2014 Oct;45(5):716-22
pubmed: 24975239
Life Sci Alliance. 2021 Jan 29;4(4):
pubmed: 33514656
Genome Res. 2009 Mar;19(3):500-9
pubmed: 19015322
Nat Biotechnol. 2017 Nov;35(11):1026-1028
pubmed: 29035372
Cell. 2014 Dec 18;159(7):1665-80
pubmed: 25497547
Science. 2017 Apr 7;356(6333):92-95
pubmed: 28336562
J Small Anim Pract. 1979 Nov;20(11):675-9
pubmed: 547113
PLoS One. 2014 Mar 13;9(3):e91172
pubmed: 24625832
Cell Syst. 2018 Feb 28;6(2):256-258.e1
pubmed: 29428417
Bioinformatics. 2013 Oct 1;29(19):2487-9
pubmed: 23842809
Genome Res. 2010 Sep;20(9):1297-303
pubmed: 20644199
Bioinformatics. 2014 Aug 1;30(15):2114-20
pubmed: 24695404
PeerJ. 2019 Dec 13;7:e8206
pubmed: 31844586
Nat Commun. 2017 Jul 18;8:16082
pubmed: 28719574
Cell Rep. 2017 Apr 25;19(4):697-708
pubmed: 28445722
BMC Bioinformatics. 2005 Feb 15;6:31
pubmed: 15713233
BMC Bioinformatics. 2020 Aug 5;21(1):343
pubmed: 32758139
Nucleic Acids Res. 2014 Nov 10;42(20):12640-9
pubmed: 25348406
Genome Res. 2017 May;27(5):737-746
pubmed: 28100585
Genome Biol Evol. 2017 May 1;9(5):1190-1203
pubmed: 28444372
Can Vet J. 1985 Oct;26(10):303-5
pubmed: 17422579
Cell Res. 2016 Jan;26(1):21-33
pubmed: 26667385
Methods Mol Biol. 2019;1962:161-177
pubmed: 31020559
Bioinformatics. 2015 Oct 1;31(19):3210-2
pubmed: 26059717
Nat Biotechnol. 2019 May;37(5):540-546
pubmed: 30936562
BMC Bioinformatics. 2018 Nov 29;19(1):460
pubmed: 30497373
Nucleic Acids Res. 2020 Jan 8;48(D1):D682-D688
pubmed: 31691826
Am Nat. 2010 Mar;175(3):289-301
pubmed: 20095825
Genome Res. 2017 May;27(5):722-736
pubmed: 28298431
Bioinformatics. 2016 Nov 15;32(22):3507-3509
pubmed: 27466624
J Mol Evol. 1994 Aug;39(2):174-90
pubmed: 7932781
Curr Biol. 2018 Nov 5;28(21):3441-3449.e5
pubmed: 30344120
Nucleic Acids Res. 2011 Jan;39(Database issue):D19-21
pubmed: 21062823
Cell Syst. 2016 Jul;3(1):99-101
pubmed: 27467250
PLoS One. 2015 Nov 23;10(11):e0143199
pubmed: 26600436
Neuron. 2015 Jun 17;86(6):1369-84
pubmed: 26087164
PeerJ. 2018 Jun 4;6:e4958
pubmed: 29888139
Nat Protoc. 2015 Mar;10(3):475-83
pubmed: 25692984
Nat Commun. 2020 Feb 3;11(1):671
pubmed: 32015346
Mol Biol Evol. 2004 Jun;21(6):1081-4
pubmed: 15014143
Anim Genet. 2019 Dec;50(6):695-704
pubmed: 31486122
Methods Mol Biol. 2020;2141:37-72
pubmed: 32696352
BMC Genomics. 2017 Dec 19;18(1):977
pubmed: 29258433
Gigascience. 2020 Apr 1;9(4):
pubmed: 32236524
Science. 2013 Nov 15;342(6160):871-4
pubmed: 24233726
PLoS One. 2012;7(11):e47768
pubmed: 23185243
PLoS One. 2014 Nov 19;9(11):e112963
pubmed: 25409509
Nat Methods. 2015 Aug;12(8):733-5
pubmed: 26076426
Hum Mutat. 2005 Feb;25(2):207-21
pubmed: 15643617
Nature. 2011 Dec 14;480(7378):490-5
pubmed: 22170606
Bioinformatics. 2009 Aug 15;25(16):2078-9
pubmed: 19505943
Nat Methods. 2013 Jun;10(6):563-9
pubmed: 23644548
Nucleic Acids Res. 2012 Oct;40(18):9073-88
pubmed: 22761406
PLoS Genet. 2014 Jan;10(1):e1004016
pubmed: 24453982
Bioinformatics. 2018 Sep 15;34(18):3094-3100
pubmed: 29750242
Bioinformatics. 2010 Mar 15;26(6):841-2
pubmed: 20110278
Commun Biol. 2021 Feb 10;4(1):185
pubmed: 33568770
BMC Bioinformatics. 2009 Dec 15;10:421
pubmed: 20003500
J Mol Biol. 1990 Oct 5;215(3):403-10
pubmed: 2231712
Proc Natl Acad Sci U S A. 2016 Dec 27;113(52):E8396-E8405
pubmed: 27956617
Proc Natl Acad Sci U S A. 2021 Mar 16;118(11):
pubmed: 33836575
Nucleic Acids Res. 2006 Jul 19;34(12):3546-54
pubmed: 16855291

Auteurs

Richard J Edwards (RJ)

School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, 2052, Australia.

Matt A Field (MA)

Centre for Tropical Bioinformatics and Molecular Biology, Australian Institute of Tropical Health and Medicine, James Cook University, Cairns, QLD, 4878, Australia.
John Curtin School of Medical Research, Australian National University, Canberra, ACT, 2600, Australia.

James M Ferguson (JM)

Kinghorn Center for Clinical Genomics, Garvan Institute of Medical Research, Victoria Street, Darlinghurst, NSW, 2010, Australia.

Olga Dudchenko (O)

The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
Department of Computer Science, Rice University, Houston, TX, USA.
Center for Theoretical and Biological Physics, Rice University, Houston, TX, USA.

Jens Keilwagen (J)

Julius Kühn-Institut, Erwin-Baur-Str, 27 06484, Quedlinburg, Germany.

Benjamin D Rosen (BD)

Animal Genomics and Improvement Laboratory, Agricultural Research Service USDA, Beltsville, MD, 20705, USA.

Gary S Johnson (GS)

Department of Veterinary Pathobiology, University of Missouri, Columbia, MO, 65211, USA.

Edward S Rice (ES)

Department of Surgery, University of Missouri, Columbia, MO, 65211, USA.

La Deanna Hillier (D)

Genome Sciences, University of Washington, Seattle, WA, 98195, USA.

Jillian M Hammond (JM)

Kinghorn Center for Clinical Genomics, Garvan Institute of Medical Research, Victoria Street, Darlinghurst, NSW, 2010, Australia.

Samuel G Towarnicki (SG)

School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, 2052, Australia.

Arina Omer (A)

The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
Department of Computer Science, Rice University, Houston, TX, USA.

Ruqayya Khan (R)

The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA.
Department of Computer Science, Rice University, Houston, TX, USA.

Ksenia Skvortsova (K)

Genomics and Epigenetics Division, Garvan Institute of Medical Research, Victoria Street, Darlinghurst, NSW, 2010, Australia.
St Vincent's Clinical School, Faculty of Medicine, University of New South Wales, Sydney, NSW, 2010, Australia.

Ozren Bogdanovic (O)

School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, NSW, 2052, Australia.
Genomics and Epigenetics Division, Garvan Institute of Medical Research, Victoria Street, Darlinghurst, NSW, 2010, Australia.

Robert A Zammit (RA)

Vineyard Veterinary Hospital, 703 Windsor Rd, Vineyard, NSW, 2765, Australia.

Erez Lieberman Aiden (EL)

The Center for Genome Architecture, Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, USA. erez@erez.com.
Department of Computer Science, Rice University, Houston, TX, USA. erez@erez.com.
Center for Theoretical and Biological Physics, Rice University, Houston, TX, USA. erez@erez.com.
Faculty of Science, UWA School of Agriculture and Environment, University of Western Australia, Perth, WA, 6009, Australia. erez@erez.com.
Shanghai Institute for Advanced Immunochemical Studies, ShanghaiTech University, Shanghai, China. erez@erez.com.

Wesley C Warren (WC)

Department of Animal Sciences, University of Missouri, Columbia, MO, 65211, Australia. warrenwc@missouri.edu.

J William O Ballard (JWO)

Department of Ecology, Environment and Evolution, La Trobe University, Melbourne, Victoria, 3086, Australia. jwoballard@gmail.com.
School of Biosciences, University of Melbourne, Parkville, Victoria, 3052, Australia. jwoballard@gmail.com.

Articles similaires

Smoking Cessation and Incident Cardiovascular Disease.

Jun Hwan Cho, Seung Yong Shin, Hoseob Kim et al.
1.00
Humans Male Smoking Cessation Cardiovascular Diseases Female
Humans United States Aged Cross-Sectional Studies Medicare Part C
1.00
Humans Yoga Low Back Pain Female Male
Humans Meals Time Factors Female Adult

Classifications MeSH