Authoritative subspecies diagnosis tool for European honey bees based on ancestry informative SNPs.
Apis mellifera, European subspecies
Biodiversity
Conservation
Machine learning
Prediction
Journal
BMC genomics
ISSN: 1471-2164
Titre abrégé: BMC Genomics
Pays: England
ID NLM: 100965258
Informations de publication
Date de publication:
03 Feb 2021
03 Feb 2021
Historique:
received:
29
05
2020
accepted:
08
01
2021
entrez:
4
2
2021
pubmed:
5
2
2021
medline:
15
5
2021
Statut:
epublish
Résumé
With numerous endemic subspecies representing four of its five evolutionary lineages, Europe holds a large fraction of Apis mellifera genetic diversity. This diversity and the natural distribution range have been altered by anthropogenic factors. The conservation of this natural heritage relies on the availability of accurate tools for subspecies diagnosis. Based on pool-sequence data from 2145 worker bees representing 22 populations sampled across Europe, we employed two highly discriminative approaches (PCA and F Using a supervised machine learning (ML) approach and a set of 3896 genotyped individuals, we could show that the 4094 selected single nucleotide polymorphisms (SNPs) provide an accurate prediction of ancestry inference in European honey bees. The best ML model was Linear Support Vector Classifier (Linear SVC) which correctly assigned most individuals to one of the 14 subspecies or different genetic origins with a mean accuracy of 96.2% ± 0.8 SD. A total of 3.8% of test individuals were misclassified, most probably due to limited differentiation between the subspecies caused by close geographical proximity, or human interference of genetic integrity of reference subspecies, or a combination thereof. The diagnostic tool presented here will contribute to a sustainable conservation and support breeding activities in order to preserve the genetic heritage of European honey bees.
Sections du résumé
BACKGROUND
BACKGROUND
With numerous endemic subspecies representing four of its five evolutionary lineages, Europe holds a large fraction of Apis mellifera genetic diversity. This diversity and the natural distribution range have been altered by anthropogenic factors. The conservation of this natural heritage relies on the availability of accurate tools for subspecies diagnosis. Based on pool-sequence data from 2145 worker bees representing 22 populations sampled across Europe, we employed two highly discriminative approaches (PCA and F
RESULTS
RESULTS
Using a supervised machine learning (ML) approach and a set of 3896 genotyped individuals, we could show that the 4094 selected single nucleotide polymorphisms (SNPs) provide an accurate prediction of ancestry inference in European honey bees. The best ML model was Linear Support Vector Classifier (Linear SVC) which correctly assigned most individuals to one of the 14 subspecies or different genetic origins with a mean accuracy of 96.2% ± 0.8 SD. A total of 3.8% of test individuals were misclassified, most probably due to limited differentiation between the subspecies caused by close geographical proximity, or human interference of genetic integrity of reference subspecies, or a combination thereof.
CONCLUSIONS
CONCLUSIONS
The diagnostic tool presented here will contribute to a sustainable conservation and support breeding activities in order to preserve the genetic heritage of European honey bees.
Identifiants
pubmed: 33535965
doi: 10.1186/s12864-021-07379-7
pii: 10.1186/s12864-021-07379-7
pmc: PMC7860026
doi:
Types de publication
Journal Article
Langues
eng
Sous-ensembles de citation
IM
Pagination
101Subventions
Organisme : European Commission FP7 KBBE program
ID : 2013.1.3-02, SmartBees Grant Agreement number 613960
Organisme : Basque Government
ID : IT1233-19
Références
Mol Ecol. 2005 Jan;14(1):93-106
pubmed: 15643954
Bioinformatics. 2018 Aug 1;34(15):2642-2648
pubmed: 29584811
PLoS One. 2013 Aug 01;8(8):e70051
pubmed: 23936375
PLoS One. 2013;8(2):e56883
pubmed: 23457633
Mol Ecol. 1992 Oct;1(3):145-54
pubmed: 1364272
Trends Genet. 2018 Apr;34(4):301-312
pubmed: 29331490
Trends Ecol Evol. 2010 Jun;25(6):345-53
pubmed: 20188434
J Agric Food Chem. 2017 May 31;65(21):4351-4358
pubmed: 28489943
Mol Ecol. 2000 Jul;9(7):907-21
pubmed: 10886654
BMC Genet. 2011 May 13;12:45
pubmed: 21569514
Mol Ecol Resour. 2015 Nov;15(6):1346-55
pubmed: 25846634
Bioinformatics. 2011 Dec 15;27(24):3435-6
pubmed: 22025480
BMC Genomics. 2011 Dec 20;12:622
pubmed: 22185208
J Hered. 2002 Jul-Aug;93(4):260-9
pubmed: 12407212
PLoS One. 2015 Mar 16;10(3):e0118734
pubmed: 25775410
Nat Rev Genet. 2009 Sep;10(9):639-50
pubmed: 19687804
Sci Rep. 2018 Jun 4;8(1):8552
pubmed: 29867207
PLoS One. 2015 Apr 13;10(4):e0124365
pubmed: 25875986
Mol Ecol Resour. 2015 May;15(3):673-83
pubmed: 25335970
PLoS Comput Biol. 2007 Jun;3(6):e116
pubmed: 17604446
PLoS One. 2014 Apr 16;9(4):e94851
pubmed: 24740156
Science. 2006 Oct 27;314(5799):642-5
pubmed: 17068261
Mol Biol Evol. 2016 May;33(5):1337-48
pubmed: 26823447
PLoS One. 2011 Apr 07;6(4):e18007
pubmed: 21490966
Genome Biol Evol. 2017 Feb 1;9(2):457-472
pubmed: 28164223
Mol Ecol. 2013 Dec;22(23):5890-907
pubmed: 24118235
Food Chem. 2019 Jun 15;283:294-301
pubmed: 30722874
Mol Ecol Resour. 2017 Jul;17(4):783-795
pubmed: 27863055
Am J Hum Genet. 2003 Dec;73(6):1402-22
pubmed: 14631557
Evol Appl. 2018 Mar 30;11(8):1270-1282
pubmed: 30151039
Genetics. 2000 Jun;155(2):945-59
pubmed: 10835412
Genome Res. 2009 Sep;19(9):1655-64
pubmed: 19648217
PLoS Genet. 2007 Sep;3(9):1672-86
pubmed: 17892327
Evolution. 1984 Nov;38(6):1358-1370
pubmed: 28563791
J Hered. 2004 Nov-Dec;95(6):536-9
pubmed: 15475402
Hum Genomics. 2006 Jun;2(6):353-64
pubmed: 16848973
Genetika. 2016 Aug;52(8):931-42
pubmed: 29368906
Curr Opin Insect Sci. 2019 Feb;31:93-98
pubmed: 31109680
Mol Ecol. 1995 Jun;4(3):347-54
pubmed: 7663752
Nat Genet. 2014 Oct;46(10):1081-8
pubmed: 25151355
Brief Bioinform. 2004 Dec;5(4):328-38
pubmed: 15606969