Reliable genotyping of recombinant genomes using a robust hidden Markov model.
Journal
Plant physiology
ISSN: 1532-2548
Titre abrégé: Plant Physiol
Pays: United States
ID NLM: 0401224
Informations de publication
Date de publication:
31 05 2023
31 05 2023
Historique:
received:
19
04
2022
accepted:
27
01
2023
pmc-release:
22
03
2024
medline:
2
6
2023
pubmed:
23
3
2023
entrez:
22
3
2023
Statut:
ppublish
Résumé
Meiotic recombination is an essential mechanism during sexual reproduction and includes the exchange of chromosome segments between homologous chromosomes. New allelic combinations are transmitted to the new generation, introducing novel genetic variation in the offspring genomes. With the improvement of high-throughput whole-genome sequencing technologies, large numbers of recombinant individuals can now be sequenced with low sequencing depth at low costs, necessitating computational methods for reconstructing their haplotypes. The main challenge is the uncertainty in haplotype calling that arises from the low information content of a single genomic position. Straightforward sliding window-based approaches are difficult to tune and fail to place recombination breakpoints precisely. Hidden Markov model (HMM)-based approaches, on the other hand, tend to over-segment the genome. Here, we present RTIGER, an HMM-based model that exploits in a mathematically precise way the fact that true chromosome segments typically have a certain minimum length. We further separate the task of identifying the correct haplotype sequence from the accurate placement of haplotype borders, thereby maximizing the accuracy of border positions. By comparing segmentations based on simulated data with known underlying haplotypes, we highlight the reasons for RTIGER outperforming traditional segmentation approaches. We then analyze the meiotic recombination pattern of segregants of 2 Arabidopsis (Arabidopsis thaliana) accessions and a previously described hyper-recombining mutant. RTIGER is available as an R package with an efficient Julia implementation of the core algorithm.
Identifiants
pubmed: 36946207
pii: 7083383
doi: 10.1093/plphys/kiad191
pmc: PMC10231367
doi:
Types de publication
Journal Article
Research Support, Non-U.S. Gov't
Langues
eng
Sous-ensembles de citation
IM
Pagination
821-836Informations de copyright
© American Society of Plant Biologists 2023. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Déclaration de conflit d'intérêts
Conflict of interest statement. None declared.
Références
Nature. 2000 Dec 14;408(6814):796-815
pubmed: 11130711
Rice (N Y). 2013 May 06;6(1):11
pubmed: 24280183
Genome Res. 2009 Jun;19(6):1068-76
pubmed: 19420380
BMC Bioinformatics. 2019 Mar 27;20(1):157
pubmed: 30917778
Curr Genomics. 2010 Apr;11(2):91-102
pubmed: 20885817
Bioinformatics. 2019 Aug 1;35(15):2555-2561
pubmed: 30576415
Nature. 2015 Feb 19;518(7539):317-30
pubmed: 25693563
Proc Natl Acad Sci U S A. 2011 Mar 15;108(11):4488-93
pubmed: 21368205
Bioinformatics. 2011 Nov 1;27(21):2987-93
pubmed: 21903627
Genetics. 2016 Feb;202(2):487-95
pubmed: 26715670
Genome Res. 2011 Apr;21(4):610-7
pubmed: 21233398
Elife. 2013 Dec 17;2:e01426
pubmed: 24347547
Proc Natl Acad Sci U S A. 2007 Nov 20;104(47):18836-41
pubmed: 18000056
Nat Protoc. 2017 Dec;12(12):2478-2492
pubmed: 29120462
Bioinformatics. 2010 Jun 15;26(12):i199-207
pubmed: 20529906
Annu Rev Plant Biol. 2015;66:297-327
pubmed: 25494464
Bioinformatics. 2003 May 1;19(7):889-90
pubmed: 12724300
Bioinformatics. 2017 Dec 01;33(23):3701-3708
pubmed: 29036320
Nat Methods. 2012 Mar 18;9(5):473-6
pubmed: 22426492
PLoS One. 2017 Jan 5;12(1):e0169249
pubmed: 28056037
Genomics Inform. 2014 Dec;12(4):145-50
pubmed: 25705151
PLoS One. 2011 May 04;6(5):e19379
pubmed: 21573248
Genomics. 1992 Nov;14(3):604-10
pubmed: 1427888
Genome Biol. 2019 Dec 16;20(1):277
pubmed: 31842948
PLoS Genet. 2015 Jul 10;11(7):e1005369
pubmed: 26161528
PLoS Genet. 2011 Nov;7(11):e1002354
pubmed: 22072983
Nat Commun. 2021 Jun 10;12(1):3551
pubmed: 34112792
Bioinformatics. 2003 Oct;19 Suppl 2:ii215-25
pubmed: 14534192
Genetics. 2012 Feb;190(2):437-47
pubmed: 22345611
Genetics. 2002 Apr;160(4):1631-9
pubmed: 11973316
Bioinformatics. 2010 Dec 1;26(23):2990-2
pubmed: 20966004
Genome Res. 2011 Jun;21(6):940-51
pubmed: 21460063
Proc Natl Acad Sci U S A. 2010 Jun 8;107(23):10578-83
pubmed: 20498060
Genetics. 2018 Sep;210(1):71-82
pubmed: 30045858
Plant J. 2018 May 29;:
pubmed: 29808512
Front Genet. 2022 Feb 15;12:790445
pubmed: 35251117
Nucleic Acids Res. 2016 Mar 18;44(5):e44
pubmed: 26578558
Proc Natl Acad Sci U S A. 2018 Mar 6;115(10):2437-2442
pubmed: 29463699
Nat Methods. 2012 Mar 04;9(4):357-9
pubmed: 22388286
Stat Appl Genet Mol Biol. 2016 Mar;15(1):55-67
pubmed: 26854292
Front Plant Sci. 2014 Sep 30;5:484
pubmed: 25324846
Proc Natl Acad Sci U S A. 2018 Mar 6;115(10):2431-2436
pubmed: 29183972
Nat Rev Genet. 2014 Oct;15(10):662-76
pubmed: 25139187
G3 (Bethesda). 2015 Jan 13;5(3):385-98
pubmed: 25585881
Nat Genet. 2021 Jan;53(1):120-126
pubmed: 33414550