Shotgun DNA sequencing for human identification: Dynamic SNP selection and likelihood ratio calculations accounting for errors.

Evidential weight Forensic genetics Genotyping error model Human identification (HID) Shotgun DNA sequencing Whole-genome sequencing

Journal

Forensic science international. Genetics
ISSN: 1878-0326
Titre abrégé: Forensic Sci Int Genet
Pays: Netherlands
ID NLM: 101317016

Informations de publication

Date de publication:
07 Sep 2024
Historique:
received: 29 07 2024
revised: 04 09 2024
accepted: 05 09 2024
medline: 14 9 2024
pubmed: 14 9 2024
entrez: 13 9 2024
Statut: aheadofprint

Résumé

Shotgun sequencing is a DNA analysis method that potentially determines the nucleotide sequence of every DNA fragment in a sample, unlike PCR-based genotyping methods that is widely used in forensic genetics and targets predefined short tandem repeats (STRs) or predefined single nucleotide polymorphisms (SNPs). Shotgun DNA sequencing is particularly useful for highly degraded low-quality DNA samples, such as ancient samples or those from crime scenes. Here, we developed a statistical model for human identification using shotgun sequencing data and developed formulas for calculating the evidential weight as a likelihood ratio (LR). The model uses a dynamic set of binary SNP loci and takes the error rate from shotgun sequencing into consideration in a probabilistic manner. To our knowledge, the method is the first to make this possible. Results from replicated shotgun sequencing of buccal swabs (high-quality samples) and hair samples (low-quality samples) were arranged in a genotype-call confusion matrix to estimate the calling error probability by maximum likelihood and Bayesian inference. Different genotype quality filters may be applied to account for genotyping errors. An error probability of zero resulted in the commonly used LR formula for the weight of evidence. Error probabilities above zero reduced the LR contribution of matching genotypes and increased the LR in the case of a mismatch between the genotypes of the trace and the person of interest. In the latter scenario, the LR increased from zero (occurring when the error probability was zero) to low positive values, which allow for the possibility that the mismatch may be due to genotyping errors. We developed an open-source R package, wgsLR, which implements the method, including estimation of the calling error probability and calculation of LR values. The R package includes all formulas used in this paper and the functionalities to generate the formulas.

Identifiants

pubmed: 39270548
pii: S1872-4973(24)00142-X
doi: 10.1016/j.fsigen.2024.103146
pii:
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

103146

Informations de copyright

Copyright © 2024 The Author(s). Published by Elsevier B.V. All rights reserved.

Déclaration de conflit d'intérêts

Declaration of competing interest The authors declare that they have no competing interests.

Auteurs

Mikkel Meyer Andersen (MM)

Department of Mathematical Sciences, Aalborg University, DK-9220 Aalborg, Denmark; Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, DK-2100 Copenhagen, Denmark. Electronic address: mikl@math.aau.dk.

Marie-Louise Kampmann (ML)

Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, DK-2100 Copenhagen, Denmark.

Alberte Honoré Jepsen (AH)

Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, DK-2100 Copenhagen, Denmark.

Niels Morling (N)

Department of Mathematical Sciences, Aalborg University, DK-9220 Aalborg, Denmark; Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, DK-2100 Copenhagen, Denmark.

Poul Svante Eriksen (PS)

Department of Mathematical Sciences, Aalborg University, DK-9220 Aalborg, Denmark.

Claus Børsting (C)

Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, DK-2100 Copenhagen, Denmark.

Jeppe Dyrberg Andersen (JD)

Section of Forensic Genetics, Department of Forensic Medicine, Faculty of Health and Medical Sciences, University of Copenhagen, DK-2100 Copenhagen, Denmark.

Classifications MeSH