Construction of a new chromosome-scale, long-read reference genome assembly for the Syrian hamster, Mesocricetus auratus.


Journal

GigaScience
ISSN: 2047-217X
Titre abrégé: Gigascience
Pays: United States
ID NLM: 101596872

Informations de publication

Date de publication:
28 05 2022
Historique:
received: 05 07 2021
revised: 03 11 2021
accepted: 29 03 2022
entrez: 31 5 2022
pubmed: 1 6 2022
medline: 3 6 2022
Statut: ppublish

Résumé

The Syrian hamster (Mesocricetus auratus) has been suggested as a useful mammalian model for a variety of diseases and infections, including infection with respiratory viruses such as SARS-CoV-2. The MesAur1.0 genome assembly was generated in 2013 using whole-genome shotgun sequencing with short-read sequence data. Current more advanced sequencing technologies and assembly methods now permit the generation of near-complete genome assemblies with higher quality and greater continuity. Here, we report an improved assembly of the M. auratus genome (BCM_Maur_2.0) using Oxford Nanopore Technologies long-read sequencing to produce a chromosome-scale assembly. The total length of the new assembly is 2.46 Gb, similar to the 2.50-Gb length of a previous assembly of this genome, MesAur1.0. BCM_Maur_2.0 exhibits significantly improved continuity, with a scaffold N50 that is 6.7 times greater than MesAur1.0. Furthermore, 21,616 protein-coding genes and 10,459 noncoding genes are annotated in BCM_Maur_2.0 compared to 20,495 protein-coding genes and 4,168 noncoding genes in MesAur1.0. This new assembly also improves the unresolved regions as measured by nucleotide ambiguities, where ∼17.11% of bases in MesAur1.0 were unresolved compared to BCM_Maur_2.0, in which the number of unresolved bases is reduced to 3.00%. Access to a more complete reference genome with improved accuracy and continuity will facilitate more detailed, comprehensive, and meaningful research results for a wide variety of future studies using Syrian hamsters as models.

Sections du résumé

BACKGROUND
The Syrian hamster (Mesocricetus auratus) has been suggested as a useful mammalian model for a variety of diseases and infections, including infection with respiratory viruses such as SARS-CoV-2. The MesAur1.0 genome assembly was generated in 2013 using whole-genome shotgun sequencing with short-read sequence data. Current more advanced sequencing technologies and assembly methods now permit the generation of near-complete genome assemblies with higher quality and greater continuity.
FINDINGS
Here, we report an improved assembly of the M. auratus genome (BCM_Maur_2.0) using Oxford Nanopore Technologies long-read sequencing to produce a chromosome-scale assembly. The total length of the new assembly is 2.46 Gb, similar to the 2.50-Gb length of a previous assembly of this genome, MesAur1.0. BCM_Maur_2.0 exhibits significantly improved continuity, with a scaffold N50 that is 6.7 times greater than MesAur1.0. Furthermore, 21,616 protein-coding genes and 10,459 noncoding genes are annotated in BCM_Maur_2.0 compared to 20,495 protein-coding genes and 4,168 noncoding genes in MesAur1.0. This new assembly also improves the unresolved regions as measured by nucleotide ambiguities, where ∼17.11% of bases in MesAur1.0 were unresolved compared to BCM_Maur_2.0, in which the number of unresolved bases is reduced to 3.00%.
CONCLUSIONS
Access to a more complete reference genome with improved accuracy and continuity will facilitate more detailed, comprehensive, and meaningful research results for a wide variety of future studies using Syrian hamsters as models.

Identifiants

pubmed: 35640223
pii: 6594469
doi: 10.1093/gigascience/giac039
pmc: PMC9155146
pii:
doi:

Types de publication

Journal Article Research Support, N.I.H., Extramural

Langues

eng

Sous-ensembles de citation

IM

Subventions

Organisme : NIH HHS
ID : P51 OD011106
Pays : United States
Organisme : NIGMS NIH HHS
ID : T32 GM135119
Pays : United States
Organisme : NIAID NIH HHS
ID : HHSN272201600007C
Pays : United States

Informations de copyright

© The Author(s) 2022. Published by Oxford University Press GigaScience.

Références

Nat Commun. 2021 Sep 22;12(1):5469
pubmed: 34552091
Nature. 2020 Oct;586(7830):509-515
pubmed: 32967005
Genome Biol. 2017 May 18;18(1):93
pubmed: 28521789
Surgery. 2015 May;157(5):888-98
pubmed: 25731784
Genome Biol. 2004;5(2):R12
pubmed: 14759262
PLoS One. 2012;7(12):e52210
pubmed: 23284938
Sci Rep. 2020 Sep 28;10(1):15917
pubmed: 32985513
Viruses. 2021 Dec 14;13(12):
pubmed: 34960775
Science. 2020 May 29;368(6494):1012-1015
pubmed: 32303590
Science. 2020 May 29;368(6494):1016-1020
pubmed: 32269068
Elife. 2022 Jan 11;11:
pubmed: 35014610
PLoS One. 2014 Nov 19;9(11):e112963
pubmed: 25409509
Bioinformatics. 2011 Mar 15;27(6):764-70
pubmed: 21217122
Expert Opin Drug Discov. 2018 Dec;13(12):1131-1139
pubmed: 30362841
Parasite Immunol. 2020 Oct;42(10):e12768
pubmed: 32594532
Clin Infect Dis. 2020 Dec 3;71(9):2428-2446
pubmed: 32215622
J Infect Dis. 2015 Oct 1;212 Suppl 2:S271-6
pubmed: 25948862
Cell Res. 2014 Mar;24(3):380-2
pubmed: 24394888
Nat Biotechnol. 2019 May;37(5):540-546
pubmed: 30936562
Immunity. 2021 Mar 9;54(3):557-570.e5
pubmed: 33577760
Science. 2020 Aug 21;369(6506):956-963
pubmed: 32540903
Proc Natl Acad Sci U S A. 2020 Jul 14;117(28):16587-16595
pubmed: 32571934
Methods Mol Biol. 2019;1962:227-245
pubmed: 31020564
J Natl Cancer Inst. 1963 Sep;31:639-50
pubmed: 14059008
Bioinformatics. 2014 May 1;30(9):1228-35
pubmed: 24443382
Gigascience. 2022 May 28;11:
pubmed: 35640223
Nat Commun. 2020 Nov 17;11(1):5838
pubmed: 33203860
Bioinformatics. 2013 Apr 15;29(8):1072-5
pubmed: 23422339
Sci Rep. 2017 Jan 10;7:40472
pubmed: 28071753
Viruses. 2021 Sep 05;13(9):
pubmed: 34578354
Cell. 2020 Apr 16;181(2):271-280.e8
pubmed: 32142651
Comp Biochem Physiol A Mol Integr Physiol. 2022 Jan;263:111083
pubmed: 34571152

Auteurs

R Alan Harris (RA)

Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.

Muthuswamy Raveendran (M)

Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.

Dustin T Lyfoung (DT)

Wisconsin National Primate Research Center, University of Wisconsin, 1220 Capitol Court, Madison, WI 53711, USA.

Fritz J Sedlazeck (FJ)

Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.

Medhat Mahmoud (M)

Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.

Trent M Prall (TM)

Department of Pathology and Laboratory Medicine, University of Wisconsin, 3170 UW Medical Foundation Centennial Building (MFCB), 1685 Highland Avenue, Madison, WI 53711, USA.

Julie A Karl (JA)

Department of Pathology and Laboratory Medicine, University of Wisconsin, 3170 UW Medical Foundation Centennial Building (MFCB), 1685 Highland Avenue, Madison, WI 53711, USA.

Harshavardhan Doddapaneni (H)

Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.

Qingchang Meng (Q)

Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.

Yi Han (Y)

Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.

Donna Muzny (D)

Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.

Roger W Wiseman (RW)

Wisconsin National Primate Research Center, University of Wisconsin, 1220 Capitol Court, Madison, WI 53711, USA.
Department of Pathology and Laboratory Medicine, University of Wisconsin, 3170 UW Medical Foundation Centennial Building (MFCB), 1685 Highland Avenue, Madison, WI 53711, USA.

David H O'Connor (DH)

Wisconsin National Primate Research Center, University of Wisconsin, 1220 Capitol Court, Madison, WI 53711, USA.
Department of Pathology and Laboratory Medicine, University of Wisconsin, 3170 UW Medical Foundation Centennial Building (MFCB), 1685 Highland Avenue, Madison, WI 53711, USA.

Jeffrey Rogers (J)

Human Genome Sequencing Center and Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.

Articles similaires

Genome, Chloroplast Phylogeny Genetic Markers Base Composition High-Throughput Nucleotide Sequencing
Robotic Surgical Procedures Animals Humans Telemedicine Models, Animal

Odour generalisation and detection dog training.

Lyn Caldicott, Thomas W Pike, Helen E Zulch et al.
1.00
Animals Odorants Dogs Generalization, Psychological Smell
Animals TOR Serine-Threonine Kinases Colorectal Neoplasms Colitis Mice

Classifications MeSH