Unraveling C-to-U RNA editing events from direct RNA sequencing.

C-to-U editing Direct RNA sequencing RNA editing RNA modifications iForest

Journal

RNA biology
ISSN: 1555-8584
Titre abrégé: RNA Biol
Pays: United States
ID NLM: 101235328

Informations de publication

Date de publication:
Jan 2024
Historique:
medline: 13 12 2023
pubmed: 13 12 2023
entrez: 13 12 2023
Statut: ppublish

Résumé

In mammals, RNA editing events involve the conversion of adenosine (A) in inosine (I) by ADAR enzymes or the hydrolytic deamination of cytosine (C) in uracil (U) by the APOBEC family of enzymes, mostly APOBEC1. RNA editing has a plethora of biological functions, and its deregulation has been associated with various human disorders. While the large-scale detection of A-to-I is quite straightforward using the Illumina RNAseq technology, the identification of C-to-U events is a non-trivial task. This difficulty arises from the rarity of such events in eukaryotic genomes and the challenge of distinguishing them from background noise. Direct RNA sequencing by Oxford Nanopore Technology (ONT) permits the direct detection of Us on sequenced RNA reads. Surprisingly, using ONT reads from wild-type (WT) and APOBEC1-knock-out (KO) murine cell lines as well as in vitro synthesized RNA without any modification, we identified a systematic error affecting the accuracy of the Cs call, thereby leading to incorrect identifications of C-to-U events. To overcome this issue in direct RNA reads, here we introduce a novel machine learning strategy based on the isolation Forest (iForest) algorithm in which C-to-U editing events are considered as sequencing anomalies. Using in vitro synthesized and human ONT reads, our model optimizes the signal-to-noise ratio improving the detection of C-to-U editing sites with high accuracy, over 90% in all samples tested. Our results suggest that iForest, known for its rapid implementation and minimal memory requirements, is a promising tool to denoise ONT reads and reliably identify RNA modifications.

Identifiants

pubmed: 38090878
doi: 10.1080/15476286.2023.2290843
doi:

Types de publication

Journal Article

Langues

eng

Sous-ensembles de citation

IM

Pagination

1-14

Auteurs

Adriano Fonzino (A)

Department of Biosciences, Biotechnology and Environment, University of Bari, Bari, Italy.

Caterina Manzari (C)

Department of Biosciences, Biotechnology and Environment, University of Bari, Bari, Italy.

Paola Spadavecchia (P)

Department of Biosciences, Biotechnology and Environment, University of Bari, Bari, Italy.

Uday Munagala (U)

Core Research Laboratory, ISPRO, Florence, Italy.

Serena Torrini (S)

Core Research Laboratory, ISPRO, Florence, Italy.

Silvestro Conticello (S)

Core Research Laboratory, ISPRO, Florence, Italy.
National Research Council, Institute of Clinical Physiology, Pisa, Italy.

Graziano Pesole (G)

Department of Biosciences, Biotechnology and Environment, University of Bari, Bari, Italy.
National Research Council, Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies (IBIOM), Bari, Italy.
Consorzio Interuniversitario Biotecnologie, Trieste, Italy.

Ernesto Picardi (E)

Department of Biosciences, Biotechnology and Environment, University of Bari, Bari, Italy.
National Research Council, Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies (IBIOM), Bari, Italy.
National Institute of Biostructures and Biosystems (INBB), Roma, Italy.

Classifications MeSH