iPseU-TWSVM: Identification of RNA pseudouridine sites based on TWSVM.

RNA max-relevance and min-redundancy pseudouridine sites twin support vector machine

Journal

Mathematical biosciences and engineering : MBE
ISSN: 1551-0018
Titre abrégé: Math Biosci Eng
Pays: United States
ID NLM: 101197794

Informations de publication

Date de publication:
19 09 2022
Historique:
entrez: 19 1 2023
pubmed: 20 1 2023
medline: 21 1 2023
Statut: ppublish

Résumé

Biological sequence analysis is an important basic research work in the field of bioinformatics. With the explosive growth of data, machine learning methods play an increasingly important role in biological sequence analysis. By constructing a classifier for prediction, the input sequence feature vector is predicted and evaluated, and the knowledge of gene structure, function and evolution is obtained from a large amount of sequence information, which lays a foundation for researchers to carry out in-depth research. At present, many machine learning methods have been applied to biological sequence analysis such as RNA gene recognition and protein secondary structure prediction. As a biological sequence, RNA plays an important biological role in the encoding, decoding, regulation and expression of genes. The analysis of RNA data is currently carried out from the aspects of structure and function, including secondary structure prediction, non-coding RNA identification and functional site prediction. Pseudouridine (У) is the most widespread and rich RNA modification and has been discovered in a variety of RNAs. It is highly essential for the study of related functional mechanisms and disease diagnosis to accurately identify У sites in RNA sequences. At present, several computational approaches have been suggested as an alternative to experimental methods to detect У sites, but there is still potential for improvement in their performance. In this study, we present a model based on twin support vector machine (TWSVM) for У site identification. The model combines a variety of feature representation techniques and uses the max-relevance and min-redundancy methods to obtain the optimum feature subset for training. The independent testing accuracy is improved by 3.4% in comparison to current advanced У site predictors. The outcomes demonstrate that our model has better generalization performance and improves the accuracy of У site identification. iPseU-TWSVM can be a helpful tool to identify У sites.

Identifiants

pubmed: 36654069
doi: 10.3934/mbe.2022644
doi:

Substances chimiques

RNA 63231-63-0
Pseudouridine 1445-07-4

Types de publication

Journal Article Research Support, Non-U.S. Gov't

Langues

eng

Sous-ensembles de citation

IM

Pagination

13829-13850

Auteurs

Mingshuai Chen (M)

Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China.
Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, China.

Xin Zhang (X)

Beidahuang Industry Group General Hospital, Harbin, China.

Ying Ju (Y)

School of Informatics, Xiamen University, Xiamen, China.

Qing Liu (Q)

Department of Anesthesiology, Hospital (T.C.M) Affiliated to Southwest Medical University, Luzhou, China.

Yijie Ding (Y)

Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, China.

Articles similaires

Exploring blood-brain barrier passage using atomic weighted vector and machine learning.

Yoan Martínez-López, Paulina Phoobane, Yanaima Jauriga et al.
1.00
Blood-Brain Barrier Machine Learning Humans Support Vector Machine Software

Understanding the role of machine learning in predicting progression of osteoarthritis.

Simone Castagno, Benjamin Gompels, Estelle Strangmark et al.
1.00
Humans Disease Progression Machine Learning Osteoarthritis
Humans Colorectal Neoplasms Biomarkers, Tumor Prognosis Gene Expression Regulation, Neoplastic

Classifications MeSH