TY - JOUR
T1 - Deepred-Mt
T2 - Deep representation learning for predicting C-to-U RNA editing in plant mitochondria
AU - Edera, Alejandro A.
AU - Small, Ian
AU - Milone, Diego H.
AU - Sanchez-Puerta, M. Virginia
PY - 2021/9
Y1 - 2021/9
N2 - In land plant mitochondria, C-to-U RNA editing converts cytidines into uridines at highly specific RNA positions called editing sites. This editing step is essential for the correct functioning of mitochondrial proteins. When using sequence homology information, edited positions can be computationally predicted with high precision. However, predictions based on the sequence contexts of such edited positions often result in lower precision, which is limiting further advances on novel genetic engineering techniques for RNA regulation. Here, a deep convolutional neural network called Deepred-Mt is proposed. It predicts C-to-U editing events based on the 40 nucleotides flanking a given cytidine. Unlike existing methods, Deepred-Mt was optimized by using editing extent information, novel strategies of data augmentation, and a large-scale training dataset, constructed with deep RNA sequencing data of 21 plant mitochondrial genomes. In comparison to predictive methods based on sequence homology, Deepred-Mt attains significantly better predictive performance, in terms of average precision as well as F1 score. In addition, our approach is able to recognize well-known sequence motifs linked to RNA editing, and shows that the local RNA structure surrounding editing sites may be a relevant factor regulating their editing. These results demonstrate that Deepred-Mt is an effective tool for predicting C-to-U RNA editing in plant mitochondria. Source code, datasets, and detailed use cases are freely available at https://github.com/aedera/deepredmt.
AB - In land plant mitochondria, C-to-U RNA editing converts cytidines into uridines at highly specific RNA positions called editing sites. This editing step is essential for the correct functioning of mitochondrial proteins. When using sequence homology information, edited positions can be computationally predicted with high precision. However, predictions based on the sequence contexts of such edited positions often result in lower precision, which is limiting further advances on novel genetic engineering techniques for RNA regulation. Here, a deep convolutional neural network called Deepred-Mt is proposed. It predicts C-to-U editing events based on the 40 nucleotides flanking a given cytidine. Unlike existing methods, Deepred-Mt was optimized by using editing extent information, novel strategies of data augmentation, and a large-scale training dataset, constructed with deep RNA sequencing data of 21 plant mitochondrial genomes. In comparison to predictive methods based on sequence homology, Deepred-Mt attains significantly better predictive performance, in terms of average precision as well as F1 score. In addition, our approach is able to recognize well-known sequence motifs linked to RNA editing, and shows that the local RNA structure surrounding editing sites may be a relevant factor regulating their editing. These results demonstrate that Deepred-Mt is an effective tool for predicting C-to-U RNA editing in plant mitochondria. Source code, datasets, and detailed use cases are freely available at https://github.com/aedera/deepredmt.
KW - C-to-U RNA editing
KW - Convolutional neural networks
KW - Land plants
KW - Mitochondrial genomes
KW - Representation learning
KW - Sequence classification
UR - http://www.scopus.com/inward/record.url?scp=85111476362&partnerID=8YFLogxK
U2 - 10.1016/j.compbiomed.2021.104682
DO - 10.1016/j.compbiomed.2021.104682
M3 - Article
C2 - 34343887
AN - SCOPUS:85111476362
SN - 0010-4825
VL - 136
JO - Computers in Biology and Medicine
JF - Computers in Biology and Medicine
M1 - 104682
ER -