Redefining the structural motifs that determine RNA binding and RNA editing by pentatricopeptide repeat proteins in land plants

S. Cheng, Bernard Gutmann, X. Zhong, Y. Ye, Mark Fisher, F. Bai, Ian Castleden, Y. Song, B. Song, Jiaying Huang, X. Liu, X. Xu, B.L Lim, Charlie Bond, S.M. Yiu, Ian Small

Research output: Contribution to journalArticle

82 Citations (Scopus)
358 Downloads (Pure)

Abstract

The pentatricopeptide repeat (PPR) proteins form one of the largest protein families in land plants. They are characterised by tandem 30–40 amino acid motifs that form an extended binding surface capable of sequence-specific recognition of RNA strands. Almost all of them are post-translationally targeted to plastids and mitochondria, where they play important roles in post-transcriptional processes including splicing, RNA editing and the initiation of translation. A code describing how PPR proteins recognise their RNA targets promises to accelerate research on these proteins, but making use of this code requires accurate definition and annotation of all of the various nucleotide-binding motifs in each protein. We have used a structural modelling approach to define 10 different variants of the PPR motif found in plant proteins, in addition to the putative deaminase motif that is found at the C-terminus of many RNA-editing factors. We show that the super-helical RNA-binding surface of RNA-editing factors is potentially longer than previously recognised. We used the redefined motifs to develop accurate and consistent annotations of PPR sequences from 109 genomes. We report a high error rate in PPR gene models in many public plant proteomes, due to gene fusions and insertions of spurious introns. These consistently annotated datasets across a wide range of species are valuable resources for future comparative genomics studies, and an essential pre-requisite for accurate large-scale computational predictions of PPR targets. We have created a web portal (http://www.plantppr.com) that provides open access to these resources for the community.
Original languageEnglish
Pages (from-to)532-547
JournalThe Plant Journal
Volume85
Issue number4
DOIs
Publication statusPublished - 12 Feb 2016

Fingerprint

Embryophyta
RNA Editing
RNA editing
embryophytes
RNA
Proteins
proteins
amino acid motifs
Plant Proteins
Amino Acid Motifs
Nucleotide Motifs
Plastids
gene fusion
Insertional Mutagenesis
Gene Fusion
Proteome
plant proteins
Genomics
proteome
Introns

Cite this

Cheng, S. ; Gutmann, Bernard ; Zhong, X. ; Ye, Y. ; Fisher, Mark ; Bai, F. ; Castleden, Ian ; Song, Y. ; Song, B. ; Huang, Jiaying ; Liu, X. ; Xu, X. ; Lim, B.L ; Bond, Charlie ; Yiu, S.M. ; Small, Ian. / Redefining the structural motifs that determine RNA binding and RNA editing by pentatricopeptide repeat proteins in land plants. In: The Plant Journal. 2016 ; Vol. 85, No. 4. pp. 532-547.
@article{5d675d6266d340debcbc8fe69435b1b9,
title = "Redefining the structural motifs that determine RNA binding and RNA editing by pentatricopeptide repeat proteins in land plants",
abstract = "The pentatricopeptide repeat (PPR) proteins form one of the largest protein families in land plants. They are characterised by tandem 30–40 amino acid motifs that form an extended binding surface capable of sequence-specific recognition of RNA strands. Almost all of them are post-translationally targeted to plastids and mitochondria, where they play important roles in post-transcriptional processes including splicing, RNA editing and the initiation of translation. A code describing how PPR proteins recognise their RNA targets promises to accelerate research on these proteins, but making use of this code requires accurate definition and annotation of all of the various nucleotide-binding motifs in each protein. We have used a structural modelling approach to define 10 different variants of the PPR motif found in plant proteins, in addition to the putative deaminase motif that is found at the C-terminus of many RNA-editing factors. We show that the super-helical RNA-binding surface of RNA-editing factors is potentially longer than previously recognised. We used the redefined motifs to develop accurate and consistent annotations of PPR sequences from 109 genomes. We report a high error rate in PPR gene models in many public plant proteomes, due to gene fusions and insertions of spurious introns. These consistently annotated datasets across a wide range of species are valuable resources for future comparative genomics studies, and an essential pre-requisite for accurate large-scale computational predictions of PPR targets. We have created a web portal (http://www.plantppr.com) that provides open access to these resources for the community.",
author = "S. Cheng and Bernard Gutmann and X. Zhong and Y. Ye and Mark Fisher and F. Bai and Ian Castleden and Y. Song and B. Song and Jiaying Huang and X. Liu and X. Xu and B.L Lim and Charlie Bond and S.M. Yiu and Ian Small",
year = "2016",
month = "2",
day = "12",
doi = "10.1111/tpj.13121",
language = "English",
volume = "85",
pages = "532--547",
journal = "The Plant Journal",
issn = "0960-7412",
publisher = "John Wiley & Sons",
number = "4",

}

Cheng, S, Gutmann, B, Zhong, X, Ye, Y, Fisher, M, Bai, F, Castleden, I, Song, Y, Song, B, Huang, J, Liu, X, Xu, X, Lim, BL, Bond, C, Yiu, SM & Small, I 2016, 'Redefining the structural motifs that determine RNA binding and RNA editing by pentatricopeptide repeat proteins in land plants' The Plant Journal, vol. 85, no. 4, pp. 532-547. https://doi.org/10.1111/tpj.13121

Redefining the structural motifs that determine RNA binding and RNA editing by pentatricopeptide repeat proteins in land plants. / Cheng, S.; Gutmann, Bernard; Zhong, X.; Ye, Y.; Fisher, Mark; Bai, F.; Castleden, Ian; Song, Y.; Song, B.; Huang, Jiaying; Liu, X.; Xu, X.; Lim, B.L; Bond, Charlie; Yiu, S.M.; Small, Ian.

In: The Plant Journal, Vol. 85, No. 4, 12.02.2016, p. 532-547.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Redefining the structural motifs that determine RNA binding and RNA editing by pentatricopeptide repeat proteins in land plants

AU - Cheng, S.

AU - Gutmann, Bernard

AU - Zhong, X.

AU - Ye, Y.

AU - Fisher, Mark

AU - Bai, F.

AU - Castleden, Ian

AU - Song, Y.

AU - Song, B.

AU - Huang, Jiaying

AU - Liu, X.

AU - Xu, X.

AU - Lim, B.L

AU - Bond, Charlie

AU - Yiu, S.M.

AU - Small, Ian

PY - 2016/2/12

Y1 - 2016/2/12

N2 - The pentatricopeptide repeat (PPR) proteins form one of the largest protein families in land plants. They are characterised by tandem 30–40 amino acid motifs that form an extended binding surface capable of sequence-specific recognition of RNA strands. Almost all of them are post-translationally targeted to plastids and mitochondria, where they play important roles in post-transcriptional processes including splicing, RNA editing and the initiation of translation. A code describing how PPR proteins recognise their RNA targets promises to accelerate research on these proteins, but making use of this code requires accurate definition and annotation of all of the various nucleotide-binding motifs in each protein. We have used a structural modelling approach to define 10 different variants of the PPR motif found in plant proteins, in addition to the putative deaminase motif that is found at the C-terminus of many RNA-editing factors. We show that the super-helical RNA-binding surface of RNA-editing factors is potentially longer than previously recognised. We used the redefined motifs to develop accurate and consistent annotations of PPR sequences from 109 genomes. We report a high error rate in PPR gene models in many public plant proteomes, due to gene fusions and insertions of spurious introns. These consistently annotated datasets across a wide range of species are valuable resources for future comparative genomics studies, and an essential pre-requisite for accurate large-scale computational predictions of PPR targets. We have created a web portal (http://www.plantppr.com) that provides open access to these resources for the community.

AB - The pentatricopeptide repeat (PPR) proteins form one of the largest protein families in land plants. They are characterised by tandem 30–40 amino acid motifs that form an extended binding surface capable of sequence-specific recognition of RNA strands. Almost all of them are post-translationally targeted to plastids and mitochondria, where they play important roles in post-transcriptional processes including splicing, RNA editing and the initiation of translation. A code describing how PPR proteins recognise their RNA targets promises to accelerate research on these proteins, but making use of this code requires accurate definition and annotation of all of the various nucleotide-binding motifs in each protein. We have used a structural modelling approach to define 10 different variants of the PPR motif found in plant proteins, in addition to the putative deaminase motif that is found at the C-terminus of many RNA-editing factors. We show that the super-helical RNA-binding surface of RNA-editing factors is potentially longer than previously recognised. We used the redefined motifs to develop accurate and consistent annotations of PPR sequences from 109 genomes. We report a high error rate in PPR gene models in many public plant proteomes, due to gene fusions and insertions of spurious introns. These consistently annotated datasets across a wide range of species are valuable resources for future comparative genomics studies, and an essential pre-requisite for accurate large-scale computational predictions of PPR targets. We have created a web portal (http://www.plantppr.com) that provides open access to these resources for the community.

U2 - 10.1111/tpj.13121

DO - 10.1111/tpj.13121

M3 - Article

VL - 85

SP - 532

EP - 547

JO - The Plant Journal

JF - The Plant Journal

SN - 0960-7412

IS - 4

ER -