Agriculture faces increasing demand for yield, higher plant-derived protein content and diversity while facing pressure to achieve sustainability. Although genomes of many of the important crops are sequenced, the subcellular locations of most proteins encoded remain unknown or are only predicted. Protein subcellular location is crucial in determining protein function and accumulation patterns in plants and is critical for targeted improvements of yield and resilience. Integrating location data from >800 studies for 12 major crop species into the data collection cropPAL2020 showed that while >80% of proteins in most species are not localised by experimental data, combining species data or integrating predictions can help bridge gaps at similar accuracy. The collation and integration of >61,505 experimental localisations and > 6 million predictions showed that the relative sizes of the protein catalogues located in different subcellular compartments are comparable between crops and Arabidopsis. A comprehensive cross-species comparison showed that between 50-80% of the subcellulomes are conserved across species and that conservation only depends to some degree on the species phylogenetic relationship. Protein subcellular locations in major biosynthesis pathways are more often conserved than in metabolism. Underlying this conservation is a clear potential for protein location subcellular diversity between species by means of gene duplication and alternative splicing. Our cropPAL data set and search platform (https://crop-pal.org) provide a comprehensive subcellular proteomics resource to drive compartmentation-based approaches for improving yield, protein composition and resilience in future crop varieties.
Hooper, C. (Creator), Castleden, I. (Data Manager), Aryamanesh, N. (Data Collector), Black, K. (Contributor), Grasso, S. (Data Collector) & Millar, H. (Creator), The University of Western Australia, Jan 2020