SUBAcon: a consensus algorithm for unifying the subcellular localization data of the Arabidopsis proteome

Cornelia Hooper, Sandra Tanz, Ian Castleden, Michael Vacher, Ian Small, Harvey Millar

Research output: Contribution to journalArticle

  • 63 Citations

Abstract

Motivation: Knowing the subcellular location of proteins is critical for understanding their function and developing accurate networks representing eukaryotic biological processes. Many computational tools have been developed to predict proteome-wide subcellular location, and abundant experimental data from green fluorescent protein (GFP) tagging or mass spectrometry (MS) are available in the model plant, Arabidopsis. None of these approaches is error-free, and thus, results are often contradictory.

Results: To help unify these multiple data sources, we have developed the SUBcellular Arabidopsis consensus (SUBAcon) algorithm, a naive Bayes classifier that integrates 22 computational prediction algorithms, experimental GFP and MS localizations, protein–protein interaction and co-expression data to derive a consensus call and probability. SUBAcon classifies protein location in Arabidopsis more accurately than single predictors.
LanguageEnglish
Pages3356-3364
JournalBioinformatics
Volume30
Issue number23
Early online date22 Aug 2014
DOIs
StatePublished - 1 Dec 2014

Fingerprint

Arabidopsis
Proteome
Proteins
Protein
Mass Spectrometry
Green Fluorescent Proteins
Mass spectrometry
Biological Phenomena
Naive Bayes Classifier
Information Storage and Retrieval
Tagging
Protein-protein Interaction
Predictors
Classifiers
Classify
Integrate
Experimental Data
Predict
Prediction

Cite this

@article{d92b3be5689c406f97a770ea80754a38,
title = "SUBAcon: a consensus algorithm for unifying the subcellular localization data of the Arabidopsis proteome",
abstract = "Motivation: Knowing the subcellular location of proteins is critical for understanding their function and developing accurate networks representing eukaryotic biological processes. Many computational tools have been developed to predict proteome-wide subcellular location, and abundant experimental data from green fluorescent protein (GFP) tagging or mass spectrometry (MS) are available in the model plant, Arabidopsis. None of these approaches is error-free, and thus, results are often contradictory. Results: To help unify these multiple data sources, we have developed the SUBcellular Arabidopsis consensus (SUBAcon) algorithm, a naive Bayes classifier that integrates 22 computational prediction algorithms, experimental GFP and MS localizations, protein–protein interaction and co-expression data to derive a consensus call and probability. SUBAcon classifies protein location in Arabidopsis more accurately than single predictors.",
author = "Cornelia Hooper and Sandra Tanz and Ian Castleden and Michael Vacher and Ian Small and Harvey Millar",
year = "2014",
month = "12",
day = "1",
doi = "10.1093/bioinformatics/btu550",
language = "English",
volume = "30",
pages = "3356--3364",
journal = "Bioinformatics",
issn = "1367-4803",
publisher = "Oxford University Press",
number = "23",

}

SUBAcon: a consensus algorithm for unifying the subcellular localization data of the Arabidopsis proteome. / Hooper, Cornelia; Tanz, Sandra; Castleden, Ian; Vacher, Michael; Small, Ian; Millar, Harvey.

In: Bioinformatics, Vol. 30, No. 23, 01.12.2014, p. 3356-3364.

Research output: Contribution to journalArticle

TY - JOUR

T1 - SUBAcon: a consensus algorithm for unifying the subcellular localization data of the Arabidopsis proteome

AU - Hooper,Cornelia

AU - Tanz,Sandra

AU - Castleden,Ian

AU - Vacher,Michael

AU - Small,Ian

AU - Millar,Harvey

PY - 2014/12/1

Y1 - 2014/12/1

N2 - Motivation: Knowing the subcellular location of proteins is critical for understanding their function and developing accurate networks representing eukaryotic biological processes. Many computational tools have been developed to predict proteome-wide subcellular location, and abundant experimental data from green fluorescent protein (GFP) tagging or mass spectrometry (MS) are available in the model plant, Arabidopsis. None of these approaches is error-free, and thus, results are often contradictory. Results: To help unify these multiple data sources, we have developed the SUBcellular Arabidopsis consensus (SUBAcon) algorithm, a naive Bayes classifier that integrates 22 computational prediction algorithms, experimental GFP and MS localizations, protein–protein interaction and co-expression data to derive a consensus call and probability. SUBAcon classifies protein location in Arabidopsis more accurately than single predictors.

AB - Motivation: Knowing the subcellular location of proteins is critical for understanding their function and developing accurate networks representing eukaryotic biological processes. Many computational tools have been developed to predict proteome-wide subcellular location, and abundant experimental data from green fluorescent protein (GFP) tagging or mass spectrometry (MS) are available in the model plant, Arabidopsis. None of these approaches is error-free, and thus, results are often contradictory. Results: To help unify these multiple data sources, we have developed the SUBcellular Arabidopsis consensus (SUBAcon) algorithm, a naive Bayes classifier that integrates 22 computational prediction algorithms, experimental GFP and MS localizations, protein–protein interaction and co-expression data to derive a consensus call and probability. SUBAcon classifies protein location in Arabidopsis more accurately than single predictors.

U2 - 10.1093/bioinformatics/btu550

DO - 10.1093/bioinformatics/btu550

M3 - Article

VL - 30

SP - 3356

EP - 3364

JO - Bioinformatics

T2 - Bioinformatics

JF - Bioinformatics

SN - 1367-4803

IS - 23

ER -