TY - JOUR
T1 - Analysis of High Throughput Protein Expression in Escherichia coli
AU - Benita, Y.
AU - Wise, Michael
AU - Lok, M.C.
AU - Humphrey-Smith, I.
AU - Oosting, R.S.
PY - 2006
Y1 - 2006
N2 - The ability to efficiently produce hundreds of proteins in parallel is the most basic requirement of many aspects of proteomics. Overcoming the technical and financial barriers associated with high throughput protein production is essential for the development of an experimental platform to query and browse the protein content of a cell ( e. g. protein and antibody arrays). Proteins are inherently different one from another in their physicochemical properties; therefore, no single protocol can be expected to successfully express most of the proteins. Instead of optimizing a protocol to express a specific protein, we used sequence analysis tools to estimate the probability of a specific protein to be expressed successfully using a given protocol, thereby avoiding a priori proteins with a low success probability. A set of 547 proteins, to be used for antibody production and selection, was expressed in Escherichia coli using a high throughput protein production pipeline. Protein properties derived from sequence alone were correlated to successful expression, and general guidelines are given to increase the efficiency of similar pipelines. A second set of 68 proteins was expressed to investigate the link between successful protein expression and inclusion body formation. More proteins were expressed in inclusion bodies; however, the formation of inclusion bodies was not a requirement for successful expression.
AB - The ability to efficiently produce hundreds of proteins in parallel is the most basic requirement of many aspects of proteomics. Overcoming the technical and financial barriers associated with high throughput protein production is essential for the development of an experimental platform to query and browse the protein content of a cell ( e. g. protein and antibody arrays). Proteins are inherently different one from another in their physicochemical properties; therefore, no single protocol can be expected to successfully express most of the proteins. Instead of optimizing a protocol to express a specific protein, we used sequence analysis tools to estimate the probability of a specific protein to be expressed successfully using a given protocol, thereby avoiding a priori proteins with a low success probability. A set of 547 proteins, to be used for antibody production and selection, was expressed in Escherichia coli using a high throughput protein production pipeline. Protein properties derived from sequence alone were correlated to successful expression, and general guidelines are given to increase the efficiency of similar pipelines. A second set of 68 proteins was expressed to investigate the link between successful protein expression and inclusion body formation. More proteins were expressed in inclusion bodies; however, the formation of inclusion bodies was not a requirement for successful expression.
U2 - 10.1074/mcp.M600140-MCP200
DO - 10.1074/mcp.M600140-MCP200
M3 - Article
C2 - 16822774
SN - 1535-9476
VL - 5
SP - 1567
EP - 1580
JO - Molecular and Cellular Proteomics
JF - Molecular and Cellular Proteomics
ER -