The ability to efficiently produce hundreds of proteins in parallel is the most basic requirement of many aspects of proteomics. Overcoming the technical and financial barriers associated with high throughput protein production is essential for the development of an experimental platform to query and browse the protein content of a cell ( e. g. protein and antibody arrays). Proteins are inherently different one from another in their physicochemical properties; therefore, no single protocol can be expected to successfully express most of the proteins. Instead of optimizing a protocol to express a specific protein, we used sequence analysis tools to estimate the probability of a specific protein to be expressed successfully using a given protocol, thereby avoiding a priori proteins with a low success probability. A set of 547 proteins, to be used for antibody production and selection, was expressed in Escherichia coli using a high throughput protein production pipeline. Protein properties derived from sequence alone were correlated to successful expression, and general guidelines are given to increase the efficiency of similar pipelines. A second set of 68 proteins was expressed to investigate the link between successful protein expression and inclusion body formation. More proteins were expressed in inclusion bodies; however, the formation of inclusion bodies was not a requirement for successful expression.