Quantitative validation of the CamSol method
Correlation of the CamSol solubility scores (x-axis) as a function of the measured monomer concentrations (y-axis) of 8 different human single domain antibody variants. The solubility of the variants was measured with analytical SEC as the monomer concentration after 4-hour incubation at room temperature at total concentration of 70 μM. Homology-derived structures of the wild type and the most soluble variant are represented. The surface of the models is color coded with the structurally corrected solubility profile.
The CamSol method of protein solubility prediction comprises three algorithms that can be used individually for specific tasks or together to rationally design protein variants with enhanced solubility.
These algorithms are:
The intrinsic solubility score computed from the amino acid sequence by the first algorithm can be used to rank libraries of protein variants according to their solubility. For example, in vitro antibody discovery techniques (e.g. phage display) usually yield a large number of antibody variants that bind their antigen with high affinity. Since these variants share a high degree of sequence similarity, the CamSol method will produce accurate solubility rankings, reducing the need for experiments and helping the selection of the best candidates for development.
The three algorithms that constitute the CamSol method can be used together for the structure-based design of soluble protein variants. In this embodiment the method performs in silico a rapid and systematic computational screening of tens of thousands of possible amino acid substitutions or insertions to identify specific mutations that are predicted to maximally increase the solubility of a protein while preserving its fundamental properties, including its functional structure and binding affinity. The method requires the knowledge of the native structure of the target protein, which could be available by experimental or by computational (e.g. homology modeling) techniques (high resolution is not required). The structural correction is used to distinguish, among the residues that are classified as poorly soluble, those that are required for functional reasons (e.g. the residues that form the hydrophobic core) from those that remain exposed to the solvent and are not strictly necessary. One can also provide a list of residues important for function or that cannot be otherwise mutated and the maximum number of mutations that the algorithm is allowed to perform so that the wild-type sequence is largely conserved. Four steps are automatically performed by the method: (i) Calculation of the intrinsic solubility profile, (ii) calculation of the structural correction to the intrinsic solubility profile, (iii) identification of suitable mutation sites using the structurally corrected solubility profile, and (iv) screening of all possible mutations/insertions at those site to identify the most soluble variant.
P. Sormanni, F. A. Aprile and M. Vendruscolo.
The CamSol method of rational design of protein mutants with enhanced solubility.
J. Mol. Biol. 2015. doi:10.1016/j.jmb.2014.09.026.