Inclusion Body Formation by Mutants of the Tenth Human Fibronectin Type III Domain
MetadataShow full item record
Inclusion bodies (IBs) are intracellular, insoluble protein aggregates, commonly observed when a protein of interest is expressed at high concentrations in a bacterial cell-based expression system. The molecular determinants of IB formation are poorly understood, and are of both fundamental and biotechnological significance. The stability, folding, and structure of the tenth human fibronectin type III domain (10Fn3) have been studied previously, making it an attractive model system to investigate IB formation. A library of 10Fn3 mutants was provided by Bristol-Myers Squibb; 31 of these mutants were expressed in Escherichia coli and analyzed. The percentage of the expressed protein found within IBs was quantified at different expression time points using densitometric analysis of soluble and inclusion body (insoluble) cell lysate fractions separated by centrifugation and subjected to polyacrylamide gel electrophoresis. Although most of these mutants differ from each other in only 3 amino acid positions, all found within a single flexible loop of the protein, the extent of IB formation varies greatly. This data set was used to test the performance of a variety of amino acid sequence-based protein aggregation prediction methods. Several of these methods produced predictions that correlate moderately well with the IB formation data (R2 > 0.6), suggesting that while the intrinsic aggregation propensity of sequence segments strongly influences IB formation, other factors are also relevant. We hypothesized that improved predictions might be made possible by the consideration of additional structural context, i.e. aggregation-prone sequence segment exposure. Thermodynamic stabilities determined using differential scanning calorimetry correlate poorly with IB formation; all of the mutants are sufficiently stable that no significant fraction of protein is likely to be denatured at equilibrium. To describe the variable structure of the flexible loop in which the mutant sequences differ, ensembles of homology models were constructed. IB formation was found to correlate with the ensemble average energy scores of the homology models. The ensemble average scores may capture subtle shifts in the energetic bias toward native structure that restricts the exposure of aggregation-prone sequence segments. A linear combination of sequence-based aggregation predictions and ensemble average homology model scores correlates much better with IB formation (R2 > 0.8) than either parameter does individually.