Loop Modeling in Proteins Using a Database Approach with Multi-Dimensional Scaling
Holtby, Daniel James
MetadataShow full item record
Modeling loops is an often necessary step in protein structure and function determination, even with experimental X-ray and NMR data. It is well known to be difficult. Database techniques have the advantage of producing a higher proportion of predictions with sub-angstrom accuracy when compared with ab initio techniques, but the disadvantage of often being able to produce usable results as they depend entirely on the loop already being represented within the database. My contribution is the LoopWeaver protocol, a database method that uses multidimensional scaling to rapidly achieve better clash-free, low energy placement of loops obtained from a database of protein structures. This maintains the above- mentioned advantage while avoiding the disadvantage by permitting the use of lower quality matches that would not otherwise fit. Test results show that this method achieves significantly better results than all other methods, including Modeler, Loopy, SuperLooper, and Rapper before refinement. With refinement, the results (LoopWeaver and Loopy combined) are better than ROSETTA's, with 0.53Å RMSD on average for 206 loops of length 6, 0.75Å local RMSD for 168 loops of length 7, 0.93Å RMSD for 117 loops of length 8, and 1.13Å RMSD loops of length 9, while ROSETTA scores 0.66Å , 0.93Å , 1.23Å , 1.56Å , respectively, at the same average time limit (3 hours on a 2.2 GHz Opteron). When ROSETTA is allowed to run for over a week against LoopWeaver's and Loopy's combined 3 hours, it approaches, but does not surpass, this accuracy.