Fast and Robust Mathematical Modeling of NMR Assignment Problems
NMR spectroscopy is not only for protein structure determination, but also for drug screening and studies of dynamics and interactions. In both cases, one of the main bottleneck steps is backbone assignment. When a homologous structure is available, it can accelerate assignment. Such structure-based methods are the focus of this thesis. This thesis aims for fast and robust methods for NMR assignment problems; in particular, structure-based backbone assignment and chemical shift mapping. For speed, we identified situations where the number of 15N-labeled experiments for structure-based assignment can be reduced; in particular, when a homologous assignment or chemical shift mapping information is available. For robustness, we modeled and directly addressed the errors. Binary integer linear programming, a well-studied method in operations research, was used to model the problems and provide practically efficient solutions with optimality guarantees. Our approach improved on the most robust method for structure-based backbone assignment on 15N-labeled data by improving the accuracy by 10% on average on 9 proteins, and then by handling typing errors, which had previously been ignored. We show that such errors can have a large impact on the accuracy; decreasing the accuracy from 95% or greater to between 40% and 75%. On automatically picked peaks, which is much noisier than manually picked peaks, we achieved an accuracy of 97% on ubiquitin. In chemical shift mapping, the peak tracking is often done manually because the problem is inherently visual. We developed a computer vision approach for tracking the peak movements with average accuracy of over 95% on three proteins with less than 1.5 residues predicted per peak. One of the proteins tested is larger than any tested by existing automated methods, and it has more titration peak lists. We then combined peak tracking with backbone assignment to take into account contact information, which resulted in an average accuracy of 94% on one-to-one assignments for these three proteins. Finally, we applied peak tracking and backbone assignment to protein-ligand docking to illustrate the potential for fast 3D complex determination.