Developing Engineering Principles for Computational Protein Design
| Metadata Field | Value | Language |
|---|---|---|
| dc.contributor.advisor | Pantazes, Robert | |
| dc.contributor.author | Richard, Alan | |
| dc.date.accessioned | 2025-11-05T14:18:51Z | |
| dc.date.available | 2025-11-05T14:18:51Z | |
| dc.date.issued | 2025-11-05 | |
| dc.identifier.uri | https://etd.auburn.edu/handle/10415/10039 | |
| dc.description.abstract | Accurate and efficient computational design of binding proteins remains an outstanding challenge which has implications in the fields of therapeutics, diagnostics, sensing, and more. The recent successes of machine learning methods for protein structure prediction and binder design have thrust computational protein engineering into the spotlight. These methods have revolutionized how researchers think about protein design and have the accolades to show for it. However, the problem is far from solved as many of these methods are plagued by low success rates, massive compute requirements, difficulty of use, and low mechanistic interpretability. To incorporate well into generalizable methods for protein binder design, novel tools should not only yield high success rates but also retain some biological explanation and rationale. In order to extend to many applications and retain interpretability, contemporary methods should seek to be built on first principles thinking. Genuine engineering principles are necessary both to understand the successes of machine learning methods and rationally design novel protein interactions that are beyond the reach of black boxes. In this work the Expected Persistent Pairwise Interaction (EPPI) features were identified, which are properties of a protein interface that are important to protein binding. These features were derived from a molecular dynamics study of the archetypal binding proteins, antibodies, and encode biologically relevant properties of residue-level protein interactions. By training a random forest classifier on this feature space the rate of false positives from docking simulations was significantly reduced when compared to traditional Rosetta-based metrics. Further, when applied to the SKEMPI v2.0 database of mutations, a model trained on the EPPI features accurately predicted changes in binding free energy (∆∆G), outperforming all comparable models in the literature. The feature space was then incorporated into a computational redesign framework to optimize protein interfaces for more EPPIs. This method is validated on failed designs in a binder recovery campaign, and designs are characterized with a suite of computational methods. Finally, it was demonstrated that a similar feature space, which considered pairwise residue stabilities, was able to increase the hit rate of de novo designed minibinders to an industrially relevant analyte, Interleukin–6. All of these pieces fit together in a framework that incorporates biological hypotheses and engineering principles to create novel proteins from data-driven decisions. Together these results suggest that the development of engineering principles not only guides the rational design of protein binders but also illuminates the successes of machine learning approaches. | en_US |
| dc.rights | EMBARGO_NOT_AUBURN | en_US |
| dc.subject | Chemical Engineering | en_US |
| dc.title | Developing Engineering Principles for Computational Protein Design | en_US |
| dc.type | PhD Dissertation | en_US |
| dc.embargo.length | MONTHS_WITHHELD:24 | en_US |
| dc.embargo.status | EMBARGOED | en_US |
| dc.embargo.enddate | 2027-11-05 | en_US |
| dc.creator.orcid | 0009-0001-8573-434X | en_US |
