Developing Engineering Principles for Computational Protein Design

Richard, Alan

Metadata Field	Value	Language
dc.contributor.advisor	Pantazes, Robert
dc.contributor.author	Richard, Alan
dc.date.accessioned	2025-11-05T14:18:51Z
dc.date.available	2025-11-05T14:18:51Z
dc.date.issued	2025-11-05
dc.identifier.uri	https://etd.auburn.edu/handle/10415/10039
dc.description.abstract	Accurate and efficient computational design of binding proteins remains an outstanding challenge which has implications in the fields of therapeutics, diagnostics, sensing, and more. The recent successes of machine learning methods for protein structure prediction and binder design have thrust computational protein engineering into the spotlight. These methods have revolutionized how researchers think about protein design and have the accolades to show for it. However, the problem is far from solved as many of these methods are plagued by low success rates, massive compute requirements, difficulty of use, and low mechanistic interpretability. To incorporate well into generalizable methods for protein binder design, novel tools should not only yield high success rates but also retain some biological explanation and rationale. In order to extend to many applications and retain interpretability, contemporary methods should seek to be built on first principles thinking. Genuine engineering principles are necessary both to understand the successes of machine learning methods and rationally design novel protein interactions that are beyond the reach of black boxes. In this work the Expected Persistent Pairwise Interaction (EPPI) features were identified, which are properties of a protein interface that are important to protein binding. These features were derived from a molecular dynamics study of the archetypal binding proteins, antibodies, and encode biologically relevant properties of residue-level protein interactions. By training a random forest classifier on this feature space the rate of false positives from docking simulations was significantly reduced when compared to traditional Rosetta-based metrics. Further, when applied to the SKEMPI v2.0 database of mutations, a model trained on the EPPI features accurately predicted changes in binding free energy (∆∆G), outperforming all comparable models in the literature. The feature space was then incorporated into a computational redesign framework to optimize protein interfaces for more EPPIs. This method is validated on failed designs in a binder recovery campaign, and designs are characterized with a suite of computational methods. Finally, it was demonstrated that a similar feature space, which considered pairwise residue stabilities, was able to increase the hit rate of de novo designed minibinders to an industrially relevant analyte, Interleukin–6. All of these pieces fit together in a framework that incorporates biological hypotheses and engineering principles to create novel proteins from data-driven decisions. Together these results suggest that the development of engineering principles not only guides the rational design of protein binders but also illuminates the successes of machine learning approaches.	en_US
dc.rights	EMBARGO_NOT_AUBURN	en_US
dc.subject	Chemical Engineering	en_US
dc.title	Developing Engineering Principles for Computational Protein Design	en_US
dc.type	PhD Dissertation	en_US
dc.embargo.length	MONTHS_WITHHELD:24	en_US
dc.embargo.status	EMBARGOED	en_US
dc.embargo.enddate	2027-11-05	en_US
dc.creator.orcid	0009-0001-8573-434X	en_US

Files in this item

Name:: ACR_Final_Dissertation.pdf
Size:: 34.41Mb

Show simple item record