| dc.description.abstract | The adaptive immune response plays an essential role in viral detection and clearance and comprises B cells, T cells, and antibodies. The collective immune response arises from a series of single binding events with high specificity between immune molecules and a pathogen. Experimental methods have enabled measurement of single binding events, which can be used to identify disease-specific motifs or model the overall immune response. These help to improve diagnostics, therapeutics, vaccine development, and surveillance efforts. Understanding what the immune system can detect is crucial to improving overall health.
T cell responses are driven in large part by the recognition of linear viral peptides displayed by the Major Histocompatibility Complex (MHC) class II molecules, which are encoded by Human Leukocyte Antigen (HLA) genes. From experimental binding assays, computational tools, such as NetMHCIIpan, can predict the binding between MHC class II alleles and peptides. These methods have been used to identify vaccine targets, design therapeutics, and determine a subset of population-level responses to viral infections. However, there is no framework for tracking how viral mutations affect MHC class II-mediated immune recognition at a global population level.
To address this gap, this dissertation introduces POP MHCII, a population level MHC class II immune recognition framework that integrates predicted peptide-MHC class II binding, allele frequency data, ethnicity weightings, and protein abundance to quantify mutation driven changes in immune recognition. The framework generates interpretable metrics, including Overall Impact, Protein Impact, and Ethnicity Impact scores. POP MHCII was first applied to longitudinal influenza A virus sequencing data, revealing recurring seasonal patterns characterized by population level increases in predicted immune recognition early in the influenza season followed by cumulative decreases during peak circulation. These trends are consistent with immune evasion acting as a secondary evolutionary pressure during viral spread. Application of POP MHCII to SARS CoV 2 variants showed that pre Omicron lineages were generally associated with increased predicted MHC class II recognition, whereas post Omicron variants more frequently exhibited decreases in recognition, reinforcing the interpretation that MHC mediated immune evasion emerges later in viral evolution.
Actual determination of what immune molecules recognize requires experimental investigation. Linear peptides have been used for these purposes, where they serve as both linear epitopes as well as mimic structural epitopes (mimotopes), and have helped identify disease-specific motifs to improve diagnostics and therapeutics. To identify a motif, many similar patterns are needed, and patterns are derived from a large number of peptides. Reducing the number of peptides needed to determine a motif would improve the amount of information that can be learned per experiment. It is hypothesized that a pattern-based peptide library would reduce the number of peptides needed to determine a motif. A major limitation of a pattern-based peptide library would be the display of hydrophobic peptides.
To address this experimental bottleneck, this dissertation also presents computational work toward the design of a human derived protein scaffold for N terminal display of peptide patterns, including hydrophobic ones. A small human protein domain was identified based on predicted solubility, structural stability, and compatibility with peptide display, and rigid linker architectures were evaluated to preserve peptide accessibility while minimizing scaffold interference. | en_US |