This Is Auburn

Minimizing Data Dependency through Predictive and Few-Shot Approaches

Date

2025-11-25

Author

Trehan, Shubham

Abstract

The advancement of deep learning in complex real-world applications is fundamentally constrained by the scarcity of high-quality unlabeled data. This challenge is particularly acute in two critical domains: biomedical diagnostics, where data acquisition is costly and characterized by a long-tailed distribution, and social activity recognition (SAR), where the need for dense annotations conflicts with privacy concerns. This dissertation addresses the data bottleneck by developing novel frameworks that leverage self-supervised learning (SSL) and few-shot learning (FSL) principles to learn robust, generalizable representations from minimal or unlabeled supervision. We present three primary contributions: ProtoKD, a few-shot classification framework that integrates prototypical networks with self-distillation and domain-specific augmentation to achieve state-of-the-art multi-class parasitic ova recognition from extremely scarce data, learning from only one sample per class. Building on this, we introduce FSP-DETR, a unified few-shot object detection and open-set recognition framework for microscopic imaging. FSP-DETR extends the idea of prototype-based learning to the object detection paradigm by pairing a class-agnostic DETR backbone with prototype-guided embedding refinement, enabling simultaneous few-shot detection, open-set rejection, and cross-domain generalization without retraining. We also establish a new ova species detection benchmark with 20 parasite classes and publicly release the dataset to drive systematic low-shot evaluation. Complementing our work in static image analysis, we present a novel self-supervised multi-actor predictive learning framework for SAR in streaming videos , moving beyond the traditional single-actor/single-action assumption. This approach models social interactions via an action graph and uses spatial-temporal graph smoothing, allowing for robust group activity recognition and individual action detection with unlabeled streaming data, competitive with many supervised methods. Collectively, this research pioneers scalable and versatile deep learning solutions for data-constrained, high-stakes environments, demonstrating that principled design based on metric learning, self-distillation, and predictive modeling can overcome the reliance on massive labeled datasets.