Advancements in machine learning, and particularly deep learning, have revolutionized the real-world applications of artificial intelligence in recent years. A main property of deep neural models is their ability to learn a task based on a set of examples, that is, the training data. Although the state-of-the-art performance of such models is promising in many tasks, this of-ten holds only as long as the inputs to the model are “sufficiently similar” to the training data. Mathematically, a ubiquitous assumption in machine learning studies is that the test data used for evaluating a model are sampled from the same probability distribution as the training data. It is challenging to approach any problem where this assumption is violated, as it requires handling Out-Of-Distribution (OOD) data, i.e., data points that are systematically different from the training (in-distribution) data. In particular, one might be interested in detecting OOD inputs at test time given an unlabeled training set, which is the main problem explored in this thesis. This type of OOD detection (a.k.a. novelty/anomaly detection) has various applications in discovering unusual events and phenomena as well as improving safety in AI systems. Another challenging problem in deep learning is that a model might rely on certain trivial relations (spurious correlations) existing in training data to solve a task. Such “shortcuts” can bring a high performance on in-distribution data, but they may collapse on more realistic OOD data. It is therefore vital to mitigate the shortcut learning effects in deep models, which is the second topic studied in this thesis.
A part of the present thesis is concerned with leveraging pretrained deep models for OOD detection on images, without modifying their standard training algorithms. A method is proposed to use invertible (flow-based) generative models based on null hypothesis testing ideas, leading to an OOD detection method that is fast and more reliable than the traditional likelihood-based method. Diffusion (score-based) models are another type of modern generative models used for OOD detection in this thesis, in combination with pretrained deep encoders. Another contribution of the thesis is in leveraging the power of large self-supervised models in fully unsupervised fine-grained OOD detection. It is shown that the simple k-nearest neighbor distance in the representation space of such models results in a reasonable performance but can be boosted substantially through the proposed adjustments, without any model fine-tuning. The local geometry of representations and background (irrelevant) features are considered to this end.
OOD detection with time series data is another problem studied in this thesis. Specifically, a method is proposed based on Contrastive Predictive Coding (CPC) self-supervised learning, and applied to detect novel categories in human activity data. It is demonstrated, both empirically and through theoretical motivation, that modifying the CPC to use a radial basis function instead of the conventional log-bilinear function is a requirement for reliable and efficient OOD detection. This extension is combined with quantization of representation vectors to achieve better performance.
This thesis also addresses the problem of learning deep representations (transfer learning) in a situation where a shortcut exists in data. In this problem, a deep model is trained on a shortcut-biased image dataset to solve a self-supervised or supervised classification task. The representations learned by this model are used to train a smaller model on a related but different downstream task, and the adverse effect of the shortcut is verified empirically there. Moreover, a method is proposed to enhance the representation learning in this scenario, based on an auxiliary model trained in an adversarial manner along with the upstream classifier.
Linköping: Linköping University Electronic Press, 2025. , p. 76
2025-05-16, Ada Lovelace, B-building, Campus Valla, Linköping, 09:30 (English)