Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Handling Novel and Out-Of-Distribution Data in Deep Learning: OOD Detection and Shortcut Mitigation
Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning. Linköping University, Faculty of Science & Engineering.ORCID iD: 0000-0001-7411-2177
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Advancements in machine learning, and particularly deep learning, have revolutionized the real-world applications of artificial intelligence in recent years. A main property of deep neural models is their ability to learn a task based on a set of examples, that is, the training data. Although the state-of-the-art performance of such models is promising in many tasks, this of-ten holds only as long as the inputs to the model are “sufficiently similar” to the training data. Mathematically, a ubiquitous assumption in machine learning studies is that the test data used for evaluating a model are sampled from the same probability distribution as the training data. It is challenging to approach any problem where this assumption is violated, as it requires handling Out-Of-Distribution (OOD) data, i.e., data points that are systematically different from the training (in-distribution) data. In particular, one might be interested in detecting OOD inputs at test time given an unlabeled training set, which is the main problem explored in this thesis. This type of OOD detection (a.k.a. novelty/anomaly detection) has various applications in discovering unusual events and phenomena as well as improving safety in AI systems. Another challenging problem in deep learning is that a model might rely on certain trivial relations (spurious correlations) existing in training data to solve a task. Such “shortcuts” can bring a high performance on in-distribution data, but they may collapse on more realistic OOD data. It is therefore vital to mitigate the shortcut learning effects in deep models, which is the second topic studied in this thesis.

A part of the present thesis is concerned with leveraging pretrained deep models for OOD detection on images, without modifying their standard training algorithms. A method is proposed to use invertible (flow-based) generative models based on null hypothesis testing ideas, leading to an OOD detection method that is fast and more reliable than the traditional likelihood-based method. Diffusion (score-based) models are another type of modern generative models used for OOD detection in this thesis, in combination with pretrained deep encoders. Another contribution of the thesis is in leveraging the power of large self-supervised models in fully unsupervised fine-grained OOD detection. It is shown that the simple k-nearest neighbor distance in the representation space of such models results in a reasonable performance but can be boosted substantially through the proposed adjustments, without any model fine-tuning. The local geometry of representations and background (irrelevant) features are considered to this end.

OOD detection with time series data is another problem studied in this thesis. Specifically, a method is proposed based on Contrastive Predictive Coding (CPC) self-supervised learning, and applied to detect novel categories in human activity data. It is demonstrated, both empirically and through theoretical motivation, that modifying the CPC to use a radial basis function instead of the conventional log-bilinear function is a requirement for reliable and efficient OOD detection. This extension is combined with quantization of representation vectors to achieve better performance.

This thesis also addresses the problem of learning deep representations (transfer learning) in a situation where a shortcut exists in data. In this problem, a deep model is trained on a shortcut-biased image dataset to solve a self-supervised or supervised classification task. The representations learned by this model are used to train a smaller model on a related but different downstream task, and the adverse effect of the shortcut is verified empirically there. Moreover, a method is proposed to enhance the representation learning in this scenario, based on an auxiliary model trained in an adversarial manner along with the upstream classifier.

Place, publisher, year, edition, pages
Linköping: Linköping University Electronic Press, 2025. , p. 76
Series
Linköping Studies in Science and Technology. Dissertations, ISSN 0345-7524 ; 2445
National Category
Artificial Intelligence
Identifiers
URN: urn:nbn:se:liu:diva-212978DOI: 10.3384/9789181180749ISBN: 9789181180732 (print)ISBN: 9789181180749 (electronic)OAI: oai:DiVA.org:liu-212978DiVA, id: diva2:1951761
Public defence
2025-05-16, Ada Lovelace, B-building, Campus Valla, Linköping, 09:30 (English)
Opponent
Supervisors
Available from: 2025-04-14 Created: 2025-04-14 Last updated: 2025-04-14Bibliographically approved
List of papers
1. Likelihood-free Out-of-Distribution Detection with Invertible Generative Models
Open this publication in new window or tab >>Likelihood-free Out-of-Distribution Detection with Invertible Generative Models
2021 (English)In: Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence (IJCAI 2021), International Joint Conferences on Artifical Intelligence (IJCAI) , 2021, p. 2119-2125Conference paper, Published paper (Refereed)
Abstract [en]

Likelihood of generative models has been used traditionally as a score to detect atypical (Out-of-Distribution, OOD) inputs. However, several recent studies have found this approach to be highly unreliable, even with invertible generative models, where computing the likelihood is feasible. In this paper, we present a different framework for generative model--based OOD detection that employs the model in constructing a new representation space, instead of using it directly in computing typicality scores, where it is emphasized that the score function should be interpretable as the similarity between the input and training data in the new space. In practice, with a focus on invertible models, we propose to extract low-dimensional features (statistics) based on the model encoder and complexity of input images, and then use a One-Class SVM to score the data. Contrary to recently proposed OOD detection methods for generative models, our method does not require computing likelihood values. Consequently, it is much faster when using invertible models with iteratively approximated likelihood (e.g. iResNet), while it still has a performance competitive with other related methods

Place, publisher, year, edition, pages
International Joint Conferences on Artifical Intelligence (IJCAI), 2021
Series
Proceedings of the International Joint Conference on Artificial Intelligence, ISSN 1045-0823
Keywords
Deep Learning, Anomaly/Outlier Detection, Uncertainty Representations
National Category
Computer Sciences
Identifiers
urn:nbn:se:liu:diva-188936 (URN)10.24963/ijcai.2021/292 (DOI)001202335502027 ()2-s2.0-85125461759 (Scopus ID)9780999241196 (ISBN)
Conference
International Joint Conference on Artificial Intelligence (IJCAI), 19-26 August, 2021
Available from: 2022-10-03 Created: 2022-10-03 Last updated: 2025-04-14Bibliographically approved
2. Unsupervised Novelty Detection in Pretrained Representation Space with Locally Adapted Likelihood Ratio
Open this publication in new window or tab >>Unsupervised Novelty Detection in Pretrained Representation Space with Locally Adapted Likelihood Ratio
2024 (English)In: International Conference on Artificial Intelligence and Statistics 2024, Proceedings of Machine Learning Research, 2024, Vol. 238Conference paper, Published paper (Refereed)
Abstract [en]

Detecting novelties given unlabeled examples of normal data is a challenging task in machine learning, particularly when the novel and normal categories are semantically close. Large deep models pretrained on massive datasets can provide a rich representation space in which the simple k-nearest neighbor distance works as a novelty measure. However, as we show in this paper, the basic k-NN method might be insufficient in this context due to ignoring the 'local geometry' of the distribution over representations as well as the impact of irrelevant 'background features'. To address this, we propose a fully unsupervised novelty detection approach that integrates the flexibility of k-NN with a locally adapted scaling of dimensions based on the 'neighbors of nearest neighbor' and computing a 'likelihood ratio' in pretrained (self-supervised) representation spaces. Our experiments with image data show the advantage of this method when off-the-shelf vision transformers (e.g., pretrained by DINO) are used as the feature extractor without any fine-tuning.

Series
Proceedings of Machine Learning Research, ISSN 2640-3498
National Category
Computer Sciences Computer graphics and computer vision Signal Processing
Identifiers
urn:nbn:se:liu:diva-203391 (URN)001221034002024 ()
Conference
27th International Conference on Artificial Intelligence and Statistics (AISTATS), Valencia, SPAIN, MAY 02-04, 2024
Available from: 2024-05-08 Created: 2024-05-08 Last updated: 2025-04-14
3. Enhancing Representation Learning with Deep Classifiers in Presence of Shortcut
Open this publication in new window or tab >>Enhancing Representation Learning with Deep Classifiers in Presence of Shortcut
2023 (English)In: Proceedings of IEEE ICASSP 2023, 2023Conference paper, Published paper (Refereed)
Abstract [en]

A deep neural classifier trained on an upstream task can be leveraged to boost the performance of another classifier in a related downstream task through the representations learned in hidden layers. However, presence of shortcuts (easy-to-learn features) in the upstream task can considerably impair the versatility of intermediate representations and, in turn, the downstream performance. In this paper, we propose a method to improve the representations learned by deep neural image classifiers in spite of a shortcut in upstream data. In our method, the upstream classification objective is augmented with a type of adversarial training where an auxiliary network, so called lens, fools the classifier by exploiting the shortcut in reconstructing images. Empirical comparisons in self-supervised and transfer learning problems with three shortcut-biased datasets suggest the advantages of our method in terms of downstream performance and/or training time.

Keywords
Deep Representation Learning, Shortcut Learning, Transfer Learning, Adversarial Methods, Computer Vision
National Category
Computer Sciences Computer graphics and computer vision
Identifiers
urn:nbn:se:liu:diva-198763 (URN)10.1109/ICASSP49357.2023.10096346 (DOI)
Conference
2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Available from: 2023-10-26 Created: 2023-10-26 Last updated: 2025-04-14

Open Access in DiVA

fulltext(9856 kB)182 downloads
File information
File name FULLTEXT01.pdfFile size 9856 kBChecksum SHA-512
0a05daea7a47d7054745f306578a078f2eaa489683915657e2954c5f9d480cdb63813070f78caaec3b373fb836170fed3b027ded4b7cec1e1f0f25c4b99933de
Type fulltextMimetype application/pdf
Order online >>

Other links

Publisher's full text

Search in DiVA

By author/editor
Ahmadian, Amirhossein
By organisation
The Division of Statistics and Machine LearningFaculty of Science & Engineering
Artificial Intelligence

Search outside of DiVA

GoogleGoogle Scholar
Total: 185 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 1479 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf