Estimating p-values for outlier detection
Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Outlier detection is useful in a vast numbers of different domains, wherever there is data and a need for analysis. The research area related to outlier detection is large and the number of available approaches is constantly growing. Most of the approaches produce a binary result: either outlier or not. In this work approaches that are able to detect outliers by producing a p-value estimate are investigated. Approaches that estimate p-values are interesting since it allows their results to easily be compared against each other, followed over time, or be used with a variable threshold.
Four approaches are subjected to a variety of tests to attempt to measure their suitability when the data is distributed in a number of ways. The first approach, the R2S, is developed at Halmstad University. Based on finding the mid-point of the data. The second approach is based on one-class support vector machines (OCSVM). The third and fourth approaches are both based on conformal anomaly detection (CAD), but using different nonconformity measures (NCM). The Mahalanobis distance to the mean and a variation of k-NN are used as NCMs.
The R2S and the CAD Mahalanobis are both good at estimating p-values from data generated by unimodal and symmetrical distributions. The CAD k-NN is good at estimating p-values when the data is generated by a bimodal or extremely asymmetric distribution. The OCSVM does not excel in any scenario, but produces good average results in most of the tests. The approaches are also subjected to real data, where they all produce comparable results.
Place, publisher, year, edition, pages
2014. , 117 p.
p-value, outlier, cad, ocsvm
IdentifiersURN: urn:nbn:se:hh:diva-25662Local ID: IDE1408OAI: oai:DiVA.org:hh-25662DiVA: diva2:725550
Subject / course
Computer science and engineering
Rögnvaldsson, Thorsteinn, ProfessorByttner, Stefan, AdjunktJärpe, Eric, Lektor
Verikas, Antanas, Professor