Change search
ReferencesLink to record
Permanent link

Direct link
Estimating p-values for outlier detection
Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE).
2014 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Outlier detection is useful in a vast numbers of different domains, wherever there is data and a need for analysis. The research area related to outlier detection is large and the number of available approaches is constantly growing. Most of the approaches produce a binary result: either outlier or not. In this work approaches that are able to detect outliers by producing a p-value estimate are investigated. Approaches that estimate p-values are interesting since it allows their results to easily be compared against each other, followed over time, or be used with a variable threshold.

Four approaches are subjected to a variety of tests to attempt to measure their suitability when the data is distributed in a number of ways. The first approach, the R2S, is developed at Halmstad University. Based on finding the mid-point of the data. The second approach is based on one-class support vector machines (OCSVM). The third and fourth approaches are both based on conformal anomaly detection (CAD), but using different nonconformity measures (NCM). The Mahalanobis distance to the mean and a variation of k-NN are used as NCMs.

The R2S and the CAD Mahalanobis are both good at estimating p-values from data generated by unimodal and symmetrical distributions. The CAD k-NN is good at estimating p-values when the data is generated by a bimodal or extremely asymmetric distribution. The OCSVM does not excel in any scenario, but produces good average results in most of the tests. The approaches are also subjected to real data, where they all produce comparable results.

Place, publisher, year, edition, pages
2014. , 117 p.
Keyword [en]
p-value, outlier, cad, ocsvm
National Category
Computer Science
URN: urn:nbn:se:hh:diva-25662Local ID: IDE1408OAI: diva2:725550
Subject / course
Computer science and engineering
Available from: 2014-06-18 Created: 2014-06-16 Last updated: 2014-06-18Bibliographically approved

Open Access in DiVA

fulltext(1916 kB)221 downloads
File information
File name FULLTEXT01.pdfFile size 1916 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
School of Information Science, Computer and Electrical Engineering (IDE)
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 221 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 315 hits
ReferencesLink to record
Permanent link

Direct link