Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Machine learning to detect anomalies in datacenter
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Systems and Control.
2019 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

This thesis investigates the possibility of using anomaly detection on performance data of virtual servers in a datacenter to detect malfunctioning servers. Using anomaly detection can potentially reduce the time a server is malfunctioning, as the server can be detected and checked before the error has a significant impact.

Several approaches and methods were applied and evaluated on one virtual server: the K-nearest neighbor algorithm, the support-vector machine, the K-means clustering algorithm, self-organizing maps, CPU-memory usage ratio using a Gaussian model, and time series analysis using neural network and linear regression.

The evaluation and comparison of the methods were mainly based on reported errors during the time period they were tested. The better the detected anomalies matched the reported errors the higher score they received.

It turned out that anomalies in performance data could be linked to real errors in the server to some extent. This enables the possibility of using anomaly detection on performance data as a way to detect malfunctioning servers. The most simple method, looking at the ratio between memory usage and CPU, was the most successful one, detecting most errors. However the anomalies were often detected just after the error had been reported. Support vector machine were more successful at detecting anomalies before they were reported. The proportion of anomalies played a big role however and K-nearest neighbor received higher score when having a higher proportion of anomalies.

Place, publisher, year, edition, pages
2019. , p. 53
Series
UPTEC F, ISSN 1401-5757 ; 19040
Keywords [en]
machine learning, anomaly detection, server, support vector machine, performance data
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:uu:diva-388616OAI: oai:DiVA.org:uu-388616DiVA, id: diva2:1334370
External cooperation
Sandvik Coromant
Educational program
Master Programme in Engineering Physics
Supervisors
Examiners
Available from: 2019-07-02 Created: 2019-07-02 Last updated: 2019-07-02Bibliographically approved

Open Access in DiVA

fulltext(1627 kB)24 downloads
File information
File name FULLTEXT01.pdfFile size 1627 kBChecksum SHA-512
565b3c1a8be29868aafe80d402263706d7c487adfabc2d4f995df95ecb6d8cdf9caec89fcce53410c0a866dc7e48b07a167b524f0c06548dbefe0bc4f3d42203
Type fulltextMimetype application/pdf

By organisation
Division of Systems and Control
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 24 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 96 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf