Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Automated Bug Report Routing
Linköping University, Department of Computer and Information Science, The Division of Statistics and Machine Learning.
2017 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

As the software industry grows larger by the minute, the need for automated solutions within bug report management is on the rise. Although some research has been conducted in the area of bug handling, new, faster or more precise approaches are yet to be developed. A bug report typically contains a free text observations field where the issue can be described by a human. Research regarding processing of this type of field is extensive, however, bug reports are often accompanied with system log files which have been given less attention so far. In the 4G LTE telecommunications network, the available system log files are many and several are likely to aid the routing of bug reports. In this thesis, one system log file was chosen to be evaluated; the alarm log. The alarm logs are time series count data containing alarms raised by the system. The alarm log data have been pre-processed with data mining techniques. The Apriori algorithm has been used to mine for specific alarms and alarming objects which indicates that the bug report should be solved by a particular developer group. We extend the Apriori algorithm to a temporal setting by using a customised time dependent confidence measure. To further mine for interesting sequences of events in the logs, the sequence mining approach SPADE has been used. The extracted class-associated sequences from both pre-processing approaches are transformed into binary features possible to use as predictors in any prediction model.

The results have been evaluated by predicting the correct developer group with two different methods; logistic regression and DO-probit. Logistic regression was regularised with the elastic net penalty to avoid computational issues as well as handling the sparse covariate set. DO-probit was used with a horseshoe prior; it is well suited for the sparse covariate regression problem as it is customised to obtain signals in sparse, noisy data. The results indicate that a data mining approach for processing alarm logs is promising.

The results show that the rules obtained with the Apriori mining process are suitable for mining the alarm logs as most binary representations of the rules used as covariates in logistic regression are kept in the equations for the expected classes with strongly positive coefficients. Although, the overall improvement in accuracy from using the alarms logs in addition to the learned topics from free text fields is modest, the alarm logs are concluded to be a good complement to the free text information as some Apriori covariates appears to be better suited to predict some classes than some topics.

Place, publisher, year, edition, pages
2017. , p. 49
National Category
Probability Theory and Statistics
Identifiers
URN: urn:nbn:se:liu:diva-139037ISRN: LIU-IDA/STAT-A--17/009—SEOAI: oai:DiVA.org:liu-139037DiVA, id: diva2:1117262
Subject / course
Statistics
Supervisors
Examiners
Available from: 2017-06-29 Created: 2017-06-28 Last updated: 2017-06-29Bibliographically approved

Open Access in DiVA

fulltext(937 kB)214 downloads
File information
File name FULLTEXT01.pdfFile size 937 kBChecksum SHA-512
6bf6cd13244375c5e3c8662367f809164c49d3e27da950cddfcfa02fbbb2ea036eff57e096045d31c4163bf8834cbed1a9530dec59ec5179157cdfbf13d075d4
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Svahn, Caroline
By organisation
The Division of Statistics and Machine Learning
Probability Theory and Statistics

Search outside of DiVA

GoogleGoogle Scholar
Total: 214 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 277 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf