Digitala Vetenskapliga Arkivet

Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Machine Learning for Software Bug Categorization
Uppsala universitet, Teknisk-naturvetenskapliga vetenskapsområdet, Matematisk-datavetenskapliga sektionen, Institutionen för informationsteknologi.
2019 (engelsk)Independent thesis Advanced level (professional degree), 20 poäng / 30 hpOppgave
Abstract [en]

The pursuit of flawless software is often an exhausting task for software developers. Code defects can range from soft issues to hard issues that lead to unforgiving consequences. DICE have their own system which automatically collects these defects which are grouped into buckets, however, this system suffers from the flaw of sometimes incorrectly grouping unrelated issues, and missing apparent duplicates. This time-consuming flaw puts excessive work for software developers and leads to wasted resources in the company. These flaws also impact the data quality of the system's defects tracking datasets which turn into a never-ending vicious circle. In this thesis, we investigate the method of measuring the similarity between reports in order to reduce incorrectly grouped issues and duplicate reports. Prototype models have been built for bug categorization and bucketing using convolutional neural networks. For each report, the prototype is able to provide developers with candidates of related issues with likelihood metric whether the issues are related. The similarity measurement is made in the representation phase of the neural networks, which we call the latent space. We also use Kullback–Leibler divergence in this space in order to get better similarity metrics. The results show important findings and insights for further improvement in the future. In addition to this, we discuss methods and strategies for detecting outliers using Mahalanobis distance in order to prevent incorrectly grouped reports.

sted, utgiver, år, opplag, sider
2019. , s. 63
Serie
UPTEC IT, ISSN 1401-5749 ; 19018
HSV kategori
Identifikatorer
URN: urn:nbn:se:uu:diva-395253OAI: oai:DiVA.org:uu-395253DiVA, id: diva2:1361472
Utdanningsprogram
Master of Science Programme in Information Technology Engineering
Veileder
Examiner
Tilgjengelig fra: 2019-10-16 Laget: 2019-10-16 Sist oppdatert: 2021-02-18bibliografisk kontrollert

Open Access i DiVA

fulltext(7238 kB)800 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 7238 kBChecksum SHA-512
75a7c52138c15bcdb77255fea319324ba4cb71dd112bf20f49ca52067c4f997ca73c6ebd3294acaa85f293bb84a081412155face39740e786bd3f0a07c5f8ab0
Type fulltextMimetype application/pdf

Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 800 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

urn-nbn

Altmetric

urn-nbn
Totalt: 738 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf