Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A Systematic Study of Semi-Supervised Learning Based on Shapley Value Data Valuation
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology.
2022 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Semi-supervised learning algorithms seek to train prediction models on both labelled and unlabelled that outperform prediction models trained only on labelled data. Semi-supervised learning is often realised through the selection of unlabelled instances with predicted pseudo-labels. The standard approach in literature is to select pseudo-labelled instances based on the confidence values from the prediction models. An alternative, more direct approach that selects pseudo-labelled instances based on their contribution to the performance of a classifier is proposed in literature. The authors use Shapley value based data valuation to realise this. We identify that there exists two areas of possible variance: when labels are provided for unlabelled instances and in the calculation of the Shapley values. We propose five algorithms that employ cross-validation committee and bootstrapping strategies from ensemble learning to attempt to reduce these potential variances and provide a systematic study of semi-supervised learning using Shapley value based data valuation. It is experimentally shown that the proposed semi-supervised methods outperform methods trained only using labelled data.

Place, publisher, year, edition, pages
2022. , p. 43
Series
IT ; 22 062
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:uu:diva-485139OAI: oai:DiVA.org:uu-485139DiVA, id: diva2:1697410
Educational program
Master's Programme in Data Science
Supervisors
Examiners
Available from: 2022-09-20 Created: 2022-09-20 Last updated: 2023-07-12

Open Access in DiVA

fulltext(1869 kB)324 downloads
File information
File name FULLTEXT01.pdfFile size 1869 kBChecksum SHA-512
b0f206a452c714e8a550f511bbb9075c4af7e948b90e7524f5bd5dd9d1a72c022e3ed88a792622a48ba4deee4eafd11bbab26c614c3a2061b4f576f45006a1f9
Type fulltextMimetype application/pdf

By organisation
Department of Information Technology
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 325 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 210 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf