Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A comparative study of structured prediction methods for sequence labeling
KTH, School of Computer Science and Communication (CSC).
2016 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

Some machine learning tasks have a complex output, rather than a real number or a class. Those outputs are composed by elements which have interdependences and structural properties. Methods which take into account the form of the output are known as structured prediction techniques. This study focuses on those techniques, evaluating their performance for tasks of sequence labeling and comparing them. Specifically, tasks of natural language processing are used as benchmarks. 

The principal problem evaluated is part-of-speech tagging. Datasets of different languages (English, Spanish, Portuguese and Dutch) and environments (newspapers, twitter and chats) are used for a general analysis. Shallow parsing and named entity recognition are also examined. The algorithms treated are structured perceptron, conditional random fields, structured support vector machines and trigram hidden Markov models. They are also compared to different approaches to solve these problems.

The results show that, in general, structured perceptron has the best performance for sequence labeling with the conditions evaluated. However, with few training examples, structured support vector machines can achieve a similar or superior accuracy. Moreover, the results for conditional ranom fields is near those two methods. The relative results of the algorithms are similar across different datasets, but the absolute accuracies are dependent on their specificities.

Place, publisher, year, edition, pages
2016.
National Category
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-186385OAI: oai:DiVA.org:kth-186385DiVA: diva2:927145
Presentation
2016-05-02, E32, Stockholm, 11:15 (English)
Supervisors
Examiners
Available from: 2016-05-18 Created: 2016-05-11 Last updated: 2016-05-18Bibliographically approved

Open Access in DiVA

fulltext(1239 kB)116 downloads
File information
File name FULLTEXT01.pdfFile size 1239 kBChecksum SHA-512
062a114e798e6515ceb4826fb6dc993593367df18516ca015f7834099f693d5f1ca521da3ba80bf5a843e6e48bf5ebc6bb082608e62c9a0bcecd3a27aac3cd2f
Type fulltextMimetype application/pdf

By organisation
School of Computer Science and Communication (CSC)
Computer Science

Search outside of DiVA

GoogleGoogle Scholar
Total: 116 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 188 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf