Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Is Simple Wikipedia simple?: – A study of readability and guidelines
Linköping University, Department of Computer and Information Science.
2018 (English)Independent thesis Basic level (degree of Bachelor), 12 credits / 18 HE creditsStudent thesis
Abstract [en]

Creating easy-to-read text is an issue that has traditionally been solved with manual work. But with advancing research in natural language processing, automatic systems for text simplification are being developed. These systems often need training data that is parallel aligned. For several years, simple Wikipedia has been the main source for this data. In the current study, several readability measures has been tested on a popular simplification corpus. A selection of guidelines from simple Wikipedia has also been operationalized and tested. The results imply that the following of guidelines are not greater in simple Wikipedia than in standard Wikipedia. There are however differences in the readability measures. The syntactical structures of simple Wikipedia seems to be less complex than those of standard Wikipedia. A continuation of this study would be to examine other readability measures and evaluate the guidelines not covered within the current work.

Place, publisher, year, edition, pages
2018. , p. 26
Keywords [en]
corpus, readability, Wikipedia, automatic text simplification
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:liu:diva-161890ISRN: LIU-IDA/KOGVET-G--18/029—SEOAI: oai:DiVA.org:liu-161890DiVA, id: diva2:1369352
Subject / course
Cognitive science
Supervisors
Examiners
Available from: 2019-11-14 Created: 2019-11-11 Last updated: 2019-11-14Bibliographically approved

Open Access in DiVA

fulltext(209 kB)9 downloads
File information
File name FULLTEXT01.pdfFile size 209 kBChecksum SHA-512
88803ee7cd09a3b3ba945e022cf13828643a6ab8e6b15287c4a618893c607eaf3477a937d6ad2d7254a33975be812d5155315b8e38b266928bddaaa54a771b1a
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Isaksson, Fabian
By organisation
Department of Computer and Information Science
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 9 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 73 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf