Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
An Entropy Estimate of Written Language and Twitter Language: A Comparison between English and Swedish
Linnaeus University, Faculty of Technology, Department of Mathematics.
2017 (English)Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
Abstract [en]

The purpose of this study is to estimate and compare the entropy and redundancy of written English and Swedish. We also investigate and compare the entropy and redundancy of Twitter language. This is done by extracting n consecutive characters called n-grams and calculating their frequencies. No precise values are obtained, due to the amount of text being finite, while the entropy is estimated for text length tending towards infinity. However we do obtain results for n = 1,...,6  and the results show that written Swedish has higher entropy than written English and that the redundancy is lower for Swedish language. When comparing Twitter with the standard languages we find that for Twitter, the entropy is higher and the redundancy is lower.

Place, publisher, year, edition, pages
2017. , 42 p.
Keyword [en]
entropy, entropy rate, redundancy, Twitter, natural language
National Category
Mathematics
Identifiers
URN: urn:nbn:se:lnu:diva-64952OAI: oai:DiVA.org:lnu-64952DiVA: diva2:1106653
Subject / course
Mathematics
Educational program
Applied Mahtematics Programme, 180 credits
Supervisors
Examiners
Available from: 2017-06-08 Created: 2017-06-07 Last updated: 2017-06-08Bibliographically approved

Open Access in DiVA

fulltext(1115 kB)34 downloads
File information
File name FULLTEXT01.pdfFile size 1115 kBChecksum SHA-512
44cd35f4e24765673c34aa620e3e3c9de68c2df4c644299f2788d0d923684df3f8c38d15c11abf35f114f8999b054da83974fd2aa3b40d41396077a70159256b
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Juhlin, Sanna
By organisation
Department of Mathematics
Mathematics

Search outside of DiVA

GoogleGoogle Scholar
Total: 34 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 129 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf