Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Cleartext detection and language identification in ciphers
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
2021 (English)Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

In historical cryptology, cleartext represents text written in a known language ina cipher (a hand-written manuscript aiming at hiding the content of a message).Cleartext can give us an historical interpretation and contextualisation of themanuscript and could help researchers in cryptanalysis, but to these days thereis still no research on how to automatically detect cleartext and identifying itslanguage. In this paper, we investigate to what extent we can automaticallydistinguish cleartext from ciphertext in transcribed historical ciphers and towhat extent we are able to identify its language. We took a rule-based approachand run 7 different models using historical language models on ciphertextsprovided by the DECRYPT-Project. Our results show that using unigrams andbigrams on a word-level combined with 3-grams, 4-grams and 5-grams on acharacter-level is the best approach to tackle cleartext detection.

Place, publisher, year, edition, pages
2021. , p. 33
Keywords [en]
historical cryptology, digital humanities
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:uu:diva-446439OAI: oai:DiVA.org:uu-446439DiVA, id: diva2:1569329
Subject / course
Digital humanities
Educational program
Master Programme in Language Technology
Supervisors
Examiners
Available from: 2021-06-19 Created: 2021-06-19 Last updated: 2021-06-19Bibliographically approved

Open Access in DiVA

fulltext(1322 kB)422 downloads
File information
File name FULLTEXT01.pdfFile size 1322 kBChecksum SHA-512
42aaa9fa30a03159a4eb1eb72eb123801c2a2924f419466cec19ab9c2f4d0a41beb35607951e0cc365e187d051babff2dec51a9284040b5fd5fead7e0efaa4e2
Type fulltextMimetype application/pdf

By organisation
Department of Linguistics and Philology
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 422 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 2421 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf