Digitala Vetenskapliga Arkivet

Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Data extraction of digitized old newspaper content to streamline the search process for users with a genealogy perspective
Linköping University, Department of Science and Technology, Media and Information Technology. Linköping University, Faculty of Science & Engineering.
2019 (English)Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

This thesis presents the data extraction of digitized old newspaper content and the implementation of a search function to simplify for the user. This is developed as a master’s degree project at Linköping University. The application allows the user to search for interesting content in a database of articles and can be used by both genealogists, local historians and novices. The database is filled with data from OCR scanned newspapers and the user can either search the database by their own or with the help of their family tree. The family tree is implemented by reading the users GEDcom file and extracting useful information that is then used to get better search results. The result is returned to the user in the form of digital articles. The work concludes that the information from GEDcom files can be used to find new interesting facts and that the user should be allowed to affect how the data is reduced, in the form of article categorization and filtering.

Place, publisher, year, edition, pages
2019. , p. 61
Keywords [en]
genealogy, MySQL database, newspapers
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:liu:diva-160533ISRN: LiU-ITN-TEK-A--19/026--SEOAI: oai:DiVA.org:liu-160533DiVA, id: diva2:1354620
Subject / course
Media Technology
Uppsok
Technology
Supervisors
Examiners
Available from: 2019-09-25 Created: 2019-09-25 Last updated: 2025-02-18Bibliographically approved

Open Access in DiVA

Data extraction of digitized old newspaper content to streamline the search process for users with a genealogy perspective(5281 kB)739 downloads
File information
File name FULLTEXT01.pdfFile size 5281 kBChecksum SHA-512
37c1ae36c3ea8d5c67ea9710e4f477e133a8a255495a4eea134e5e0c694979927ede125f2557e9f584c0a85c3fca6b9d944ef797b205d778acad2d8b356ba7c7
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Pettersson, Sandra
By organisation
Media and Information TechnologyFaculty of Science & Engineering
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 742 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 972 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf