Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Optical Character and Symbol Recognition using Tesseract
2016 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

The thesis objective was to examine and evaluate Optical Character Recognition technology for symbol recognition. Can the technology be used to recognize and verify symbols? If so, how good? Today symbol recognition relies on other image registration technologies. The other objective was to provide new functionality of Optical Character Recognition to an existing automation tool at Volvo Cars Corporation. The implementation should be stable, efficient, robust and support various testing functions for the ability to set up automated test cases. The thesis work were conducted in the VU-team at Volvo Cars Corporation, Gothenburg. The working method was agile with two week sprints and constant deliveries.Symbols could indeed be recognized using Optical Character Recognition, some even very accurately. Results show that symbols which was recognized with a confidence above 73% in perfect conditions could very likely be recognized accurately when exposed to various noise. Symbols were recognized with a confidence above 75% even though the image resolution decreased from 96 to 16 PPI. Tesseract OCR engine were able to recognize most symbols confidently even though they were put through heavy noise filtering, where pixel-to-pixel comparison techniques would fail. Generally the symbols sharing the same outlines were difficult to recognize correctly due to the heavy analysis of symbol outlines. This might be changing when a Neural Network classifier is fully implemented. This could make Tesseract OCR engine a possible candidate for symbol recognition in the future. Symbol recognition using Optical Character Recognition cannot directly replace image registration but the two techniques could complete each other. However, the process to allow symbols to be recognized is very protracted. The thesis implementation result provided a stable and efficient program, it will be used for automated testing to localize, recognize and verify text in Volvo Cars display’s interfaces. The implementation is in a JAR file format and can be added as a library in any Java project. This format helps with future development or if the functionality is needed in other projects at Volvo Cars.

Place, publisher, year, edition, pages
2016. , 67 p.
Keyword [en]
Technology
Keyword [sv]
Teknik, OCR, Optical Character Recognition, Symbol Recognition, Automation, System testining, Volvo Car Corporation
Identifiers
URN: urn:nbn:se:ltu:diva-46531Local ID: 428c7b3f-ddc3-4728-9cb6-9ed0c2a65fbcOAI: oai:DiVA.org:ltu-46531DiVA: diva2:1019846
External cooperation
Subject / course
Student thesis, at least 30 credits
Educational program
Computer Science and Engineering, master's level
Examiners
Note
Validerat; 20160701 (global_studentproject_submitter)Available from: 2016-10-04 Created: 2016-10-04Bibliographically approved

Open Access in DiVA

fulltext(8798 kB)1492 downloads
File information
File name FULLTEXT02.pdfFile size 8798 kBChecksum SHA-512
ef9c49fec19b2f6e29566f2dbccbe5804c844cadbd371fb22cb22eb122f7c9cd903ca0ddb9bbe188346c64fafc8e0f6076b2f269d77740bee5a318baf465bf88
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Ohlsson, Victor

Search outside of DiVA

GoogleGoogle Scholar
Total: 1492 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 631 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf