Change search
ReferencesLink to record
Permanent link

Direct link
Optical Character and Symbol Recognition using Tesseract
2016 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

The thesis objective was to examine and evaluate Optical Character Recognition technology for symbol recognition. Can the technology be used to recognize and verify symbols? If so, how good? Today symbol recognition relies on other image registration technologies. The other objective was to provide new functionality of Optical Character Recognition to an existing automation tool at Volvo Cars Corporation. The implementation should be stable, efficient, robust and support various testing functions for the ability to set up automated test cases. The thesis work were conducted in the VU-team at Volvo Cars Corporation, Gothenburg. The working method was agile with two week sprints and constant deliveries.Symbols could indeed be recognized using Optical Character Recognition, some even very accurately. Results show that symbols which was recognized with a confidence above 73% in perfect conditions could very likely be recognized accurately when exposed to various noise. Symbols were recognized with a confidence above 75% even though the image resolution decreased from 96 to 16 PPI. Tesseract OCR engine were able to recognize most symbols confidently even though they were put through heavy noise filtering, where pixel-to-pixel comparison techniques would fail. Generally the symbols sharing the same outlines were difficult to recognize correctly due to the heavy analysis of symbol outlines. This might be changing when a Neural Network classifier is fully implemented. This could make Tesseract OCR engine a possible candidate for symbol recognition in the future. Symbol recognition using Optical Character Recognition cannot directly replace image registration but the two techniques could complete each other. However, the process to allow symbols to be recognized is very protracted. The thesis implementation result provided a stable and efficient program, it will be used for automated testing to localize, recognize and verify text in Volvo Cars display’s interfaces. The implementation is in a JAR file format and can be added as a library in any Java project. This format helps with future development or if the functionality is needed in other projects at Volvo Cars.

Place, publisher, year, edition, pages
2016. , 67 p.
Keyword [en]
Keyword [sv]
Teknik, OCR, Optical Character Recognition, Symbol Recognition, Automation, System testining, Volvo Car Corporation
URN: urn:nbn:se:ltu:diva-46531Local ID: 428c7b3f-ddc3-4728-9cb6-9ed0c2a65fbcOAI: diva2:1019846
External cooperation
Subject / course
Student thesis, at least 30 credits
Educational program
Computer Science and Engineering, master's level
Validerat; 20160701 (global_studentproject_submitter)Available from: 2016-10-04 Created: 2016-10-04Bibliographically approved

Open Access in DiVA

fulltext(8798 kB)1 downloads
File information
File name FULLTEXT02.pdfFile size 8798 kBChecksum SHA-512
Type fulltextMimetype application/pdf

Search in DiVA

By author/editor
Ohlsson, Victor

Search outside of DiVA

GoogleGoogle Scholar
Total: 1 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

ReferencesLink to record
Permanent link

Direct link