Change search
ReferencesLink to record
Permanent link

Direct link
Context Dependent Thresholding and Filter Selection for Optical Character Recognition
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology.
2012 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

Thresholding algorithms and filters are of great importance when utilizing OCR to extract information from text documents such as invoices. Invoice documents vary greatly and since the performance of image processing methods when applied to those documents will vary accordingly, selecting appropriate methods is critical if a high recognition rate is to be obtained.

This paper aims to determine if a document recognition system that automatically selects optimal processing methods, based on the characteristics of input images, will yield a higher recognition rate than what can be achieved by a manual choice. Such a recognition system, including a learning framework for selecting optimal thresholding algorithms and filters, was developed and evaluated. It was established that an automatic selection will ensure a high recognition rate when applied to a set of arbitrary invoice images by successfully adapting and avoiding the methods that yield poor recognition rates.

Place, publisher, year, edition, pages
2012. , 45 p.
UPTEC F, ISSN 1401-5757 ; 12 036
Keyword [en]
digital image analysis, image thresholding, image filtering, machine learning
National Category
Engineering and Technology
URN: urn:nbn:se:uu:diva-197460OAI: diva2:613004
External cooperation
ReadSoft AB
Educational program
Master Programme in Engineering Physics
Available from: 2013-05-06 Created: 2013-03-26 Last updated: 2013-05-06Bibliographically approved

Open Access in DiVA

fulltext(2666 kB)436 downloads
File information
File name FULLTEXT01.pdfFile size 2666 kBChecksum SHA-512
Type fulltextMimetype application/pdf

By organisation
Department of Information Technology
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar
Total: 436 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 310 hits
ReferencesLink to record
Permanent link

Direct link