Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Neural Ctrl-F: Segmentation-free query-by-string word spotting in handwritten manuscript collections
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Visual Information and Interaction. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computerized Image Analysis and Human-Computer Interaction.ORCID iD: 0000-0002-6783-1744
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Arts, Department of History.ORCID iD: 0000-0002-5245-937X
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Visual Information and Interaction. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computerized Image Analysis and Human-Computer Interaction.ORCID iD: 0000-0002-4405-6888
2017 (English)In: 2017 IEEE International Conference on Computer Vision (ICCV), IEEE, 2017, p. 4443-4452Conference paper, Published paper (Refereed)
Abstract [en]

In this paper, we approach the problem of segmentation-free query-by-string word spotting for handwritten documents. In other words, we use methods inspired from computer vision and machine learning to search for words in large collections of digitized manuscripts. In particular, we are interested in historical handwritten texts, which are often far more challenging than modern printed documents. This task is important, as it provides people with a way to quickly find what they are looking for in large collections that are tedious and difficult to read manually. To this end, we introduce an end-to-end trainable model based on deep neural networks that we call Ctrl-F-Net. Given a full manuscript page, the model simultaneously generates region proposals, and embeds these into a distributed word embedding space, where searches are performed. We evaluate the model on common benchmarks for handwritten word spotting, outperforming the previous state-of-the-art segmentation-free approaches by a large margin, and in some cases even segmentation-based approaches. One interesting real-life application of our approach is to help historians to find and count specific words in court records that are related to women's sustenance activities and division of labor. We provide promising preliminary experiments that validate our method on this task.

Place, publisher, year, edition, pages
IEEE, 2017. p. 4443-4452
Series
IEEE International Conference on Computer Vision, E-ISSN 1550-5499
Keywords [en]
Segmentation-free Word Spotting, Deep Learning, Convolutional Neural Network, Query-by-String
National Category
Computer Vision and Robotics (Autonomous Systems)
Research subject
Computerized Image Processing
Identifiers
URN: urn:nbn:se:uu:diva-335926DOI: 10.1109/ICCV.2017.475ISI: 000425498404054ISBN: 978-1-5386-1032-9 (electronic)OAI: oai:DiVA.org:uu-335926DiVA, id: diva2:1164427
Conference
16th IEEE International Conference on Computer Vision (ICCV), Venice, Italy, October 22-29, 2017
Projects
q2b
Funder
Swedish Research Council, 2012-5743Riksbankens Jubileumsfond, NHS14-2068:1Available from: 2017-12-11 Created: 2017-12-11 Last updated: 2018-05-24Bibliographically approved

Open Access in DiVA

fulltext(9742 kB)31 downloads
File information
File name FULLTEXT01.pdfFile size 9742 kBChecksum SHA-512
6815bb2fc88742034c79f612c67cc17f0812111b20b79da6a18a601c5917b65dd2d887655762f98a2e90a02c5663717b12174bc4907f133fcd7b4ca183a06332
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Search in DiVA

By author/editor
Wilkinson, TomasLindström, JonasBrun, Anders
By organisation
Division of Visual Information and InteractionComputerized Image Analysis and Human-Computer InteractionDepartment of History
Computer Vision and Robotics (Autonomous Systems)

Search outside of DiVA

GoogleGoogle Scholar
Total: 31 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 154 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf